0% found this document useful (0 votes)
37 views

Lecture 4: Non-Linear Filters & Image Compression: B14 Image Analysis Michaelmas 2014 A. Zisserman

This lecture covers image compression and non-linear filters. It discusses lossy versus lossless compression and JPEG compression. JPEG uses discrete cosine transform (DCT) and quantization for compression. Non-linear filters discussed include median filters, which remove noise while preserving edges, and bilateral filters, which reduce noise while avoiding blurring of edges. Bilateral filters perform a weighted average of nearby pixels, weighting by both spatial closeness and similarity of pixel values.

Uploaded by

yb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Lecture 4: Non-Linear Filters & Image Compression: B14 Image Analysis Michaelmas 2014 A. Zisserman

This lecture covers image compression and non-linear filters. It discusses lossy versus lossless compression and JPEG compression. JPEG uses discrete cosine transform (DCT) and quantization for compression. Non-linear filters discussed include median filters, which remove noise while preserving edges, and bilateral filters, which reduce noise while avoiding blurring of edges. Bilateral filters perform a weighted average of nearby pixels, weighting by both spatial closeness and similarity of pixel values.

Uploaded by

yb
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 54

Lecture 4: Non-linear filters & Image Compression

B14 Image Analysis Michaelmas 2014 A. Zisserman

• Image Compression
• Non-lossy vs lossy, JPEG

• Non-linear filters
• Median filter, bilateral filter, non-local means

• Inpainting
Image Compression
Why compress?

• Storage
• Bandwidth
The key attribute which enables compression is redundancy

Example
PAL colour video 768 x 576 pixels at 25 frames per second = 33 Mbytes per second
A 2hour video = 238 Gbytes (cf CD holds 0.7 Gbytes, a DVD holds 7 Gbytes)

Compression/decompression pipe line

image compressed uncompressed


encoder image decoder image
Two types of compression
1. Loss-less compression (error free), e.g. gzip, zip
• typical compression ratios in range 2:1 to 3:1
2. Lossy compression
• typical compression ratios in range 10:1 to 50:1
• no perceivable visible difference

Example JPEG compression ratio 16:1


Loss-less Compression

Method 1: Run length encoding

Example – white bar length 40

256 x 256 Image = 65536 bytes

Run length encode as


{ (32748,0), (40,255),(32748,0)} = 15 bytes
Compression ratio = 4369
(integer requires 4 bytes, intensity 1 byte)

Extensions: code ramps; code regions


Method 2: Coding redundancy

Intensity histogram

frequency

intensity
• instead of using same length code
(1 byte) for every intensity,
use variable length coding use short use long
code code

• e.g. 2 bits for most common


intensity, longer codes for less
common intensities
• Use Huffman coding
Joint Photographic Experts Group (JPEG)

• Standard for compression of general images


• Allows progressive display of images
• Uses both lossy and lossless compression techniques
• Uses Discrete Cosine Transform (DCT)
JPEG Pipeline
e
n
c Quantize DCT
o Divide image into Compute Huffman code
d coefficients
i 8 x 8 pixel blocks DCT coefficients
n (lossy step)
g

d
e
c Reconstruct each
o Decode Recompose image
d block from its
i coefficients from 8 x 8 blocks
n DCT coefficients
g
Discrete Cosine Transform (DCT) (cf Fourier series)

The forward cosine transform is defined as


4 c u  c v  N 1 N 1
F u , v     f x , y 
N 2
x0 y 0
  1    1 
cos  u  x    cos  v  y   
N  2  N  2 

where  1
 for w  0 ,
C w    2
1 for w  1, 2 ,..., N  1

The inverse DCT is defined as


f x , y  
N 1 N 1   1    1 
  C u  C v  F u , v  cos  u  x    cos  v  y  
u  0 v  0 N  2  N  2 
8 x 8 basis images for each block

divided into 8 x 8 blocks


JPEG Quantization Step
Z(u,v)
out(u,v) = round(F(u,v)/Z(u,v))

• Z(u,v) chosen to weight


coefficients which are visually
salient
• scale Z(u,v) to alter compression
quality
normalization matrix
Example

DCT coefficients normalization normalized & denormalized &


F(u,v) quantized quantized
out(u,v)
JPEG performance

original CR: 12:1 16:1 21:1


Bytes: 65536 5675 4016 3158

CR: 25:1 33:1 41:1 48:1


Bytes: 2643 2007 1613 1373
Note “blocking artifacts”
Differences wrt original image

JPEG

CR: 12:1 21:1 33:1 48:1


mean abs diff 1.3 2.2 4.1 9.4

pixels for which absolute intensity difference > 5


Non-linear filters
Non-linear filters

• Non-linear filters are more powerful than linear filters, e.g.

• Suppression of non-Gaussian noise, e.g. spikes

• Edge preserving properties

• Examples include median filters, and morphological filters


Median filter

Definition
1. rank-order neighbourhood intensities
2. select middle value
In 1D
The median of 2N+1 samples is the value which has N smaller
or equal values, and N larger or equal to it.
Example
sort
{ 5, 7, 3, 4, 5, 19, 6, 4, 9 } { 3, 4, 4, 5, 5, 6, 7, 9, 19 }
Median of set = 5
Note, “odd man out” effect: { 1, 1, 1, 7, 1, 1, 1 }
median width 3

{ ?, 1, 1, 1, 1, 1, ? }
filters have width 5

Median filter
• no new grey values are created
• edge is preserved
• spike is removed
Median filter in 2D
• A median filter operates over a window by selecting
the median intensity in the window

Source: K. Grauman
Median properties

Suppose f{i} and g{i} are two sets of 2N+1 values, then
1. med( k f{i} ) = k med( f{i} )
2. med( k + f{i} ) = k + med( f{i} )

The median is not linear because


• med( f{i} + g{i} ) = med( f{i} ) + med( g{i} )
e.g.
• med( {1,2,3} + {5,4,6}) = med( {6,6,9} ) = 6
• med( {1,2,3} ) + med( {5,4,6} ) = 2 + 5
Gaussian noise

2D median filter, 3 x 3 neighbourhood

• sharpens edges, reduces noise, but …


• generates jagged edges
Gaussian noise
Comparison with Gaussian filter

Gaussian
Median

Gaussian: upper lip smoother, eye better preserved


10 times 3 X 3 median

• patchy effect
• important details lost (e.g. ear-ring)
Salt and Pepper noise

Gaussian
 = 1 pixel

3x3
median
Median filter
Salt and
Median
pepper
filtered
noise

Plots of a row of the image

Matlab: output im = medfilt2(im, [h w]); Source: M. Hebert


Application: Film Restoration
• Objective: remove vertical scratches and blotches
Bilateral filtering
f(x) + n(x)
Spatial averaging removes
noise and blurs edges

Gaussian filter  = 2
Can we remove noise but
not blur edges?
6

Gaussian filter  = 4

6
filtering f(x,y) with a Gaussian filter h(x,y)

for a pixel at p, rewrite this as a weighted average


X
In(p) = Gσ (||p − q||)I(q)
q∈S
over the support of the filter S, where
2
q
−x2 p
Gσ (x) = e 2σ
Bilateral filter

Same idea: weighted average of neighbouring pixels but with


an additional weighting term

1newX
In(p) = Gσs (||p−q||)Gσr (|I(p)−I(q)|)I(q)
W q∈S

spatial range
weight weight
normalization
factor I
1D illustration

pixel
intensity

pixel position

1 X
In(p) = Gσs (||p−q||)Gσr (|I(p)−I(q)|)I(q)
W q∈S
1D illustration

spatial

range
pixel
intensity
p

pixel position

1 X
In(p) = Gσs (||p−q||)Gσr (|I(p)−I(q)|)I(q)
W q∈S
Only pixels close in space and in range are considered
Gaussian blur vs bilateral Filter

Gaussian blur
p

spatial

Bilateral filter
p range

spatial
1 X
In(p) = Gσs (||p−q||)Gσr (|I(p)−I(q)|)I(q)
W q∈S

p p
q

output input

reproduced from [Durand 02]


Exploring the Parameter Space
r = 
r = 0.1 r = 0.25 (Gaussian blur)

input
s = 2

s = 6

s = 18
Basic denoising

Noisy input Bilateral filter 7x7 window


Non-Local Means
Denoising
Non-Local Means Denoising

Key idea: to denoise a pixel p use patches in the image


with similar neighbourhoods

Image self-similarity

A. Buades, B. Coll, J.M. Morel "A non local algorithm for image denoising”, CVPR 2005
Non-Local Means Denoising

Assume that if the neighbourhoods are similar, then the


central pixel value is also similar (apart from noise)
Algorithm:
• Select similar (low distance) neighbourhoods to p’s, where the distance
between two neighbourhoods is the Gaussian weighted squared sum of
differences d(Np, Nq )
Np
• Compute a Gaussian weighted sum of the central p
pixel values as

1 X − d(Np2,Nq) Nq
In(p) = e h I(q) q
W q∈Φ

d(Np ,Nq )
P −
where Φ is the set of neighbourhoods and W = q∈Φ e h2
−d(Np ,Nq )
d(Np, Nq) e h2

black = low

Parameters:
• neighbourhood size: 5 x 5
• search region: 7 x 7
Removing film grain
original
Removing film grain
denoised
Inpainting
Inpainting techniques

• Automated texture generation to fill in regions


Texture Synthesis

Texture
sample

Output

Efros & Leung, ICCV 1999


Synthesizing One Pixel

p
input image

synthesized image

• Select patches with similar neighbourhoods


• Choose a central pixel value randomly from this set

Slide from Alyosha Efros, ICCV 1999


Neighborhood Window

input

Slide from Alyosha Efros, ICCV 1999


Hole Filling

Slide from Alyosha Efros, ICCV 1999


A Cautionary Note …

Removal of Nikolai Yezhov


Removal of foreground objects using inpainting

original frame
Removal of foreground objects using inpainting

inpainted frame
Removal of foreground objects using inpainting

original frame
Removal of foreground objects using inpainting

inpainted frame
There is much more …

• Image reorganization, e.g. using the patch transform


• Image retargetting
e.g. change of aspect ratio without
losing or distorting semantic content

scale

retarget

You might also like