0% found this document useful (0 votes)
30 views

Lecture 11 Camera Isp

The document discusses the camera processing pipeline, including how light passes through the lens and is captured by the sensor. It covers topics like pinhole cameras, lenses, aperture, depth of field, chromatic aberration, lens distortion, vignetting, sensor types, Bayer demosaicking, ISO, color spaces, gamma correction, and calculating luminance from RGB values. The pipeline involves multiple complex steps to convert the light information into a digital image file.

Uploaded by

Alberto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

Lecture 11 Camera Isp

The document discusses the camera processing pipeline, including how light passes through the lens and is captured by the sensor. It covers topics like pinhole cameras, lenses, aperture, depth of field, chromatic aberration, lens distortion, vignetting, sensor types, Bayer demosaicking, ISO, color spaces, gamma correction, and calculating luminance from RGB values. The pipeline involves multiple complex steps to convert the light information into a digital image file.

Uploaded by

Alberto
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 78

Camera

Processing
Pipeline
Kari Pulli
VP Computational Imaging
Light
Imaging without optics?

Each point on sensor


would record
the integral of light
arriving from every point
on subject

All sensor points would


record similar colors
Pinhole camera (a.k.a. camera obscura)

Linear perspective with viewpoint at pinhole


Effect of pinhole size
Stopping down the pinhole

Large pinhole
geometric blur

Optimal pinhole
too little light

Small pinhole
diffraction blur
Add a lens to get more light
Real lenses are complex
Thin lens approximation: Gauss’s ray diagram

http://graphics.stanford.edu/courses/cs178/applets/thinlens.html
1 1 1
Changing the focus distance + =
so si f
To focus on objects at
different distances
move sensor relative to
the lens
1 1 1
Changing the focus distance + =
so si f
To focus on objects at
different distances
move sensor relative to
the lens

At so = si = 2f we get 1:1
imaging (macro) because
1 1 1
+ =
2f 2f f
1 1 1
Changing the focus distance + =
so si f
To focus on objects at
different distances
move sensor relative to
the lens

At so = si = 2f we get 1:1
imaging (macro) because
1 1 1
+ =
2f 2f f

Can’t focus on objects


closer to the lens than f
http://graphics.stanford.edu/courses/cs178/applets/gaussian.html
Circle of confusion
1 1 1 1 1 1
Focusing = + ( c / A ),
Dnear D d
= − (c / A)
D far D d
DOF
camera scene

A c

d D
Sensor
Dnear Dfar
position

• Depth of field (DOF) = the range of distances that are in focus

http://graphics.stanford.edu/courses/cs178/applets/dof.html
Chromatic aberration

Different wavelengths refract at different rates


so have different focal lengths
Correct with achromatic doublet
strong positive lens + weak negative lens
= weak positive compound lens
align red and blue
Lens distortion

Radial change in magnification


(a) pincushion
(b) barrel distortion
Vignetting

Irradiance is proportional to
projected area of aperture
as seen from pixel
projected area of pixel
as seen from aperture
distance2 from aperture to pixel
Combining all these
each ~ a factor of cos θ
light drops as cos4 θ

Fix by calibrating
take a photo of a uniformly white object
the picture shows the attenuation, divide the pixel values by it
CMOS sensor
Front- vs. Back-illuminated sensor
Anti-aliasing filter

Two layers of birefrigent material


splits one ray into 4 rays

anti-aliasing filter removed normal


From “raw-raw” to RAW

Pixel Non-Uniformity
each pixel has a slightly different sensitivity to light
typically within 1% to 2%
reduce by calibrating an image with a flat-field image
also eliminate the effects of vignetting and other optical variations

Stuck pixels
some pixels are turned always on or off
identify, replace with filtered values

Dark floor
temperature adds noise
sensors usually have a ring of covered pixels
around the exposed sensor, subtract their signal
AD Conversion

Sensor converts the


continuous light signal to a
continuous electrical signal
The analog signal is
converted to a digital signal
at least 10 bits (even on cell
phones), often 12 or more
(roughly) linear sensor
response
ISO = amplification in AD conversion

Before conversion the signal can be amplified


ISO 100 means no amplification
ISO 1600 means 16x amplification
+: can see details in dark areas better
-: noise is amplified as well; sensor more likely to saturate
ISO
From “raw-raw” to RAW

Pixel Non-Uniformity
each pixel in a CCD has a slightly different sensitivity to light,
typically within 1% to 2% of the average signal
can be reduced by calibrating an image with a flat-field image
flat-field images are also used to eliminate the effects of
vignetting and other optical variations
Stuck pixels
some pixels are turned always on or off
identify, replace with filtered values
Dark floor
temperature adds noise
sensors usually have a ring of covered pixels
around the exposed sensor, subtract their signal
Color filter array

Bayer pattern
Demosaicking
Your eyes do it too…
Demosaicking
First choice: bilinear interpolation

Easy to implement

But fails at sharp edges


Take edges into account

Use bilateral filtering ADAPTIVE DEMOSAICKING


avoid interpolating across edges Ramanath, Snyder, JEI 2003
Take edges into account
HIGH-QUALITY LINEAR INTERPOLATION FOR
DEMOSAICING OF BAYER-PATTERNED COLOR IMAGES
Malvar, He, Cutler, ICASSP 2004
Predict edges and adjust
assumptions
luminance correlates with RGB
edges = luminance change
When estimating G at R
if the R differs from bilinearly
estimated R
à luminance changes
Correct the bilinear estimate
by the difference between the
estimate and real value

Ĝ(i, j) = Ĝ(i, j) + ↵ R̂ (i, j)


Denoising using non-local means

Most image details occur


repeatedly
Each color indicates a group
of squares in the image which
are almost indistinguishable
Image self-similarity can be
used to eliminate noise
it suffices to average the
squares which resemble each
other

Image and movie denoising by nonlocal means


Buades, Coll, Morel, IJCV 2006
BM3D (Block Matching 3D)
The CIE XYZ System

A standard created in 1931 z


by CIE
Commission Internationale de
L'Eclairage
Defined in terms of three
color matching functions x
y
Given an emission spectrum,
we can use the CIE matching
functions to obtain the x, y
and z coordinates
y corresponds to luminance
perception

http://graphics.stanford.edu/courses/cs178/applets/threedgamut.html
The CIE Chromaticity Diagram
Intensity is measured as
the distance from origin
black = (0, 0, 0)

Chromaticity coordinates
give a notion of color
independent of brightness

A projection of the plane


x + y + z = 1 yields a
chromaticity value
dependent on
dominant wavelength
(= hue), and
excitation purity
(= saturation)
the distance from the
white at (1/3, 1/3, 1/33)
Perceptual (non-)uniformity

The XYZ color space is


not perceptually uniform!
Enlarged ellipses of
constant color in XYZ
space
CIE L*a*b*: uniform color space

Lab is designed to approximate human vision


it aspires to perceptual uniformity
L component closely matches human perception
of lightness

A good color space for image processing


Break RGB to Lab channels
Blur “a” channel (red-green)
Blur “b” channel (blue-yellow)
Blur “L” channel
YUV, YCbCr, …

Family of color spaces for video encoding


Uses the fact that eye is sensitive to luminance
Filters colors down (2:1, 4:1)

Channels
Y = luminance [linear]; Y’ = luma [gamma corrected]
CbCr / UV = chrominance [always linear]

Y′CbCr is not an absolute color space


it is a way of encoding RGB information
the actual color depends on the RGB primaries used
How many bits are needed for smooth
shading?

With a given adaptation,


human vision has contrast sensitivity ~1%
call black 1, white 100
you can see differences
1, 1.01, 1.02, … needed step size ~ 0.01
98, 99, 100 needed step size ~ 1
with linear encoding
delta 0.01
– 100 steps between 99 & 100 à wasteful
delta 1
– only 1 step between 1 & 2 à lose detail in shadows
instead, apply a non-linear power function, gamma
provides adaptive step size

44
Gamma encoding

With the “delta” ratio of 1.01


need about 480 steps to reach 100
takes almost 9 bits

8 bits, nonlinearly encoded


sufficient for broadcast quality digital TV
contrast ratio ~ 50 : 1

With poor viewing conditions or display quality


fewer bits needed

http://graphics.stanford.edu/courses/cs178/applets/gamma.html

45
Luminance from RGB

If three sources of same radiance appear R, G, B:


green will appear the brightest, it has high luminous efficiency
red will appear less bright
blue will be the darkest

Luminance by NTSC: 0.2990 R + 0.5870 G + 0.1140 B


based on phosphors in use in 1953

Luminance by CIE: 0.2126 R + 0.7152 G + 0.0722 B


based on contemporary phosphors

Luminance by ITU: 0.2125 R + 0.7154 G + 0.0721 B

1/4 R + 5/8 G + 1/8 B works fine


quick to compute: R>>2 + G>>1 + G>>3 + B>>3
range is [0, 252]
Cameras use sRGB

sRGB is a standard RGB color space (since 1996)


uses the same primaries as used in studio monitors and HDTV
and a gamma curve typical of CRTs
allows direct display

First need to map from sensor RGB to standard


need calibration
sRGB from XYZ
XYZ matrix(3x3) RGBsRGB RsRGB < 0.0031308
linear transformation
R´sRGB = 12.92 RsRGB
RsRGB > nonlinear
0.0031308distortion
RGB´sRGB
R´sRGB = 1.055 RsRGB(1/2.4) - 0.055

RGB8Bit R8Bit quantization


= round[255 R'sRGB]

linear relation between XYZ und sRGB:

X 0.4124 0.3576 0.1805 RsRGB


Y = 0.2126 0.7152 0.0722 GsRGB
Z 0.0193 0.1192 0.9505 BsRGB

red green blue

Primaries according to ITU-R BT.709.3


Image processing in
linear or non-linear space?

Simulating physical world


use linear light
a weighted average of gamma-corrected pixel values is not a
linear convolution!
Bad for antialiasing
want to numerically simulate lens?
Undo gamma first

Dealing with human perception


using non-linear coding allows
minimizing perceptual errors due to quantization

49
Film response curve

Toe region
the chemical process
is just starting
Middle
follows a power function
if a given amount of light
turned half of the grain
crystals to silver,
the same amount more
turns half of the rest
Shoulder region
close to saturation
Toe and shoulder preserve more dynamic range at dark and bright areas
at the cost of reduced contrast
Film has more dynamic range than print
~12bits
50
Digital camera response curve

Digital cameras modify the response curve

May use different response curves at different exposures


impossible to calibrate and invert!
3A

Automated selection of key camera control values


auto-focus
auto-exposure
auto-white-balance
Contrast-based auto-focus

ISP can filter pixels with configurable IIR filters


to produce a low-resolution sharpness map of the image

The sharpness map helps estimate the best lens position


by summing the sharpness values (= Focus Value)
either over the entire image
or over a rectangular area

http://graphics.stanford.edu/courses/cs178/applets/autofocusCD.html
Hunt for the peak

First coarse search


back up after peak, do a finer search

Focus Value Peak


passing

6 7
8
4 5
3 9
1
2

Lens
position

Scan direction
Auto-White-Balance

The dominant light source (illuminant) produces a color


cast that affects the appearance of the scene objects
The color of the illuminant determines the color normally
associated with white by the human visual system
Auto-white-balance
Identify the illuminant color
Neutralize the color of the illuminant

(source: www.cambridgeincolour.com)
Identify the color of the illuminant

Prior knowledge about the ambient light


Candle flame light (18500K)
Sunset light (20000K)
Summer sunlight at noon (54000K)

Known reference object
in the picture
best: find something that is
white or gray

Assumptions about the scene


Gray world assumption
(gray in sRGB space!)
Best way to do white balance

Grey card
take a picture of a neutral object (white or gray)
deduce the weight of each channel

If the object is recoded as rw, gw, bw


use weights k/rw, k/gw, k/bw
where k controls the exposure
Brightest pixel assumption

Highlights usually have the color of the light source


at least for dielectric materials

White balance by using the brightest pixels


plus potentially a bunch of heuristics
in particular use a pixel that is not saturated / clipped
Color temperature
x, y chromaticity diagram

Colors of a black-body heated at different temperatures


fall on a curve (Planckian locus)
Colors change non-linearly with temperature
but almost linearly with reciprocal temperatures 1/T
Mapping the colors

For a given sensor


pre-compute the transformation matrices between the sensor
color space and sRGB at different temperatures

Estimate a new transformation


by interpolating between pre-computed matrices
ISP can apply the linear transformation
Estimating the color temperature

Use scene mode


Use gray world assumption (R = G = B) in sRGB space
really, just R = B, ignore G
Estimate color temperature in a given image
apply pre-computed matrix to get sRGB for T1 and T2
calculate the average values R, B
solve α, use to interpolate matrices (or 1/T)

1 1 1
= (1− α ) + α
T T1 T2

1/T1 1/T 1/T2

R = (1− α )R1 + α R2 , B = (1− α )B1 + α B2


Auto-exposure

Goal: well-exposed image (not a very well defined goal!)

Possible parameters to adjust


Exposure time
Longer exposure time leads to brighter image, but also motion blur
Aperture (f-number)
Larger aperture (smaller f-number) lets more light in causing the
image to be brighter, also makes depth of field shallower
Phone cameras often have fixed aperture
Analog and digital gain
Higher gain makes image brighter but amplifies noise as well
ND filters on some cameras
Exposure metering

Cumulative Density Function of image intensity values


P percent of image pixels have an intensity lower than Y
Percentile
100
P

Y 1 Intensity
Exposure metering examples

Adjustment examples
P = 0.995, Y = 0.9
max 0.5% of pixels are saturated (highlights)
P = 0.1, Y = 0.1
max 10% of pixels are under-exposed (shadows)
Auto-exposure somewhere in between, e.g., P = 0.9, Y = 0.4

Highlights Auto-exposure Shadows


JPEG Encoding

1. Transform RGB to YUV or YIQ and subsample color


2. DCT on 8x8 image blocks
3. Quantization
4. Zig-zag ordering and run-length encoding
5. Entropy coding
Alternatives?

JPEG 2000
ISO, 2000
better compression, inherently hierarchical, random access, …
but much more complex than JPEG

JPEG XR
Microsoft, 2006; ISO / ITU-T, 2010
good compression, supports tiling (random access without
having to decode whole image), better color accuracy (incl.
HDR), transparency, compressed domain editing

But JPEG stays


too large an install base
Exposure, Image Sensor
Traditional camera APIs Frame rate Configure
1
Real image sensors are pipelined Expose
while one frame exposing Gain, 2
next one is being prepared Digital Zoom Readout
previous one is being read out 3

Viewfinding / video mode: Imaging


pipelined, high frame rate Pipeline
settings changes take effect sometime later Format Receive
4
Still capture mode: Coefficients Demosaic
need to know which parameters were used 5
à reset pipeline between shots à slow White balance Color corr
6
...
The FCam Architecture

A software architecture for programmable cameras


that attempts to expose the maximum device capabilities
while remaining easy to program
Sensor

A pipeline that converts requests into images


No global state
state travels in the requests through the pipeline
all parameters packed into the requests
Image Signal Processor (ISP)

Receives sensor data, and optionally transforms it


untransformed raw data must also be available
Computes helpful statistics
histograms, sharpness maps
Devices

Devices (like the Lens and Flash) can


schedule Actions
to be triggered at a given time into an exposure
Tag returned images with metadata
Everything is visible

Programmer has full control over sensor settings


and access to the supplemental statistics from ISP
No hidden daemon running autofocus/metering
nobody changes the settings under you
Android
Camera
Architecture

http://source.android.com/devices/
camera/index.html
HAL v3

You might also like