Learning Image Processing With OpenCV - Sample Chapter
Learning Image Processing With OpenCV - Sample Chapter
Learning Image Processing With OpenCV - Sample Chapter
ee
P U B L I S H I N G
pl
C o m m u n i t y
$ 44.99 US
29.99 UK
Sa
m
E x p e r i e n c e
D i s t i l l e d
Jos Luis Espinosa Aranda holds a PhD in computer science from the University
of Castilla-La Mancha. He has been a finalist for Certamen Universitario Arqumedes
de Introduccin a la Investigacin cientfica in 2009 for his final degree project
in Spain. His research interests involve computer vision, heuristic algorithms,
and operational research. He is currently working at the VISILAB group as an
assistant researcher and developer in computer vision topics.
Jesus Salido Tercero gained his electrical engineering degree and PhD (1996) from
Universidad Politcnica de Madrid (Spain). He then spent 2 years (1997 and 1998)
as a visiting scholar at the Robotics Institute (Carnegie Mellon University, Pittsburgh,
USA), working on cooperative multirobot systems. Since his return to the Spanish
University of Castilla-La Mancha, he spends his time teaching courses on robotics
and industrial informatics, along with research on vision and intelligent systems.
Over the last 3 years, his efforts have been directed to develop vision applications
on mobile devices. He has coauthored a book on OpenCV programming for
mobile devices.
Ismael Serrano Gracia received his degree in computer science in 2012 from
the University of Castilla-La Mancha. He got the highest marks for his final degree
project on person detection. This application uses depth cameras with OpenCV libraries.
Currently, he is a PhD candidate at the same university, holding a research grant from
the Spanish Ministry of Science and Research. He is also working at the VISILAB
group as an assistant researcher and developer on different computer vision topics.
Noelia Vllez Enano has liked computers since her childhood, though she didn't have
one before her mid-teens. In 2009, she finished her studies in computer science at the
University of Castilla-La Mancha, where she graduated with top honors. She started
working at the VISILAB group through a project on mammography CAD systems
and electronic health records. Since then, she has obtained a master's degree in physics
and mathematics and has enrolled for a PhD degree. Her work involves using image
processing and pattern recognition methods. She also likes teaching and working in
other areas of artificial intelligence.
Computational Photography
Computational photography refers to techniques that allow you to extend the
typical capabilities of digital photography. This may include hardware add-ons or
modifications, but it mostly refers to software-based techniques. These techniques
may produce output images that cannot be obtained with a "traditional" digital
camera. This chapter introduces some of the lesser-known techniques available in
OpenCV for computational photography: high-dynamic-range imaging, seamless
cloning, decolorization, and non-photorealistic rendering. These three are inside
the photo module of the library. Note that other techniques inside this module
(inpainting and denoising) have been already considered in previous chapters.
High-dynamic-range images
The typical images we process have 8 bits per pixel (bpp). Color images also use 8
bits to represent the value of each channel, that is, red, green, and blue. This means
that only 256 different intensity values are used. This 8 bpp limit has prevailed
throughout the history of digital imaging. However, it is obvious that light in nature
does not have only 256 different levels. We should, therefore, consider whether this
discretization is desirable or even sufficient. The human eye, for example, is known
to capture a much higher dynamic range (the number of light levels between the
dimmest and brightest levels), estimated at between 1 and 100 million light levels.
With only 256 light levels, there are cases where bright lights appear overexposed
or saturated, while dark scenes are simply captured as black.
Computational Photography
There are cameras that can capture more than 8 bpp. However, the most common
way to create high-dynamic-range images is to use an 8 bpp camera and take images
with different exposure values. When we do this, problems of a limited dynamic
range are evident. Consider, for example, the following figure:
The top-left image is mostly black, but window details are visible.
Conversely, the bottom-right image shows details of the room, but
the window details are barely visible.
We can take pictures with different exposure levels using modern smartphone
cameras. With iPhone and iPads, for example, as of iOS 8, it is very easy to change
the exposure with the native camera app. By touching the screen, a yellow box
appears with a small sun on its side. Swiping up or down can then change the
exposure (see the following screenshot).
The range of exposure levels is quite large, so we may have to
repeat the swiping gesture a number of times.
If you use previous versions of iOS, you can download camera apps such as Camera+
that allow you to focus on a specific point and change exposure.
[ 166 ]
Chapter 6
For Android, tons of camera apps are available on Google Play that can adjust the
exposure. One example is Camera FV-5, which has both free and paid versions.
If you use a handheld device to capture the images, make sure the
device is static. In fact, you may well use a tripod. Otherwise, images
with different exposures will not be aligned. Also, moving subjects will
inevitably produce ghost artifacts. Three images are sufficient for most
cases, with low, medium, and high exposure levels.
Smartphones and tables are handy to capture a number of images with different
exposures. To create HDR images, we need to know the exposure (or shutter) time
for each captured image (see the following section for the reason). Not all apps allow
you to control (or even see) this manually (the iOS 8 native app doesn't). At the time
of writing this, at least two free apps allow this for iOS: Manually and ManualShot!
In Android, the free Camera FV-5 allows you to control and see exposure times. Note
that F/Stop and ISO are two other parameters that control the exposure.
[ 167 ]
Computational Photography
Images that are captured can be transferred to the development computer and used
to create the HDR image.
As of iOS 7, the native camera app has an HDR mode that automatically
captures three images in a rapid sequence, each with different exposure.
These images are also automatically combined into a single (sometimes
better) image.
[ 168 ]
Chapter 6
Example
OpenCV (as of 3.0 only) provides functions to create HDR images from a set of
images taken with different exposures. There's even a tutorial example called hdr_
imaging, which reads a list of image files and exposure times (from a text file) and
creates the HDR image.
In order to run the hdr_imaging tutorial, you will need to download
the required image files and text files with the list. You can download
them from https://github.com/Itseez/opencv_extra/tree/
master/testdata/cv/hdr.
[ 169 ]
Computational Photography
times.push_back((float)1/32);
times.push_back((float)1/12);
// Estimate camera response...
Mat response;
Ptr<CalibrateDebevec> calibrate = createCalibrateDebevec();
calibrate->process(images, response, times);
// Show the estimated camera response function...
cout << response;
// Create and write the HDR image...
Mat hdr;
Ptr<MergeDebevec> merge_debevec = createMergeDebevec();
merge_debevec->process(images, hdr, times, response);
imwrite("hdr.hdr", hdr);
cout << "\nDone. Press any key to exit...\n";
waitKey(); // Wait for key press
return 0;
}
The example uses three images of a cup (the images are available along with the
code accompanying this book). The images were taken with the ManualShot! app
mentioned previously, using exposures of 1/66, 1/32, and 1/12 seconds; refer to
the following figure:
[ 170 ]
Chapter 6
Note that the createCalibrateDebevec method expects the images and exposure
times in an STL vector (STL is a kind of library of useful common functions and data
structures available in standard C++). The camera response function is given as a
256 real-valued vector. This represents the mapping between the pixel value and
irradiance. Actually, it is a 256 x 3 matrix (one column per each of the three color
channels). The following figure shows you the response given by the example:
The cout part of code displays the matrix in the format used by
MATLAB and Octave, two widely used packages for numerical
computation. It is straightforward to copy the matrix in the output
and paste it in MATLAB/Octave in order to display it.
The resulting HDR image is stored in the lossless RGBE format. This image format
uses one byte per color channel plus one byte as a shared exponent. The format
uses the same principle as the one used in the floating-point number representation:
the shared exponent allows you to represent a much wider range of values. RGBE
images use the .hdr extension. Note that as it is a lossless image format, .hdr files
are relatively large. In this example, the RGB input images are 1224 x 1632 each (100
to 200 KB each), while the output .hdr file occupies 5.9 MB.
The example uses Debevec and Malik's method, but OpenCV also provides another
calibration function based on Robertson's method. Both calibration and merge
functions are available, that is, createCalibrateRobertson and MergeRobertson.
[ 171 ]
Computational Photography
For more information on the other functions and the theory behind
them, refer to http://docs.opencv.org/trunk/modules/
photo/doc/hdr_imaging.html.
Finally, note that the example does not display the resulting image. The HDR image
cannot be displayed in conventional screens, so we need to perform another step
called tone mapping.
Tone mapping
When high-dynamic-range images are to be displayed, information can be lost. This
is due to the fact that computer screens also have a limited contrast ratio, and printed
material is also typically limited to 256 tones. When we have a high-dynamic-range
image, it is necessary to map the intensities to a limited set of values. This is called
tone mapping.
Simply scaling the HDR image values to the reduced range of the display device
is not sufficient in order to provide a realistic output. Scaling typically produces
images that appear as lacking detail (contrast), eliminating the original scene content.
Ultimately, tone-mapping algorithms aim at providing outputs that appear visually
similar to the original scene (that is, similar to what a human would see when
viewing the scene). Various tone-mapping algorithms have been proposed and it
is still a matter of extensive research. The following lines of code can apply tone
mapping to the HDR image obtained in the previous example:
Mat ldr;
Ptr<TonemapDurand> tonemap = createTonemapDurand(2.2f);
tonemap->process(hdr, ldr); // ldr is a floating point image with
ldr=ldr*255;
// values in interval [0..1]
imshow("LDR", ldr);
The method was proposed by Durand and Dorsey in 2002. The constructor actually
accepts a number of parameters that affect the output. The following figure shows
you the output. Note how this image is not necessarily better than any of the three
original images:
[ 172 ]
Chapter 6
[ 173 ]
Computational Photography
Alignment
The scene that will be captured with multiple exposure images must be static. The
camera must also be static. Even if the two conditions met, it is advisable to perform
an alignment procedure.
OpenCV provides an algorithm for image alignment proposed by G. Ward in 2003.
The main function, createAlignMTB, takes an input parameter that defines the
maximum shift (actually, a logarithm the base two of the maximum shift in each
dimension). The following lines should be inserted right before estimating the
camera response function in the previous example:
vector<Mat> images_(images);
Ptr<AlignMTB> align=createAlignMTB(4);// 4=max 16 pixel shift
align->process(images_, images);
Exposure fusion
We can also combine images with multiple exposures with neither camera response
calibration (that is, exposure times) nor intermediate HDR image. This is called
exposure fusion. The method was proposed by Mertens et al in 2007. The following
lines perform exposure fusion (images is the STL vector of input images; refer to the
previous example):
Mat fusion;
Ptr<MergeMertens> merge_mertens = createMergeMertens();
merge_mertens->process(images, fusion); // fusion is a
fusion=fusion*255; // float. point image w. values in [0..1]
imwrite("fusion.png", fusion);
Exposure fusion
[ 174 ]
Chapter 6
Seamless cloning
In photomontages, we typically want to cut an object/person in a source image and
insert it into a target image. Of course, this can be done in a straightforward way by
simply pasting the object. However, this would not produce a realistic effect. See, for
example, the following figure, in which we wanted to insert the boat in the top half
of the image into the sea at the bottom half of the image:
Cloning
[ 175 ]
Computational Photography
As of OpenCV 3, there are seamless cloning functions available in which the result is
more realistic. This function is called seamlessClone and it uses a method proposed
by Perez and Gangnet in 2003. The following seamlessCloning example shows you
how it can be used:
#include <opencv2/photo.hpp>
#include <opencv2/highgui.hpp>
#include <iostream>
using namespace cv;
using namespace std;
int main(int, char** argv)
{
// Load and show images...
Mat source = imread("source1.png", IMREAD_COLOR);
Mat destination = imread("destination1.png", IMREAD_COLOR);
Mat mask = imread("mask.png", IMREAD_COLOR);
imshow("source", source);
imshow("mask", mask);
imshow("destination", destination);
Mat result;
Point p;
// p will be near top right corner
p.x = (float)2*destination.size().width/3;
p.y = (float)destination.size().height/4;
seamlessClone(source, destination, mask, p, result, NORMAL_CLONE);
imshow("result", result);
cout << "\nDone. Press any key to exit...\n";
waitKey(); // Wait for key press
return 0;
}
[ 176 ]
Chapter 6
Seamless cloning
The last parameter of seamlessClone represents the exact method to be used (there
are three methods available that produce a different final effect). On the other hand,
the library provides the following related functions:
Function
colorChange
Effect
Multiplies each of the three color channels of the source image
by a factor, applying the multiplication only in the region
given by the mask
illuminationChange
textureFlattening
[ 177 ]
Computational Photography
Decolorization
Decolorization is the process of converting a color image to grayscale. Given this
definition, the reader may well ask, don't we already have grayscale conversion?
Yes, grayscale conversion is a basic routine in OpenCV and any image-processing
library. The standard conversion is based on a linear combination of the R, G, and
B channels. The problem is that such a conversion may produce images in which
contrast in the original image is lost. The reason is that two different colors (which
are perceived as contrasts in the original image) may end up being mapped to the
same grayscale value. Consider the conversion of two colors, A and B, to grayscale.
Let's suppose that B is a variation of A in the R and G channels:
A = (R,G,B)
=> G = (R+G+B)/3
[ 178 ]
Chapter 6
Mat dummy = Mat(source.size(),CV_8UC3);
decolor(source,decolorized,dummy);
imshow("decolorized",decolorized);
cout << "\nDone. Press any key to exit...\n";
waitKey(); // Wait for key press
return 0;
}
The example is straightforward. After reading the image and showing the result
of a standard grayscale conversion, it uses the decolor function to perform the
decolorization. The image used (the color_image_3.png file) is included in the
opencv_extra repository at https://github.com/Itseez/opencv_extra/tree/
master/testdata/cv/decolor.
The image used in the example is actually an extreme case. Its
colors have been chosen so that the standard grayscale output is
fairly homogeneous.
[ 179 ]
Computational Photography
Non-photorealistic rendering
As part of the photo module, four functions are available that transform an input
image in a way that produces a non-realistic but still artistic output. The functions
are very easy to use and a nice example is included with OpenCV (npr_demo). For
illustrative purposes, here we show you a table that allows you to grasp the effect
of each function. Take a look at the following fruits.jpg input image, included
with OpenCV:
Effect
Smoothing is a handy and frequently used filter. This function
performs smoothing while preserving object edge details.
[ 180 ]
Chapter 6
Function
detailEnhance
Effect
pencilSketch
[ 181 ]
Computational Photography
Function
stylization
Effect
Watercolor effect
Summary
In this chapter, you learned what computational photography is and the related
functions available in OpenCV 3. We explained the most important functions within
the photo module, but note that other functions of this module (inpainting and noise
reduction) were also considered in previous chapters. Computational photography
is a rapidly expanding field, with strong ties to computer graphics. Therefore, this
module of OpenCV is expected to grow in future versions.
The next chapter will be devoted to an important aspect that we have not yet
considered: time. Many of the functions explained take a significant time to compute
the results. The next chapter will show you how to deal with that using modern
hardware.
[ 182 ]
www.PacktPub.com
Stay Connected: