Ashwin Pajankar - Raspberry Pi Image Processing Programming - With NumPy, SciPy, Matplotlib, and OpenCV-Apress (2022)
Ashwin Pajankar - Raspberry Pi Image Processing Programming - With NumPy, SciPy, Matplotlib, and OpenCV-Apress (2022)
Processing
Programming
With NumPy, SciPy, Matplotlib,
and OpenCV
Second Edition
Ashwin Pajankar
Raspberry Pi Image Processing Programming
Ashwin Pajankar
Nashik, Maharashtra, India
v
Table of Contents
vi
Table of Contents
vii
Table of Contents
Chapter 8: Filters������������������������������������������������������������������������������145
Kernels, Convolution, and Correlation���������������������������������������������������������������146
Low-Pass Filters�����������������������������������������������������������������������������������������������150
Blurring��������������������������������������������������������������������������������������������������������150
Noise Reduction������������������������������������������������������������������������������������������156
High-Pass Filters�����������������������������������������������������������������������������������������������161
Fourier Filters����������������������������������������������������������������������������������������������������165
Summary����������������������������������������������������������������������������������������������������������167
viii
Table of Contents
Appendix�������������������������������������������������������������������������������������������223
pgmagick Image Processing�����������������������������������������������������������������������������223
Connecting a Display����������������������������������������������������������������������������������������225
Using a VGA Display������������������������������������������������������������������������������������225
Booting Up After Connecting a Display��������������������������������������������������������226
Connecting to Ethernet/ Wired Network������������������������������������������������������������234
Remote Desktop with VNC��������������������������������������������������������������������������������235
Index�������������������������������������������������������������������������������������������������241
ix
About the Author
Ashwin Pajankar earned a master of technology in computer science
engineering from IIIT Hyderabad and has over 25 years of experience
in the area of programming. He started his journey in programming
and electronics at the tender age of seven with an MS-DOS computer
and BASIC programming language. He is now proficient in Assembly
programming, C, C++, Java, shell scripting, JavaScript, Go Programming,
HTML, and Python. His other technical expertise includes single-board
computers such as Raspberry Pi and Banana Pro, microcontroller boards
such as Arduino, and embedded boards such as BBC Micro Bit. He has
extensively worked on domains such as software/product testing, software
automation, databases, data analytics and visualization, computer vision,
and web development.
He is currently a freelance online instructor teaching programming
and electronics to more than 82,000 professionals. He also regularly
conducts live programming bootcamps for software professionals.
His growing YouTube channel has an audience of more than 11,000
subscribers. He has published more than 20 books on programming and
electronics with many international publishers and is writing more books
with Apress. He also regularly reviews books on the topics of programming
and electronics written by other authors.
Apart from his work in the area of technology, he is active in the
community as a leader and volunteer for many social causes. He has
won several awards at his university (IIIT Hyderabad) and also at past
workplaces for his leadership in community service for uplifting the
underpriviledged with education and skill-based training and employment
xi
About the Author
xii
About the Technical Reviewer
Lentin Joseph is an author, roboticist, and
robotics entrepreneur from India. He runs
a robotics software company called Qbotics
Labs in Kochi/Kerala. He has ten years of
experience in the robotics domain, primarily
in the Robot Operating System, OpenCV,
and PCL.
He has authored ten books on ROS, namely,
Learning Robotics Using Python, first and
second editions; Mastering ROS for Robotics
Programming, first and second editions;
ROS Robotics Projects, first and second
editions; ROS Programming: Building Powerful Robots; and Robot
Operating System (ROS) for Absolute Beginners first and second editions.
He is also co-editor of the book Autonomous Driving and Advanced
Driver-Assistance Systems (ADAS): Applications, Development, Legal Issues,
and Testing.
He obtained his masters in robotics and automation from India and has
also worked at the Robotics Institute, Carnegie Mellon University, USA. He
is also a TEDx speaker.
xiii
Acknowledgments
I would like to thank my friend Anuradha for encouraging me to share
my knowledge with the world. I would like to thank my longtime mentors
from Apress, Celestin and Aditee, for giving me an opportunity to share
my knowledge and experience with readers. I thank Mark Powers and
James Markham for helping me to shape this book as per Apress standards.
I am in debt to the technical reviewer, Lentin Joseph. I also thank Prof.
Govindrajulu Sir’s family, Srinivas (son) and Amy (daughter-in-law),
for allowing me to dedicate this book to his memory and sharing with
us Govindrajulu Sir’s biographical information and his photograph for
publication. I would also like to thank all the people associated with Apress
who have been instrumental in bringing this project to reality.
xv
Introduction
I want to keep this introduction short, concise, and precise. I have been
working with the domain of image processing for quite a while now. I
was introduced to Python more than 15 years ago. When I first worked
with image processing using Raspberry Pi, I found it a bit tedious to comb
through all the literature available as printed books, video tutorials, and
online tutorials, as most lacked a step-by-step yet comprehensive guide for
a beginner getting started. It was then that I resolved to write a book, and
then I published the first edition of this book. It has been almost five years
since the first edition was published and it needed a lot of updates.
This book is the fruit of my efforts and cumulative experience of
thousands of hours (apart from the ones spent writing the actual book)
spent going through technical documentation, watching training videos,
writing code with the help of different tools, debugging faulty code
snippets, posting questions, and participating in discussions on various
question-answer and technical forums online, and referring to various
code repositories for directions. I have written this edition of the book in
such a way that beginners will find it easy to understand the topics. This
book has hundreds of code examples and images (of outputs of code
execution and screenshots) so readers can understand each concept
perfectly. All the code examples are adequately explained.
The book begins with a general discussion of Raspberry Pi and
Python programming. It is followed by a discussion of concepts related
to the domain of image processing. Then, it explores the libraries Pillow
and TKinter. The following few chapters focus on the scientific Python
ecosystem and image processing with libraries such as NumPy, SciPy,
and Matplotlib. The last chapter discusses how OpenCV can be used for
xvii
Introduction
xviii
CHAPTER 1
Introduction
to Single-Board
Computers
and Raspberry Pi
Let’s start this exciting journey of exploring the scientific domain of digital
image processing with Raspberry Pi. To begin, we must be comfortable
with the basics of single-board computers (SBCs) and the Raspberry Pi.
This chapter discusses the definition, history, and philosophy of SBCs. It
compares SBCs to regular computers, then moves toward the most popular
and bestselling SBC of all time, the Raspberry Pi. By the end of this chapter,
we will have adequate knowledge to independently set up our own
Raspberry Pi. This chapter aims to make us comfortable with the basic
concepts of SBCs and Raspberry Pi setup.
• Pico-ITX
• PXI
• Qseven
• VMEbus
• VPX
• VXI
• AdvancedTCA
• CompactPCI
• Mini-ITX
• PC/104
• PICMG
2
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
3
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
History of SBCs
Dyna-Micro was the first true SBC. It was based on the Intel C8080A and
used Intel’s first EPROM, the C1702A. The Dyna-Micro was rebranded
and marketed by E&L Instruments of Derby, Connecticut, in 1976 as
the MMD-1 (Mini-Micro Designer 1). It became famous as the leading
example of a microcomputer. SBCs were very popular in the earlier days of
computing, as many home computers were actually SBCs. However, with
the rise of PCs, the popularity of SBCs declined. Since 2010, there has been
a resurgence in the popularity of SBCs due to their lower production costs.
Apart from the MMD-1, here are a few other popular historical SBCs:
4
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
• CubieBoard
• BeagleBone
The Raspberry Pi
The Raspberry Pi is a family of credit card–sized SBCs developed in the United
Kingdom by the Raspberry Pi Foundation. The Raspberry Pi Foundation was
formed by Eben Upton in 2009. The aim in developing the Raspberry Pi was
to promote the teaching of basic computer science in schools and developing
countries by providing a low-cost computing platform.
Raspberry Pi Foundation’s Raspberry Pi was released in 2012. It was
a massive hit and sold over two million units in two years. Subsequently,
the Raspberry Pi Foundation released revised versions of the Raspberry Pi.
They also released other accessories for the Pi.
More information about the Raspberry Pi Foundation can be found on
their website at https://www.raspberrypi.org.
The product page for Raspberry Pi's current production models
and other accessories can be found at https://www.raspberrypi.org/
products.
I have written, executed, and tested all the code examples in this
book on Raspberry Pi 4 Model B units with 8GB RAM. Table 1-2 lists the
specifications of the Raspberry Pi 4 Model B.
5
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
Architecture ARMv8
SoC broadcom BCM2711
CPU Quad core Cortex-A72 (ARM v8) 64-bit SoC @ 1.5GHz
GPU Broadcom VideoCore IV
Memory 2GB, 4GB or 8GB LPDDR4-3200 SDRAM (depending on model)
USB 2 USB 3.0 ports, 2 USB 2.0 ports
Video output 2 × micro-HDMI ports (up to 4kp60 supported)
2-lane MIPI DSI display port
On-board storage Micro SDHC slot
On-board network 2.4 GHz and 5.0 GHz IEEE 802.11ac wireless
Bluetooth 5.0, BLE
Gigabit Ethernet
Power source 3A 5V via MicroUSB
6
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
The image clearly shows all the important components of the board.
It shows USB ports, Ethernet port, micro-HDMI ports, 3.5mm audio jack,
USB type C port for power, and connectors for CSI and DSI.
Figure 1-2 shows the top view of the board.
1
Image provided by Laserlicht under a CC-by-SA 4.0 license (https://
creativecommons.org/licenses/by-sa/4.0/deed.en)
7
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
We can see the microSD card slot clearly in the bottom view
(Figure 1-3).
2
Image provided by Laserlicht under a CC-by-SA 4.0 license (https://
creativecommons.org/licenses/by-sa/4.0/deed.en)
8
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
9
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
Raspberry Pi Setup
We have to set up Raspberry Pi before we can use it for exploration and
adventure. As mentioned earlier, I am using a Raspberry Pi 4 Model B for
this setup. Make sure that you use a model of RPi board that has built-in
Wi-Fi. The setup process is almost the same for all models.
We are going to use the Raspberry Pi board in headless mode. This
means that we will not connect any keyboard, mouse, or display. We will
just connect it to the Wi-Fi network and access it remotely. So, we are not
going to need a lot of hardware for this. Apart from the RPi board itself, we
will need the following things:
10
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
We can procure all these things and the RPi boards at online
marketplaces or local hobby electronics stores.
3
Image provided by Alen under a CC0 1.0 Universal (CC0 1.0) license (https://
creativecommons.org/publicdomain/zero/1.0/)
11
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
12
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
13
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
14
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
15
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
16
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
17
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
18
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
country=IN
ctrl_interface=DIR=/var/run/wpa_supplicant
GROUP=netdev
19
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
update_config=1
network={
scan_ssid=1
ssid="TP-Link_710E"
psk="internet1"
}
Booting Up Raspberry Pi
This is the easiest part. Following these steps:
We can see the lights on the RPi board blinking at this point. This
means that the RPi is booting up. Wait for a couple of minutes for the boot
process to complete.
20
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
Since we enabled the SSH and provided it the settings of our Wi-Fi before
booting up, it is connected to the Wi-Fi at the time of the booting process.
There are many ways to find out the IP address. If you are a part of an
organization (workplace, research lab, or university), check with your
network/system administrator to find out the IP address of the RPi board.
If it is your personal Wi-Fi, then on all the UNIX-like systems (Linux,
BSD, macOS), you can run a command to find out the IP address. For
Debian and derivatives (Ubuntu, Raspberry Pi OS, etc.), we can install the
following command:
This utility (NMAP) scans the network. The command to find out all
the IP addresses connected to the network is as follows:
You can find the IP address of your own system with the commands
ipconfig (Windows) or ifconfig (UNIX-like systems). From this list, we
can eliminate all the known devices with known IP addresses. If there are
too many devices attached to the home network, then turn off the Wi-Fi of
21
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
all the unnecessary devices. In my case, the IP address of the RPi board is
192.168.0.100.
On a Windows computer, we can install the Zenmap utility, which is
the graphical interface for the nmap. We can download and install it from
https://nmap.org/zenmap/. It is also available for UNIX-like systems.
Install the utility and open it. It looks as shown in Figure 1-13.
We can easily use it. Mention 192.168.0.* in the Target text box.
Mention the command that we used earlier (nmap -sn 192.168.0.*) in
the Command text box. And then click the Scan button. It takes some
time for the scan to finish. Sometimes the scan fails prematurely. In such
cases, keep trying, and it will work after a couple of attempts. It is all open
source and free, so we cannot complain. It produces the result shown
in Figure 1-13 once finished. Sometimes, it also shows the names of the
manufacturers of the connected devices beside their MAC addresses. This
makes it easy to identify the RPi board.
22
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
Once we have the IP address of the RPi board, we can log in remotely.
We can use a variety of SSH tools, such as the built-in ssh utility on UNIX-
like operating systems, to remotely log in to the RPi board. Just run the
following command in the terminal emulator of your operating system:
23
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
24
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
Keep in mind that this File Explorer utility is running on the RPi.
We are accessing its GUI using X-11 Forwarding. This way, with the
combination of nohup (no hangup) and &, we can launch any GUI utility
from the terminal. & makes sure that the control of the terminal is returned
back to us, and nohup keeps the process running even when the user
logs out.
Configuring Raspberry Pi
Now, type sudo raspi-config in the prompt and press Enter. It opens
the tool raspi-config, which is the configuration tool for the Raspberry Pi
OS. First, update it, as shown in Figure 1-16.
25
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
It will take some time to update. Once updated, go to the fifth option
for Localization. It looks as follows (Figure 1-17).
Set all these options as per your choice, and in the main menu choose
Finish. It will ask to reboot. Choose Yes, and it will reboot the RPi.
26
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
The Raspberry Pi OS
An operating system is the set of basic programs and utilities that make
a computer work. It is an interface between the user and the computer.
Raspberry Pi OS is a free operating system based on the popular Linux
distribution Debian. It is optimized for the Raspberry Pi family of SBCs. It
is even ported to the other, similar SBCs like Banana Pro.
27
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
Updating the RPi
The Raspberry Pi must be connected to the internet in order to update it
successfully. Let’s update the firmware and the Raspberry Pi OS.
Updating the Firmware
To update the firmware, run the command sudo rpi-update.
28
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
This fetches new versions of the packages on the local machine that are
marked for upgrade. It also detects and installs any dependencies. It also
removes obsolete packages.
Doing this regularly will keep the Raspberry Pi OS up to date. After
entering these commands, it will take a while to update the OS, because
these commands fetch the data and the packages from remote repositories
on the internet.
Note The command apt help will list all the options associated
with the APT utility.
sudo shutdown -h
sudo init 0
sudo reboot -h
sudo init 6
29
Chapter 1 Introduction to Single-Board Computers and Raspberry Pi
S
ummary
In this chapter, we learned how to set up and access RPi in headless mode
over Wi-Fi. We are comfortable with this part now.
The next chapter will focus on the concepts involved with digital
image processing. We will also write basic programs with Python 3.
30
CHAPTER 2
Introduction
to Python and Digital
Image Processing
In the last chapter, we explored the amazing world of single-board
computers and Raspberry Pi. We booted up the Raspberry Pi, connected it
to the internet, and updated the Raspberry Pi OS.
In this chapter, we will get started with Python and the concepts
of digital image processing (DIP). Let’s begin with an introduction to
Python. I personally find Python amazing and have been enchanted
by it. Python is a simple yet powerful programming language. When
programmers use Python, it’s easy to focus on solving a given problem
as they do not have to worry about the syntax. Python perfectly fits the
philosophy of Raspberry Pi, which is programming for everyone. That’s
why it’s the most preferred programming platform for Raspberry Pi and
other computers.
The following is a list of topics we will learn in this chapter:
• History of Python
• Features of Python
• Python 3
32
Chapter 2 Introduction to Python and Digital Image Processing
• Readability counts.
33
Chapter 2 Introduction to Python and Digital Image Processing
Features of Python
The following are the features of Python that have made it popular and
beloved in the programming community.
Simple
Python is a simple language with a minimalist approach. Reading a
well-written and good Python program makes you think you are reading
English text.
Easy to Learn
Due to its simple and English-like syntax, Python is extremely easy to
learn. That is the prime reason why it is taught as the first programming
language to high school and university students who take introductory
programming courses. An entire new generation of programmers is
learning Python as their first programming language.
Easy to Read
Unlike other high-level programming languages, Python does not
obfuscate the code and make it unreadable. The English-like structure of
the Python code makes it easier to read compared to code written in other
programming languages. This makes it easier to understand and easier to
learn compared to other high-level languages like C and C++.
Easy to Maintain
As Python code is easy to read, easy to understand, and easy to learn,
anyone maintaining the code becomes comfortable with the codebase
very quickly. I can vouch for this from my personal experience maintaining
and enhancing large legacy codebases that were written in a combination
of Bash and Python 2.
Open Source
Python is an open source project, which means its source code is
freely available. You can make changes to it to suit your needs and use the
original and modified code in your applications.
34
Chapter 2 Introduction to Python and Digital Image Processing
High-Level Language
While writing Python programs, you do not have to manage the low-
level details like memory management, CPU timings, and scheduling
processes. All these tasks are managed by the Python interpreter. You can
write the code directly in easy-to-understand English-like syntax.
Portable
The Python interpreter has been ported to many OS platforms. Python
code is also portable. All the Python programs will work on the supported
platform without requiring many changes if you are careful to avoid
system-dependent coding.
You can use Python on GNU/Linux, Windows, Android, FreeBSD, Mac
OS, iOS, Solaris, OS/2, Amiga, Palm OS, QNX, VMS, AROS, AS/400, BeOS,
OS/390, z/OS, Psion, Acorn, PlayStation, Sharp Zaurus, RISC OS, VxWorks,
Windows CE, and PocketPC.
Interpreted
Python is an interpreted language. Let’s take a look at what that means.
Programs written in high-level programming languages like C, C++, and
Java are compiled first. This means that they are first converted into an
intermediate format. When we run the program, this intermediate format
is loaded from secondary storage (i.e., from the hard disk) to the memory
(RAM) by the linker/loader.
So, C, C++, and Java have a separate compiler and linker/loader. This
is not the case with Python. Python runs the program directly from the
source code. You do not have to bother compiling and linking to the proper
libraries. This makes Python programs truly portable, as you can copy the
program to one computer from another and the program runs fine as long
as the necessary libraries are installed on the target computer.
Object Oriented
Python supports procedure-oriented programming as well as object-
oriented programming paradigms.
35
Chapter 2 Introduction to Python and Digital Image Processing
36
Chapter 2 Introduction to Python and Digital Image Processing
Memory Management
In assembly language and in programming languages like C and C++,
memory management is the responsibility of the programmer. This is in
addition to the task at hand. This creates an unnecessary burden on the
programmer. In Python, the Python interpreter takes care of the memory
management. This helps the programmers steer clear of memory issues
and focus on the task at hand.
Powerful
Python has everything in it that a modern programming language
needs. It is used in applications such as computer visioning,
supercomputing, drug discovery, scientific computing, simulation, and
bioinformatics. Millions of programmers around the world use Python.
Many big organizations like NASA, Google, SpaceX, and Cisco use Python
for their applications and infrastructure.
Community Support
I find this to be the most appealing feature of Python. Recall that
Python is open source. It also has a community of almost a million
programmers throughout the world (probably more, as today high school
kids are learning Python too). There are also plenty of forums on the
internet to support programmers who encounter a roadblock. None of my
queries related to Python have ever gone unanswered.
Python 3
Python 3 was released in 2008. The Python development team decided
to do away with some of the redundant features of Python, simplify some
more features, rectify some design flaws, and add a few much-needed
features.
It was decided that a major revision number was needed for this and
that the resultant release would not be backward compatible. Python
2.x and 3.x were supposed to coexist in parallel so the programmer
37
Chapter 2 Introduction to Python and Digital Image Processing
community would have enough time to migrate their code and third-party
libraries from 2.x to 3.x. Python 2.x code cannot be run as-is in most cases,
as there are significant differences between 2.x and 3.x.
python3 -V
python --version
You can check the location of the Python 3 binary by running the
following command:
which python3
Interactive Mode
Python’s interactive mode is a command-line shell. It provides immediate
output for every executed statement. It also stores the output of previously
executed statements in the active memory. As new statements are
executed by the Python interpreter, the entire sequence of previously
38
Chapter 2 Introduction to Python and Digital Image Processing
pi@raspberrypi:~ $
Python 3.4.2 (default, Oct 19 2014, 13:31:11)
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more
information.
>>>
>>> exit()
pi@raspberrypi:~ $
Script Mode
Script mode is where the Python script files (.py) are executed by the
Python interpreter.
39
Chapter 2 Introduction to Python and Digital Image Processing
Create a file called test.py and add the print ('Hello World!')
statement to the file. Save the file and run it with the Python 3 interpreter
as follows:
IDEs for Python
An integrated development environment (IDE) is a software suite that has
all the basic tool s needed to write and test programs. A typical IDE has
a compiler, a debugger, a code editor, and a build automation tool. Most
programming languages have various IDEs to make programmers’ lives
better. Python too has many IDEs. Let’s look at a few of them.
IDLE
IDLE stands for Integrated DeveLopment Environment. We have to install
it on Raspberry Pi OS. IDLE3 is for Python 3. It’s popular with Python
beginners. We can install it with the following command:
Figure 2-1 shows the IDLE3 code editor and an interactive prompt.
40
Chapter 2 Introduction to Python and Digital Image Processing
41
Chapter 2 Introduction to Python and Digital Image Processing
42
Chapter 2 Introduction to Python and Digital Image Processing
Geany
Geany is a text editor that uses the GTK+ toolkit and has the basic features
of an integrated development environment. It supports many file types
and programming languages. It has some nice features. Check out
https://www.geany.org for more details. Figure 2-3 shows the Geany text
editor window.
Type print("Hello World!") in the code editor and save the file in
the /home/pi directory as test.py. Click Build in the menu bar and then
choose Execute. You can also use the F5 keyboard shortcut to execute
the program. The program will execute in a terminal window. You have to
press Enter to close the execution window. The default Python interpreter
for Geany is Python 2. We will need to change it to Python 3. To do that,
go to Build ➤ Set Build Commands. The window shown in Figure 2-4
will appear.
43
Chapter 2 Introduction to Python and Digital Image Processing
44
Chapter 2 Introduction to Python and Digital Image Processing
Thonny IDE
Thonny IDE also comes installed with the latest edition of the Raspberry
Pi OS. It works with Python 3 by default (and only with Python 3, so we
do not have to change the settings). We can invoke it with the following
command:
45
Chapter 2 Introduction to Python and Digital Image Processing
Signal Processing
Anything that carries any information, when represented mathematically,
is called a signal. The process or technique used to extract useful and
relevant information from a given signal is known as signal processing. The
system that does this type of task is known as a signal processing system.
The best example of a signal processing system is the human brain. It
processes various types of signals from various senses. The human brain
is a biological signal processing system. When the system is made up of
electronic components, it is known as an electronic signal processing
system. Signal processing is a discipline that combines mathematics and
electrical engineering.
There are two types of electronic signals—analog and digital. The
following table lists the differences between these two types of signals.
Analog Digital
46
Chapter 2 Introduction to Python and Digital Image Processing
Analog Digital
I mage Processing
An image is a signal. So, image processing is a type of signal processing.
Image processing systems are types of signal processing systems. The
combination of the eye and brain is an example of a biological image
processing system. There are two types of image processing systems—
analog and digital.
47
Chapter 2 Introduction to Python and Digital Image Processing
• Signal processing
• Digital electronics
• Computer/machine vision
• Biological vision
• Artificial intelligence, pattern recognition, and machine
learning
48
Chapter 2 Introduction to Python and Digital Image Processing
Exercise
Complete the following Exercise to better understand the Python 3
background.
• Visit and explore the Python home page at www.
python.org.
• Visit and explore the Python documentation page at
https://docs.python.org/3/.
• Check for new features of the latest releases of
Python at https://docs.python.org/3/whatsnew/
index.html.
49
Chapter 2 Introduction to Python and Digital Image Processing
Summary
In this chapter, we learned about the background, history, and features
of Python. We also studied the important differences between Python 2.x
and Python 3.x. We learned to use Python 3 in scripting and interpreter
modes. We looked at a few popular IDEs for Python and configured Geany
for Python 3 on the Pi. Finally, we learned about the field of digital image
processing. In the next chapter, we will get started with a popular digital
image processing library in Python called Pillow. We will also learn how to
use the Tkinter library to show images.
50
CHAPTER 3
Getting Started
In the previous chapter, we learned about the philosophy of Python. We
also learned why we should use Python 3, as well as several concepts
related to digital image processing. This chapter will look at image
processing programming using Python 3 on the Raspberry Pi. We will
learn how to connect a Raspberry Pi to various camera sensors in order to
acquire images. We will be introduced to the Pillow and Tkinter libraries in
Python 3. The following is a list of topics we will learn in this chapter:
• Image sources
• Using a webcam
Image Sources
To learn how to do digital image processing, we are going to need digital
images. There are standard image datasets that are used all around the
world; we will use a couple of free-to-use images from the following
database: http://sipi.usc.edu/database/.
mkdir DIP
cd DIP
mkdir code
mkdir dataset
cd code
mkdir chapter03
Extract all the images into the Dataset directory. Now the directory
structure will look like Figure 3-1.
52
Chapter 3 Getting Started
We use the tree command in the DIP directory to view the directory
structure.
53
Chapter 3 Getting Started
Using a Webcam
Let’s see how to capture images using a standard USB webcam. The
Raspberry Pi 3 has four USB ports. You can use one of these to connect a
webcam to the Raspberry Pi. I am using a Logitech c922 USB webcam (see
Figure 3-2).
1
Image provided by Pmwiki1 under a CC0 1.0 Universal (CC0 1.0) license
(https://creativecommons.org/publicdomain/zero/1.0/)
54
Chapter 3 Getting Started
Attach the webcam and run the command lsusb in the terminal. This
displays a list of all USB devices connected to the computer. The output
will be similar to the following:
pi@raspberrypi:~$ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 003: ID 046d:085c Logitech, Inc. C922 Pro
Stream Webcam
Bus 001 Device 002: ID 2109:3431 VIA Labs, Inc. Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
guvcview
If you can see the USB webcam in the output, it means that the webcam
has been detected. There are quite a few Linux utilities with which to
capture images using a webcam. If you like GUI, guvcview is one of them.
Use the following command to install it if it is not already installed:
55
Chapter 3 Getting Started
fswebcam
The other useful utility is fswebcam. It’s a command-line utility. Install it
using the following command:
You can invoke the fswebcam utility from the terminal to capture an
image with the webcam as follows:
56
Chapter 3 Getting Started
This will capture an image with the resolution of 1920 x 1080 pixels.
The --no-banner flag disables the timestamp banner from the image. The
image is saved as test.jpg. The output of the command is as follows:
Then, check in the current directory for the image file called test.jpg.
57
Chapter 3 Getting Started
This one is the normal camera module suitable for normal lighting
conditions.
Figure 3-5 shows the NoIR version, suitable for low lighting conditions.
More details about them can be found at https://www.raspberrypi.org/
products/camera-module-v2. The camera can be connected to any model
of Raspberry Pi by a dedicated camera port, as shown in Figure 3-5.
2
Image provided by the Raspberry Pi Foundation under a CC-by-SA 4.0 license
(https://creativecommons.org/licenses/by-sa/4.0/deed.en)
58
Chapter 3 Getting Started
3
Image provided by the Raspberry Pi Foundation under a CC-by-SA 4.0 license
(https://creativecommons.org/licenses/by-sa/4.0/deed.en)
59
Chapter 3 Getting Started
package includes the camera module and C-to-CS mount adapter. The
camera looks as shown in Figure 3-6.
We can read more about the camera module on the product page
at https://www.raspberrypi.com/products/raspberry-pi-high-
quality-camera/.
4
Image provided by the Raspberry Pi Foundation under a CC-by-SA 4.0 license
(https://creativecommons.org/licenses/by-sa/4.0/deed.en)
60
Chapter 3 Getting Started
raspistill
The command-line utility we use to capture images using the camera
module is raspistill. We call it as follows:
raspistill -o test.png
61
Chapter 3 Getting Started
This will install the Pillow library for Python 3. To check if it is installed
properly, open Python 3 in interpreter mode and run the following
sequence of commands. If Pillow is installed properly, these commands
will display its version number.
Working with Images
Let’s work with images now. Before we begin, we will have to install an
image viewing utility called xv in order for the built-in function show()
to work. The problem is that the xv utility is deprecated. So, you will use
another utility called xli and point it to xv with a Linux command. Run the
following commands:
62
Chapter 3 Getting Started
python3 prog01.py
It will show the image in an xli window. This is the simplest Pillow
program; it loads an image in a Python variable with the function call
open() and displays it with the function call show(). In the first line, we are
importing the Image module of Pillow. We will learn more about this module
in the next chapter. The standard version of show() is not very efficient,
because it saves the image to a temporary file and calls the xv utility to
display the image. However, it is handy for the purposes of debugging.
The output is as shown in Figure 3-7.
Due to the limitation of the function show(), we will use Python’s built-
in GUI module Tkinter to display images whenever needed. Listing 3-2
creates an empty window as shown in Figure 3-8.
5
4.1.07, 4.1.08 Picture of jelly beans taken at USC. Free to use (http://sipi.usc.
edu/database/copyright.php)
63
Chapter 3 Getting Started
import tkinter as tk
root = tk.Tk()
root.title("Test")
root.mainloop()
64
Chapter 3 Getting Started
This program uses Python’s built-in module for GUI, called Tkinter.
We are importing it in the second line. The ImageTk module provides the
functionality to convert the Pillow image to a Tk-compatible image with
the PhotoImage() function. We create a Tk window with the following
statement:
root = tk.Tk()
The code displays the Pillow image as a label. The method title() sets
the title of the image. In the code, the Label() and pack() functions are
used to create the image label. The line l.photo() = photo is for Python’s
garbage collector. root.mainloop() is the main loop for the Tk GUI.
The output is shown in Figure 3-9.
65
Chapter 3 Getting Started
RGB
TIFF
(256, 256)
{'compression': 'raw', 'dpi': (1, 1), 'resolution': (1, 1)}
('R', 'G', 'B')
66
Chapter 3 Getting Started
The format of the image refers to the file format. Size refers to the
resolution of the image in pixels. Info refers to the auxiliary information
of the image. The function getbands() retrieves the bands of the colors in
the image.
Summary
In this chapter, we got started with the basics of Pillow. We learned how
to load and display an image. We learned a bit about image properties.
We also learned how to capture images using various sensors. In the next
chapter, we will explore the modules Image, ImageChops, and ImageOps in
Pillow in detail.
67
CHAPTER 4
Basic Operations
on Images
In the previous chapter, we started with using Pillow for image processing.
We also used Tkinter for displaying images. In this chapter, we will learn
various arithmetic and logical operations to use on images. We will
explore using Image, ImageChops, and ImageOps modules in Pillow to
implement these operations. We will also learn how to use the slide bar in
Tkinter to dynamically change the program input.
We will be learning about the following topics:
• Image module
• ImageChops module
• ImageOps module
Image Module
Let’s start with the Image module. We can perform myriad operations on
images with this module.
Image Channels
We can use the routine split() to split an image into its constituent
channels. We can also merge various images into a single image with the
routine merge(). Listing 4-1 shows a demonstration of this.
70
Chapter 4 Basic Operations on Images
Colorspace Conversion
You can change the mode of an image using the routine convert(), as
shown in Listing 4-2.
res1 = im1.convert("L")
root = tk.Tk()
root.title("Colorspace Conversion Demo")
photo = ImageTk.PhotoImage(res1)
l = tk.Label(root, image=photo)
l.pack()
l.photo = photo
root.mainloop()
The code in Listing 4-2 changes the mode of the image to L. The
routine convert() supports all possible conversions between the RGB,
CMYK, and L modes. We can read more about the colorspaces at https://
www.color-management-guide.com/color-spaces.html.
Image Blending
We can blend two images using the blend() method. It takes three
arguments—two images to be blended and the value of alpha. The
mathematical formula it uses for blending is as follows:
output = image1 * (1.0 - alpha) + image2 * alpha
Now, we will write a program that can change the value of alpha so
that we can experience the blending effect ourselves. We will use the scale
widget in Tkinter.
The program in Listing 4-3 demonstrates this process.
72
Chapter 4 Basic Operations on Images
photo = ImageTk.PhotoImage(img)
l['image'] = photo
l.photo = photo
root = tk.Tk()
root.title('Blending Demo')
im1 = Image.open("/home/pi/DIP/Dataset/4.1.07.tiff")
im2 = Image.open("/home/pi/DIP/Dataset/4.1.08.tiff")
photo = ImageTk.PhotoImage(im1)
l = tk.Label(root, image=photo)
l.pack()
l.photo = photo
w1 = (tk.Scale(root, label="Alpha", from_=0, to=1,
resolution=0.01, command=show_value_1, orient=tk.
HORIZONTAL))
w1.pack()
root.mainloop()
The code in Listing 4-3 creates a scale widget using the tk.Scale()
call statement. The statement has more than 79 characters, so in order to
conform to the PEP8 standard, I wrote it to fit in two lines, each consisting
of fewer than 79 characters. The parameters passed to the tk.Scale() call
are as follows:
73
Chapter 4 Basic Operations on Images
When you change the slider with the mouse pointer, it calls the custom
show_value_1() function. We are printing the current value of the track-
bar position to the console for debugging purposes. The statement img =
Image.blend(im1, im2, float(alpha)) creates a blended image. The
following lines update the image in the Tkinter window:
photo = ImageTk.PhotoImage(img)
l['image'] = photo
l.photo = photo
1
4.1.07, 4.1.08 Picture of jelly beans taken at USC. Free to use (http://sipi.usc.
edu/database/copyright.php)
74
Chapter 4 Basic Operations on Images
We will use the same basic code skeleton to demonstrate the other
features of the Pillow library.
Resizing an Image
You can resize an imageusing the routine resize(), as shown in
Listing 4-4.
75
Chapter 4 Basic Operations on Images
This example varies the image size from (128, 128) to (512, 512). The
routine resize() takes the new size tuple as an argument. The code
also invokes the Tkinter window in full-screen mode with the root.
attributes() function call. To close this window, you have to press Alt+F4
from the keyboard. Run the code and have a look at the output.
Rotating an Image
We can use the routine rotate(), which takes the angle of rotation as an
argument. The code in Listing 4-5 demonstrates this idea.
76
Chapter 4 Basic Operations on Images
Also, the following line (prog07.py in the code bundle) will flip the
image vertically:
out = im.transpose(Image.FLIP_TOP_BOTTOM)
77
Chapter 4 Basic Operations on Images
78
Chapter 4 Basic Operations on Images
The code in Listing 4-8 opens an image from a given location, copies it
into the im1 variable, and saves it to the current location as test.tiff.
print(im.getpixel((100,100)))
Mandelbrot Set
A Mandelbrot set is a fractal, and we can generate it with Pillow.
79
Chapter 4 Basic Operations on Images
80
Chapter 4 Basic Operations on Images
image.save("mandel.png", "PNG")
Julia Set
We can create a Julia set as demonstrated in Listing 4-10.
81
Chapter 4 Basic Operations on Images
82
Chapter 4 Basic Operations on Images
Note You can read more about Julia sets at the following URLs:
https://mathworld.wolfram.com/JuliaSet.html
https://www.britannica.com/science/Julia-set
83
Chapter 4 Basic Operations on Images
Noise and Gradients
Noise is the unwanted part in a signal. Depending on the type of signal,
noise can take various forms. We know that an image is a signal. So, an
image can have noise. Noise can be inadvertently added into an image
at the time of capturing that image or transmission. We can generate (or
simulate) noise as shown in Listing 4-11.
84
Chapter 4 Basic Operations on Images
ImageChops Module
This module has routines for many basic arithmetic and logical operations
that we can use on images. Let’s look at them quickly one by one.
We can add two images together using the add() method.
The following is sample code for adding images:
im2 = ImageChops.invert(im1)
Just like with darker(), you can use the lighter() method to return
the set of lighter pixels:
85
Chapter 4 Basic Operations on Images
im1 = Image.open("/home/pi/DIP/Dataset/ruler.512.tiff")
im2 = Image.open("/home/pi/DIP/Dataset/ruler.512.tiff")
im2 = im2.transpose(Image.ROTATE_90)
im3 = ImageChops.logical_and(im1.convert("1"), im2.
convert("1"))
im3 = ImageChops.logical_or(im1.convert("1"), im2.convert("1"))
86
Chapter 4 Basic Operations on Images
2
gray21.512, ruler.512 Test patterns constructed at USC-SIPI. Free to use.
(http://sipi.usc.edu/database/copyright.php)
87
Chapter 4 Basic Operations on Images
You can subtract one image from another using the subtract()
method as follows:
im3 = ImageChops.logical_xor(im1.convert("1"),
im2.convert("1"))
You can superimpose two images on top of each other using the Soft
Light algorithm as follows:
You can superimpose two images on top of each other using the Hard
Light algorithm as follows:
You can superimpose two images on top of each other using the
Overlay algorithm as follows:
88
Chapter 4 Basic Operations on Images
im2 = ImageChops.offset(im1,
xoffset=128,
yoffset=128)
ImageOps
This module has many predefined and useful operations.
You can automatically adjust the contrast of an image as follows:
im2 = ImageOps.autocontrast(im1)
You can crop the borders of an image equally from all sides as follows:
im2 = ImageOps.flip(im1)
im2 = ImageOps.mirror(im1)
You can reduce the number of bits of all the color channels using the
posterize() method. This takes the image and the number of bits to keep
for every channel as arguments. The following is an example:
89
Chapter 4 Basic Operations on Images
im2 = ImageOps.posterize(im1, 3)
im2 = ImageOps.equalize(im1)
im2 = ImageOps.grayscale(im1)
im2 = ImageOps.invert(im1)
Summary
This chapter explored the modules Image, ImageChops, and ImageOps
in detail. In the next chapter, we will explore a few more Pillow modules
for advanced operations on images, such as filtering, enhancements,
histograms, and quantization.
90
CHAPTER 5
Advanced Operations
on Images
The previous chapter explored the arithmetic and logical operations to use
on images with Pillow. There is more to the world of image processing than
that, however. Pillow has a lot more functionality to offer. You can enhance
and filter images. You can also calculate the histogram of an image as well
as its channels. We will cover the following topics in this chapter:
• ImageFilter module
• ImageEnhance module
• Color quantization
ImageFilter Module
We can use the ImageFilter module in Pillow to perform a variety of
filtering operations on images. You can use filters to remove noise, to add
blur effects, and to sharpen and smooth your images. Listing 5-1 shows a
simple image-filtering program that uses the ImageFilter module.
Uncomment the lines in the code file and see the other filters in action.
92
Chapter 5 Advanced Operations on Images
The code in Listing 5-2 uses the GaussianBlur() method for a custom
filter. The radius of the blur is 5. We can change the radius. Let’s use a
slider in Tkinter to modify the code and make the blur radius dynamic
(see Listing 5-3).
93
Chapter 5 Advanced Operations on Images
94
Chapter 5 Advanced Operations on Images
95
Chapter 5 Advanced Operations on Images
Listing 5-4 shows a code example of the kernel used for image
convolution.
96
Chapter 5 Advanced Operations on Images
We can use the Digital Unsharp Mask filter, as shown in Listing 5-5.
97
Chapter 5 Advanced Operations on Images
photo = ImageTk.PhotoImage(img)
l['image'] = photo
l.photo = photo
root = tk.Tk()
root.title('Digital Unsharp Mask Demo')
im1 = Image.open("/home/pi/DIP/Dataset/4.1.07.tiff")
photo = ImageTk.PhotoImage(im1)
l = tk.Label(root, image=photo)
l.pack()
l.photo = photo
w1 = (tk.Scale(root, label="Blur Radius", from_=0, to=10,
resolution=0.2, command=show_value_1, orient=tk.
HORIZONTAL))
w1.pack()
root.mainloop()
98
Chapter 5 Advanced Operations on Images
99
Chapter 5 Advanced Operations on Images
custom_filter = ImageFilter.
MedianFilter(size=int(window_size))
img = im1.filter(custom_filter)
photo = ImageTk.PhotoImage (img)
l['image'] = photo
l.photo = photo
root = tk.Tk()
root.title('Median Filter Demo')
im1 = Image.open("/home/pi/DIP/Dataset/4.1.07.tiff")
photo = ImageTk.PhotoImage(im1)
l = tk.Label(root, image=photo)
l.pack()
l.photo = photo
w1 = (tk.Scale(root, label="Window Size", from_=1, to=19,
resolution=1, command=show_value_1, orient=tk.
HORIZONTAL))
w1.pack()
root.mainloop()
100
Chapter 5 Advanced Operations on Images
If you change the filter to a min filter using the following code line
custom_filter = ImageFilter.MinFilter(size=int(window_size))
101
Chapter 5 Advanced Operations on Images
custom_filter = ImageFilter.MaxFilter(size=int(window_size))
102
Chapter 5 Advanced Operations on Images
Next, we will see an example of a mode filter. The mode filter works
with even and odd window sizes. Listing 5-7 shows a code example of the
mode filter.
103
Chapter 5 Advanced Operations on Images
root = tk.Tk()
root.title('Mode Filter Demo')
im1 = Image.open("/home/pi/DIP/Dataset/4.1.07.tiff")
photo = ImageTk.PhotoImage(im1)
l = tk.Label(root, image=photo)
l.pack()
l.photo = photo
w1 = (tk.Scale(root, label="Window Size", from_=1, to=19,
resolution=1, command=show_value_1, orient=tk.
HORIZONTAL))
w1.pack()
root.mainloop()
104
Chapter 5 Advanced Operations on Images
custom_filter = ImageFilter.BoxBlur(radius=int(window_size))
You’ve seen how to utilize filters; now, let’s take a look at how images
can be enhanced.
105
Chapter 5 Advanced Operations on Images
enhancer = ImageEnhance.Color(im1)
img = enhancer.enhance(float(factor))
We can follow the same style of coding for all the other image
enhancement operations. First, we create an enhancer, and then we apply
the enhancement factor to that. We also must have observed w1.set(1) in
106
Chapter 5 Advanced Operations on Images
the code. This sets the scale to 1 at the beginning. Changing the argument
to set() changes the default position of the scale pointer.
Run the program in Listing 5-8 and take a look at the output.
To change the contrast, use the code in Listing 5-9.
Run the program in Listing 5-9 and take a look at the output.
The following enhancer is used to change the brightness:
107
Chapter 5 Advanced Operations on Images
enhancer = ImageEnhance.Brightness(im1)
enhancer = ImageEnhance.Sharpness(im1)
For finer control of the sharpness, use the following code for the scale:
These were all image enhancement operations. The next section looks
at more advanced image operations.
Color Quantization
Color quantization is the process of reducing the number of distinct colors
in an image. The new image should be similar to the original image in
appearance. Color quantization is done for a variety of purposes, including
when you want to store an image in a digital medium. Real-life images
have millions of colors. However, encoding them in the digital format and
retaining all the color-related information would result in a huge image
size. If you limit the number of colors in the image, you’ll need less space
to store the color-related information. This is the practical application of
quantization. The Image module has a method called quantize() that’s
used for image quantization.
The code in Listing 5-10 demonstrates image quantization in Pillow.
108
Chapter 5 Advanced Operations on Images
109
Chapter 5 Advanced Operations on Images
Histograms and Equalization
You likely studied frequency tables in statistics in school. Well, the
histogram is nothing but a frequency table visualized. You can calculate
the histogram of any dataset represented in the form of numbers.
The Image module has a method called histogram() that’s used
to calculate the histogram of an image. An RGB image has three 8-bit
channels. This means that it can have a staggering 256 x 256 x 256 number
of colors. Drawing a histogram of such a dataset would be very difficult. So,
the histogram() method calculates the histogram of individual channels
in an image. Each channel has 256 distinct intensities. The histogram is a
list of values for each intensity level of a channel.
The histogram for each channel has 256 numbers in the list. Suppose
the histogram for the Red channel has the values (4, 8, 0, 19, …, 90). This
110
Chapter 5 Advanced Operations on Images
means that four pixels have the red intensity of 0, eight pixels have the red
intensity of 1, no pixel has red intensity of 2, 19 pixels have the red intensity
of three, and so on, until the last value, which indicates that 90 pixels have
the red intensity of 255.
When we combine the histogram of all three channels, we get a list of
768 numbers. In this chapter, we will just compute the histogram. We will
not show it visually. When we learn about the advanced image processing
library scipy.ndimage, we will learn how to represent histograms for each
channel individually.
The code in Listing 5-11 calculates and stores the histograms of an
image and its individual channels.
Modify this program to directly print the histograms of the image and
channels.
A grayscale image (L mode image) will have a histogram of only 256
values because it has only a single channel.
Histogram Equalization
You can adjust the histogram to enhance the image contrast. This
technique is known as histogram equalization. The ImageOps.equalize()
method equalizes the histogram of the image. Listing 5-12 shows an
example of this process.
111
Chapter 5 Advanced Operations on Images
The program in Listing 5-12 prints the histogram of the original image
after the equalization. Add the im1.show() statement to the program and
then run it to see the difference between the images.
Summary
In this chapter, we explored how to use the Pillow library for advanced
image processing. Pillow is good for the beginners who want to get started
with an easy-to-program and less mathematical image processing library.
However, if you want a more mathematical and scientific approach, then
Pillow might not be your best choice. In the following chapters, we will
learn about a more powerful library to use for image processing, scipy.
ndimage. It is widely used by the scientific community all over the world.
We will also learn the basics of the NumPy and matplotlib libraries, which
come in handy when processing and displaying images.
112
CHAPTER 6
Introduction to the
Scientific Python
Ecosystem
In the previous chapter, we studied advanced image processing with
Pillow. Pillow is a nice starting point for image processing operations.
However, it has its limitations. When it comes to implementing elaborate
image processing operations, such as segmentation, morphological
operations, advanced filters, and measurements, Pillow proves to be
inadequate. We really need to use better libraries for advanced image
processing. The SciPy ecosystem serves as a foundation for all the
scientific uses of Python.
SciPy stands for Scientific Python. It extensively uses NumPy
(Numerical Python) and matplotlib for numeric operations and data
visualization, respectively. This chapter covers the basics of NumPy, SciPy,
and matplotlib. It also explores introductory-level programming examples
using NumPy, scipy.misc, and matplotlib. We will explore the following
topics in this chapter:
• matplotlib
By the end of this chapter, we will be comfortable with the basics of the
SciPy ecosystem.
• NumPy
• SciPy
• matplotlib
• IPython
• Pandas
• Sympy
• Nose
Many other libraries use the core modules in the ecosystem for
additional algorithms and data structures. Examples include OpenCV
and Scikit.
114
Chapter 6 Introduction to the Scientific Python Ecosystem
Figure 6-1 aptly summarizes the role of the Scientific Python ecosystem
in the world of scientific computing.
Simple Examples
Let’s study a few simple examples. Let’s learn how to read a built in image
from scipy.misc, store it in a variable, and see the data type of the variable
as shown in Listing 6-1,
115
Chapter 6 Introduction to the Scientific Python Ecosystem
<class 'numpy.ndarray'>
All the variables that store image data belong to this type as far as
Scientific Python is concerned. This means that the data type of the image
is ndarray in NumPy. To get started with scientific image processing,
and any type of scientific programming in general, we must know what
NumPy is.
The NumPy homepage at www.numpy.org says this:
NumPy is the fundamental package for scientific computing
with Python.
It offers the following features:
• A powerful multi-dimensional array object
To get started with image processing using SciPy and NumPy, we need
to learn the basics of N-dimensional (or multi-dimensional) array objects
in NumPy.
NumPy’s N-dimensional array is a homogeneous (contains elements
all of the same data type) multi-dimensional array. It has multiple
dimensions. Each dimension is known as an axis. The class corresponding
to the N-dimensional array in NumPy is numpy.ndarray. This is what
we saw in the output of the Listing 6-1. All the major image processing
116
Chapter 6 Introduction to the Scientific Python Ecosystem
uint8
(768, 1024, 3)
3
2359296
Let’s look at what each of these means. The dtype attribute is for the
data type of the elements that represent the image. In this case, it is uint8,
which means an unsigned 8-bit integer. This means it can have 256 distinct
values. shape means the dimension or size of the images. In this case, it is
117
Chapter 6 Introduction to the Scientific Python Ecosystem
a color image. Its resolution is 1024 x 768, and it has three color channels
corresponding to the colors red, green, and blue. Each channel for each
pixel can have one of the 256 possible values. So, a combination can
produce 256*256*256 distinct colors for each pixel.
You can visualize a color image as an arrangement of three two-
dimensional planes. A grayscale image is a single plane of grayscale values.
ndim represents the dimensions. A color image has three dimensions, and
a grayscale image has two dimensions. size represents the total number of
elements in the array. It can be calculated by multiplying the values of the
dimensions. In this case, it is 768*1024*3 = 2359296.
We can see the RGB value corresponding to each individual pixel, as
shown in Listing 6-3.
The code in Listing 6-3 accesses the value of the pixel located at (10, 10).
The output is [172 169 188].
This concludes the basics of NumPy and image processing. You will
learn more about NumPy as and when needed throughout the chapters.
Matplotlib
We have thus far used the misc.imshow() method to display an image.
While this method is useful for simple applications, it is primitive. We must
use a more advanced framework for scientific applications. Matplotlib
serves this purpose. It is a Matlab-style plotting and data visualization
library for Python. We have already installed it. It is an integral part of the
Scientific Python ecosystem. Just like NumPy, matplotlib is a vast topic
118
Chapter 6 Introduction to the Scientific Python Ecosystem
and warrants a dedicated book. The examples in this book use the pyplot
module in matplotlib for the image processing requirements. Listing 6-4
shows a simple program for image processing.
The code in Listing 6-4 imports the pyplot module. The imshow()
method adds the image to the plot window. The show() method shows the
plot window. The output is shown in Figure 6-2.
119
Chapter 6 Introduction to the Scientific Python Ecosystem
We can also turn off the axes (or the ruler) and add a title to the image,
as shown in Listing 6-5.
1
Image provided by Judy Weggelaar (This file is in public domain, not copyrighted,
no rights reserved, free for any use.) License: https://commons.wikimedia.org/
wiki/File:Raccoon_procyon_lotor.jpg
120
Chapter 6 Introduction to the Scientific Python Ecosystem
plt.imshow(img, cmap='gray')
plt.axis('off')
plt.title('Ascent')
plt.show()
2
Image provided by Hillebrand Steve, U.S. Fish and Wildlife Service (This file is
in public domain, not copyrighted, no rights reserved, free for any use.) License:
https://commons.wikimedia.org/wiki/File:Accent_to_the_top.jpg
121
Chapter 6 Introduction to the Scientific Python Ecosystem
We have used the subplot() method before, with imshow(). The first
two arguments in the subplot() method specify the dimensions of the
grid, and the third argument specifies the position of the image in the grid.
The numbering of the images in the grid starts from the top-left edge. The
top-left position is the first position, the next position is the second one,
and so on. The result is shown in Figure 6-4.
122
Chapter 6 Introduction to the Scientific Python Ecosystem
Image Channels
You can separate image channels of a multi-channel image. The code for
that process is shown in Listing 6-7.
123
Chapter 6 Introduction to the Scientific Python Ecosystem
g = img[:, :, 1]
b = img[:, :, 2]
titles = ['face', 'Red', 'Green', 'Blue']
images = [img, r, g, b]
plt.subplot(2, 2, 1)
plt.imshow(images[0])
plt.axis('off')
plt.title(titles[0])
plt.subplot(2, 2, 2)
plt.imshow(images[1], cmap='Reds')
plt.axis('off')
plt.title(titles[1])
plt.subplot(2, 2, 3)
plt.imshow(images[2], cmap='Greens')
plt.axis('off')
plt.title(titles[2])
plt.subplot(2, 2, 4)
plt.imshow(images[3], cmap='Blues')
plt.axis('off')
plt.title(titles[3])
plt.show()
124
Chapter 6 Introduction to the Scientific Python Ecosystem
We can use the np.dstack() method, which merges all the channels,
to create the original image, as shown in Listing 6-8.
125
Chapter 6 Introduction to the Scientific Python Ecosystem
b = img[:, :, 2]
output = np.dstack((r, g, b))
plt.imshow(output)
plt.axis('off')
plt.title('Combined')
plt.show()
Run the code in Listing 6-8 to see the workings of the np.dstack() for
yourself.
<class 'PIL.TiffImagePlugin.TiffImageFile'>
<class 'numpy.ndarray'>
<class 'PIL.Image.Image'>
126
Chapter 6 Introduction to the Scientific Python Ecosystem
Summary
In this chapter, we were introduced to the Scientific Python stack and its
constituent libraries, such as NumPy and matplotlib. We also explored the
scipy.misc module for basic image processing and conversion. In the next
chapter, we will start exploring the scipy.ndimage module for more image
processing operations.
127
CHAPTER 7
Transformations
and Measurements
In the previous chapter, we were introduced to the Scientific Python stack.
We learned the basics of NumPy and matplotlib. We explored the useful
modules ndarray and pyplot from NumPy and matplotlib, respectively.
We also learned about the scipy.misc module and how to perform basic
image processing with it. In this chapter, we will further explore the SciPy
library. We will learn to use the scipy.ndimage library for processing
images. We will also explore methods to use for image transformation and
image measurement.
By the end of this chapter, we will be comfortable with performing
operations on images.
Transformations
We studied a few basic transformations that are possible with scipy.misc
in the previous chapter. Here, we will look at few more.
130
Chapter 7 Transformations and Measurements
131
Chapter 7 Transformations and Measurements
plt.title('Zoom Demo')
plt.axis('off')
plt.show()
We can also use the function rotate() to rotate an image. Listing 7-3
shows several ways to use this function.
132
Chapter 7 Transformations and Measurements
133
Chapter 7 Transformations and Measurements
134
Chapter 7 Transformations and Measurements
def shift_func(output_coords,
x_shift=128,
y_shift=128):
return (output_coords[0] - x_shift,
output_coords[1] - y_shift)
img = misc.ascent()
plt.subplot(1, 2, 1)
plt.imshow(img, cmap='gray')
plt.title('Original')
plt.subplot(1, 2, 2)
plt.imshow(ndi.geometric_transform(img, shift_func),
cmap='gray')
plt.title('Shift')
plt.show()
135
Chapter 7 Transformations and Measurements
136
Chapter 7 Transformations and Measurements
plt.imshow(img, cmap='gray')
plt.title('Original')
T = [[1, 0, 128], [0, 1, 128], [0, 0, 1]]
plt.subplot(1, 3, 2)
plt.imshow(ndi.affine_transform(img, T), cmap='gray')
plt.title('Translation')
S = [[0.5, 0, 0], [0, 0.5, 0], [0, 0, 1]]
plt.subplot(1, 3, 3)
plt.imshow(ndi.affine_transform(img, S), cmap='gray')
plt.title('Scaling')
plt.show()
137
Chapter 7 Transformations and Measurements
138
Chapter 7 Transformations and Measurements
Measurements
Let’s work with measurements. The first one is the histogram, which is
a visual representation of the frequency distribution of pixel intensities.
Listing 7-7 shows a demonstration of a histogram.
139
Chapter 7 Transformations and Measurements
img = misc.face()
hist = ndi.histogram(img, 0, 255, 256)
plt.plot(hist, 'k')
plt.title('Face Histogram')
plt.grid(True)
plt.show()
We pass the image, the lowest and highest limits of the bins, and the
number of bins as arguments to the function that computes the histogram.
The output is shown in Figure 7-7.
140
Chapter 7 Transformations and Measurements
0
(201, 268)
255
(190, 265)
(0, 255, (201, 268), (190, 265))
141
Chapter 7 Transformations and Measurements
22932324
87.47987365722656
80.0
2378.9479362999555
48.774459877070456
(259.9973423539629, 254.6090727219797)
142
Chapter 7 Transformations and Measurements
print(ndi.find_objects(a))
print(ndi.find_objects(a, max_label=2))
print(ndi.find_objects(a, max_label=4))
print(ndi.find_objects(img, max_label=4))
print(ndi.sum_labels(img))
Summary
In this chapter, we explored methods for transformations and
measurements. We studied the shift and zoom transformations and
calculated the histogram of a grayscale image. We also calculated statistical
information about the images.
In the next chapter, we will study image kernels and filters, their types,
and their applications in image enhancement in detail.
143
CHAPTER 8
Filters
In the previous chapter, we learned the functionality found in the
scipy.ndimage module of the SciPy library. We learned how to apply
transformations such as shift and zoom on an image. We also learned how
to obtain statistical information about an image. We saw how to compute
and plot the histogram for an image.
In Chapter 5, we studied image filters using the Pillow library. In this
chapter, we will study in detail the theory behind those filters. We will also
see the types of filters and kernels used in image processing, as well as the
applications of those filters.
The following are topics covered in this chapter:
• Kernels
• Low-pass filters
• High-pass filters
• Fourier filters
By the end of this chapter, we will be comfortable with SciPy filters and
their applications.
[ 1 4 10 16 22 28 33]
[[ 5 9 14 18]
[21 25 30 34]
146
Chapter 8 Filters
[41 45 50 54]
[57 61 66 70]]
147
Chapter 8 Filters
[ 3 8 14 20 26 32 35]
[[ 5 9 14 18]
[21 25 30 34]
[41 45 50 54]
[57 61 66 70]]
148
Chapter 8 Filters
149
Chapter 8 Filters
Low-Pass Filters
We can use various kernels with the convolution operation to create filters.
Filters are used to enhance an image or to perform operations on an
image. The filters that remove the high-frequency data from a signal (or
an image, for that matter) are known as low-pass filters. There are many
applications of low-pass filters. Let’s look at them now.
Blurring
We can blur images with blurring kernels. SciPy comes with many routines
that use blurring kernels with convolution. Listing 8-5 demonstrates a
Gaussian filter’s being applied to an image.
150
Chapter 8 Filters
The routine for the Gaussian filter accepts the image and the value for
sigma. The output is as shown in Figure 8-3.
A uniform filter replaces the value of a pixel by the mean value of an area
centered at the pixel. Let’s see a uniform filter, as demonstrated in Listing 8-6.
151
Chapter 8 Filters
The routine for the uniform filter accepts the image and the size of the
filter. The output is as shown in Figure 8-4.
152
Chapter 8 Filters
plt.subplot(1, 3, 2)
plt.imshow(y3, cmap='gray')
plt.title('Sigma = 3')
plt.subplot(1, 3, 3)
plt.imshow(y6, cmap='gray')
plt.title('Sigma = 3')
plt.show()
The routine for the one-dimensional Gaussian filter accepts the image
and the value of Sigma. The output is as shown in Figure 8-5.
153
Chapter 8 Filters
plt.title('Original Image')
plt.subplot(1, 3, 2)
plt.imshow(y9, cmap='gray')
plt.title('Size = 9')
plt.subplot(1, 3, 3)
plt.imshow(y12, cmap='gray')
plt.title('Size = 12')
plt.show()
The routine for the one-dimensional uniform filter accepts the image
and the size of the filter. The output is as shown in Figure 8-6.
154
Chapter 8 Filters
Rank filters are non-linear filters. A rank filter uses local gray-level
ordering to compute the output. We can even apply a rank filter to an
image, as demonstrated in Listing 8-10.
155
Chapter 8 Filters
img = misc.ascent()
out1 = ndi.rank_filter(img, size=10, rank=10)
out2 = ndi.rank_filter(img, size=45, rank=65)
plt.subplot(1, 3, 1)
plt.imshow(img, cmap='gray')
plt.title('Original Image')
plt.subplot(1, 3, 2)
plt.imshow(out1, cmap='gray')
plt.title('rank=10')
plt.subplot(1, 3, 3)
plt.imshow(out2, cmap='gray')
plt.title('rank=65')
plt.show()
Noise Reduction
We can use low-pass filters for noise reduction. In the next two examples
(Listing 8-11 and Listing 8-12), we will create artificial noise with NumPy
routines and add it to the test image. Listing 8-11 shows how to reduce
noise with the Gaussian filter.
156
Chapter 8 Filters
157
Chapter 8 Filters
158
Chapter 8 Filters
159
Chapter 8 Filters
import numpy as np
import matplotlib.pyplot as plt
img = misc.ascent()
output1 = ndi.minimum_filter1d(img, 9)
output2 = ndi.minimum_filter(img, 9)
output3 = ndi.maximum_filter1d(img, 9)
output4 = ndi.maximum_filter(img, 9)
output = [output1, output2, output3, output4]
titles = ['Minimum 1D', 'Minimum',
'Maximum 1D', 'Maximum']
for i in range(4):
plt.subplot(2, 2, i+1)
plt.imshow(output[i], cmap='gray')
plt.title(titles[i])
plt.axis('off')
plt.show()
160
Chapter 8 Filters
High-Pass Filters
High-pass filters work in the exact opposite way to the low-pass filters.
They allow high-frequency signals to pass through them. They also use
kernels and convolutions. However, the kernel used for high-pass filters is
different than that used in low-pass filters. When we apply high-pass filters
to an image, it sharpens the image (which is opposite of blurring) and
highlights edges.
Listing 8-14 demonstrates the Prewitt filter to detect edges in the
test image.
161
Chapter 8 Filters
162
Chapter 8 Filters
We can also apply Sobel filters to axes separately and to the entire
image. It will detect horizontal edges, vertical edges, and edges on both the
axes. Listing 8-15 generates a test image and applies Sobel filters.
import numpy as np
import scipy.ndimage as ndi
import matplotlib.pyplot as plt
img = np.zeros((516, 516))
img[128:-128, 128:-128] = 1
img = ndi.gaussian_filter(img, 8)
rotated = ndi.rotate(img, -20)
noisy = rotated + 0.09 * np.random.random(rotated.shape)
sx = ndi.sobel(noisy, axis=0)
sy = ndi.sobel(noisy, axis=1)
sob = np.hypot(sx, sy)
titles = ['Original', 'Rotated', 'Noisy',
'Sobel (X-axis)', 'Sobel (Y-axis)', 'Sobel']
output = [img, rotated, noisy, sx, sy, sob]
for i in range(6):
plt.subplot(2, 3, i+1)
plt.imshow(output[i])
plt.title(titles[i])
plt.axis('off')
plt.show()
163
Chapter 8 Filters
164
Chapter 8 Filters
plt.subplot(2, 2, i+1)
plt.imshow(output[i],
cmap='gray')
plt.title(titles[i])
plt.axis('off')
plt.show()
Fourier Filters
Fourier filters work in the frequency domain. They compute the fourier
transform of the image, manipulate frequencies, and then finally compute the
inverse fourier transform to produce the final output. We can apply various
Fourier filtersFiltersfourier filters to images, as demonstrated in Listing 8-17.
165
Chapter 8 Filters
166
Chapter 8 Filters
Summary
In this chapter, we were introduced to myriad filters. We saw their types
and looked at their applications. The image-filtering topic is too vast to be
completely explored in a single chapter.
In the next chapter, we will cover morphological operators, image
thresholding, and basic segmentation.
167
CHAPTER 9
Morphology,
Thresholding,
and Segmentation
In the previous chapter, we studied the theory behind image filters, along
with the types and practical applications of filters used to enhance images.
In this chapter, we are going to study and demonstrate important
concepts in image processing, such as morphology, morphological
operations on images, thresholding, and segmentation. We will study and
demonstrate the following:
• Distance transforms
• Morphology and morphological operations
Distance Transforms
A distance transform is an operation performed on binary images.
Binary images have background elements (zero value – black color)
and foreground elements (white color). A distance transform replaces
each foreground element with the value of the shortest distance to the
background. scipy.ndimage has three methods for computing the
distance transform of a binary image. The code in Listing 9-1 illustrates
how a distance transform can be used practically to generate test images.
170
Chapter 9 Morphology, Thresholding, and Segmentation
171
Chapter 9 Morphology, Thresholding, and Segmentation
Structuring Element
A structuring element is a matrix that’s used to interact with a given binary
image. It comes in various shapes, like a ball, a ring, or a line. It can come
in many shapes, like a 3 x 3 or a 7 x 7 matrix. Larger structuring elements
take more time for computation. A simple structuring element can be
defined as a unity matrix of odd sizes. np.ones((3, 3)) is an example
of this.
172
Chapter 9 Morphology, Thresholding, and Segmentation
opening, closing]
titles = ['Original', 'Erosion',
'Dilation', 'Opening',
'Closing']
for i in range(5):
print(output[i])
plt.subplot(1, 5, i+1)
plt.imshow(output[i],
interpolation='nearest')
plt.title(titles[i])
plt.axis('off')
plt.show()
The code example in Listing 9-2 generates a binary image and applies
all the binary morphological operations to it. The output is shown in
Figure 9-2.
173
Chapter 9 Morphology, Thresholding, and Segmentation
img[x, y] = 0
noise_removed = ndi.binary_fill_holes(img).astype(int)
output = [img, noise_removed]
titles = ['Original', 'Noise Removed']
for i in range(2):
print(output[i])
plt.subplot(1, 2, i+1)
plt.imshow(output[i],
interpolation='nearest')
plt.title(titles[i])
plt.axis('off')
plt.show()
174
Chapter 9 Morphology, Thresholding, and Segmentation
175
Chapter 9 Morphology, Thresholding, and Segmentation
176
Chapter 9 Morphology, Thresholding, and Segmentation
plt.imshow(output[i],
interpolation='nearest')
plt.title(titles[i])
plt.axis('off')
plt.show()
177
Chapter 9 Morphology, Thresholding, and Segmentation
o2=ndi.binary_propagation(img, mask=mask,
structure=struct).astype(int)
output = [img, o1, o2]
titles = ['Original',
'Output 1',
'Output 2']
for i in range(3):
print(output[i])
plt.subplot(1, 3, i+1)
plt.imshow(output[i],
interpolation='nearest')
plt.title(titles[i])
plt.axis('off')
plt.show()
178
Chapter 9 Morphology, Thresholding, and Segmentation
179
Chapter 9 Morphology, Thresholding, and Segmentation
The code shown in Listing 9-8 applies gray dilation and gray erosion
operations to a distance transform.
img = ndi.distance_transform_bf(img)
180
Chapter 9 Morphology, Thresholding, and Segmentation
for i in range(3):
print(output[i])
plt.subplot(1, 3, i+1)
plt.imshow(output[i], interpolation='nearest',
cmap='rainbow')
plt.title(titles[i])
plt.axis('off')
plt.show()
The code in Listing 9-8 uses a structuring element that’s 3 x 3 for both
operations. The output is shown in Figure 9-8.
181
Chapter 9 Morphology, Thresholding, and Segmentation
Thresholding and Segmentation
This is the final part of the book and it deals with one of the most
important applications of image processing: segmentation. In thresholding
operations, you convert grayscale images to binary (black and white)
images based on the threshold value. The pixels with intensity values
greater than the threshold are assigned white, and the pixels with intensity
values lower than the threshold are assigned a dark value. This is known
as binary thresholding, and is the most basic form of thresholding and
segmentation. An example is shown in Listing 9-9.
The code in Listing 9-9 sets the threshold at 127. At the grayscale, a
pixel value of 127 corresponds to the gray color. The resulting thresholded
image is shown in Figure 9-9.
182
Chapter 9 Morphology, Thresholding, and Segmentation
183
Chapter 9 Morphology, Thresholding, and Segmentation
184
Chapter 9 Morphology, Thresholding, and Segmentation
185
Chapter 9 Morphology, Thresholding, and Segmentation
[[0 1 0]
[1 1 1]
[0 1 0]]
[[0 0 1 0 0]
[0 1 1 1 0]
[1 1 1 1 1]
[0 1 1 1 0]
[0 0 1 0 0]]
[[0 0 0 1 0 0 0]
[0 0 1 1 1 0 0]
[0 1 1 1 1 1 0]
[1 1 1 1 1 1 1]
[0 1 1 1 1 1 0]
[0 0 1 1 1 0 0]
[0 0 0 1 0 0 0]]
186
Chapter 9 Morphology, Thresholding, and Segmentation
187
Chapter 9 Morphology, Thresholding, and Segmentation
Summary
In this chapter, we studied distance transforms for generating test images.
We then learned about morphology and how to use morphological
operations on images. Morphological operations come in two varieties—
binary and grayscale. We also studied thresholding, which is the simplest
form of image segmentation.
In the next chapter, we will study and demonstrate video processing.
188
CHAPTER 10
Video Processing
In the previous chapter, we learned about using morphological
operations in the domain of image processing. We have also demonstrated
all those operations with libraries such as NumPy, SciPy, and matplotlib.
Till now, throughout this book, we have been discussing how to work
with static images. We know the means of acquiring static images and also
how to process them. This chapter is dedicated to processing streams of
continuous images, such as video and live webcam feed. We will also get
introduced to a new library, OpenCV. We will cover the following topics in
this chapter:
are going to use it for reading live webcam stream and video files. For
processing the video and live webcam feed, we will still use SciPy.
OpenCV stands for open source computer vision. It is a library for
real-time computer vision. It includes a lot of functionality for image
processing and video procession. It also has implementations of many
computer vision algorithms. It is primarily written in C++. It has interfaces
(APIs; application programming interfaces) for languages such as C++,
Python, Java, MATLAB/OCTAVE, and JavaScript. We can explore this at the
webpage https://opencv.org/.
Let’s get started by installing the library on our Raspberry Pi with the
following command:
import numpy as np
import cv2 as cv
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
190
Chapter 10 Video Processing
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
cv.imshow('frame', frame)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
Execute the program. The indicator LED embedded in the webcam will
glow, and it will produce the following (Figure 10-1) output window on the
display,
191
Chapter 10 Video Processing
That’s me! We can stop the program by pressing the ESC key on the
keyboard. The window (Figure 10-1) will close, and the LED indicator in
the webcam will stop glowing. Now, it is time to understand the program
line-by-line. In the first two lines, we import the required libraries (which
includes OpenCV) and create aliases for our convenience. Then, the
following line creates an object corresponding to the webcam:
cap = cv.VideoCapture(0)
The argument passed is the index of the webcam. If only one webcam
is attached then the argument passed is always 0. The following two lines
define the resolution at which the webcam will operate:
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
https://stackoverflow.com/questions/11420748/setting-camera-
parameters-in-opencv-python
Also, I wish to make readers aware that we will find the video stream
from the webcam faster and more responsive at lower resolutions. As the
processing power of the Raspberry Pi is limited, we will have more frames
per second (FPS) in the output window at lower resolutions. We can get a
list of all the standard resolutions at the following resources on the web:
https://en.wikipedia.org/wiki/List_of_common_resolutions
https://yusef.es/labs/resolutions.htm
192
Chapter 10 Video Processing
if not cap.isOpened():
print("Cannot open camera...")
exit()
The while loop block runs as long as we do not hit the ESC key on the
keyboard.
while True:
ret, frame = cap.read()
The function call cap.read() returns the status and the frame. The
frame is a numpy Ndarray object that has the current frame. We can
manipulate it as usual with NumPy and SciPy routines, just as we have
practiced in earlier chapters.
If the status of the webcam frame read operation is unsuccessful, then
we terminate the loop, as follows:
cv.imshow('frame', frame)
The following code checks for the press of the ESC key on the
keyboard:
cap.release()
cv.destroyAllWindows()
193
Chapter 10 Video Processing
Consider this program as the template for all the programs in the rest
of the chapter. We can modify the program to read and show a video file.
We have to modify the code where we address the webcam, as follows:
cap = cv.VideoCapture('test.mp4')
We are passing the name of the video file as the argument. We also
have to make modifications to the following line in the while loop block:
194
Chapter 10 Video Processing
import numpy as np
import cv2 as cv
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
gray = cv.cvtColor(frame,
cv.COLOR_BGR2GRAY)
cv.imshow('frame', gray)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
gray = cv.cvtColor(frame,
cv.COLOR_BGR2GRAY)
195
Chapter 10 Video Processing
import numpy as np
import cv2 as cv
from datetime import datetime
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
print(datetime.now())
196
Chapter 10 Video Processing
print(frame.dtype)
print(frame.shape)
print(frame.ndim)
print(frame.size)
b = frame[:, :, 0]
g = frame[:, :, 1]
r = frame[:, :, 2]
cv.imshow('Red', r)
cv.imshow('Green', g)
cv.imshow('Blue', b)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
In this code listing (prog02.py), we can see the properties of each frame
as well as all the channels, as shown in the output (Figure 10-2).
Also, the console will show the output, like the one here, several times
per second (so, it will scroll really fast):
2022-01-30 13:00:30.427360
197
Chapter 10 Video Processing
uint8
(144, 176, 3)
3
76032
Geometric Transformation
Let’s rotate the webcam frame continuously. Listing 10-4 shows the code.
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
angle = 0
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
198
Chapter 10 Video Processing
cap.release()
cv.destroyAllWindows()
199
Chapter 10 Video Processing
Convolution
Let’s define a kernel and apply it to the frames in the stream using the
convolution operation, as in Listing 10-5.
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
cap = cv.VideoCapture(0)
n = 5
k = np.ones((n, n, n), np.uint8)/(n*n*n)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
output = ndi.convolve(frame, k,
mode='nearest')
200
Chapter 10 Video Processing
cap.release()
cv.destroyAllWindows()
The OpenCV library also has the functionality to create a trackbar. Let’s
create a trackbar in such a way that it adjusts the size of the kernel matrix.
Listing 10-6 contains the code.
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
201
Chapter 10 Video Processing
WINDOW_NAME,
1, 5, nothing)
cap = cv.VideoCapture(0)
n = 1
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
We are already familiar with most of the code. Let’s understand the
new parts. The following code defines an empty function:
def nothing(x):
pass
202
Chapter 10 Video Processing
cv.createTrackbar('Kernel Size',
WINDOW_NAME,
1, 5, nothing)
We can now change the size of the kernel by sliding the tracker on the
trackbar. Figure 10-5 shows the output.
203
Chapter 10 Video Processing
We are going to use the code for the trackbar frequently throughout the
chapter.
Correlation
Let’s use the trackbar and apply a correlation on the live webcam feed, as
shown in Listing 10-7.
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
204
Chapter 10 Video Processing
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
205
Chapter 10 Video Processing
Filtering
We have learned about many filters in earlier chapters. Let’s apply them on
the live webcam feed one by one. Listing 10-8 shows the Gaussian filter.
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
206
Chapter 10 Video Processing
output = ndi.gaussian_filter(frame,
sigma=n)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
207
Chapter 10 Video Processing
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
output = ndi.uniform_filter(frame,
size=n)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
Let’s apply the percentile filter. The code in Listing 10-10 defines two
trackbars for adjusting the arguments passed to the filter.
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
208
Chapter 10 Video Processing
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
s = cv.getTrackbarPos('Size',
WINDOW_NAME)
p = cv.getTrackbarPos('Percentile',
WINDOW_NAME)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
209
Chapter 10 Video Processing
cap.release()
cv.destroyAllWindows()
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
210
Chapter 10 Video Processing
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
output = ndi.minimum_filter(frame,
size=n)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
211
Chapter 10 Video Processing
def nothing(x):
pass
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
output = ndi.maximum_filter(frame,
size=n)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
212
Chapter 10 Video Processing
We can also apply Prewitt and Laplace filters as shown in Listing 10-13.
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
o1 = ndi.prewitt(frame)
o2 = ndi.laplace(frame)
cv.imshow("Prewitt", o1)
cv.imshow("Laplace", o2)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
213
Chapter 10 Video Processing
Note We can read more about both filters at the following URLs:
https://www.geeksforgeeks.org/matlab-image-edge-
detection-using-prewitt-operator-from-scratch/
https://www.l3harrisgeospatial.com/docs/
laplacianfilters.html
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
214
Chapter 10 Video Processing
s = cv.getTrackbarPos('Sigma',
WINDOW_NAME)
output = ndi.gaussian_gradient_magnitude(gray,
sigma=s)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
215
Chapter 10 Video Processing
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
s = cv.getTrackbarPos('Sigma',
WINDOW_NAME)
output = ndi.gaussian_laplace(gray,
sigma=s)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
Morphological Operations
We can also apply morphological operations on live video, as shown in
Listing 10-16.
216
Chapter 10 Video Processing
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
cv.createTrackbar('Sigma',
WINDOW_NAME,
1, 10, nothing)
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
n1 = cv.getTrackbarPos('Sigma',
WINDOW_NAME)
217
Chapter 10 Video Processing
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
cap.release()
cv.destroyAllWindows()
import numpy as np
import cv2 as cv
import scipy.ndimage as ndi
def nothing(x):
pass
218
Chapter 10 Video Processing
cv.createTrackbar('Sigma',
WINDOW_NAME,
1, 10, nothing)
cap = cv.VideoCapture(0)
cap.set(cv.CAP_PROP_FRAME_WIDTH, 256)
cap.set(cv.CAP_PROP_FRAME_HEIGHT, 144)
if not cap.isOpened():
print("Cannot open camera...")
exit()
while True:
ret, frame = cap.read()
n1 = cv.getTrackbarPos('Sigma',
WINDOW_NAME)
cv.imshow(WINDOW_NAME, output)
if cv.waitKey(1) == 27:
break
219
Chapter 10 Video Processing
cap.release()
cv.destroyAllWindows()
Summary
In this chapter, we learned how to apply the SciPy routines to a live video
feed from a webcam and on video files. We are now comfortable with
processing videos (both live and stored files).
Conclusion
In this book, we covered various Python libraries meant for image
processing. We also used the Raspberry Pi 4 single-board computer
with the Raspberry Pi OS Debian Buster as the preferred platform for
demonstrating all the Python 3 code. All the code examples can be run on
any other operating system as well, such as Windows, macOS, other Linux
distributions, Unix distributions, and BSD (such as FreeBSD).
220
Chapter 10 Video Processing
221
Appendix
This section examines topics that will help us to work with image
processing and Raspberry Pi SBC in a more efficient way. While I could not
find a suitable place for these topics in any of the chapters, I personally feel
that they are quite useful in the learning process. This appendix section
covers the following topics:
• Connecting a display
We can add some text to an image with the code shown in Listing A-4.
224
Appendix
These are the basics of the library. Please run all the preceding
programs and see the output. One can explore this library further at
https://pythonhosted.org/pgmagick/.
Connecting a Display
Up until now, we have been accessing the Raspberry Pi in headless
mode (connecting using SSH and X-11 forwarding using Wi-Fi). We can
also connect a visual display (an HDMI/VGA monitor) by choosing the
appropriate connector. If we are opting for HDMI with RPi 4B, then we
need a micro-HDMI to HDMI adapter (check online marketplaces for
this). If we are opting for a VGA monitor, we need a micro-HDMI to VGA
adapter. If we check online marketplaces, we can find a micro-HDMI to
HDMI and VGA combined adapter that takes care of both cases.
225
Appendix
Open the config.txt file and make the following changes to it:
Save the file after making these changes. The microSD card is now
ready for the Pi with a VGA monitor.
226
Appendix
227
Appendix
Click on the Next button, and you will be shown the following window
(Figure A-2).
228
Appendix
Choose the most appropriate options for your region and click Next
(you can also navigate to the previous options by clicking on Back). You
will see the following (Figure A-3) window.
229
Appendix
230
Appendix
Check the box if the taskbar (in the top area of the screen) does
not fit the screen, and then click on Next. It shows the following
(Figure A-5) window.
231
Appendix
We can set up the Wi-Fi here (if not set up already; if it is, click on the
Skip button), and it shows the update window (Figure A-6).
232
Appendix
We can click on the Next button, and it will update the software (we
can also skip this, but I recommend updating the software). Once done, it
will show a success message (Figure A-7).
233
Appendix
Here, click on the Restart button. After reboot, if you have skipped
changing the password, it will show the following warning (Figure A-8).
234
Appendix
Here, make sure that at least the options SSH and VNC are enabled.
I need other options for my development and programming lessons, so I
enabled them all.
235
Appendix
236
Appendix
237
Appendix
Finally, if you are connecting for the very first time, it will show a
warning message. Click on the Always button, and it will launch the
remote desktop (Figure A-12).
238
Appendix
239
Index
A E
Analog image processing, 47 Equalization, 110, 111
Ethernet/ Wired Network, 234
B F
Binary thresholding, 182, 183
Filters, 145
Box blur, 105
convolution, 146, 148
correlation, 149
C fourier filters, 165
high-pass filters
Color quantization, 108, 109
Prewitt filter, 162
Colorspace conversion, 71, 72
Sobel filters, 164
Convolution, 95–97, 146–148,
kernel, 146
150, 161
low-pass filters, 150
Custom Gaussian filter, 93
blurring kernels, 150
noise reduction, 156, 158,
D 159, 161
fswebcam, 56, 57
Digital image processing (DIP),
31, 46–48
image processing, 47 G
Python 3 for, 61–67 Gaussian filter, 93, 150, 151, 153,
Raspberry Pi and Python for, 49 156–158, 206
signal processing, 46, 47 Geany, 43, 44, 50
Digital unsharp, 97–99 Grayscale morphological
Distance transform, 169, 170, operations, 178, 180, 181
180, 188 guvcview, 55, 56
242
index
243
index
244
index
245