How to do image processing in
Python: Step-by-Step Guide
Working with images is an integral part of many technology solutions today. Most
developers would agree that processing images in Python can be challenging initially.
This article will provide a step-by-step guide to mastering image processing in
Python. You'll learn the fundamentals, essential techniques, and even advanced
methods to build real-world image processing applications.
We'll cover everything from setting up the Python environment to manipulating,
enhancing, and segmenting images. You'll also see how to develop projects for
classification, detection, and even building an image search engine.By the end,
you'll have the skills to tackle any image processing task in Python.
Introduction to Image Processing with Python
Image processing refers to various techniques that allow computers to understand
and modify digital images. It involves analyzing pixel information to perform
operations like identifying objects, detecting edges, adjusting brightness/contrast,
applying filters, recognizing text, etc.
Python is a popular language for image processing due to its extensive libraries,
simple syntax, and active developer community. Key libraries like OpenCV,
PIL/Pillow, scikit-image, and more enable you to work with images in Python.
Understanding the Basics of Image Processing
Image processing relies on analyzing pixel data from digital images to identify and
modify elements within them. Key concepts include:
Image acquisition: Capturing or importing images via cameras, scanners etc.
Preprocessing: Transforming images before analysis (resizing, rotation, noise
removal etc.).
Feature detection: Identifying pixels/regions of interest like edges, corners or
objects.
Analysis: Extracting meaningful information from images using the detected
features.
Manipulation: Transforming images based on the extracted information (filtering,
morphing etc.).
The Advantages of Python in Image Processing
Python is a preferred language for image processing due to:
Extensive libraries like OpenCV, PIL/Pillow, scikit-image etc. offering specialized
functionality.
Simple and readable code thanks to its clean syntax. Easy for beginners to adopt.
Vibrant developer community providing abundant code examples and
troubleshooting support.
Interoperability with languages like C++ for performance-critical operations.
Rapid prototyping enabled by Python's interpreted nature.
Overview of Python Image Processing Libraries
Some key image processing libraries in Python include:
OpenCV: Comprehensive library with over 2500 algorithms ranging from facial
recognition to shape analysis.
PIL/Pillow: Offers basic image handling and processing functionality.
scikit-image: Implements algorithms for segmentation, filtering, feature detection
etc.
Mahotas: Specialized library for computer vision operations.
SimpleCV: Provides an easy interface to OpenCV for rapid prototyping.
With these mature libraries, Python makes an excellent choice for developing image
processing and computer vision applications.
How do I start image processing in Python?
To get started with image processing in Python, follow these key steps:
Import Required Libraries
The main library used for image processing in Python is OpenCV (Open Source
Computer Vision Library). Other useful libraries include scikit-image, Pillow,
matplotlib, etc.
import cv2
import numpy as np
from skimage.io import imread
import matplotlib.pyplot as plt
Load the Image
Use imread() from scikit-image or cv2.imread() from OpenCV to load images into
Python.
img = imread('image.jpg')
Perform Image Processing Techniques
There are many image processing techniques like blurring, sharpening, thresholding,
filtering, edge detection etc. that can be applied.
For example, to convert an image to grayscale:
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Save/Display Result
Use matplotlib to display images. To save processed images, use cv2.imwrite() .
plt.imshow(gray_img, cmap='gray')
plt.show()
cv2.imwrite('gray_image.jpg', gray_img)
This covers the basic workflow to load, process and visualize images in Python.
Check out OpenCV and scikit-image documentation for more image processing
operations.
Is image processing with Python easy?
Python makes image processing very accessible due to its extensive libraries and
ready-made functions. For example, the OpenCV library provides over 500 functions
for common image processing tasks like:
Image resizing and rotation
Blurring and sharpening
Edge detection
Object detection
You don't need to code these from scratch - just call the function and pass in your
image. This makes development much faster compared to lower-level languages like
C++.
Here's a simple example to resize an image with OpenCV in 5 lines of Python:
import cv2
img = cv2.imread('image.jpg')
resized = cv2.resize(img, (100, 100))
cv2.imwrite('resized.jpg', resized)
So while you still need some programming knowledge, Python and libraries like
OpenCV, scikit-image and Pillow make image processing tasks straightforward for
developers at any level.
The key benefits are:
Simple syntax and readability
Extensive libraries for common tasks
High-level functions instead of coding from scratch
Rapid prototyping and development
This makes Python a popular choice for computer vision and image processing.
What is the Python tool for image processing?
Pillow (also known as PIL) is the most widely used Python library for image
processing. Here are some key things to know about Pillow:
Open-source library that builds on the now-discontinued PIL (Python Imaging Library)
Provides extensive support for different image formats like JPEG, PNG, GIF, BMP
and TIFF
Useful for basic image manipulation tasks like resizing, cropping, rotating, blurring
etc.
Has image enhancement capabilities like contrast adjustment, sharpening, color
space conversions etc.
Supports creating thumbnails, applying filters, drawing shapes and text onto images
Integrates well with popular Python data analysis libraries like NumPy and SciPy
In summary, Pillow offers a versatile toolkit to load, manipulate and save images for
various applications using Python. Its simple API, maturity as a library and integration
with NumPy make it a convenient choice for developers looking to integrate image
processing capabilities into their Python programs.
Which algorithm is used for image processing in
Python?
Python has several algorithms and libraries that are commonly used for image
processing tasks. Some of the most popular options include:
SciPy - This scientific computing library contains modules for image processing like
binary morphology, filtering, interpolation, etc. It is useful for tasks like image
enhancement, restoration, and segmentation.
OpenCV - The OpenCV library is widely used for computer vision and image
processing. It provides algorithms for tasks ranging from facial recognition to image
stitching. Useful for object detection, classification, and tracking.
scikit-image - Also known as skimage, this library focuses specifically on image
processing. It has tools for segmentation, denoising, feature extraction, registration
and more. Easy to use and integrate into machine learning workflows.
Pillow - Pillow is a popular Python imaging library used for basic image manipulation
like resizing, cropping, filtering, color space conversions etc. Handy for preparing
images for input/output.
So in summary, SciPy and scikit-image are good for scientific image analysis while
OpenCV focuses on computer vision. Pillow provides general utility functions for
image handling. The choice depends on the specific task - classification, object
recognition, enhancement etc. But all these libraries complement each other.
Preparing the Python Image Processing Environment
Installing Python and PIP for Image Processing
To get started with image processing in Python, you'll need to have Python and PIP
(Python package manager) installed on your system. Here are step-by-step
instructions for installation:
1. Download the latest Python release from python.org. Make sure to download
version 3.6 or higher.
2. Follow the installation wizard, customizing any options as desired. Make sure
Python is added to your system's PATH.
3. Open a new command prompt window and run pip --version to confirm PIP is
installed with Python. If not, install it from this page.
Once Python and PIP are installed, you have the base environment ready for image
processing libraries.
Installing Essential Python Image Processing Libraries
The main libraries we'll use are:
OpenCV - for core image processing operations
NumPy - provides multidimensional array data structures
SciPy - used for scientific computing and technical computing capabilities
Pillow - adds support for image file reading/writing
To install them:
1. At the command prompt, run: pip install opencv-python
2. Run: pip install numpy scipy
3. Run: pip install pillow
This will download and install the latest versions of these important libraries.
Other useful optional libraries like scikit-image, Mahotas, SimpleITK can also be
installed via PIP.
Importing Libraries for Image Processing in Python
Once the libraries are installed, we can import them into our Python scripts.
For example:
import cv2
import numpy as np
from PIL import Image
import scipy.ndimage
We use aliases like cv2 for OpenCV and np for NumPy to simplify later coding.
The environment is now ready for loading images, applying filters, transformations
and running analysis algorithms!
Fundamentals of Working with Images in Python
Python provides various libraries for working with images, such as OpenCV,
PIL/Pillow, scikit-image, etc. This section will introduce some core concepts and
techniques for handling images in Python.
Loading and Handling Images with OpenCV and PIL
To load an image in Python using OpenCV, we use the cv2.imread() function. For
example:
import cv2
img = cv2.imread('image.jpg')
Similarly, with the Python Imaging Library (PIL), we use the Image.open() method:
from PIL import Image
img = Image.open('image.jpg')
These functions load the image data into a NumPy array or PIL Image object
respectively, which provides various properties and pixel data access.
Some key attributes when working with loaded image data:
shape : Access width, height and channels
size : Width and height dimensions
dtype : Data type of pixels
getpixel() / item() : Get value of a pixel
Efficient Image Storing Techniques with OpenCV and PIL
To save an image to disk after processing, OpenCV provides cv2.imwrite() :
cv2.imwrite('new_image.jpg', img)
And with PIL:
img.save('new_image.jpg')
Some best practices for efficient image saving:
Use compressed formats like JPG, PNG depending on image type
Adjust quality parameter for best compression/quality trade-off
Store normalized float arrays before saving for better precision
Visualizing Images with Matplotlib in Python
The Matplotlib library provides simple visualization of images using plt.imshow() :
import matplotlib.pyplot as plt
plt.imshow(img)
plt.show()
Some parameters that help enhance visualization:
cmap : Colormap for intensity values
interpolation : Algorithm for pixel interpolation
This allows inspection of images at various stages of processing pipelines.
Essential Image Manipulation Techniques in Python
Image processing is an important capability in Python, enabling tasks like resizing,
cropping, rotating, and otherwise manipulating images. This section will cover some
of the essential image manipulation techniques using popular Python libraries like
OpenCV, PIL/Pillow, NumPy, and SciPy.
Image Resizing with Python Libraries
Resizing images is a common requirement in applications like creating thumbnails,
fitting images to specific dimensions, or scaling for display purposes.
The OpenCV library provides simple methods like cv2.resize() to resize images.
You can specify the output dimensions directly:
import cv2
img = cv2.imread('image.jpg')
resized = cv2.resize(img, (100, 200))
The PIL/Pillow library also offers flexible image resizing with Image.resize() ,
allowing both pixel dimensions or percentage scaling:
from PIL import Image
img = Image.open('image.jpg')
resized = img.resize((100, 100)) # pixels
resized = img.resize((50, 50)) # 50% scale
Both libraries make image resizing straightforward in Python.
Cropping Images Using Python
Cropping extracts a region of interest from an image, a useful technique for focusing
on key parts or removing unwanted areas.
NumPy array slicing provides an easy way to crop in OpenCV and Pillow. If img is a
NumPy array, we can extract a 100x100 pixel square from x=50, y=50 like:
cropped = img[50:150, 50:150]
Alternatively, Pillow's Image.crop() method allows cropping by pixel coordinates:
box = (50, 50, 150, 150)
cropped = img.crop(box)
This selects the same region as the NumPy slicing. Both approaches provide simple
ways to implement cropping.
Image Rotation and Flipping Techniques
Rotating or flipping images may be required in applications like correcting
orientations or generating augmented data.
OpenCV provides the cv2.rotate() method for rotating images by 90 degree
increments or an arbitrary angle:
rotated90 = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)
rotated30 = cv2.rotate(img, 30)
Similarly, Pillow offers Image.rotate() and Image.transpose() for rotations:
rotated90 = img.rotate(90)
flipped = img.transpose(Image.FLIP_LEFT_RIGHT)
These functions enable flexible image rotation and flipping manipulations.
Overall, Python imaging libraries like OpenCV and PIL provide powerful yet easy to
use tools for essential image processing techniques, from resizing and cropping to
rotations, making them very useful for tasks like data augmentation and image
correction.
Advanced Image Filtering and Enhancement in Python
Image processing techniques like filtering and enhancement allow you to manipulate
images in Python to achieve various effects. This guide will demonstrate some
advanced methods using the OpenCV library.
Image Blurring Techniques with OpenCV
Applying blur effects can be useful for reducing image noise. OpenCV provides
several blurring techniques:
Linear filters - Simple averaging of pixel neighborhoods. Easy to apply but produces
unnatural looking results.
Gaussian blur - Uses a Gaussian kernel to produce more natural blurs. Adjustable
kernel size allows control over blur intensity. Useful for smoothing noise while
preserving edges.
Here is an example applying a 15 x 15 Gaussian blur in OpenCV Python:
import cv2
image = cv2.imread('image.jpg')
blurred = cv2.GaussianBlur(image, (15, 15), 0)
cv2.imwrite('blurred.jpg', blurred)
This smooths the image while avoiding distortion artifacts.
Sharpening Images with Convolutional Filters
Sharpening brings images into better focus. Convolutional filters accentuate edges
and fine details.
Some OpenCV sharpening filter options:
Unsharp masking - Boosts edge contrast for perceived sharpness.
Laplacian filters - Detects rapid changes in pixel values to emphasize edges.
High-pass filters - Retain high frequency details while suppressing lower
frequencies.
Here's an example unsharp mask in OpenCV:
import cv2
import numpy as np
image = cv2.imread('image.jpg')
kernel = np.array([[0, -1, 0],
[-1, 5,-1],
[0, -1, 0]])
sharpened = cv2.filter2D(image, -1, kernel)
This brings out finer details for improved clarity.
Edge Detection in Python Using Canny Algorithm
The Canny algorithm is widely used for edge detection. It applies Gaussian
smoothing to reduce noise, computes intensity gradients to highlight edges, then
suppresses weak or disconnected edges.
Here is Canny edge detection in OpenCV Python:
import cv2
image = cv2.imread('image.jpg')
edges = cv2.Canny(image, 100, 200)
cv2.imwrite('canny_edges.jpg', edges)
This produces a clear edge map isolating prominent contours in the image.
Advanced filters like these enable effective image analysis and manipulation with
OpenCV in Python.
Exploring Image Segmentation Techniques with Python
Image segmentation is an important technique in image processing and computer
vision that involves partitioning an image into multiple segments. This allows easier
analysis of the image contents by simplifying representation into something more
meaningful and easier to analyze.
Python offers simple and powerful tools to perform image segmentation thanks to
libraries like OpenCV, scikit-image, and others. In this section, we'll explore some of
the popular image segmentation techniques and how to implement them in Python.
Applying Thresholding Techniques in Python
Thresholding is one of the simplest segmentation methods. It converts a grayscale
image to a binary image by setting pixel values above a threshold to white and
values below to black. This separates the image into foreground and background
regions.
Here is an example using OpenCV's threshold function:
import cv2
import numpy as np
img = cv2.imread('image.jpg', 0)
ret, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
We can also use adaptive thresholding which calculates the threshold for smaller
regions, giving better results for images with varying illumination:
thresh_adapt = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY, 11, 2)
Segmentation with Watershed Algorithm in Python
The watershed algorithm treats an image like a topographic map, with pixel
intensities representing heights. It then finds "catchment basins" and "watershed
ridge lines" to segment the image.
We can use OpenCV's implementation in Python:
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('coins.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)
# Apply watershed
sure_bg = cv2.dilate(opening,kernel,iterations=3)
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
ret, markers = cv2.connectedComponents(sure_fg)
markers = markers+1
markers[unknown==255] = 0
markers = cv2.watershed(img,markers)
img[markers == -1] = [0,255,0]
This performs several pre and post-processing steps on the image before applying
watershed. The final segmented image separates each coin successfully.
Foreground Extraction with GrabCut Algorithm
GrabCut is an interactive segmentation method. It allows a user to draw an initial
bounding box around the foreground object to extract. It then iteratively refines the
segmentation based on pixel color and texture features.
Here is an example with OpenCV:
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('messi.jpg')
mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
rect = (50,50,450,290)
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
mask = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask[:,:,np.newaxis]
We initialize a rectangular region around Messi. GrabCut then evolves the
segmentation to tightly fit just the foreground object.
Developing Python Image Processing Projects
Image processing is an exciting field with many real-world applications. Here are
some ideas for Python image processing projects you can develop to put your skills
to use:
Image Classification Projects with Python and OpenCV
Image classification involves training machine learning models to categorize images
into different classes. Here are some project ideas:
Build a custom image classifier to detect specific objects. Gather images of those
objects, label them, and train a convolutional neural network model with OpenCV and
Python to recognize them. This could be used for quality control in manufacturing,
identifying wildlife with camera traps, or even detecting ripe produce.
Create a classifier that can identify plant diseases from leaf images. Collect images
of healthy and infected plant leaves, label them by disease type, and train a model to
categorize new leaf images by disease. This could help farmers identify crop
infections early to prevent spread.
Develop an app that identifies dog breeds from user-submitted photos. Use transfer
learning with a pre-trained model like ResNet50 to retrain the final layer, adding new
output classes for different dog breeds. Capture images of dogs to train classifier.
The key steps are gathering a dataset, labeling images, training/validating/testing
models, and exporting the model to production. Use data
augmentation, hyperparameter tuning, and techniques like transfer learning to
improve accuracy.
Object Localization and Detection with Python
Locating and drawing bounding box regions around objects in images is another
useful application of computer vision. Project ideas include:
Face detection app that draws boxes around faces in images. Use Haar cascades
with OpenCV to identify facial features. Could be used to automatically tag people in
photos.
Traffic camera analyzer that highlights all vehicles in a traffic video feed. Use
background subtraction and contour detection to identify cars and trucks and draw
boxes around them. Useful for automated traffic monitoring.
Product scanner that locates retail products on store shelves. Train an object
detection model on product images and apply it to shelf images to identify and
highlight items. Assist with inventory audits and checking stock levels.
The key techniques are training object detection models like SSD and YOLOv3 or
using Haar cascades for things like faces. Outputs are bounding box regions
identifying object locations.
Creating an Image Search Engine with Python
Building a reverse image search engine lets people discover similar images. Project
ideas include:
Fashion image search site for finding clothing and accessory ideas. Allow image
uploads and return visually similar catalog images, linking to shopping options.
Interior design search tool for matching furniture and decor styles. Index interior
images and return the most similar images from the database to user uploads.
Plagiarism checker that compares essay submissions against web sources to detect
copied work. Use image hashing to compare incoming images/docs to indexed
original content.
Use perceptual image hashing to give images a fingerprint. Index the hashes for
storage and fast lookup. Calculate hash of search images and find closest matches
in index using a distance metric like Hamming distance.
The key skills are building an image database, generating hashes, indexing for
search, and writing matching logic. These allow building versatile search apps.
Conclusion: Mastering Python Image Processing
Python is a versatile programming language that offers powerful image processing
capabilities. By following this tutorial, you have learned key image processing
techniques in Python:
Image resizing, rotation, translation, shearing and normalization using OpenCV and
Pillow to manipulate image properties
Applying filters like blurring and edge detection to alter image appearance
Utilizing morphological operations for advanced image transformations
Detecting and localizing objects in images with OpenCV and deep learning
Working with different color spaces and channel operations
With these fundamentals, you can now confidently take on more advanced Python
computer vision and image analysis projects. Check out the OpenCV and Pillow
documentation to continue expanding your skills. Additionally, active communities like
PyImageSearch provide code examples and applied tutorials on cutting-edge
techniques.
By mastering Python image processing, you open up possibilities in diverse fields like
medical imaging, satellite imagery analysis, machine inspection systems, facial
recognition, and more. This versatile skill set will serve you well in both research and
industry applications.