How to Do Image Processing in Python_ Step-By-Step Guide
How to Do Image Processing in Python_ Step-By-Step Guide
Working with images is an integral part of many technology solutions today. Most
developers would agree that processing images in Python can be challenging initially.
Python is a popular language for image processing due to its extensive libraries,
simple syntax, and active developer community. Key libraries like OpenCV,
PIL/Pillow, scikit-image, and more enable you to work with images in Python.
Image processing relies on analyzing pixel data from digital images to identify and
modify elements within them. Key concepts include:
Simple and readable code thanks to its clean syntax. Easy for beginners to adopt.
OpenCV: Comprehensive library with over 2500 algorithms ranging from facial
recognition to shape analysis.
PIL/Pillow: Offers basic image handling and processing functionality.
With these mature libraries, Python makes an excellent choice for developing image
processing and computer vision applications.
To get started with image processing in Python, follow these key steps:
The main library used for image processing in Python is OpenCV (Open Source
Computer Vision Library). Other useful libraries include scikit-image, Pillow,
matplotlib, etc.
import cv2
import numpy as np
from skimage.io import imread
import matplotlib.pyplot as plt
Use imread() from scikit-image or cv2.imread() from OpenCV to load images into
Python.
img = imread('image.jpg')
Save/Display Result
plt.imshow(gray_img, cmap='gray')
plt.show()
cv2.imwrite('gray_image.jpg', gray_img)
This covers the basic workflow to load, process and visualize images in Python.
Check out OpenCV and scikit-image documentation for more image processing
operations.
Python makes image processing very accessible due to its extensive libraries and
ready-made functions. For example, the OpenCV library provides over 500 functions
for common image processing tasks like:
Edge detection
Object detection
You don't need to code these from scratch - just call the function and pass in your
image. This makes development much faster compared to lower-level languages like
C++.
Here's a simple example to resize an image with OpenCV in 5 lines of Python:
import cv2
img = cv2.imread('image.jpg')
resized = cv2.resize(img, (100, 100))
cv2.imwrite('resized.jpg', resized)
So while you still need some programming knowledge, Python and libraries like
OpenCV, scikit-image and Pillow make image processing tasks straightforward for
developers at any level.
This makes Python a popular choice for computer vision and image processing.
Pillow (also known as PIL) is the most widely used Python library for image
processing. Here are some key things to know about Pillow:
Open-source library that builds on the now-discontinued PIL (Python Imaging Library)
Provides extensive support for different image formats like JPEG, PNG, GIF, BMP
and TIFF
Useful for basic image manipulation tasks like resizing, cropping, rotating, blurring
etc.
Has image enhancement capabilities like contrast adjustment, sharpening, color
space conversions etc.
Supports creating thumbnails, applying filters, drawing shapes and text onto images
Integrates well with popular Python data analysis libraries like NumPy and SciPy
In summary, Pillow offers a versatile toolkit to load, manipulate and save images for
various applications using Python. Its simple API, maturity as a library and integration
with NumPy make it a convenient choice for developers looking to integrate image
processing capabilities into their Python programs.
Python has several algorithms and libraries that are commonly used for image
processing tasks. Some of the most popular options include:
SciPy - This scientific computing library contains modules for image processing like
binary morphology, filtering, interpolation, etc. It is useful for tasks like image
enhancement, restoration, and segmentation.
OpenCV - The OpenCV library is widely used for computer vision and image
processing. It provides algorithms for tasks ranging from facial recognition to image
stitching. Useful for object detection, classification, and tracking.
Pillow - Pillow is a popular Python imaging library used for basic image manipulation
like resizing, cropping, filtering, color space conversions etc. Handy for preparing
images for input/output.
So in summary, SciPy and scikit-image are good for scientific image analysis while
OpenCV focuses on computer vision. Pillow provides general utility functions for
image handling. The choice depends on the specific task - classification, object
recognition, enhancement etc. But all these libraries complement each other.
To get started with image processing in Python, you'll need to have Python and PIP
(Python package manager) installed on your system. Here are step-by-step
instructions for installation:
1. Download the latest Python release from python.org. Make sure to download
version 3.6 or higher.
2. Follow the installation wizard, customizing any options as desired. Make sure
Python is added to your system's PATH.
3. Open a new command prompt window and run pip --version to confirm PIP is
installed with Python. If not, install it from this page.
Once Python and PIP are installed, you have the base environment ready for image
processing libraries.
To install them:
1. At the command prompt, run: pip install opencv-python
2. Run: pip install numpy scipy
3. Run: pip install pillow
This will download and install the latest versions of these important libraries.
Other useful optional libraries like scikit-image, Mahotas, SimpleITK can also be
installed via PIP.
Once the libraries are installed, we can import them into our Python scripts.
For example:
import cv2
import numpy as np
from PIL import Image
import scipy.ndimage
We use aliases like cv2 for OpenCV and np for NumPy to simplify later coding.
The environment is now ready for loading images, applying filters, transformations
and running analysis algorithms!
Python provides various libraries for working with images, such as OpenCV,
PIL/Pillow, scikit-image, etc. This section will introduce some core concepts and
techniques for handling images in Python.
To load an image in Python using OpenCV, we use the cv2.imread() function. For
example:
import cv2
img = cv2.imread('image.jpg')
Similarly, with the Python Imaging Library (PIL), we use the Image.open() method:
These functions load the image data into a NumPy array or PIL Image object
respectively, which provides various properties and pixel data access.
cv2.imwrite('new_image.jpg', img)
img.save('new_image.jpg')
plt.imshow(img)
plt.show()
The OpenCV library provides simple methods like cv2.resize() to resize images.
You can specify the output dimensions directly:
import cv2
img = cv2.imread('image.jpg')
resized = cv2.resize(img, (100, 200))
The PIL/Pillow library also offers flexible image resizing with Image.resize() ,
allowing both pixel dimensions or percentage scaling:
img = Image.open('image.jpg')
resized = img.resize((100, 100)) # pixels
resized = img.resize((50, 50)) # 50% scale
Cropping extracts a region of interest from an image, a useful technique for focusing
on key parts or removing unwanted areas.
NumPy array slicing provides an easy way to crop in OpenCV and Pillow. If img is a
NumPy array, we can extract a 100x100 pixel square from x=50, y=50 like:
This selects the same region as the NumPy slicing. Both approaches provide simple
ways to implement cropping.
rotated90 = img.rotate(90)
flipped = img.transpose(Image.FLIP_LEFT_RIGHT)
Overall, Python imaging libraries like OpenCV and PIL provide powerful yet easy to
use tools for essential image processing techniques, from resizing and cropping to
rotations, making them very useful for tasks like data augmentation and image
correction.
Image processing techniques like filtering and enhancement allow you to manipulate
images in Python to achieve various effects. This guide will demonstrate some
advanced methods using the OpenCV library.
Applying blur effects can be useful for reducing image noise. OpenCV provides
several blurring techniques:
Linear filters - Simple averaging of pixel neighborhoods. Easy to apply but produces
unnatural looking results.
Gaussian blur - Uses a Gaussian kernel to produce more natural blurs. Adjustable
kernel size allows control over blur intensity. Useful for smoothing noise while
preserving edges.
image = cv2.imread('image.jpg')
blurred = cv2.GaussianBlur(image, (15, 15), 0)
cv2.imwrite('blurred.jpg', blurred)
Sharpening brings images into better focus. Convolutional filters accentuate edges
and fine details.
import cv2
import numpy as np
image = cv2.imread('image.jpg')
kernel = np.array([[0, -1, 0],
[-1, 5,-1],
[0, -1, 0]])
sharpened = cv2.filter2D(image, -1, kernel)
import cv2
image = cv2.imread('image.jpg')
edges = cv2.Canny(image, 100, 200)
cv2.imwrite('canny_edges.jpg', edges)
This produces a clear edge map isolating prominent contours in the image.
Advanced filters like these enable effective image analysis and manipulation with
OpenCV in Python.
Python offers simple and powerful tools to perform image segmentation thanks to
libraries like OpenCV, scikit-image, and others. In this section, we'll explore some of
the popular image segmentation techniques and how to implement them in Python.
import cv2
import numpy as np
img = cv2.imread('image.jpg', 0)
ret, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
We can also use adaptive thresholding which calculates the threshold for smaller
regions, giving better results for images with varying illumination:
The watershed algorithm treats an image like a topographic map, with pixel
intensities representing heights. It then finds "catchment basins" and "watershed
ridge lines" to segment the image.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('coins.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)
# Apply watershed
sure_bg = cv2.dilate(opening,kernel,iterations=3)
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg,sure_fg)
ret, markers = cv2.connectedComponents(sure_fg)
markers = markers+1
markers[unknown==255] = 0
markers = cv2.watershed(img,markers)
img[markers == -1] = [0,255,0]
This performs several pre and post-processing steps on the image before applying
watershed. The final segmented image separates each coin successfully.
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('messi.jpg')
mask = np.zeros(img.shape[:2],np.uint8)
bgdModel = np.zeros((1,65),np.float64)
fgdModel = np.zeros((1,65),np.float64)
rect = (50,50,450,290)
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
mask = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask[:,:,np.newaxis]
Image processing is an exciting field with many real-world applications. Here are
some ideas for Python image processing projects you can develop to put your skills
to use:
Build a custom image classifier to detect specific objects. Gather images of those
objects, label them, and train a convolutional neural network model with OpenCV and
Python to recognize them. This could be used for quality control in manufacturing,
identifying wildlife with camera traps, or even detecting ripe produce.
Create a classifier that can identify plant diseases from leaf images. Collect images
of healthy and infected plant leaves, label them by disease type, and train a model to
categorize new leaf images by disease. This could help farmers identify crop
infections early to prevent spread.
Develop an app that identifies dog breeds from user-submitted photos. Use transfer
learning with a pre-trained model like ResNet50 to retrain the final layer, adding new
output classes for different dog breeds. Capture images of dogs to train classifier.
Locating and drawing bounding box regions around objects in images is another
useful application of computer vision. Project ideas include:
Face detection app that draws boxes around faces in images. Use Haar cascades
with OpenCV to identify facial features. Could be used to automatically tag people in
photos.
Traffic camera analyzer that highlights all vehicles in a traffic video feed. Use
background subtraction and contour detection to identify cars and trucks and draw
boxes around them. Useful for automated traffic monitoring.
Product scanner that locates retail products on store shelves. Train an object
detection model on product images and apply it to shelf images to identify and
highlight items. Assist with inventory audits and checking stock levels.
The key techniques are training object detection models like SSD and YOLOv3 or
using Haar cascades for things like faces. Outputs are bounding box regions
identifying object locations.
Building a reverse image search engine lets people discover similar images. Project
ideas include:
Fashion image search site for finding clothing and accessory ideas. Allow image
uploads and return visually similar catalog images, linking to shopping options.
Interior design search tool for matching furniture and decor styles. Index interior
images and return the most similar images from the database to user uploads.
Plagiarism checker that compares essay submissions against web sources to detect
copied work. Use image hashing to compare incoming images/docs to indexed
original content.
Use perceptual image hashing to give images a fingerprint. Index the hashes for
storage and fast lookup. Calculate hash of search images and find closest matches
in index using a distance metric like Hamming distance.
The key skills are building an image database, generating hashes, indexing for
search, and writing matching logic. These allow building versatile search apps.
Applying filters like blurring and edge detection to alter image appearance
Detecting and localizing objects in images with OpenCV and deep learning
With these fundamentals, you can now confidently take on more advanced Python
computer vision and image analysis projects. Check out the OpenCV and Pillow
documentation to continue expanding your skills. Additionally, active communities like
PyImageSearch provide code examples and applied tutorials on cutting-edge
techniques.
By mastering Python image processing, you open up possibilities in diverse fields like
medical imaging, satellite imagery analysis, machine inspection systems, facial
recognition, and more. This versatile skill set will serve you well in both research and
industry applications.