CV (Unit1&2ans)
CV (Unit1&2ans)
CV (Unit1&2ans)
1) What is computer vision? Write about goals and examples of computer vision.
Computer vision is a field of artificial intelligence (AI) that enables computers and systems to derive
meaningful information from digital images, videos and other visual inputs — and take actions or
make recommendations based on that information.
Computer vision works much the same as human vision. Human sight has the advantage of lifetimes
of context to train how to tell objects apart, how far away they are, whether they are moving and
whether there is something wrong in an image. Computer vision trains machines to perform these
functions, but it has to do it in much less time with cameras, data and algorithms rather than retinas,
optic nerves and a visual cortex.
Examples:
Self-Driving Cars: Computer vision is a critical component of self-driving cars that allows them to
perceive the world around them and react in real-time to changing traffic conditions.
Healthcare: From medical imaging to patient monitoring, computer vision plays a crucial role in the
healthcare industry by enabling medical professionals to diagnose diseases accurately.
Security: Computer vision is used heavily in security systems like facial recognition to identify and
track potential threats.
Retail: Retailers use computer vision to analyze customer behavior and preferences, allowing them to
offer personalized recommendations and improve overall customer satisfaction.
Agriculture: Computer vision can be used for crop monitoring and analysis, including identifying areas
that need irrigation or detecting pests and diseases.
………………………………………………………………………………………………
2) Compare and contrast image processing and computer vision.
Computer Vision:
In Computer Vision, computers or machines are made to gain high-level understanding from the input digital
images or videos with the purpose of automating tasks that the human visual system can do. It uses many
techniques and Image Processing is just one of them.
Image Processing:
Image Processing is the field of enhancing the images by tuning many parameter and features of the
images. So Image Processing is the subset of Computer Vision. Here, transformations are applied to an
input image and the resultant output image is returned. Some of these transformations are- sharpening,
smoothing, stretching etc
………………………………………………………………………………………………………………….
3) What is an image? Explain different types of images with representation.
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial coordinates, and the
amplitude of F at any pair of coordinates (x,y) is called the intensity of that image at that point. When x,y,
and amplitude values of F are finite, we call it a digital image. In other words, an image can be defined by a
two-dimensional array specifically arranged in rows and columns.
Digital Image is composed of a finite number of elements, each of which elements have a particular value at
a particular location. These elements are referred to as picture elements, image elements, and pixels. A
Pixel is most widely used to denote the elements of a Digital Image.
Types of an image
BINARY IMAGE– The binary image as its name suggests, contain only two pixel elements i.e 0 &
1,where 0 refers to black and 1 refers to white. This image is also known as Monochrome.
Gray-scale images-Grayscale images are monochrome images, Means they have only one color.
Grayscale images do not contain any information about color. Each pixel determines available different grey
levels.A normal grayscale image contains 8 bits/pixel data, which has 256 different grey levels. In medical
images and astronomy, 12 or 16 bits/pixel images are used.
Colour images- Colour images are three band monochrome images in which, each band contains a
different color and the actual information is stored in the digital image. The color images contain gray level
information in each spectral band.The images are represented as red, green and blue (RGB images). And
each color image has 24 bits/pixel means 8 bits for each of the three color band(RGB).
8 bit COLOR FORMAT– It is the most famous image format. It has 256 different shades of colors in it
and commonly known as Grayscale Image. In this format, 0 stands for Black, and 255 stands for white, and
127 stands for gray.
16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors in it. It is also known
as High Color Format. In this format the distribution of color is not as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue. That famous
RGB format.
88
8bit 16bit
……………………………………………………………………………………………..
4) How different types of images are represented as matrix? Explain.
As we know, images are represented in rows and columns we have the following syntax in which images
are represented:
The right side of this equation is digital image by definition. Every element of this matrix is called image
element, picture element, or pixel.
A grayscale image is a 2-dimensional array of numbers. An 8-bit image has entries between 0 and 255. The
value 255 represents a white color, and the value 0 represents a black color. Lower
numbers translate to darker pixels, while higher numbers translate to lighter pixels. For
an image that has (m * n) pixels (i.e., “picture elements”), we represent that image
using a matrix of size m*n. The entries of the matrix indicate the pixel value of the corresponding part of the
image. Example: This table represents an image that has 4 * 5 pixels.
In this representation, each pixel in an image is assigned a value, also known as intensity, that determines
its brightness or color.
There are several types of images, including grayscale images, black & white images, and color images.
Each type of image is represented differently as a matrix, depending on the number of channels and the
type of encoding technique used.
Grayscale images: A grayscale image is a single-channel image where the value of each
pixel is represented by a single scalar value (0-255) indicating the level of gray. In a grayscale image, the
matrix is usually represented as a two-dimensional array, where each element represents the intensity at a
particular pixel location.
For example, if we have a grayscale image of size 5x5, the image can be represented as
follows:
[155, 200, 50, 10, 100]
[100, 255, 75, 0, 60]
[40, 175, 250, 55, 110]
[90, 120, 40, 215, 180]
[220, 30, 170, 80, 190]
Black & white images: A black and white image is a binary image, where each pixel can have
only two values - black (0) or white (1). Here, the matrix is again represented as a two-dimensional array. If
a pixel is represented by 0, it will be black, and vice versa.
For example, if we have a binary image of size 4x4, the image can be represented as follows:
[0, 1, 1, 0]
[1, 0, 0, 1]
[0, 1, 1, 1]
[1, 0, 1, 0]
Color images: A color image contains three channels - red, green, and blue - each
represented by a two-dimensional matrix. In this case, the matrix is usually represented as a three-
dimensional array, with one dimension for each channel. The values of these matrices determine the color
and brightness of each pixel in the image. Another common representation is Hue, Saturation, and Value
(HSV), which is more intuitive for color manipulation or recognition tasks.
For example, if we have a color image of size 3x3, the image can be represented as follows:
[(255, 0, 0), (0, 255, 0), (0, 0, 255)]
[(255, 255, 0), (255, 0, 255), (0, 255, 255)]
[(255, 255, 255), (128, 128, 128), (0, 0, 0)]
Multi-spectral images: Multispectral images capture additional spectral data beyond visual light such as
infrared, ultraviolet or radar frequencies, typically seen in satellite imagery. They possess more than three
channels of the data and required hyperspectral matrices for their construction.
…………………………………………………………………………………..
5) What is a pixel? How to manipulate pixels? Explain with example.
The pixel -- a word invented from picture element -- is the basic unit of programmable color on a
computer display or in a computer image. Think of it as a logical -- rather than a physical -- unit.
Pixels are the smallest unit in a digital display. Up to millions of pixels make up an image or video
on a device's screen. Each pixel comprises a subpixel that emits a red, green and blue (RGB)
color, which displays at different intensities. The RGB color components make up the gamut of
different colors that appear on a display or computer monitor.
To manipulate pixels in an image, we need to access and change its RGB (Red, Green, Blue) values that
determine its color. There are several Python libraries available like OpenCV, Numpy for manipulating pixels
in images. These libraries allow us to read, modify, and save images using different functions.
import cv2
image = cv2.imread('teeej.jpg’)
cv2_imshow(image)
cv2_imshow(gray_image)
cv2_imshow(BnW_image)
A simple binary thresholding technique in OpenCV can be
used to convert an image to black and white.
To apply thresholding, first of all, we need to convert a colored
image to grayscale.
Then, convert this image to grayscale using
cv2.cvtColor(image, color_space_conversion) function.
The next step is the conversion of this grayscale image to
black and white image.
The syntax of a binary thresholding function is:
cv2.threshold (image, threshold, max_value,
cv2.THRESH_BINARY)
………………………………………………………………………………………………………………………
Deformation: The subject of computer vision analysis is not only a solid object but also bodies that can be
deformed and change their shapes, which provides additional complexity for object detection. For example;
a football player may change his pose at different times.so, the images of the football players are different in
different poses. If the object detector is trained to find a person only in a standing or running position, it may
not be able to detect a player who is lying on the field or preparing to make a maneuver by bending down.
Occlusion: Sometimes objects can be covered by other things, which makes it difficult to read the signs and
identify these objects. For example, in the first below image, a cup is covered by the hand of the person
holding this cup.
Illumination Conditions: Lighting has a very large influence on the definition of objects. The same object will
look different depending on the lighting conditions. the less illuminated space, the less visible the objects
are. All of these factors affect the detector’s ability to define objects.
Cluttered or textured background: Objects that need to be identified may blend into the background, making
it difficult to identify them. For example, the below picture shows a lot of items, the location of which is
confusing when identifying scissors or other items of interest. In such cases, the object detector will
encounter detection problems.
………………………………………………………………………………………
7) Explain different application areas of computer vision
Object Identification and Recognition: Identifying and recognizing real-world objects from
images, videos, or live feeds.
Autonomous Vehicles: Computer vision plays a vital role in autonomous vehicles by enabling
them to perceive the environment accurately and make appropriate decisions.
Healthcare: It is employed in medical imaging for accurate diagnoses of various ailments such
as cancer, tumors, dental, and brain injuries.
Augmented Reality: AR heavily depends on computer vision technologies like, tracking facial
expressions, gestures, movements and understanding environmental information.
Quality Control: Ensuring consistency in products by detecting defects, impurities and
ensuring proper packaging requirements which helps in reducing costs and saving time.
Security and Surveillance: Visual biometric techniques are used to provide authenticated
access to secured areas.
Robotics: Used for robotic navigation, mapping, and perception, making robots more reliable
and efficient.
Agriculture: Computer vision has been useful in analyzing plant growth, soil morphology,
checking fertilizer requirements, and to monitor crop health
Retail: Retailers use computer vision-powered tech-tools for predicting customer behavior
which helps them in increasing sales and improving store efficiency.
Gaming: Used in gaming with motion tracking, gesture recognition and facial expressions
features increasing user engagement and experience.
Sports Analytics: A computerized system that tracks player movements and provides analysis
of their performance based on that data.
Cultural Heritage: Preserving, restoring, and analysing ancient cultures, artifacts and
documents throught digitisation.
Space exploration: used for capturing images, videos, and extracting relevant data of celestial
objects for further research.
Forensic analysis: The ability to analyse visual evidence gathered from crime scenes has
improved significantly over recent years through developing better algorithms and computer simulations.
Environmental Research: Computer vision facilitates in identifying fragile eco-systems and
assess changes through advanced image processing techniques.
Optical character recognition
Optical Character Recognition (OCR) is the process of detecting and reading text in images through
computer vision. Detection of text from document images enables Natural Language Processing algorithms
to decipher the text and make sense of what the document conveys.
Furthermore, the text can be easily translated into multiple languages, making it easily interpretable to
anyone. OCR, however, is not limited to the detection of text from document images only. Novel OCR
algorithms make use of Computer Vision and NLP to recognize text from supermarket product names, traffic
signs, and even from billboards, making them an effective translator and interpreter.
Analyze satellite images
Satellite image processing and analysis is one of the significant computational methods which finds
application in military, agriculture, natural disaster prevention, natural resource identification and so forth
Deep fake detection
With fake news taking over the media space, it becomes harder for the average person to determine what’s
real and what’s not. Deep fakes are becoming so good, even the experts might fail to identify them. The
system can identify the elements of photos and videos that have been manipulated in any way. It may detect
fake product review, fake customer complaint, fake news etc.
Law enforcement and defense
Technology can be extremely helpful in ensuring public security. Of course, there’s a controversy
surrounding public surveillance, but the fact remains that this technology can help detect suspicious
individuals, dangerous criminals, and terrorists in public places. The technology can also be used in
defense, helping the military identify weapons of mass destruction and other hazardous objects over vast
areas.
………………………………………………………………………………………….
8) What is a digital image? How images are formed? Explain.
An image is a two-dimensional array in which color information is arranged along x and y spatial axis. So, in
order to understand how the image is formed, we should first understand how the signal is formed?
Signal
A signal is a mathematical and statistical approach that relates us to the physical world. It can be measured
through its dimensions and time over space. Signals are used to convey information from one source to
another.
Relationship
A signal is that which conveys information around us in the physical world, it can be any voice, images etc.
whatever we speak, it first converted into a signal or wave and then transfer to others in due time period.
While capturing an image in the digital camera, a signal is transferred from one system to another.
How a digital image is formed?
A digital image is formed by the small bits of data i.e. pixels, which are stored in computers. When we
capture an image in our digital camera in presence of light then this camera works like a digital sensor and
converts it into digital signals.
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial coordinates, and the
amplitude of F at any pair of coordinates (x,y) is called the intensity of that image at that point. When x,y,
and amplitude values of F are finite, we call it a digital image. In other words, an image can be defined by a
two-dimensional array specifically arranged in rows and columns.
Digital Image is composed of a finite number of elements, each of which elements have a particular value at
a particular location. These elements are referred to as picture elements, image elements, and pixels. A
Pixel is most widely used to denote the elements of a Digital Image.
Types of an image
BINARY IMAGE– The binary image as its name suggests, contain only two pixel elements i.e 0 &
1,where 0 refers to black and 1 refers to white. This image is also known as Monochrome.
Gray-scale images-Grayscale images are monochrome images, Means they have only one color.
Grayscale images do not contain any information about color. Each pixel determines available different grey
levels.A normal grayscale image contains 8 bits/pixel data, which has 256 different grey levels. In medical
images and astronomy, 12 or 16 bits/pixel images are used.
Colour images- Colour images are three band monochrome images in which, each band contains a
different color and the actual information is stored in the digital image. The color images contain gray level
information in each spectral band.The images are represented as red, green and blue (RGB images). And
each color image has 24 bits/pixel means 8 bits for each of the three color band(RGB).
8 bit COLOR FORMAT– It is the most famous image format. It has 256 different shades of colors in it
and commonly known as Grayscale Image. In this format, 0 stands for Black, and 255 stands for white, and
127 stands for gray.
16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors in it. It is also known
as High Color Format. In this format the distribution of color is not as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue. That famous
RGB format.
88
8bit 16bit
and
[Same as qn 4]
………………………………………………………………………………………………………………………………………
9) What is OpenCV? Write python code to read, save and display images in OpenCV.
OpenCV is a Python open-source library, which is used for computer vision in Artificial intelligence,
Machine Learning, face recognition, etc.
OpenCV, the CV is an abbreviation form of a computer vision, which is defined as a field of study
that helps computers to understand the content of the digital images such as photographs and
videos.
OpenCV stands for Open Source Computer Vision Library, which is widely used for image
recognition or identification. It was officially launched in 1999 by Intel. It was written in C/C++ in the
early stage, but now it is commonly used in Python for the computer vision as well.
The first alpha version of OpenCV was released for the common use at the IEEE Conference on
Computer Vision and Pattern Recognition in 2000, and between 2001 and 2005, five betas were
released. The first 1.0 version was released in 2006.
The second version of the OpenCV was released in October 2009 with the significant changes. The
second version contains a major change to the C++ interface, aiming at easier, more type-safe,
pattern, and better implementations. Currently, the development is done by an independent Russian
team and releases its newer version in every six months.
Loading image
The very first thing is to import the necessary packages and in our case its cv2 . OpenCV (Open
Source Computer Vision Library) is an open source computer vision and machine learning software
library.
Syntax: Img=cv2.imread(“path”)
Example: img=cv2.imread(“C:\\pictures\apple.jpg”)
cv2.imread will return the NumPy array of the image. To load the particular image, we need to pass
the image path as a parameter to that function. Here, we are passing the image path as static but
one can also pass it as a command line argument.
Display image
# displaying image
cv2.imshow('Image file', img)
cv2.waitKey(0)
As a second step, we will display the image. Here, we will use cv2.imshow function. cv2.imshow
accepts two parameters. First parameter is basically a string and it’s the title of the display window.
The second parameter is the actual image/numpy array, which is img in our case that we want to
display. Then we have cv2.waitKey which basically pause the execution of script until the user
press any key from keyboard. And we have passed 0 as a parameter which indicates that when the
user press any key then the execution will resume.
Finally, we want to save the image. To save the image we are going to use cv2.imwrite function,
which accepts two parameters. The first parameter is the path of the image, where you want to
save the image. In my case, I want to save the image as newfile.jpg. And the second parameter is
the actual image or the numpy array that we want to save.
……………………………………………………………
10) How pixels are accessed and manipulated in OpenCV? Explain with an example program.
The definition of an image is very simple: it is a two-dimensional view of a 3D world. Furthermore, a digital
image is a numeric representation of a 2D image as a finite set of digital values. We call these values pixels
and they collectively represent an image. Basically, a pixel is the smallest unit of a digital image (if we zoom
in a picture, we can detect them as miniature rectangles close to each other) that can be displayed on a
computer screen.
In OpenCV, pixels can be accessed and manipulated using the cv2.imread() function to read an image and
cv2.imshow() function to display it on the screen. You can access individual pixels in a image by specifying
the (x, y) coordinate of the pixel.
we are mainly going to use grayscale images as a default choice. Due to only one channel, it
makes image processing more convenient. Usually, we convert an image into the grayscale one,
because we are dealing with one color and it is a lot easier and faster. In OpenCV we can perform
image and video analysis in full color as well, which we will also demonstrate.
# Necessary imports
import cv2
import numpy as np
import matplotlib.pyplot as plt
from google.colab.patches import cv2_imshow
# Loading our image with a cv2.imread() function
img=cv2.imread("Cybertruck.jpg",cv2.IMREAD_COLOR)
# Loading our image with a cv2.imread() function
gray=cv2.imread("Cybertruck.jpg",cv2.IMREAD_GRAYSCALE)
# For Google Colab we use the cv2_imshow() function
# but we can use cv2.imshow() if we are programming on our computer
cv2_imshow(img)
cv2_imshow(gray)
First, we need to read the image we want to work with using the cv2.imread() function. If the image is
not in the working directory, make sure to know the exact file path. If we are working in Google Colab we
need to upload our image from our computer. With this in mind, in the following examples we are going to
read the image of the Tesla truck.
If we want to load a color image, we just need to add a second parameter. The value that’s needed
for loading a color image is cv2.IMREAD_COLOR. There’s also another option for loading a color image: we
can just put the number 1 instead cv2.IMREAD_COLOR and we will obtain the same output.
The value that’s needed for loading a grayscale image is cv2.IMREAD_GRAYSCALE, or we can just put
the number 0 instead as an argument.
To display an image, we will use the cv2.imshow() function.
Unit2
To create a black image, we could use the np.zeros() method. It creates a numpy n-
dimensional array of given size with all elements as 0. As all elements are zero, when we display it using
cv2.imshow() or plt.imshow() functions, it displays a balck image.
To create a white image, we could use the np.ones() method. It creates a numpy n-
dimensional array of given size with all elements as 1. We multiply this array by 255 to create a white image.
Now all elements are 255, so when we display it using cv2.imshow() or plt.imshow() functions it gives a
white image.
import cv2
from google.colab.patches import cv2_imshow
import numpy as np
blank_image = np.zeros((200,200,3), np.uint8)
cv2_imshow(blank_image)
blank_image[:,0:100] = (255,0,0) # (B, G, R)
blank_image[:,100:200] = (0,255,0)
cv2_imshow(blank_image)
Output
Further, we can fill the image with different colors of Blue, Green, and Red by
accessing the pixel values using index values of the NumPy array. An example is
shown below
Output:
(Or)
to create a black image in OpenCV using Python, we can use the cv2.imread() method by passing "0" as a second
argument or use the numpy np.zeros() method. Here is an example
import cv2
import numpy as np
Cropping and resizing are two common operations performed on images in image processing or computer vision
applications. The main differences between them are:
Cropping an image refers to selecting a region of interest (ROI) from an input image by keeping only a part of it,
which can be a rectangular or any shape. In contrast, resizing an image means changing the size of the whole image
uniformly or non-uniformly based on specific requirements.
The output image after cropping retains the original aspect ratio, while resizing may not. Also, when we crop an
image, we keep the same number of pixels for the selected ROI. However, in resizing, the total number of pixels
changes based on the new size, which can lead to loss or gain of information in the image.
Cropping an image typically helps in eliminating unwanted regions from an image or zooming into a localized
area of interest, whereas resizing helps to scale down large images for efficient processing or to adapt an image to a
specific display size.
from google.colab.patches import cv2_imshow
import cv2
# Perform cropping
cropped_img = img[y:y+h, x:x+w]
# Perform resizing
resized_img = cv2.resize(img, (300, 400))
dimension = img.shape
cv2.waitKey(0)
cv2.destroyAllWindows()
In the above example, we loaded an input image using cv2.imread() method, defined the ROI coordinates to crop, and
then obtained the cropped image by indexing the pixel values. Next, we resized the original image using cv2.resize()
method and displayed both the cropped and resized images using cv2.imshow() method. Finally, we used cv2.waitkey()
and cv2.destroyAllWindows() methods to close the windows holding the images.
…………………………………………………………………………………………………………
3) Write OpenCV program example to illustrate cropping procedure in an image.
Cropping an image refers to selecting a rectangular region or a portion of an image. This selected part can be then
modified, analyzed or enlarged as per the requirements. It is a common operation performed in computer vision and
image processing.
In OpenCV, images can be cropped using the slicing operator ‘:’. For example, consider a grayscale image 'img' of
dimensions (300, 400). To crop this image from pixels (50, 50) to (250, 350), we can use the code:
crop_img = img[50:250, 50:350]
cv2.waitKey(0)
cv2.destroyAllWindows()
Here, img[50:250, 50:350] selects rows 50 to 249 and columns 50 to 349 of the original image and creates a new
cropped image 'crop_img'. It is then displayed using cv2.imshow() function.
It is important to note that the pixel values of the original image are not copied to the cropped image but rather pointed to
by its index. Therefore, any modifications made to the cropped image will affect the original image as well.
Cropping an image is useful when only specific parts of an image are required for further processing. This reduces the
number of pixels to process and hence improves computational time. Further, it can also help in removing unwanted
parts of an image, improving its quality and making it easier to analyze.
Example:
import cv2
img = cv2.imread("yooo.jpg")
cropped_image = img[150:300,150:500]
cv2_imshow(img)
cv2_imshow(cropped_image)
cv2.waitKey(0)
cropped_image.shape
(150, 350, 3)
…………………………………………………………………………………………….
4) Show the process of copying a region to another in an image
Copying a region to another in an image in OpenCV refers to selecting a portion of an image and copying it over
to another location within the same or different image. This can be useful in various applications such as object
detection or removing unwanted portions of an image.
The process of copying a region to another is usually done by first selecting the region of interest using a
rectangle shape. The coordinates of the top-left and bottom-right corners of the rectangle are used to define the region
of interest.
Once the region of interest is selected, we can copy it over to another location in the same or different image.
This is achieved by specifying the destination location on the target image where the selected region will be copied to.
The above code demonstrated the copying of a region to another in an image using OpenCV. Below are the steps
involved:
1. First, we import necessary packages including OpenCV and google.colab.patches.
2. We load an image called "Penguins.jpg" using the cv2.imread() method.
3. The next step involves selecting a region of interest (ROI) from the loaded image using slicing
technique. Here, "img[50:200,150:400]" selects pixels/region starting from x=150 to x=400 and y=50 to y=200. This step
results in a cropped image called copy_image.
4. To see the original image with the selected ROI, we use the cv2.imshow() method.
5. In the next step, we place the selected/cropped image onto another region of interest in the same
image. Here, "img[40:190,630:880]" pastes the cropped image starting from x=630 to x=880 and y=40 to y=190.
6. Finally, we display the modified/processed image using cv2.imshow() method again.
These are the steps involved in copying a region to another in an image using OpenCV
…………………………………………………………………………….
5) Create a black rectangle and white circle in OpenCV and apply bitwise OR operation on them.
The images can be subjected to arithmetic operations such as addition, subtraction, and bitwise operations
(AND, OR, NOT, XOR). These operations can help to improve the properties of the input images. Image
arithmetic is necessary for analyzing the properties of the input image. The operated images can then be used
as an enhanced input image, and many more operations can be applied to the image. Image arithmetic is the
application of one or more images to one of the standard arithmetic operations or a logical operator. The
operators are applied pixel by pixel, so the value of a pixel in the output image is determined solely by the
values of the corresponding pixels in the input images. As a result, the images must usually be the same size.
When adding a constant offset to an image, one of the input images may be a constant value.
Bitwise operations are used in image manipulation to extract important parts. The following Bitwise operations are used
in this article:
AND
OR
NOT
XR
Bitwise operations are also useful for image masking. These operations can be used to enable image creation. These
operations can help to improve the properties of the input images.
NOTE: Bitwise operations should only be performed on input images of the same dimensions.
The OR operator typically takes two binary or greyscale images as input and outputs a third image whose pixel values
are the first image’s pixel values ORed with the corresponding pixels from the second. A variant of this operator takes a
single input image and ORs each pixel with a constant value to generate the output.
A bitwise 'OR' examines every pixel in the two inputs, and if *EITHER* pixel in the two images is greater than 0, then
the output pixel has a value of 255, otherwise it is 0.
Parameters:
Code :
import cv2
import numpy as np
img1 = cv2.imread('input1.png')
img2 = cv2.imread('input2.png')
cv2.waitKey(0)
……………………………………………………………………………………
6) What are the different ways of resizing an image? Explain resizing procedures with suitable examples.
Resizing an image
Scaling, or simply resizing, is the process of increasing or decreasing the size of an image in terms of width
and height. When resizing an image, it’s important to keep in mind the aspect ratio — which is the ratio of an
image’s width to its height. Ignoring the aspect ratio can lead to resized images that look compressed and
distorted.
We can downscale the original image into a smaller dimension by using cv2.resize() method and specifying the new
dimension (width, height) in the function.
import cv2
image = cv2.imread(‘tej.jpg')
Height:360, Width:561
print(f"Height:{height}, Width:{width}")
# Setting new dimension for image size - 50% of the original size
cv2_imshow(image)
cv2_imshow(resize_image)
We can upscale the original image into a larger dimension by using cv2.resize() method and specifying the new
dimension (width, height) in the function.
import cv2
image = cv2.imread(‘tej.jpg')
print(f"Height:{height}, Width:{width}")
# Setting new dimension for image size - twice of the original size
cv2_imshow(image)
cv2_imshow(resize_image)
In the other ways of resizing an image, we don't preserve the aspect ratio of the image. In such cases, the width or
height of the image is changed based on the requirement.
Below is an example of resizing an image only with an increased width, while keeping the height same as the original
image.
Code Example:
import cv2
from google.colab.patches import cv2_imshow
Finally, resize the specified width and height simultaneously, we need to pass both 'width' and 'height' as arguments to
the cv2.resize function.
Code Example:
import cv2
from google.colab.patches import cv2_imshow
7) What is the need of image mask? Explain the image mask procedure with an example in OpenCV.
Masking is used in Image Processing to output the Region of Interest, or simply the part of the image that we
are interested in. We tend to use bitwise operations for masking as it allows us to discard the parts of the
image that we do not need.
For example, let’s say that we were building a computer vision system to recognize faces. The only part of the
image we are interested in finding and describing is the parts of the image that contain faces — we simply
don’t need the rest of the image’s content. Provided that we could find the faces in the image, we may construct
a mask to show only the faces in the image.
Example:
#masking
import cv2
import numpy as np
from google.colab.patches import cv2_imshow
img = cv2.imread('pokemon.jpg')
cv2_imshow(img)
blank = np.zeros((img.shape[0],img.shape[1]), dtype='uint8')
masked = cv2.bitwise_and(img,img,mask=circle)
cv2_imshow(masked)
cv2.waitKey(0)
Output:
Image Mask Masked Image
In the above code, cv2.circle() method is used to draw a circle on any image. The syntax of cv2.circle()
method is:
Syntax:
Parameters:
where source1_array is the array corresponding to the first input image on which bitwise and
operation is to be performed,
source2_array is the array corresponding to the second input image on which bitwise and operation is
to be performed,
destination_array is the resulting array by performing bitwise operation on the array corresponding to
the first input image and the array corresponding to the second input image and
mask is the mask operation to be performed on the resulting image.
………………………………………………………………………………………
The images can be subjected to arithmetic operations such as addition, subtraction, and bitwise
operations (AND, OR, NOT, XOR). These operations can help to improve the properties of the input
images. Image arithmetic is necessary for analyzing the properties of the input image. The operated
images can then be used as an enhanced input image, and many more operations can be applied
to the image. Image arithmetic is the application of one or more images to one of the standard
arithmetic operations or a logical operator. The operators are applied pixel by pixel, so the value of
a pixel in the output image is determined solely by the values of the corresponding pixels in the input
images. As a result, the images must usually be the same size. When adding a constant offset to
an image, one of the input images may be a constant value.
Bitwise Operations
Bitwise operations are used in image manipulation to extract important parts. The following Bitwise
operations are used in this article:
AND
OR
NOT
XR
Bitwise operations are also useful for image masking. These operations can be used to enable
image creation. These operations can help to improve the properties of the input images.
NOTE: Bitwise operations should only be performed on input images of the same dimensions.
The AND operator (and the NAND operator in a similar fashion) typically takes two binary or integer
graylevel images as input and produces a third image whose pixel values are just those of the first
image ANDed with the corresponding pixels from the second. This operator can be modified to
produce the output by taking a single input image and ANDing each pixel with a predetermined
constant value.
Code :
import cv2
import numpy as np
img1 = cv2.imread('input1.png')
img2 = cv2.imread('input2.png')
dest_and = cv2.bitwise_and(img2, img1, mask = None)
cv2.imshow('Bitwise And', dest_and)
cv2.waitKey(0)
OR Bitwise Operation of Image
The OR operator typically takes two binary or greyscale images as input and outputs a third image
whose pixel values are the first image’s pixel values ORed with the corresponding pixels from the
second. A variant of this operator takes a single input image and ORs each pixel with a constant
value to generate the output.
import cv2
import numpy as np
img1 = cv2.imread('input1.png')
img2 = cv2.imread('input2.png')
dest_or = cv2.bitwise_or(img1, img2, mask = None)
cv2.imshow('Bitwise OR', dest_or)
cv2.waitKey(0)
Logical NOT, also known as invert, is an operator that takes a binary or grayscale image as input
and generates its photographic negative.
Code :
import cv2
import numpy as np
img1 = cv2.imread('input1.png')
dest_not = cv2.bitwise_not(img1, mask = None)
cv2.imshow('Bitwise Not', dest_not)
cv2.waitKey(0)
Code :
import cv2
import numpy as np
img1 = cv2.imread('input1.png')
img2 = cv2.imread('input2.png')
dest_or = cv2.bitwise_xor(img1, img2, mask = None)
cv2.imshow('Bitwise XOR', dest_xor)
cv2.waitKey(0)
……………………………………………………….
The images can be subjected to arithmetic operations such as addition, subtraction, and bitwise
operations (AND, OR, NOT, XOR). These operations can help to improve the properties of the input
images. Image arithmetic is necessary for analyzing the properties of the input image. The operated
images can then be used as an enhanced input image, and many more operations can be applied
to the image. Image arithmetic is the application of one or more images to one of the standard
arithmetic operations or a logical operator. The operators are applied pixel by pixel, so the value of
a pixel in the output image is determined solely by the values of the corresponding pixels in the input
images. As a result, the images must usually be the same size. When adding a constant offset to
an image, one of the input images may be a constant value.
…………………………………………………………..
Where
Steps
To change the contrast and brightness of an image, you could follow the steps given below −
Import the required library OpenCV. Make sure you have already installed it.
Read the input image using cv2.imread() method. Specify the full path of the image.
Define alpha (it controls contrast) and beta (it controls brightness) and
call convertScaleAbs() function to change the contrast and brightness of the image. This function
returns the image with adjusted contrast and brightness. Alternatively, we can also use
the cv2.addWeighted() method to change contrast and brightness
Display the contrast and brightness adjusted image.
Let's see the examples to change the contrast and brightness of an image.
Input Image
We will use the following image as the input file in the examples below.
Example
In this Python program, we change the contrast and brightness of the input image
using cv2.convertScaleAbs() method.
image = cv2.imread('food1.jpg')
cv2.imshow('adjusted', adjusted)
cv2.waitKey()
cv2.destroyAllWindows()
Output
When you execute the above code it will produce the following output window -
Example
In this Python program, we change the contrast and brightness of the input image
using cv2.addWeighted() method.
import cv2
img = cv2.imread('food1.jpg')
cv2.imshow('adjusted', out)
cv2.waitKey(0)
cv2.destroyAllWindows()
Output
When you execute the above code, it will produce the following output window.