CV (Unit1&2ans)

UNIT1
1) What is computer vision? Write about goals and examples of computer vision.
Computer vision is a field of artificial intelligence (AI) that enables computers and systems to derive
meaningful information from digital images, videos and other visual inputs — and take actions or
make recommendations based on that information.
Computer vision works much the same as human vision. Human sight has the advantage of lifetimes
of context to train how to tell objects apart, how far away they are, whether they are moving and
whether there is something wrong in an image. Computer vision trains machines to perform these
functions, but it has to do it in much less time with cameras, data and algorithms rather than retinas,
optic nerves and a visual cortex.
Goals of Computer Vision

The main goal of computer vision is to enable machines to see the world just like human eyes do. It aims to
give machines the ability to perceive, understand and interpret visual information from the real world through
cameras and other sensors. This technology helps computers to automate visual tasks that typically require
human eyesight and cognitive processing.
Some of the key goals of computer vision are:
 Object recognition and tracking: Identifying objects in a digital image or video stream and tracking
their movements over time.
 Image segmentation: Separating images into multiple segments to recognize different objects or
regions within an image.
 Augmented reality: Superimposing virtual objects onto real-world scenes in real-time to enhance the
viewer's experience.
 Activity recognition: Understanding specific actions or events that are taking place in a given video
stream.
 Scene reconstruction: Building 3D models of scenes using information from multiple camera angles to
create immersive virtual experiences.
Examples:
 Self-Driving Cars: Computer vision is a critical component of self-driving cars that allows them to
perceive the world around them and react in real-time to changing traffic conditions.
 Healthcare: From medical imaging to patient monitoring, computer vision plays a crucial role in the
healthcare industry by enabling medical professionals to diagnose diseases accurately.
 Security: Computer vision is used heavily in security systems like facial recognition to identify and
track potential threats.
 Retail: Retailers use computer vision to analyze customer behavior and preferences, allowing them to
offer personalized recommendations and improve overall customer satisfaction.
 Agriculture: Computer vision can be used for crop monitoring and analysis, including identifying areas
that need irrigation or detecting pests and diseases.
………………………………………………………………………………………………
2) Compare and contrast image processing and computer vision.
Computer Vision:
In Computer Vision, computers or machines are made to gain high-level understanding from the input digital
images or videos with the purpose of automating tasks that the human visual system can do. It uses many
techniques and Image Processing is just one of them.
Image Processing:
Image Processing is the field of enhancing the images by tuning many parameter and features of the
images. So Image Processing is the subset of Computer Vision. Here, transformations are applied to an
input image and the resultant output image is returned. Some of these transformations are- sharpening,
smoothing, stretching etc
Image Processing Computer Vision
Computer vision is focused on extracting

Image processing is mainly focused on processing the raw
information from the input images or videos to
input images to enhance them or preparing them to do other
have a proper understanding of them to predict
tasks the visual input like human brain.
Image processing is one of the methods that is

Image processing uses methods like Anisotropic diffusion,
Hidden Markov models, Independent component analysis, used for computer vision along with other
Different Filtering etc. Machine learning techniques, CNN etc.

Computer Vision is a superset of Image
Image Processing is a subset of Computer Vision. Processing.
Examples of some Computer Vision applications

are- Object detection, Face detection, Hand
writing recognition etc.
Examples of some Image Processing applications are-

Rescaling image (Digital Zoom), Correcting illumination,
Changing tones etc.
………………………………………………………………………………………………………………….
3) What is an image? Explain different types of images with representation.
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial coordinates, and the
amplitude of F at any pair of coordinates (x,y) is called the intensity of that image at that point. When x,y,
and amplitude values of F are finite, we call it a digital image. In other words, an image can be defined by a
two-dimensional array specifically arranged in rows and columns.
Digital Image is composed of a finite number of elements, each of which elements have a particular value at
a particular location. These elements are referred to as picture elements, image elements, and pixels. A
Pixel is most widely used to denote the elements of a Digital Image.
Types of an image
 BINARY IMAGE– The binary image as its name suggests, contain only two pixel elements i.e 0 &
1,where 0 refers to black and 1 refers to white. This image is also known as Monochrome.
 Gray-scale images-Grayscale images are monochrome images, Means they have only one color.
Grayscale images do not contain any information about color. Each pixel determines available different grey
levels.A normal grayscale image contains 8 bits/pixel data, which has 256 different grey levels. In medical
images and astronomy, 12 or 16 bits/pixel images are used.
 Colour images- Colour images are three band monochrome images in which, each band contains a
different color and the actual information is stored in the digital image. The color images contain gray level
information in each spectral band.The images are represented as red, green and blue (RGB images). And
each color image has 24 bits/pixel means 8 bits for each of the three color band(RGB).
 8 bit COLOR FORMAT– It is the most famous image format. It has 256 different shades of colors in it
and commonly known as Grayscale Image. In this format, 0 stands for Black, and 255 stands for white, and
127 stands for gray.
 16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors in it. It is also known
as High Color Format. In this format the distribution of color is not as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue. That famous
RGB format.
88
8bit 16bit
……………………………………………………………………………………………..
4) How different types of images are represented as matrix? Explain.
As we know, images are represented in rows and columns we have the following syntax in which images
are represented:
The right side of this equation is digital image by definition. Every element of this matrix is called image
element, picture element, or pixel.
A grayscale image is a 2-dimensional array of numbers. An 8-bit image has entries between 0 and 255. The
value 255 represents a white color, and the value 0 represents a black color. Lower
numbers translate to darker pixels, while higher numbers translate to lighter pixels. For
an image that has (m * n) pixels (i.e., “picture elements”), we represent that image
using a matrix of size m*n. The entries of the matrix indicate the pixel value of the corresponding part of the
image. Example: This table represents an image that has 4 * 5 pixels.
In this representation, each pixel in an image is assigned a value, also known as intensity, that determines
its brightness or color.
There are several types of images, including grayscale images, black & white images, and color images.
Each type of image is represented differently as a matrix, depending on the number of channels and the
type of encoding technique used.
 Grayscale images: A grayscale image is a single-channel image where the value of each
pixel is represented by a single scalar value (0-255) indicating the level of gray. In a grayscale image, the
matrix is usually represented as a two-dimensional array, where each element represents the intensity at a
particular pixel location.
 For example, if we have a grayscale image of size 5x5, the image can be represented as
follows:
 [155, 200, 50, 10, 100]
 [100, 255, 75, 0, 60]
 [40, 175, 250, 55, 110]
 [90, 120, 40, 215, 180]
 [220, 30, 170, 80, 190]
 Black & white images: A black and white image is a binary image, where each pixel can have
only two values - black (0) or white (1). Here, the matrix is again represented as a two-dimensional array. If
a pixel is represented by 0, it will be black, and vice versa.
 For example, if we have a binary image of size 4x4, the image can be represented as follows:
 [0, 1, 1, 0]
 [1, 0, 0, 1]
 [0, 1, 1, 1]
 [1, 0, 1, 0]
 Color images: A color image contains three channels - red, green, and blue - each
represented by a two-dimensional matrix. In this case, the matrix is usually represented as a three-
dimensional array, with one dimension for each channel. The values of these matrices determine the color
and brightness of each pixel in the image. Another common representation is Hue, Saturation, and Value
(HSV), which is more intuitive for color manipulation or recognition tasks.
 For example, if we have a color image of size 3x3, the image can be represented as follows:
 [(255, 0, 0), (0, 255, 0), (0, 0, 255)]
 [(255, 255, 0), (255, 0, 255), (0, 255, 255)]
 [(255, 255, 255), (128, 128, 128), (0, 0, 0)]
 Multi-spectral images: Multispectral images capture additional spectral data beyond visual light such as
infrared, ultraviolet or radar frequencies, typically seen in satellite imagery. They possess more than three
channels of the data and required hyperspectral matrices for their construction.
…………………………………………………………………………………..
5) What is a pixel? How to manipulate pixels? Explain with example.
The pixel -- a word invented from picture element -- is the basic unit of programmable color on a
computer display or in a computer image. Think of it as a logical -- rather than a physical -- unit.
Pixels are the smallest unit in a digital display. Up to millions of pixels make up an image or video
on a device's screen. Each pixel comprises a subpixel that emits a red, green and blue (RGB)
color, which displays at different intensities. The RGB color components make up the gamut of
different colors that appear on a display or computer monitor.
To manipulate pixels in an image, we need to access and change its RGB (Red, Green, Blue) values that
determine its color. There are several Python libraries available like OpenCV, Numpy for manipulating pixels
in images. These libraries allow us to read, modify, and save images using different functions.
 Explicit changes to pixel values in any of the channels.

 Mathematical operations on images.
 Brightness changes.
 Contrast changes.
 Gamma manipulation.
 Histogram equalization
import cv2
from google.colab.patches import cv2_imshow #read the image
image = cv2.imread('teeej.jpg’)
#Convert an image from BGR to grayscale mode
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
#Convert a grayscale image to black and white using binary thresholding
(thresh, BnW_image) = cv2.threshold(gray_image, 125, 255, cv2.THRESH_BINARY)
#display all the images
cv2_imshow(image)
cv2_imshow(gray_image)
cv2_imshow(BnW_image)
 A simple binary thresholding technique in OpenCV can be
used to convert an image to black and white.
 To apply thresholding, first of all, we need to convert a colored
image to grayscale.
 Then, convert this image to grayscale using
cv2.cvtColor(image, color_space_conversion) function.
 The next step is the conversion of this grayscale image to
black and white image.
 The syntax of a binary thresholding function is:
 cv2.threshold (image, threshold, max_value,
cv2.THRESH_BINARY)
here, image is the grayscale image which you want to transform.

The second parameter specifies the threshold value with which all
pixel values are compared. Then, the third input parameter is set. If
the pixel value is greater then the threshold value, it is changed to
the value of parameter “max-value”. In binarization, we want the maximum value to be 255 as the pixel
value for black color is 255
………………………………………………………………………………………………………………………
6) Write about problems and challenges in computer vision.

.Problems in Computer Vision or Challenges in Computer vision
CV Technology is revolutionizing many industries, including healthcare, retail, automotive, etc. As more
companies invest in CV technology, the global market is projected to multiply 9 times by 2026 to $2.4 Billion.
However, implementing computer vision in your business can be a challenging and expensive process, and
improper preparation can lead to CV and AI project failure. Therefore, business managers need to be
careful before initiating computer vision projects. Some of the key challenges / problems are,
 Inadequate hardware
Computer vision technology is implemented with a combination of software and hardware. To ensure the
system’s effectiveness, a business needs to install high-resolution cameras, sensors, and bots. This
hardware can be costly and, if suboptimal or improperly installed, can lead to blind spots and ineffective CV
systems.
 Poor data quality
Poor Quality
High-quality labeled and annotated datasets are the foundation of a successful computer vision system. In
industries such as healthcare, where computer vision technology is being abundantly used, it is crucial to
have high-quality data annotation, and labeling since the consequences of inaccurate computer vision
systems can be significantly damaging. For example, many tools built to catch Covid-19 are failed due to
poor data quality.
Lack of training data
Collecting relevant and sufficient data can have various challenges. These challenges can lead to a lack of
training data for computer vision systems. For example, gathering medical data is a challenge for data
annotators. This is mainly due to the sensitivity and privacy aspects of healthcare data. Most medical
images are either of sensitive nature or are strictly private and are not shared by healthcare professionals
and hospitals. Additionally, it is possible that the developers do not have the resources to collect sufficient
data.
 Weak planning for model development
Another problem can be weak planning for creating the ML model that is deployed for the computer vision
system. During the planning stage, executives tend to set overly ambitious targets, which are hard to
achieve for the data science team.
Due to this, the business model:
 Does not meet business objectives
 Demands unrealistic computing power
 Becomes too costly
 Delivers insufficient accuracy and performance
 Object detection challenges
Viewpoint Variation: One of the biggest difficulties of object detection is that an object viewed from different
angles may look completely different. For example, images of a cake look different from different sides.
Thus, the goal of detectors is to recognize objects from different viewpoints.
Deformation: The subject of computer vision analysis is not only a solid object but also bodies that can be
deformed and change their shapes, which provides additional complexity for object detection. For example;
a football player may change his pose at different times.so, the images of the football players are different in
different poses. If the object detector is trained to find a person only in a standing or running position, it may
not be able to detect a player who is lying on the field or preparing to make a maneuver by bending down.
Occlusion: Sometimes objects can be covered by other things, which makes it difficult to read the signs and
identify these objects. For example, in the first below image, a cup is covered by the hand of the person
holding this cup.
Illumination Conditions: Lighting has a very large influence on the definition of objects. The same object will
look different depending on the lighting conditions. the less illuminated space, the less visible the objects
are. All of these factors affect the detector’s ability to define objects.
Cluttered or textured background: Objects that need to be identified may blend into the background, making
it difficult to identify them. For example, the below picture shows a lot of items, the location of which is
confusing when identifying scissors or other items of interest. In such cases, the object detector will
encounter detection problems.
………………………………………………………………………………………
7) Explain different application areas of computer vision
Some of the applications of computer vision are given below.

 Human Face identification/ Biometrics authentication
This is especially relevant for facial recognition technology. Not only does the computer recognize human
faces in general, it can also recognize the unique faces of particular individuals. Take facial recognition
phone lock systems: your phone knows your face and can distinguish you from any other person.
Facial recognition is one of the better known computer vision advantages that does not just protect your
phone; it can also be used in retail, banking, transportation, and other industries as a security measure.
Your face is a primary visual identifier by which you get recognized by other people, and now, computers too
can recognize your face to give you access to your private and sensitive data.
 Image database query
Image querying refers to the problem of finding objects that are relevant to a user query within image
databases
 Inspecting products
A computer vision system can detects defects, contaminants, functional flaws, and other irregularities in
manufactured products. Examples include inspecting tablets of medicine for flaws, checking safety seals,
caps, and rings on bottles, verify proper label placement, Count Items in packages etc.
 Object Identification and Recognition: Identifying and recognizing real-world objects from
images, videos, or live feeds.
 Autonomous Vehicles: Computer vision plays a vital role in autonomous vehicles by enabling
them to perceive the environment accurately and make appropriate decisions.
 Healthcare: It is employed in medical imaging for accurate diagnoses of various ailments such
as cancer, tumors, dental, and brain injuries.
 Augmented Reality: AR heavily depends on computer vision technologies like, tracking facial
expressions, gestures, movements and understanding environmental information.
 Quality Control: Ensuring consistency in products by detecting defects, impurities and
ensuring proper packaging requirements which helps in reducing costs and saving time.
 Security and Surveillance: Visual biometric techniques are used to provide authenticated
access to secured areas.
 Robotics: Used for robotic navigation, mapping, and perception, making robots more reliable
and efficient.
 Agriculture: Computer vision has been useful in analyzing plant growth, soil morphology,
checking fertilizer requirements, and to monitor crop health
 Retail: Retailers use computer vision-powered tech-tools for predicting customer behavior
which helps them in increasing sales and improving store efficiency.
 Gaming: Used in gaming with motion tracking, gesture recognition and facial expressions
features increasing user engagement and experience.
 Sports Analytics: A computerized system that tracks player movements and provides analysis
of their performance based on that data.
 Cultural Heritage: Preserving, restoring, and analysing ancient cultures, artifacts and
documents throught digitisation.
 Space exploration: used for capturing images, videos, and extracting relevant data of celestial
objects for further research.
 Forensic analysis: The ability to analyse visual evidence gathered from crime scenes has
improved significantly over recent years through developing better algorithms and computer simulations.
 Environmental Research: Computer vision facilitates in identifying fragile eco-systems and
assess changes through advanced image processing techniques.
 Optical character recognition
Optical Character Recognition (OCR) is the process of detecting and reading text in images through
computer vision. Detection of text from document images enables Natural Language Processing algorithms
to decipher the text and make sense of what the document conveys.
Furthermore, the text can be easily translated into multiple languages, making it easily interpretable to
anyone. OCR, however, is not limited to the detection of text from document images only. Novel OCR
algorithms make use of Computer Vision and NLP to recognize text from supermarket product names, traffic
signs, and even from billboards, making them an effective translator and interpreter.
 Analyze satellite images
Satellite image processing and analysis is one of the significant computational methods which finds
application in military, agriculture, natural disaster prevention, natural resource identification and so forth
 Deep fake detection
With fake news taking over the media space, it becomes harder for the average person to determine what’s
real and what’s not. Deep fakes are becoming so good, even the experts might fail to identify them. The
system can identify the elements of photos and videos that have been manipulated in any way. It may detect
fake product review, fake customer complaint, fake news etc.
 Law enforcement and defense
Technology can be extremely helpful in ensuring public security. Of course, there’s a controversy
surrounding public surveillance, but the fact remains that this technology can help detect suspicious
individuals, dangerous criminals, and terrorists in public places. The technology can also be used in
defense, helping the military identify weapons of mass destruction and other hazardous objects over vast
areas.
………………………………………………………………………………………….
8) What is a digital image? How images are formed? Explain.
An image is a two-dimensional array in which color information is arranged along x and y spatial axis. So, in
order to understand how the image is formed, we should first understand how the signal is formed?
Signal
A signal is a mathematical and statistical approach that relates us to the physical world. It can be measured
through its dimensions and time over space. Signals are used to convey information from one source to
another.
Relationship
A signal is that which conveys information around us in the physical world, it can be any voice, images etc.
whatever we speak, it first converted into a signal or wave and then transfer to others in due time period.
While capturing an image in the digital camera, a signal is transferred from one system to another.
How a digital image is formed?
A digital image is formed by the small bits of data i.e. pixels, which are stored in computers. When we
capture an image in our digital camera in presence of light then this camera works like a digital sensor and
converts it into digital signals.
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial coordinates, and the
amplitude of F at any pair of coordinates (x,y) is called the intensity of that image at that point. When x,y,
and amplitude values of F are finite, we call it a digital image. In other words, an image can be defined by a
two-dimensional array specifically arranged in rows and columns.
Digital Image is composed of a finite number of elements, each of which elements have a particular value at
a particular location. These elements are referred to as picture elements, image elements, and pixels. A
Pixel is most widely used to denote the elements of a Digital Image.
Types of an image
 BINARY IMAGE– The binary image as its name suggests, contain only two pixel elements i.e 0 &
1,where 0 refers to black and 1 refers to white. This image is also known as Monochrome.
 Gray-scale images-Grayscale images are monochrome images, Means they have only one color.
Grayscale images do not contain any information about color. Each pixel determines available different grey
levels.A normal grayscale image contains 8 bits/pixel data, which has 256 different grey levels. In medical
images and astronomy, 12 or 16 bits/pixel images are used.
 Colour images- Colour images are three band monochrome images in which, each band contains a
different color and the actual information is stored in the digital image. The color images contain gray level
information in each spectral band.The images are represented as red, green and blue (RGB images). And
each color image has 24 bits/pixel means 8 bits for each of the three color band(RGB).
 8 bit COLOR FORMAT– It is the most famous image format. It has 256 different shades of colors in it
and commonly known as Grayscale Image. In this format, 0 stands for Black, and 255 stands for white, and
127 stands for gray.
 16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors in it. It is also known
as High Color Format. In this format the distribution of color is not as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue. That famous
RGB format.
88
8bit 16bit
and
[Same as qn 4]
………………………………………………………………………………………………………………………………………
9) What is OpenCV? Write python code to read, save and display images in OpenCV.
OpenCV is a Python open-source library, which is used for computer vision in Artificial intelligence,
Machine Learning, face recognition, etc.
OpenCV, the CV is an abbreviation form of a computer vision, which is defined as a field of study
that helps computers to understand the content of the digital images such as photographs and
videos.
OpenCV stands for Open Source Computer Vision Library, which is widely used for image
recognition or identification. It was officially launched in 1999 by Intel. It was written in C/C++ in the
early stage, but now it is commonly used in Python for the computer vision as well.
The first alpha version of OpenCV was released for the common use at the IEEE Conference on
Computer Vision and Pattern Recognition in 2000, and between 2001 and 2005, five betas were
released. The first 1.0 version was released in 2006.
The second version of the OpenCV was released in October 2009 with the significant changes. The
second version contains a major change to the C++ interface, aiming at easier, more type-safe,
pattern, and better implementations. Currently, the development is done by an independent Russian
team and releases its newer version in every six months.
Loading image
Let’s get started and go ahead and create image_processing_opencv.py

# import required packages
import cv2
The very first thing is to import the necessary packages and in our case its cv2 . OpenCV (Open
Source Computer Vision Library) is an open source computer vision and machine learning software
library.
Syntax: Img=cv2.imread(“path”)
Example: img=cv2.imread(“C:\\pictures\apple.jpg”)
cv2.imread will return the NumPy array of the image. To load the particular image, we need to pass
the image path as a parameter to that function. Here, we are passing the image path as static but
one can also pass it as a command line argument.
Display image
# displaying image
cv2.imshow('Image file', img)
cv2.waitKey(0)
As a second step, we will display the image. Here, we will use cv2.imshow function. cv2.imshow
accepts two parameters. First parameter is basically a string and it’s the title of the display window.
The second parameter is the actual image/numpy array, which is img in our case that we want to
display. Then we have cv2.waitKey which basically pause the execution of script until the user
press any key from keyboard. And we have passed 0 as a parameter which indicates that when the
user press any key then the execution will resume.
Saving image (or) store the image

# saving file to disk
cv2.imwrite("path/to/save/newfile.jpg", img)
Finally, we want to save the image. To save the image we are going to use cv2.imwrite function,
which accepts two parameters. The first parameter is the path of the image, where you want to
save the image. In my case, I want to save the image as newfile.jpg. And the second parameter is
the actual image or the numpy array that we want to save.
……………………………………………………………
10) How pixels are accessed and manipulated in OpenCV? Explain with an example program.
The definition of an image is very simple: it is a two-dimensional view of a 3D world. Furthermore, a digital
image is a numeric representation of a 2D image as a finite set of digital values. We call these values pixels
and they collectively represent an image. Basically, a pixel is the smallest unit of a digital image (if we zoom
in a picture, we can detect them as miniature rectangles close to each other) that can be displayed on a
computer screen.
In OpenCV, pixels can be accessed and manipulated using the cv2.imread() function to read an image and
cv2.imshow() function to display it on the screen. You can access individual pixels in a image by specifying
the (x, y) coordinate of the pixel.
 we are mainly going to use grayscale images as a default choice. Due to only one channel, it
makes image processing more convenient. Usually, we convert an image into the grayscale one,
because we are dealing with one color and it is a lot easier and faster. In OpenCV we can perform
image and video analysis in full color as well, which we will also demonstrate.
# Necessary imports
import cv2
import numpy as np
import matplotlib.pyplot as plt
from google.colab.patches import cv2_imshow
# Loading our image with a cv2.imread() function
img=cv2.imread("Cybertruck.jpg",cv2.IMREAD_COLOR)
# Loading our image with a cv2.imread() function
gray=cv2.imread("Cybertruck.jpg",cv2.IMREAD_GRAYSCALE)
# For Google Colab we use the cv2_imshow() function
# but we can use cv2.imshow() if we are programming on our computer
cv2_imshow(img)
cv2_imshow(gray)
 First, we need to read the image we want to work with using the cv2.imread() function. If the image is
not in the working directory, make sure to know the exact file path. If we are working in Google Colab we
need to upload our image from our computer. With this in mind, in the following examples we are going to
read the image of the Tesla truck.
 If we want to load a color image, we just need to add a second parameter. The value that’s needed
for loading a color image is cv2.IMREAD_COLOR. There’s also another option for loading a color image: we
can just put the number 1 instead cv2.IMREAD_COLOR and we will obtain the same output.
 The value that’s needed for loading a grayscale image is cv2.IMREAD_GRAYSCALE, or we can just put
the number 0 instead as an argument.
 To display an image, we will use the cv2.imshow() function.
# If we want to get the dimensions of the image we use img.shape

# It will tell us the number of rows, columns, and channels
dimensions = img.shape
print(dimensions)
dimensions = gray.shape
print(dimensions)
Unit2
1) How to create a black image in OpenCV?
 To create a black image, we could use the np.zeros() method. It creates a numpy n-
dimensional array of given size with all elements as 0. As all elements are zero, when we display it using
cv2.imshow() or plt.imshow() functions, it displays a balck image.
 To create a white image, we could use the np.ones() method. It creates a numpy n-
dimensional array of given size with all elements as 1. We multiply this array by 255 to create a white image.
Now all elements are 255, so when we display it using cv2.imshow() or plt.imshow() functions it gives a
white image.
import cv2
import numpy as np
blank_image = np.zeros((200,200,3), np.uint8)
cv2_imshow(blank_image)
blank_image[:,0:100] = (255,0,0) # (B, G, R)
blank_image[:,100:200] = (0,255,0)
cv2_imshow(blank_image)
Output
Further, we can fill the image with different colors of Blue, Green, and Red by
accessing the pixel values using index values of the NumPy array. An example is
shown below
Output:
(Or)
to create a black image in OpenCV using Python, we can use the cv2.imread() method by passing "0" as a second
argument or use the numpy np.zeros() method. Here is an example
import cv2
import numpy as np
# Method 1 (using cv2.imread())

img = cv2.imread('path/to/image.jpg', 0)
cv2.imshow('Black Image Using cv2.imread()', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
# Method 2 (using np.zeros())

black_img = np.zeros((512, 512), np.uint8)
cv2.imshow("Black Image Using np.zeros()", black_img)
cv2.waitKey(0)
In this program, we first import the required libraries, i.e., OpenCV and NumPy. Then we create a black image using two
methods- cv2.imread() and np.zeros(). We display both images using cv2.imshow() method and wait for a key event
until the user presses any key. Finally, we release all windows using cv2.destroyAllWindows() method.
……………………………………………………………………………………………………………………
2) What is the difference between cropping and image and resizing an image?
 Cropping and resizing are two common operations performed on images in image processing or computer vision
applications. The main differences between them are:
 Cropping an image refers to selecting a region of interest (ROI) from an input image by keeping only a part of it,
which can be a rectangular or any shape. In contrast, resizing an image means changing the size of the whole image
uniformly or non-uniformly based on specific requirements.
 The output image after cropping retains the original aspect ratio, while resizing may not. Also, when we crop an
image, we keep the same number of pixels for the selected ROI. However, in resizing, the total number of pixels
changes based on the new size, which can lead to loss or gain of information in the image.
 Cropping an image typically helps in eliminating unwanted regions from an image or zooming into a localized
area of interest, whereas resizing helps to scale down large images for efficient processing or to adapt an image to a
specific display size.
import cv2
# Load the input image

img = cv2.imread('teeeeeeeeej.jpg')
# Define the ROI for cropping

x, y, w, h = 100, 100, 200, 200
# Perform cropping
cropped_img = img[y:y+h, x:x+w]
# Perform resizing
resized_img = cv2.resize(img, (300, 400))
# Display the cropped and resized images

cv2_imshow( cropped_img) #cropped image
cv2_imshow( resized_img) #resized image
dimensions = resized_img.shape
print('resized_image',dimensions) #print dimensions of resized pic
dimension = img.shape
print('orginal_image',dimension) #print dimensions of original pic
cv2.waitKey(0)
resized_image (400, 300, 3)
orginal_image (360, 561, 3)
In the above example, we loaded an input image using cv2.imread() method, defined the ROI coordinates to crop, and
then obtained the cropped image by indexing the pixel values. Next, we resized the original image using cv2.resize()
method and displayed both the cropped and resized images using cv2.imshow() method. Finally, we used cv2.waitkey()
and cv2.destroyAllWindows() methods to close the windows holding the images.
…………………………………………………………………………………………………………
3) Write OpenCV program example to illustrate cropping procedure in an image.
Cropping an image refers to selecting a rectangular region or a portion of an image. This selected part can be then
modified, analyzed or enlarged as per the requirements. It is a common operation performed in computer vision and
image processing.
In OpenCV, images can be cropped using the slicing operator ‘:’. For example, consider a grayscale image 'img' of
dimensions (300, 400). To crop this image from pixels (50, 50) to (250, 350), we can use the code:
crop_img = img[50:250, 50:350]
cv2.imshow('Cropped Image', crop_img)
cv2.waitKey(0)
Here, img[50:250, 50:350] selects rows 50 to 249 and columns 50 to 349 of the original image and creates a new
cropped image 'crop_img'. It is then displayed using cv2.imshow() function.
It is important to note that the pixel values of the original image are not copied to the cropped image but rather pointed to
by its index. Therefore, any modifications made to the cropped image will affect the original image as well.
Cropping an image is useful when only specific parts of an image are required for further processing. This reduces the
number of pixels to process and hence improves computational time. Further, it can also help in removing unwanted
parts of an image, improving its quality and making it easier to analyze.
Example:
import cv2
img = cv2.imread("yooo.jpg")
cropped_image = img[150:300,150:500]
cv2_imshow(img)
cv2_imshow(cropped_image)
cv2.imwrite("Cropped Image.jpg", cropped_image)
cv2.waitKey(0)
cropped_image.shape
(150, 350, 3)
…………………………………………………………………………………………….
4) Show the process of copying a region to another in an image
 Copying a region to another in an image in OpenCV refers to selecting a portion of an image and copying it over
to another location within the same or different image. This can be useful in various applications such as object
detection or removing unwanted portions of an image.
 The process of copying a region to another is usually done by first selecting the region of interest using a
rectangle shape. The coordinates of the top-left and bottom-right corners of the rectangle are used to define the region
of interest.
 Once the region of interest is selected, we can copy it over to another location in the same or different image.
This is achieved by specifying the destination location on the target image where the selected region will be copied to.
#copying a region to another region in an image

import cv2
img = cv2.imread("Penguins.jpg")
copy_image = img[50:200,150:400] #image cropped here
cv2_imshow(img)
img[40:190,630:880]=copy_image # The cropped image placed here
cv2.waitKey(0)
cv2_imshow(img)
The above code demonstrated the copying of a region to another in an image using OpenCV. Below are the steps
involved:
1. First, we import necessary packages including OpenCV and google.colab.patches.
2. We load an image called "Penguins.jpg" using the cv2.imread() method.
3. The next step involves selecting a region of interest (ROI) from the loaded image using slicing
technique. Here, "img[50:200,150:400]" selects pixels/region starting from x=150 to x=400 and y=50 to y=200. This step
results in a cropped image called copy_image.
4. To see the original image with the selected ROI, we use the cv2.imshow() method.
5. In the next step, we place the selected/cropped image onto another region of interest in the same
image. Here, "img[40:190,630:880]" pastes the cropped image starting from x=630 to x=880 and y=40 to y=190.
6. Finally, we display the modified/processed image using cv2.imshow() method again.
These are the steps involved in copying a region to another in an image using OpenCV
…………………………………………………………………………….
5) Create a black rectangle and white circle in OpenCV and apply bitwise OR operation on them.
The images can be subjected to arithmetic operations such as addition, subtraction, and bitwise operations
(AND, OR, NOT, XOR). These operations can help to improve the properties of the input images. Image
arithmetic is necessary for analyzing the properties of the input image. The operated images can then be used
as an enhanced input image, and many more operations can be applied to the image. Image arithmetic is the
application of one or more images to one of the standard arithmetic operations or a logical operator. The
operators are applied pixel by pixel, so the value of a pixel in the output image is determined solely by the
values of the corresponding pixels in the input images. As a result, the images must usually be the same size.
When adding a constant offset to an image, one of the input images may be a constant value.
Bitwise operations are used in image manipulation to extract important parts. The following Bitwise operations are used
in this article:
 AND
 OR
 NOT
 XR
Bitwise operations are also useful for image masking. These operations can be used to enable image creation. These
operations can help to improve the properties of the input images.
NOTE: Bitwise operations should only be performed on input images of the same dimensions.
OR Bitwise Operation of Image
The OR operator typically takes two binary or greyscale images as input and outputs a third image whose pixel values
are the first image’s pixel values ORed with the corresponding pixels from the second. A variant of this operator takes a
single input image and ORs each pixel with a constant value to generate the output.
A bitwise 'OR' examines every pixel in the two inputs, and if *EITHER* pixel in the two images is greater than 0, then
the output pixel has a value of 255, otherwise it is 0.
Syntax: cv2.bitwise_or(source1, source2, destination, mask)
Parameters:
source1: First Input numpy Image array
source2: Second Input numpy Image array
destination: Output array image
mask: Operation mask, input / output 8-bit single-channel mask.
Code :
import cv2
import numpy as np
img1 = cv2.imread('input1.png')
dest_or = cv2.bitwise_or(img1, img2, mask = None)
cv2.imshow('Bitwise OR', dest_or)
cv2.waitKey(0)
……………………………………………………………………………………
6) What are the different ways of resizing an image? Explain resizing procedures with suitable examples.
Resizing an image
Scaling, or simply resizing, is the process of increasing or decreasing the size of an image in terms of width
and height. When resizing an image, it’s important to keep in mind the aspect ratio — which is the ratio of an
image’s width to its height. Ignoring the aspect ratio can lead to resized images that look compressed and
distorted.
An image can be resized in the following ways.
1. Retain Aspect Ratio ( height to width ratio of the image is retained)

o Downscale(Decrement in the size of the image)
o Upscale(Increment in the size of image)
2. Do not preserve Aspect Ratio
o Resize only the width
o Resize only the height
3. Resize the specified width and height
To retain the Aspect Ratio of any input image, we can use cv2.resize() method with appropriate parameter values.
Below are the two methods to retain the aspect ratio of the image.
Downscale the Image:
We can downscale the original image into a smaller dimension by using cv2.resize() method and specifying the new
dimension (width, height) in the function.
import cv2
image = cv2.imread(‘tej.jpg')
Height:360, Width:561
# Get the Original Size.
height, width = image.shape[:2]
print(f"Height:{height}, Width:{width}")
# Setting new dimension for image size - 50% of the original size
new_dimension = (int(width/2), int(height/2))
# Resizing the Image
resize_image = cv2.resize(image, new_dimension)
# Displaying Original and Resized Image
cv2_imshow(image)
cv2_imshow(resize_image)
Upscale the Image:
We can upscale the original image into a larger dimension by using cv2.resize() method and specifying the new
dimension (width, height) in the function.
import cv2
from google.colab.patches import cv2_imshow Height:360, Width:561
image = cv2.imread(‘tej.jpg')
# Get the Original Size.
height, width = image.shape[:2]
print(f"Height:{height}, Width:{width}")
# Setting new dimension for image size - twice of the original size
new_dimension = (int(width*2), int(height*2))
# Resizing the Image
resize_image = cv2.resize(image, new_dimension)
# Displaying Original and Resized Image
cv2_imshow(image)
cv2_imshow(resize_image)
In the other ways of resizing an image, we don't preserve the aspect ratio of the image. In such cases, the width or
height of the image is changed based on the requirement.
Below is an example of resizing an image only with an increased width, while keeping the height same as the original
image.
Code Example:
import cv2
# Reading the image

img = cv2.imread('tej.jpg')
# Defining scales along X and Y axis

scale_x = 2.0 # double the size along X axis
scale_y = 1.0 # no change in size along Y axis
# Changing the shape of image without preserving aspect ratio

resized_img = cv2.resize(img, (0,0), fx=scale_x, fy=scale_y)
# Displaying the Original and Resized image

cv2_imshow(img)
cv2_imshow(resized_img)
Finally, resize the specified width and height simultaneously, we need to pass both 'width' and 'height' as arguments to
the cv2.resize function.
Code Example:
import cv2
# Reading the image

img = cv2.imread(‘tej.jpg')
# Specifying the New dimensions (width, height)

new_height = 300
new_width = 200
# Changing the Shape of Image without preserving aspect ratio

resized_img = cv2.resize(img, (new_width, new_height))
# Displaying the Original and Resized image

cv2_imshow(img)
cv2_imshow(resized_img)
……………………………………………………………………………………………………………………….
7) What is the need of image mask? Explain the image mask procedure with an example in OpenCV.
Masking is used in Image Processing to output the Region of Interest, or simply the part of the image that we
are interested in. We tend to use bitwise operations for masking as it allows us to discard the parts of the
image that we do not need.
For example, let’s say that we were building a computer vision system to recognize faces. The only part of the
image we are interested in finding and describing is the parts of the image that contain faces — we simply
don’t need the rest of the image’s content. Provided that we could find the faces in the image, we may construct
a mask to show only the faces in the image.
Example:
#masking
import cv2
import numpy as np
img = cv2.imread('pokemon.jpg')
cv2_imshow(img)
blank = np.zeros((img.shape[0],img.shape[1]), dtype='uint8')
circle = cv2.circle(blank, (img.shape[0]//2,img.shape[1]//2),50,(255,255,255),-1)

cv2_imshow(circle)
masked = cv2.bitwise_and(img,img,mask=circle)
cv2_imshow(masked)
cv2.waitKey(0)
Output:
Image Mask Masked Image
In the above code, cv2.circle() method is used to draw a circle on any image. The syntax of cv2.circle()
method is:
Syntax:
cv2.circle (image, center coordinates, radius, color, thickness)
Parameters:
 image: It is the image on which the circle is to be drawn.

 center coordinates: It is the center coordinates of the circle. The coordinates are represented
as tuples of two values i.e. (X coordinate value, Y coordinate value).
 radius: It is the radius of the circle.
 color: It is the color of the borderline of a circle to be drawn. For BGR, we pass a tuple. eg:
(255, 255, 255) for white color.
 thickness: It is the thickness of the circle border line in px. Thickness of -1 px will fill the circle
shape by the specified color.
The syntax to define bitwise_and() operator in OpenCV is as follows:
bitwise_and(source1_array, source2_array, destination_array, mask)
 where source1_array is the array corresponding to the first input image on which bitwise and
operation is to be performed,
 source2_array is the array corresponding to the second input image on which bitwise and operation is
to be performed,
 destination_array is the resulting array by performing bitwise operation on the array corresponding to
the first input image and the array corresponding to the second input image and
 mask is the mask operation to be performed on the resulting image.
………………………………………………………………………………………
8) Explain Bitwise operations on image with suitable examples.
The images can be subjected to arithmetic operations such as addition, subtraction, and bitwise
operations (AND, OR, NOT, XOR). These operations can help to improve the properties of the input
images. Image arithmetic is necessary for analyzing the properties of the input image. The operated
images can then be used as an enhanced input image, and many more operations can be applied
to the image. Image arithmetic is the application of one or more images to one of the standard
arithmetic operations or a logical operator. The operators are applied pixel by pixel, so the value of
a pixel in the output image is determined solely by the values of the corresponding pixels in the input
images. As a result, the images must usually be the same size. When adding a constant offset to
an image, one of the input images may be a constant value.
Bitwise Operations
Bitwise operations are used in image manipulation to extract important parts. The following Bitwise
operations are used in this article:
 AND
 OR
 NOT
 XR
Bitwise operations are also useful for image masking. These operations can be used to enable
image creation. These operations can help to improve the properties of the input images.
NOTE: Bitwise operations should only be performed on input images of the same dimensions.
AND Bitwise Operation of Image
The AND operator (and the NAND operator in a similar fashion) typically takes two binary or integer
graylevel images as input and produces a third image whose pixel values are just those of the first
image ANDed with the corresponding pixels from the second. This operator can be modified to
produce the output by taking a single input image and ANDing each pixel with a predetermined
constant value.
Syntax: cv2.bitwise_and(Image1, Image2, destination, mask)
Code :
import cv2
import numpy as np
dest_and = cv2.bitwise_and(img2, img1, mask = None)
cv2.imshow('Bitwise And', dest_and)
cv2.waitKey(0)
OR Bitwise Operation of Image
The OR operator typically takes two binary or greyscale images as input and outputs a third image
whose pixel values are the first image’s pixel values ORed with the corresponding pixels from the
second. A variant of this operator takes a single input image and ORs each pixel with a constant
value to generate the output.
Syntax: cv2.bitwise_or(source1, source2, destination, mask)
import cv2
import numpy as np
dest_or = cv2.bitwise_or(img1, img2, mask = None)
cv2.imshow('Bitwise OR', dest_or)
cv2.waitKey(0)
NOT Bitwise Operation of Image
Logical NOT, also known as invert, is an operator that takes a binary or grayscale image as input
and generates its photographic negative.
Syntax: cv2.bitwise_not(Image1,Destination, mask)
Code :
import cv2
import numpy as np
dest_not = cv2.bitwise_not(img1, mask = None)
cv2.imshow('Bitwise Not', dest_not)
cv2.waitKey(0)
XOR Bitwise Operation of Image

The operation is carried out simply and in a single pass. It is critical that all of the input pixel values
being processed have the same number of bits, or else unexpected results may occur. When the
pixel values in the input images are not simple 1-bit numbers, the XOR operation is typically (but
not always) performed bitwise on each corresponding bit in the pixel values.
Syntax: cv2.bitwise_xor(source1, source2, destination,

mask)
Code :
import cv2
import numpy as np
dest_or = cv2.bitwise_xor(img1, img2, mask = None)
cv2.imshow('Bitwise XOR', dest_xor)
cv2.waitKey(0)
……………………………………………………….
9) List and explain different mathematical operations on images.
The images can be subjected to arithmetic operations such as addition, subtraction, and bitwise
operations (AND, OR, NOT, XOR). These operations can help to improve the properties of the input
images. Image arithmetic is necessary for analyzing the properties of the input image. The operated
images can then be used as an enhanced input image, and many more operations can be applied
to the image. Image arithmetic is the application of one or more images to one of the standard
arithmetic operations or a logical operator. The operators are applied pixel by pixel, so the value of
a pixel in the output image is determined solely by the values of the corresponding pixels in the input
images. As a result, the images must usually be the same size. When adding a constant offset to
an image, one of the input images may be a constant value.
[Same as 8th qn]
…………………………………………………………..
10) Write and explain contrast and brightness enhancement techniques.

In OpenCV, to change the contrast and brightness of an image we could use cv2.convertScaleAbs().
The syntax we use for this method is as follows −
cv2.convertScaleAbs(image, alpha, beta)
Where
 image is the original input image.

 alpha is the contrast value. To lower the contrast, use 0 < alpha < 1. And for higher
contrast use alpha > 1.
 beta is the brightness value. A good range for brightness value is [-127, 127]
We could also apply the cv2.addWeighted() function to change the contrast and brightness of an
image. We have discussed it in example 2.
Steps
To change the contrast and brightness of an image, you could follow the steps given below −
 Import the required library OpenCV. Make sure you have already installed it.
 Read the input image using cv2.imread() method. Specify the full path of the image.
 Define alpha (it controls contrast) and beta (it controls brightness) and
call convertScaleAbs() function to change the contrast and brightness of the image. This function
returns the image with adjusted contrast and brightness. Alternatively, we can also use
the cv2.addWeighted() method to change contrast and brightness
 Display the contrast and brightness adjusted image.
Let's see the examples to change the contrast and brightness of an image.
Input Image
We will use the following image as the input file in the examples below.
Example
In this Python program, we change the contrast and brightness of the input image
using cv2.convertScaleAbs() method.
# import the required library

import cv2
# read the input image
image = cv2.imread('food1.jpg')
# define the alpha and beta
alpha = 1.5 # Contrast control
beta = 10 # Brightness control
# call convertScaleAbs function
adjusted = cv2.convertScaleAbs(image, alpha=alpha, beta=beta)
# display the output image
cv2.imshow('adjusted', adjusted)
cv2.waitKey()
Output
When you execute the above code it will produce the following output window -
Example
In this Python program, we change the contrast and brightness of the input image
using cv2.addWeighted() method.
# import required library
import cv2
# read the input image
img = cv2.imread('food1.jpg')
# define the contrast and brightness value

contrast = 5. # Contrast control ( 0 to 127)
brightness = 2. # Brightness control (0-100)
# call addWeighted function. use beta = 0 to effectively only
operate on one image
out = cv2.addWeighted( img, contrast, img, 0, brightness)
# display the image with changed contrast and brightness
cv2.imshow('adjusted', out)
cv2.waitKey(0)
Output
When you execute the above code, it will produce the following output window.

CV (Unit1&2ans)

Uploaded by

Copyright:

Available Formats

CV (Unit1&2ans)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CV (Unit1&2ans)

Uploaded by

Copyright:

Available Formats

UNIT1

Goals of Computer Vision

Image Processing Computer Vision

Computer vision is focused on extracting

Image processing is one of the methods that is

Different Filtering etc. Machine learning techniques, CNN etc.

Examples of some Computer Vision applications

Examples of some Image Processing applications are-

 Explicit changes to pixel values in any of the channels.

from google.colab.patches import cv2_imshow #read the image

#Convert an image from BGR to grayscale mode

gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

#Convert a grayscale image to black and white using binary thresholding

(thresh, BnW_image) = cv2.threshold(gray_image, 125, 255, cv2.THRESH_BINARY)

#display all the images

here, image is the grayscale image which you want to transform.

6) Write about problems and challenges in computer vision.

Some of the applications of computer vision are given below.

Let’s get started and go ahead and create image_processing_opencv.py

Saving image (or) store the image

# If we want to get the dimensions of the image we use img.shape

1) How to create a black image in OpenCV?

# Method 1 (using cv2.imread())

# Method 2 (using np.zeros())

# Load the input image

# Define the ROI for cropping

# Display the cropped and resized images

print('orginal_image',dimension) #print dimensions of original pic

resized_image (400, 300, 3)

orginal_image (360, 561, 3)

cv2.imshow('Cropped Image', crop_img)

from google.colab.patches import cv2_imshow

cv2.imwrite("Cropped Image.jpg", cropped_image)

#copying a region to another region in an image

OR Bitwise Operation of Image

Syntax: cv2.bitwise_or(source1, source2, destination, mask)

source1: First Input numpy Image array

source2: Second Input numpy Image array

destination: Output array image

mask: Operation mask, input / output 8-bit single-channel mask.

dest_or = cv2.bitwise_or(img1, img2, mask = None)

cv2.imshow('Bitwise OR', dest_or)

An image can be resized in the following ways.

1. Retain Aspect Ratio ( height to width ratio of the image is retained)

Downscale the Image:

from google.colab.patches import cv2_imshow

# Get the Original Size.

height, width = image.shape[:2]

new_dimension = (int(width/2), int(height/2))

# Resizing the Image

resize_image = cv2.resize(image, new_dimension)

# Displaying Original and Resized Image

Upscale the Image:

from google.colab.patches import cv2_imshow Height:360, Width:561

# Get the Original Size.

height, width = image.shape[:2]

new_dimension = (int(width*2), int(height*2))

# Resizing the Image

resize_image = cv2.resize(image, new_dimension)

new_dimension = (int(width2), int(height2))