0% found this document useful (0 votes)
265 views26 pages

Computer Vision Important Questions Answers 250322 101712

The document contains a series of objective questions and answers related to Computer Vision, covering fundamental concepts such as the goals of Computer Vision, applications, and specific techniques like Convolutional Neural Networks (CNNs) and image processing methods. Key topics include image classification, object detection, and the significance of various image features and layers in CNNs. Each question is accompanied by an explanation to clarify the correct answers.

Uploaded by

priyasmriti940
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
265 views26 pages

Computer Vision Important Questions Answers 250322 101712

The document contains a series of objective questions and answers related to Computer Vision, covering fundamental concepts such as the goals of Computer Vision, applications, and specific techniques like Convolutional Neural Networks (CNNs) and image processing methods. Key topics include image classification, object detection, and the significance of various image features and layers in CNNs. Each question is accompanied by an explanation to clarify the correct answers.

Uploaded by

priyasmriti940
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

COMPUTER VISION

OBJECTIVE QUESTIONS

1. What is the primary goal of Computer Vision?


(a) To mimic human thought processes
(b) To enable machines to see and analyze images
(c) To develop new programming languages
(d) To create artificial life forms
Answer: (b) To enable machines to see and analyze images
Explanation: Computer Vision is a domain of Artificial Intelligence that allows machines to process
and analyze visual data using algorithms.

2. Which of the following is NOT an application of Computer Vision?


(a) Facial recognition
(b) Self-driving cars
(c) Cooking food
(d) Medical imaging
Answer: (c) Cooking food
Explanation: Computer Vision is widely used in various fields like security (facial recognition),
transportation (self-driving cars), and healthcare (medical imaging), but it is not used for cooking
food.

3. What is the role of the Convolutional Layer in a Convolutional Neural Network (CNN)?
(a) To extract high-level features like edges and shapes
(b) To reduce image resolution
(c) To classify images directly
(d) To store images for future use
Answer: (a) To extract high-level features like edges and shapes
Explanation: The Convolutional Layer is responsible for detecting features such as edges, gradients,
and textures from the input image.

4. In an RGB image, how is color information stored?


(a) Using a single grayscale channel
(b) Using three separate color channels: Red, Green, and Blue
(c) Using a hexadecimal color code
(d) Using a binary color system
Answer: (b) Using three separate color channels: Red, Green, and Blue
Explanation: In an RGB image, each pixel has three values corresponding to Red, Green, and Blue
intensities, which combine to form different colors.

5. What is the purpose of the Rectified Linear Unit (ReLU) in a CNN?


(a) To convert the image into grayscale
(b) To remove all negative values from the feature map
(c) To reduce the size of the image
(d) To classify images into categories
Answer: (b) To remove all negative values from the feature map
Explanation: ReLU introduces non-linearity by replacing all negative values with zero, making the
feature extraction process more effective.

6. What does "Object Detection" in Computer Vision involve?


(a) Assigning a single label to an image
(b) Identifying and locating multiple objects within an image

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 1 -


(c) Enhancing image brightness
(d) Converting images into text
Answer: (b) Identifying and locating multiple objects within an image
Explanation: Object detection involves recognizing multiple objects in an image and determining
their positions.

7. What is a key advantage of Max Pooling in CNNs?


(a) It increases the image resolution
(b) It retains only the most important features while reducing dimensionality
(c) It converts images into 3D models
(d) It replaces grayscale images with color images
Answer: (b) It retains only the most important features while reducing dimensionality
Explanation: Max Pooling selects the maximum value from a region in the feature map, preserving
essential information while reducing computation.

8. How does Google Translate use Computer Vision?


(a) By recognizing and translating text from images
(b) By predicting future translations
(c) By modifying image pixels
(d) By analyzing video content
Answer: (a) By recognizing and translating text from images
Explanation: Google Translate uses optical character recognition (OCR) to identify text in an image
and translate it into another language.

9. What is the significance of Pixel Value in an image?


(a) It determines the file format of the image
(b) It represents the brightness and/or color of a pixel
(c) It defines the image compression ratio
(d) It decides the physical size of the image
Answer: (b) It represents the brightness and/or color of a pixel
Explanation: Each pixel in an image has a numerical value that represents its color and brightness.

10. Why are corners considered better features in an image than edges?
(a) Because they are larger in size
(b) Because they have unique intensity variations in multiple directions
(c) Because they are smoother
(d) Because they contain more pixels
Answer: (b) Because they have unique intensity variations in multiple directions
Explanation: Corners are more distinguishable compared to edges, as they remain unique when
moved in different directions.

11. What is the primary function of Image Classification in Computer Vision?


(a) Assigning an image one label from a set of predefined categories
(b) Identifying and locating multiple objects in an image
(c) Converting images into 3D models
(d) Enhancing the brightness of an image
Answer: (a) Assigning an image one label from a set of predefined categories
Explanation: Image Classification is a fundamental task in Computer Vision where an image is
categorized into one of the predefined classes.

12. How does Google’s "Search by Image" feature utilize Computer Vision?
(a) By using textual data to find similar images
(b) By analyzing image features and comparing them to a database

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 2 -


(c) By converting images into text
(d) By generating new images from existing ones
Answer: (b) By analyzing image features and comparing them to a database
Explanation: Google’s "Search by Image" feature uses Computer Vision techniques to compare
different image features and retrieve similar images.

13. What does "Instance Segmentation" in Computer Vision do?


(a) Identifies objects in an image without marking their location
(b) Detects and assigns a unique label to each pixel of different objects in an image
(c) Increases image resolution
(d) Converts grayscale images into colored images
Answer: (b) Detects and assigns a unique label to each pixel of different objects in an image
Explanation: Instance Segmentation is the process of detecting objects, categorizing them, and
labeling each pixel of the objects in an image.

14. Why do self-driving cars rely on Computer Vision?


(a) To generate new road maps
(b) To detect objects, recognize signals, and navigate routes
(c) To control the speed of the car
(d) To communicate with other vehicles
Answer: (b) To detect objects, recognize signals, and navigate routes
Explanation: Self-driving cars use Computer Vision to identify objects like pedestrians, traffic
signals, and obstacles to navigate safely.

15. What is the significance of "Grayscale Images" in Computer Vision?


(a) They allow for color enhancement
(b) They are easier to process as they contain only intensity values ranging from 0 to 255
(c) They increase the size of an image
(d) They store more color information than RGB images
Answer: (b) They are easier to process as they contain only intensity values ranging from 0 to 255
Explanation: Grayscale images reduce computational complexity since they only have one channel
instead of three (R, G, and B).

16. What is a Kernel in image processing?


(a) A high-resolution image format
(b) A matrix used in convolution operations to process an image
(c) A data storage unit for images
(d) A color-coding technique
Answer: (b) A matrix used in convolution operations to process an image
Explanation: A Kernel is a small matrix used to filter images, extract features, and apply effects like
blurring or sharpening.

17. What is the purpose of the Pooling Layer in Convolutional Neural Networks (CNNs)?
(a) To reduce computational load and extract dominant features
(b) To add more pixels to an image
(c) To transform images into 3D representations
(d) To change image colors
Answer: (a) To reduce computational load and extract dominant features
Explanation: The Pooling Layer reduces the spatial dimensions of feature maps, making
computations faster while preserving key features.

18. Why do images have a maximum pixel value of 255 in an 8-bit system?
(a) Because 255 is the largest decimal number in binary coding
(b) Because images cannot store values beyond 255

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 3 -


(c) Because the computer cannot handle more than 255 colors
(d) Because every image pixel must be black or white
Answer: (a) Because 255 is the largest decimal number in binary coding
Explanation: In an 8-bit system, each pixel is represented using 8 bits, allowing for 256 possible
values (0 to 255).

19. How does facial recognition work using Computer Vision?


(a) By matching an image with stored text data
(b) By detecting and analyzing facial features to recognize individuals
(c) By comparing hair colors in an image
(d) By estimating the height of a person in an image
Answer: (b) By detecting and analyzing facial features to recognize individuals
Explanation: Facial recognition technology analyzes facial features like the distance between the
eyes and nose to identify individuals.

20. What is the purpose of OpenCV in Computer Vision?


(a) To store high-resolution images
(b) To provide an open-source library for image processing and analysis
(c) To convert text into images
(d) To create 3D animations
Answer: (b) To provide an open-source library for image processing and analysis
Explanation: OpenCV (Open Source Computer Vision Library) is widely used for processing
images, object detection, and facial recognition.

21. What is the primary difference between Classification and Object Detection in Computer
Vision?
(a) Classification identifies multiple objects, whereas Object Detection assigns a single label to an
image
(b) Classification assigns a single label to an image, while Object Detection identifies and locates
multiple objects
(c) Both perform the same function in different ways
(d) Classification is used for videos, and Object Detection is used for images
Answer: (b) Classification assigns a single label to an image, while Object Detection identifies and
locates multiple objects
Explanation: Classification determines what an image represents, while Object Detection finds
multiple objects and their locations in an image.

22. What is the role of "Feature Maps" in Convolutional Neural Networks (CNNs)?
(a) To store high-resolution image data
(b) To capture essential features from an image after applying convolution
(c) To convert images into text format
(d) To directly classify objects in an image
Answer: (b) To capture essential features from an image after applying convolution
Explanation: Feature maps contain extracted information such as edges, textures, and patterns after
convolution is applied to an image.

23. What is the main function of the Fully Connected Layer in a CNN?
(a) To downsample the image
(b) To classify the extracted features into specific categories
(c) To detect the edges of objects in an image
(d) To apply different color filters
Answer: (b) To classify the extracted features into specific categories
Explanation: The Fully Connected Layer takes the feature maps from previous layers and classifies
them into specific object categories.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 4 -


24. What is the primary advantage of using Convolutional Neural Networks (CNNs) for image
processing?
(a) They store images in grayscale format
(b) They automatically learn spatial hierarchies of features
(c) They require manual feature selection for classification
(d) They do not require any labeled data for training
Answer: (b) They automatically learn spatial hierarchies of features
Explanation: CNNs efficiently learn and extract features from images through convolutional layers
without the need for manual feature selection.

25. In a CNN, what does the Rectified Linear Unit (ReLU) activation function do?
(a) It normalizes pixel values in an image
(b) It removes negative values from the feature map, introducing non-linearity
(c) It converts an image into binary form
(d) It merges multiple layers in a CNN
Answer: (b) It removes negative values from the feature map, introducing non-linearity
Explanation: ReLU replaces negative values with zero, ensuring non-linearity in the neural network
for better learning of complex patterns.

26. Which of the following statements about Convolution in image processing is correct?
(a) Convolution applies a filter (kernel) over an image to extract features
(b) Convolution increases the size of the image
(c) Convolution removes colors from an image
(d) Convolution is only used for reducing noise in images
Answer: (a) Convolution applies a filter (kernel) over an image to extract features
Explanation: Convolution involves sliding a kernel over an image to detect features such as edges,
textures, and patterns.

27. What happens to an image when it undergoes Max Pooling in a CNN?


(a) The image becomes larger in size
(b) Only the most important features are retained while reducing dimensionality
(c) The colors in the image change
(d) The image becomes more detailed
Answer: (b) Only the most important features are retained while reducing dimensionality
Explanation: Max Pooling helps reduce the size of the feature map while keeping the most essential
features, improving efficiency in processing.

28. In an 8-bit grayscale image, what does a pixel value of 0 represent?


(a) White
(b) Black
(c) Gray
(d) Transparent
Answer: (b) Black
Explanation: In grayscale images, a pixel value of 0 represents black, while 255 represents white,
with values in between representing shades of gray.

29. How does OpenCV help in Computer Vision applications?


(a) It provides an open-source library for image and video processing
(b) It only works with grayscale images
(c) It stores high-resolution image data
(d) It converts videos into 3D models
Answer: (a) It provides an open-source library for image and video processing
Explanation: OpenCV is a widely used open-source library that provides tools for image processing,
object detection, and facial recognition.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 5 -


30. Why are corners considered good features for object detection?
(a) They have unique intensity variations in multiple directions
(b) They are easier to blur
(c) They appear more frequently in images
(d) They provide information about an object’s background
Answer: (a) They have unique intensity variations in multiple directions
Explanation: Corners are easy to detect and track because their intensity values change distinctly in
multiple directions, making them good features for object detection.

31. Which of the following is an example of Computer Vision in retail?


(a) Image compression
(b) Customer movement tracking and inventory management
(c) Creating 3D models of stores
(d) Predicting customer purchases using text analysis
Answer: (b) Customer movement tracking and inventory management
Explanation: Computer Vision is used in retail to analyze customer movement patterns and manage
inventory by tracking stock levels using image data.

32. What is the primary reason for using Convolutional Neural Networks (CNNs) in image
processing?
(a) CNNs require less memory than traditional neural networks
(b) CNNs automatically extract important features from images
(c) CNNs can only process grayscale images
(d) CNNs store images in binary format
Answer: (b) CNNs automatically extract important features from images
Explanation: CNNs use convolution layers to detect patterns like edges, textures, and shapes,
reducing the need for manual feature selection.

33. How does Google Translate use Computer Vision for real-time translation?
(a) By using voice input to recognize languages
(b) By identifying and translating text in images using Optical Character Recognition (OCR)
(c) By analyzing hand gestures
(d) By converting text into speech for translation
Answer: (b) By identifying and translating text in images using Optical Character Recognition
(OCR)
Explanation: Google Translate uses OCR to extract text from images and then translates it into the
desired language.

34. What is the purpose of the Convolution operation in image processing?


(a) To add noise to an image
(b) To apply filters that extract important features from an image
(c) To convert images into text format
(d) To increase the resolution of an image
Answer: (b) To apply filters that extract important features from an image
Explanation: Convolution applies a kernel (filter) to an image to highlight features like edges,
patterns, and textures.

35. In a grayscale image, what does a pixel value of 255 represent?


(a) Black
(b) White
(c) A completely transparent pixel
(d) A random noise value

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 6 -


Answer: (b) White
Explanation: In an 8-bit grayscale image, pixel values range from 0 (black) to 255 (white), with
intermediate values representing different shades of gray.

36. What is the main advantage of using Pooling layers in a CNN?


(a) They increase image resolution
(b) They make the model more computationally efficient by reducing spatial dimensions
(c) They add color information to images
(d) They remove background noise from images
Answer: (b) They make the model more computationally efficient by reducing spatial dimensions
Explanation: Pooling layers downsample feature maps, retaining essential features while reducing
computational complexity.

37. Which of the following is a correct statement about RGB images?


(a) Each pixel has three values corresponding to Red, Green, and Blue intensities
(b) RGB images only contain shades of gray
(c) The pixel values in an RGB image range from 0 to 100
(d) RGB images are stored as a single-layer grayscale image
Answer: (a) Each pixel has three values corresponding to Red, Green, and Blue intensities
Explanation: In an RGB image, each pixel is represented by three values (R, G, and B), which define
its color intensity.

38. What is a key difference between Max Pooling and Average Pooling in CNNs?
(a) Max Pooling selects the maximum pixel value from a region, while Average Pooling calculates
the average pixel value
(b) Max Pooling increases the image size, while Average Pooling reduces it
(c) Max Pooling blurs the image, while Average Pooling sharpens it
(d) Max Pooling only works on grayscale images, while Average Pooling works on color images
Answer: (a) Max Pooling selects the maximum pixel value from a region, while Average Pooling
calculates the average pixel value
Explanation: Max Pooling retains the most significant features by selecting the highest value in a
region, while Average Pooling smooths out the values.

39. Why is Convolution useful in Computer Vision applications like facial recognition?
(a) It helps in resizing images to a fixed dimension
(b) It extracts important patterns like edges and shapes from faces
(c) It reduces the color depth of an image
(d) It converts a 2D image into a 3D model
Answer: (b) It extracts important patterns like edges and shapes from faces
Explanation: Convolution applies filters to images to detect patterns like facial features, making it
essential for facial recognition.

40. What is the purpose of edge detection in Computer Vision?


(a) To reduce image file size
(b) To identify boundaries and shapes within an image
(c) To convert images into grayscale
(d) To smooth out color transitions in an image
Answer: (b) To identify boundaries and shapes within an image
Explanation: Edge detection highlights transitions between different objects or features in an image,
helping in object detection and segmentation.

41. What does the term "Pixel" stand for in digital imaging?
(a) A 3D representation of an image
(b) The smallest unit of a digital image

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 7 -


(c) A special type of image filter
(d) A method of image encryption
Answer: (b) The smallest unit of a digital image
Explanation: A pixel (short for "picture element") is the smallest unit of a digital image that
contributes to forming the overall picture.

42. How does Facial Recognition technology use Computer Vision?


(a) By comparing text descriptions of faces
(b) By detecting and analyzing facial features for identification
(c) By changing facial expressions in photos
(d) By converting faces into animated images
Answer: (b) By detecting and analyzing facial features for identification
Explanation: Facial recognition technology uses Computer Vision to identify and verify people
based on their facial features.

43. What is the main purpose of Image Segmentation in Computer Vision?


(a) To convert images into text
(b) To divide an image into meaningful regions for analysis
(c) To apply random colors to an image
(d) To increase the resolution of an image
Answer: (b) To divide an image into meaningful regions for analysis
Explanation: Image Segmentation is used to partition an image into multiple segments to simplify
analysis and object detection.

44. Why is OpenCV widely used in Computer Vision applications?


(a) It provides tools for image processing and analysis
(b) It is the only software available for image editing
(c) It is a paid software for advanced graphics rendering
(d) It is mainly used for creating animated videos
Answer: (a) It provides tools for image processing and analysis
Explanation: OpenCV (Open Source Computer Vision Library) is a widely used open-source library
for image processing, object detection, and video analysis.

45. What does Instance Segmentation do differently from Object Detection?


(a) It identifies the presence of an object without detecting its shape
(b) It assigns a pixel-wise mask to each detected object in an image
(c) It blurs out unwanted parts of an image
(d) It converts an image into a 3D model
Answer: (b) It assigns a pixel-wise mask to each detected object in an image
Explanation: Unlike Object Detection, Instance Segmentation detects objects and provides a precise
mask for each object in the image.

46. What is the primary purpose of the Kernel in a Convolutional Neural Network (CNN)?
(a) To add random noise to an image
(b) To extract specific features from an image during the convolution operation
(c) To convert an image into a grayscale format
(d) To increase the size of an image
Answer: (b) To extract specific features from an image during the convolution operation
Explanation: A kernel (or filter) slides over an image and extracts features such as edges, textures,
and patterns.

47. Why are Grayscale images used in many Computer Vision applications?
(a) They contain more color information
(b) They are easier to process since they only contain intensity values

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 8 -


(c) They make images larger and more detailed
(d) They automatically enhance image quality
Answer: (b) They are easier to process since they only contain intensity values
Explanation: Grayscale images are computationally efficient because they contain only one intensity
value per pixel instead of three (R, G, B).

48. What is the main function of the Fully Connected Layer in a CNN?
(a) To classify the extracted features into specific categories
(b) To apply filters to an image
(c) To detect object edges in an image
(d) To reduce image noise
Answer: (a) To classify the extracted features into specific categories
Explanation: The Fully Connected Layer takes feature maps from previous layers and maps them to
specific object categories.

49. How does Convolution help in image processing?


(a) By applying a filter to highlight important features in an image
(b) By increasing the resolution of an image
(c) By converting images into 3D models
(d) By removing all color information from an image
Answer: (a) By applying a filter to highlight important features in an image
Explanation: Convolution applies a filter (kernel) to extract important features such as edges and
textures.

50. What role does the Pooling Layer play in a CNN?


(a) It increases image brightness
(b) It reduces the size of feature maps while retaining important information
(c) It converts images into text format
(d) It generates new colors in an image
Answer: (b) It reduces the size of feature maps while retaining important information
Explanation: The Pooling Layer downsamples feature maps, making computations more efficient
while preserving key image features.

51. What is the significance of the term "Resolution" in digital images?


(a) The number of colors an image can display
(b) The total number of pixels in an image
(c) The speed at which an image loads
(d) The amount of time taken to process an image
Answer: (b) The total number of pixels in an image
Explanation: Resolution refers to the number of pixels in an image, usually expressed as width ×
height, such as 1280 × 1024 pixels.

52. What is the difference between Classification and Classification + Localization in Computer
Vision?
(a) Classification only identifies the object, while Classification + Localization identifies both the
object and its location
(b) Classification detects multiple objects, while Classification + Localization detects only one object
(c) Classification + Localization is a simpler form of Classification
(d) There is no difference between the two
Answer: (a) Classification only identifies the object, while Classification + Localization identifies
both the object and its location
Explanation: Classification assigns a label to an image, while Classification + Localization identifies
the object and marks its position within the image.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 9 -


53. How does a computer represent colors in an RGB image?
(a) Using a single grayscale value for each pixel
(b) Using three separate color channels: Red, Green, and Blue
(c) Using a hexadecimal color code
(d) By assigning a single numerical value to each image
Answer: (b) Using three separate color channels: Red, Green, and Blue
Explanation: RGB images are stored as three separate layers for Red, Green, and Blue, with each
pixel having three values corresponding to these colors.

54. What is the main advantage of using Convolutional Neural Networks (CNNs) over
traditional Machine Learning models for image processing?
(a) CNNs require less data for training
(b) CNNs automatically extract and learn features from images without manual feature selection
(c) CNNs use more computational resources but do not improve accuracy
(d) CNNs are only useful for text processing
Answer: (b) CNNs automatically extract and learn features from images without manual feature
selection
Explanation: CNNs use convolutional layers to detect patterns in images, eliminating the need for
manually defined features.

55. What happens when the Rectified Linear Unit (ReLU) activation function is applied in a
CNN?
(a) It converts all pixel values to grayscale
(b) It removes negative values from the feature map, keeping only positive values
(c) It increases the image resolution
(d) It adds more color intensity to the image
Answer: (b) It removes negative values from the feature map, keeping only positive values
Explanation: ReLU introduces non-linearity into the network by setting negative values to zero
while keeping positive values unchanged.

56. What is the function of Edge Detection in Computer Vision?


(a) It increases the brightness of an image
(b) It helps identify object boundaries by detecting changes in intensity
(c) It converts colored images into black and white
(d) It blurs the image to reduce noise
Answer: (b) It helps identify object boundaries by detecting changes in intensity
Explanation: Edge Detection techniques, such as Sobel and Canny filters, help identify the edges of
objects by detecting intensity variations in an image.

57. Why is Max Pooling used in CNNs?


(a) To increase the computational complexity of the model
(b) To reduce the size of the feature map while retaining essential information
(c) To add more detail to the image
(d) To remove all unwanted objects from an image
Answer: (b) To reduce the size of the feature map while retaining essential information
Explanation: Max Pooling selects the highest pixel value from a region, reducing image size while
preserving the most important features.

58. How does Google’s Search by Image feature use Computer Vision?
(a) By comparing different features of an input image with a database of images
(b) By scanning images for hidden text only
(c) By converting images into videos
(d) By detecting objects without analyzing their features

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 10 -


Answer: (a) By comparing different features of an input image with a database of images
Explanation: Google’s Search by Image feature uses Computer Vision to analyze and compare
image features to find similar images online.

59. Why do self-driving cars rely heavily on Computer Vision?


(a) To capture high-quality images of roads for future reference
(b) To detect objects, recognize road signs, and navigate autonomously
(c) To entertain passengers with enhanced visual effects
(d) To replace GPS technology
Answer: (b) To detect objects, recognize road signs, and navigate autonomously
Explanation: Self-driving cars use Computer Vision to analyze their surroundings, detect objects like
pedestrians and road signs, and make driving decisions.

60. What is the importance of Optical Character Recognition (OCR) in Computer Vision?
(a) It converts images into audio files
(b) It detects objects in a video
(c) It recognizes and extracts text from images
(d) It enhances image brightness
Answer: (c) It recognizes and extracts text from images
Explanation: OCR is a Computer Vision technique that allows machines to recognize and extract
text from images, making it useful for applications like document scanning and translation.

QUESTIONS AND ANSWERS (2 marks)

1. What is Computer Vision, and how does it work?


Answer: Computer Vision is a field of Artificial Intelligence (AI) that enables machines to process
and analyze visual data such as images and videos. It works by using algorithms and models to
interpret, recognize, and make decisions based on visual inputs.
Explanation: Computer Vision mimics human vision by detecting patterns, objects, and features in
images through techniques like image processing, object detection, and deep learning models.

2. What are the main applications of Computer Vision?


Answer: Computer Vision is widely used in applications such as facial recognition, self-driving cars,
medical imaging, retail (customer tracking and inventory management), Google’s Search by Image,
and augmented reality filters in social media apps.
Explanation: The ability of Computer Vision to analyze and process images enables industries like
healthcare, security, and retail to automate tasks and improve efficiency.

3. What is the difference between Object Detection and Instance Segmentation?


Answer: Object Detection identifies and locates multiple objects in an image, while Instance
Segmentation assigns a pixel-wise mask to each detected object, distinguishing them from the
background.
Explanation: Object Detection marks objects with bounding boxes, whereas Instance Segmentation
provides precise outlines by labeling every pixel belonging to an object.

4. How does a Convolutional Neural Network (CNN) process an image?


Answer: A CNN processes an image through multiple layers, including convolution layers for feature
extraction, activation layers (like ReLU) for non-linearity, pooling layers for dimensionality
reduction, and fully connected layers for classification.
Explanation: CNNs automatically learn spatial hierarchies of features, from simple edges in early
layers to complex patterns in deeper layers, making them highly effective in image recognition.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 11 -


5. What is the function of a Kernel in image processing?
Answer: A Kernel (or filter) is a small matrix that slides over an image to apply transformations such
as edge detection, sharpening, or blurring.
Explanation: The convolution operation between the Kernel and the image helps extract specific
features needed for further analysis, such as detecting edges or textures.

6. Why is the Rectified Linear Unit (ReLU) activation function used in CNNs?
Answer: ReLU removes negative values from the feature map and keeps only positive values,
introducing non-linearity to the model.
Explanation: Since most real-world data is non-linear, ReLU helps CNNs capture complex patterns
by ensuring that the network can learn from variations in input images.

7. How does Max Pooling improve the efficiency of a CNN?


Answer: Max Pooling reduces the spatial dimensions of an image while preserving its most
important features, making computations faster and reducing overfitting.
Explanation: By selecting only the highest value from a region, Max Pooling retains key information
and eliminates redundant data, helping CNNs process images more effectively.

8. What is Optical Character Recognition (OCR), and how is it used in Computer Vision?
Answer: OCR is a technology that recognizes and extracts text from images, enabling machines to
read printed or handwritten text.
Explanation: OCR is widely used in document scanning, automatic license plate recognition, and
real-time translation apps like Google Translate, where text is detected and converted into digital
format.

9. Why are Grayscale images commonly used in Computer Vision?


Answer: Grayscale images simplify computations because they contain only intensity values
(ranging from 0 to 255) instead of three color channels (R, G, B).
Explanation: Since color information is not always necessary for tasks like edge detection, using
grayscale images reduces computational complexity and speeds up processing.

10. How does Computer Vision contribute to self-driving cars?


Answer: Computer Vision enables self-driving cars to detect objects, recognize road signs, analyze
traffic conditions, and navigate safely by processing real-time visual data.
Explanation: By using deep learning models and sensor-based vision systems, self-driving cars can
identify obstacles, lane markings, and pedestrians, making autonomous navigation possible.

11. What is the difference between Image Classification and Object Detection in Computer
Vision?
Answer: Image Classification assigns a single label to an entire image, while Object Detection
identifies multiple objects within an image and provides their locations.
Explanation: Image Classification is useful for tasks where only one object needs to be recognized,
whereas Object Detection is necessary when multiple objects in an image must be identified and
localized.

12. How do self-driving cars use Computer Vision to navigate?


Answer: Self-driving cars use Computer Vision to detect objects, recognize traffic signals, track
lanes, and analyze road conditions.
Explanation: By processing real-time image data from cameras and sensors, self-driving cars can
make decisions like stopping for pedestrians, changing lanes, or maintaining a safe distance from
other vehicles.

13. What is the significance of Edge Detection in image processing?

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 12 -


Answer: Edge Detection identifies sharp changes in pixel intensity, allowing objects and boundaries
to be distinguished within an image.
Explanation: Techniques like the Sobel and Canny edge detectors help in applications such as object
recognition, image segmentation, and facial recognition.

14. Why is Convolution an essential operation in Computer Vision?


Answer: Convolution helps extract features such as edges, textures, and patterns from an image by
applying a filter (kernel) to it.
Explanation: By sliding a kernel across an image and computing weighted sums, convolution helps
in feature detection, which is critical for tasks like object recognition and facial detection.

15. How does Google Translate use Computer Vision for real-time translation?
Answer: Google Translate uses Optical Character Recognition (OCR) to extract text from images
and overlay translated text in the user’s preferred language.
Explanation: By using OCR, Google Translate recognizes letters and words in images, converting
them into digital text that can be translated and displayed in real-time.

16. What is the purpose of the Pooling Layer in a Convolutional Neural Network (CNN)?
Answer: The Pooling Layer reduces the spatial dimensions of the feature map while preserving
essential information.
Explanation: Pooling (such as Max Pooling or Average Pooling) helps in making CNNs more
efficient by reducing computation and preventing overfitting.

17. How does the Fully Connected Layer contribute to the performance of a CNN?
Answer: The Fully Connected Layer takes extracted features from previous layers and classifies
them into specific categories.
Explanation: It converts the feature map into a single-dimensional array and applies weights to
determine the final class of an object, such as recognizing whether an image contains a dog or a cat.

18. Why is grayscale conversion useful in Computer Vision applications?


Answer: Grayscale conversion reduces computational complexity by representing each pixel with a
single intensity value instead of three color values (RGB).
Explanation: Since many Computer Vision tasks focus on shape and texture rather than color,
grayscale images make processing faster while preserving essential features.

19. What is the role of the Kernel in Convolutional operations?


Answer: A Kernel is a small matrix used in convolution to detect patterns like edges, textures, and
corners in an image.
Explanation: Different Kernels perform different operations—such as sharpening, blurring, or
detecting vertical and horizontal edges—by modifying pixel values in an image.

20. How does Facial Recognition work using Computer Vision?


Answer: Facial Recognition detects, extracts, and analyzes facial features to identify or verify
individuals.
Explanation: By mapping key facial points (such as the distance between eyes and nose shape),
Computer Vision compares these features with stored facial data to recognize a person.

21. What is Image Segmentation, and why is it important in Computer Vision?


Answer: Image Segmentation is the process of dividing an image into meaningful regions to identify
objects more precisely.
Explanation: It helps in tasks like medical imaging, object detection, and scene understanding by
isolating different objects in an image for further analysis.

22. How does a Convolutional Neural Network (CNN) differ from a traditional neural network?

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 13 -


Answer: A CNN is specifically designed for image processing and uses convolutional layers to
extract spatial features, whereas a traditional neural network processes data without spatial awareness.
Explanation: CNNs reduce computational complexity by preserving spatial structures through
convolution and pooling, making them effective for image recognition tasks.

23. What is meant by "Feature Extraction" in Computer Vision?


Answer: Feature Extraction is the process of identifying important patterns, edges, and textures in an
image to aid in recognition and classification.
Explanation: It reduces image complexity while retaining essential information, making it easier for
algorithms to analyze and interpret the image.

24. What is the difference between Max Pooling and Average Pooling in CNNs?
Answer: Max Pooling selects the highest pixel value from a region, while Average Pooling calculates
the average of pixel values in that region.
Explanation: Max Pooling retains the most prominent features, making it useful for object detection,
whereas Average Pooling smooths features, reducing sensitivity to noise.

25. Why is OpenCV widely used in Computer Vision applications?


Answer: OpenCV is an open-source library that provides tools for image processing, object
detection, and machine learning in Computer Vision.
Explanation: It offers efficient algorithms for image manipulation, making it useful for applications
such as facial recognition, motion detection, and augmented reality.

26. What is Optical Flow, and where is it used in Computer Vision?


Answer: Optical Flow is a technique used to track the movement of objects in consecutive video
frames.
Explanation: It is commonly used in video surveillance, motion analysis, and self-driving cars to
detect moving objects and predict their trajectories.

27. How does Google’s "Search by Image" feature utilize Computer Vision?
Answer: It compares an input image’s features to a vast database of images to find visually similar
results.
Explanation: Using feature matching techniques, it helps users find information about objects,
landmarks, or even products using an image instead of text.

28. What are some common challenges faced in Computer Vision?


Answer: Challenges include variations in lighting, occlusions, image noise, and differences in object
orientation.
Explanation: These factors can affect the accuracy of Computer Vision models, requiring robust
algorithms and extensive training data to overcome them.

29. What is the role of Data Augmentation in training Computer Vision models?
Answer: Data Augmentation artificially increases the size of a dataset by applying transformations
like rotation, flipping, and scaling to images.
Explanation: This helps improve model generalization and prevents overfitting by exposing the
model to varied input patterns.

30. What is the purpose of the Activation Function in a CNN?


Answer: The Activation Function introduces non-linearity into the network, allowing it to learn
complex patterns.
Explanation: Functions like ReLU (Rectified Linear Unit) help improve CNN performance by
retaining essential features while filtering out irrelevant information.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 14 -


QUESTIONS AND ANSWERS - 3 marks

1. Explain the concept of Computer Vision and how it mimics human vision.
Answer: Computer Vision is a field of Artificial Intelligence (AI) that enables machines to process,
analyze, and interpret visual data such as images and videos. It allows machines to "see" by using
algorithms to extract meaningful information from visual inputs.
Explanation: Similar to human vision, Computer Vision involves capturing images, processing them
to recognize patterns, and making decisions based on that information. It is widely used in
applications such as facial recognition, self-driving cars, and medical imaging.

2. How does Object Detection work in Computer Vision, and how is it different from Image
Classification?
Answer: Object Detection identifies and locates multiple objects within an image, whereas Image
Classification assigns a single label to an entire image without pinpointing object locations.
Explanation: Object Detection uses bounding boxes to specify object positions and is essential for
applications like surveillance and autonomous driving. Image Classification is useful when only one
object needs to be recognized in an image.

3. What are the different types of tasks performed in Computer Vision?


Answer: Computer Vision tasks include:
 Image Classification: Assigning a category label to an image.
 Object Detection: Identifying multiple objects and their locations.
 Instance Segmentation: Detecting objects and assigning a pixel-wise mask.
 Optical Character Recognition (OCR): Extracting text from images.
Explanation: Each task serves different purposes, from recognizing objects in an image to
understanding text from scanned documents.

4. Explain the role of Convolutional Neural Networks (CNNs) in Computer Vision.


Answer: CNNs are deep learning models designed to process and analyze images by automatically
learning spatial hierarchies of features.
Explanation: CNNs use convolutional layers to extract low- to high-level features, ReLU for non-
linearity, pooling layers for dimensionality reduction, and fully connected layers for classification.
They are widely used in facial recognition, medical imaging, and autonomous vehicles.

5. How do Pooling Layers improve the efficiency of CNNs?


Answer: Pooling Layers reduce the spatial dimensions of feature maps while preserving essential
information, making the model more computationally efficient.
Explanation:
 Max Pooling: Retains the highest pixel value in a region, emphasizing prominent features.
 Average Pooling: Computes the average pixel value in a region, smoothing out variations.
This helps prevent overfitting and speeds up computations.

6. Describe how Google Translate uses Computer Vision to translate text from images.
Answer: Google Translate uses Optical Character Recognition (OCR) to detect and extract text
from images, then applies language translation algorithms to provide real-time translation.
Explanation: OCR scans an image, recognizes characters, converts them into digital text, and then
translates them into the selected language. This feature is helpful for reading foreign signs, menus,
and documents.

7. What is Image Segmentation, and how is it different from Object Detection?


Answer: Image Segmentation divides an image into multiple meaningful regions, whereas Object
Detection identifies and localizes objects within an image.
Explanation:
 Object Detection provides bounding boxes around detected objects.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 15 -


 Instance Segmentation assigns a unique pixel-wise mask to each object for precise
identification.
Segmentation is crucial in applications like medical imaging (tumor detection) and
autonomous driving (road segmentation).

8. Why is Edge Detection important in Computer Vision, and how does it work?
Answer: Edge Detection identifies sudden changes in pixel intensity, helping to define object
boundaries in an image.
Explanation:
 Techniques like Sobel and Canny edge detection highlight transitions between different
objects.
 It is used in applications such as object recognition, fingerprint matching, and face detection.

9. How does Computer Vision assist in self-driving cars?


Answer: Self-driving cars use Computer Vision to detect objects, recognize road signs, analyze
traffic signals, and track lanes for navigation.
Explanation:
 Cameras capture real-time visual data.
 Deep learning models process this data to identify vehicles, pedestrians, and obstacles.
 The car makes driving decisions based on the analysis, ensuring safety and accuracy.

10. What challenges are faced in Computer Vision applications, and how can they be
overcome?
Answer: Challenges include variations in lighting, occlusions, image noise, and differences in object
orientation.
Explanation:
 Lighting issues can be handled with adaptive thresholding techniques.
 Occlusions require deep learning models trained on diverse datasets.
 Noise reduction techniques like Gaussian filtering improve image quality.
 Data augmentation helps models generalize better for different orientations and perspectives.

11. What is meant by Feature Extraction in Computer Vision, and why is it important?
Answer: Feature Extraction is the process of identifying key attributes such as edges, corners,
textures, and patterns from an image to help in recognition and classification.
Explanation:
 Feature Extraction reduces image complexity while retaining essential information.
 It enables models to recognize objects efficiently by focusing on unique characteristics rather
than raw pixel data.
 It is used in applications like facial recognition, image classification, and object detection.

12. How does Optical Character Recognition (OCR) work in Computer Vision?
Answer: OCR is a technique that extracts text from images and converts it into a digital format for
further processing.
Explanation:
 OCR detects characters in an image by analyzing pixel patterns.
 It uses pre-trained models to recognize letters, numbers, and symbols.
 OCR is commonly used in applications like document scanning, license plate recognition, and
translation apps.

13. What is Instance Segmentation, and how is it useful in real-world applications?


Answer: Instance Segmentation detects objects in an image and assigns a pixel-wise mask to each
detected object, differentiating multiple objects of the same class.
Explanation:
 It is useful in medical imaging (e.g., segmenting tumors in scans).

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 16 -


 In autonomous driving, it helps in identifying pedestrians and vehicles separately.
 It improves accuracy over traditional object detection by providing precise object boundaries.

14. Explain the role of the Convolutional Layer in a CNN.


Answer: The Convolutional Layer is responsible for extracting features such as edges, textures, and
patterns from an image by applying filters (kernels).
Explanation:
 The layer applies a kernel to small regions of an image to detect spatial features.
 It captures patterns at different levels, from simple edges to complex shapes.
 Multiple convolutional layers allow deep networks to learn hierarchical representations of
images.

15. What are some challenges in Object Detection, and how can they be addressed?
Answer: Challenges in Object Detection include occlusion, varying object sizes, lighting conditions,
and background noise.
Explanation:
 Occlusion can be addressed using deep learning models trained on diverse datasets.
 Multi-scale detection techniques help recognize objects of different sizes.
 Image preprocessing techniques like contrast adjustment and normalization improve
detection in poor lighting.
 Advanced models like YOLO (You Only Look Once) improve real-time detection accuracy.

16. How does Google’s Search by Image feature utilize Computer Vision techniques?
Answer: Google’s Search by Image feature analyzes image features and compares them with a
database to find visually similar images.
Explanation:
 It extracts key features such as color, shape, and texture.
 Feature matching algorithms compare input images with indexed images in the database.
 Applications include finding product information, identifying landmarks, and verifying image
sources.

17. Why is the ReLU activation function used in CNNs, and how does it improve performance?
Answer: ReLU (Rectified Linear Unit) introduces non-linearity by converting negative values to zero
while keeping positive values unchanged.
Explanation:
 Without ReLU, CNNs would behave like linear models and fail to capture complex patterns.
 ReLU allows networks to learn deeper features by introducing non-linearity.
 It improves training speed by reducing the likelihood of vanishing gradients.

18. What is the significance of data augmentation in training Computer Vision models?
Answer: Data augmentation artificially expands the training dataset by applying transformations like
rotation, flipping, scaling, and cropping to existing images.
Explanation:
 It improves model generalization by exposing it to varied input patterns.
 It reduces overfitting by preventing the model from memorizing specific image features.
 It helps in training models with limited real-world datasets by creating diverse variations.

19. How does the Pooling Layer in CNNs help in feature extraction?
Answer: The Pooling Layer reduces the spatial dimensions of the feature map while retaining
essential information, improving computational efficiency.
Explanation:
 Max Pooling retains the highest pixel value in a region, focusing on prominent features.
 Average Pooling smooths the feature map by averaging values.
 Pooling helps CNNs extract robust features and reduces model complexity.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 17 -


20. Explain how Computer Vision is used in self-driving cars for navigation.
Answer: Self-driving cars use Computer Vision to analyze their surroundings, detect objects, and
make real-time driving decisions.
Explanation:
 Cameras capture images of roads, traffic signs, and obstacles.
 Deep learning models process these images to identify lanes, pedestrians, and other vehicles.
 The car's system makes navigation decisions, such as stopping at red lights or avoiding
collisions.

21. How is Computer Vision used in medical imaging, and what are its benefits?
Answer: Computer Vision is used in medical imaging for tasks like detecting tumors, analyzing X-
rays, and converting 2D scans into 3D models.
Explanation:
 Early diagnosis: It helps detect diseases at an early stage, improving treatment outcomes.
 Automation: Reduces manual work for radiologists by automating image analysis.
 3D reconstruction: Converts 2D scans into detailed 3D models for better understanding of
complex structures.

22. How do self-driving cars use real-time Object Detection for safe navigation?
Answer: Self-driving cars use Computer Vision to detect pedestrians, vehicles, lane markings, and
traffic signals for safe navigation.
Explanation:
 Real-time image processing captures visual data from multiple cameras.
 Deep learning models (e.g., YOLO, SSD) identify and classify objects on the road.
 Decision-making algorithms use detected objects to take actions like braking, steering, or
accelerating.

23. Explain how facial recognition technology works and its security applications.
Answer: Facial recognition analyzes facial features and compares them with a database for
identification or authentication.
Explanation:
 Feature extraction: Identifies key facial landmarks such as eyes, nose, and mouth.
 Face encoding: Converts facial features into numerical representations.
 Security applications: Used in surveillance, biometric authentication, and access control
(e.g., phone unlocking, airport security).

24. What role does Computer Vision play in industrial automation and quality control?
Answer: Computer Vision automates inspection processes in industries to detect defects, measure
product dimensions, and ensure quality control.
Explanation:
 Defect detection: Identifies flaws in manufacturing (e.g., cracks, incorrect labels).
 Assembly verification: Ensures correct placement of parts in assembly lines.
 Efficiency improvement: Reduces human error and speeds up production.

25. How does Google Lens use Computer Vision for object recognition?
Answer: Google Lens analyzes images and uses deep learning models to recognize objects, text, and
landmarks.
Explanation:
 Image processing: Extracts features from an input image.
 Comparison with databases: Matches the image with a vast dataset to find relevant
information.
 Applications: Used for translating text, identifying plants/animals, and shopping by scanning
barcodes or products.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 18 -


26. How does Computer Vision assist in sports analytics and athlete performance tracking?
Answer: Computer Vision helps analyze player movements, track ball trajectories, and provide real-
time game statistics.
Explanation:
 Player tracking: Detects and records player movements using motion capture.
 Performance analysis: Measures speed, agility, and accuracy for training improvement.
 Referee assistance: Used in goal-line technology and automated officiating (e.g., Hawk-Eye
in tennis).

27. How does AI-powered Optical Character Recognition (OCR) enhance document
digitization?
Answer: AI-powered OCR extracts text from scanned documents and converts them into editable and
searchable digital files.
Explanation:
 Pre-processing: Enhances contrast and removes noise from scanned images.
 Text recognition: Identifies letters, words, and paragraphs using deep learning models.
 Applications: Used in banking, legal documentation, and automatic invoice processing.

28. What are the advantages of using AI-powered face filters in social media applications?
Answer: AI-powered face filters detect facial features and apply augmented reality (AR) effects in
real time.
Explanation:
 Facial landmark detection: Identifies key points on the face (e.g., eyes, lips, nose).
 Filter application: Overlays digital effects like masks, makeup, or animations.
 User engagement: Enhances interaction on platforms like Instagram, Snapchat, and TikTok.

29. How does Computer Vision improve inventory management in retail stores?
Answer: Computer Vision automates inventory tracking by analyzing shelf images and detecting
stock levels.
Explanation:
 Shelf scanning: Uses security cameras or robots to monitor stock availability.
 Object recognition: Identifies missing or misplaced products.
 Data analytics: Helps retailers optimize stock replenishment and reduce losses.

30. Explain how Convolutional Neural Networks (CNNs) are used in satellite image analysis.
Answer: CNNs analyze satellite images to detect changes in landscapes, track deforestation, and
monitor urban development.
Explanation:
 Feature extraction: Identifies land types, water bodies, and buildings.
 Change detection: Compares images over time to track environmental changes.
 Disaster management: Helps assess damage after natural disasters like floods and
earthquakes.

QUESTIONS AND ANSWERS - 5 marks

1. Explain the concept of Computer Vision. What are its key applications?
Answer: Computer Vision is a field of Artificial Intelligence (AI) that enables machines to interpret
and analyze visual data such as images and videos. It allows computers to recognize objects, detect
patterns, and make decisions based on image inputs.
Key Applications:
 Facial Recognition: Used in security systems and biometric authentication.
 Self-Driving Cars: Helps vehicles detect obstacles, road signs, and lanes.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 19 -


 Medical Imaging: Aids in diagnosing diseases using X-rays, MRIs, and CT scans.
 Retail and Inventory Management: Tracks stock levels using image analysis.
 Google Search by Image: Finds similar images by comparing visual features.
Explanation: Computer Vision mimics human vision by processing images through deep learning
models. It is widely applied in industries like healthcare, automotive, security, and retail to automate
processes and improve efficiency.

2. Describe the working of Convolutional Neural Networks (CNNs) in Computer Vision.


Answer: CNNs are deep learning models specifically designed for image processing. They extract
important features from images through multiple layers.
Working of CNN:
1. Convolutional Layer: Detects features such as edges and textures using filters (kernels).
2. Activation Function (ReLU): Removes negative values and introduces non-linearity.
3. Pooling Layer: Reduces the size of feature maps to retain essential information while
improving computational efficiency.
4. Fully Connected Layer: Flattens the extracted features and classifies the image.
5. Softmax/Output Layer: Assigns probabilities to different categories.
Explanation: CNNs help in tasks like facial recognition, medical diagnosis, and autonomous driving.
They can recognize patterns at different levels, from simple edges to complex objects, making them
highly effective in Computer Vision applications.

3. What is Object Detection, and how does it differ from Image Classification?
Answer: Object Detection is a Computer Vision technique used to identify multiple objects in an
image and determine their locations. It differs from Image Classification, which assigns a single label
to an entire image without identifying object locations.
Differences:
Feature Object Detection Image Classification
Output Identifies and localizes multiple objects Assigns a single category to an image
Used in facial recognition, medical
Use Case Used in autonomous vehicles, security systems
diagnostics
More complex due to bounding boxes and Simpler as it only provides a category
Complexity
multiple objects label
Explanation: Object Detection uses advanced models like YOLO (You Only Look Once) and SSD
(Single Shot MultiBox Detector) to detect objects in real-time. It is crucial for applications like self-
driving cars and surveillance.

4. What is Optical Character Recognition (OCR), and how does it work in Computer Vision?
Answer: OCR is a technology that enables machines to read printed or handwritten text from images
and convert it into digital format.
Working of OCR:
1. Pre-processing: Enhances image quality by adjusting brightness and removing noise.
2. Text Detection: Identifies areas containing text.
3. Character Recognition: Uses deep learning models to recognize individual characters.
4. Post-processing: Corrects errors and formats text for better readability.
Applications:
 Digitizing printed documents.
 Automatic license plate recognition.
 Translating text in images (e.g., Google Translate).
Explanation: OCR has revolutionized document processing, making it easier to extract text from
physical documents, scanned images, and even handwritten notes.

5. Explain the concept of Image Segmentation and its different types.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 20 -


Answer: Image Segmentation is the process of dividing an image into meaningful regions to analyze
specific objects or areas.
Types of Image Segmentation:
1. Semantic Segmentation: Assigns a category to every pixel but does not differentiate between
different instances of the same object.
2. Instance Segmentation: Detects objects and assigns unique labels to different instances of
the same object.
3. Panoptic Segmentation: Combines both semantic and instance segmentation for a more
detailed understanding.
Applications:
 Medical imaging (e.g., tumor detection).
 Self-driving cars (road segmentation).
 Satellite imagery analysis.
Explanation: Image Segmentation is essential in applications that require precise object separation,
such as medical diagnosis and environmental monitoring.

6. How does Google’s Search by Image feature utilize Computer Vision?


Answer: Google’s Search by Image feature allows users to search for similar images by uploading a
picture instead of entering text.
Process:
1. Feature Extraction: Identifies key visual elements such as colors, shapes, and textures.
2. Feature Matching: Compares the extracted features with an extensive image database.
3. Ranking and Display: Displays visually similar images and related content.
Applications:
 Identifying objects, places, and landmarks.
 Finding product details from an image.
 Detecting fake images or plagiarism.
Explanation: This feature uses deep learning and Computer Vision to enhance user experience and
improve image-based search functionalities.

7. What are the advantages and challenges of using Computer Vision in self-driving cars?
Answer:
Advantages:
 Accurate object detection: Recognizes pedestrians, vehicles, and road signs.
 Real-time decision-making: Helps navigate traffic and avoid collisions.
 Enhanced safety: Reduces human error and improves road safety.
Challenges:
 Adverse weather conditions: Fog, rain, and low light can affect accuracy.
 Processing speed: Requires high computational power for real-time processing.
 Unexpected obstacles: Difficulty in handling unpredictable events (e.g., animals crossing the
road).
Explanation: Despite challenges, Computer Vision is critical for autonomous vehicles.
Improvements in AI and sensor technologies continue to enhance its reliability.

8. What is Edge Detection in Computer Vision, and why is it important?


Answer: Edge Detection identifies boundaries within an image by detecting sudden intensity changes
in pixel values.
Techniques:
 Sobel Operator: Detects edges in horizontal and vertical directions.
 Canny Edge Detection: Finds strong edges and reduces noise for precise results.
 Laplacian Operator: Highlights areas with rapid intensity change.
Importance:
 Used in medical imaging to highlight critical features.
 Essential for object recognition and image segmentation.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 21 -


 Helps in facial recognition by detecting facial contours.
Explanation: Edge Detection simplifies image processing by reducing data complexity and
extracting essential features.

9. How does Pooling improve the performance of a Convolutional Neural Network (CNN)?
Answer: Pooling layers reduce the spatial dimensions of feature maps while preserving important
information.
Types of Pooling:
1. Max Pooling: Retains the highest pixel value in a region, preserving key features.
2. Average Pooling: Computes the average pixel value, smoothing variations.
Advantages:
 Reduces computational complexity.
 Prevents overfitting by summarizing essential information.
 Improves CNN performance by focusing on dominant features.
Explanation: Pooling ensures that CNNs extract meaningful features efficiently while reducing
memory usage and processing time.

10. What are the challenges faced in Computer Vision, and how can they be overcome?
Answer:
Challenges:
 Variations in lighting: Poor lighting can affect image recognition.
 Occlusion: Objects may be partially hidden.
 Noise in images: Can lead to misinterpretation.
Solutions:
 Adaptive thresholding: Adjusts brightness dynamically.
 Deep learning models: Trained on diverse datasets to handle occlusions.
 Noise reduction techniques: Use Gaussian filtering for clearer images.
Explanation: Overcoming these challenges requires advanced AI models, improved data collection,
and efficient image preprocessing techniques.

11. Explain the working and significance of Feature Extraction in Computer Vision.
Answer: Feature Extraction is the process of identifying important attributes, such as edges, textures,
and patterns, from an image for further analysis.
Working of Feature Extraction:
1. Pre-processing: Image is resized, converted to grayscale, and noise is removed.
2. Feature Detection: Identifies edges, corners, and shapes using algorithms like Sobel, Canny,
or Harris corner detection.
3. Feature Selection: Chooses the most relevant features to reduce computational load.
4. Feature Representation: Converts features into numerical data for machine learning models.
Significance:
 Reduces image complexity while retaining essential information.
 Improves the accuracy of object detection and classification.
 Enhances model performance in tasks such as facial recognition and medical imaging.
Explanation: Feature Extraction enables computers to identify objects in images efficiently. It is
crucial for applications like autonomous vehicles, security surveillance, and industrial automation.

12. Describe the different layers of a Convolutional Neural Network (CNN) and their roles.
Answer: A CNN consists of multiple layers designed to process and analyze images efficiently.
Layers in a CNN:
1. Convolutional Layer: Extracts features such as edges and textures using filters (kernels).
2. Activation Layer (ReLU): Introduces non-linearity by converting negative values to zero.
3. Pooling Layer: Reduces the spatial dimensions of feature maps while preserving important
features (e.g., Max Pooling).

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 22 -


4. Fully Connected Layer: Flattens the extracted features and passes them to the classification
model.
5. Output Layer (Softmax): Assigns probabilities to different categories for classification.
Explanation: CNNs are widely used in image processing tasks such as facial recognition, self-
driving cars, and medical image analysis due to their ability to extract hierarchical features
automatically.

13. What is Image Resolution, and how does it impact Computer Vision applications?
Answer: Image Resolution refers to the number of pixels in an image, determining its clarity and
level of detail.
Impact on Computer Vision:
 Higher Resolution: Provides more detail but requires more processing power.
 Lower Resolution: Faster processing but may lead to loss of critical features.
 Scaling Techniques: Resizing images while maintaining key features is essential in
applications like object detection and facial recognition.
Examples:
 High-resolution images are used in medical imaging for accurate diagnosis.
 Low-resolution images are sufficient for applications like barcode scanning.
Explanation: Proper selection of resolution helps balance accuracy and computational efficiency in
Computer Vision applications.

14. Explain the importance of Convolutional Filters (Kernels) in Computer Vision.


Answer: Convolutional Filters (Kernels) are small matrices that slide over an image to extract
specific features.
Types of Convolutional Filters:
1. Edge Detection Filters (Sobel, Prewitt, Canny): Identify edges and object boundaries.
2. Sharpening Filters: Enhance details in an image.
3. Blurring Filters: Reduce noise and smooth images.
4. Embossing Filters: Highlight specific patterns or textures.
Importance:
 Detects essential features like edges, textures, and shapes.
 Helps in object recognition and classification.
 Used in applications such as facial recognition, self-driving cars, and medical imaging.
Explanation: Convolutional Filters are essential for reducing image complexity while preserving key
features, enabling accurate object detection and image analysis.

15. What is the role of Data Augmentation in training Computer Vision models?
Answer: Data Augmentation is a technique used to artificially expand training datasets by applying
transformations to existing images.
Types of Data Augmentation:
1. Rotation: Rotates images to create variations in orientation.
2. Flipping: Applies horizontal or vertical flips to diversify data.
3. Scaling: Resizes images while preserving aspect ratio.
4. Brightness Adjustment: Modifies image brightness to simulate different lighting conditions.
Importance:
 Prevents overfitting by increasing dataset variability.
 Improves model generalization to handle real-world variations.
 Enhances performance in tasks like facial recognition, object detection, and autonomous
driving.
Explanation: Data Augmentation helps improve the robustness of Computer Vision models, making
them more accurate and reliable in real-world scenarios.

16. What is Edge Detection in Computer Vision, and what are its real-world applications?

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 23 -


Answer: Edge Detection is a technique used to identify significant changes in pixel intensity,
highlighting object boundaries.
Edge Detection Algorithms:
1. Sobel Operator: Detects edges in horizontal and vertical directions.
2. Canny Edge Detector: Uses multi-stage filtering for precise edge detection.
3. Laplacian of Gaussian (LoG): Detects edges by identifying zero-crossing points.
Real-World Applications:
 Medical Imaging: Identifies structures in X-rays and MRIs.
 Autonomous Vehicles: Detects lanes and road boundaries.
 Facial Recognition: Identifies key facial features.
Explanation: Edge Detection is a fundamental technique in Computer Vision, improving the
accuracy of object recognition and scene understanding.

17. How does Object Tracking work in Computer Vision, and where is it used?
Answer: Object Tracking involves following an object’s movement across multiple frames in a
video.
Techniques:
1. Correlation Filters: Track objects by comparing image patches.
2. Optical Flow: Detects object motion based on pixel movement.
3. Deep Learning-based Tracking: Uses neural networks to improve tracking accuracy.
Applications:
 Surveillance Systems: Tracks individuals in security footage.
 Sports Analytics: Analyzes player movements and ball tracking.
 Augmented Reality (AR): Follows objects for interactive applications.
Explanation: Object Tracking plays a crucial role in real-time applications, enabling automation and
intelligent decision-making.

18. Explain the role of Computer Vision in Retail and Inventory Management.
Answer: Computer Vision helps retailers monitor inventory, track customer behavior, and improve
store management.
Applications:
1. Stock Level Monitoring: Uses cameras to track shelf stock in real-time.
2. Customer Behavior Analysis: Detects foot traffic patterns and popular products.
3. Self-Checkout Systems: Recognizes products and automates billing.
Benefits:
 Reduces manual labor and improves efficiency.
 Minimizes errors in stock management.
 Enhances customer experience with personalized recommendations.
Explanation: Retailers leverage Computer Vision to optimize operations, reduce losses, and improve
customer engagement.

19. What are the challenges in implementing Computer Vision, and how can they be addressed?
Answer: Challenges include environmental factors, data quality issues, and computational
complexity.
Challenges and Solutions:
1. Lighting Variations: Adaptive thresholding and exposure adjustments help improve
visibility.
2. Occlusion and Object Overlaps: Deep learning models trained on diverse datasets improve
recognition.
3. Processing Speed: Hardware accelerators like GPUs and TPUs speed up computations.
4. Data Privacy Concerns: Secure data handling and encryption ensure privacy compliance.
Explanation: Overcoming these challenges requires continuous advancements in AI models,
improved hardware, and better training datasets.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 24 -


20. Explain the use of Computer Vision in Satellite Image Analysis.
Answer: Computer Vision helps analyze satellite images for environmental monitoring, urban
planning, and disaster management.
Applications:
1. Deforestation Tracking: Detects illegal logging and forest degradation.
2. Agriculture Monitoring: Assesses crop health and soil conditions.
3. Disaster Management: Identifies flood-prone areas and damage assessment after natural
disasters.
Benefits:
 Enables large-scale environmental analysis.
 Provides real-time insights for decision-making.
 Helps governments and organizations plan infrastructure and resource allocation.
Explanation: Satellite Image Analysis powered by Computer Vision plays a vital role in global
monitoring and sustainable development.

Prepared by: M. S. KumarSwamy, TGT(Maths) Page - 25 -

You might also like