0% found this document useful (0 votes)
59 views16 pages

Computer Vision Questions

The document contains multiple-choice questions, fill-in-the-blank questions, one-word answers, and open-ended questions related to the field of Computer Vision. It covers various topics such as image classification, object detection, and the use of deep learning techniques like Convolutional Neural Networks. Additionally, it discusses the importance of visual data processing in applications like self-driving cars and facial recognition.

Uploaded by

vagadphotobackup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views16 pages

Computer Vision Questions

The document contains multiple-choice questions, fill-in-the-blank questions, one-word answers, and open-ended questions related to the field of Computer Vision. It covers various topics such as image classification, object detection, and the use of deep learning techniques like Convolutional Neural Networks. Additionally, it discusses the importance of visual data processing in applications like self-driving cars and facial recognition.

Uploaded by

vagadphotobackup
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Multiple-Choice Questions (MCQs)

1.​ Which domain of Artificial Intelligence enables machines to see through


images or visual data?
o​ a) Natural Language Processing
o​ b) Computer Vision
o​ c) Machine Learning
o​ d) Robotics​
Answer: b) Computer Vision
2.​ What is the task of assigning an input image one label from a fixed set of
categories called?
o​ a) Object Detection
o​ b) Classification
o​ c) Segmentation
o​ d) Localization​
Answer: b) Classification
3.​ Which of the following applications uses Computer Vision for identifying
and analysing objects in an image or video?
o​ a) Facial Recognition
o​ b) Medical Imaging
o​ c) Object Detection
o​ d) All of the above​
Answer: d) All of the above
4.​ What term is used to describe the smallest unit of information that makes
up a picture?
o​ a) Pixel
o​ b) Grid
o​ c) Color value
o​ d) Resolution​
Answer: a) Pixel
5.​ In which year was the concept of Computer Vision first introduced?
o​ a) 1980s
o​ b) 1990s
o​ c) 1970s
o​ d) 2000s​
Answer: c) 1970s
6.​ Which of the following tasks involves both identifying the object and its
spatial position within the image, but is used for single objects only?
o​ a) Object Detection
o​ b) Instance Segmentation
o​ c) Classification + Localization
o​ d) Semantic Segmentation​
Answer: c) Classification + Localization
7.​ Which deep learning technique is commonly used in Computer Vision
tasks such as Object Detection and Image Classification?
o​ a) Recurrent Neural Networks (RNN)
o​ b) Convolutional Neural Networks (CNN)
o​ c) Decision Trees
o​ d) Support Vector Machines (SVM)​
Answer: b) Convolutional Neural Networks (CNN)
8.​ In Computer Vision, which of the following algorithms would be most
suitable for separating objects in a crowded scene and labeling them
distinctly, even if they belong to the same category?
o​ a) Image Classification
o​ b) Instance Segmentation
o​ c) Edge Detection
o​ d) Object Detection​
Answer: b) Instance Segmentation
9.​ In the context of pixel values, a grayscale image has pixel values between:
o​ a) 0 to 100
o​ b) 0 to 255
o​ c) -128 to 128
o​ d) 0 to 1024​
Answer: b) 0 to 255
10.​Which task in Computer Vision involves detecting and recognizing
instances of real-world objects like faces, bicycles, or buildings in images
or videos?
o​ a) Semantic Segmentation
o​ b) Object Detection
o​ c) Image Classification
o​ d) Image Enhancement​
Answer: b) Object Detection
11.​Which of the following technologies has not been traditionally a core part
of Computer Vision but is now frequently integrated with it in applications
like Google Translate or augmented reality?
o​ a) Optical Character Recognition (OCR)
o​ b) Image Segmentation
o​ c) Augmented Reality (AR)
o​ d) Facial Recognition​
Answer: c) Augmented Reality (AR)
12.​Which of the following is the most significant challenge when applying
deep learning-based object detection models to real-time video analysis in
dynamic environments?
o​ a) Dealing with variations in lighting conditions
o​ b) Real-time processing and inference speed
o​ c) Image resolution and pixel density
o​ d) Lack of labeled data for training​
Answer: b) Real-time processing and inference speed
13.​Which technique is most commonly used to prevent overfitting in deep
neural networks trained for Computer Vision tasks?
o​ a) Batch Normalization
o​ b) Data Augmentation
o​ c) Learning Rate Annealing
o​ d) Skip Connections​
Answer: b) Data Augmentation
14.​What is the primary disadvantage of using traditional methods such as
edge detection and thresholding compared to modern deep learning
approaches in Computer Vision?
o​ a) They are computationally expensive and slow
o​ b) They rely heavily on handcrafted features and are not adaptable
o​ c) They perform poorly on large-scale datasets
o​ d) They require less data for training​
Answer: b) They rely heavily on handcrafted features and are not
adaptable
15.​In a Convolutional Neural Network (CNN), which layer is primarily
responsible for learning the spatial hierarchies of features in an image?
o​ a) Fully Connected Layer
o​ b) Convolutional Layer
o​ c) Pooling Layer
o​ d) Dropout Layer​
Answer: b) Convolutional Layer
16.​In the context of instance segmentation, the Mask R-CNN algorithm
generates a segmentation mask by adding an additional ________ to the
Faster R-CNN architecture.
o​ a) Semantic Decoder
o​ b) ROI Align Layer
o​ c) Fully Connected Layer
o​ d) Bounding Box Regression​
Answer: b) ROI Align Layer
17.​Which of the following loss functions is most commonly used for training a
model for image segmentation tasks, particularly in the case of imbalanced
datasets (e.g., background vs object pixels)?
o​ a) Cross-Entropy Loss
o​ b) Mean Squared Error
o​ c) Dice Loss
o​ d) Hinge Loss​
Answer: c) Dice Loss
18.​Which of the following is an example of Computer Vision?
o​ a) Playing music on a speaker
o​ b) Detecting faces in a photo
o​ c) Writing a document in Word
o​ d) Sending an email​
Answer: b) Detecting faces in a photo
19.​What is the smallest unit of an image called?
o​ a) Pixel
o​ b) Byte
o​ c) Grid
o​ d) Color​
Answer: a) Pixel
20.​Which primary colors make up an RGB image?
o​ a) Red, Yellow, Green
o​ b) Red, Green, Blue
o​ c) Blue, Yellow, Orange
o​ d) Black, White, Gray​
Answer: b) Red, Green, Blue
21.​Which technology allows self-driving cars to identify objects on the road?
o​ a) Computer Vision
o​ b) Virtual Reality
o​ c) Augmented Reality
o​ d) Cloud Computing​
Answer: a) Computer Vision
22.​What is the process of assigning an image to a specific category called?
o​ a) Image Editing
o​ b) Image Classification
o​ c) Image Resolution
o​ d) Image Resizing​
Answer: b) Image Classification
23.​What does a pixel represent in an image?
a) A file name
b) A single point of color or brightness
c) The size of the image
d) The shape of the image​
Answer: b) A single point of color or brightness
24.​ Which of these tasks is NOT part of Computer Vision?
a) Image Classification
b) Object Detection
c) Video Compression
d) Instance Segmentation​
Answer: c) Video Compression
25.​What is the main difference between Classification and Object Detection?
a) Classification labels objects, while Object Detection locates them
b) Classification works with videos, and Object Detection works with images
c) Classification is easier than Object Detection
d) There is no difference​
Answer: a) Classification labels objects, while Object Detection locates them
26.​ Which of these is NOT a primary component of an RGB image?
a) Red
b) Green
c) Blue
d) Yellow​
Answer: d) Yellow
27.​What do self-driving cars use Computer Vision for?
a) Reading traffic signs
b) Recognizing pedestrians
c) Navigating roads
d) All of the above​
Answer: d) All of the above

Fill in the Blanks


1.​ Computer Vision allows machines to _________ images or visual data.
o​ Answer: see
2.​ In Computer Vision, the task of identifying both the object and its location
in an image is called ________.
o​ Answer: Classification + Localization
3.​ The technology behind self-driving cars relies heavily on ________ to
identify objects and navigate.
o​ Answer: Computer Vision
4.​ A ________ image consists of various shades of gray and has pixel values
ranging from 0 (black) to 255 (white).
o​ Answer: Grayscale
5.​ Images in the digital form are made up of small units called ________.
o​ Answer: Pixels
6.​ The process of detecting instances of objects in an image and assigning a
unique label to each pixel based on the object it belongs to is called
_________.
o​ Answer: Instance Segmentation
7.​ In the context of object detection, ________ involves locating objects and
classifying them, but without distinguishing pixel-level boundaries.
o​ Answer: Object Detection
8.​ The primary model architecture used for tasks like facial recognition, image
classification, and object detection is called ________.
o​ Answer: Convolutional Neural Network (CNN)
9.​ In a grayscale image, the pixel value of 0 represents ________, and the
pixel value of 255 represents ________.
o​ Answer: black, white
10.​The technique of using machine learning algorithms to classify an image
based on a predefined set of labels or categories is called ________.
o​ Answer: Image Classification
11.​When the resolution of an image is high, the number of ________
increases, making the image clearer and more detailed.
o​ Answer: Pixels
12.​When using CNNs for image classification, the layer that reduces the
spatial dimensions of the feature map, thus reducing the computational
load, is called the ________.
o​ Answer: Pooling Layer
13.​The challenge of distinguishing between visually similar objects or
detecting small objects in images is referred to as the ________ problem in
object detection.
o​ Answer: Scale Variance
14.​In the context of deep learning, the term ________ is used to describe the
concept of adding noise or random changes to training data to artificially
expand the dataset and improve model generalization.
o​ Answer: Data Augmentation
15.​The task of creating an exact pixel-wise mask for each object in an image,
rather than just bounding boxes, is known as ________.
o​ Answer: Instance Segmentation
16.​The process of identifying and locating objects in an image is called __________.​
Answer: Object Detection
17.​A grayscale image uses pixel values ranging from ________ to ________.​
Answer: 0 to 255
18.​The technique that enables machines to “see” and analyze visual data is known
as __________.​
Answer: Computer Vision
19.​The resolution of an image is defined by its ________ and ________.​
Answer: Width, Height
20.​__________ images are made by combining different intensities of red, green,
and blue colors.​
Answer: RGB
21.​Computer Vision allows machines to process and analyze __________ data.​
Answer: Visual
22.​__________ is a common example of Computer Vision in smartphones, used to
unlock the device.​
Answer: Facial Recognition
23.​The most common pixel format uses an 8-bit integer, giving values from
__________ to __________.​
Answer: 0, 255
24.​A grayscale image has only one plane of __________, while an RGB image has
three.​
Answer: Pixels
25.​The process of detecting multiple objects in an image is called __________.​
Answer: Object Detection

One-Word Answers
1.​ What is the most common pixel format used in images?
o​ Answer: Byte
2.​ What color model is used in RGB images?
o​ Answer: Red, Green, Blue
3.​ What is the process of dividing an image into smaller regions or segments
called?
o​ Answer: Segmentation
4.​ What type of object recognition is used in systems like Google Search by
Image?
o​ Answer: Object Detection
5.​ What is used to define the clarity or detail of an image based on the
number of pixels?
o​ Answer: Resolution
6.​ Which task is focused on pixel-level classification and categorizes each
pixel in an image into predefined classes?
o​ Answer: Semantic Segmentation
7.​ What is the primary challenge in real-time facial recognition systems in
terms of lighting conditions and facial occlusions?
o​ Answer: Robustness
8.​ In the context of a self-driving car, what is the term for detecting and
identifying objects on the road, such as pedestrians, other cars, or traffic
signs?
o​ Answer: Object Detection
9.​ In an RGB image, the color value of a pixel is represented using how many
channels?
o​ Answer: Three
10.​Which concept in image processing involves enhancing the clarity or detail
of an image, typically used before performing higher-level analysis?
o​ Answer: Image Enhancement
11.​The primary challenge for object detection systems in crowded or complex
environments is to accurately differentiate between ________ objects.
o​ Answer: overlapping
12.​Which architecture in deep learning is specifically designed to handle
sequential image data, such as video or time-series images?
o​ Answer: 3D Convolutional Neural Network (3D CNN)
13.​In a Convolutional Neural Network, the operation that combines feature
maps from multiple layers and reduces dimensionality while retaining
important features is called ________.
o​ Answer: Pooling
14.​The method of training a deep network where different parts of the model
are updated based on gradient flow, such as via multi-task learning, is
referred to as ________.
o​ Answer: Backpropagation
15.​What is the name of the smallest element in an image?​
Answer: Pixel
16.​What kind of images have only shades of gray and no color?​
Answer: Grayscale
17.​What is the process of teaching a computer to recognize patterns in
images?​
Answer: Training
18.​Which popular technology is used to unlock phones using your face?​
Answer: Facial Recognition
19.​What term describes creating a 3D model from 2D images in medical
imaging?​
Answer: Reconstruction
20.​What is the term for a collection of pixels in an image?​
Answer: Resolution
21.​ What is the full form of RGB in images?​
Answer: Red, Green, Blue
22.​ What kind of image uses only shades of gray?​
Answer: Grayscale
23.​ What does the Google Translate app use to recognize and translate text from
images?​
Answer: Optical Character Recognition (OCR)
24.​ What is the process of marking an object and its location in an image called?​
Answer: Localization
25.​An image with higher resolution contains fewer pixels.​
Answer: False
26.​Computer Vision is a subfield of Artificial Intelligence.​
Answer: True
27.​In a grayscale image, pixel values can range from 0 to 255.​
Answer: True
28.​RGB images cannot represent shades of gray.​
Answer: False
29.​Instance Segmentation identifies each pixel that belongs to a specific object.​
Answer: True

Open-Ended Questions
1.​ What are some real-life examples of Computer Vision applications?​
Answer: Facial recognition, self-driving cars, Google Translate’s camera feature,
and face filters on social media apps.
2.​ Why is it important for machines to understand visual data?​
Answer: It helps in making decisions automatically, like identifying road signs for
self-driving cars or detecting diseases in medical imaging.
3.​ How do RGB images differ from grayscale images?​
Answer: RGB images are made up of three color components—Red, Green, and
Blue—while grayscale images have only shades of gray.
4.​ What do you understand by the term ‘Resolution’ of an image?​
Answer: Resolution refers to the number of pixels in an image, usually measured
as width × height.
5.​ Can you name one example where object detection is useful in everyday
life?​
Answer: Security cameras use object detection to recognize people and detect
movements.
6.​ What are some common applications of Computer Vision?​
Answer: Facial recognition, self-driving cars, medical imaging, Google Translate,
and face filters on social media apps.
7.​ Why is Computer Vision important for industries like retail?​
Answer: It helps with customer behavior tracking, inventory management, and
optimizing store layouts.
8.​ How does an RGB image store color information?​
Answer: By combining different intensities of Red, Green, and Blue for each
pixel.
9.​ Explain how lighting conditions affect Computer Vision.​
Answer: Poor lighting can make it harder for algorithms to detect or identify
objects accurately.
10.​What is the main difference between Object Detection and Instance
Segmentation?​
Answer: Object Detection finds and labels objects, while Instance Segmentation
also labels each pixel of the object.
11.​Explain the basics of an image and how computers interpret it.​
Answer: An image is made up of pixels, the smallest units of information. Each
pixel has a value representing brightness or color. Computers store these values
as numbers in a grid, with higher resolutions having more pixels. RGB images
store three values per pixel for Red, Green, and Blue, while grayscale images
store a single value between 0 and 255.
12.​Discuss how Computer Vision is used in self-driving cars.​
Answer: Self-driving cars use Computer Vision to identify objects like
pedestrians, vehicles, and traffic signs. They also monitor the environment and
analyze navigational routes. This helps in decision-making, like when to stop or
turn.
13.​What is the role of pixels in forming an image?​
Answer: Pixels are the smallest units of an image. They store color or brightness
information and collectively form the complete picture.
14.​How is resolution related to the quality of an image?​
Answer: Higher resolution means more pixels in the image, leading to greater
detail and sharper quality.
15.​What are the three primary tasks of Computer Vision?​
Answer: Image Classification, Object Detection, and Instance Segmentation.
16.​Why is grayscale commonly used in image processing?​
Answer: Grayscale simplifies computations as it uses only one channel, reducing
complexity compared to RGB images.
17.​How does Object Detection differ from Classification?​
Answer: Classification assigns a label to an image, while Object Detection
identifies and locates multiple objects within an image.
18.​What does Computer Vision contribute to medical imaging?​
Answer: It helps analyze 2D scan images, creates interactive 3D models, and
assists doctors in understanding patients’ health conditions.
19.​What is the purpose of Google’s “Search by Image” feature?​
Answer: It compares features of an input image with a database to find similar
images and provide search results.
20.​How do RGB and grayscale images differ in data storage?​
Answer: RGB images store three values per pixel (Red, Green, Blue), while
grayscale images store a single brightness value per pixel.
21.​Explain how Computer Vision enables machines to interpret and analyze
visual data.​
Answer: Computer Vision uses algorithms to process images or videos and
extract meaningful information. Tasks include detecting objects, classifying them,
and identifying their locations. It relies on techniques like image preprocessing,
machine learning, and neural networks to enable applications such as facial
recognition, object detection, and scene understanding.
22.​Discuss the applications of Computer Vision in daily life with examples.​
Answer: Computer Vision is widely used in various fields:
23.​Facial Recognition: For unlocking phones and security systems.
24.​Retail: Tracking customer movements and managing inventory.
25.​Medical Imaging: Assisting in creating 3D models from 2D scans for diagnosis.
26.​Self-Driving Cars: Recognizing traffic signs, pedestrians, and navigating routes.
27.​Social Media: Applying filters on platforms like Instagram and Snapchat.
28.​How do lighting conditions impact the accuracy of Computer Vision
applications?​
Answer: Lighting significantly affects how images are captured and analyzed.
Poor lighting can cause blurriness or obscure details, making it harder for
algorithms to detect objects. Bright lighting might cause reflections or glare,
leading to incorrect interpretations. This is why robust algorithms often include
image enhancement techniques to adjust for varying lighting conditions.
29.​Describe the role of pixels and resolution in image quality.​
Answer: Pixels are the building blocks of an image, each representing a tiny
portion of the picture. Resolution, defined as the number of pixels in width ×
height, determines the image's detail. Higher resolution means more pixels,
resulting in better clarity and detail. For example, a 1280 × 1024 resolution
contains 1.31 million pixels, offering more detail than a lower-resolution image.
30.​Why is Computer Vision considered a critical technology for autonomous
vehicles?​
Answer: Autonomous vehicles rely on Computer Vision to perceive their
surroundings. Cameras capture images of the environment, and algorithms
detect objects like cars, pedestrians, and traffic lights. This data helps the vehicle
make real-time decisions, such as when to stop, accelerate, or turn. Without
Computer Vision, self-driving technology would not be able to operate safely and
effectively.
31.​How does the Google Translate app use Computer Vision to translate text?​
Answer: The Google Translate app uses Computer Vision with Optical Character
Recognition (OCR) to detect text in images. The app captures an image of text,
identifies characters, and translates them into the desired language. Augmented
Reality overlays the translated text onto the original image, making it appear in
real-time.
32.​What is Instance Segmentation, and how is it used in Computer Vision
applications?​
Answer: Instance Segmentation identifies each object in an image, labels it, and
assigns a unique label to every pixel of that object. It is commonly used in fields
like autonomous vehicles (to detect and distinguish multiple objects like cars and
pedestrians) and medical imaging (to identify and segment different tissues or
organs in scans).
33.​Explain the importance of pixel values in an image and their range in
grayscale images.​
Answer: Pixel values represent the brightness or intensity of an image. In
grayscale images, pixel values range from 0 (black) to 255 (white). Intermediate
values indicate varying shades of gray. These values are crucial for analyzing the
image, as they form the basis for tasks like filtering, edge detection, and
segmentation.
34.​What is Computer Vision, and how does it differ from traditional image
processing?​
Answer: Computer Vision is a field of Artificial Intelligence that enables machines
to interpret and analyze visual data. Unlike traditional image processing, which
focuses on transforming images (e.g., filtering, resizing), Computer Vision aims to
understand and extract meaningful information from images.
35.​Why are RGB images composed of three channels?​
Answer: RGB images use three channels—Red, Green, and Blue—because
these primary colors can be combined in varying intensities to produce a wide
spectrum of colors.
36.​What is the significance of the value range 0–255 in images?​
Answer: The range 0–255 represents the intensity of a pixel in 8-bit grayscale or
RGB images. 0 indicates black, 255 indicates white or maximum intensity, and
intermediate values represent varying shades or colors.
37.​How does facial recognition work in Computer Vision?​
Answer: Facial recognition uses algorithms to detect and analyze facial features
such as the eyes, nose, and mouth. It matches these features with a stored
database to identify individuals.
38.​What is the primary advantage of using grayscale images over RGB images
in analysis?​
Answer: Grayscale images reduce computational complexity because they use a
single channel for pixel intensity, compared to the three channels in RGB images.
39.​What is the role of Object Detection in autonomous vehicles?​
Answer: Object Detection helps autonomous vehicles identify obstacles, traffic
signs, and pedestrians to ensure safe navigation.
40.​Why is classification combined with localization in certain tasks?​
Answer: Classification with localization not only identifies the object in an image
but also determines its position, which is crucial for applications like robotics and
augmented reality.
41.​What is the difference between Object Detection and Instance
Segmentation?​
Answer: Object Detection identifies and locates objects in an image, while
Instance Segmentation goes further by labeling each pixel of the objects
individually.
42.​Explain the role of Computer Vision in retail, with examples.​
Answer: In retail, Computer Vision is used for tracking customer movements,
analyzing store layouts, and optimizing shelf placements. For instance, security
camera footage can help monitor foot traffic and suggest better product
placements. It is also used in inventory management, where algorithms estimate
stock levels by analyzing images of shelves.
43.​Discuss the significance of pixels and their arrangement in a 2D grid.​
Answer: Pixels are the smallest units of an image, each representing a color or
intensity value. They are arranged in a 2D grid, where each pixel corresponds to
a specific position in the image. The arrangement and values of these pixels
determine the visual representation of the image.
44.​Describe the process and importance of classification in Computer Vision.​
Answer: Classification assigns a label to an image from a set of predefined
categories. This process is essential for tasks like detecting defective products in
factories or identifying handwritten digits. By training models on labeled data,
algorithms learn to recognize patterns and classify new images accurately.
45.​What challenges do lighting conditions pose to Computer Vision systems?
How are they addressed?​
Answer: Poor lighting can obscure details, and excessive brightness can create
glare, making object detection difficult. These challenges are addressed by
preprocessing techniques like contrast adjustment, noise reduction, and adaptive
thresholding.
46.​How do medical professionals benefit from Computer Vision in diagnosis?​
Answer: Computer Vision helps create detailed 3D models from 2D scans,
allowing medical professionals to examine tissues and organs more precisely. It
also assists in detecting anomalies, such as tumors, by highlighting areas of
concern in medical images.
47.​Explain how the Google Translate app combines Optical Character
Recognition (OCR) and Augmented Reality (AR).​
Answer: The app uses OCR to detect and extract text from an image. AR
overlays the translated text onto the original image, allowing users to view the
translation in real time as if it were part of the scene.
48.​What are the steps involved in Object Detection using Computer Vision?​
Answer:
Preprocessing the input image.
Applying feature extraction to identify patterns.
Using machine learning or deep learning algorithms to detect objects.
Drawing bounding boxes around detected objects to indicate their location.
49.​Describe the concept of resolution and its impact on digital images.​
Answer: Resolution refers to the number of pixels in an image, usually expressed
as width × height. Higher resolution provides finer details and better clarity, but it
also increases storage and processing requirements. For example, a
high-resolution image is crucial for medical imaging but may be unnecessary for
simple applications like thumbnails.

You might also like