Computer Vision Questions
Computer Vision Questions
One-Word Answers
1. What is the most common pixel format used in images?
o Answer: Byte
2. What color model is used in RGB images?
o Answer: Red, Green, Blue
3. What is the process of dividing an image into smaller regions or segments
called?
o Answer: Segmentation
4. What type of object recognition is used in systems like Google Search by
Image?
o Answer: Object Detection
5. What is used to define the clarity or detail of an image based on the
number of pixels?
o Answer: Resolution
6. Which task is focused on pixel-level classification and categorizes each
pixel in an image into predefined classes?
o Answer: Semantic Segmentation
7. What is the primary challenge in real-time facial recognition systems in
terms of lighting conditions and facial occlusions?
o Answer: Robustness
8. In the context of a self-driving car, what is the term for detecting and
identifying objects on the road, such as pedestrians, other cars, or traffic
signs?
o Answer: Object Detection
9. In an RGB image, the color value of a pixel is represented using how many
channels?
o Answer: Three
10.Which concept in image processing involves enhancing the clarity or detail
of an image, typically used before performing higher-level analysis?
o Answer: Image Enhancement
11.The primary challenge for object detection systems in crowded or complex
environments is to accurately differentiate between ________ objects.
o Answer: overlapping
12.Which architecture in deep learning is specifically designed to handle
sequential image data, such as video or time-series images?
o Answer: 3D Convolutional Neural Network (3D CNN)
13.In a Convolutional Neural Network, the operation that combines feature
maps from multiple layers and reduces dimensionality while retaining
important features is called ________.
o Answer: Pooling
14.The method of training a deep network where different parts of the model
are updated based on gradient flow, such as via multi-task learning, is
referred to as ________.
o Answer: Backpropagation
15.What is the name of the smallest element in an image?
Answer: Pixel
16.What kind of images have only shades of gray and no color?
Answer: Grayscale
17.What is the process of teaching a computer to recognize patterns in
images?
Answer: Training
18.Which popular technology is used to unlock phones using your face?
Answer: Facial Recognition
19.What term describes creating a 3D model from 2D images in medical
imaging?
Answer: Reconstruction
20.What is the term for a collection of pixels in an image?
Answer: Resolution
21. What is the full form of RGB in images?
Answer: Red, Green, Blue
22. What kind of image uses only shades of gray?
Answer: Grayscale
23. What does the Google Translate app use to recognize and translate text from
images?
Answer: Optical Character Recognition (OCR)
24. What is the process of marking an object and its location in an image called?
Answer: Localization
25.An image with higher resolution contains fewer pixels.
Answer: False
26.Computer Vision is a subfield of Artificial Intelligence.
Answer: True
27.In a grayscale image, pixel values can range from 0 to 255.
Answer: True
28.RGB images cannot represent shades of gray.
Answer: False
29.Instance Segmentation identifies each pixel that belongs to a specific object.
Answer: True
Open-Ended Questions
1. What are some real-life examples of Computer Vision applications?
Answer: Facial recognition, self-driving cars, Google Translate’s camera feature,
and face filters on social media apps.
2. Why is it important for machines to understand visual data?
Answer: It helps in making decisions automatically, like identifying road signs for
self-driving cars or detecting diseases in medical imaging.
3. How do RGB images differ from grayscale images?
Answer: RGB images are made up of three color components—Red, Green, and
Blue—while grayscale images have only shades of gray.
4. What do you understand by the term ‘Resolution’ of an image?
Answer: Resolution refers to the number of pixels in an image, usually measured
as width × height.
5. Can you name one example where object detection is useful in everyday
life?
Answer: Security cameras use object detection to recognize people and detect
movements.
6. What are some common applications of Computer Vision?
Answer: Facial recognition, self-driving cars, medical imaging, Google Translate,
and face filters on social media apps.
7. Why is Computer Vision important for industries like retail?
Answer: It helps with customer behavior tracking, inventory management, and
optimizing store layouts.
8. How does an RGB image store color information?
Answer: By combining different intensities of Red, Green, and Blue for each
pixel.
9. Explain how lighting conditions affect Computer Vision.
Answer: Poor lighting can make it harder for algorithms to detect or identify
objects accurately.
10.What is the main difference between Object Detection and Instance
Segmentation?
Answer: Object Detection finds and labels objects, while Instance Segmentation
also labels each pixel of the object.
11.Explain the basics of an image and how computers interpret it.
Answer: An image is made up of pixels, the smallest units of information. Each
pixel has a value representing brightness or color. Computers store these values
as numbers in a grid, with higher resolutions having more pixels. RGB images
store three values per pixel for Red, Green, and Blue, while grayscale images
store a single value between 0 and 255.
12.Discuss how Computer Vision is used in self-driving cars.
Answer: Self-driving cars use Computer Vision to identify objects like
pedestrians, vehicles, and traffic signs. They also monitor the environment and
analyze navigational routes. This helps in decision-making, like when to stop or
turn.
13.What is the role of pixels in forming an image?
Answer: Pixels are the smallest units of an image. They store color or brightness
information and collectively form the complete picture.
14.How is resolution related to the quality of an image?
Answer: Higher resolution means more pixels in the image, leading to greater
detail and sharper quality.
15.What are the three primary tasks of Computer Vision?
Answer: Image Classification, Object Detection, and Instance Segmentation.
16.Why is grayscale commonly used in image processing?
Answer: Grayscale simplifies computations as it uses only one channel, reducing
complexity compared to RGB images.
17.How does Object Detection differ from Classification?
Answer: Classification assigns a label to an image, while Object Detection
identifies and locates multiple objects within an image.
18.What does Computer Vision contribute to medical imaging?
Answer: It helps analyze 2D scan images, creates interactive 3D models, and
assists doctors in understanding patients’ health conditions.
19.What is the purpose of Google’s “Search by Image” feature?
Answer: It compares features of an input image with a database to find similar
images and provide search results.
20.How do RGB and grayscale images differ in data storage?
Answer: RGB images store three values per pixel (Red, Green, Blue), while
grayscale images store a single brightness value per pixel.
21.Explain how Computer Vision enables machines to interpret and analyze
visual data.
Answer: Computer Vision uses algorithms to process images or videos and
extract meaningful information. Tasks include detecting objects, classifying them,
and identifying their locations. It relies on techniques like image preprocessing,
machine learning, and neural networks to enable applications such as facial
recognition, object detection, and scene understanding.
22.Discuss the applications of Computer Vision in daily life with examples.
Answer: Computer Vision is widely used in various fields:
23.Facial Recognition: For unlocking phones and security systems.
24.Retail: Tracking customer movements and managing inventory.
25.Medical Imaging: Assisting in creating 3D models from 2D scans for diagnosis.
26.Self-Driving Cars: Recognizing traffic signs, pedestrians, and navigating routes.
27.Social Media: Applying filters on platforms like Instagram and Snapchat.
28.How do lighting conditions impact the accuracy of Computer Vision
applications?
Answer: Lighting significantly affects how images are captured and analyzed.
Poor lighting can cause blurriness or obscure details, making it harder for
algorithms to detect objects. Bright lighting might cause reflections or glare,
leading to incorrect interpretations. This is why robust algorithms often include
image enhancement techniques to adjust for varying lighting conditions.
29.Describe the role of pixels and resolution in image quality.
Answer: Pixels are the building blocks of an image, each representing a tiny
portion of the picture. Resolution, defined as the number of pixels in width ×
height, determines the image's detail. Higher resolution means more pixels,
resulting in better clarity and detail. For example, a 1280 × 1024 resolution
contains 1.31 million pixels, offering more detail than a lower-resolution image.
30.Why is Computer Vision considered a critical technology for autonomous
vehicles?
Answer: Autonomous vehicles rely on Computer Vision to perceive their
surroundings. Cameras capture images of the environment, and algorithms
detect objects like cars, pedestrians, and traffic lights. This data helps the vehicle
make real-time decisions, such as when to stop, accelerate, or turn. Without
Computer Vision, self-driving technology would not be able to operate safely and
effectively.
31.How does the Google Translate app use Computer Vision to translate text?
Answer: The Google Translate app uses Computer Vision with Optical Character
Recognition (OCR) to detect text in images. The app captures an image of text,
identifies characters, and translates them into the desired language. Augmented
Reality overlays the translated text onto the original image, making it appear in
real-time.
32.What is Instance Segmentation, and how is it used in Computer Vision
applications?
Answer: Instance Segmentation identifies each object in an image, labels it, and
assigns a unique label to every pixel of that object. It is commonly used in fields
like autonomous vehicles (to detect and distinguish multiple objects like cars and
pedestrians) and medical imaging (to identify and segment different tissues or
organs in scans).
33.Explain the importance of pixel values in an image and their range in
grayscale images.
Answer: Pixel values represent the brightness or intensity of an image. In
grayscale images, pixel values range from 0 (black) to 255 (white). Intermediate
values indicate varying shades of gray. These values are crucial for analyzing the
image, as they form the basis for tasks like filtering, edge detection, and
segmentation.
34.What is Computer Vision, and how does it differ from traditional image
processing?
Answer: Computer Vision is a field of Artificial Intelligence that enables machines
to interpret and analyze visual data. Unlike traditional image processing, which
focuses on transforming images (e.g., filtering, resizing), Computer Vision aims to
understand and extract meaningful information from images.
35.Why are RGB images composed of three channels?
Answer: RGB images use three channels—Red, Green, and Blue—because
these primary colors can be combined in varying intensities to produce a wide
spectrum of colors.
36.What is the significance of the value range 0–255 in images?
Answer: The range 0–255 represents the intensity of a pixel in 8-bit grayscale or
RGB images. 0 indicates black, 255 indicates white or maximum intensity, and
intermediate values represent varying shades or colors.
37.How does facial recognition work in Computer Vision?
Answer: Facial recognition uses algorithms to detect and analyze facial features
such as the eyes, nose, and mouth. It matches these features with a stored
database to identify individuals.
38.What is the primary advantage of using grayscale images over RGB images
in analysis?
Answer: Grayscale images reduce computational complexity because they use a
single channel for pixel intensity, compared to the three channels in RGB images.
39.What is the role of Object Detection in autonomous vehicles?
Answer: Object Detection helps autonomous vehicles identify obstacles, traffic
signs, and pedestrians to ensure safe navigation.
40.Why is classification combined with localization in certain tasks?
Answer: Classification with localization not only identifies the object in an image
but also determines its position, which is crucial for applications like robotics and
augmented reality.
41.What is the difference between Object Detection and Instance
Segmentation?
Answer: Object Detection identifies and locates objects in an image, while
Instance Segmentation goes further by labeling each pixel of the objects
individually.
42.Explain the role of Computer Vision in retail, with examples.
Answer: In retail, Computer Vision is used for tracking customer movements,
analyzing store layouts, and optimizing shelf placements. For instance, security
camera footage can help monitor foot traffic and suggest better product
placements. It is also used in inventory management, where algorithms estimate
stock levels by analyzing images of shelves.
43.Discuss the significance of pixels and their arrangement in a 2D grid.
Answer: Pixels are the smallest units of an image, each representing a color or
intensity value. They are arranged in a 2D grid, where each pixel corresponds to
a specific position in the image. The arrangement and values of these pixels
determine the visual representation of the image.
44.Describe the process and importance of classification in Computer Vision.
Answer: Classification assigns a label to an image from a set of predefined
categories. This process is essential for tasks like detecting defective products in
factories or identifying handwritten digits. By training models on labeled data,
algorithms learn to recognize patterns and classify new images accurately.
45.What challenges do lighting conditions pose to Computer Vision systems?
How are they addressed?
Answer: Poor lighting can obscure details, and excessive brightness can create
glare, making object detection difficult. These challenges are addressed by
preprocessing techniques like contrast adjustment, noise reduction, and adaptive
thresholding.
46.How do medical professionals benefit from Computer Vision in diagnosis?
Answer: Computer Vision helps create detailed 3D models from 2D scans,
allowing medical professionals to examine tissues and organs more precisely. It
also assists in detecting anomalies, such as tumors, by highlighting areas of
concern in medical images.
47.Explain how the Google Translate app combines Optical Character
Recognition (OCR) and Augmented Reality (AR).
Answer: The app uses OCR to detect and extract text from an image. AR
overlays the translated text onto the original image, allowing users to view the
translation in real time as if it were part of the scene.
48.What are the steps involved in Object Detection using Computer Vision?
Answer:
Preprocessing the input image.
Applying feature extraction to identify patterns.
Using machine learning or deep learning algorithms to detect objects.
Drawing bounding boxes around detected objects to indicate their location.
49.Describe the concept of resolution and its impact on digital images.
Answer: Resolution refers to the number of pixels in an image, usually expressed
as width × height. Higher resolution provides finer details and better clarity, but it
also increases storage and processing requirements. For example, a
high-resolution image is crucial for medical imaging but may be unnecessary for
simple applications like thumbnails.