ML Study Design - Google Street View Blurring System

The document outlines the design and objectives of a machine learning study focused on a Google Street View blurring system, addressing key questions about existing systems, data quality, deployment, and success metrics. It details the workings of a Region Proposal Network (RPN) and YOLO (You Only Look Once) for object detection, emphasizing their efficiency and differences. Additionally, it includes a discussion on data augmentation, privacy concerns in blurring, and the need for human intervention in training datasets.

Uploaded by

aleb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views11 pages

ML Study Design - Google Street View Blurring System

Uploaded by

aleb

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

ML Study Design - Google Street View Blurring System

Objective:

- Is there an existing system already in place?

- What is the main objective: accuracy, latency?
- Is there labeled data? If so, how much
- Can we trust the quality of data coming in?
- Where the model will be deployed? [cloud, edge, etc]
- Are there any constraints or limitations on model complexity or size?
- What are the key success metrics for this ML system in the long term?
In a given image, first objects are located and then the bounding boxes are
created around the object.

What is an RPN?

A Region Proposal Network (RPN) is a critical component in object detection systems like
Faster R-CNN. Its role is to generate proposals—regions in an image that are likely to contain
objects—quickly and accurately.

How Does It Work?

1. Feature Extraction:
o The RPN uses a feature map produced by a convolutional neural network (CNN)
as input.
o This feature map captures essential details about the image, such as edges,
textures, and patterns.
2. Anchor Boxes:
o At each position in the feature map, RPN places predefined anchor boxes—
rectangles of various sizes and aspect ratios.
o These boxes act as potential candidates for object regions.
o Example: For a 10x10 feature map with 3 scales and 3 aspect ratios, RPN creates
10×10×9=900anchor boxes.
3. Objectness and Bounding Box Refinement:
o For each anchor box, RPN predicts:
 Objectness Score: How likely it is that the box contains an object (vs.
background).
 Bounding Box Adjustment: Precise shifts to the box’s position, width,
and height to better match the object.
o Example: An anchor box at (5, 5) with size 32x32 pixels might be refined to:
 Objectness score: 0.9 (high confidence)
 Adjusted box: Center (5.2, 5.1), Size 30x40 pixels.
4. Proposal Generation:
o Filtering: Anchor boxes with low objectness scores are discarded.
o Refinement: The remaining boxes are adjusted using the bounding box
predictions. An anchor box might be at (x=100, y=150) with width 50 and height
30. The RPN might predict dx=2, dy=-1, dw=4, dh=-2. The refined box would
then be at (x=102, y=149) with width 54 and height 28.
o Non-Maximum Suppression (NMS): Overlapping proposals are merged, leaving
only the highest-quality ones.

How it works:

 Sort the remaining proposals by their objectness scores in descending

order.
 Select the proposal with the highest score.
 Compare this proposal with all other remaining proposals.
 If the Intersection over Union (IoU) between the selected proposal and any
other proposal is greater than a certain threshold (e.g., 0.7), discard the
other proposal (because it's considered redundant).
 Repeat steps 2-4 until all proposals have been considered.
 Intersection over Union (IoU): IoU is the ratio of the area of overlap
between two bounding boxes to the area of their union. A high IoU means
the boxes overlap significantly.
 Effect: NMS ensures that each object is represented by only one (or a
few) high-quality proposal(s), further reducing the number of proposals
passed to the next stage of the object detection pipeline.
Why is RPN Important?

 RPN enables end-to-end training, allowing the model to jointly optimize region
proposals and object classification.
 It’s computationally efficient, as it operates directly on the feature map without requiring
separate sliding window or region extraction steps.

YOLO

YOLO (You Only Look Once) is a single-stage object detection system that performs detection
directly on a grid overlaid on the input image. Here's a concise breakdown:

1. Grid Division: The input image is divided into an S x S grid (e.g., 7x7).
2. Cell Predictions: Each grid cell is responsible for predicting:
o B bounding boxes: Each bounding box has:
 (x, y): Center coordinates relative to the cell.
 (w, h): Width and height relative to the image.
 Confidence score: Probability of an object being in the box and the box
being accurate.
o C class probabilities: A probability distribution over the C object classes.
3. Encoding: These predictions are encoded into a tensor of size S x S x [B * 5 + C], where
5 represents the 4 bounding box coordinates + the confidence score.
4. Non-Max Suppression (NMS): Because multiple cells might detect the same object,
NMS is used to filter out redundant bounding boxes, keeping only the ones with the
highest confidence scores and sufficient IoU (Intersection over Union).
5. Loss Function: YOLO uses a loss function that combines:
o Bounding box regression loss (how well the predicted boxes match the ground
truth).
o Confidence loss (how accurate the objectness predictions are).
o Classification loss (how accurate the class predictions are).

Key Differences from Two-Stage Detectors:

 No Region Proposals: YOLO directly predicts bounding boxes and class probabilities
without a separate region proposal step. This is what makes it much faster.
 Grid-Based Detection: Detection happens at the grid cell level. Each cell is responsible
for predicting objects whose centers fall within it.
Some other models
Example of per-class evaluation and mAP for a two-stage detector:

Two classes ("cat", "dog") and two images are used for a simplified example.

 Image 1: 2 cats, 1 dog (ground truth). Predictions: 3 cat predictions (2 correct), 1 correct
dog prediction.
 Image 2: 1 cat, 2 dogs (ground truth). Predictions: 1 correct cat prediction, 3 dog
predictions (2 correct).

Using precision and recall (simplified, AP is usually from a curve):

 Cat: Image 1: P=0.67, R=1.0. Image 2: P=1.0, R=1.0. AP (simplified average): 0.835
 Dog: Image 1: P=1.0, R=1.0. Image 2: P=0.67, R=1.0. AP (simplified average): 0.835

mAP (average of per-class APs): (0.835 + 0.835) / 2 = 0.835

The "Hard negative mining" component likely processes the initial training results to identify and
select the most challenging false positives. These hard negatives are then used to augment the
dataset for subsequent training iterations.
Questions:

Dr. Mamta's Question:

●Question: Is there a way to understand the data and what kind of data augmentation can be
applied without misunderstanding the data, especially when there is limited data and human
intervention is undesirable?
●Answer: Jit acknowledged that data augmentation needs careful consideration. He explained
that in the Street View example, using completely upside-down images would be inappropriate
because they wouldn't occur in reality. He suggested focusing on realistic scenarios and data that
would likely be encountered. Prasa and Dr. Mamta highlighted the importance of human
intervention, particularly with limited datasets, to ensure appropriate augmentation and avoid
inaccurate results. They emphasized the need for a "human in the middle" to evaluate and select
suitable data for training to prevent false positives and negatives.

Pry's Questions:
●Question: How do you determine what to blur when trying to protect privacy, considering that
blurring the whole face might be excessive?
●Answer: Jit did not directly address this question. However, Prasa and he agreed that blurring
specific features like eyes and noses could suffice for de-identification while reducing processing
demands.
●Question: How does the system differentiate overlapping objects, like a person in a vehicle,
when applying different blurring to each?
●Answer: Jit didn't explicitly answer this. However, his presentation explained that the system
uses bounding boxes and assigns object classes (person, car, etc.) to each detected region. This
suggests that the model can distinguish and apply different blurring to overlapping objects based
on their classifications.

Single-stage object detection models differentiate overlapping objects by:

1. Bounding Box Regression: Predicting separate bounding boxes for each object in the
overlap region.
2. Object Classification: Assigning class probabilities (e.g., person, vehicle) to each
predicted bounding box.
3. Non-Maximum Suppression (NMS): Ensuring that overlapping boxes with lower
confidence scores are suppressed, retaining the most confident predictions for distinct
objects.
4. Feature Maps: Utilizing spatial and contextual features from the image to accurately
separate and classify objects even in overlapping scenarios.

●Question: Since Google Street View is a live view, how does the system process real-time
images and select frames for blurring?
●Answer: Jit clarified that Google captures panoramic images and stitches them together to
create a route view. This process suggests the blurring happens on static images rather than live
video streams. Pry and Prasa discussed the complexities of handling live video streams,
proposing frame rate reduction and selective image processing as potential solutions.

Prasa's Questions and Suggestions:

●Question/Suggestion: Can we discuss various object detection models, including YOLO (You
Only Look Once) and SSD (Single Shot Detection), and compare their strengths and
weaknesses?
●Answer/Response: Jit briefly mentioned YOLO and SSD as examples of one-stage object
detection networks5. Prasa emphasized the need to delve deeper into these models, including
RCNN variations (Fast RCNN and Faster RCNN), and compare their architectures and
performance characteristics.
●Question/Suggestion: Can we discuss the differences between commercial models, open-
source models, and models described in research papers?
●Answer/Response: This question wasn't answered directly. However, Prasa suggested
incorporating this comparison into future discussions to provide a broader perspective on the
landscape of object detection models.
●Question/Suggestion: Should we discuss ImageNet and COCO image databases, which are
widely used in the computer vision community?
●Answer/Response: This question was not addressed. However, Prasa's suggestion highlights
the value of exploring publicly available datasets to understand how models are trained and
evaluated.
●Question/Suggestion: Can we discuss the mathematical aspects of image processing, focusing
on techniques like data pre-processing and augmentation, and the libraries that support them?
●Answer/Response: While the presentation touched upon pre-processing techniques like
resizing and normalization, it didn't delve into the mathematical details9. Prasa advocated
exploring the mathematical underpinnings of these techniques and discussing libraries that
facilitate their implementation. He proposed exploring topics such as wavelet transforms,
contour analysis, and relevant libraries to enhance the understanding of image processing
techniques.

Object Detection Slides
No ratings yet
Object Detection Slides
90 pages
Object Detection Technique (YOLO)
No ratings yet
Object Detection Technique (YOLO)
19 pages
Boylestad 11th Edition - Solman
25% (4)
Boylestad 11th Edition - Solman
25 pages
Project
100% (1)
Project
30 pages
"Object Detection With Yolo": A Seminar On
No ratings yet
"Object Detection With Yolo": A Seminar On
14 pages
Research Ii: Types of Research Data
No ratings yet
Research Ii: Types of Research Data
21 pages
Yolo
No ratings yet
Yolo
32 pages
Object Detection Using Yolo Algorithm-1
No ratings yet
Object Detection Using Yolo Algorithm-1
9 pages
Object Detection Using Deep Learning
No ratings yet
Object Detection Using Deep Learning
6 pages
Real Time Object Detection System
No ratings yet
Real Time Object Detection System
31 pages
Thoshiba Power Transformer
100% (1)
Thoshiba Power Transformer
28 pages
Optimized Visual Recognition Algorithm in Service Robots: Junwwu, Wei Cai, Shi M Yu, Zhuo L Xu Andxueyhe
No ratings yet
Optimized Visual Recognition Algorithm in Service Robots: Junwwu, Wei Cai, Shi M Yu, Zhuo L Xu Andxueyhe
11 pages
Autonomous Robot Operation
No ratings yet
Autonomous Robot Operation
16 pages
Bounding Box Regression With Uncertainty For Accurate Object Detection
No ratings yet
Bounding Box Regression With Uncertainty For Accurate Object Detection
10 pages
8 ObectDectection
No ratings yet
8 ObectDectection
60 pages
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
No ratings yet
An Improved Rotation Invariant CNN-based Detector With Rotatable Bounding Boxes For Aerial Image Detection
5 pages
Module 6
No ratings yet
Module 6
83 pages
Deep Learning: Dr. Sanjeev Sharma
No ratings yet
Deep Learning: Dr. Sanjeev Sharma
61 pages
q8, q9, q10 Question and Answers
No ratings yet
q8, q9, q10 Question and Answers
16 pages
C11240283S19
No ratings yet
C11240283S19
4 pages
REQUIREMENTS-Storage and Filling of LPG in Bulk: WWW - Erc.go - Ke
No ratings yet
REQUIREMENTS-Storage and Filling of LPG in Bulk: WWW - Erc.go - Ke
2 pages
Incremental Training For Image Classification of Unseen Objects
No ratings yet
Incremental Training For Image Classification of Unseen Objects
19 pages
Learning Material 1 in MMW, Ch3
No ratings yet
Learning Material 1 in MMW, Ch3
16 pages
GV Series: Variable Speed Booster Sets With The New Sd60 Control Card
100% (1)
GV Series: Variable Speed Booster Sets With The New Sd60 Control Card
40 pages
12 - How To Deal With Single Bits
No ratings yet
12 - How To Deal With Single Bits
11 pages
Report 34
No ratings yet
Report 34
22 pages
Improvement of Object Detection Based On Faster R - 220904 150051
No ratings yet
Improvement of Object Detection Based On Faster R - 220904 150051
5 pages
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
No ratings yet
cv2021 Lec6 Object Detection - 1600 - PDF - Gdrive.vip
60 pages
Power For All - UttarPradesh
No ratings yet
Power For All - UttarPradesh
106 pages
IP Report Final
No ratings yet
IP Report Final
20 pages
Wepik Advancing Object Detection Unveiling The Potential For Precision and Efficiency 202401081226449LyU
No ratings yet
Wepik Advancing Object Detection Unveiling The Potential For Precision and Efficiency 202401081226449LyU
22 pages
Yolo
No ratings yet
Yolo
24 pages
Maaz Assignment # 3 Deep Learning
No ratings yet
Maaz Assignment # 3 Deep Learning
5 pages
Object Detection and Segmentation
No ratings yet
Object Detection and Segmentation
37 pages
CSE4261 Lecture-12
No ratings yet
CSE4261 Lecture-12
24 pages
BbbbbbbbE Project Research PPR 1-4-6
No ratings yet
BbbbbbbbE Project Research PPR 1-4-6
3 pages
2022 V13i3059
No ratings yet
2022 V13i3059
11 pages
SEMINAR
No ratings yet
SEMINAR
13 pages
Seminar 201202175023
No ratings yet
Seminar 201202175023
16 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
Object Detection
No ratings yet
Object Detection
76 pages
Finalreport
No ratings yet
Finalreport
56 pages
Mini Project Synopsis
No ratings yet
Mini Project Synopsis
6 pages
Base Paper (YOLO)
No ratings yet
Base Paper (YOLO)
6 pages
YOLO
No ratings yet
YOLO
31 pages
Yolo Algorithm
No ratings yet
Yolo Algorithm
37 pages
Yolo Vs RCNN
No ratings yet
Yolo Vs RCNN
5 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Second Progress Report UID - 17BCS2127
No ratings yet
Second Progress Report UID - 17BCS2127
13 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Ass 06
0% (1)
Ass 06
3 pages
Object Detection Using Mask R-CNN
No ratings yet
Object Detection Using Mask R-CNN
5 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
Synopsis of Real Time Security System: Submitted in Partial Fulfillment of The Requirements For The Award of
No ratings yet
Synopsis of Real Time Security System: Submitted in Partial Fulfillment of The Requirements For The Award of
6 pages
Wang 等 - 2021 - Accurate but fragile passive non-line-of-sight rec
No ratings yet
Wang 等 - 2021 - Accurate but fragile passive non-line-of-sight rec
14 pages
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
No ratings yet
Real Time Object Detection in Surveillance Cameras With 2xjeq74wam
8 pages
Detection and Content Retrieval of Object in An Image Using YOLO
No ratings yet
Detection and Content Retrieval of Object in An Image Using YOLO
8 pages
Analytical Study On Object Detection Using Yolo Algorithm
No ratings yet
Analytical Study On Object Detection Using Yolo Algorithm
3 pages
Pana Bežični Manujal kx-tcd150FX
No ratings yet
Pana Bežični Manujal kx-tcd150FX
77 pages
You Only Look Once - Unified, Real-Time Object Detection
No ratings yet
You Only Look Once - Unified, Real-Time Object Detection
10 pages
Realtime Visual Recognition in Deep Convolutional Neural Networks
No ratings yet
Realtime Visual Recognition in Deep Convolutional Neural Networks
13 pages
Unit 3
No ratings yet
Unit 3
45 pages
Object Detection Using You Only Look Once (YOLO) Algorithm in Convolution Neural Network (CNN)
No ratings yet
Object Detection Using You Only Look Once (YOLO) Algorithm in Convolution Neural Network (CNN)
5 pages
Paper Majorproject1
No ratings yet
Paper Majorproject1
6 pages
Cisco Asa Firepower
No ratings yet
Cisco Asa Firepower
11 pages
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
No ratings yet
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
6 pages
Image and Video Analytics Unit 3
No ratings yet
Image and Video Analytics Unit 3
18 pages
CI Object Detection and Localization
No ratings yet
CI Object Detection and Localization
27 pages
Thirteenth Edition: Design of Goods and Services
No ratings yet
Thirteenth Edition: Design of Goods and Services
88 pages
Deep Learning Object Detection IoU
No ratings yet
Deep Learning Object Detection IoU
2 pages
Robot Dynamics PDF
No ratings yet
Robot Dynamics PDF
4 pages
Extinguishant Control Panel (SHC70002, SHC70003) Operation and Maintenance Manual
No ratings yet
Extinguishant Control Panel (SHC70002, SHC70003) Operation and Maintenance Manual
38 pages
CHEMISTRY - 3.1 Accuracy Precision Practice Sig Figs and Sci Notation
100% (1)
CHEMISTRY - 3.1 Accuracy Precision Practice Sig Figs and Sci Notation
20 pages
Directory Structures and Implementations
No ratings yet
Directory Structures and Implementations
18 pages
F20 HMGT 6335 OPRE 6332 Spreadsheet Modeling SYLLABUS
No ratings yet
F20 HMGT 6335 OPRE 6332 Spreadsheet Modeling SYLLABUS
9 pages
Product Detail - 700d - English - 3
No ratings yet
Product Detail - 700d - English - 3
2 pages
Audi A6 f2 Faulty 0009
No ratings yet
Audi A6 f2 Faulty 0009
2 pages
141 Colors That Start With Z (Names, Hex, RGB, CMYK)
No ratings yet
141 Colors That Start With Z (Names, Hex, RGB, CMYK)
77 pages
Questions Chapter Wise
No ratings yet
Questions Chapter Wise
6 pages
F
No ratings yet
F
124 pages
Internship Report Vinoth Kumar
No ratings yet
Internship Report Vinoth Kumar
28 pages
Os Installation
No ratings yet
Os Installation
16 pages
Amdocs Fte Cpi Form Nitdgp
No ratings yet
Amdocs Fte Cpi Form Nitdgp
4 pages
DBMS Adi
No ratings yet
DBMS Adi
19 pages
Color 1
No ratings yet
Color 1
2 pages
TVL Comprog12 q3 m7
No ratings yet
TVL Comprog12 q3 m7
11 pages
Setting Up of An Open Source Based Private Cloud
No ratings yet
Setting Up of An Open Source Based Private Cloud
6 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
From Everand
Computer Stereo Vision: Exploring Depth Perception in Computer Vision
Fouad Sabry
No ratings yet
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet