0% found this document useful (0 votes)

13 views

Active Learning for Deep Object Detection 2

Computer Science: A paper on Active Learning for Deep Object Detection.

Uploaded by

sadiqmw2014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Active Learning for Deep Object Detection 2

Computer Science: A paper on Active Learning for Deep Object Detection.

Uploaded by

sadiqmw2014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Active Learning for Deep Object Detection

Clemens-Alexander Brust1 , Christoph Käding1,2 and Joachim Denzler1,2

1 Computer Vision Group, Friedrich Schiller University Jena, Germany
2 Michael Stifel Center Jena, Germany

Keywords: Active Learning, Deep Learning, Object Detection, YOLO, Continuous Learning, Incremental Learning.

Abstract: The great success that deep models have achieved in the past is mainly owed to large amounts of labeled
training data. However, the acquisition of labeled data for new tasks aside from existing benchmarks is both
challenging and costly. Active learning can make the process of labeling new data more efficient by selecting
unlabeled samples which, when labeled, are expected to improve the model the most. In this paper, we
combine a novel method of active learning for object detection with an incremental learning scheme (Käding
et al., 2016b) to enable continuous exploration of new unlabeled datasets. We propose a set of uncertainty-
based active learning metrics suitable for most object detectors. Furthermore, we present an approach to
leverage class imbalances during sample selection. All methods are evaluated systematically in a continuous
exploration context on the PASCAL VOC 2012 dataset (Everingham et al., 2010).

1 INTRODUCTION over time or the distribution underlying the problem

changes itself. We simulate such an environment
Labeled training data is highly valuable and the ba- using splits of the PASCAL VOC 2012 (Everingham
sic requirement of supervised learning. Active lear- et al., 2010) dataset. With our proposed framework,
ning aims to expedite the process of acquiring new a deep object detection system can be trained in an
labeled data, ordering unlabeled samples by the ex- incremental manner while the proposed aggregation
pected value from annotating them. In this paper, we schemes enable selection of valuable data for anno-
propose novel active learning methods for object de- tation. In consequence, a deep object detector can
tection. Our main contributions are (i) an incremental explore unknown data and adapt itself involving mi-
learning scheme for deep object detectors without ca- nimal human supervision. This combination results
tastrophic forgetting based on (Käding et al., 2016b), in a complete system enabling continuously changing
(ii) active learning metrics for detection derived from scenarios.
uncertainty estimates and (iii) an approach to leverage
selection imbalances for active learning. 1.1 Related Work
While active learning is widely studied in classi-
fication tasks (Kovashka et al., 2016; Settles, 2009), Object Detection using CNNs. An important con-
it has received much less attention in the domain of tribution to object detection based on deep learning
deep object detection. In this work, we propose met- is R-CNN (Girshick et al., 2014). It delivers a con-
hods that can be used with any object detector that siderable improvement over previously published sli-
predicts a class probability distribution per object pro- ding window-based approaches. R-CNN employs se-
posal. Scores from individual detections are aggrega- lective search (Uijlings et al., 2013), an unsupervised
ted into a score for the whole image (see Fig. 1). All method to generate region proposals. A pre-trained
methods rely on the intuition that model uncertainty CNN performs feature extraction. Linear SVMs (one
and valuable samples are likely to co-occur (Settles, per class) are used to score the extracted features and
2009). Furthermore, we show how the balanced se- a threshold is applied to filter the large number of
lection of new samples can improve the resulting per- proposed regions. Fast R-CNN (Girshick, 2015) and
formance of an incrementally learned system. Faster R-CNN (Ren et al., 2015) offer further impro-
In continuous exploration application scenarios, vements in speed and accuracy. Later on, R-CNN
e.g., in camera streams, new data becomes available is combined with feature pyramids to enable efficient

181
Brust, C., Käding, C. and Denzler, J.
Active Learning for Deep Object Detection.
DOI: 10.5220/0007248601810190
In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019), pages 181-190
ISBN: 978-989-758-354-4
Copyright c 2019 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

Score from
margin sampling
(1-vs-2) +
Whole
image
score

ca ycl
bir

ho g
dot
c

rs
e
e
Predict Calculate scores Aggregate individual
unlabeled example for each detection scores
Figure 1: Our proposed system for continuous exploration scenarios. Unlabeled images are evaluated by an deep object de-
tection method. The margins of predictions (i.e., absolute difference of highest and second-highest class score) are aggregated
to identify valuable instances by combining scores of individual detections.

multi-scale detections (Lin et al., 2017). YOLO (Red- part-based detector for SVM classifiers in combina-
mon et al., 2016) is a more recent deep learning-based tion with hashing is proposed for use in large-scale
object detector. Instead of using a CNN as a black box settings. Active learning is realized by selecting the
feature extractor, it is trained in an end-to-end fashion. most uncertain instances for labeling. In (Roy et al.,
All detections are inferred in a single pass (hence the 2016), object detection is interpreted as a structured
name “You Only Look Once”) while detection and prediction problem using a version space approach in
classification are capable of independent operation. the so called “difference of features” space. The aut-
YOLOv2 (Redmon and Farhadi, 2017) and YOLOv3 hors propose different margin sampling approaches
(Redmon and Farhadi, 2018) improve upon the ori- estimating the future margin of an SVM classifier.
ginal YOLO in several aspects. These include among Like our proposed approach, most related met-
others different network architectures, different priors hods presented above rely on uncertainty indicators
for bounding boxes and considering multiple scales like least confidence or 1-vs-2. However, they are
during training and detection. SSD (Liu et al., 2016) designed for a specific type of object detection and
is a single-pass approach comparable to YOLO intro- therefore can not be applied to deep object detection
ducing improvements like assumptions about the as- methods in general whereas our method can. Addi-
pect ratio distribution of bounding boxes as well as tionally, our method does not propose single objects
predictions on different scales. As a result of a series to the human annotator. It presents whole images at
of improvements, it is both faster and more accurate once and requests labels for every object.
than the original YOLO. DSSD (Fu et al., 2017) furt-
her improves upon SSD in focusing more on context
Active Learning for Deep Architectures. In
with the help of deconvolutional layers.
(Wang and Shang, 2014) and (Wang et al., 2016),
uncertainty-based active learning criteria for deep
Active Learning for Object Detection. The aut- models are proposed. The authors offer several me-
hors of (Abramson and Freund, 2006) propose an trics to estimate model uncertainty, including least
active learning system for pedestrian detection in vi- confidence, margin or entropy sampling. Wang et al.
deos taken by a camera mounted on the front of additionally describe a self-taught learning scheme,
a moving car. Their detection method is based on where the model’s prediction is used as a label for
AdaBoost while sampling of unlabeled instances is further training if uncertainty is below a threshold.
realized by hand-tuned thresholding of detections. Another type of margin sampling is presented in
Object detection using generalized Hough transform (Stark et al., 2015). The authors propose querying
in combination with randomized decision trees, cal- samples according to the quotient of the highest and
led Hough forests, is presented in (Yao et al., 2012). second-highest class probability. The visual detection
Here, costs are estimated for annotations, and instan- of defects using a ResNet is presented in (Feng et al.,
ces with highest costs are selected for labeling. This 2017). The authors propose two methods: uncertainty
follows the intuition that those examples are most li- sampling (i.e., defect probability of 0.5) and positive
kely to be difficult and therefore considered most va- sampling (i.e., selecting every positive sample since
luable. Another active learning approach for satellite they are very rare) for querying unlabeled instances
images using sliding windows in combination with as model update after labeling. Another work which
an SVM classifier and margin sampling is proposed presents uncertainty sampling is (Liu et al., 2017). In
in (Bietti, 2012). The combination of active learning addition, a query by committee strategy as well as
for object detection with crowd sourcing is presen- active learning involving weighted incremental dicti-
ted in (Vijayanarasimhan and Grauman, 2014). A onary learning for active learning are proposed. In the

182
Active Learning for Deep Object Detection

work of (Gal et al., 2017), several uncertainty-related located close to a decision boundary. In this case, it
measures for active learning are presented. Since they can be used to refine the decision boundary and is the-
use Bayesian CNNs, they can make use of the proba- refore valuable. The metric is defined using the hig-
bilistic output and employ methods like variance sam- hest scoring classes c1 and c2 :
pling, entropy sampling or maximizing mutual infor- 2
mation. Finally, the authors of (Beluch et al., 2018) v1vs2 (x) = 1 − (max p̂(c1 |x) − max p̂(c2 |x)) .
c1 ∈K c2 ∈K\c1
show that ensamble-based uncertainty measures are (1)
able to perform best in their evaluation. All of the This procedure is known as 1-vs-2 or margin sam-
works introduced above are tailored to active learning pling (Settles, 2009). We use 1-vs-2 as part of our
in classification scenarios. Most of them rely on mo- methods since its operation is intuitive and it can pro-
del uncertainty, similar to our applied selection crite- duce better estimates than e.g., least confidence ap-
ria. proaches (Käding et al., 2016a). However, our propo-
Besides estimating the uncertainty of the model, sed aggregation method could be applied to any other
further retraining-based approaches are maximizing active learning measure.
the expected model change (Huang et al., 2016) or the
expected model output change (Käding et al., 2016a)
that unlabeled samples would cause after labeling.
Since each bounding box inside an image has to be 3 ACTIVE LEARNING FOR DEEP
evaluated according its active learning score, both me- OBJECT DETECTION
asures would be impractical in terms of runtime wit-
hout further modifications. A more complete over- Using a classification metric on a single detection is
view of general active learning strategies can be found straightforward, if class scores are available. Though,
in (Kovashka et al., 2016; Settles, 2009). aggregating metrics of individual detections for a
complete image can be done in many different ways.
In the section below, we propose simple and efficient
2 PREREQUISITE: ACTIVE aggregation strategies. Afterwards, we discuss the
problem of class imbalance in datasets.
LEARNING
3.1 Aggregation of Detection Metrics
In active learning, a value or metric v(x) is assigned
to any unlabeled example x to determine its possible
Possible aggregations include calculating the sum, the
contribution to model improvement. The current mo-
average or the maximum over all detections. Ho-
del’s output can be used to estimate a value, as can
wever, for some aggregations, it is not clear how an
statistical properties of the example itself. A high v(x)
image without any detections should be handled.
means that the example should be preferred during se-
lection because of its estimated value for the current
model. Sum. A straightforward method of aggregation is
In the following section, we propose a method to the sum. Intuitively, this method prefers images with
adapt an active learning metric for classification to ob- lots of uncertain detections in them. When aggrega-
ject detection using an aggregation process. This met- ting detections using a sum, empty examples should
hod is applicable to any object detector whose output be valued zero. It is the neutral element of addition,
contains class scores for each detected object. making it a reasonable value for an empty sum. A low
valuation effectively delays the selection of empty ex-
amples until there are either no better examples left or
Classification. For classification, the model output
the model has improved enough to actually produce
for a given example x is an estimated distribution of
detections on them. The value of a single example x
class scores p̂(c|x) over classes K. This distribution
can be calculated from the detections D in the follo-
can be analyzed to determine whether the model made
wing way:
an uncertain prediction, a good indicator of a valua-
ble example. Different measures of uncertainty are vSum (x) = ∑ v1vs2 (xi ) . (2)
a common choice for selection, e.g., (Ertekin et al., i∈D
2007; Fu and Yang, 2015; Hoi et al., 2006; Jain and
Kapoor, 2009; Kapoor et al., 2010; Käding et al., Average. Another possibility is averaging each de-
2016c; Tong and Koller, 2001; Beluch et al., 2018). tection’s scores. The average is not sensitive to the
For example, if the difference between the two number of detections, which may make scores more
highest class scores is very low, the example may be comparable between images. If a sample does not

183
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

contain any detections, it will be assigned a zero Data. We use the PASCAL VOC 2012 dataset (Eve-
score. This is an arbitrary rule because there is no true ringham et al., 2010) to assess the effects of our met-
neutral element w.r.t. averages. However, we choose hods on learning. To specifically measure incremen-
zero to keep the behavior in line with the other me- tal and active learning performance, both training and
trics: validation set are split into parts A and B in two diffe-
1 rent random ways to obtain more general experimen-
vAvg (x) = ∑ v1vs2 (xi ) . (3)
|D| i∈D tal results. Part B is considered “new” and is compri-
sed of images with the object classes bird, cow and
sheep (first way) or tvmonitor, cat and boat (se-
Maximum. Finally, individual detection scores can cond way). Part A contains all other 17 classes and
be aggregated by calculating the maximum. This can is used for initial training. The training set for part B
result in a substantial information loss. However, it contains 605 and 638 images for the first and second
may also prove beneficial because of increased robus- way, respectively. The decision towards VOC in favor
tness to noise from many detections. For the maxi- of recently published datasets was motivated by the
mum aggregation, a zero score for empty examples is conditions of the dataset itself. Since it mainly con-
valid. The maximum is not affected by zero valued tains images showing fewer objects, it is possible to
detections, because no single detection’s score can be split the data into a known and unknown part without
lower than zero: having images containing classes from both parts of
vMax (x) = max v1vs2 (xi ) . (4) the splits.
i∈D

3.2 Handling Selection Imbalances Active Exploration Protocol. Before an experi-

mental run, the part B datasets are divided randomly
Class imbalances can lead to worse results for clas- into unlabeled batches of ten samples each. This fixed
ses underrepresented in the training set. In a continu- assignment decreases the probability of very similar
ous learning scenario, this imbalance can be counte- images being selected for the same batch compared
red during selection by preferring instances where the to always selecting the highest valued samples, which
predicted class is underrepresented in the training set. would lead to less diverse batches. This is valuable
An instance is weighted by the following rule: while dealing with data streams, e.g., from camera
traps, or data with low intra-class variance. The con-
#instances + #classes struction of diverse unlabeled data batches is a well
wc = , (5)
#instancesc + 1 known topic in batch-mode active learning (Settles,
2009). However, the construction of diverse batches
where c denotes the predicted class. We assume a
could lead to unintended side-effects and an evalua-
symmetric Dirichlet prior with α = 1, meaning that
tion of those is beyond the scope of the current study.
we have no prior knowledge of the class distribution,
The unlabeled batch size is a trade-off between a tight
and estimate the posterior after observing the total
feedback loop (smaller batches) and computational
number of training instances as well as the number
efficiency (larger batches). As side-effect of the fixed
of instances of class c in the training set. The weight
batch assignment, there are some samples left over
wc is then defined as the inverse of the posterior to
during selection (i.e., five for first way and eight for
prefer underrepresented classes. It is multiplied with
second way of splitting).
v1vs2 (x) before aggregation to obtain a final score.
The unlabeled batches are assigned a value using
the sum of the active learning metric over all images
in the corresponding batch as a meta-aggregation. Ot-
4 EXPERIMENTS her functions such as average or maximum could be
considered too, but are also beyond the scope of this
In the following, we present our evaluation. First we paper.
show how the proposed aggregation metrics are able The highest valued batch is selected for an incre-
to enhance recognition performance while selecting mental training step (Käding et al., 2016b). The net-
new data for annotation. After this, we will analyze work is updated using the annotations from the dataset
the gained improvements when our proposed weig- in lieu of a human annotator. Please note, annotations
hting scheme is applied. are not needed for update batch selection but for the
The code for our experiments is available 1 . update itself. This process is repeated from the point
of batch valuation until there are no unlabeled batches
1 https://github.com/cvjena/cn24-active left. The assignment of samples to unlabeled batches

184
Active Learning for Deep Object Detection

Algorithm 1: Detailed description of the experimental protocol. Please note that in an actual continuous learning scenario,
new examples are always added to U. The loop is never left because U is never exhausted. The described splitting process
would have to be applied regularly.
Require: Known labeled samples L, unknown samples U, initial model f0 , active learning metric v

U = U1 , U2 , . . . ← split of U into random batches

f ← f0

while U is not empty do

calculate scores for all batches in U using f
Ubest ← highest scoring batch in U according to v

Ybest ← annotations for Ubest human-machine interaction

f ← incrementally train f using L and (Ubest , Ybest )

U ← U\Ubest
L ← L ∪ (Ubest , Ybest )
end while
is not changed during an experimental run. 2016; Shmelkov et al., 2017). We use a straightfor-
ward, but effective fine-tuning method (Käding et al.,
2016b) to implement incremental learning. With each
Evaluation. We report mean average precision
gradient step, the mini-batch is constructed by rand-
(mAP) as described in (Everingham et al., 2010) and
omly selecting from old and new examples with a
validate each five new batches (i.e., 50 new samples).
certain probability of λ or 1 − λ, respectively. After
The result is averaged over five runs for each active
completing the learning step, the new data is simply
learning metric and way of splitting for a total of ten
considered old data for the next step. This method
runs. As a baseline for comparison, we evaluate the
can balance known and unknown data performance
performance of random selection, since there is no ot-
successfully. We use a value of 0.5 for λ to make as
her work suitable for direct comparison without any
few assumptions as possible and perform 100 iterati-
adjustments as of yet.
ons per update. Algorithm 1 describes the protocol
in more detail. The method can be applied to YOLO
Setup – Object Detector. We use YOLO as deep object detection with some adjustments. Mainly, the
object detection framework (Redmon et al., 2016). architecture needs to be changed when new classes
More precisely, we use the YOLO-Small architecture are added. Because of the design of YOLO’s output
as an alternative to larger object detection networks, layer, we rearrange the weights to fit new classes, ad-
because it allows for much faster training. Our ini- ding 49 zero-initialized weights per class.
tial model is obtained by fine-tuning the Extraction
model2 on part A of the VOC dataset for 24,000 ite- 4.1 Results
rations using the Adam optimizer (Kingma and Ba,
2014), for each way of splitting the dataset into parts We focus our analysis on the new, unknown data. Ho-
A and B, resulting in two initial models. The first half wever, not losing performance on known data is also
of initial training is completed with a learning rate of important. We analyze the performance on the known
0.0001. The second half and all incremental experi- part of the data (i.e., part A of the VOC dataset) to va-
ments use a lower learning rate of 0.00001 to prevent lidate our method. In worst case, the mAP decreases
divergence. Other hyperparameters match (Redmon from 36.7% initially to 32.1% averaged across all ex-
et al., 2016), including the augmentation of training perimental runs and methods although three new clas-
data using random crops, exposure or saturation ad- ses were introduced. We can see that the incremental
justments. learning method from (Käding et al., 2016b) causes
only minimal losses on known data. These losses in
Setup – Incremental Learning. Extending an exis- performance are also referred to as “catastrophic for-
ting CNN without sacrificing performance on known getting” in literature (Kirkpatrick et al., 2016). The
data is not a trivial task. Fine-tuning exclusively on method from (Käding et al., 2016b) does not require
new data leads to a severe degradation of performance additional model parameters or adjusted loss terms
on previously learned examples (Kirkpatrick et al., for added samples like comparable approaches such
as (Shmelkov et al., 2017) do, which is important for
2 http://pjreddie.com/media/files/extraction.weights learning indefinitely.

185
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

Table 1: Validation results on part B of the VOC data (i.e., new classes only). Bold face indicates block-wise best results, i.e.,
best results with and without additional weighting (· + w). Underlined face highlights overall best results.
50 samples 100 samples 150 samples 200 samples 250 samples All samples
mAP/AULC mAP/AULC mAP/AULC mAP/AULC mAP/AULC mAP/AULC
Baseline
Random 8.7 / 4.3 12.4 / 14.9 15.5 / 28.8 18.7 / 45.9 21.9 / 66.2 32.4 / 264.0
Our Methods
Max 9.2 / 4.6 12.9 / 15.7 15.7 / 30.0 19.8 / 47.8 22.6 / 69.0 32.0 / 269.3
Avg 9.0 / 4.5 12.4 / 15.2 15.8 / 29.2 19.3 / 46.8 22.7 / 67.8 33.3 / 266.4
Sum 8.5 / 4.2 14.3 / 15.6 17.3 / 31.4 19.8 / 49.9 22.7 / 71.2 32.4 / 268.2
Max + w 9.2 / 4.6 13.0 / 15.7 17.0 / 30.7 20.6 / 49.5 23.2 / 71.4 33.0 / 271.0
Avg + w 8.7 / 4.3 12.5 / 14.9 16.6 / 29.4 19.9 / 47.7 22.4 / 68.8 32.7 / 267.1
Sum + w 8.7 / 4.4 13.7 / 15.6 17.5 / 31.2 20.9 / 50.4 24.3 / 72.9 32.7 / 273.6

Most valuable examples (highest score)

Sum (+ w)

Avg (+ w)

Max (+ w)

Least valuable examples (zero score)

All

Figure 2: Value of examples of cow, sheep and bird as determined by the Sum, Avg and Max metrics using the initial model.
The top seven selection is not affected by using our weighting method to counter training set class imbalaces.

Performance of active learning methods is usu- we evaluate the models after learning small amounts
ally evaluated by observing points on a learning curve of samples. At this point there is still a large number
(i.e., performance over number of added samples). In of diverse samples for the methods to choose from,
Table 1, we show the mAP for the new classes from which makes the following results much more rele-
part B of VOC at several intermediate learning steps vant for practical applications than results on the full
as well as exhausting the unlabeled pool. In addition dataset.
we show the area under learning curve (AULC) to In general, we can see that incremental learning
further improve comparability among the methods. In works in the context of the new classes in part B of
our experiments, the number of samples added equals the data, meaning that we observe an improving per-
the number of images. formance for all methods. After adding only 50 sam-
ples, Max and Avg are performing better than pas-
sive selection while the Sum metric is outperformed
Quantitative Results – Fast Exploration. Gaining marginally. When more and more samples are added
accuracy as fast as possible while minimizing the hu- (i.e., 100 to 250 samples), we observe a superior per-
man supervision is one of the main goals of active formance of the Sum aggregation. But also the two
learning. Moreover, in continuous exploration scena- other aggregation techniques are able to reach better
rios, like live camera feeds or other continuous auto- rates than mere random selection. We attribute the
matic measurements, it is assumed that new data is fast increase of performance for the Sum metric to its
always available. Hence, the pool of valuable exam- tendency to select samples with many object inside
ples will rarely be exhausted. To assess the perfor- which leads to more annotated bounding boxes. Ho-
mance of our methods in this fast exploration context, wever, the target application is a scenario where the

186
Active Learning for Deep Object Detection

New classes (part B) Known classes (part A)

bird cow sheep aeroplane car

Initial prediction

After 50 samples

After 150 samples

Figure 3: Evolution of detections on examples from validation set.

amount of unlabeled data is huge or new data is ap- cate that the chosen incremental learning technique
proaching continuously and hence a complete evalu- (Käding et al., 2016b) is suitable for the faced scena-
ation by humans is infeasible. Here, we consider the rio. In continuous exploration, it is usually assumed
amount of images to be evaluated more critical as the that there will be more new unlabeled data available
time needed to draw single bounding boxes. Anot- than can be processed. Nevertheless, evaluating the
her interesting fact is the almost equal performance long term performance of our metrics is important to
of Max and Avg which can be explained as follows: detect possible deterioration over time compared to
the VOC dataset consists mostly of images with only random selection. In contrast to this, the differences
one object in them. Therefore, both metrics lead to a in AULC arise from the improvements of the different
similar score if objects are identified correctly. methods during the experimental run and therefore
We can also see that the proposed balance hand- should be considered as distinctive feature implying
ling (i.e., · + w) causes slight losses in performance at the performance over the whole experiment. Having
very early stages up to 100 samples. At subsequent this in mind, we can still see that Sum performs best
stages, it helps to gain noticeable improvements. Es- while the weighting generally leads to improvements.
pecially for the Sum method benefits from the sam-
ple weighting scheme. A possible explanation for this
behavior would be the following: At early stages, the Quantitative Results — Class-wise Analysis To
classifier has not seen many samples of each class and validate the efficacy of our sample weighting strategy
therefore suffers more from miss-classification errors. as discussed in Section 3.2, it is important to mea-
Hence, the weighting scheme is not able to encourage sure not only overall performance, but to look at me-
the selection of rare class samples since the classi- trics for individual classes. Fig. 4 shows the perfor-
fier decisions are still too unstable. At later stages, mance over time on the validation set for each indi-
this problem becomes less severe and the weighting vidual class. For reference, we also provide the class
scheme is much more helpful than in the beginning. distribution over the relevant part of the VOC data-
This could also explain the performance of Sum in set, indicated by number of object instances in total as
general. Further details on learning pace are given well as number of images with at least one instance in
later in a qualitative study on model evolution. Addi- it.
tionally, the Sum aggregation tends to select batches In the first row, we observe an advantage for the
with many detections in it. Hence, it is natural that weighted method when looking at the performance of
the improvement is noticeable the most with this ag- cow. Out of the three classes in this way of splitting
gregation technique since it helps to find batches with cow has the fewest instances in the dataset. The per-
many rare objects in it. formance of tvmonitor in the second row shows a si-
milar pattern, where it is also the class with the lowest
number of object instances in the dataset. Analyzing
Quantitative Results – All Available Samples. In bird and cat, the top classes by number of instan-
our case, active learning only affects the sequence of ces in each way of splitting, we observe only small
unlabeled batches if we train until there is no new data differences in performance. Thus, we can show evi-
available. Therefore, there are only very small diffe- dence that our balancing scheme is able to improve
rences between each method’s results for mAP after performance on rare classes while it does not effect
training has completed. The small differences indi- performance on frequent classes.

187
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

bird cow Qualitative Results – Sample Valuation We cal-

30
culate whole image scores over bird, cow and sheep
30
20 samples using our corresponding initial model trained
AP (%)

AP (%)
20 on the remaining classes for the first way of splitting.
Sum 10 Sum Figure 2 shows those images that the three aggrega-
10
Sum + w Sum + w tion metrics consider the most valuable. Additionally,
0 0 common zero scoring images are shown. The least
0 250 500 0 250 500 valuable images shown here are representative of all
# samples # samples proposed metrics because they do not lead to any de-
sheep boat tections using the initial model. Note that there are
more than seven images with zero score in the trai-
20 20 ning dataset. The images shown in the figure have
been selected randomly.
AP (%)

AP (%)

10 10 Intuitively, the Sum metric should prefer images

Sum Sum
with many objects in them over single objects, even if
Sum + w Sum + w
0 0 individual detection values are low. Although VOC
0 250 500 0 250 500 contains mostly of images with a single object, all
# samples # samples seven of the highest scoring images contain at le-
ast three objects. The Average and Maximum metric
cat tvmonitor
60 prefer almost identical images since the average and
20 maximum are used to be nearly equal for few detecti-
40 ons. With few exceptions, the most valuable images
AP (%)

AP (%)

10
contain pristine examples of each object. They are
20 Sum Sum well lit and isolated. The objects in the zero scoring
Sum + w Sum + w images are more noisy and hard to identify even for
0 0
the human viewer, resulting in fewer reliable detecti-
0 250 500 0 250 500
# samples # samples
ons.

Number of samples in VOC dataset by class Qualitative Results – Model Evolution. Obser-
ving the change in model output as new data is lear-
Objects ned can help estimate the number of samples needed
1000
Images to learn new classes and identify possible confusions.
# samples

Fig. 3 shows the evolution from initial guesses to cor-

500 rect detections after learning 150 samples, correspon-
ding to an fast exploration scenario. For selection, the
0 Sum metric is used.
bird cow sheep boat cat tvmonitor The class confusions shown in the figure are typi-
cal for this scenario. cow and sheep are recognized
Figure 4: Class-wise validation results on part B of the VOC
as visually similar dog, horse and cat. bird is often
dataset (i.e.,, unknown classes). The first row details the
first way of splitting (bird,cow,sheep) while the second classified as aeroplane. After selecting and learning
row shows the second way (boat,cat,tvmonitor). For re- 150 samples, the objects are detected and classified
ference, the distribution of samples (object instances as well correctly and reliably.
as images with at least one instance) over the VOC dataset During the learning process, there are also
is provided in the third row. unknown objects. Please note, being able to mark
objects as unknown is a direct consequence of using
Intuitively, these observations are in line with our YOLO. Those objects have a detection confidence
expectations regarding our handling of class imbalan- above the required threshold, but no classification is
ces, where examples of rare classes should be prefer- certain enough. This property of YOLO is important
red during selection. We start to observe the advanta- for the discovery of objects of new classes. Nevert-
ges after around 100 training examples, because, for heless, if similar information is available from other
the selection to happen correctly, the prediction of the detection methods, our techniques could easily be ap-
rare class needs to be correct in the first place. plied.

188
Active Learning for Deep Object Detection

5 CONCLUSIONS of California, San Diego.

Beluch, W. H., Genewein, T., Nürnberger, A., and Köhler,
In this paper, we propose several uncertainty-based J. M. (2018). The power of ensembles for active lear-
active learning metrics for object detection. They ning in image classification. In Computer Vision and
Pattern Recognition (CVPR).
only require a distribution of classification scores per
detection. Depending on the specific task, an object Bietti, A. (2012). Active learning for object detection on
satellite images. Technical report, California Institute
detector that will report objects of unknown classes of Technology, Pasadena.
is also important. Additionally, we propose a sample
Ertekin, S., Huang, J., Bottou, L., and Giles, L. (2007). Le-
weighting scheme to balance selections among clas- arning on the border: active learning in imbalanced
ses. data classification. In Conference on Information and
We evaluate the proposed metrics on the PASCAL Knowledge Management.
VOC 2012 dataset (Everingham et al., 2010) and offer Everingham, M., Van Gool, L., Williams, C. K. I., Winn,
quantitative and qualitative results and analysis. We J., and Zisserman, A. (2010). The pascal visual ob-
show that the proposed metrics are able to guide the ject classes (voc) challenge. International Journal of
annotation process efficiently which leads to superior Computer Vision (IJCV).
performance in comparison to a random selection ba- Feng, C., Liu, M.-Y., Kao, C.-C., and Lee, T.-Y. (2017).
seline. In our experimental evaluation, the Sum me- Deep active learning for civil infrastructure defect de-
tric is able to achieve best results overall which can tection and classification. In International Workshop
on Computing in Civil Engineering (IWCCE).
be attributed to the fact that it tends to select batches
with many single objects in it. However, the targe- Fu, C.-J. and Yang, Y.-P. (2015). A batch-mode active le-
ted scenario is an application with huge amounts of arning svm method based on semi-supervised cluste-
ring. Intelligent Data Analysis.
unlabeled data where we consider the amount of ima-
ges to be evaluated as more critical than the time nee- Fu, C.-Y., Liu, W., Ranga, A., Tyagi, A., and Berg, A. C.
(2017). Dssd: Deconvolutional single shot detector.
ded to draw single bounding boxes. Examples would arXiv preprint arXiv:1701.06659.
be camera streams or camera trap data. To expedite
Gal, Y., Islam, R., and Ghahramani, Z. (2017). Deep bay-
annotation, our approach could be combined with a esian active learning with image data. arXiv preprint
weakly supervised learning approach as presented in arXiv:1703.02910.
(Papadopoulos et al., 2016). We also showed that our Girshick, R. (2015). Fast R-CNN. In International Confe-
weighting scheme leads to even increased accuracies. rence on Computer Vision (ICCV).
All presented metrics could be applied to other Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014).
deep object detectors, such as the variants of SSD Rich feature hierarchies for accurate object detection
(Liu et al., 2016), the improved R-CNNs e.g., (Ren and semantic segmentation. In Computer Vision and
et al., 2015) or the newer versions of YOLO (Red- Pattern Recognition (CVPR).
mon and Farhadi, 2017). Moreover, our proposed me- Hoi, S. C., Jin, R., and Lyu, M. R. (2006). Large-scale text
trics are not restricted to deep object detection and categorization by batch mode active learning. In In-
could be applied to arbitrary object detection met- ternational Conference on World Wide Web (WWW).
hods if they fulfill the requirements. It only requires Huang, J., Child, R., Rao, V., Liu, H., Satheesh, S., and
a complete distribution of classifications scores per Coates, A. (2016). Active learning for speech re-
detection. Also the underlying uncertainty measure cognition: the power of gradients. arXiv preprint
arXiv:1612.03226.
could be replaced with arbitrary active learning me-
trics to be aggregated afterwards. Depending on the Jain, P. and Kapoor, A. (2009). Active learning for large
multi-class problems. In Computer Vision and Pattern
specific task, an object detector that will report objects Recognition (CVPR).
of unknown classes is also important.
Käding, C., Freytag, A., Rodner, E., Perino, A., and Den-
The proposed aggregation strategies also genera- zler, J. (2016a). Large-scale active learning with ap-
lize to selection of images based on segmentation re- proximated expected model output changes. In Ger-
sults or any other type of image partition. The re- man Conference on Pattern Recognition (GCPR).
sulting scores could also be applied in a novelty de- Käding, C., Rodner, E., Freytag, A., and Denzler, J.
tection scenario. (2016b). Fine-tuning deep neural networks in con-
tinuous learning scenarios. In ACCV Workshop on
Interpretation and Visualization of Deep Neural Nets
(ACCV-WS).
REFERENCES Käding, C., Rodner, E., Freytag, A., and Denzler, J.
(2016c). Watch, ask, learn, and improve: A lifelong
Abramson, Y. and Freund, Y. (2006). Active learning for learning cycle for visual recognition. In European
visual object detection. Technical report, University Symposium on Artificial Neural Networks (ESANN).

189
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications

Kapoor, A., Grauman, K., Urtasun, R., and Darrell, T. Uijlings, J. R., Van De Sande, K. E., Gevers, T., and Smeul-
(2010). Gaussian processes for object categorization. ders, A. W. (2013). Selective search for object re-
International Journal of Computer Vision (IJCV). cognition. International Journal of Computer Vision
Kingma, D. P. and Ba, J. (2014). Adam: A method (IJCV), 104(2):154–171.
for stochastic optimization. arXiv preprint arXiv: Vijayanarasimhan, S. and Grauman, K. (2014). Large-scale
1412.6980. live active learning: Training object detectors with
Kirkpatrick, J., Pascanu, R., Rabinowitz, N. C., Veness, J., crawled data and crowds. International Journal of
Desjardins, G., Rusu, A. A., Milan, K., Quan, J., Ra- Computer Vision (IJCV).
malho, T., Grabska-Barwinska, A., Hassabis, D., Clo- Wang, D. and Shang, Y. (2014). A new active labeling met-
path, C., Kumaran, D., and Hadsell, R. (2016). Over- hod for deep learning. In International Joint Confe-
coming catastrophic forgetting in neural networks. rence on Neural Networks (IJCNN).
arXiv preprint arXiv:1612.00796. Wang, K., Zhang, D., Li, Y., Zhang, R., and Lin, L. (2016).
Kovashka, A., Russakovsky, O., Fei-Fei, L., and Grauman, Cost-effective active learning for deep image classifi-
K. (2016). Crowdsourcing in computer vision. Foun- cation. Circuits and Systems for Video Technology.
dations and Trends in Computer Graphics and Vision. Yao, A., Gall, J., Leistner, C., and Van Gool, L. (2012).
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Interactive object detection. In Computer Vision and
and Belongie, S. (2017). Feature pyramid networks Pattern Recognition (CVPR).
for object detection. In CVPR.
Liu, P., Zhang, H., and Eom, K. B. (2017). Active deep le-
arning for classification of hyperspectral images. Se-
lected Topics in Applied Earth Observations and Re-
mote Sensing.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu,
C.-Y., and Berg, A. C. (2016). SSD: Single shot mul-
tibox detector. In European Conference on Computer
Vision (ECCV).
Papadopoulos, D. P., Uijlings, J. R. R., Keller, F., and Fer-
rari, V. (2016). We dont need no bounding-boxes:
Training object class detectors using only human veri-
fication. In Computer Vision and Pattern Recognition
(CVPR).
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A.
(2016). You only look once: Unified, real-time object
detection. In Computer Vision and Pattern Recogni-
tion (CVPR).
Redmon, J. and Farhadi, A. (2017). Yolo9000: Better, fas-
ter, stronger. In Computer Vision and Pattern Recog-
nition (CVPR).
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental
improvement. arXiv preprint arXiv:1804.02767.
Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-
CNN: Towards real-time object detection with region
proposal networks. In Neural Information Processing
Systems (NIPS).
Roy, S., Namboodiri, V. P., and Biswas, A. (2016). Active
learning with version spaces for object detection.
arXiv preprint arXiv:1611.07285.
Settles, B. (2009). Active learning literature survey. Techni-
cal report, University of Wisconsin–Madison.
Shmelkov, K., Schmid, C., and Alahari, K. (2017). In-
cremental learning of object detectors without cata-
strophic forgetting. In International Conference on
Computer Vision (ICCV).
Stark, F., Hazırbas, C., Triebel, R., and Cremers, D. (2015).
Captcha recognition with active deep learning. In
Workshop New Challenges in Neural Computation.
Tong, S. and Koller, D. (2001). Support vector machine
active learning with applications to text classification.
Journal of Machine Learning Research (JMLR).

190

Fundamentals of Artificial Neural Networks
No ratings yet
Fundamentals of Artificial Neural Networks
7 pages
Aghdam Et Al. - 2019 - Active Learning For Deep Detection Neural Networks
No ratings yet
Aghdam Et Al. - 2019 - Active Learning For Deep Detection Neural Networks
9 pages
iv47402.2020.9304793
No ratings yet
iv47402.2020.9304793
6 pages
Region-level Active Learning for Cluttered Scenes
No ratings yet
Region-level Active Learning for Cluttered Scenes
9 pages
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
No ratings yet
Ding 2018 IOP Conf. Ser. Mater. Sci. Eng. 322 062024
6 pages
active
No ratings yet
active
2 pages
Efficient Detection of Small and Complex Objects for Autonomous Driving Using Deep Learning
No ratings yet
Efficient Detection of Small and Complex Objects for Autonomous Driving Using Deep Learning
5 pages
2211.11612v2
No ratings yet
2211.11612v2
10 pages
Varifocal Net
No ratings yet
Varifocal Net
11 pages
2310.17109v2
No ratings yet
2310.17109v2
10 pages
Lecture06 - Copie
No ratings yet
Lecture06 - Copie
52 pages
Object Detection With Deep Learning: A Review
No ratings yet
Object Detection With Deep Learning: A Review
21 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
YOLO V3 ML Project
No ratings yet
YOLO V3 ML Project
15 pages
A brief review and challenges of object 2020
No ratings yet
A brief review and challenges of object 2020
17 pages
1 ObjectDetection
No ratings yet
1 ObjectDetection
46 pages
MJEER-Volume 30-Issue 1 - Page 52-57
No ratings yet
MJEER-Volume 30-Issue 1 - Page 52-57
6 pages
Deep Learning Based Automated Billing Cart
No ratings yet
Deep Learning Based Automated Billing Cart
4 pages
Yuan Multiple Instance Active Learning For Object Detection CVPR 2021 Paper
No ratings yet
Yuan Multiple Instance Active Learning For Object Detection CVPR 2021 Paper
10 pages
Recent Advances in Deep Learning For Object Detection
No ratings yet
Recent Advances in Deep Learning For Object Detection
26 pages
Detection and Content Retrieval of Object in An Image Using YOLO
No ratings yet
Detection and Content Retrieval of Object in An Image Using YOLO
8 pages
OD Trans Christopher-Lang2022 Q2
No ratings yet
OD Trans Christopher-Lang2022 Q2
15 pages
Generalized Focal Loss Towards Efficient Representation Learning for Dense Object Detection
No ratings yet
Generalized Focal Loss Towards Efficient Representation Learning for Dense Object Detection
15 pages
2802 8020 1 PB
No ratings yet
2802 8020 1 PB
3 pages
Havi Doc Batch 10
No ratings yet
Havi Doc Batch 10
17 pages
Research Paper UGR_Team-07
No ratings yet
Research Paper UGR_Team-07
16 pages
gal17a
No ratings yet
gal17a
10 pages
Object and Face Detection Based On Center-Net 1
No ratings yet
Object and Face Detection Based On Center-Net 1
7 pages
Final Project Paper Akash
No ratings yet
Final Project Paper Akash
5 pages
Etana material
No ratings yet
Etana material
29 pages
Helmet Detection Using Machine Learning and Automatic License Final
75% (4)
Helmet Detection Using Machine Learning and Automatic License Final
47 pages
Incremental Training For Image Classification of Unseen Objects
No ratings yet
Incremental Training For Image Classification of Unseen Objects
19 pages
2023 CVPR 未知物体嗅探 Unknown Sniffer for Object Detection Don't Turn a Blind Eye to Unknown Objects
No ratings yet
2023 CVPR 未知物体嗅探 Unknown Sniffer for Object Detection Don't Turn a Blind Eye to Unknown Objects
10 pages
Image and Video Analytics Unit 3
No ratings yet
Image and Video Analytics Unit 3
18 pages
Object Detection Using Adaptive Mask RCNN
No ratings yet
Object Detection Using Adaptive Mask RCNN
12 pages
2205.15445v1
No ratings yet
2205.15445v1
17 pages
Applsci 12 07825
No ratings yet
Applsci 12 07825
23 pages
Object Detection With Deep Learning_ A Review Summary
No ratings yet
Object Detection With Deep Learning_ A Review Summary
11 pages
EdgeYOLO AnEdge-Real-Time Object Detector
No ratings yet
EdgeYOLO AnEdge-Real-Time Object Detector
7 pages
ref14
No ratings yet
ref14
5 pages
Realtime Visual Recognition in Deep Convolutional Neural Networks
No ratings yet
Realtime Visual Recognition in Deep Convolutional Neural Networks
13 pages
CV Project
No ratings yet
CV Project
7 pages
Object Detection and Game-Based Learning
No ratings yet
Object Detection and Game-Based Learning
23 pages
Sensors 22 04833
No ratings yet
Sensors 22 04833
17 pages
Object Detection With Deep Learning: A Review
No ratings yet
Object Detection With Deep Learning: A Review
21 pages
A Literature Review of Object Detection Using YOLOv4 Detector
No ratings yet
A Literature Review of Object Detection Using YOLOv4 Detector
7 pages
Knowledge-Based Systems
No ratings yet
Knowledge-Based Systems
10 pages
Bottom-Up Object Detection by Grouping Extreme and Center Points
No ratings yet
Bottom-Up Object Detection by Grouping Extreme and Center Points
10 pages
Object Tracking in Crowd Environment Using Deep Learning
No ratings yet
Object Tracking in Crowd Environment Using Deep Learning
8 pages
05742970
No ratings yet
05742970
12 pages
On Hyperbolic Embeddings in Object Detection
No ratings yet
On Hyperbolic Embeddings in Object Detection
19 pages
IJRPR7632
No ratings yet
IJRPR7632
8 pages
Object Detection
No ratings yet
Object Detection
13 pages
Object Detection Using Deep Learning
No ratings yet
Object Detection Using Deep Learning
6 pages
Second Progress Report UID - 17BCS2127
No ratings yet
Second Progress Report UID - 17BCS2127
13 pages
Literature Survey For Robotics
No ratings yet
Literature Survey For Robotics
6 pages
Real-Time Object Detection Using Deep Learning and Open CV
No ratings yet
Real-Time Object Detection Using Deep Learning and Open CV
4 pages
Overview_of_object_detection_based_on_deep_learnin
No ratings yet
Overview_of_object_detection_based_on_deep_learnin
7 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
From Everand
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
Fouad Sabry
No ratings yet
Planning A Project. Talking About How To Plan An Event
No ratings yet
Planning A Project. Talking About How To Plan An Event
3 pages
The Practice of Strategy.
No ratings yet
The Practice of Strategy.
4 pages
CO - Operations Research - 2020
No ratings yet
CO - Operations Research - 2020
7 pages
Testbank and Solutions for Electrical Wiring Industrial 18th Edition Herman
No ratings yet
Testbank and Solutions for Electrical Wiring Industrial 18th Edition Herman
17 pages
(Appendix C-11) COT-RPMS Inter-Observer Agreement Form For T I-III For SY 2024-2025
No ratings yet
(Appendix C-11) COT-RPMS Inter-Observer Agreement Form For T I-III For SY 2024-2025
1 page
Rice Leaf Diseases Detection Using Machine Learnin
No ratings yet
Rice Leaf Diseases Detection Using Machine Learnin
6 pages
ppt.pptx
No ratings yet
ppt.pptx
28 pages
Strategic Plan Department of English: Mission of The University
No ratings yet
Strategic Plan Department of English: Mission of The University
12 pages
An Action Research Plan For Developing and Implementing The Students' Listening Comprehension Skills
No ratings yet
An Action Research Plan For Developing and Implementing The Students' Listening Comprehension Skills
4 pages
P H I L C S T: Philippine College of Science and Technology
No ratings yet
P H I L C S T: Philippine College of Science and Technology
11 pages
Animal Movement Lesson Plan
67% (15)
Animal Movement Lesson Plan
14 pages
Presented To Prof. Arpita Sarangi Presented By::: Snigdha Mandal Tanumoy Chakraborty Kanchan Bose Sandip Jha
No ratings yet
Presented To Prof. Arpita Sarangi Presented By::: Snigdha Mandal Tanumoy Chakraborty Kanchan Bose Sandip Jha
27 pages
Hamburger Paragraph Lesson
No ratings yet
Hamburger Paragraph Lesson
5 pages
Sanfort PreSchool
No ratings yet
Sanfort PreSchool
12 pages
Heidi Talaat Farid: Highly Developed Skills in
No ratings yet
Heidi Talaat Farid: Highly Developed Skills in
3 pages
Making The Workplace Inclusive 5th Sem Notes Eng Manish Verma
No ratings yet
Making The Workplace Inclusive 5th Sem Notes Eng Manish Verma
30 pages
Accomplishment - Practical Research 2
100% (1)
Accomplishment - Practical Research 2
6 pages
Written Output Research
No ratings yet
Written Output Research
2 pages
Homework Policy High School Students
100% (1)
Homework Policy High School Students
7 pages
Developmental Plan
No ratings yet
Developmental Plan
3 pages
CBSE CST-3B_All Groups (C-09) - English (Communicative) (01!02!2025)_SOL
No ratings yet
CBSE CST-3B_All Groups (C-09) - English (Communicative) (01!02!2025)_SOL
6 pages
CONTEXTUALIZED READING INTERVENTION MATERIAL (CRIM) FOR GRADE 7 LEARNERS
No ratings yet
CONTEXTUALIZED READING INTERVENTION MATERIAL (CRIM) FOR GRADE 7 LEARNERS
14 pages
8601 - 1 Spring 2019-1 PDF
No ratings yet
8601 - 1 Spring 2019-1 PDF
13 pages
SECOND Q Week 1
No ratings yet
SECOND Q Week 1
18 pages
TEFL End Term Assignment
100% (2)
TEFL End Term Assignment
6 pages
Prof Ed 5 Midterm Module 1
No ratings yet
Prof Ed 5 Midterm Module 1
48 pages
Video Ko Tun-An Mo: Its Effect On The Reading Comprehension Skills of Struggling Learners
No ratings yet
Video Ko Tun-An Mo: Its Effect On The Reading Comprehension Skills of Struggling Learners
7 pages
Unit Two - Week Two - Day 2
No ratings yet
Unit Two - Week Two - Day 2
3 pages
Lesson Plan 2
No ratings yet
Lesson Plan 2
4 pages

Active Learning for Deep Object Detection 2

Uploaded by

Active Learning for Deep Object Detection 2

Uploaded by

Active Learning for Deep Object Detection

Clemens-Alexander Brust1 , Christoph Käding1,2 and Joachim Denzler1,2

1 INTRODUCTION over time or the distribution underlying the problem

3.2 Handling Selection Imbalances Active Exploration Protocol. Before an experi-

U = U1 , U2 , . . . ← split of U into random batches

while U is not empty do

Ybest ← annotations for Ubest human-machine interaction

Most valuable examples (highest score)

Least valuable examples (zero score)

New classes (part B) Known classes (part A)

After 150 samples

Figure 3: Evolution of detections on examples from validation set.

bird cow Qualitative Results – Sample Valuation We cal-

10 10 Intuitively, the Sum metric should prefer images

Fig. 3 shows the evolution from initial guesses to cor-

5 CONCLUSIONS of California, San Diego.

You might also like