Introduction To Multiple Instance Learning PDF
Introduction To Multiple Instance Learning PDF
Instance Learning
Marc-André Carbonneau
Supervisors : Eric Granger and Ghyslain Gagnon
October 19th 2016
Outline of the presentation
1. Definition and formulation.
2. Applications.
3. Type of approaches
4. Characteristics of MIL problems.
What Is Multiple Instance
Learning?
Problem Formulation
Multiple Instance Learning
What it is:
• It is a form of weakly supervised learning.
• Training instances are arranged in sets, called bags.
• A label is provided for entire bags but not for instances.
What it is not:
• Supervised learning
• Unsupervised learning
• Semi-supervised learning
Illustration of a MIL problem
Can enter the secret room
Can I the secret room???
More on MIL assumption: J. Foulds and E. Frank, “A Review of Multi-Instance Learning Assumptions,” Knowl. Eng. Rev., vol. 25,
no. 1, pp. 1–25, Mar. 2010.
Example of
relaxed MIL
assumptions
• Both sand and water
segments are positive
instances for beach
pictures.
• However, picture of
beach must contain both
segments of sand and
water. Otherwise, they
can be pictures of desert
or sea.
Image from : J. Amores, “Multiple instance classification: Review, taxonomy and comparative study,”
Artif. Intell., vol. 201, pp. 81–105, Aug. 2013.
Tasks that can be performed in MIL
Group-based
Bag classification in Instance classification
Supervised Learning classification and set
MIL in MIL
classification
Image from: V. Cheplygina, D. M. J. Tax, and M. Loog, “On classification with bags, groups and sets,” Pattern Recognition Letters, vol. 59, pp. 11–17, Jul. 2015.
What Can I Do with Multiple
Instance Learning?
Applications
Molecule Classification
This is the first MIL application published in:
T. G. Dietterich, R. H. Lathrop, and T. Lozano-Pérez, “Solving the Multiple
Instance Problem with Axis-parallel Rectangles,” Artificial Intelligence
1997.
Objective: Predict if a molecule produces a given effect.
Bag: Collection of all conformations of the same molecule.
Instance: Conformation of a molecule.
Justification: Conformations are not observable individually.
Content Base Image Retrieval
Objective: Classify images based
on their subject.
Bag: Collection segments or
patches extracted from an image.
Instance: Image segments or
patches.
Justification: Images can
represent composite objects or
concepts.
Note: Bag-of-words methods are
MIL methods.
Image from: Y. Chen, J. Bi, and J. Z. Wang, “MILES: Multiple-Instance Learning via Embedded
Instance Selection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 12, pp. 1931–1947,
2006.
Object Localization in Image
Objective: Find objects in images.
Bag: Collection of candidate annotation boxes
Instance: Sub-image corresponding to
candidate windows.
Justification: A large quantity of data can be
used to learn because costly strong
annotations are not necessary.
Taxonomy from: J. Amores, “Multiple instance classification: Review, taxonomy and comparative study,” Artificial Intelligence, vol. 201, pp. 81–105, Aug. 2013.
Instance Space Methods
These methods try to uncover the true nature of each
instance in order to make a decision on bag labels.
MI-SVM
Pros: mi-SVM
APR
• Can be directly used for instance classification tasks. RSIS
Cons: EM-DD
MIL-Boost
• Do not work when instances have no precise classes. SbMIL
Image from: M.-A. Carbonneau, V. Cheplygina, E. Granger, and G. Gagnon, “Multiple Instance Learning: A Survey on Problems Characteristics and Applications,” to be
submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
Task
Instance and bag classification
are two different tasks.
It has been observed by many
authors that the best algorithm
for instance classification is
rarely the best for bag
classification.
G. Vanwinckelen, V. do O, D. Fierens, and H. Blockeel, “Instance-level accuracy
versus bag-level accuracy in multi-instance learning,” Data Mining Knowledge
Discovery, 2015.
• The relation between the instances: Instance Learning: A Survey on Problems Characteristics and Applications,” to be
submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
• Co-occurences
• Structure
• Intra-bags similarities
Data Distribution
The type of distribution is important when choosing a MIL algorithm.
Not all MIL algorithms easily deal with :
• Multi-modal distributions
• Unknown negative distribution