Machine Learning 2D5362
Machine Learning 2D5362
Lecture 1:
Introduction to Machine Learning
Machine Learning
• Date/Time:
Tuesday ???
Thursday 13.30
• Location:
BB2 ?
• Course requirements:
active participation
homework assignments
course project
• Credits:
3-5 credits depending on course project
• Course webpage:
http://www.nada.kth.se/~hoffmann/ml.html
Course Material
Textbook (recommended):
• Machine Learning
Tom M. Mitchell, McGraw Hill,1997
ISBN: 0-07-042807-7 (available as paperback)
Further readings:
• An Introduction to Genetic Algorithms
Melanie Mitchell, MIT Press, 1996
• Reinforcement Learning – An Introduction
Richard Sutton, MIT Press, 1998
• Selected publications:
check course webpage
Course Overview
Introduction to machine learning
Concept learners
Decision tree learning
Neural networks
Evolutionary algorithms
Instance based learning
Reinforcement learning
Machine learning in robotics
Software Packages & Datasets
• MLC++
• Machine learning library in C++
• http://www.sig.com/Technology/mlc
• GALIB
• MIT GALib in C++
• http://lancet.mit.edu/ga
• UCI
• Machine Learning Data Repository UC Irvine
• http://www.ics.uci.edu/~mlearn/ML/Repository.html
Possible Course Projects
Apply machine learning techniques to your own
problem e.g. classification, clustering, data
modeling, object recognition
Investigating combining multiple classifiers
Comparing different approaches in genetic fuzzy
systems
Learning robotic behaviors using evolutionary
techniques or reinforcement learning
LEGO Mindstorm
Scout
Scout Robots
16 Sonar sensors
Laser range
scanner
Odometry
Differential drive
Simulator
API in C
LEGO Mindstorms
Touch sensor
Light sensor
Rotation sensor
Video cam
Motors
Learning & Adaptation
Definition:
A computer program is said to learn from
experience E with respect to some class
of tasks T and performance measure P,
if its performance at tasks in T, as
measured by P, improves with
experience.
Disciplines relevant to ML
Artificial intelligence
Bayesian methods
Control theory
Information theory
Computational complexity theory
Philosophy
Psychology and neurobiology
Statistics
Applications of ML
Learning to recognize spoken words
SPHINX (Lee 1989)
Learning to drive an autonomous vehicle
ALVINN (Pomerleau 1989)
Learning to classify celestial objects
(Fayyad et al 1995)
Learning to play world-class backgammon
TD-GAMMON (Tesauro 1992)
Designing the morphology and control structure
of electro-mechanical artefacts
GOLEM (Lipton, Pollock 2000)
Artificial Life
GOLEM Project (Nature: Lipson, Pollack 2000)
http://golem03.cs-i.brandeis.edu/index.html
Evolve simple electromechanical locomotion machines from
basic building blocks (bars, acuators, artificial neurons) in a
simulation of the physical world (gravity, friction).
The individuals that demonstrate the best locomotion
ability are fabricated through rapid prototyping technology.
Evolvable Robot
Arrow
Ratchet
Tetra
Evolved Creatures
Evolved creatures: Sims (1994)
http://genarts.com/karl/evolved-virtual-creatures.html
Darwinian evolution of virtual block creatures for
swimming, jumping, following, competing for a block
Learning Problem
Based on experience E
T: play checkers
P: percentage of games won
What experience?
What exactly should be learned?
How shall it be represented?
What specific algorithm to learn it?
Type of Training Experience
Direct or indirect?
Direct: board state -> correct move
Indirect: outcome of a complete game
Credit assignment problem
Teacher or not ?
Teacher selects board states
Learner can select board states
Is training experience representative of
performance goal?
Training playing against itself
Performance evaluated playing against world champion
Choose Target Function
ChooseMove : B M : board state move
Maps a legal board state to a legal move
Evaluate : BV : board state board value
Assigns a numerical score to any given board
state, such that better board states obtain a
higher score
Select the best move by evaluating all successor
states of legal moves and pick the one with the
maximal score
Possible Definition of Target
Function
If b is a final board state that is won then
V(b) = 100
If b is a final board state that is lost then
V(b) = -100
If b is a final board state that is drawn then
V(b)=0
If b is not a final board state, then
V(b)=V(b’), where b’ is the best final board
state that can be achieved starting from b
and playing optimally until the end of the
game.
Gives correct values but is not operational
State Space Search
V(b)= ?
draw: V(b)=0
Depth-First Search
Breadth-First Search
Number of Board States
Tic-Tac-Toe:
#board states < 9!/(5! 4!) + 9!/(1! 4! 4!) + …
… + 9!/(2! 4! 3!) + … 9 = 6045
V(b0)=20 V(b1)=20
V(b0)=20 V(b1)=20
V(b3)=20 V(b2)=20
Example: 4x4 checkers
V(b3)=20
V(b4a)=20 V(b4b)=-55
Example 4x4 checkers
V(b3)=20 V(b4)=-55
V(b4)=-107.5 V(b5)=-107.5
Example 4x4 checkers
V(b5)=-107.5 V(b6)=-167.5
V(b6)=-197.5
Training data:
before
after
Design Choices
Determine Type of
Training Experience
Games against Games Table of correct
experts against self moves
Determine
Target Function
BoardMove BoardValue
Determine Representation
of Learned Function
polynomial Linear function of Artificial neural
six features network
Determine
Learning Algorithm
Gradient descent Linear programming
Learning Problem Examples
Credit card applications
Task T: Distinguish ”good” applicants from
”risky” applicants.
Performance measure P : ?
Experience E : ? (direct/indirect)
Target function : ?
Performance Measure P:
Error based: minimize percentage of incorrectly
classified customers : P = Nfp + Nfn / N
Nfp: # false positives (rejected good customers)
Nfn: # false negatives (accepted bad customers)
Customer record:
income, owns house, credit history, age, employed, accept
$40000, yes, good, 38, full-time, yes
$25000, no, excellent, 25, part-time, no
$50000, no, poor, 55, unemployed, no
T: Customer data accept/reject
T: Customer data probability good customer
T: Customer data expected utility/profit
Learning methods
Decision rules:
If income < $30.000 then reject
Bayesian network:
P(good | income, credit history,….)
Neural Network:
Nearest Neighbor:
Take the same decision as for the customer
in the data base that is most similar to the
applicant
Learning Problem Examples
Obstacle Avoidance Behavior of a
Mobile Robot
Task T: Navigate robot safely through an
environment.
Performance measure P : ?
Experience E : ?
Target function : ?
Performance Measure P:
Neural Networks
Require direct training experience
Reinforcement Learning
Indirect training experience
Evolutionary Algorithms
Indirect training experience
Evolutionary Algorithms
mutation population of genotypes
10111
10011 10001
phenotype space
00111
01001
01001 11001
01011 f
coding scheme
recombination selection
x
10011
10
10001
011
001 10001
fitness
01001
01
01011
001
011 10001 11001
01011
Evolution of Simple Navigation
Issues in Machine Learning
What algorithms can approximate functions
well and when?
How does the number of training examples
influence accuracy?
How does the complexity of hypothesis
representation impact it?
How does noisy data influence accuracy?
What are the theoretical limits of learnability?
Machine vs. Robot Learning
Machine Learning Robot Learning
Learning in vaccum Embedded learning
Statistically well-behaved Data distribution not
data homegeneous
Mostly off-line Mostly on-line
Informative feed-back
Qualitative and sparse
Computational time not an feed-back
issue
Time is crucial
Hardware does not matter
Convergence proof Hardware is a priority
Empirical proof
Learning in Robotics
• behavioral adaptation:
adjust the parameters of individual behaviors according
to some direct feedback signal (e.g. adaptive control)
• evolutionary adaptation:
application of artificial evolution to robotic systems
• sensor adaptation:
adopt the perceptual system to the environment
(e.g. classification of different contexts, recognition)
• learning complex, deliberative behaviors:
unsupervised learning based on sparse feedback from
the environment, credit assignment problem
(e.g. reinforcement learning)