03u Handout
03u Handout
(機器學習基石)
Roadmap
1 When Can Machines Learn?
hypothesis set
H
• credit approve/disapprove
• email spam/non-spam
• patient sick/not sick
• ad profitable/not profitable
• answer correct/incorrect (KDDCup 2010)
5
• Y = {1c, 5c, 10c, 25c}, or
1 Y = {1, 2, · · · , K } (abstractly)
• binary classification: special case
10
with K = 2
Size
?
(image by Robert-Owen-Wahl from Pixabay)
multilabel classification:
classify input to multiple (or no) categories
Y = 2{apple,orange,strawberry,kiwi}
What Tags?
+ ⇒
(Leonardo da Vinci, (Van Gogh, (Pjfinlay,
in Public Domain) in Public Domain) with CC0)
all images are downloaded from Wikipedia
Y: a ‘manifold’ ⊂ Rw×h×c ,
arguably not just multi-pixel regression
Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 10/40
Types of Learning Learning with Different Output Space Y
Mini Summary
Learning with Different Output Space Y
• binary classification: Y = {−1, +1}
• multiclass classification: Y = {1, 2, · · · , K }
• multilabel classification: Y = 2{1,2,··· ,K }
• regression: Y = R
unknown target function
f: X →Y
• image generation: Y ⊂ Rw×h×c
• . . . and a lot more!!
hypothesis set
H
Fun Time
What is this learning problem?
The entrance system of the school gym, which does automatic face
recognition based on machine learning, is built to charge four different
groups of users differently: Staff, Student, Professor, Other. What type
of learning problem best fits the need of the system?
1 binary classification
2 multiclass classification
3 regression
4 structured learning
Fun Time
What is this learning problem?
The entrance system of the school gym, which does automatic face
recognition based on machine learning, is built to charge four different
groups of users differently: Staff, Student, Professor, Other. What type
of learning problem best fits the need of the system?
1 binary classification
2 multiclass classification
3 regression
4 structured learning
Reference Answer: 2
There is an ‘explicit’ Y that contains four
classes.
Mass
5
unknown target function 1
f: X →Y
10
Size
hypothesis set
H
supervised learning:
every xn comes with corresponding yn
25
Mass
Mass
5
1
10
Size Size
25
Mass
Mass
5
1
10
Size Size
25 25
Mass
Mass
Mass
5 5
1 1
10 10
Reinforcement Learning
a ‘very different’ but natural way of learning
Reinforcement Learning
a ‘very different’ but natural way of learning
(Public Domain, from Wikipedia; used here for education purpose; all other rights still belong to Google DeepMind)
(CC-BY-SA 3.0 by Stannered on (CC-BY-SA 2.0 by Frej Bjon on (Public Domain, from Wikipedia)
Wikipedia) Wikipedia)
(Public Domain, from Wikipedia; used here for education purpose; all other rights still belong to OpenAI)
GPT-3 chatGPT
Self-Supervised Supervised (Few-Shot) + Supervised (Ranking) + Reinforcement
• mainly next-token
prediction from 2048
tokens
• 175 billion parameters
trained with 500 billion
tokens
hypothesis set
H
Fun Time
What is this learning problem?
To build a tree recognition system, a company decides to gather one
million of pictures on the Internet. Then, it asks each of the 10
company members to view 100 pictures and record whether each
picture contains a tree. The pictures and records are then fed to a
learning algorithm to build the system. What type of learning problem
does the algorithm need to solve?
1 supervised
2 unsupervised
3 semi-supervised
4 reinforcement
Fun Time
What is this learning problem?
To build a tree recognition system, a company decides to gather one
million of pictures on the Internet. Then, it asks each of the 10
company members to view 100 pictures and record whether each
picture contains a tree. The pictures and records are then fed to a
learning algorithm to build the system. What type of learning problem
does the algorithm need to solve?
1 supervised
2 unsupervised
3 semi-supervised
4 reinforcement
Reference Answer: 3
The 1, 000 records are the labeled (xn , yn ); the
other 999, 000 pictures are the unlabeled xn .
Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 23/40
Types of Learning Learning with Different Protocol f ⇒ (xn , yn )
Mass
5
unknown target function 1
f: X →Y
10
Size
hypothesis set
H
Mass
Mass
5
1
10
Size Size
real-world ML system
different from textbook settings
Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 27/40
Types of Learning Learning with Different Protocol f ⇒ (xn , yn )
hypothesis set
H
Mini Summary
Learning with Different Protocol f ⇒ (xn , yn )
• batch: all known data
• online: sequential (passive) data
• online + batch: best of both worlds
unknown target function
f: X →Y • active: strategically-observed data
• . . . and more!!
hypothesis set
H
Reference Answer: 3
The algorithm takes a active but naïve
strategy: ask when ‘confused’. You should
probably do the same when taking a class. :-)
Hsuan-Tien Lin (NTU CSIE) Machine Learning Foundations 31/40
Types of Learning Learning with Different Input Space X
hypothesis set
H
Mass
• patient info for cancer diagnosis 5
1
• often including ‘human intelligence’
on the learning task 10
Size
x =(symmetry, density)
is it a ‘1’ ? ✲ z1 z5 ✛ is it a ‘5’ ?
φ2 φ3 φ4 φ5
φ1 φ6
,
• layered extraction: simple to complex features
• natural for difficult learning task with raw features, like vision
Mini Summary
Learning with Different Input Space X
• concrete: sophisticated (and related)
physical meaning
• raw: simple physical meaning
unknown target function
f: X →Y • abstract: no (or little) physical meaning
• . . . and more!!
hypothesis set
H
Fun Time
What features can be used?
Consider a problem of building an online image advertisement system
that shows the users the most relevant images. What features can you
choose to use?
1 concrete
2 concrete, raw
3 concrete, abstract
4 concrete, raw, abstract
Fun Time
What features can be used?
Consider a problem of building an online image advertisement system
that shows the users the most relevant images. What features can you
choose to use?
1 concrete
2 concrete, raw
3 concrete, abstract
4 concrete, raw, abstract
Reference Answer: 4
concrete user features, raw image features,
and maybe abstract user/image IDs