0% found this document useful (0 votes)
7 views62 pages

1 ML Overview

Uploaded by

luticia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views62 pages

1 ML Overview

Uploaded by

luticia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Statistical Learning

Machine Learning Overview


Outline
• What is machine learning?
• Supervised Learning
• Classi cation
• Regression
• Unsupervised Learning
• Clustering
• Reinforcement Learning
fi
Part I: What is machine learning?
What is machine learning?
• Arthur Samuel (1959): Machine learning is the eld of study that gives the
computer the ability to learn without being explicitly programmed.

fi
https://tung-dn.github.io/programming.html
What is machine learning?
• Arthur Samuel (1959): Machine learning is the eld of study that gives the
computer the ability to learn without being explicitly programmed.

• Tom Mitchell (1997): A computer program is said to learn from experience


E with respect to some class of tasks T and performance measure P, if
its performance at tasks in T as measured by P, improves with experience
E.

fi
Supervised
Learning

Taxonomy of ML
Unsupervised
Reinforcement
Learning
Learning
Part II: Supervised Learning
Example 1: Predict whether a user likes a song or not

model
Example 1: Predict whether a user likes a song or not

Intensity

User Sharon

Tempo
Example 1: Predict whether a user likes a song or not

Intensity

User Sharon

DisLike
Like

Relaxed Tempo Fast


Example 1: Predict whether a user likes a song or not

Intensity

User Sharon

DisLike
Like

Relaxed Tempo Fast


Example 1: Predict whether a user likes a song or not

Intensity New data


?
User Sharon

DisLike
Like

Relaxed Tempo Fast


Example 1: Predict whether a user likes a song or not

Intensity New data

User Sharon

DisLike
Like

Relaxed Tempo Fast


Example 2: Classify Images http://www.image-net.org/
Example 2: Classify Images

Experience/Data:
images with labels

indoor outdoor
Example 2: Classify Images
Label: outdoor

Label: indoor

Training data Test data

learning (i.e.,training) testing


performance
Label: outdoor

Label: indoor

Training data Test data

learning (i.e.,training) testing


performance
How to represent data?
input data
d
x∈ℝ Intensity

d: feature dimension
x
x1 Tempo
x=
x2 Intensity

There can be many features!


Relaxed Tempo Fast
How to represent data?

Label Intensity
y ∈ {0,1}
y=1

Where “supervision”
comes from y=0
Relaxed Tempo Fast
Represent various types of data
• Image
- Pixel values

• Bank account
- Credit rating, balance, # deposits in last day, week,
month, year, #withdrawals
Two Types of Supervised Learning Algorithms

Classification Regression
Example of regression: housing price prediction
Given: a dataset that contains samples
(x1, y2), (x2, y2), . . . , (xn, yn) Price

Task: if a residence has x squares


feet, predict the price?

Square feet
𝑛
Example of regression: housing price prediction
Given: a dataset that contains samples
(x1, y2), (x2, y2), (x3, y3), . . . , (xn, yn)

Task: if a residence has x squares


feet, predict the price?
y∈ℝ

Square feet
𝑛
Example of regression: housing price prediction

Input with more features (e.g., lot size)

x
(credit: stanford CS229)
Supervised Learning: More examples
x = raw pixels of the image y = bounding boxes

Russakovsky et al. 2015


Two Types of Supervised Learning Algorithms

Classification Regression

• the label is a discrete variable • the label is a continuous variable


y ∈ {1,2,3,...,K} y∈ℝ
Training Data for Supervised Learning

Training data is a collection of input instances to the


learning algorithm:

(x1, y2), (x2, y2), (x3, y3), . . . , (xn, yn)


input label

A training data is the “experience” given to a learning algorithm


Goal of Supervised Learning

Given training data


(x1, y2), (x2, y2), (x3, y3), . . . , (xn, yn)

Learn a function mapping f : X → Y, such that f(x) predicts


the label y on future data x (not in training data)
Goal of Supervised Learning

Training set error


n
1

0-1 loss for classification ℓ = ( f(xi) ≠ yi)
• n i=1
n
1 2

Squared loss for regression: ℓ = ( f(xi) − yi)
• n i=1
A learning algorithm optimizes the training objective

f* = arg min (x,y)ℓ( f(x), y) Details in upcoming


lectures :)
𝔼
Quiz Break
Q1-1: Which is true about feature vectors?

A. Feature vectors can have at most 10 dimensions


B. Feature vectors have only numeric values
C. The raw image can also be used as the feature vector
D. Text data don’t have feature vectors
Quiz Break
Q1-1: Which is true about feature vectors?

A. Feature vectors can have at most 10 dimensions


B. Feature vectors have only numeric values
C. The raw image can also be used as the feature vector
D. Text data don’t have feature vectors

A. Feature vectors can be in high dimen.


B. Some feature vectors can have other types of values like strings
D. Bag-of-words is a type of feature vector for text
Quiz Break
Q1-2: Which of the following is not a common task of supervised learning?

A. Object detection (predicting bounding box from raw images)


B. Classi cation
C. Regression
D. Dimensionality reduction
fi
Quiz Break
Q1-2: Which of the following is not a common task of supervised learning?

A. Object detection (predicting bounding box from raw images)


B. Classi cation
C. Regression
D. Dimensionality reduction
fi
Part II: Unsupervised Learning
(no teacher)
Unsupervised Learning
• Given: dataset contains no label x1, x2, . . . , xn
• Goal: discover interesting patterns and structures in the data
Unsupervised Learning
• Given: dataset contains no label x1, x2, . . . , xn
• Goal: discover interesting patterns and structures in the data

y=1
Intensity

y=0

Tempo
Unsupervised Learning
• Given: dataset contains no label x1, x2, . . . , xn
• Goal: discover interesting patterns and structures in the data

y=1
Intensity Intensity

y=0

Tempo Tempo
Clustering
• Given: dataset contains no label x1, x2, . . . , xn
• Output: divides the data into clusters such that there are
intra-cluster similarity and inter-cluster dissimilarity
Intensity

Tempo
Clustering

Clustering Irises using three di erent features


The colors represent clusters identi ed by the algorithm, not y’s provided as input
ff
fi
Clustering
• You probably have >1000 digital photos stored on your phone
• After this class you will be able to organize them better
(based on visual similarity)
Clustering Genes
Clustering Words with Similar Meanings

[Arora-Li-Liang-Ma-Risteski, TACL’17,18]
How do we perform clustering?
• Many clustering algorithms. We will look at the two most
frequently used ones:
• K-means clustering: we specify the desired number of
clusters, and use an iterative algorithm to find them
• Hierarchical clustering: we build a binary tree over the
dataset
K-means clustering
• Very popular clustering method

• Don’t confuse it with k-NN classifier

• Input: a dataset x1, x2, . . . , xn , and assume the number of


clusters k is given
K-means clustering
Step 1: Randomly picking 2 positions as initial cluster centers (not necessarily a data
point)

Intensity

Tempo
K-means clustering
Step 2: for each point x, determine its cluster: nd the closest center in Euclidean space

Intensity

Tempo
fi
K-means clustering
Step 3: update all cluster centers as the centroids

Intensity

Tempo
K-means clustering
Repeat step 2 & 3 until convergence

Intensity

Converged solution!

No labels required!

Tempo
K-means clustering: A demo
https://www.naftaliharris.com/blog/visualizing-k-means-clustering/
Hierarchical Clustering (more to follow next lecture)
Quiz Break
Q2-1: Which is true about machine learning?

A. The process doesn’t involve human inputs


B. The machine is given the training and test data for learning
C. In clustering, the training data also have labels for learning
D. Supervised learning involves labeled data
Quiz Break
Q2-1: Which is true about machine learning?

A. The process doesn’t involve human inputs


B. The machine is given the training and test data for learning
C. In clustering, the training data also have labels for learning
D. Supervised learning involves labeled data

A. The labels are human inputs


B. The machine should not have test data for learning
C. No labels available for clustering
Quiz Break
Q2-2: Which is true about unsupervised learning?

A. There are only 2 unsupervised learning algorithms


B. Kmeans clustering is a type of hierarchical clustering
C. Kmeans algorithm automatically determines the number of clusters k
D. Unsupervised learning is widely used in many applications
Quiz Break
Q2-2: Which is true about unsupervised learning?

A. There are only 2 unsupervised learning algorithms


B. Kmeans clustering is a type of hierarchical clustering
C. Kmeans algorithm automatically determines the number of clusters k
D. Unsupervised learning is widely used in many applications
Part III: Reinforcement Learning
(Learn from reward)
Reinforcement Learning
• Given: an agent that can take actions and a reward function
specifying how good an action is.
• Goal: learn to choose actions that maximize future reward
total.

Google Deepmind
Reinforcement Learning Key Problems
1. Problem: actions may have delayed effects.
• Requires credit-assignment
2. Problem: maximal reward action is unknown
• Exploration-exploitation trade-off

“..the problem [exploration-exploitation]


was proposed [by British scientist] to be
dropped over Germany so that German
scientists could also waste their time on it.”

- Peter Whittle

Multi-armed Bandit
Today’s recap
• What is machine learning?
• Supervised Learning
• Classi cation
• Regression
• Unsupervised Learning
• Reinforcement Learning
fi
Thanks!

You might also like