0% found this document useful (0 votes)
7 views35 pages

Lec2 Intro To ML

Uploaded by

Saitama Deku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views35 pages

Lec2 Intro To ML

Uploaded by

Saitama Deku
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

1

Introduction to machine
learning
CSCI-P 556
ZORAN TIGANJ
2
Today

u Chapter 1 from the Hands On ML textbook (The machine learning


landscape)
u Applications of ML
u Types of ML
u Main challenges in ML
u Next time we will cover chapter 2 (End to end machine learning project).
3
Announcements

u Download slides and watch videos of previous lectures on Canvas


u See all deadlines for assingments on Canvas
u Stay tuned for our first quiz – it will be released today (with deadline in a
week).
u Use Q&A community to ask questions
4
Examples of Applications

u Analyzing images of products on a production line to automatically classify


them
u Detecting tumors in brain scans
u Automatically classifying news articles
u Automatically flagging offensive comments on discussion forums
u Summarizing long documents automatically
u Creating a chatbot or a personal assistant
u Forecasting your company’s revenue next year, based on m any performance
metrics
u Making your app react to voice commands
u Detecting credit card fraud
u Segmenting clients based on t heir purchases so that you can design a
different marketing strategy for each segment
5
Examples of Applications

u Representing a complex, high-dimensional dataset in a clear and insightful


diagram
u Recommending a product that a client may be interested in, based on
past purchases
u Building an intelligent bot for a game
6
Common divisions of ML algorithms

u Most common division of ML algorithms is based on the amount and type


of supervision:
u supervised learning
u unsupervised learning
u semisupervised learning
u reinforcement Learning
7
Supervised learning

u In supervised learning, the training set you feed to the algorithm includes
the desired solutions, called labels
8
Supervised learning

u In supervised learning, the training set you feed to the algorithm includes
the desired solutions, called labels
9
Supervised learning algorithms

u Here are some of the most important supervised learning algorithms


u k-Nearest Neighbors Unsupervised Learning:
k-means Clustering
Anomaly Detection
u Linear Regression Visualization & Dimension Reduction

u Logistic Regression
u Support Vector Machines (SVMs)
u Decision Trees and Random Forests
10
Unsupervised learning

u In unsupervised learning the training data is unlabeled.


11
Unsupervised learning

u In unsupervised learning the training data is unlabeled.


12
Unsupervised learning

u In unsupervised learning the training data is unlabeled.


u Here are some of the most important unsupervised learning algorithms
u Clustering
u K-Means
u Anomaly detection and novelty detection
u One-class SVM
u Isolation Forest
u Visualization and dimensionality reduction
u Principal Component Analysis (PCA)
u Kernel PCA
u Locally Linear Embedding (LLE)
u t-Distributed Stochastic Neighbor Embedding (t-SNE)
13
Unsupervised learning: Data
visualization with t-SNE
14
Unsupervised learning: anomaly
detection
15
Semisupervised leanring

u Semisupervised leanring: plenty of unlabeled instances, and few labeled


instances.
16
Reinforcement Learning

u The learning system, called an


agent, can observe the
environment, select and perform
actions, and get rewards in return (or
penalties in the form of negative
rewards).
u It must then learn by itself what is the
best strategy, called a policy, to get
the most reward over time.
u A policy defines what action the
agent should choose when it is in a
given situation
17
Batch and Online Learning

u In batch learning, the system is incapable of learning incrementally: it must


be trained using all the available data.
u If you want a batch learning system to know about new data (such as a
new type of spam), you need to train a new version of the system from
scratch on the full dataset (not just the new data, but also the old data).
18
Batch and Online Learning

u In online learning, you train the system incrementally by feeding it data batch learning: trained on all available
data
instances sequentially, either individually or in small groups called online learning: incrementally by

minibatches.
feeding in data sequentially. mini-
batches

u Each learning step is fast and cheap, so the system can learn about new out-of-core learning: Out-of-core
data on the fly, as it arrives algorithms can handle vast quantities
of data that cannot fit in a computer's
main memory. An out-of-core learning
algorithm chops the data into mini-
batches and uses online learning
techniques to learn from these mini-
batches.
19
Instance-Based Versus Model-Based
Learning
u Instance-based learning: the system learns the examples by heart, then
generalizes to new cases by using a similarity measure to compare them to
the learned examples (or a subset of them).
20
Instance-Based Versus Model-Based
Learning
u Model-based learning: build a model of the available examples and then
use that model to make predictions.
21
Model-based learning: example

u Suppose you want to know if money makes people happy, so you


download the Better Life Index data from the OECD’s website and stats
about gross domestic product (GDP) per capita from the IMF’s website.
22
Model-based learning: example

Model selection: we select a simple linear model with two parameters, θ0 and θ1 :
life_satisfaction = θ0 + θ1 × GDP_per_capita
23
Model-based learning: example

Model selection: we select a simple linear model with two parameters, θ0 and θ1:
life_satisfaction = θ0 + θ1 × GDP_per_capita
24
Model-based learning: example

Model selection: we select a simple linear model with two parameters, θ0 and θ1 :
life_satisfaction = θ0 + θ1 × GDP_per_capita
25
Model-based learning: example

u Now we can use the model to predict life satisfaction:


u For example, say you want to know how happy Cypriots are, and the OECD
data does not have the answer. Fortunately, you can use your model to make a
good prediction: you look up Cyprus’s GDP per capita, find $22,587, and then
apply your model and find that life satisfaction is likely to be somewhere around
4.85 + 22,587 × 4.91 × 10 = 5.96.
26
27
Main Challenges of Machine Learning

u Insufficient Quantity of Training Data


u In a famous paper published in 2001, Microsoft
researchers Michele Banko and Eric Brill
showed that very different Machine Learning
algorithms, including fairly simple ones,
performed almost identically well on a
complex problem of natural language
disambiguation once they were given enough
data.
28
Main Challenges of Machine Learning

u Nonrepresentative Training Data (sampling bias)


29
Main Challenges of Machine Learning

u Poor-Quality Data
u If some instances are clearly outliers, it may help to simply discard them or try to
fix the errors manually.
u If some instances are missing a few features (e.g., 5% of your customers did not
specify their age), you must decide whether you want to ignore this attribute
altogether, ignore these instances, fill in the missing values (e.g., with the median
age), or train one model with the feature and one model without it.
30
Main Challenges of Machine Learning

u Irrelevant Features:
u garbage in, garbage out
u feature engineering can help (selecting the most useful features to train on
among existing features)
31
Main Challenges of Machine Learning

u Overfitting the Training Data


u Example of a high-degree polynomial life satisfaction model that strongly
overfits the training data. Even though it performs much better on the training
data than the simple linear model,

u Constraining a model to make it simpler and reduce the risk of overfitting is


called regularization .
32
Main Challenges of Machine Learning

u Underfitting the Training Data


u opposite of overfitting: it occurs when your model is too simple to learn the
underlying structure of the data.
u For example, a linear model of life satisfaction is prone to underfit; reality is just
more complex than the model, so its predictions are bound to be inaccurate,
even on the training examples.
Main Challenges of Machine Learning:
Insufficient Quantity of Training Data
Non-representative Training Data (Sampling Bias)
Poor Quality of Data (outliers, missing features,)
Irrelevant Features
Overfitting the Training Data: Model works good on training data but won't generalize well
Underfitting the Training Data

Feature extraction: combining existing features to produce a more useful one (as
we saw earlier, dimensionality reduction algorithms can help).
33
Testing and Validating

u To ensure that our algorithm is not overfitting the training data it is common
to split the data into two sets:
u the training set and
u the test set
u You train your model using the training set, and you test it using the test set.
u The error rate on new cases is called the generalization error (or out-of-
sample error),
Overfitting:
Low training error score but high generalization error
34
Hyperparameter Tuning and Model
Selection

u How do you choose the value of the regularization hyperparameter?


u One option is to train 100 different models using 100 different values for this
hyperparameter.
u If you do this and evaluate the model on test set, it is likely going to lead to
overfitting.
u The problem is that you measured the generalization error multiple times on the
test set, and you adapted the model and hyperparameters to produce the best
model for that particular set
u A common solution to this problem is called holdout validation: you simply hold
out part of the training set to evaluate several candidate models and select the
best one.
We create a validation set to tune the hyperparameters

A hyperparameter is a parameter of the learning algorithm itself, not of the model (e.g., the amount of regularization to apply). i.e. learning rate,
batch size, no. of epochs
36
Next time

u Chapter 2 of the textbook: End-to-End Machine Learning Project

You might also like