0% found this document useful (0 votes)
11 views55 pages

Lec1 - Introduction

The document provides an introduction to machine learning, defining it as the study of algorithms that improve their performance with experience. It discusses various types of learning, including supervised, unsupervised, and reinforcement learning, along with practical applications such as spam filtering and autonomous cars. Additionally, it outlines the process of framing a learning problem, training models, and evaluating their performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views55 pages

Lec1 - Introduction

The document provides an introduction to machine learning, defining it as the study of algorithms that improve their performance with experience. It discusses various types of learning, including supervised, unsupervised, and reinforcement learning, along with practical applications such as spam filtering and autonomous cars. Additionally, it outlines the process of framing a learning problem, training models, and evaluating their performance.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Introduction to Machine Learning

Book
Course Assessment
Course total grade 150

Grading:
- Attendance 10
- Lab + Assignments 20
- Midterm Exam 30
- Final Exam 90
What is Machine Learning?
“Learning is any process by which a system improves
performance from experience.”
- Herbert Simon

Definition by Tom Mitchell (1998):


Machine Learning is the study of algorithms that
• improve their performance P
• at some task T
• with experience E.
A well-defined learning task is given by <P, T, E>.
3
Traditional Programming

Data
Computer Output
Program

Machine Learning

Data
Computer Program
Output

4
Some examples of tasks that are best solved
by using a learning algorithm
• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual credit card transactions
– Unusual patterns of sensor readings in a nuclear power plant
• Prediction:
– Future stock prices or currency exchange rates
7
State of the Art Applications of
Machine Learning

11
Example (Spam Filter)
Traditional way of Programming
1. First you would look at what spam typically looks like. You might notice that
some words or phrases (such as “4U,” “credit card,” “free,” and “amazing”)
tend to come up a lot in the subject.
2. You would write a detection algorithm for each of the patterns that you
noticed, and your program would flag emails as spam if a number of these
patterns are detected.
3. You would test your program, and repeat steps 1 and 2 until it is good enough.
Example (Spam Filter)
Machine way of Programming

In contrast, a spam filter based on Machine Learning techniques automatically learns


which words and phrases are good predictors of spam by detecting unusually frequent
patterns of words in the spam examples compared to the ham examples.
The program is much shorter, easier to maintain, and most likely more accurate.
Example (Spam Filter)
Machine way of Programming (Automatically
Adapting to change (Classification))

If spammers notice that all their emails containing “4U” are blocked, they
might start writing “For U” instead. A spam filter using traditional programming
techniques would need to be updated to flag “For U” emails. If spammers keep
working around your spam filter, you will need to keep writing new rules forever.
In contrast, a spam filter based on Machine Learning techniques automatically notices
that “For U” has become unusually frequent in spam flagged by users, and it starts
flagging them without your intervention.
Example (Spam Filter)

A well-defined learning task is given by <P, T, E>.


T: Task is Flag spam for new emails
E: Experience is the training Data.
P: Performance for example would be the ration of correctly
classified emails.
Autonomous Cars

• Nevada made it legal for


autonomous cars to drive on
roads in June 2011
• As of 2013, four states (Nevada,
Florida, California, and
Michigan) have legalized
autonomous cars
Penn’s Autonomous Car 
(Ben Franklin Racing Team) 12
Autonomous Car Sensors

13
Machine Learning in
Automatic Speech Recognition
A Typical Speech Recognition System

ML used to predict of phone states from the sound spectrogram

Deep learning has state-of-the-art results

# Hidden Layers 1 2 4 8 10 12

Word Error Rate % 16.0 12.8 11.4 10.9 11.0 11.1

Baseline GMM performance = 15.4%


[Zeiler et al. “On rectified linear units for speech
recognition” ICASSP 2013]
Types of Learning
Types of Learning
• Supervised (inductive) learning
– Given: training data + desired outputs (labels)
• Unsupervised learning
– Given: training data (without desired outputs)
• Semi-supervised learning
– Given: training data + a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions
Supervised Learning
The supervised learning is the ML model where the training
data you feed to the algorithm includes the desired
solutions, called labels is divided into:
 Regression.
 Classification
Supervised Learning: Regression
• Given (x1, y1), (x2, y2), ..., (xn, yn)
• Learn a function f (x) to predict y given x
– y is real-valued == regression
9
September Arctic Sea Ice Extent

8
7
(1,000,000 sq km)

6
5
4
3
2
1
0
1970 1980 1990 2000 2010 2020
Supervised Learning
Classification example:

The spam filter is a good example of this: it is trained with many example emails along
with their class (spam or ham), and it must learn how to classify new emails.

A labeled training set for supervised learning (e.g., spam classification)


Supervised Learning: Classification
• Given (x1, y1), (x2, y2), ..., (xn, yn)
• Learn a function f (x) to predict y given x
– y is categorical == classification
Breast Cancer (Malignant / Benign)

1(Malignant)

0(Benign)
Tumor Size
Predict Benign Predict Malignant

Tumor Size
Supervised Learning
• x can be multi-dimensional
– Each dimension corresponds to an attribute

- Clump Thickness
- Uniformity of Cell Size
Age - Uniformity of Cell Shape

Tumor Size
Unsupervised Learning
In unsupervised learning, as you might guess, the
training data is unlabeled.
The Model tries to learn without a teacher. So, the
Model find structure and pattern in the data on its own.
Given input
Output hidden

An unlabeled training set for unsupervised


learning
Unsupervised Learning
• Given x1, x2, ..., x n (without labels)
• Output hidden structure behind the x’s
– E.g., clustering
Unsupervised Learning
• Independent component analysis – separate a combined
signal into its original sources
Reinforcement Learning
• Given a sequence of states and actions with (delayed)
rewards, output a policy
– Policy is a mapping from states  actions that tells you what to
do in a given state
• Examples:
– Credit assignment problem
– Game playing
– Robot in a maze
– Balance a pole on your hand

Reinforcement Learning
Reinforcement Learning
Key components of Reinforcement learning:
Agent: The learner or decision- maker.

Environment: The space in which the agent operates.

State: A representation of the current situation or condition in which the


agent is.

Actions: What the agent can do or the choices it can make.

Reward: Feedback from the environment in response to the agent’s


actions.

Policy: A strategy used by the agent to determine its actions based on


the current state.
Reinforcement Learning
The Agent-Environment Interface

Agent and environment interact at discrete time steps : t  0, 1, 2, K


Agent observes state at step t : st S
produces action at step t : at  A(st )
gets resulting reward : rt 1 
and resulting next state : st 1

... rt +1 s rt +2 s rt +3 s ...
st a t +1
at +1 t +2
at +2 t +3 at +3
t
Framing a Learning Problem
Machine learning model maps from
features to prediction
𝑓 𝑥 →𝑦
Features Prediction

Examples
• Classification
• Is this a dog or a cat?
• Is this email spam or not?

• Regression
• What will the stock price be tomorrow?
• What will be the high temperature tomorrow?
Learning has three stages
Training: optimize model parameters

Validation: intermediate evaluations to design/select model

Test: final performance evaluation


Training: the model is fit to data to
minimize a loss or maximize an objective
function
𝜃∗ = argmin 𝐿𝑜𝑠𝑠(𝑓 𝑋; 𝜃 , 𝑌)
𝜃
Model parameters Features of all training “Ground Truth” predictions of
that minimize loss examples all training examples

Example
𝑋 𝑌 Loss: sum squared error
37.5 41.2 51.0 48.3 50.5
𝐿𝑜𝑠𝑠 𝑓 𝑋; 𝜃 , 𝑌 = ∑ 𝑓 𝑋𝑖 ; 𝜃 − 𝑦𝑖 2

Learn to predict next day’s 1 row per 47.0 46.5 48.9 50.5 47.6
𝑖
temperature given example Model: linear
… …
preceding days’ 67.0 64.7 63.0 61.4 60.2
𝑓(𝑋𝑖; 𝜃) = 𝐴𝑋𝑖 + 𝑏
temperatures
1 column per
Optimization via ordinary least
feature
squares regression
Model design and “hyper parameter”
tuning is performed using a validation set

Select model linear regression


Set training parameters
Feature selection
Learning rate, regularization parameters, …

Sometimes, there are clear “train”, “val”, and “test” sets. Other
times, you need to split “train” into a train set for learning
parameters and a val set for checking model performance
Testing: The effectiveness of the model
is evaluated on a held out test set
“Held out”: not used in training; ideally not viewed by developers, e.g. in
a private test server

Common performance
1
measures
Classification error: ∑𝑖 𝑓 𝑋𝑖 ≠ 𝑦𝑖 (for classification model, 𝑦 𝑖 is target/true label)
N
1
Cross-entropy: − ∑ 𝑖 log 𝑓 𝑦 = 𝑦 𝑖|𝑋 𝑖 (for probabilistic model)
N
1
RMSE: ∑ 𝑓 𝑋𝑖 − 𝑦𝑖 2 (regression measure)
N 𝑖
𝑅2: 1 − ∑𝑖 𝑓 𝑋𝑖 − 𝑦𝑖 2 / ∑𝑖 𝑦𝑖 − 𝒚
̅ 2 (unitless regression measure; 𝒚
̅ is
expectation/mean/avg)

In machine learning research, usually data is collected once and then


randomly sampled into train and test partitions
Train and test samples are “i.i.d.”, independent and identically distributed
In many real-world applications, the input to the model in deployment comes from a
different distribution than training
Various Function Representations
• Numerical functions
– Linear regression
– Neural networks
– Support vector machines
• Symbolic functions
– Decision trees
• Instance-based functions
– Nearest-neighbor
• Probabilistic Graphical Models
– Naïve Bayes
– Bayesian networks
Evaluation
• Accuracy
• Precision and recall
• Squared error
• Entropy
• etc.
ML in Practice
• Understand domain, prior knowledge, and goals
• Data integration, selection, cleaning, pre-processing, etc.
Loop • Learn models
• Interpret results
• Consolidate and deploy discovered knowledge
Complete example to understand
steps of Machine Learning
Suppose you want to know if money makes people happy or not

Step1 :download
the Better Life Index data from the OECD’s website
OECD Data Explorer - Archive • Better Life Index
as well as stats about GDP( Gross Domestic Product) per capita )‫الناتج المحلى االجمالى للفرد‬
from the IMF’s website.
World Economic Outlook Databases (imf.org)
Step 2: join the tables and sort by GDP per capita.

Table shows an excerpt of what you get.


Complete example to understand
steps of Machine Learning
It is observed that the graph tends to be linear means as the GDP per capita
increase the Life Satisfaction increase also, there are some noisy Data. So we will
chose Linear model.

This step is called model selection: you selected a linear model of life satisfaction
with just one attribute, GDP per capita.
Complete example to understand
steps of Machine Learning
linear model equation.

Second step determine model parameters which are chose best values to make
line covers all scattered points. (This step is the training step)
Complete example to understand
steps of Machine Learning
How can you know which values of the parameter values θ0 and θ1. will make
your model perform best ?

 You need to specify a performance measure (utility function (or fitness


function) that measures how good your model is.

 Or you can define a cost function that measures how bad it is (error).

For linear regression problems, people typically use a cost function that
measures the distance between the linear model’s predictions and the training
examples; the objective is to minimize this distance.
Complete example to understand
steps of Machine Learning
it finds the parameters that make the linear model fit best to your data with least
MSE for example.
This is called training the model. In our case the algorithm finds that the optimal
parameter values are θ0 = 4.85 and θ1 = 4.91 × 10–5.
Complete example to understand
steps of Machine Learning
After training the model we can use now.
For example to know how happy Cypriots are, and the OECD data does not have
the answer.

Fortunately, you can use your model to make a good prediction: you look up
Cyprus’s GDP per capita, find $22,587, and then apply your model and find that
life satisfaction.

Is likely to be somewhere around 4.85 + 22,587 × 4.91 × 10-5 = 5.96.


Complete example to understand
steps of Machine Learning
Python code that loads the data, prepares it, creates a scatterplot for visualization, and
then trains a linear model and makes a prediction.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import sklearn.linear_model
# Load the data
oecd_bli = pd.read_csv("oecd_bli_2015.csv", thousands=',')
gdp_per_capita = pd.read_csv("gdp_per_capita.csv",thousands=',',delimiter='\t',
encoding='latin1', na_values="n/a")
# Prepare the data
country_stats = prepare_country_stats(oecd_bli, gdp_per_capita)
X = np.c_[country_stats["GDP per capita"]]
y = np.c_[country_stats["Life satisfaction"]]
# Visualize the data
country_stats.plot(kind='scatter', x="GDP per capita", y='Life satisfaction')
plt.show()
# Select a linear model
model = sklearn.linear_model.LinearRegression()
# Train the model
model.fit(X, y)
# Make a prediction for Cyprus
X_new = [[22587]] # Cyprus' GDP per capita
print(model.predict(X_new)) # outputs [[ 5.96242338]]
Complete example to understand
steps of Machine Learning
Replacing the Linear Regression model with k-Nearest Neighbors
regression in the previous code is as simple as replacing these two
lines:

import sklearn.linear_model
model = sklearn.linear_model.LinearRegression()

with these two:

import sklearn.neighbors
model = sklearn.neighbors.KNeighborsRegressor(n_neighbors=3)
Complete example to understand
steps of Machine Learning
If all went well, your model will make good predictions.
If not, you may need to use:
 more attributes (employment rate, health, air pollution, etc.),
 get more or better quality training data,
 or perhaps select a more powerful model (e.g., a Polynomial Regression model).
In summary:
 You studied the data.
 You selected a model.
 You trained it on the training data (i.e., the learning algorithm searched for the
model parameter values that minimize a cost function).
 Finally, you applied the model to make predictions on new cases (this is called
inference), hoping that this model will generalize well.
Another Complete example to understand
steps of Machine Learning (Heart attack)
Problem:
A heart attack may happens, when the flow of blood to a part of the
heart muscle gets blocked.
Solution:
We want to use ML model to predict if this person may happens to
him heart attack or not.
So , we are going to follow the following steps (workflow):
Step 1: Data collection and features.
Step 2 : Data preprocessing.
Step 3: Feature extraction.
Step 4: Model prediction.
Step 5: Model evaluation.
Step 6: Deployment and monitoring.
Another Complete example to
understand steps of Machine
Learning (Heart attack)
Step one: Data collection and Features
Which may be
 Age
 Gender
 Blood pressure
 Cholesterol levels
 and others
Another Complete example to
understand steps of Machine Learning
(Heart attack)
Step two: Data Preprocessing
Which includes
Cleaning and normalizing the collected data.
May include dealing with missing values.
May include dealing with categorical data to be encoded
appropriately to a numerical value the PC can deal with.
Another Complete example to
understand steps of Machine Learning
(Heart attack)
Step three: Feature selection
Which includes
Identifying the most predictive features, which will
help in predicting the heart attack.
Which means exclude the irrelevant features.
Another Complete example to understand
steps of Machine Learning (Heart attack)
Step four: Model training
Which includes
The data must be divided into training and test.
For, example 80% training and 20% testing.
The processed data will be used to train the ML model.
From common ML models are:
 Logistic regression
 Decision Tree
 Random forest
 Gradient Boosting models
Another Complete example to understand
steps of Machine Learning (Heart attack)
Step five: Model Evaluation
After training the model, we need to evaluate its performance, so
we used different evaluation metrics for classification:

 Accuracy
 Precision
 Recall
 AUC-ROC curve

Note: other metrics will be used for Regression.


Lessons Learned about Learning
• Learning can be viewed as using direct or indirect experience to
approximate a chosen target function.

• Function approximation can be viewed as a search through a space


of hypotheses (representations of functions) for one that best fits a
set of training data.

• Different learning methods assume different hypothesis spaces


(representation languages) and/or employ different search techniques.

You might also like