0% found this document useful (0 votes)
23 views31 pages

LM #02-ML Concepts & Frameworks

Uploaded by

amisskpop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views31 pages

LM #02-ML Concepts & Frameworks

Uploaded by

amisskpop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

Machine Learning: Concepts and

Frameworks

Dr. Aadam S.O. Olatunji


Associate Professor of Computer
Science, IAU, Dammam, KSA.

With some contents adapted from:


ETHEM ALPAYDIN (INTRODUCTION
TO
Machine
Learning)
© The MIT Press, 2014
11/17/2024 1
[email protected]
http://www.cmpe.boun.edu.tr/~ethem/i2ml3e
Machine Learning Paradigms/or The Sub-
Fields of ML/Application Areas of ML

Association

Classification
Supervised Learning Regression

Unsupervised Learning Clustering

Co-training (mix small labeled data with Large unlabeled data)


Semi-supervised Learning Active learning (Interactive supervised learning)

Reinforcement Learning

11/17/2024 2
Learning Associations
• Basket analysis:
P (Y | X ) probability that somebody who
buys X also buys Y where X and Y are
products/services.

Example: P ( chips | biscuit ) = 0.7

3
Supervised Learning
• Given: Training examples
x , f x , x , f x ,..., x
1 1 2 2 P 
, f x P 
for some unknown function (system) y f x 

• Find f x 
– Predict y  f x, where x is not in the
training set

11/17/2024 4
Generic Supervised Machine Learning Framework

Start

Data Gathering &


Preparation Step 1:
Data preparation
Data Preprocessing

Model Building

Model Testing

m a
dta
:

gfro
p2
Y N
Need to refine model?

te

in
S

arn
Building models using

e
Parameter tuning

L
the parameters obtained

Model
validation

Prediction using the


models
Step 3:
Results analysis & Carrying out
interpretation prediction &
Identification of hot- interpreting results
spots

5
End
Supervised Learning…… Contd.
• Classification

𝕽
• Regression

𝐲 ∈

11/17/2024 6
7
+

Supervised Learning (Predictive Modeling)


 Supervised Learning is the first type of machine learning, in
which labelled data used to train the algorithms.

 The algorithms are trained using marked data,


where the input and the output are known.

 The input set of data is called as Features (denoted by


X) along with the corresponding outputs(target/Class
labels)(indicated by Y)

 The algorithm learns by comparing its actual


production with correct outputs to find errors.

 The raw data divided into two parts.


 The first part is for training the algorithm, and (70%)
 The second is used for test the trained algorithm
(30%)

November 17, 2024


Supervised Learning…….Contd.
• Approaches (Category of Supervised Learning)
– Classification (discrete predictions)
– Regression (continuous predictions)
• Common considerations
– Representation (Features)
– Feature Selection
– Functional form
– Evaluation of predictive power
11/17/2024 8
Classification vs. Regression
• If I want to predict whether a patient will
die from a disease within six months, that is
classification
• If I want to predict how long the patient will
live, that is regression
• (Students should come up with examples in each category)

11/17/2024 9
Representation
• Definition of thing or things to be predicted
– Classification: classes
– Regression: regression variable
• Definition of things (instances) to make
predictions for
– Individuals
– Families
– Neighborhoods, etc.
• Choice of descriptors (features) to describe
different aspects of instances
11/17/2024 10
Classification
• Example: Credit
scoring
• Differentiating
between low-risk and
high-risk customers
from their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
11
Classification: Applications
• Aka Pattern recognition
• Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
• Character recognition: Different handwriting styles.
• Speech recognition: Temporal dependency.
• Medical diagnosis: From symptoms to illnesses
• Biometrics: Recognition/authentication using
physical and/or behavioral characteristics: Face, iris,
signature, etc
• Outlier/novelty detection:
12
Face Recognition
Training examples of a person

Test images

ORL dataset,
AT&T Laboratories, Cambridge UK
13
Regression
• Example: Price of a
used car
• x : car attributes y = wx+w0
y : price
y = g (x | q )
g ( ) model,
q parameters

14
Linear Regression
– Main Assumptions:
• Linear weighted sum of attribute values.
• Data is linearly separable.
• Attributes and target values are real valued.
– Hypothesis Space
• Fixed size (parametric) : Limited modeling
potential
𝑑
𝑦 =∑ 𝑎𝑖 𝑥 𝑖 +𝑏
𝑖 =1

-Multiple Linear Regression (for non linear cases!)


15
Regression Applications
• Navigating a car: Angle of the steering
• Kinematics of a robot arm
(x,y) α1= g1(x,y)
α2= g2(x,y)
α2

α1

 Response surface design


16
Regression Applications …
 Navigating a car: Angle of the steering wheel
 Predicting the stock prices
 Predicting oil and gas properties, e.g. permeability,
porosity, etc.
 Predicting Student actual Score in the Exam
 etc…

17
Unsupervised Learning
• Learning “what normally happens”
• No output
• Clustering: Grouping similar instances
• Example applications
– Customer segmentation in CRM
– Image compression: Color quantization
– Bioinformatics: Learning motifs
– etc.
• It could be used as preprocessing to the
supervised learning 18
19
+

Unsupervised Learning (Descriptive Modeling)


 Clustering:
 The goal here is to divide the input dataset into logical groups of related items.
 Some examples are grouping similar news articles, grouping similar customers
based on their profile, etc.

 Dimension Reduction:
 Here the goal is to simplify a large input dataset by mapping them to a lower
dimensional space.
 For example, carrying analysis on a large dimension dataset is very computational
intensive, so to simplify you may want to find the key variables that hold a significant
percentage (say 95%) of information and only use them for analysis.

 Anomaly Detection:
 Anomaly detection is also commonly known as outlier detection is the identification
of items, events or observations which do not conform to an expected pattern or
behavior in comparison with other items in a given dataset.
 It has applicability in a variety of domains, such as machine or system health
monitoring, event detection, fraud/intrusion detection etc.

November 17, 2024


Semi-supervised Learning
ing Paradigms/or The SubL/Application Areas of ML

Co-training (mix small labeled data with Large


unlabeled data)

Active learning (Interactive supervised learning)

11/17/2024 20
21
+

Semi-supervised Learning
 Semi-supervised machine learning is a combination of supervised and
unsupervised machine learning methods.

 In semi-supervised learning, an algorithm learns from a dataset that includes both


labeled and unlabeled data, usually mostly unlabeled.

Why is Semi-Supervised Machine Learning


important?
 When you don’t have enough
labeled data to produce an  You can use a semi-supervised
accurate model and you don’t learning algorithm to label the
have the ability or resources to get data, and retrain the model with
more, you can use semi- the newly labeled dataset
supervised techniques to increase
the size of your training data.

There is no way to verify that the algorithm produced


labels that are 100% accurate, resulting in less
trustworthy outcomes than traditional supervised
techniques.
https://www.datarobot.com/wiki/semi-supervised-machine-
learning/

November 17, 2024


Reinforcement Learning
– Reinforcement learning:
in the case of the agent that acts on its environment, it receives
some evaluation of its action (reinforcement), but is not told of
which action is the correct one to achieve its goal

– Unlike: Supervised learning:


a situation in which sample (input, output) pairs of the function
to be learned can be perceived or are given
» You can think it as if there is a kind teacher
Reinforcement Learning contd…
• Learning a policy: Policy could be:
– positive: Think of it as adding something in order to increase a response. (e.g. adding
praise will increase the chances of your child cleaning his or her room) ,

– Negative: taking something negative away in order to increase a response (e.g.


stopped nagging when someone you’ve been nagging did good)

– Punishment: adding something aversive in order to decrease a behavior


– Extinction (When you remove something in order to decrease a behavior)

• No supervised output but delayed reward


23
RL is learning from interaction

Task
- Learn how to behave successfully to achieve a goal
while interacting with an external environment
-Learn via experiences!
Reinforcement Learning
Applications
• Credit assignment problem
• Game playing
• Robot in a maze
• Multiple agents, partial observability, ...
• etc…

25
26
+

Types of Machine Learning Paradigms


(summary)
Main task perform in each types of Machine Learning

November 17, 2024


27
+ Machine Learning Process
1. Collecting Data: Data-set having variety, density and volume of relevant
data will help in better learning.

2. Preparing the data: This involves fixing issues with the data set collected
e.g. handling outliers and managing missing data points. Break the cleaned
data-set into two parts, one for training and other for evaluating the
program. Visualize the data.

3. Training a model: Choose an appropriate algorithm and representation of


data in form of the model suited for your problem. Use the training data-set
to train the model.

4. Evaluating the model: To test the accuracy and precision of the model, use
the test data-set kept aside in the step 2.

5. Improving the performance: It might involve choosing different model and


algorithm altogether, or introducing more variables and/or data to train the
model or optimizing the parameters of the model.

November 17, 2024


28
+ Machine Learning Process

Data
Learning Evaluation
Exploration

Data Mode
1. Explore the Data  Supervised Learning  Precision/Recall …
 Un-supervised Learning  Overfitting
l
2. Visualize the Data
3. Feature Selection  Semi Supervised  Test validation Data
4. Feature Extraction Learning
 Reinforcement Learning

November 17, 2024


Resources: Datasets
 UCI Repository: (recommended in this course)
http://www.ics.uci.edu/~mlearn/MLRepository.html
 UCI KDD Archive:
http://kdd.ics.uci.edu/summary.data.application.html
 Statlib: http://lib.stat.cmu.edu/
 Delve: http://www.cs.utoronto.ca/~delve/

29
Resources: Journals
 Journal of Machine Learning Research
www.jmlr.org
 http://www.jmlr.org/mloss/
 Applied Soft Computing:
http://www.journals.elsevier.com/applied-soft-computing/
 Machine Learning
 IEEE Transactions on Neural Networks
 IEEE Transactions on Pattern Analysis and
Machine Intelligence
 ...

30
Resources: Conferences
 International Conference on Machine Learning (ICML)
 European Conference on Machine Learning (ECML)
 Neural Information Processing Systems (NIPS)
 Computational Learning
 International Joint Conference on Artificial Intelligence (IJCAI)
 ACM SIGKDD Conference on Knowledge Discovery and Data
Mining (KDD)

31

You might also like