LM #02-ML Concepts & Frameworks
LM #02-ML Concepts & Frameworks
Frameworks
Association
Classification
Supervised Learning Regression
Reinforcement Learning
11/17/2024 2
Learning Associations
• Basket analysis:
P (Y | X ) probability that somebody who
buys X also buys Y where X and Y are
products/services.
3
Supervised Learning
• Given: Training examples
x , f x , x , f x ,..., x
1 1 2 2 P
, f x P
for some unknown function (system) y f x
• Find f x
– Predict y f x, where x is not in the
training set
11/17/2024 4
Generic Supervised Machine Learning Framework
Start
Model Building
Model Testing
m a
dta
:
gfro
p2
Y N
Need to refine model?
te
in
S
arn
Building models using
e
Parameter tuning
L
the parameters obtained
Model
validation
5
End
Supervised Learning…… Contd.
• Classification
𝕽
• Regression
𝐲 ∈
11/17/2024 6
7
+
11/17/2024 9
Representation
• Definition of thing or things to be predicted
– Classification: classes
– Regression: regression variable
• Definition of things (instances) to make
predictions for
– Individuals
– Families
– Neighborhoods, etc.
• Choice of descriptors (features) to describe
different aspects of instances
11/17/2024 10
Classification
• Example: Credit
scoring
• Differentiating
between low-risk and
high-risk customers
from their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
11
Classification: Applications
• Aka Pattern recognition
• Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
• Character recognition: Different handwriting styles.
• Speech recognition: Temporal dependency.
• Medical diagnosis: From symptoms to illnesses
• Biometrics: Recognition/authentication using
physical and/or behavioral characteristics: Face, iris,
signature, etc
• Outlier/novelty detection:
12
Face Recognition
Training examples of a person
Test images
ORL dataset,
AT&T Laboratories, Cambridge UK
13
Regression
• Example: Price of a
used car
• x : car attributes y = wx+w0
y : price
y = g (x | q )
g ( ) model,
q parameters
14
Linear Regression
– Main Assumptions:
• Linear weighted sum of attribute values.
• Data is linearly separable.
• Attributes and target values are real valued.
– Hypothesis Space
• Fixed size (parametric) : Limited modeling
potential
𝑑
𝑦 =∑ 𝑎𝑖 𝑥 𝑖 +𝑏
𝑖 =1
α1
17
Unsupervised Learning
• Learning “what normally happens”
• No output
• Clustering: Grouping similar instances
• Example applications
– Customer segmentation in CRM
– Image compression: Color quantization
– Bioinformatics: Learning motifs
– etc.
• It could be used as preprocessing to the
supervised learning 18
19
+
Dimension Reduction:
Here the goal is to simplify a large input dataset by mapping them to a lower
dimensional space.
For example, carrying analysis on a large dimension dataset is very computational
intensive, so to simplify you may want to find the key variables that hold a significant
percentage (say 95%) of information and only use them for analysis.
Anomaly Detection:
Anomaly detection is also commonly known as outlier detection is the identification
of items, events or observations which do not conform to an expected pattern or
behavior in comparison with other items in a given dataset.
It has applicability in a variety of domains, such as machine or system health
monitoring, event detection, fraud/intrusion detection etc.
11/17/2024 20
21
+
Semi-supervised Learning
Semi-supervised machine learning is a combination of supervised and
unsupervised machine learning methods.
Task
- Learn how to behave successfully to achieve a goal
while interacting with an external environment
-Learn via experiences!
Reinforcement Learning
Applications
• Credit assignment problem
• Game playing
• Robot in a maze
• Multiple agents, partial observability, ...
• etc…
25
26
+
2. Preparing the data: This involves fixing issues with the data set collected
e.g. handling outliers and managing missing data points. Break the cleaned
data-set into two parts, one for training and other for evaluating the
program. Visualize the data.
4. Evaluating the model: To test the accuracy and precision of the model, use
the test data-set kept aside in the step 2.
Data
Learning Evaluation
Exploration
Data Mode
1. Explore the Data Supervised Learning Precision/Recall …
Un-supervised Learning Overfitting
l
2. Visualize the Data
3. Feature Selection Semi Supervised Test validation Data
4. Feature Extraction Learning
Reinforcement Learning
29
Resources: Journals
Journal of Machine Learning Research
www.jmlr.org
http://www.jmlr.org/mloss/
Applied Soft Computing:
http://www.journals.elsevier.com/applied-soft-computing/
Machine Learning
IEEE Transactions on Neural Networks
IEEE Transactions on Pattern Analysis and
Machine Intelligence
...
30
Resources: Conferences
International Conference on Machine Learning (ICML)
European Conference on Machine Learning (ECML)
Neural Information Processing Systems (NIPS)
Computational Learning
International Joint Conference on Artificial Intelligence (IJCAI)
ACM SIGKDD Conference on Knowledge Discovery and Data
Mining (KDD)
31