Ch8 - Learning from Examples
Ch8 - Learning from Examples
© Bushra Alhijawi 2
Learning Elements
• Machine learning research has produced a large variety of learning elements.
• Major issues in the design of learning elements:
• Which components of the performance element are to be improved?
• What representation is used for those components?
• What kind of feedback is available:
• supervised learning.
• reinforcement learning.
• unsupervised learning.
• semi-supervised learning.
• What prior knowledge is available?
© Bushra Alhijawi 3
Machine Learning Types
• Supervised learning → uses labeled datasets
to train algorithms to classify data or predict
outcomes accurately.
• Unsupervised learning → uses unlabeled
datasets to train algorithms to analyze and
discover hidden patterns or data groupings
without the need for human intervention.
• Semi-Supervised learning → similar to
supervised learning but uses labeled and
unlabeled datasets to train algorithms.
• Reinforcement learning → a behavioral ML
model that learns as it goes by using trial and
error
© Bushra Alhijawi 4
Supervised Learning
• The dataset is the collection of labeled examples
involving features and labels.
© Bushra Alhijawi 5
Supervised Regression
Features Supervised Label
ML Algorithm
Feature 1 … Feature n Goal 1
𝑓 𝑥 = 𝑤𝑥 + 𝑏
© Bushra Alhijawi 7
Supervised Regression -Polynomial Regression
© Bushra Alhijawi 8
Supervised Regression -Polynomial Regression
© Bushra Alhijawi 9
Supervised Regression -Polynomial Regression
• Polynomial regression fits a nonlinear relationship between
dependent and independent variables.
© Bushra Alhijawi 10
Supervised Classification
Features Supervised Label
ML Algorithm
Feature 1 … Feature n Goal 1
© Bushra Alhijawi 12
Supervised Classification - k-Nearest Neighbors
• The KNN algorithm assumes that
SIMILAR things exist in close
proximity.
• As the value of K increases, the
predictions become more stable
due to majority voting /
averaging, and thus, more likely
to make more accurate
predictions (up to a certain
point).
© Bushra Alhijawi 13
Supervised Classification – Decision Tree
• Decision tree is an acyclic
graph that can be used to
make decisions.
• A non-parametric
supervised learning
method.
• Very powerful algorithms,
capable of fitting complex
datasets.
• The goal is to create a model
that predicts the value of a
target variable by learning
simple decision rules inferred
from the data features.
Image Source
© Bushra Alhijawi 14
Supervised Classification – Decision Tree
Example Source
© Bushra Alhijawi 15
Performance Evaluation – Supervised Learning
• Evaluating a machine learning model is important to assess its performance,
identify potential problems, and make improvements.
• A well-evaluated model can help improve its accuracy, generalizability, and
reliability.
• Performance evaluation steps:
• Data Sampling.
• Define the evaluation metrics.
• Train the machine learning algorithm.
• Test the machine learning model.
• Analyze the results to identify potential problems.
• Refine the model.
© Bushra Alhijawi 16
Evaluation – Data Sampling
• Divide the data into training, validation, and testing sets.
• The training set is used to train the model.
• The validation set is used to optimize the hyperparameters and prevent overfitting.
• The testing set is used to evaluate the final performance of the model.
• Data sampling techniques:
• Hold-Out → The original dataset is randomly divided into two subsets: the training
set and the testing set.
• K-fold Cross-Validation → The original dataset is divided into K subsets. The model is
then trained and tested k times, with each fold serving as the testing set once and
the remaining folds serving as the training set.
• Leave-One-Out → A type of cross-validation method where the original data is
divided into n-folds, where n is the number of samples in the dataset.
• Boot Strapping (Bagging) →Creating multiple new datasets by randomly sampling
from the original dataset with replacement and then estimating the performance or
estimate on each of these new datasets.
© Bushra Alhijawi 17
Evaluation – Data Sampling
• Hold-Out method
Source
© Bushra Alhijawi 18
Evaluation – Data Sampling
• Leave-One-Out method
• Boot Strapping
© Bushra Alhijawi 19
Evaluation Metrics – Regression
• Mean squared error (MSE): the average of squared differences between the
predicted output and the true output.
• Mean Absolute Error(MAE): the average of the absolute differences between the
predicted output and the true output.
© Bushra Alhijawi 20
Evaluation Metrics – Classification
• Confusion matrix → A summary of the number of correct and incorrect
predictions made by the model, comparing the predicted labels to the true labels
in the data.
• True Positives (TP): Positive records are
correctly classified as positive.
• True Negatives (TN): Negative records are
correctly classified as negative.
• False Positives (FP): Negative records are
misclassified as positive.
• False Negatives (FN): Positive records are
misclassified as negative.
© Bushra Alhijawi 21
Evaluation Metrics – Classification
© Bushra Alhijawi 22
Evaluation Metrics – Classification
© Bushra Alhijawi 23
Evaluation Metrics – Classification
© Bushra Alhijawi 24
Evaluation Metrics – Classification
• How to build the confusion matrix?
© Bushra Alhijawi 25
Unsupervised Learning
• In unsupervised learning, the dataset is a collection of unlabeled
examples (ONLY Features).
• Unsupervised learning methods determined hidden patterns in
data without initial patterns and relationships being known.
© Bushra Alhijawi 26
Unsupervised Learning
Features Unsupervised
ML Algorithm
Feature 1 … Feature n
Feature 1 … Feature n
Internal
Similarity
Feature 1 … Feature n
Hyundai Category
2015 Manual 41000 1582 CC
Creta
Honda Cluster 1
2011 Manual 46000 1199 CC
Jazz V
Cluster 2
Audi A4 2013 Automatic 40670 1968 CC
© Bushra Alhijawi 27
Semi-Supervised Learning
• In semi-supervised learning, the dataset contains both labeled and unlabeled
examples.
• Usually, the quantity of unlabeled examples is much higher than the number of
labeled examples.
• The ML algorithm is trained using the labeled data and predicts the unlabeled
data to attach the label with every data sample (Pseudo-labeled data). Then, a
new model can be trained with a mixture of labeled and pseudo-labeled data.
© Bushra Alhijawi 28
Reinforcement Learning
• In Reinforcement learning, the machine “lives” in an environment and can
perceive the state of that environment as a vector of features. The machine can
execute actions in every state. Different actions bring different rewards and could
also move the machine to another state of the environment.
• Reinforcement learning → always looks for the BEST BEHAVIORS.
• The agent selects the BEST action from all options presented in the environmental
state, and based on that selection, receives reward/risk.
© Bushra Alhijawi 29
Machine Learning Applications
• Online Customer Support
Several websites nowadays offer the option to chat with customer support
representatives while navigating the site. In most cases, you talk to a chatbot. These
bots tend to extract information from the website and present it to the customers.
They tend to understand the user queries better and serve them with better
answers.
• Search Engine Result Refining
Search engines use machine learning to improve search results. Every time you
execute a search, the algorithms watch how you respond to the results.
• Product Recommendations
The product recommendations are based on your behavior with the website/app,
past purchases, items liked or added to the cart, brand preferences, etc.
© Bushra Alhijawi 30
Machine Learning Applications
• Virtual Personal Assistants
Ex., Siri, Alexa, and Google. Machine learning collects and refines the
information based on your previous involvement with them.
• Traffic Predictions
Machine learning in such scenarios helps estimate the regions where
congestion can be found based on daily experiences.
www.psut.edu.jo
Call: (+962) 6-5359 949
Fax: (+962) 6-5347 295
[email protected]
Email: [email protected]
Princess Sumaya University for Technology
Amman 11941 Jordan
P.o.Box 1438 Al-Jubaiha