0% found this document useful (0 votes)

4 views31 pages

05 - Machine Learning

The document provides an overview of machine learning approaches, including classical, reinforcement, and ensemble learning, as well as neural networks and deep learning. It discusses the differences between supervised and unsupervised learning, highlighting their applications, drawbacks, and evaluation methods. Additionally, it addresses issues related to data preparation, classification, prediction, and the challenges of overfitting and underfitting in model performance.

Uploaded by

buitouyenglobaltwe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views31 pages

05 - Machine Learning

Uploaded by

buitouyenglobaltwe

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

Introduction to data science

Overview of Machine Learning

Machine Learning Approaches

Classical
learning

Reinforcement MACHINE Ensemble

LEARNING
learning learning

Neural nets
and deep
learning
Machine Learning Approaches

Classical
learning

Supervised Unsupervised Semi-supervised

learning learning learning
Machine Learning Approaches

Ensemble
learning

Boosting Bagging Stacking

Machine Learning Approaches

Reinforcement
learning

Genetic
Algorithm Q-Learning …
(GA)
Machine Learning Approaches

Neural nets
(NN) and
deep learning

Back
Feed forward Convolutional
Propagation Recurrent NN ….
NN NN
NN
Supervised vs. Unsupervised Learning

◼ Supervised learning (classification)

◼ Supervision: The training data (observations,
measurements, etc.) are accompanied by labels
indicating the class of the observations
◼ New data is classified based on the training set
◼ Unsupervised learning (clustering)
◼ The class labels of training data is unknown
◼ Given a set of measurements, observations, etc. with
the aim of establishing the existence of classes or
clusters in the data
Supervised vs. Unsupervised Learning
Supervised Learning: Classification vs. Prediction

◼ Classification
◼ predicts categorical class labels (discrete or nominal)

◼ classifies data (constructs a model) based on the

training set and the values (class labels) in a
classifying attribute and uses it in classifying new data
◼ Prediction (Regression)
◼ models continuous-valued functions, i.e., predicts
unknown or missing values
◼ Typical applications
◼ Credit approval

◼ Target marketing

◼ Medical diagnosis

◼ Fraud detection
Supervised Learning: Drawbacks

◼ Supervised learning requires human expertise: Expert

annotators play an invaluable role in guiding your model’s
training, but they can be difficult to recruit.
◼ Supervised learning is labor-intensive: You’ll need to
have a big enough team with relevant expertise to accurately
label large datasets.
◼ Supervised learning is time-intensive: In addition to top
talent, you’ll need the bandwidth to accurately annotate the
dataset so that your model is capable of producing
predictable outcomes.
Classification: A Two-Step Process

◼ Model construction: describing a set of predetermined classes

◼ Each tuple/sample is assumed to belong to a predefined class,
as determined by the class label attribute
◼ The set of tuples used for model construction is training set

◼ The model is represented as classification rules, decision trees,

or mathematical formulae
◼ Model usage: for classifying future or unknown objects
◼ Estimate accuracy of the model

◼ The known label of test sample is compared with the

classified result from the model

◼ Accuracy rate is the percentage of test set samples that are

correctly classified by the model

◼ Test set is independent of training set, otherwise over-

fitting will occur

◼ If the accuracy is acceptable, use the model to classify data
tuples whose class labels are not known
Process (1): Model Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

M ik e A ssistan t P ro f 3 no (Model)
M ary A ssistan t P ro f 7 yes
B ill P ro fesso r 2 yes
J im A sso c iate P ro f 7 yes
IF rank = ‘professor’
D ave A ssistan t P ro f 6 no
OR years > 6
Anne A sso c iate P ro f 3 no
THEN tenured = ‘yes’
Process (2): Using the Model in Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom A ssistan t P ro f 2 no Tenured?
M erlisa A sso c iate P ro f 7 no
G eo rg e P ro fesso r 5 yes
J o sep h A ssistan t P ro f 7 yes
Machine learning in data mining
Issues regarding to classification and prediction
Issues: Data Preparation

◼ Data cleaning
◼ Preprocess data in order to reduce noise and handle
missing values
◼ Relevance analysis (feature selection)
◼ Remove the irrelevant or redundant attributes
◼ Data transformation
◼ Generalize and/or normalize data
Issues: Evaluating Classification Methods

◼ Accuracy
◼ classifier accuracy: predicting class label

◼ predictor accuracy: guessing value of predicted attributes

◼ Speed
◼ time to construct the model (training time)

◼ time to use the model (classification/prediction time)

◼ Robustness: handling noise and missing values

◼ Scalability: efficiency in disk-resident databases
◼ Interpretability
◼ understanding and insight provided by the model

◼ Other measures, e.g., goodness of rules, such as decision

tree size or compactness of classification rules
Issues: Evaluating Classification Methods
Actual class
+ –
False Positive - NP
Predicted + True Positive - TP
Type I error
False Negative- FN
class – Type II error
True Negative - TN
Issues: Evaluating Classification Methods
Miss Detection Rate

False Alarm Rate

Issues: Evaluating Classification Methods
Issues: Evaluating Classification Methods

Example: Given a confusion matrix

Calculate Accuracy, Precision,

Recall and F1-Score.

Accuracy =
Precision =
Recall =
F1-Score =
Issues: Evaluating Regression Methods
Issues: Evaluating Regression Methods

Mean Squared Error (MSE)

Mean Absolute Error (MAE):

Root Mean Square Error (RMSE):

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

Issues: Evaluating Regression Methods

Mean Absolute Percentage Error (MAPE)

R2 (R-squared):

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

SSR is the sum of squared residuals, and SST is the total sum of squares
Issues: Evaluating Regression Methods

Calculate MSE, MAE, RMSE, R2

Issues: Evaluating Regression Methods
Issues: Overfitting and underfitting

▪ Underfitting happens when a model is not good enough to understand all the
details in the data
→ Poor performance on both the training and test sets
▪ Overfitting occurs when a model is too complex and memorizes the training
data too well
→ good performance on the training set but poor performance on the test set
Other machine learning models

▪ Ensemble learning:
Other machine learning models

▪ Ensemble learning:

19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
The Phonology of Intonation and Phrasing 2006
100% (2)
The Phonology of Intonation and Phrasing 2006
475 pages
Unit-1 ML
No ratings yet
Unit-1 ML
19 pages
6.data Mining - Classification
No ratings yet
6.data Mining - Classification
37 pages
Detailed Lesson Plan in Math 3 Symmetry
No ratings yet
Detailed Lesson Plan in Math 3 Symmetry
12 pages
MI - Unit 3
No ratings yet
MI - Unit 3
107 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
Lecture-5 Classification in ML
No ratings yet
Lecture-5 Classification in ML
50 pages
IntroClassificationDA 2024
No ratings yet
IntroClassificationDA 2024
129 pages
Unit 4
No ratings yet
Unit 4
61 pages
ML 2
No ratings yet
ML 2
166 pages
New Slides Machine Learning-Winter 2024
No ratings yet
New Slides Machine Learning-Winter 2024
72 pages
Classification (Part II)
No ratings yet
Classification (Part II)
162 pages
CH 6
No ratings yet
CH 6
24 pages
BSC ML CH1
No ratings yet
BSC ML CH1
63 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
14 pages
Module 2 - ML
No ratings yet
Module 2 - ML
53 pages
Intro To ML
No ratings yet
Intro To ML
34 pages
ML Unit-1
No ratings yet
ML Unit-1
39 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
dbms-10 Marks
No ratings yet
dbms-10 Marks
32 pages
Chapter 01 Introduction To Machine Learning
No ratings yet
Chapter 01 Introduction To Machine Learning
59 pages
AAI Lecture 9 SP 25
No ratings yet
AAI Lecture 9 SP 25
26 pages
Machine Learning Models: by Mayuri Bhandari
No ratings yet
Machine Learning Models: by Mayuri Bhandari
48 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
Data Analyst Interview Questionaries
No ratings yet
Data Analyst Interview Questionaries
16 pages
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
No ratings yet
2-Capacity, Underfitting, overfitting-15-Jul-2020Material - I - 15-Jul-2020 - ML - Fundamentals
35 pages
Chapter 01 Introduction To ML
No ratings yet
Chapter 01 Introduction To ML
178 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
61 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
ML Chap 2
No ratings yet
ML Chap 2
60 pages
Unit II
No ratings yet
Unit II
25 pages
Data Mining and Warehousing Mod3
No ratings yet
Data Mining and Warehousing Mod3
69 pages
Chapter 2 Machine Learning Draft-85-172
No ratings yet
Chapter 2 Machine Learning Draft-85-172
88 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
24 pages
Classification
No ratings yet
Classification
53 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Unit 2
No ratings yet
Unit 2
63 pages
Unit 5 PPT
No ratings yet
Unit 5 PPT
32 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
Ai CH4
No ratings yet
Ai CH4
27 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Complete ML Concepts
No ratings yet
Complete ML Concepts
30 pages
Formal and Informal Assessment
No ratings yet
Formal and Informal Assessment
4 pages
PECD - CHECKLIST-Assessment Summary English Version
No ratings yet
PECD - CHECKLIST-Assessment Summary English Version
9 pages
Lecturenotes Cse176
No ratings yet
Lecturenotes Cse176
80 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Unit Iii Classification
No ratings yet
Unit Iii Classification
57 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Classifiers (Support Vector Machines, Decision Trees, Nearest Neighbor Classification)
No ratings yet
Classifiers (Support Vector Machines, Decision Trees, Nearest Neighbor Classification)
16 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Lecturenotes PDF
No ratings yet
Lecturenotes PDF
80 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
13 pages
MLT Unit 1
No ratings yet
MLT Unit 1
15 pages
Machine Learning
No ratings yet
Machine Learning
2 pages
Unit 3 - DS - 1st Year
No ratings yet
Unit 3 - DS - 1st Year
5 pages
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
No ratings yet
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
15 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
17 pages
ABP DWDM UNIT 4 Classification 1
No ratings yet
ABP DWDM UNIT 4 Classification 1
51 pages
Williams - (Somente Texto)
No ratings yet
Williams - (Somente Texto)
110 pages
Eccles Wigfield 2002 M Otivational B Eliefs V Alues and G Oals
100% (1)
Eccles Wigfield 2002 M Otivational B Eliefs V Alues and G Oals
27 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
7 pages
The 6 Secrets Every Supervisor Needs To Know
No ratings yet
The 6 Secrets Every Supervisor Needs To Know
82 pages
An Introduction To Formal Emails: Openings and Closings
No ratings yet
An Introduction To Formal Emails: Openings and Closings
1 page
Idiomatic Expressions - Detailed
No ratings yet
Idiomatic Expressions - Detailed
5 pages
BIA Model Test Paper - 2014
No ratings yet
BIA Model Test Paper - 2014
23 pages
Prof Ed LET Reviewer
No ratings yet
Prof Ed LET Reviewer
101 pages
Impact of AI Research
No ratings yet
Impact of AI Research
12 pages
Tacit Knowledge
No ratings yet
Tacit Knowledge
14 pages
#Thesis Final v1-06082021
No ratings yet
#Thesis Final v1-06082021
47 pages
Syllabus Edu212 - Sy 2024-2025
No ratings yet
Syllabus Edu212 - Sy 2024-2025
13 pages
Writing Lesson Plan April 26th
0% (1)
Writing Lesson Plan April 26th
3 pages
Week 4
No ratings yet
Week 4
4 pages
In This English 11 Class I Have Learned A Lot and I
No ratings yet
In This English 11 Class I Have Learned A Lot and I
2 pages
The Impact of Humor in Advertising: A Review: Marc G. Weinberger and Charles S. Gulas
No ratings yet
The Impact of Humor in Advertising: A Review: Marc G. Weinberger and Charles S. Gulas
26 pages
Power of Language Essay
No ratings yet
Power of Language Essay
5 pages
Artificial Intelligence and Personalized Learning: Tailoring Education To Individual Students Needs in Rwanda
No ratings yet
Artificial Intelligence and Personalized Learning: Tailoring Education To Individual Students Needs in Rwanda
11 pages
Introduction To Brain Chips
No ratings yet
Introduction To Brain Chips
10 pages
(Un) Healthy Behavior? The Relationship Between Media Literacy, Nutritional Behavior, and Self-Representation On Instagram
No ratings yet
(Un) Healthy Behavior? The Relationship Between Media Literacy, Nutritional Behavior, and Self-Representation On Instagram
9 pages
DLIR Oct 2024 Assignment
No ratings yet
DLIR Oct 2024 Assignment
4 pages
Columbian Exchange Grocery List and Meal Brochure
No ratings yet
Columbian Exchange Grocery List and Meal Brochure
5 pages
Inclusive Design, Disability and The Built Environment
No ratings yet
Inclusive Design, Disability and The Built Environment
4 pages
Week 10 - Shark Attack Internalization
No ratings yet
Week 10 - Shark Attack Internalization
2 pages
Satchel Paige
No ratings yet
Satchel Paige
2 pages
QB Dl-Cie1
No ratings yet
QB Dl-Cie1
1 page
Rubric Canva Project Tener Que and Expr With Tener
No ratings yet
Rubric Canva Project Tener Que and Expr With Tener
1 page
Machine Learning with R - Third Edition: Expert techniques for predictive modeling, 3rd Edition
From Everand
Machine Learning with R - Third Edition: Expert techniques for predictive modeling, 3rd Edition
Brett Lantz
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet

05 - Machine Learning

Uploaded by

05 - Machine Learning

Uploaded by

Introduction to data science

Overview of Machine Learning

Reinforcement MACHINE Ensemble

Supervised Unsupervised Semi-supervised

Boosting Bagging Stacking

◼ Supervised learning (classification)

◼ classifies data (constructs a model) based on the

◼ Supervised learning requires human expertise: Expert

◼ Model construction: describing a set of predetermined classes

◼ The model is represented as classification rules, decision trees,

◼ The known label of test sample is compared with the

classified result from the model

correctly classified by the model

fitting will occur

NAME RANK YEARS TENURED Classifier

◼ predictor accuracy: guessing value of predicted attributes

◼ time to use the model (classification/prediction time)

◼ Robustness: handling noise and missing values

◼ Other measures, e.g., goodness of rules, such as decision

False Alarm Rate

Example: Given a confusion matrix

Calculate Accuracy, Precision,

Mean Squared Error (MSE)

Mean Absolute Error (MAE):

Root Mean Square Error (RMSE):

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

Mean Absolute Percentage Error (MAPE)

where: yi is the actual values, and 𝑦ො𝑖 is the predicted values

Calculate MSE, MAE, RMSE, R2

You might also like