Supervised ML

The document provides an introduction to machine learning, covering supervised and unsupervised algorithms, dataset splitting, and model evaluation metrics. It details various classification algorithms, particularly focusing on Random Forest, including hyperparameter tuning using GridSearchCV. The document also emphasizes the importance of metrics like accuracy, precision, recall, and F1-score for evaluating model performance.

Uploaded by

duarte.denio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views13 pages

Supervised ML

Uploaded by

duarte.denio

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Introduction to

Machine Learning

Prof. Denio Duarte

[email protected]
Introduction
● Learning based on dataset features
○ Supervised algorithms (labelled)
■ The value of the label
● Classifcation (discrete)
● Regression (continuous)
○ Unserpervised algorithms
Introduction
● Before starting
○ The ML algorithms learn applying a given approach to
generalize the data to get the correct classes (labels)
○ The dataset can (or must) be split for training and
testing the built model
■ Training set (+/- 70%)
■ Test set (+/- 30%)
Introduction
● Before starting
○ Dataset spliting

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# random_state to guarantee the same random numbers through all executtions
# shuffle the examples before spliting (default True)
# stratify balance the number of classes in the sets if None (default)
Introduction
● Before starting
○ A model must be evaluate using metrics
■ Classification is the simplest one
– Compare the real value against the predicted one
■ To evaluate regression models is trickier
– Subtract the real value from the predicted one (0 means
that real and predicted are the same) – residual error
Introduction
● Before staring
○ Classification (metrics: sklearn.metrics)
■ Balanced classes
● Accuracy (metrics.accuracy_score(y, y_hat))
■ Unbalanced classes
● Precision (metrics.precision_score(y, y_hat))
● Recall (metrics.recall_score(y, y_hat))
● F1-score (metrics.f1_score(y, y_hat))
■ Getting all metrics
● metrics.classification_report(y, y_hat)
Introduction
● Confusion matrix
Classification
● Some algorithms
○ Decision tree
○ Random forest
○ Support vector machines
○ Logistic regression
Random Forest
● Several decision trees (estimators) are built to check
the predicted value
○ The most voted predicted class is chosen

Source: https://towardsdatascience.com/understanding-random-forest-58381e0602d2
Random Forest
● The most informative attributes are placed next to the
root (an informativeness function is applied)
○ entropy
○ gini (based on entropy but
computationally simpler)

Source: https://towardsdatascience.com/understanding-random-forest-58381e0602d2
Random Forest
● All ML algorithm has a set of hyperparameter used to
tune the model
○ criterion ['gini','entropy']
○ n_estimators n - # of trees (100)
○ max_features ['auto', 'sqrt', 'log2'] – maximal number of
attributes to split a node
○ max_depth n – max depth of a tree (None)
○ bootstrap [True, False] – built synthetic samples to
build the trees
○ Several others must be studied
Random Forest
● How to choose the best hyperparameters
○ sklearn.model_selection.GridSearchCV
○ Allow to run a given estimator with a combination of
hiperparameter and return the best combination

param_grid = {'bootstrap':[True,False], ‘n_estimators’:[50,100,200,300],

‘max_features’: ['auto', 'sqrt', 'log2'], criterion:[‘gini’,’entropy’]}
best_RF=GridSearchCV(estimator=RandomForestClassifier(), param_grid=param_grid)
# the algorithm runs 2x4x3x2= 48 times
best_RF.fit(X_train,y_train)
best_RF.best_estimator_
Exercise
● Based on previous exercise
○ Propose a set of hypeparameters and show the best
model using precision and accuracy

Unit-IV Rough Set Theory
No ratings yet
Unit-IV Rough Set Theory
40 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Introduction To Scikit Learn
100% (1)
Introduction To Scikit Learn
108 pages
Machine Learning Random Forest Algorithm - Javatpoint
No ratings yet
Machine Learning Random Forest Algorithm - Javatpoint
14 pages
Hartshorn, Scott 2016 - Machin Learning With Random Forests and Decision Trees - A Visual Guide For Beginners
No ratings yet
Hartshorn, Scott 2016 - Machin Learning With Random Forests and Decision Trees - A Visual Guide For Beginners
98 pages
Unit 1,2,3
No ratings yet
Unit 1,2,3
30 pages
Artificial Intelligence and Machine Learning - July 2021
No ratings yet
Artificial Intelligence and Machine Learning - July 2021
49 pages
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
25 pages
Data Mining Using Python Manual
No ratings yet
Data Mining Using Python Manual
69 pages
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners (Naren) PDF
No ratings yet
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners (Naren) PDF
68 pages
Data Mining
No ratings yet
Data Mining
84 pages
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
No ratings yet
Machine Learning With Random Forests and Decision Trees - A Visual Guide For Beginners by Scott Hartshorn
73 pages
IML Lab Manual
No ratings yet
IML Lab Manual
31 pages
Hybrid Approach of Cotton Disease Detection For en
No ratings yet
Hybrid Approach of Cotton Disease Detection For en
14 pages
Random Forest 1737667979
No ratings yet
Random Forest 1737667979
11 pages
Lecture 4 - Intro To Machine Learning and Decision Trees
No ratings yet
Lecture 4 - Intro To Machine Learning and Decision Trees
61 pages
Machine Learning - Iii
No ratings yet
Machine Learning - Iii
53 pages
Unit II
No ratings yet
Unit II
34 pages
ML Practical Updated
No ratings yet
ML Practical Updated
64 pages
Random Forest
No ratings yet
Random Forest
18 pages
14 - Ensemble Methods
No ratings yet
14 - Ensemble Methods
38 pages
Random Forest Regression
No ratings yet
Random Forest Regression
57 pages
ML Models
No ratings yet
ML Models
21 pages
On Daibeteg
No ratings yet
On Daibeteg
27 pages
CSC407 - Chapter 5-6
No ratings yet
CSC407 - Chapter 5-6
42 pages
Lec 04 05
No ratings yet
Lec 04 05
37 pages
ML Imp Ques 2
No ratings yet
ML Imp Ques 2
37 pages
ML Asst.-01
No ratings yet
ML Asst.-01
21 pages
ML Unit I
No ratings yet
ML Unit I
43 pages
Day 2 Presentation
No ratings yet
Day 2 Presentation
65 pages
AIML
No ratings yet
AIML
30 pages
LINFO2262: Decision Trees + Random Forests: Pierre Dupont
No ratings yet
LINFO2262: Decision Trees + Random Forests: Pierre Dupont
43 pages
Course - Machine Learning A-Z - AI, Python & R + ChatGPT Prize (2025) - Udemy Business
No ratings yet
Course - Machine Learning A-Z - AI, Python & R + ChatGPT Prize (2025) - Udemy Business
18 pages
ML & DL Notes
No ratings yet
ML & DL Notes
30 pages
1725629890-Unit1 Machine Learning Introduction CU 3.0
No ratings yet
1725629890-Unit1 Machine Learning Introduction CU 3.0
38 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Applied Machine Learning Supervised Machine Learning (Part 2)
No ratings yet
Applied Machine Learning Supervised Machine Learning (Part 2)
47 pages
1694266379-Unit1 Machine Learning Introduction CU 2.0
No ratings yet
1694266379-Unit1 Machine Learning Introduction CU 2.0
58 pages
Supervised Learning Workshop
No ratings yet
Supervised Learning Workshop
30 pages
ML 6
No ratings yet
ML 6
15 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Unit 4 ML
No ratings yet
Unit 4 ML
24 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Decision Trees Implementation
No ratings yet
Decision Trees Implementation
13 pages
AbuSaa2019 Article FactorsAffectingStudentsPerfor
No ratings yet
AbuSaa2019 Article FactorsAffectingStudentsPerfor
32 pages
ML Ch-2 Supervised Learning
No ratings yet
ML Ch-2 Supervised Learning
23 pages
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
No ratings yet
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
29 pages
DL
No ratings yet
DL
10 pages
Final
No ratings yet
Final
13 pages
Random Forest
No ratings yet
Random Forest
21 pages
Chapter 6 DATA MINING R1
No ratings yet
Chapter 6 DATA MINING R1
81 pages
Random Forests
No ratings yet
Random Forests
22 pages
What Is Machine Learning - Python Data Science Handbook
No ratings yet
What Is Machine Learning - Python Data Science Handbook
11 pages
Machine Learning 2
No ratings yet
Machine Learning 2
21 pages
Scikit Learn What Were Covering
No ratings yet
Scikit Learn What Were Covering
15 pages
2023AIB1008 Lab08
No ratings yet
2023AIB1008 Lab08
8 pages
Random Forest Classification
No ratings yet
Random Forest Classification
8 pages
ML Lab File
No ratings yet
ML Lab File
53 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Unit3 ML
No ratings yet
Unit3 ML
7 pages
Learn Machine Learning in One Lesson Book
No ratings yet
Learn Machine Learning in One Lesson Book
8 pages
Crash Course Sul Machine Learning ?
No ratings yet
Crash Course Sul Machine Learning ?
13 pages
Unit 4
No ratings yet
Unit 4
33 pages
Self-Quiz Unit 5 - Attempt Review
No ratings yet
Self-Quiz Unit 5 - Attempt Review
6 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
9 pages
Lab Program 3
No ratings yet
Lab Program 3
6 pages
Machine Learning: An Applied Econometric Approach: Sendhil Mullainathan and Jann Spiess
No ratings yet
Machine Learning: An Applied Econometric Approach: Sendhil Mullainathan and Jann Spiess
38 pages
Machine Learning - Random Forest
No ratings yet
Machine Learning - Random Forest
6 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
Week 7 Laboratory Activity
No ratings yet
Week 7 Laboratory Activity
12 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
9 pages
Solutions 3
No ratings yet
Solutions 3
4 pages
Random Forest: The Algorithm in A Nutshell
No ratings yet
Random Forest: The Algorithm in A Nutshell
10 pages
Lecture Notes 2
No ratings yet
Lecture Notes 2
3 pages
COSC 6335 Data Mining (Dr. Eick) Solution Sketches Midterm Exam October 25, 2012
No ratings yet
COSC 6335 Data Mining (Dr. Eick) Solution Sketches Midterm Exam October 25, 2012
11 pages
International School of Engineering: (We Are Applied Engineering)
No ratings yet
International School of Engineering: (We Are Applied Engineering)
20 pages
Robust Decision Trees
No ratings yet
Robust Decision Trees
6 pages
9.0 KNN Nearest Neighbours Algorithm
No ratings yet
9.0 KNN Nearest Neighbours Algorithm
4 pages
Random Forest
No ratings yet
Random Forest
3 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
What Is Decision Tree?: ISM Implementation of Decision Tree Submitted By: Sagiruddin Akthar 19mcmc28
No ratings yet
What Is Decision Tree?: ISM Implementation of Decision Tree Submitted By: Sagiruddin Akthar 19mcmc28
4 pages
Hyperparameter Tuning For Machine Learning Models
No ratings yet
Hyperparameter Tuning For Machine Learning Models
5 pages
APD (Analysis Process and Designer)
No ratings yet
APD (Analysis Process and Designer)
9 pages
2011 DSS Detecting Evolutionary Financial Statement Fraud PDF
No ratings yet
2011 DSS Detecting Evolutionary Financial Statement Fraud PDF
7 pages
Data Warehousing and Data Mining UNIT - 04: A Lazy Learner Simply Stores The Training Data and
No ratings yet
Data Warehousing and Data Mining UNIT - 04: A Lazy Learner Simply Stores The Training Data and
3 pages
ml2 PDF
No ratings yet
ml2 PDF
5 pages
Core Concepts in Real Analysis
From Everand
Core Concepts in Real Analysis
Roshan Trivedi
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)