Assignment ML3

The assignment involves using the Pen-Digits dataset to implement and evaluate machine learning techniques including decision trees, bagging, and boosting. Key tasks include generating visualizations, fitting models, tuning hyperparameters, and comparing model performances based on various metrics. Additionally, the assignment emphasizes improving model accuracy through PCA and feature selection, with a focus on evaluating the impact of these enhancements.

Uploaded by

harithmsylhy3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views2 pages

Assignment ML3

Uploaded by

harithmsylhy3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Assignment: ML (3)

Dataset:
Use Pen-Digits datasets (train dataset & test dataset) with provided splits to solve
questions.

# Decision tree
1- Generate a scatterplot matrix to show the relationships between the variables and a
heatmap to determine correlated attributes, then write a summary of what you noticed.
2- Ensure data is in the correct format for downstream processes (e.g., remove redundant
information, convert categorical to numerical values, address missing values, etc.)
3- Fit a decision tree to the training data. Plot the tree, interpret the results, and display
accuracy and Confusion Matrix.
4- Try different ways to improve the decision tree algorithm (e.g., use different splitting
strategies, prune tree after splitting). Does pruning the tree improves the accuracy?

# Bagging
(Bagging is to generate a set of bootstrap datasets, create estimators for each bootstrap
dataset, and finally utilize majority voting (soft or hard) to get the final decision.)
1- Apply bagging strategy to classify test set samples by using SVM and Decision Tree
algorithm as base estimators. Display accuracy and Confusion Matrix.
2- Apply Random Forest algorithm (the baseline), then fine tune this baseline. For the
number of estimators, Try 5 different values within the interval of [10, 200]. Plot
accuracy vs. number of estimators.

# Boosting
1- Use GradientBoosting classifier to classify test set samples. There are 2 important
hyperparameters in GradientBoosting, i.e., the number of estimators, and learning rate.
First, tune number of estimators parameter by trying 4 values in the interval of [10,
200]. Then by using the tuned value for number of estmators, tune the learning rate
parameter by trying 4 values within the range of [0.1, 0.9]. Display accuracy and
Confusion Matrix separately for the best value of both parameters (Number of
estimators and learning rate).
2- Build XGBoost classifier with the same parameters that you obtained in the last one.
Provide accuracy and Confusion Matrix.
3- Comment on Bagging and Boosting approaches.
# Improving with PCA and Feature Selection:
1- Compare the performance of the models in terms of the following criteria: precision,
recall, accuracy, F-score. Identify the model that performed best and worst according to
each criterion.
2- Choose the best model and reduce complexity and focusing on the most important
features by using Principal Component Analysis (PCA) and feature selection to make it
work even better.
3- Evaluate how well the improved model does in precision, recall, accuracy, and F1-score.
This helps us see the impact of PCA and feature selection on its performance.
4- Compare the enhanced model's performance with its original version. With writing a
comment.

Mercedes-Benz Greener Manufacturing Ai
0% (1)
Mercedes-Benz Greener Manufacturing Ai
16 pages
The Book of The Dun Cow by Walter Wangerin - Teacher Study Guide
No ratings yet
The Book of The Dun Cow by Walter Wangerin - Teacher Study Guide
33 pages
Progress of GRADIENT BOOSTING ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
No ratings yet
Progress of GRADIENT BOOSTING ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
10 pages
Assignment 2
No ratings yet
Assignment 2
3 pages
Mldoc Intro
No ratings yet
Mldoc Intro
4 pages
Module4
No ratings yet
Module4
44 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
17 Ensemble Techniques Problem Statement
No ratings yet
17 Ensemble Techniques Problem Statement
28 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Week_11
No ratings yet
Week_11
16 pages
Homework 2: SVM, Kernel Methods, Ensemble Learning, Learning Theory
No ratings yet
Homework 2: SVM, Kernel Methods, Ensemble Learning, Learning Theory
12 pages
Data Mining Lab-2
No ratings yet
Data Mining Lab-2
6 pages
Ann Experiential Learning
No ratings yet
Ann Experiential Learning
43 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
XGBoost Tuning 1597155827
No ratings yet
XGBoost Tuning 1597155827
7 pages
Phase 3 IBM
No ratings yet
Phase 3 IBM
7 pages
Project Report Kodeinkgp
No ratings yet
Project Report Kodeinkgp
6 pages
Unit-4
No ratings yet
Unit-4
15 pages
KNN-SVM Assignment
No ratings yet
KNN-SVM Assignment
4 pages
Eldar: Name: Ticket:N3 Group:E27-24
No ratings yet
Eldar: Name: Ticket:N3 Group:E27-24
10 pages
DMBI
No ratings yet
DMBI
15 pages
BigData Week13
No ratings yet
BigData Week13
62 pages
Unit-5
No ratings yet
Unit-5
11 pages
Chapter 7 - Ensemble
No ratings yet
Chapter 7 - Ensemble
12 pages
Green University of Bangladesh Department of Computer Science and Engineering (CSE)
No ratings yet
Green University of Bangladesh Department of Computer Science and Engineering (CSE)
6 pages
T1 ML QB Soln
No ratings yet
T1 ML QB Soln
23 pages
TP-Phase3
No ratings yet
TP-Phase3
2 pages
1 Homework 2: 1.1 Large Scale Data Analysis / Aalto University, Spring 2023
No ratings yet
1 Homework 2: 1.1 Large Scale Data Analysis / Aalto University, Spring 2023
12 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Unit 04 EDA 02
No ratings yet
Unit 04 EDA 02
7 pages
Problem Statement For Assignment Part 2
No ratings yet
Problem Statement For Assignment Part 2
1 page
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
ML Lab6
No ratings yet
ML Lab6
4 pages
AdaBoost and Gradient Boost.
No ratings yet
AdaBoost and Gradient Boost.
3 pages
Ml Algo Revision (Detailed)
No ratings yet
Ml Algo Revision (Detailed)
8 pages
Unit 3 Aml
No ratings yet
Unit 3 Aml
9 pages
PA UNIT 4
No ratings yet
PA UNIT 4
5 pages
MCQ
No ratings yet
MCQ
4 pages
Al3451 - Question Bank
100% (1)
Al3451 - Question Bank
12 pages
TD2345
No ratings yet
TD2345
3 pages
6 - Steps of The Classification Algorithm in Supervised Learning
No ratings yet
6 - Steps of The Classification Algorithm in Supervised Learning
15 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Classifiers Comparison Report
No ratings yet
Classifiers Comparison Report
2 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
No ratings yet
SVM K NN MLP With Sklearn Jupyter NoteBo
22 pages
Python Essential Methods in Machine Learning
No ratings yet
Python Essential Methods in Machine Learning
6 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
10 pages
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
No ratings yet
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
9 pages
AIML Short Term Internship Session 10 Summary-1719293295226
No ratings yet
AIML Short Term Internship Session 10 Summary-1719293295226
3 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
8 pages
To Improve The Performance of Models Predicting Ba
No ratings yet
To Improve The Performance of Models Predicting Ba
6 pages
ML Lab Programs 2
No ratings yet
ML Lab Programs 2
16 pages
ML QB Solutionss
No ratings yet
ML QB Solutionss
16 pages
Codes and Concepts of ML-Developer-2
No ratings yet
Codes and Concepts of ML-Developer-2
17 pages
Lecture 3
No ratings yet
Lecture 3
15 pages
Heart Merged
No ratings yet
Heart Merged
8 pages
Ensemble (v6)
No ratings yet
Ensemble (v6)
45 pages
Performance Improvement of Model
No ratings yet
Performance Improvement of Model
4 pages
Assignment 7
No ratings yet
Assignment 7
3 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Logistic Regression
No ratings yet
Logistic Regression
87 pages
Linear Regression (2)
No ratings yet
Linear Regression (2)
80 pages
Presentation (7)
No ratings yet
Presentation (7)
19 pages
python lec3 (1)
No ratings yet
python lec3 (1)
28 pages
python lec3
No ratings yet
python lec3
46 pages
Assignment Numpy
No ratings yet
Assignment Numpy
1 page
Assignment Pandas#
No ratings yet
Assignment Pandas#
1 page
Regression Assignment#
No ratings yet
Regression Assignment#
1 page
Georges Renault Cvis II
No ratings yet
Georges Renault Cvis II
76 pages
Action Plan For NLC
No ratings yet
Action Plan For NLC
9 pages
Button
No ratings yet
Button
11 pages
Automobile Technology Ceylon German Technical Training Institute Moratuwa
No ratings yet
Automobile Technology Ceylon German Technical Training Institute Moratuwa
28 pages
Factors Led To The Growth of MIS
No ratings yet
Factors Led To The Growth of MIS
17 pages
Product List
No ratings yet
Product List
42 pages
Chapter 1 SAD
No ratings yet
Chapter 1 SAD
8 pages
AES DRRM Memo PASS
No ratings yet
AES DRRM Memo PASS
2 pages
Ec PDF
No ratings yet
Ec PDF
1,602 pages
Aramean Crusade Against The Assyrian Name & Identity
No ratings yet
Aramean Crusade Against The Assyrian Name & Identity
7 pages
Review of Invisalign System
No ratings yet
Review of Invisalign System
13 pages
About S HTTP
No ratings yet
About S HTTP
1 page
Research Paper 2 Group 3 Watson
No ratings yet
Research Paper 2 Group 3 Watson
6 pages
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
No ratings yet
Oops (Object Oriented Programming System) : Object Class Inheritance Polymorphism Abstraction Encapsulation
65 pages
6089202f4e466 The Amorphous Nature of Agile No One Size Fits All
No ratings yet
6089202f4e466 The Amorphous Nature of Agile No One Size Fits All
42 pages
Organophosphate Insecticides (OPC)
No ratings yet
Organophosphate Insecticides (OPC)
27 pages
T5 Chapter Wise Test Biology Chapter 5 1st Year
No ratings yet
T5 Chapter Wise Test Biology Chapter 5 1st Year
2 pages
Vernalisation in Details
No ratings yet
Vernalisation in Details
3 pages
Commerce
No ratings yet
Commerce
10 pages
Ecosystem Services: Economics and Policy Stephen Muddiman Instant Download
No ratings yet
Ecosystem Services: Economics and Policy Stephen Muddiman Instant Download
62 pages
Maths
No ratings yet
Maths
114 pages
702-Failure Cargo Crane
100% (1)
702-Failure Cargo Crane
27 pages
The Most Notorious "Talker" Runs The World's Greatest Clan Vol 3
No ratings yet
The Most Notorious "Talker" Runs The World's Greatest Clan Vol 3
339 pages
Physics1 PDF
No ratings yet
Physics1 PDF
7 pages
DDP Sohana - 2021 - Notification
No ratings yet
DDP Sohana - 2021 - Notification
17 pages
Medical Forms The High School Programme 2020-21
No ratings yet
Medical Forms The High School Programme 2020-21
4 pages
Some Basic Concepts of Chemistry
No ratings yet
Some Basic Concepts of Chemistry
19 pages
Studies in The Psychology of Sex, Volume 3 Analysis of The Sexual Impulse Love and Pain The Sexual Impulse in Women by Ellis, Havelock, 1859-1939
100% (3)
Studies in The Psychology of Sex, Volume 3 Analysis of The Sexual Impulse Love and Pain The Sexual Impulse in Women by Ellis, Havelock, 1859-1939
242 pages
CHE-221: Fluid Mechanics-I: Dr. Zaib Jahan
No ratings yet
CHE-221: Fluid Mechanics-I: Dr. Zaib Jahan
10 pages

Assignment ML3

Uploaded by

Assignment ML3

Uploaded by

Assignment: ML (3)

You might also like