0% found this document useful (0 votes)
31 views

ML Engineer

This experiment teaches students how to apply classical machine learning techniques to solve regression and classification problems. Students will use scikit-learn models like linear regression, KNeighborsRegressor, DecisionTreeRegressor and RandomForestClassifier to predict disease progression on a diabetes dataset. They will also use models like SGDClassifier, KNeighborsClassifier, DecisionTreeClassifier and SVC to classify iris species. Students are required to tune hyperparameters, report results, and identify the best model for each problem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

ML Engineer

This experiment teaches students how to apply classical machine learning techniques to solve regression and classification problems. Students will use scikit-learn models like linear regression, KNeighborsRegressor, DecisionTreeRegressor and RandomForestClassifier to predict disease progression on a diabetes dataset. They will also use models like SGDClassifier, KNeighborsClassifier, DecisionTreeClassifier and SVC to classify iris species. Students are required to tune hyperparameters, report results, and identify the best model for each problem.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Developing Curricula for Artificial Intelligence and Robotics (DeCAIR)

618535-EPP-1-2020-1-JO-EPPKA2-CBHE-JP

Course Title Applied Machine Learning

Experiment Number 8

Experiment Name Regression and Classification Using Classical Techniques

Objectives The students learn to solve regression and classification problems using various classical
machine learning techniques.

Introduction There are many classical machine learning techniques that can solve a wide range of
machine learning problems. In this experiment, the students learn how to apply, tune, and
evaluate many classical models to solve regression and classification problems.

Materials Computer with Python integrated development environment (IDE) software installed
(PyCharm is recommended).

Procedure Exercise 1: Diabetes Dataset


Scikit-Learn Diabetes dataset has 10 baseline variables (age, sex, body mass index, average
blood pressure, and six blood serum measurements) that were obtained for each of 442
diabetes patients, as well as the response of interest, a quantitative measure of disease
progression one year after the baseline. For more detail, check: https://scikit-
learn.org/stable/datasets/toy_dataset.html#diabetes-dataset

You can download this dataset using the following code:


from sklearn import datasets
X, y = datasets.load_diabetes(return_X_y=True)

Using cross_val_score(reg, X, y, scoring="neg_mean_squared_error", cv=3) , find


which is the best regressor among the following regressors to predict diabetes disease
progression one year after the baseline:

1. sklearn.linear_model.LinearRegression
2. sklearn.neighbors.KNeighborsRegressor
3. sklearn.tree.DecisionTreeRegressor
4. sklearn.ensemble.RandomForestRegressor
5. sklearn.svm.LinearSVR
6. sklearn.svm.SVR

Experiment tuning the hyper parameters of some of these regressors to improve their
respective RMSE.

Exercise 2: Iris Dataset


A famous dataset that contains the sepal and petal length and width of 150 iris flowers of
three different species: Setosa, Versicolor, and Virginica.
For more detail, check: https://scikit-learn.org/stable/datasets/toy_dataset.html#iris-
dataset

You can download this dataset using the following code:

The European Commission's support for the production of this publication does not constitute an endorsement of the contents, which reflect
the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained
therein.

2
Developing Curricula for Artificial Intelligence and Robotics (DeCAIR)
618535-EPP-1-2020-1-JO-EPPKA2-CBHE-JP

from sklearn import datasets


X, y = datasets.load_iris(return_X_y=True)

Using cross_val_score(clf, X, y, scoring="accuracy", cv=3), find which is the best


classifier among the following classifiers to predict the iris species:

1. sklearn.linear_model.SGDClassifier
2. sklearn.neighbors.KNeighborsClassifier
3. sklearn.tree.DecisionTreeClassifier
4. sklearn.ensemble.RandomForestClassifier
5. sklearn.svm.LinearSVC
6. sklearn.svm.SVC
7. sklearn.ensemble.VotingClassifier

Experiment tuning the hyper parameters of some of these classifiers to improve their
respective accuracies.

Data Collection Capture the output of your code.

Data Analysis Evaluate the used models.

Required Reporting • Submit your code and the captured output.


• Report the sensitivity of each model to its hyper parameters.
• Report the best hyper parameter values for each model.
• Report the best model for each problem.

Safety Considerations Standard safety precautions related to using computer.

References 1. Applied Machine Learning presentation titled “End-to-End Machine Learning Project.”
2. Applied Machine Learning presentation titled “Classification.”
3. Applied Machine Learning presentation titled “Classical Techniques.”
4. Aurélien Géron, Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow,
O’Reilly, 3rd Edition, 2022.

The European Commission's support for the production of this publication does not constitute an endorsement of the contents, which reflect
the views only of the authors, and the Commission cannot be held responsible for any use which may be made of the information contained
therein.

You might also like