0% found this document useful (0 votes)

51 views6 pages

PythonForML2023 Laboratory07 08 Regression Classification Update2

Python agh

Uploaded by

Mohamed Abdullah Mehanna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views6 pages

PythonForML2023 Laboratory07 08 Regression Classification Update2

Python agh

Uploaded by

Mohamed Abdullah Mehanna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Faculty of Mechanical

Engineering and Robotics

Department of Robotics and

Mechatronics

Python for machine learning and data science

Topic: Regression and classification

Aim of the exercise: Learn basic implementation of the regression and

classification models using scikit-learn.

Course supervisor: Ziemowit Dworakowski, [email protected]

Laboratory author: Adam Machynia, [email protected]
1. Introduction

During this laboratory, you will learn basic implementation and use of regression and classification models
in Python. We are using the same dataset for red wine, which can be found at https://ar-
chive.ics.uci.edu/dataset/186/wine+quality. Next time, you will implement your ideas based on the pre-
sented examples for the project datasets.

2. Regression

Regression aims to predict the target value based on some features. We will use several models for this
purpose, all of them are available in the scikit-learn library. To start up, let's use very simple linear regres-
sion. After importing LinearRegression from sklearn.linear_model, we create a model simply by calling
LinearRegression(). Then, to train our model we use the function fit(), which takes as arguments training
data and labels for them. When the model is trained, we can use the function predict() to make predictions.
Importing, creating, training models, and making predictions for training and validation sets are shown in
the listing below. As you can see, with scikit-learn, it is very straightforward.

from sklearn.linear_model import LinearRegression

reg = LinearRegression()
reg.fit(x_train, y_train)

y_pred_train = reg.predict(x_train)
y_pred = reg.predict(x_val)

For evaluation of obtained predictions, we will use root mean squared error (RMSE), which is calculated
according to the following equation:

𝑁
1
𝑅𝑀𝑆𝐸 = √ ∑(𝑦̂𝑛 − 𝑦𝑛 )2 ,
𝑁
𝑛=1

where 𝑦̂𝑛 is the predicted value, 𝑦𝑛 is the real value, and N is the number of samples. RMSE can be calcu-
lated from scratch as follows.

rmse = np.sqrt(np.mean((y_pred_train - y_train)**2))

One can also use the mean_squared_error function from sklearn.metrics.

Task 1: Split the data for training, validation, and test sets. Take the quality as target value – y, do not
forget to drop it from the rest of the data. Implement linear regression and calculate the RMSE for the
training and validation set.

2
Task 2: A simple linear model can be easily extended to a weighted linear model just by using an additional
parameter sample_weight. Check how to use this parameter and implement weighted linear regression.

Now, let’s consider a more sophisticated model which is a neural network. To use it, we need to import
MLPRegressor from sklearn.neural_network.

from sklearn.neural_network import MLPRegressor

# simply with almost default parameters

reg = MLPRegressor(random_state=1, max_iter=500)
reg.fit(x_train, y_train)
y_pred_train = reg.predict(x_train)
y_pred = reg.predict(x_val)

# and then try some customization, for example:

reg = MLPRegressor(hidden_layer_sizes=(10, 5), random_state=1,
max_iter=1000, solver='lbfgs')

As neural networks are more complex, we should consider some parameters for the customization of our
model.

• Parameter hidden_layer_sizes takes a tuple where each number denotes the number of neurons
in a particular hidden layer. For example, (15, 10, 5) stands for 15 neurons in the first hidden layer,
10 neurons in the second hidden layer, and 5 neurons in the third hidden layer.
• random_state allows for reproducibility.
• max_iter is a maximal number of epochs if convergence is not obtained before.
• solver sets the algorithm for weight optimization.

These are only selected parameters, you can find the description of all of them in scikit-learn documenta-
tion.

Task 3: Implement neural network regressor. Make predictions and calculate RMSE for training and vali-
dation sets. Repeat the previous step trying your ideas for model parameters setup.

As you can see, while we have the model created, training and making predictions look the same for
different models, which helps in ensuring consistency in the code and makes examination of different
models easy. However, not all regression methods have their direct implementation. Still, they may often
be implemented quite easily with scikit-learn. If you would like to learn how to cleverly implement poly-
nomial regression using a linear regression model take a look here: https://scikit-learn.org/stable/mod-
ules/linear_model.html#polynomial-regression-extending-linear-models-with-basis-functions. Generally,
handling nonlinear models as linear models operating on nonlinear functions might be beneficial as it
keeps fast performance allowing solving more complex problems.

Task 4: Based on provided link implement polynomial regression.

3
3. Classification

As classification aims to assign the class label to each sample, we will modify our task. Suppose that we
want to predict if wine is just good or not. Let's assume that wine is good if its quality is higher than five.
So, we can simply prepare labels for training like the following. Do not forget to do the same for other
subsets or perform thresholding in advance.

y_train = data_train['quality'] > 5.5

Having the labels arranged we may pick the first classification model. Let’s start with the logistic regression
classifier. Similarly as in the case of regression, after creating the model we use function fit() for training
and predict() for prediction.

from sklearn.linear_model import LogisticRegression

clf = LogisticRegression()
clf.fit(x_train, y_train)

y_pred_train = clf.predict(x_train)
y_pred = clf.predict(x_val)

For now, we will use just the accuracy metric to check the performance of models. Generally, this is not
enough and we will discuss this issue during lab 9.

from sklearn.metrics import accuracy_score

acc_train = accuracy_score(y_train, y_pred_train)

acc_val = accuracy_score(y_val, y_pred)

Task 5: Implement the logistic regression model, train it, and make predictions on train and validation
sets. Check its accuracy.

Task 6: Use the logistic regression model, but only with some selected features. Compare two scenarios
by picking four different features:

1) 'alcohol', 'volatile acidity', 'total sulfur dioxide', 'sulphates',

2) 'pH', 'free sulfur dioxide', 'residual sugar', 'fixed acidity'.
Compare accuracy.

Various other models might be created and used similarly. Each of them has its specific parameters that
should be tuned, e.g. kernel of the SVM, number of the neighbors for KNN, or maximal depth for decision
trees.

from sklearn import svm

from sklearn.neural_network import MLPClassifier
from sklearn.neighbors import KNeighborsClassifier
4
from sklearn import tree

clf = svm.SVC(kernel='rbf')
clf = MLPClassifier()
clf = KNeighborsClassifier(n_neighbors=5)
clf = tree.DecisionTreeClassifier(max_depth=3)

Task 7: Implement the SVM model. Try different kernels using the kernel parameter.

Task 8: Implement multi-layer perceptron. Set your own parameters’ configuration. Some of them were
described in the section covering regression.

Task 9: Implement some other classifiers: KNN, decision tree or others.

Task 10: Probably, you’ve easily obtained around 70-75% accuracy. Try improving the models’ perfor-
mance by configuring their parameters.

Task 11: Pick the model (with tuned parameters) that seems to work the best for you. Now train it on
both training and validation sets and check its final performance on the test subset.

4. Pipelines

The scikit-learn module implements so-called pipelines, which facilitates creating the processing routines
when we want to execute the same processing path for different data sets, e.g. training, and validation.
This might include, for example, outliers removal or feature scaling. The Pipeline takes a list of tuples as
an argument, constructed from name and transform. The name is a string, and a transform must imple-
ment the fit and transform methods. The final transform, serving as the estimator, only requires the fit
method. Let’s take a look at the example. Here we put the standardization and the MLP into one pipeline,
so the data input to the clf.fit() method is first standardized and then the neural network is trained.

from sklearn.pipeline import Pipeline

from sklearn.preprocessing import StandardScaler

clf = Pipeline([("scaler", StandardScaler()),

("mlp", MLPClassifier(max_iter=500))])

clf.fit(x_train_subset, y_train)

5
Task 12: Implement the pipeline consisting of two steps: standardization and neural network. Select
a reasonable size of the net and compare its performance with and without feature scaling while using
the stochastic gradient descent (‘sgd’) algorithm for learning.

5. Additional tasks and challenges

Task 13: Compare regression performance (with one selected model) for all features, and selected subset
of features. Firstly, choose two, then four features, then suggest your concepts. Remember that for fea-
ture selection for regression, the correlation matrix is useful.

Task 14: Implement linear regression and neural network regressor for pH feature as a target.

Task 15: Implement KNN and decision tree classifiers. Try adjusting their parameters.

6. Tasks for project datasets

For your project datasets, discuss specific tasks with the course instructor. However, in general, all teams
should complete the following tasks.

Task 1: Perform the classification or regression task outlined in your project scope using the chosen model.
Draw conclusions.

Task 2: Repeat the previous task using different models.

• For regression: linear regressor and neural network.
• For classification: SVM and neural network.
Also, experiment with various sets of features and compare the results.

Task 3: Explore some other models that weren't mentioned in Task 2.

Task 4: Prepare your data and code for further testing:

• Ensure clarity and transparency in your code.
• Organize it to facilitate easy modification of scoring metrics and model parameters.

Task 5: Encapsulate your processing path using a pipeline.

Task 6: Describe the implementation of the model used to solve the project task in a dedicated report
section.

C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Moocs Ritesh
No ratings yet
Moocs Ritesh
22 pages
ML Lab Programs
No ratings yet
ML Lab Programs
9 pages
Regression Model Training Guide
No ratings yet
Regression Model Training Guide
13 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
ML Lab Manual
No ratings yet
ML Lab Manual
13 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
22 pages
Model Learning Steps
No ratings yet
Model Learning Steps
12 pages
cp4252 Machine Learning Lab Manual
No ratings yet
cp4252 Machine Learning Lab Manual
21 pages
ML Lab
No ratings yet
ML Lab
23 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
CP4252 Machine Learning Lab Manual
No ratings yet
CP4252 Machine Learning Lab Manual
26 pages
Week-7 DS Practical
No ratings yet
Week-7 DS Practical
8 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
Advance AI and ML LAB
No ratings yet
Advance AI and ML LAB
16 pages
A3 Classification and Feature Engineering
No ratings yet
A3 Classification and Feature Engineering
2 pages
Machine Learning LAB
No ratings yet
Machine Learning LAB
20 pages
CVD Lab Manual
No ratings yet
CVD Lab Manual
33 pages
ML Lab 01999676272
No ratings yet
ML Lab 01999676272
12 pages
Machine Learning Final Manual
No ratings yet
Machine Learning Final Manual
45 pages
ML Manual
No ratings yet
ML Manual
24 pages
21CSC305P ML - Lab Programs 1 - 9
No ratings yet
21CSC305P ML - Lab Programs 1 - 9
36 pages
Machine Learning Lab Assignments
100% (2)
Machine Learning Lab Assignments
23 pages
1 - Lab Manual (ML)
No ratings yet
1 - Lab Manual (ML)
42 pages
Lab Manual 04
No ratings yet
Lab Manual 04
12 pages
Machine Learning Evaluation Guide
100% (1)
Machine Learning Evaluation Guide
504 pages
Machine Learning Hands-On
100% (1)
Machine Learning Hands-On
18 pages
Aiml Practicals
No ratings yet
Aiml Practicals
22 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
Python Machine Learning in 7 Days
No ratings yet
Python Machine Learning in 7 Days
10 pages
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
No ratings yet
Enthought Python Machine Learning SciKit Learn Cheat Sheets 1 3 v1.0
3 pages
MLLAb
No ratings yet
MLLAb
36 pages
Chapter 4
No ratings yet
Chapter 4
5 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
AI Lab M.Tech
No ratings yet
AI Lab M.Tech
29 pages
Python For Data Science IA 1 Programs
No ratings yet
Python For Data Science IA 1 Programs
14 pages
Linear Regression Lab Guide
No ratings yet
Linear Regression Lab Guide
5 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
ML Python
No ratings yet
ML Python
11 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
ML Regression & Classification Guide
100% (1)
ML Regression & Classification Guide
45 pages
ML Record
No ratings yet
ML Record
19 pages
Easy Pract ML
No ratings yet
Easy Pract ML
7 pages
Important Questions
No ratings yet
Important Questions
4 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
Gradient Descent in Machine Learning
No ratings yet
Gradient Descent in Machine Learning
55 pages
Python Linear Regression Guide
No ratings yet
Python Linear Regression Guide
23 pages
Developing A Machining Learning Models From Start To Finish.
No ratings yet
Developing A Machining Learning Models From Start To Finish.
59 pages
Supervised Learning
No ratings yet
Supervised Learning
14 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
ML With Python Practical
No ratings yet
ML With Python Practical
22 pages
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
No ratings yet
ML-Lab07-Building and Evaluating Multivariate Regression Models in Python
5 pages
Logistic Regression Lab Guide
No ratings yet
Logistic Regression Lab Guide
9 pages
HCIA-AI Machine Learning Lab Guide
No ratings yet
HCIA-AI Machine Learning Lab Guide
82 pages
ML Models
No ratings yet
ML Models
21 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Applsci 13 04550
No ratings yet
Applsci 13 04550
21 pages
Review of Meta-Heuristic Algorithms For Wind Power Prediction - Methodologies, Applications and Challen
No ratings yet
Review of Meta-Heuristic Algorithms For Wind Power Prediction - Methodologies, Applications and Challen
36 pages
Supervised Machine Learning Guide
No ratings yet
Supervised Machine Learning Guide
7 pages
Data Science
No ratings yet
Data Science
29 pages
2 A Convolutional Neural Network Ensemble Model For Pneumonia Detection Using Chest X-Ray Images
No ratings yet
2 A Convolutional Neural Network Ensemble Model For Pneumonia Detection Using Chest X-Ray Images
6 pages
Literature Review Report 05.05.2025
No ratings yet
Literature Review Report 05.05.2025
12 pages
IoT Enabled Flood Monitoring and Early W
No ratings yet
IoT Enabled Flood Monitoring and Early W
18 pages
Energy Consumption Prediction by Using Machine Learning For Smart Building Case Study in Malaysia
No ratings yet
Energy Consumption Prediction by Using Machine Learning For Smart Building Case Study in Malaysia
17 pages
Yuveats - Food Ordering System in University Campus: Hiti Dudeja Dr. Priyanka Gupta
No ratings yet
Yuveats - Food Ordering System in University Campus: Hiti Dudeja Dr. Priyanka Gupta
5 pages
Sentiments Analysis of Amazon Reviews Dataset by Using Machine Learning
No ratings yet
Sentiments Analysis of Amazon Reviews Dataset by Using Machine Learning
9 pages
Lecture 11
No ratings yet
Lecture 11
35 pages
A Survey On The Application of Data Science and Analytics in The Field of Organized Sports
No ratings yet
A Survey On The Application of Data Science and Analytics in The Field of Organized Sports
7 pages
Monitoring Iron Stress in Tomato Plants Through Bioimpedance and Machine-Learning-Enhanced Classification Based On Circuit Component Analysis
No ratings yet
Monitoring Iron Stress in Tomato Plants Through Bioimpedance and Machine-Learning-Enhanced Classification Based On Circuit Component Analysis
8 pages
A New Hybrid Classification Algorithm For Customer Churn Prediction Based On Logistic Regression and Decision Trees
No ratings yet
A New Hybrid Classification Algorithm For Customer Churn Prediction Based On Logistic Regression and Decision Trees
13 pages
Dengue Fever in India: Overview and Impact
No ratings yet
Dengue Fever in India: Overview and Impact
30 pages
21csc305p - Machine Learning - Student Copy Course Plan
No ratings yet
21csc305p - Machine Learning - Student Copy Course Plan
11 pages
SVM MATLAB Guide with Code Examples
No ratings yet
SVM MATLAB Guide with Code Examples
25 pages
Mini Project 2 Report
No ratings yet
Mini Project 2 Report
28 pages
AI Augmented Metabolomics
No ratings yet
AI Augmented Metabolomics
8 pages
Detection of Pesticide Residue Level in Grape
No ratings yet
Detection of Pesticide Residue Level in Grape
3 pages
Mini Project Report 2024-25-1
No ratings yet
Mini Project Report 2024-25-1
29 pages
Eskandari Et Al RS 2020
No ratings yet
Eskandari Et Al RS 2020
32 pages
Ai in Agriculture
No ratings yet
Ai in Agriculture
11 pages
A Review of Machine Learning Applications in Wildfire Science and MNGT
No ratings yet
A Review of Machine Learning Applications in Wildfire Science and MNGT
71 pages
Stroke Detection via ML Models
No ratings yet
Stroke Detection via ML Models
5 pages
A Multimodal Deep Learning-Based Fault Detection Model For A Plastic Injection Molding Process
No ratings yet
A Multimodal Deep Learning-Based Fault Detection Model For A Plastic Injection Molding Process
13 pages
Support Vector Machine (SVM) : Basic Terminologies
100% (1)
Support Vector Machine (SVM) : Basic Terminologies
2 pages
Sinkhole Publication
No ratings yet
Sinkhole Publication
11 pages
Fall Detection For Elderly People Using Machine Learning
No ratings yet
Fall Detection For Elderly People Using Machine Learning
4 pages
ML History & Prbabilistic Modelling
No ratings yet
ML History & Prbabilistic Modelling
13 pages

PythonForML2023 Laboratory07 08 Regression Classification Update2

Uploaded by

PythonForML2023 Laboratory07 08 Regression Classification Update2

Uploaded by

Faculty of Mechanical

Engineering and Robotics

Department of Robotics and

Python for machine learning and data science

Topic: Regression and classification

Aim of the exercise: Learn basic implementation of the regression and

Course supervisor: Ziemowit Dworakowski, [email protected]

from sklearn.linear_model import LinearRegression

rmse = np.sqrt(np.mean((y_pred_train - y_train)**2))

One can also use the mean_squared_error function from sklearn.metrics.

from sklearn.neural_network import MLPRegressor

# simply with almost default parameters

# and then try some customization, for example:

Task 4: Based on provided link implement polynomial regression.

y_train = data_train['quality'] > 5.5

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score

acc_train = accuracy_score(y_train, y_pred_train)

1) 'alcohol', 'volatile acidity', 'total sulfur dioxide', 'sulphates',

from sklearn import svm

Task 9: Implement some other classifiers: KNN, decision tree or others.

from sklearn.pipeline import Pipeline

clf = Pipeline([("scaler", StandardScaler()),

5. Additional tasks and challenges

6. Tasks for project datasets

Task 2: Repeat the previous task using different models.

Task 3: Explore some other models that weren't mentioned in Task 2.

Task 4: Prepare your data and code for further testing:

Task 5: Encapsulate your processing path using a pipeline.

You might also like