ML Lab 08 Manual - Logisitic Regression (Ver7)
ML Lab 08 Manual - Logisitic Regression (Ver7)
Machine Learning
Introduction
Objectives
Lab Conduct
Machine Learning
code/screenshot/plot between the #START and #END parts of these
commented lines. Do NOT remove the commented lines.
• Use the tab key to provide the indentation in python.
• When you provide the code in the report, keep the font size at 12
Theory
Logistic Regression is another basic supervised learning technique besides
Linear Regression. In logistic regression, the linear regression algorithm is
modified by applying a sigmoid function to the predicted value. This causes the
prediction to fall between 0 to 1 values. Thus, logistic regression is actually a
classification technique built from the linear regression. The sigmoid function is
a type of an activation function. Aside from loss and accuracy, logistic
regression also at times, requires calculation of precision and recall. This is
needed when the dataset is skewed; the two class labels are not equally
distributed in the dataset.
Machine Learning
Lab Task 1 - Dataset Preparation, Feature Scaling ______________________
Download a dataset containing several columns. You will need to select at least
3 of the feature columns to make your own dataset. Also, you will need to
choose the label column that your model will predict. Ensure that the label
column has discrete, binary values. Specify the features and the label that you
choose. You will need to attach your dataset in the submission.
The dataset examples are to be divided into 2 separate portions: training and
validation datasets (choose from 80-20 to 70-30 ratios). Load the dataset into
your python program, split them into the portions and store them as NumPy
arrays (Xtrain , ytrain, Xval, yval,). Next, use feature scaling to rescale the feature
columns for both datasets so that their values range from 0 to 1. Finally, print
both of the datasets (you need to show any 5 rows of the datasets).
Machine Learning
represents the sigmoid activation function. In this task, you will write a function
that takes in a value z as argument and outputs the result of the sigmoid
activation g(z). Provide the code for this task. For the screenshot, provide the
plot of the sigmoid activation function.
cost_function(X, y)
The X and y are the features and labels of the training/validation dataset. The
function will return the cost value. For binary classification, use the given cost
function:
m−1
1
J (w)= ∑ ¿ ¿
m i=0
Machine Learning
hypothesis. Provide the code and all relevant screenshots showcasing the use of
your cost function.
gradient_descent(X, y, alpha)
The X and y are the features and labels of the training dataset, alpha is the
learning rate which is a tuning hyperparameter. The gradient descent algorithm
is given as follows:
m
∂J 1
d w j= = ∑ (h( x (i ))– y (i )) x j(i)
∂ w j m i=1
m
∂J 1
db= = ∑ (h(x ( i)) – y (i) )
∂ b m i=1
∂J
w j :=w j−α
∂wj
∂J
b :=b−α
∂wj
Machine Learning
The gradient descent for logistic regression may seem identical to that in linear
regression, however, it should be noted that they are not the same formulas as
the cost function for the logistic regression is different from that used in linear
regression.
For the submission, you will need to run the gradient descent algorithm once to
update the weights. You will need to print the weights, training cost and
validation cost both before and after the weight update. Provide the code and
all relevant screenshots of the final output.
Machine Learning
TP
Precision=
TP+ FP
TP
Recall=
TP+ FN
Start the training at some value of alpha. Try multiple training attempts with
various alpha values and find the best value of the step size. Once you have
found the value for the step size, showcase the output by making three plots:
Ensure all axes are labeled appropriately. Provide the code (excluding function
definitions) and all plots of the final output.
Machine Learning
Lab Task 6 – Prediction and Scatter Plot ________________________________
Save the weights that fit the best model and use them to create a prediction
function. The prediction function will take the features as input and output the
predicted class of the label. To convert the output to 0 or 1, a threshold of 0.5
needs to be applied to the predicted value h(x). Call your prediction function by
giving it some input values to make at least three predictions. Print your
predictions and take a screenshot. Additionally, your program must make a
scatter plot showing the training and validation examples. The coordinates in
the scatter plot correspond to the inputs (x). The class is denoted by:
Machine Learning