Vectorized Logistic Regression
Vectorized Logistic Regression
Logistic Regression
1
• Recall, for the non-vectorized implementation of Logistic Regression, the batch
update rule is:
For each feature ( ){
for each
=1
2
• Notice that over the sum of training examples, must be computed for
each training example :
3
• Moreover, a vector is created:
ି௭
• The dimension:
4
• Furthermore, is subtracted from the corresponding
component of :
ሺభሻ
ሺమሻ
ሺሻ
Vector product.
7
• Therefore to compute the ,
1. Compute vector .
2. Subtract given vector from .
3. Take the sum of the products obtained from the
component wise multiplication of the vectors
and .
8
• Consider the dimension of the vectors in :
9
• Expanding:
10
• This is the for feature ,
• The list of slopes, formed by finding the slope for each feature, forms a
gradient (i.e., vector of partial derivatives):
12
• Expanding the vector product:
13
• For vectorized implementation, the batch update rule
is:
For each feature (j=0; j<=n; j++) {
}
• In vector format:
14
• Example 2 uses the data set of “Exercise 4: Logistic
Regression and Newton’s Method” from:
http://openclassroom.stanford.edu/MainFolder/Cour
sePage.php?course=MachineLearning
# x1 x2 y
1 5.55E+01 6.95E+01 1
2 4.10E+01 8.15E+01 1
… … 1
78 1.85E+01 7.45E+01 0
79 1.60E+01 7.25E+01 0
80 3.35E+01 6.80E+01 0
16
Opening and Plotting Data Files
clear all; close all; clc
x = load('ex4x.dat'); y = load('ex4y.dat');
figure;
hold on
set(0, 'defaultaxesfontname', 'Arial');
set(0, 'defaultaxesfontsize', 16);
for i=1:length(y)
if (y(i)==1)
plot3(x(i,1),x(i,2),y(i),'+', 'color', 'g', 'markersize', 8);
else
plot3(x(i,1),x(i,2),y(i),'o', 'color', 'r', 'markersize', 8);
endif
endfor
ylabel('Exam 1 Score', 'fontsize', 18, 'fontname', 'Arial');
xlabel('Exam 2 Score', 'fontsize', 18, 'fontname', 'Arial');
zlabel('Pass/Fail', 'fontsize', 18, 'fontname', 'Arial');
title('Exam Scores', 'fontsize', 20, 'fontname', 'Arial');
17
m = numTrainSam; Same as in Example 1
prevTheta=theta;
for t=1:maxIterations
totError = 0;
for j=1:numFeatures
totSlope = 0;
for i=1:m
z=0;
for jj=1:numFeatures
z=z+prevTheta(jj)*x(i,jj);
end
h=1.0/(1.0+exp(-z));
totSlope = (totSlope + (h-y(i))*x(i,j));
totError = (totError + -y(i)*log(h)-(1-y(i))*log(1-h));
end
totError=totError/numTrainSam;
theta(j)=theta(j)-learningRate*(totSlope/numTrainSam);
end
prevTheta=theta;
errorPerIteration(t)=totError/numFeatures;
end
18
for t = 1:MAX_ITR
% Update theta
z = x * theta;
h = g(z);
grad = (1/m).*x' * (h-y);
theta = theta - alpha .* grad;
% Calculate J (for testing convergence)
J(t) =(1/m)*sum(-y.*log(h) - (1-y).*log(1-h));
end
19
x theta z
=
… …
… …
21
… … …
22
m Columns
m Rows
…
23
Comparison:
Non-vectorized vs Vectorized
Non-vectorized Vectorized
Iterations 100,000
CPU Time (s) 4805.7 24.430
-3.998494 -3.998494
0.065799 0.065799
0.024059 0.024059
Iterations 10,000,000
CPU Time (s) ≈500,000 2532.5
-16.37865
0.14834
0.15891
24
Visualizing Classification Result
−16.380
0.1483 + Positive training
0.1589 examples.
Δ Predictions.
o Negative training
examples.
Prediction
Examples labeled
as Positive, but
predicted as
Negative.
Examples labeled
as Negative, but
predicted as
• Unable to learn 15 training examples. Positive.
• Training accuracy = 15/80 = 81.25% 26
27
28
29
30
Very difficult to
differentiate or
classify these
examples.
32
References
[1] W. S. McCulloch and W. Pitts, "A logical calculus of the ideas immanent in nervous activity," Bulletin of Mathematical
Biophysics, vol. 5, pp. 115-133, 1943.
[2] F. Rosenblatt, "The Perceptron--a perceiving and recognizing automaton," Cornell Aeronautical Laboratory, New York, NY,
1957.
[3] M. Minsky and S. Papert, Perceptrons: An Introduction to Computational Geometry, Cambridge MA: The MIT Press,
1969.
[4] P. J. Werbos, "Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences," PhD thesis, Harvard
University, Harvard, 1974.
[5] J. J. Hopfield, "Neural networks and physical systems with emergent collective computational properties," in Proceedings
of the National Academy of Sciences of the USA, 1982.
33