0% found this document useful (0 votes)

107 views8 pages

1 Tutorial: Linear Regression

This document provides a tutorial on linear regression. It begins with an introduction and overview of the agenda. It then loads and describes the Boston housing price dataset, which contains 506 instances and 13 attributes related to house prices. It defines and compares vectorized and non-vectorized cost functions. It then solves for the optimal weights exactly in closed form and uses gradient descent. Key steps include defining the cost and gradient functions, plotting the cost in weight space, getting the closed form solution, and implementing gradient descent to iteratively minimize the cost.

Uploaded by

pulademotan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views8 pages

1 Tutorial: Linear Regression

Uploaded by

pulademotan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Tutorial 1.

Linear Regression

January 11, 2017

1 Tutorial: Linear Regression

Agenda: 1. Spyder interface 2. Linear regression running example: boston data 3. Vectorize cost
function 4. Closed form solution 5. Gradient descent

In [1]: import matplotlib

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [2]: from sklearn.datasets import load_boston

boston_data = load_boston()
print(boston_data['DESCR'])

Boston House Prices dataset

Notes
------
Data Set Characteristics:

:Number of Instances: 506

:Number of Attributes: 13 numeric/categorical predictive

:Median Value (attribute 14) is usually the target

:Attribute Information (in order):

- CRIM per capita crime rate by town
- ZN proportion of residential land zoned for lots over 25,000 sq.ft.
- INDUS proportion of non-retail business acres per town
- CHAS Charles River dummy variable (= 1 if tract bounds river; 0 other
- NOX nitric oxides concentration (parts per 10 million)
- RM average number of rooms per dwelling
- AGE proportion of owner-occupied units built prior to 1940
- DIS weighted distances to five Boston employment centres
- RAD index of accessibility to radial highways
- TAX full-value property-tax rate per $10,000

1
- PTRATIO pupil-teacher ratio by town
- B 1000(Bk - 0.63)ˆ2 where Bk is the proportion of blacks by town
- LSTAT % lower status of the population
- MEDV Median value of owner-occupied homes in $1000's

:Missing Attribute Values: None

:Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.

http://archive.ics.uci.edu/ml/datasets/Housing

This dataset was taken from the StatLib library which is maintained at Carnegie Mel

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic

prices and the demand for clean air', J. Environ. Economics & Management,
vol.5, 81-102, 1978. Used in Belsley, Kuh & Welsch, 'Regression diagnostics
...', Wiley, 1980. N.B. Various transformations are used in the table on
pages 244-261 of the latter.

The Boston house-price data has been used in many machine learning papers that addr
problems.

**References**

- Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data a

- Quinlan,R. (1993). Combining Instance-Based and Model-Based Learning. In Proce
- many more! (see http://archive.ics.uci.edu/ml/datasets/Housing)

In [3]: # take the boston data

data = boston_data['data']
# we will only work with two of the features: INDUS and RM
x_input = data[:, [2,5]]
y_target = boston_data['target']

In [5]: # Individual plots for the two features:

plt.title('Industrialness vs Med House Price')
plt.scatter(x_input[:, 0], y_target)
plt.xlabel('Industrialness')
plt.ylabel('Med House Price')
plt.show()

plt.title('Avg Num Rooms vs Med House Price')

plt.scatter(x_input[:, 1], y_target)
plt.xlabel('Avg Num Rooms')

2
plt.ylabel('Med House Price')
plt.show()

3
1.1 Define cost function
N
1 X (i)
E(y, t) = (y − t(i) )2
2N
i=1
N
1 X (i) (i)
E(y, t) = (w1 x1 + w2 x2 + b − t(i) )2
2N
i=1

In [6]: def cost(w1, w2, b, X, t):

'''
Evaluate the cost function in a non-vectorized manner for
inputs `X` and targets `t`, at weights `w1`, `w2` and `b`.
'''
costs = 0
for i in range(len(t)):
y_i = w1 * X[i, 0] + w2 * X[i, 1] + b
t_i = t[i]
costs += 0.5 * (y_i - t_i) ** 2
return costs / len(t)
In [7]: cost(3, 5, 20, x_input, y_target)
Out[7]: 2241.1239166749006
In [8]: cost(3, 5, 0, x_input, y_target)
Out[8]: 1195.1098850543478

1.2 Vectorizing the cost function:

1
kXw + b1 − tk2
E(y, t) =
2N
In [9]: def cost_vectorized(w1, w2, b, X, t):
'''
Evaluate the cost function in a vectorized manner for
inputs `X` and targets `t`, at weights `w1`, `w2` and `b`.
'''
N = len(y_target)
w = np.array([w1, w2])
y = np.dot(X, w) + b * np.ones(N)
return np.sum((y - t)**2) / (2.0 * N)
In [10]: cost_vectorized(3, 5, 20, x_input, y_target)
Out[10]: 2241.1239166749015
In [11]: cost(3, 5, 0, x_input, y_target)
Out[11]: 1195.1098850543478

4
1.3 Comparing speed of the vectorized vs unvectorized code
We’ll see below that the vectorized code already runs ~2x faster than the non-vectorized code!
Hopefully this will convince you to always vectorized your code whenever possible

In [12]: import time

t0 = time.time()
print cost(4, 5, 20, x_input, y_target)
t1 = time.time()
print t1 - t0

3182.40634167
0.00229597091675

In [13]: t0 = time.time()
print cost_vectorized(4, 5, 20, x_input, y_target)
t1 = time.time()
print t1 - t0

3182.40634167
0.000537872314453

1.4 Plotting cost in weight space

We’ll plot the cost for two of our weights, assuming that bias = -22.89831573.
We’ll see where that number comes from later.
Notice the shape of the contours are ovals.

In [15]: w1s = np.arange(-1.0, 0.0, 0.01)

w2s = np.arange(6.0, 10.0, 0.1)
z_cost = []
for w2 in w2s:
z_cost.append([cost_vectorized(w1, w2, -22.89831573, x_input, y_target
z_cost = np.array(z_cost)
np.shape(z_cost)
W1, W2 = np.meshgrid(w1s, w2s)
CS = plt.contour(W1, W2, z_cost, 25)
plt.clabel(CS, inline=1, fontsize=10)
plt.title('Costs for various values of w1 and w2 for b=0')
plt.xlabel("w1")
plt.ylabel("w2")
plt.plot([-0.33471389], [7.82205511], 'o') # this will be the minima that
plt.show()

5
2 Exact Solution
Work this out on the board:

1. ignore biases (add an extra feature & weight instead)

2. get equations from partial derivative
3. vectorize
4. write code.

In [16]: # add an extra feature (column in the input) that are just all ones
x_in = np.concatenate([x_input, np.ones([np.shape(x_input)[0], 1])], axis=
x_in

Out[16]: array([[ 2.31 , 6.575, 1. ],

[ 7.07 , 6.421, 1. ],
[ 7.07 , 7.185, 1. ],
...,
[ 11.93 , 6.976, 1. ],
[ 11.93 , 6.794, 1. ],
[ 11.93 , 6.03 , 1. ]])

In [17]: def solve_exactly(X, t):

'''

6
Solve linear regression exactly. (fully vectorized)

Given `X` - NxD matrix of inputs

`t` - target outputs
Returns the optimal weights as a D-dimensional vector
'''
N, D = np.shape(X)
A = np.matmul(X.T, X)
c = np.dot(X.T, t)
return np.matmul(np.linalg.inv(A), c)

In [18]: solve_exactly(x_in, y_target)

Out[18]: array([ -0.33471389, 7.82205511, -22.89831573])

In [19]: # In real life we don't want to code it directly

np.linalg.lstsq(x_in, y_target)

Out[19]: (array([ -0.33471389, 7.82205511, -22.89831573]),

array([ 19807.614505]),
3,
array([ 318.75354429, 75.21961717, 2.10127199]))

2.1 Implement Gradient Function

∂E 1 X (i) (i)
= xj (y − t(i) )
∂wj N
i

In [20]: # Vectorized gradient function

def gradfn(weights, X, t):
'''
Given `weights` - a current "Guess" of what our weights should be
`X` - matrix of shape (N,D) of input features
`t` - target y values
Return gradient of each weight evaluated at the current value
'''
N, D = np.shape(X)
y_pred = np.matmul(X, weights)
error = y_pred - t
return np.matmul(np.transpose(x_in), error) / float(N)

In [23]: def solve_via_gradient_descent(X, t, print_every=5000,

niter=100000, alpha=0.005):
'''
Given `X` - matrix of shape (N,D) of input features
`t` - target y values
Solves for linear regression weights.
Return weights after `niter` iterations.
'''

7
N, D = np.shape(X)
# initialize all the weights to zeros
w = np.zeros([D])
for k in range(niter):
dw = gradfn(w, X, t)
w = w - alpha*dw
if k % print_every == 0:
print 'Weight after %d iteration: %s' % (k, str(w))
return w

In [24]: solve_via_gradient_descent( X=x_in, t=y_target)

Weight after 0 iteration: [ 1.10241186 0.73047508 0.11266403]

Weight after 5000 iteration: [-0.48304613 5.10076868 -3.97899253]
Weight after 10000 iteration: [-0.45397323 5.63413678 -7.6871518 ]
Weight after 15000 iteration: [ -0.43059857 6.06296553 -10.66851736]
Weight after 20000 iteration: [ -0.41180532 6.40774447 -13.06553969]
Weight after 25000 iteration: [ -0.39669551 6.68494726 -14.9927492 ]
Weight after 30000 iteration: [ -0.38454721 6.90781871 -16.54222851]
Weight after 35000 iteration: [ -0.37477995 7.08700769 -17.78801217]
Weight after 40000 iteration: [ -0.36692706 7.23107589 -18.78962409]
Weight after 45000 iteration: [ -0.36061333 7.34690694 -19.59492155]
Weight after 50000 iteration: [ -0.35553708 7.44003528 -20.24238191]
Weight after 55000 iteration: [ -0.35145576 7.5149106 -20.762941 ]
Weight after 60000 iteration: [ -0.34817438 7.57511047 -21.18147127]
Weight after 65000 iteration: [ -0.34553614 7.62351125 -21.51797024]
Weight after 70000 iteration: [ -0.343415 7.66242555 -21.78851591]
Weight after 75000 iteration: [ -0.34170959 7.69371271 -22.00603503]
Weight after 80000 iteration: [ -0.34033844 7.71886763 -22.18092072]
Weight after 85000 iteration: [ -0.33923604 7.73909222 -22.32152908]
Weight after 90000 iteration: [ -0.3383497 7.75535283 -22.4345784 ]
Weight after 95000 iteration: [ -0.33763709 7.76842638 -22.52547023]

Out[24]: array([ -0.33706425, 7.77893565, -22.59853432])

In [25]: # For comparison, this was the exact result:

np.linalg.lstsq(x_in, y_target)

Out[25]: (array([ -0.33471389, 7.82205511, -22.89831573]),

array([ 19807.614505]),
3,
array([ 318.75354429, 75.21961717, 2.10127199]))

AMT Gauss Student Sample Problems Solutions
No ratings yet
AMT Gauss Student Sample Problems Solutions
16 pages
ML Lab Manual
100% (1)
ML Lab Manual
37 pages
Maths GR 10 - PT2 - Assignments
No ratings yet
Maths GR 10 - PT2 - Assignments
256 pages
ML Lab File
No ratings yet
ML Lab File
48 pages
Misconceptions in Math & Diagnostic Teaching
50% (2)
Misconceptions in Math & Diagnostic Teaching
54 pages
Lecture 0.2 - Linear Methods For Regression, Optimization
No ratings yet
Lecture 0.2 - Linear Methods For Regression, Optimization
53 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Group 05
No ratings yet
Group 05
83 pages
Unit 5
No ratings yet
Unit 5
171 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
Lecture3 - Linear Regression and Logistic Regression
No ratings yet
Lecture3 - Linear Regression and Logistic Regression
60 pages
10MMAT - 17 - T01 - Linear Relations and Equations Test
No ratings yet
10MMAT - 17 - T01 - Linear Relations and Equations Test
9 pages
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
No ratings yet
4 - Học Máy Cơ Bản - Hồi Quy Tuyến Tính
113 pages
Lecture Slides - Linear Regression (2025)
No ratings yet
Lecture Slides - Linear Regression (2025)
45 pages
Bias
No ratings yet
Bias
62 pages
VTAMPS 18 SS Set 4
No ratings yet
VTAMPS 18 SS Set 4
11 pages
Mayhoc
No ratings yet
Mayhoc
51 pages
MMC 2019 Grade 10 Eliminations
50% (2)
MMC 2019 Grade 10 Eliminations
2 pages
Regression
No ratings yet
Regression
25 pages
Revised-L3-Linear Regression
No ratings yet
Revised-L3-Linear Regression
41 pages
Machinelearning
No ratings yet
Machinelearning
26 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
LinearRegression Annotated
No ratings yet
LinearRegression Annotated
116 pages
Lab2 Linear Regression
100% (1)
Lab2 Linear Regression
18 pages
Take It Easy: Created Status Last Read
No ratings yet
Take It Easy: Created Status Last Read
55 pages
Lecture - 4 - Logistic Regression
No ratings yet
Lecture - 4 - Logistic Regression
62 pages
Ai Lab
No ratings yet
Ai Lab
19 pages
MDS372 Lab4 2448001
No ratings yet
MDS372 Lab4 2448001
17 pages
Components, Moment and Resultant of Spatial Forces
No ratings yet
Components, Moment and Resultant of Spatial Forces
10 pages
03 Linear Regression Intuition
No ratings yet
03 Linear Regression Intuition
23 pages
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
No ratings yet
1.1 ID5059 1.2 Tom Kelsey - Jan 2021: February 15, 2021
43 pages
LinearRegression Tutorial
No ratings yet
LinearRegression Tutorial
40 pages
C1 W2 Lab02 Multiple Variable Soln
No ratings yet
C1 W2 Lab02 Multiple Variable Soln
11 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
Lecture 3 - Linear Regression
No ratings yet
Lecture 3 - Linear Regression
31 pages
Daniel Lupp - A Simple Proof of The Kronecker-Weber Theorem (Bachelor's Thesis) (2012)
No ratings yet
Daniel Lupp - A Simple Proof of The Kronecker-Weber Theorem (Bachelor's Thesis) (2012)
43 pages
Week 3
No ratings yet
Week 3
35 pages
Lab02
No ratings yet
Lab02
14 pages
ML Manual
No ratings yet
ML Manual
30 pages
Updating Weight
No ratings yet
Updating Weight
9 pages
Review Questions in Math
100% (1)
Review Questions in Math
4 pages
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
No ratings yet
Experiment Number: 3: Aim:-Study of The Linear Regression in The Machine Learning Using The Boston Housing Dataset. 1)
14 pages
Chapter 6 - Advanced Machine Learning PDF
No ratings yet
Chapter 6 - Advanced Machine Learning PDF
37 pages
T2 Summary VHA
No ratings yet
T2 Summary VHA
14 pages
DFT-FFT
No ratings yet
DFT-FFT
21 pages
C1 W1 Lab04 Gradient Descent Soln
No ratings yet
C1 W1 Lab04 Gradient Descent Soln
11 pages
SL Ch16 Techniques of Indefinite Integration Lecture Notes Solutions
No ratings yet
SL Ch16 Techniques of Indefinite Integration Lecture Notes Solutions
13 pages
C1 W2 Lab03 Feature Scaling and Learning Rate Soln
No ratings yet
C1 W2 Lab03 Feature Scaling and Learning Rate Soln
10 pages
Regression
No ratings yet
Regression
16 pages
M30-1 - Chapter 11 Review (Permutations, Combinations, The Binomial Theorem) DK
No ratings yet
M30-1 - Chapter 11 Review (Permutations, Combinations, The Binomial Theorem) DK
11 pages
Python - Vectorized - Tute - Jupyter Notebook
No ratings yet
Python - Vectorized - Tute - Jupyter Notebook
16 pages
CC06 11 Calculus 2
No ratings yet
CC06 11 Calculus 2
11 pages
Hands On Session Machine Learning Fundamentals II Word Format
No ratings yet
Hands On Session Machine Learning Fundamentals II Word Format
5 pages
Numerical Methods - Chapter 1
No ratings yet
Numerical Methods - Chapter 1
24 pages
14.1 Functions of Several Variables
No ratings yet
14.1 Functions of Several Variables
26 pages
Linear Regression
No ratings yet
Linear Regression
14 pages
On Constructions of Circulant MDS Matric
No ratings yet
On Constructions of Circulant MDS Matric
15 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
5 pages
Determinant
No ratings yet
Determinant
12 pages
Linear Regression in Python
No ratings yet
Linear Regression in Python
9 pages
Assignment 1
100% (1)
Assignment 1
3 pages
Examples of Expansions Into Partial Fractions 1
No ratings yet
Examples of Expansions Into Partial Fractions 1
17 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
4 pages
Notes Key Topic 1.5 Polynomial Functions and Complex Zeros Ap PC
No ratings yet
Notes Key Topic 1.5 Polynomial Functions and Complex Zeros Ap PC
4 pages
Operations On Rational Numbers
No ratings yet
Operations On Rational Numbers
5 pages
Sofcomputing Da2
No ratings yet
Sofcomputing Da2
7 pages
Lecture 17 Properties of Fourier Transform
No ratings yet
Lecture 17 Properties of Fourier Transform
6 pages
APSC 258 Midterm Study Guide
No ratings yet
APSC 258 Midterm Study Guide
4 pages
Gradient Descent and SGD
No ratings yet
Gradient Descent and SGD
8 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
De Moivre's Formula
No ratings yet
De Moivre's Formula
14 pages
Mac Lane Answers
No ratings yet
Mac Lane Answers
45 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
C1 W2 Lab05 Sklearn GD Soln
No ratings yet
C1 W2 Lab05 Sklearn GD Soln
3 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
SRM Valliammai Engineering College: Department of Civil Engineering (M.E - Structural Engineering)
No ratings yet
SRM Valliammai Engineering College: Department of Civil Engineering (M.E - Structural Engineering)
17 pages
Koml
No ratings yet
Koml
5 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
7 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
4 pages
C1 W1 Lab02 Model Representation Soln
No ratings yet
C1 W1 Lab02 Model Representation Soln
5 pages
Lab 1. Boston House
No ratings yet
Lab 1. Boston House
7 pages
Linear Regression
No ratings yet
Linear Regression
4 pages
C1 W1 Lab03 Model Representation Soln-Copy1
No ratings yet
C1 W1 Lab03 Model Representation Soln-Copy1
7 pages
C1 W1 Lab03 Cost Function Soln
No ratings yet
C1 W1 Lab03 Cost Function Soln
4 pages
Weakly Prime Submodules
No ratings yet
Weakly Prime Submodules
8 pages
Linear Regression: Machine Learning
No ratings yet
Linear Regression: Machine Learning
9 pages
Ex 2
No ratings yet
Ex 2
3 pages
Maths - Ans Key - Revision Work - Class - Xi
No ratings yet
Maths - Ans Key - Revision Work - Class - Xi
2 pages
Mathematical Formulas for Economics and Business: A Simple Introduction
From Everand
Mathematical Formulas for Economics and Business: A Simple Introduction
K.H. Erickson
4/5 (4)

1 Tutorial: Linear Regression

Uploaded by

1 Tutorial: Linear Regression

Uploaded by

Tutorial 1.

January 11, 2017

1 Tutorial: Linear Regression

In [1]: import matplotlib

In [2]: from sklearn.datasets import load_boston

Boston House Prices dataset

:Number of Instances: 506

:Number of Attributes: 13 numeric/categorical predictive

:Median Value (attribute 14) is usually the target

:Attribute Information (in order):

:Missing Attribute Values: None

:Creator: Harrison, D. and Rubinfeld, D.L.

This is a copy of UCI ML housing dataset.

The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic

- Belsley, Kuh & Welsch, 'Regression diagnostics: Identifying Influential Data a

In [3]: # take the boston data

In [5]: # Individual plots for the two features:

plt.title('Avg Num Rooms vs Med House Price')

In [6]: def cost(w1, w2, b, X, t):

1.2 Vectorizing the cost function:

In [12]: import time

1.4 Plotting cost in weight space

In [15]: w1s = np.arange(-1.0, 0.0, 0.01)

1. ignore biases (add an extra feature & weight instead)

Out[16]: array([[ 2.31 , 6.575, 1. ],

In [17]: def solve_exactly(X, t):

Given `X` - NxD matrix of inputs

In [18]: solve_exactly(x_in, y_target)

Out[18]: array([ -0.33471389, 7.82205511, -22.89831573])

In [19]: # In real life we don't want to code it directly

Out[19]: (array([ -0.33471389, 7.82205511, -22.89831573]),

2.1 Implement Gradient Function

In [20]: # Vectorized gradient function

In [23]: def solve_via_gradient_descent(X, t, print_every=5000,

In [24]: solve_via_gradient_descent( X=x_in, t=y_target)

Weight after 0 iteration: [ 1.10241186 0.73047508 0.11266403]

Out[24]: array([ -0.33706425, 7.77893565, -22.59853432])

In [25]: # For comparison, this was the exact result:

Out[25]: (array([ -0.33471389, 7.82205511, -22.89831573]),

You might also like