0% found this document useful (0 votes)

246 views14 pages

Python Machine Learning: Linear Regression

This document summarizes the first part of a blog post that walks through machine learning exercises in Python based on Andrew Ng's Coursera machine learning course. It begins by examining the data, which contains cities' populations and food truck profits. Some basic statistics and a scatter plot are used to explore the data. Then it introduces the concept of simple linear regression and describes implementing it from scratch in Python using gradient descent to find the parameters that best model the population-profit relationship. The goal is to accurately predict new cities' profits based only on their populations.

Uploaded by

rahmanian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

246 views14 pages

Python Machine Learning: Linear Regression

Uploaded by

rahmanian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

Curious Insight

MACHINE LEARNING

Machine Learning Exercises In

Python, Part 1
5TH DECEMBER 2014

This post is part of a series covering the exercises from Andrew Ng's

machine learning class on Coursera. The original code, exercise text, and

data �les for this post are available here.

Part 1 - Simple Linear Regression

Part 2 - Multivariate Linear Regression

Part 3 - Logistic Regression

Part 4 - Multivariate Logistic Regression

Part 5 - Neural Networks

Part 6 - Support Vector Machines

Part 7 - K-Means Clustering & PCA<

Part 8 - Anomaly Detection & Recommendation

One of the pivotal moments in my professional development this year

came when I discovered Coursera. I'd heard of the "MOOC" phenomenon

1 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

but had not had the time to dive in and take a class. Earlier this year I

�nally pulled the trigger and signed up for Andrew Ng's Machine

Learning class. I completed the whole thing from start to �nish, including

all of the programming exercises. The experience opened my eyes to the

power of this type of education platform, and I've been hooked ever since.

This blog post will be the �rst in a series covering the programming

exercises from Andrew's class. One aspect of the course that I didn't

particularly care for was the use of Octave for assignments. Although

Octave/Matlab is a �ne platform, most real-world "data science" is done

in either R or Python (certainly there are other languages and tools being

used, but these two are unquestionably at the top of the list). Since I'm

trying to develop my Python skills, I decided to start working through the

exercises from scratch in Python. The full source code is available at my

IPython repo on Github. You'll also �nd the data used in these exercises

and the original exercise PDFs in sub-folders o� the root directory if you're

interested.

While I can explain some of the concepts involved in this exercise along

the way, it's impossible for me to convey all the information you might

need to fully comprehend it. If you're really interested in machine learning

but haven't been exposed to it yet, I encourage you to check out the class

(it's completely free and there's no commitment whatsoever). With that,

let's get started!

2 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

Examining The Data

In the �rst part of exercise 1, we're tasked with implementing simple linear

regression to predict pro�ts for a food truck. Suppose you are the CEO of a

restaurant franchise and are considering di�erent cities for opening a new

outlet. The chain already has trucks in various cities and you have data for

pro�ts and populations from the cities. You'd like to �gure out what the

expected pro�t of a new food truck might be given only the population of

the city that it would be placed in.

Let's start by examining the data which is in a �le called "ex1data1.txt" in

the "data" directory of my repository above. First we need to import a few

libraries.

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

Now let's get things rolling. We can use pandas to load the data into a

data frame and display the �rst few rows using the "head" function.

path = os.getcwd() + '\data\ex1data1.txt'

data = pd.read_csv(path, header=None, names=['Population',
data.head()

3 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

Population Pro�t

0 6.1101 17.5920

1 5.5277 9.1302

2 8.5186 13.6620

3 7.0032 11.8540

4 5.8598 6.8233

Another useful function that pandas provides out-of-the-box is the

"describe" function, which calculates some basic statistics on a data set.

This is helpful to get a "feel" for the data during the exploratory analysis

stage of a project.

data.describe()

Population Pro�t

count 97.000000 97.000000

mean 8.159800 5.839135

std 3.869884 5.510262

min 5.026900 -2.680700

25% 5.707700 1.986900

50% 6.589400 4.562300

75% 8.578100 7.046700

max 22.203000 24.147000

4 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

Examining stats about your data can be helpful, but sometimes you need

to �nd ways to visualize it too. Fortunately this data set only has one

dependent variable, so we can toss it in a scatter plot to get a better idea

of what it looks like. We can use the "plot" function provided by pandas for

this, which is really just a wrapper for matplotlib.

data.plot(kind='scatter', x='Population', y='Profit', figsize

It really helps to actually look at what's going on, doesn't it? We can

clearly see that there's a cluster of values around cities with smaller

populations, and a somewhat linear trend of increasing pro�t as the size of

the city increases. Now let's get to the fun part - implementing a linear

5 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

regression algorithm in python from scratch!

Implementing Simple Linear

Regression

If you're not familiar with linear regression, it's an approach to modeling

the relationship between a dependent variable and one or more

independent variables (if there's one independent variable then it's called

simple linear regression, and if there's more than one independent

variable then it's called multiple linear regression). There are lots of

di�erent types and variances of linear regression that are outside the

scope of this discussion so I won't go into that here, but to put it simply -

we're trying to create a linear model of the data X, using some number of

parameters theta, that describes the variance of the data such that given a

new data point that's not in X, we could accurately predict what the

outcome y would be without actually knowing what y is.

In this implementation we're going to use an optimization technique

called gradient descent to �nd the parameters theta. If you're familiar with

linear algebra, you may be aware that there's another way to �nd the

optimal parameters for a linear model called the "normal equation" which

basically solves the problem at once using a series of matrix calculations.

However, the issue with this approach is that it doesn't scale very well for

large data sets. In contrast, we can use variants of gradient descent and

6 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

other optimization methods to scale to data sets of unlimited size, so for

machine learning problems this approach is more practical.

Okay, that's enough theory. Let's write some code. The �rst thing we need

is a cost function. The cost function evaluates the quality of our model by

calculating the error between our model's prediction for a data point,

using the model parameters, and the actual data point. For example, if the

population for a given city is 4 and we predicted that it was 7, our error is

(7-4)^2 = 3^2 = 9 (assuming an L2 or "least squares" loss function). We do

this for each data point in X and sum the result to get the cost. Here's the

function:

def computeCost(X, y, theta):

inner = np.power(((X * theta.T) - y), 2)
return np.sum(inner) / (2 * len(X))

Notice that there are no loops. We're taking advantage of numpy's linear

algrebra capabilities to compute the result as a series of matrix

operations. This is far more computationally e�icient than an unoptimizted

"for" loop.

In order to make this cost function work seamlessly with the pandas data

frame we created above, we need to do some manipulating. First, we

need to insert a column of 1s at the beginning of the data frame in order to

make the matrix operations work correctly (I won't go into detail on why

this is needed, but it's in the exercise text if you're interested - basically it

7 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

accounts for the intercept term in the linear equation). Second, we need to

separate our data into independent variables X and our dependent

variable y.

# append a ones column to the front of the data set

data.insert(0, 'Ones', 1)

# set X (training data) and y (target variable)

cols = data.shape[1]
X = data.iloc[:,0:cols-1]
y = data.iloc[:,cols-1:cols]

Finally, we're going to convert our data frames to numpy matrices and

instantiate a parameter matirx.

# convert from data frames to numpy matrices

X = np.matrix(X.values)
y = np.matrix(y.values)
theta = np.matrix(np.array([0,0]))

One useful trick to remember when debugging matrix operations is to

look at the shape of the matrices you're dealing with. It's also helpful to

remember when walking through the steps in your head that matrix

multiplications look like (i x j) * (j x k) = (i x k), where i, j, and k are the

shapes of the relative dimensions of the matrix.

X.shape, theta.shape, y.shape

((97L, 2L), (1L, 2L), (97L, 1L))

8 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

Okay, so now we can try out our cost function. Remember the parameters

were initialized to 0 so the solution isn't optimal yet, but we can see if it

works.

computeCost(X, y, theta)

32.072733877455676

So far so good. Now we need to de�ne a function to perform gradient

descent on the parameters theta using the update rules de�ned in the

exercise text. Here's the function for gradient descent:

def gradientDescent(X, y, theta, alpha, iters):

temp = np.matrix(np.zeros(theta.shape))
parameters = int(theta.ravel().shape[1])
cost = np.zeros(iters)

for i in range(iters):
error = (X * theta.T) - y

for j in range(parameters):
term = np.multiply(error, X[:,j])
temp[0,j] = theta[0,j] - ((alpha / len(X)) * np

theta = temp
cost[i] = computeCost(X, y, theta)

return theta, cost

The idea with gradient descent is that for each iteration, we compute the

gradient of the error term in order to �gure out the appropriate direction

9 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

to move our parameter vector. In other words, we're calculating the

changes to make to our parameters in order to reduce the error, thus

bringing our solution closer to the optimal solution (i.e best �t).

This is a fairly complex topic and I could easily devote a whole blog post

just to discussing gradient descent. If you're interested in learning more, I

would recommend starting with this article and branching out from

there.

Once again we're relying on numpy and linear algebra for our solution.

You may notice that my implementation is not 100% optimal. In particular,

there's a way to get rid of that inner loop and update all of the parameters

at once. I'll leave it up to the reader to �gure it out for now (I'll cover it in a

later post).

Now that we've got a way to evaluate solutions, and a way to �nd a good

solution, it's time to apply this to our data set.

# initialize variables for learning rate and iterations

alpha = 0.01
iters = 1000

# perform gradient descent to "fit" the model parameters

g, cost = gradientDescent(X, y, theta, alpha, iters)
g

matrix([[-3.24140214, 1.1272942 ]])

10 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

Note that we've initialized a few new variables here. If you look closely at

the gradient descent function, it has parameters called alpha and iters.

Alpha is the learning rate - it's a factor in the update rule for the

parameters that helps determine how quickly the algorithm will converge

to the optimal solution. Iters is just the number of iterations. There is no

hard and fast rule for how to initialize these parameters and typically

some trial-and-error is involved.

We now have a parameter vector descibing what we believe is the optimal

linear model for our data set. One quick way to evaluate just how good

our regression model is might be to look at the total error of our new

solution on the data set:

computeCost(X, y, g)

4.5159555030789118

That's certainly a lot better than 32, but it's not a very intuitive way to look

at it. Fortunately we have some other techniques at our disposal.

Viewing The Results

We're now going to use matplotlib to visualize our solution. Remember

the scatter plot from before? Let's overlay a line representing our model

on top of a scatter plot of the data to see how well it �ts. We can use

11 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

numpy's "linspace" function to create an evenly-spaced series of points

within the range of our data, and then "evaluate" those points using our

model to see what the expected pro�t would be. We can then turn it into a

line graph and plot it.

x = np.linspace(data.Population.min(), data.Population.max(
f = g[0, 0] + (g[0, 1] * x)

fig, ax = plt.subplots(figsize=(12,8))
ax.plot(x, f, 'r', label='Prediction')
ax.scatter(data.Population, data.Profit, label='Traning Data'
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')

12 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

Not bad! Our solution looks like and optimal linear model of the data set.

Since the gradient decent function also outputs a vector with the cost at

each training iteration, we can plot that as well.

fig, ax = plt.subplots(figsize=(12,8))
ax.plot(np.arange(iters), cost, 'r')
ax.set_xlabel('Iterations')
ax.set_ylabel('Cost')
ax.set_title('Error vs. Training Epoch')

Notice that the cost always decreases - this is an example of what's called

a convex optimization problem. If you were to plot the entire solution

space for the problem (i.e. plot the cost as a function of the model

parameters for every possible value of the parameters) you would see that

13 of 14 2/1/19, 8:32 PM
Machine Learning Exercises In Python, Part 1 https://www.johnwittenauer.net/machine-learning-exercises-i...

it looks like a "bowl" shape with a "basin" representing the optimal

solution.

That's all for now! In part 2 we'll �nish o� the �rst exercise by extending

this example to more than 1 variable. I'll also show how the above solution

can be reached by using a popular machine learning library called scikit-

learn.

Follow me on twitter to get new post updates.

Follow @jdwittenauer

MACHINE LEARNING DATA SCIENCE DATA VISUALIZATION

AUTHOR

John Wittenauer

Data scientist, engineer, author, investor, entrepreneur

14 of 14 2/1/19, 8:32 PM

Examen Textmining 20202021
No ratings yet
Examen Textmining 20202021
2 pages
Random Forests: Features & Algorithm
100% (1)
Random Forests: Features & Algorithm
13 pages
Hyperparameters Hyperparameters For Decision Trees: Maximum Depth
No ratings yet
Hyperparameters Hyperparameters For Decision Trees: Maximum Depth
4 pages
Support Vector Machine PPT Presentation
No ratings yet
Support Vector Machine PPT Presentation
26 pages
SSNAO Dupliant
No ratings yet
SSNAO Dupliant
9 pages
Plug-and-Play Image Restoration With Deep Denoiser Prior
No ratings yet
Plug-and-Play Image Restoration With Deep Denoiser Prior
16 pages
Machine Learning Basics QCM
No ratings yet
Machine Learning Basics QCM
1 page
MobilenetV2 (Quantization)
No ratings yet
MobilenetV2 (Quantization)
4 pages
1 s2.0 S0950705123010389 Main
No ratings yet
1 s2.0 S0950705123010389 Main
18 pages
Multi-Armed Bandits Algorithms Overview
No ratings yet
Multi-Armed Bandits Algorithms Overview
2 pages
Cours - Machine Learning - Deep Learning - V1
100% (2)
Cours - Machine Learning - Deep Learning - V1
194 pages
SVM: Understanding the Optimal Hyperplane
100% (1)
SVM: Understanding the Optimal Hyperplane
37 pages
Machine Learning for Banknote Detection
No ratings yet
Machine Learning for Banknote Detection
24 pages
How Much Info
No ratings yet
How Much Info
207 pages
Exam 2
No ratings yet
Exam 2
2 pages
Image Enhancement Techniques Explained
No ratings yet
Image Enhancement Techniques Explained
80 pages
Correction de l'Exercice AGNES-DIANA
No ratings yet
Correction de l'Exercice AGNES-DIANA
3 pages
Understanding Machine Learning Concepts
No ratings yet
Understanding Machine Learning Concepts
24 pages
Cours Machine Learning
0% (1)
Cours Machine Learning
204 pages
Program v3 ICCSC2025
No ratings yet
Program v3 ICCSC2025
56 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
48 pages
Feature Engineering
No ratings yet
Feature Engineering
6 pages
Unsupervised Learning 2024-PPG
No ratings yet
Unsupervised Learning 2024-PPG
85 pages
AI Engineer Cheat Sheet Micro1
No ratings yet
AI Engineer Cheat Sheet Micro1
2 pages
From Classical To Unsupervised Deep Learning For Solving Inverse Problem in Imaging To
No ratings yet
From Classical To Unsupervised Deep Learning For Solving Inverse Problem in Imaging To
248 pages
Liste Convoqués Master SESN 22-09-2023
No ratings yet
Liste Convoqués Master SESN 22-09-2023
9 pages
Working With Data in Python Cheat Sheet
No ratings yet
Working With Data in Python Cheat Sheet
3 pages
Correction Exercice ACP
No ratings yet
Correction Exercice ACP
3 pages
Cours Calcul Stochastique
100% (1)
Cours Calcul Stochastique
68 pages
Éléments de Data Mining avec Tanagra
No ratings yet
Éléments de Data Mining avec Tanagra
146 pages
Program ICDTA24 V0
No ratings yet
Program ICDTA24 V0
21 pages
02 Understanding Mini Batch Gradient Descent C2W2L02
No ratings yet
02 Understanding Mini Batch Gradient Descent C2W2L02
4 pages
CSE 465 Exam: Decision Trees & SVMs
No ratings yet
CSE 465 Exam: Decision Trees & SVMs
2 pages
Branch and Price Algorithm Overview
No ratings yet
Branch and Price Algorithm Overview
3 pages
Final Exam Paper Fall 2020
No ratings yet
Final Exam Paper Fall 2020
3 pages
Recherche Opérationnelle XXXX - 18,5
No ratings yet
Recherche Opérationnelle XXXX - 18,5
7 pages
AI Innovations in Health and Education
No ratings yet
AI Innovations in Health and Education
6 pages
2024 JWT 125 Q4 Test
No ratings yet
2024 JWT 125 Q4 Test
9 pages
AI Deep Learning & NLP Course
No ratings yet
AI Deep Learning & NLP Course
4 pages
Linear Regression Analysis in Hydrology
No ratings yet
Linear Regression Analysis in Hydrology
15 pages
TD Calcul Stochastique
No ratings yet
TD Calcul Stochastique
3 pages
SVM
No ratings yet
SVM
36 pages
C 2 OneFactor Vasicek
No ratings yet
C 2 OneFactor Vasicek
21 pages
Derivation of Backpropagation in Convolutional Neural Network (CNN)
No ratings yet
Derivation of Backpropagation in Convolutional Neural Network (CNN)
7 pages
Deep Learning for Flotation Purity Prediction
No ratings yet
Deep Learning for Flotation Purity Prediction
12 pages
Multi-Label Long Short-Term Memory-Based Framework To Analyze Drug Functions From Biological Properties
No ratings yet
Multi-Label Long Short-Term Memory-Based Framework To Analyze Drug Functions From Biological Properties
6 pages
Pytorch Tutorial 1
No ratings yet
Pytorch Tutorial 1
48 pages
Student Performance Prediction Techniques
No ratings yet
Student Performance Prediction Techniques
6 pages
Data Preprocessing for Machine Learning
No ratings yet
Data Preprocessing for Machine Learning
38 pages
Unsupervised Learning: K-Means & GMM
No ratings yet
Unsupervised Learning: K-Means & GMM
27 pages
TDT4136: Intro to Artificial Intelligence
No ratings yet
TDT4136: Intro to Artificial Intelligence
40 pages
Machine Learning CA 2
No ratings yet
Machine Learning CA 2
19 pages
Time Series Analysis Project - CAC 40 - 2018
No ratings yet
Time Series Analysis Project - CAC 40 - 2018
33 pages
MATLAB
No ratings yet
MATLAB
11 pages
CSI 2110 Summary PDF
No ratings yet
CSI 2110 Summary PDF
17 pages
SVM
No ratings yet
SVM
21 pages
Principles of ML
100% (1)
Principles of ML
2 pages
Résumé de Probabilité Bac Science
No ratings yet
Résumé de Probabilité Bac Science
12 pages
CH11
No ratings yet
CH11
36 pages
Machine Learning Laboratory Exercises
No ratings yet
Machine Learning Laboratory Exercises
16 pages
Class Xii - Web (Javascript) Worksheet: Theory Questions
No ratings yet
Class Xii - Web (Javascript) Worksheet: Theory Questions
9 pages
MyPlace User Manual v4.2
No ratings yet
MyPlace User Manual v4.2
17 pages
Power BI & SQL Server ETL Guide
No ratings yet
Power BI & SQL Server ETL Guide
129 pages
Shashwat - 220103040 - Shashwat Kumar
No ratings yet
Shashwat - 220103040 - Shashwat Kumar
2 pages
Power BI Training: Data Visualization & Analysis
No ratings yet
Power BI Training: Data Visualization & Analysis
51 pages
Ict Microproject
No ratings yet
Ict Microproject
8 pages
Digital Camera: Owner'S Manual KT7002
No ratings yet
Digital Camera: Owner'S Manual KT7002
14 pages
PaX-i3D Smart Troubleshooting Manual Ver. 3.0. (Pano Image Cutoff)
100% (1)
PaX-i3D Smart Troubleshooting Manual Ver. 3.0. (Pano Image Cutoff)
13 pages
Security and Privacy in Cloud
No ratings yet
Security and Privacy in Cloud
3 pages
120 121
No ratings yet
120 121
4 pages
MS 900 Part 1
No ratings yet
MS 900 Part 1
96 pages
BN124668392
No ratings yet
BN124668392
15 pages
Modbus Poll User Manual
No ratings yet
Modbus Poll User Manual
65 pages
IMagic Service Manual
100% (1)
IMagic Service Manual
125 pages
G2-Practice Worksheet-Answer Key
No ratings yet
G2-Practice Worksheet-Answer Key
7 pages
SOP For SNA Marking
No ratings yet
SOP For SNA Marking
10 pages
Database Management MCQs & FAQs
No ratings yet
Database Management MCQs & FAQs
8 pages
Advanced Excel for Intermediate Users
No ratings yet
Advanced Excel for Intermediate Users
11 pages
08 - CI-CD Pipeline
No ratings yet
08 - CI-CD Pipeline
13 pages
RF Energy Harvesting For Medical Devices: Group 18
No ratings yet
RF Energy Harvesting For Medical Devices: Group 18
42 pages
Chapter 10 Agile Retrospectives: Meiktila University of Economics
No ratings yet
Chapter 10 Agile Retrospectives: Meiktila University of Economics
11 pages
Install and Configure RemoteApp On Windows Server 2022
No ratings yet
Install and Configure RemoteApp On Windows Server 2022
21 pages
Nested Loops C
No ratings yet
Nested Loops C
8 pages
Salesforce Certified Data Cloud Consultant Dumps by Clark 22 12 2023 6qa Braindumpscollection
No ratings yet
Salesforce Certified Data Cloud Consultant Dumps by Clark 22 12 2023 6qa Braindumpscollection
12 pages
Chennai Set2
No ratings yet
Chennai Set2
9 pages
Data Engineer Certification Questions1
100% (1)
Data Engineer Certification Questions1
22 pages
Stable Diffusion Fashion Image Setup No LoRA
No ratings yet
Stable Diffusion Fashion Image Setup No LoRA
3 pages
GV FDC Agent Installation A En-Us
No ratings yet
GV FDC Agent Installation A En-Us
12 pages
English Syllabus Level 3 2023-2024
No ratings yet
English Syllabus Level 3 2023-2024
289 pages
Problem G. Chat Order: 3000 Ms 262144 KB
No ratings yet
Problem G. Chat Order: 3000 Ms 262144 KB
2 pages