Cost Function: y 2m 1 (Y ) 2m 1

The document discusses machine learning cost functions and gradient descent. It explains that cost functions measure the accuracy of a hypothesis function by taking the average squared difference between predicted and actual output values. This is called the mean squared error. Gradient descent is then introduced as a way to estimate the parameters in the hypothesis function to minimize the cost function and improve accuracy. It works by plotting the cost function with the hypothesis parameters on the x and y axes to find the minimum value at the bottom of "pits" on the graph.

Uploaded by

Laith Bounenni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views1 page

Cost Function: y 2m 1 (Y ) 2m 1

Uploaded by

Laith Bounenni

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

27/09/2020 Machine Learning - Home | Coursera

Cost Function
We can measure the accuracy of our hypothesis function by using a cost function. This takes an average
(actually a fancier version of an average) of all the results of the hypothesis with inputs from x's compared to the
actual output y's.

m m
1 2 1 2
J(θ0 , θ1 ) = ∑ (y^i − yi ) = ∑ (hθ (xi ) − yi )
2m i=1 2m i=1

To break it apart, it is 12 x
ˉ where xˉ is the mean of the squares of hθ (xi ) − yi , or the difference between the
predicted value and the actual value.

1
This function is otherwise called the "Squared error function", or "Mean squared error". The mean is halved ( 2m )
as a convenience for the computation of the gradient descent, as the derivative term of the square function will
cancel out the 12 term.

Now we are able to concretely measure the accuracy of our predictor function against the correct results we have
so that we can predict new results we don't have.

If we try to think of it in visual terms, our training data set is scattered on the x-y plane. We are trying to make
straight line (defined by hθ (x)) which passes through this scattered set of data. Our objective is to get the best
possible line. The best possible line will be such so that the average squared vertical distances of the scattered
points from the line will be the least. In the best case, the line should pass through all the points of our training
data set. In such a case the value of J(θ0 , θ1 ) will be 0.

ML:Gradient Descent
So we have our hypothesis function and we have a way of measuring how well it fits into the data. Now we need
to estimate the parameters in hypothesis function. That's where gradient descent comes in.

Imagine that we graph our hypothesis function based on its fields θ0 and θ1 (actually we are graphing the cost
function as a function of the parameter estimates). This can be kind of confusing; we are moving up to a higher
level of abstraction. We are not graphing x and y itself, but the parameter range of our hypothesis function and
the cost resulting from selecting particular set of parameters.

We put θ0 on the x axis and θ1 on the y axis, with the cost function on the vertical z axis. The points on our graph
will be the result of the cost function using our hypothesis with those specific theta parameters.

We will know that we have succeeded when our cost function is at the very bottom of the pits in our graph, i.e.
when its value is the minimum.

https://www.coursera.org/learn/machine-learning/resources/JXWWS 1/1

Book of Sweep Picking
100% (3)
Book of Sweep Picking
4 pages
Fleet Management System-Sample
83% (6)
Fleet Management System-Sample
30 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
Aggregate Impact Value
No ratings yet
Aggregate Impact Value
8 pages
cs229 2
No ratings yet
cs229 2
275 pages
What Is Machine Learning?
No ratings yet
What Is Machine Learning?
12 pages
Measurement of Conductance and Kohlrauch's Law
No ratings yet
Measurement of Conductance and Kohlrauch's Law
23 pages
Machine Learning
No ratings yet
Machine Learning
60 pages
DSCTP 2022 1 ML Slides
No ratings yet
DSCTP 2022 1 ML Slides
351 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
Regression
No ratings yet
Regression
30 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
Lec2 Linear Regression With One Variable
No ratings yet
Lec2 Linear Regression With One Variable
48 pages
ML Coursera
No ratings yet
ML Coursera
10 pages
Optimization23 22
No ratings yet
Optimization23 22
32 pages
L3 Linear Regression and Gradient Descent
No ratings yet
L3 Linear Regression and Gradient Descent
46 pages
Supervised Machine Learning
No ratings yet
Supervised Machine Learning
7 pages
A Tutorial of Machine Learning
No ratings yet
A Tutorial of Machine Learning
16 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
43 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
5 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Linear Regression: Level:4 Department: IT, Security
No ratings yet
Linear Regression: Level:4 Department: IT, Security
35 pages
Cost Function
No ratings yet
Cost Function
17 pages
Lec 3
No ratings yet
Lec 3
22 pages
Week 3 Lecture Notes
No ratings yet
Week 3 Lecture Notes
7 pages
Doherty Power Amplifier For 5G Systems
No ratings yet
Doherty Power Amplifier For 5G Systems
25 pages
CSE445 T3 Linear Regression One Variable
No ratings yet
CSE445 T3 Linear Regression One Variable
57 pages
Lecture 3
No ratings yet
Lecture 3
56 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
Slide 3 - Linear Regression One Variable
No ratings yet
Slide 3 - Linear Regression One Variable
60 pages
CS229
No ratings yet
CS229
69 pages
(MLP) MidtermNote
No ratings yet
(MLP) MidtermNote
31 pages
Linear Regression
No ratings yet
Linear Regression
29 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
48 pages
Linear Regression
No ratings yet
Linear Regression
63 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
ML:Introduction What Is Machine Learning?: Continuous and Discrete Data
No ratings yet
ML:Introduction What Is Machine Learning?: Continuous and Discrete Data
6 pages
Module 3
No ratings yet
Module 3
27 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
8 pages
Linear - Regression - SGD
No ratings yet
Linear - Regression - SGD
71 pages
Computing For Data Sciences: Introduction To Regression Analysis
No ratings yet
Computing For Data Sciences: Introduction To Regression Analysis
9 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
3 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
12 pages
Linear Regression
No ratings yet
Linear Regression
55 pages
Gradient Descent for Linear Regression: repeat until convergence: (:=:=) − α ( −) 1 ∑ − α ( ( −) ) 1 ∑
No ratings yet
Gradient Descent for Linear Regression: repeat until convergence: (:=:=) − α ( −) 1 ∑ − α ( ( −) ) 1 ∑
1 page
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
Dod STD 2183
No ratings yet
Dod STD 2183
19 pages
03 Cost Function - Intuition I 11 Min
No ratings yet
03 Cost Function - Intuition I 11 Min
4 pages
Lec 07-08 - Final
No ratings yet
Lec 07-08 - Final
32 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
07 Gradient Descent For Linear Regression 10 Min
No ratings yet
07 Gradient Descent For Linear Regression 10 Min
5 pages
Machine Learning - Exploring The Model - Resp
No ratings yet
Machine Learning - Exploring The Model - Resp
18 pages
M146 Lec3 Sidenotes S25
No ratings yet
M146 Lec3 Sidenotes S25
29 pages
Notes 1
No ratings yet
Notes 1
30 pages
Physics Final Project
No ratings yet
Physics Final Project
22 pages
Le Club Francais Case
No ratings yet
Le Club Francais Case
8 pages
Kubota Mobile Light Tower
No ratings yet
Kubota Mobile Light Tower
1 page
Lecture Notes Solid State Physics 1
No ratings yet
Lecture Notes Solid State Physics 1
28 pages
Classification of Air Masses and Fronts - Geography Optional - UPSC - Digitally Learn
No ratings yet
Classification of Air Masses and Fronts - Geography Optional - UPSC - Digitally Learn
14 pages
Lab Exp 1 2
No ratings yet
Lab Exp 1 2
26 pages
PFD Consiltator
No ratings yet
PFD Consiltator
5 pages
Introduction To Cisco PIX and ASA
No ratings yet
Introduction To Cisco PIX and ASA
35 pages
12th Computer Applications Study Material 2024 25
No ratings yet
12th Computer Applications Study Material 2024 25
143 pages
Aircraft Welding Cabriana
No ratings yet
Aircraft Welding Cabriana
5 pages
Gates Belt Guide
No ratings yet
Gates Belt Guide
12 pages
AP1501
No ratings yet
AP1501
12 pages
Apoorva Nandakumar Resume PDF
No ratings yet
Apoorva Nandakumar Resume PDF
2 pages
EN UserGuideISAKMetry
No ratings yet
EN UserGuideISAKMetry
32 pages
Selected Problems in The Theory of Classical Cellular Automata
No ratings yet
Selected Problems in The Theory of Classical Cellular Automata
410 pages
CPM and Pert
No ratings yet
CPM and Pert
40 pages
80 Watt + 80 Watt Dual BTL Class-D Audio Amplifier: TDA7498L
No ratings yet
80 Watt + 80 Watt Dual BTL Class-D Audio Amplifier: TDA7498L
23 pages
Simpsons Parking Lot Diversity Lab Share
No ratings yet
Simpsons Parking Lot Diversity Lab Share
7 pages
Chapter3 Electrochemistyry
No ratings yet
Chapter3 Electrochemistyry
2 pages
Myfile
No ratings yet
Myfile
10 pages
The Visualization Handbook 1st Edition Christopher R. Johnson Download
100% (1)
The Visualization Handbook 1st Edition Christopher R. Johnson Download
37 pages
Support Vector
No ratings yet
Support Vector
2 pages
0 - A Manual For The Part-Compositor Framework
No ratings yet
0 - A Manual For The Part-Compositor Framework
10 pages
Normal Equation: θ = (X X) X y
No ratings yet
Normal Equation: θ = (X X) X y
1 page
Module 3 - Pneumatics Activity 1
No ratings yet
Module 3 - Pneumatics Activity 1
2 pages
Polyester Double Braid Info Sheet
No ratings yet
Polyester Double Braid Info Sheet
1 page
Gradient Descent Tips: X X X X X
No ratings yet
Gradient Descent Tips: X X X X X
1 page
7 PDF
No ratings yet
7 PDF
1 page
7 PDF
No ratings yet
7 PDF
1 page
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet

Cost Function: y 2m 1 (Y ) 2m 1

Uploaded by

Cost Function: y 2m 1 (Y ) 2m 1

Uploaded by

27/09/2020 Machine Learning - Home | Coursera

You might also like