0% found this document useful (0 votes)

5 views6 pages

GR 1 Report Week 7

Uploaded by

hungletatdac123456

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views6 pages

GR 1 Report Week 7

Uploaded by

hungletatdac123456

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Supervised machine learning:

regression and classification

Lê Ngọc Quang Hưng

Contents
1. Introduction ........................................................................................................................ 1
2. The need for machine learning ............................................................................................ 1
3. Linear regression ................................................................................................................. 1
3.1. Cost function ................................................................................................................ 2
3.2. Gradient descent algorithm .......................................................................................... 2
3.3. Multiple linear regression ............................................................................................. 4
4. Feature engineering ............................................................................................................. 4
5. Polynomial regression ......................................................................................................... 5

1. Introduction
This report will tackle the basics of supervised machine learning, mainly about regression
problems. It delves into fundamental concepts such as linear regression, cost function,
gradient descent, feature scaling and feature engineering.

2. The need for machine learning

While any programmer can design a program to handle different scenarios of a problem by
using tools like conditional statements, many problems such as housing price prediction,
tumor detection, image recognition are very hard to solve using those tools alone. Take image
recognition for example, for a small image with a resolution of 32x32, each pixel in the grid
can have a different color represented by a combination of 3 three numbers with each number
ranging from 0 to 255 which adds up to about 17 billion combinations that the programmer
has to account for, even more, these standards will always change with time as technology
evolve, so there arise a need for a solution that grow and adapt with the change in the real
world, hence, the need for machine to learn from examples and grow with time. The main
concept behind machine learning is that given a feature or an input 𝑥, the model needs to
output an estimated result 𝑦̂ as close to the desired result 𝑦 as possible. In supervised
machine learning, the set of input 𝑋 and desired result 𝑌 has already been given beforehand
while unsupervised machine learning only gets access to the set of input 𝑋. This report will
be about supervised machine learning and addressing one of the two main types of supervised
learning, which is regression.

3. Linear regression
The goal of machine learning is to find a model 𝑓(𝑥) that can roughly fit the training data
and from that model, the machine can predict future outcomes when it encounters data it
has never seen before. A regression problem requires the machine to predict a value from an
infinite amount of choices. The simplest method to tackle regression problems is called linear
regression. In linear regression, this model take the form of:

1
𝑓𝑤,𝑏 (𝑥(𝑖) ) = 𝑤 ∗ 𝑥(𝑖) + 𝑏
𝑤 and 𝑏 are called the weight and bias of the model, while the 𝑖 in 𝑥(𝑖) denotes the 𝑖th
training example.

3.1. Cost function

In order for the model to be able to predict future results accurately, the parameters 𝑤 and 𝑏
need to be tuned to fit the training data. One way of choosing the appropriate parameters is
using a cost function to evaluate the performance of the model, which calculates the
difference between the output 𝑦̂ and the actual result 𝑦. The smaller the cost function, the
better the model fits the data. One of the more popular cost function is the Mean Square
Error function:
2
𝐽 (𝑤, 𝑏) = ( 𝑛1 ) ∗ ∑𝑛𝑖=1 (𝑓𝑤,𝑏 (𝑥(𝑖) ) − 𝑦(𝑖) )
The cost function enables the machine to learn from training data, by trying to minimize the
value of the function through the tuning of parameters 𝑤 and 𝑏, the machine can arrive at a
model that can predict future outcomes when it is given some data.

3.2. Gradient descent algorithm

Gradient descent is a widely used algorithm for minimizing the cost function. In linear
regression, the cost function is the average sum of the square of the “loss” in training, which
means the plot of the function is convex like a soup bowl.

2
Figure 1: The plot of the cost function 𝐽 (𝑤, 𝑏)
Since the cost function is convex, there exists at most one global minimum. In linear
regression, the minimum can be achieved by starting at a random point (𝑤, 𝑏) and calculate
𝜕 𝜕
the respective partial derivative 𝜕𝑤 𝐽 (𝑤, 𝑏) and 𝜕𝑏 𝐽 (𝑤, 𝑏) at that point and nudge 𝑤 and 𝑏
toward a value where the direction of the slope at that point is going down, which means the
function is heading toward the global minimum. Repeat this step until the cost function is
around the area of the global minimum or the partial derivatives are approaching 0. Here is
how 𝑤 and 𝑏 are calculated at each step:

𝜕
𝑤 = 𝑤 − 𝛼 𝜕𝑤 𝐽 (𝑤, 𝑏) = 𝑤 − 𝛼 𝑛1 ∑𝑛𝑖=1 (𝑓𝑤,𝑏 (𝑥(𝑖) − 𝑦(𝑖) )𝑥(𝑖)
𝜕
𝑏 = 𝑏 − 𝛼 𝜕𝑏 𝐽 (𝑤, 𝑏) = 𝑏 − 𝛼 𝑛1 ∑𝑛𝑖=1 (𝑓𝑤,𝑏 (𝑥(𝑖) − 𝑦(𝑖) )
The 𝛼 here is called a hyperparameter or the learning rate of the model, which is set before
the training starts. Notice that 𝛼 needs to be chosen carefully, if 𝛼 is too small, the model
will take a longer time to train, on the contrary, having an 𝛼 too big and the model will

3
actually get worse overtime as it makes the algorithm overshoots the minimum again and
again.

Figure 2: The plot of J(w,b) when 𝛼 is a big number

The cost function should decrease after each iteration of gradient descent, this means the
algorithm is working as intended and the chosen 𝛼 is appropriate and it can still be increased
to improve the performance of the algorithm.

3.3. Multiple linear regression

For some input 𝑋, the training data can contain multiple features, in which case, the input
can be represented as a vector of features and the linear regression model can be rewritten as:

𝑓𝑤,𝑏 (𝑥)⃗ = 𝑤⃗ ⋅ 𝑥⃗ + 𝑏
By using the vector to represent training data, the machine can use the GPU parallel
computing to efficiently calculate the dot product of 𝑤⃗ and 𝑥⃗ simultaneously.
For gradient descent implementation, there is a slight different with multiple linear
regression, where every features should have its own gradient descent process:

𝜕
𝑤𝑗 = 𝑤𝑗 − 𝛼 𝜕𝑤 𝐽 (𝑤, 𝑏) = 𝑤𝑗 − 𝛼 𝑛1 ∑𝑛𝑖=1 (𝑓𝑤,𝑏 (𝑥(𝑖) −
𝑗
(𝑖)
𝑦(𝑖) )𝑥𝑗 for 𝑗 = 0..(𝑛 − 1)
𝜕
𝑏 = 𝑏 − 𝛼 𝜕𝑏 𝐽 (𝑤, 𝑏) = 𝑏 − 𝛼 𝑛1 ∑𝑛𝑖=1 (𝑓𝑤,𝑏 (𝑥(𝑖) − 𝑦(𝑖) )
4. Feature engineering
Feature engineering is a crucial part of optimization in machine learning. Feature engineering
is the concept of organizing and processing data before training so that the final input X fit
the model best. One key concept of feature engineering is feature scaling, where features are
scaled so that all features have a similar impact on the final state of the model. For example,
a housing price input has 2 features, the size of the house in 𝑚2 ranging from 500 to 2000
and the number of bedrooms ranging from 1 to 5. In this case, the value of the first feature is

4
much higher than the value of the second one, this leads to an uneven influence on the model
where only a little change in the size of the house is already greater than any change in the
number of bedrooms. The model will try to compensate by choosing the appropriate weight
𝑤1 and 𝑤2 so that their influence cancel out, leading to a “skinny” contour plot of 𝑤1 when
compared to 𝑤2 and cause the gradient descent algorithm to run slower since it has a longer
path to go before reaching the global minimum.

Figure 3: The contour plot of 𝑤1 to 𝑤2

One way to implement feature scaling is Mean Normalization:

𝑥−𝜇
𝑥rescaled = max(𝑥)− min(𝑥)
where 𝜇 is the average value of the feature.

5. Polynomial regression
Sometimes, using linear regression is not enough to fit the input data, in this case,
polynomial regression can be used to fit a curve over the training data instead of just a
straight line. Polynomial regression can enable a model to better fit the data but might cause

5
some issues like calculation overflow since polynomial regression introduces variables with
exponential growth, this makes feature scaling even more important. A general form of
polynomial regression:

2 3 𝑛
𝑓𝑤,𝑏
⃗ (𝑥) = 𝑤1 𝑥 + 𝑤2 𝑥 + 𝑤3 𝑥 + … + 𝑤𝑛 𝑥 + 𝑏

Linear Regression Python Programming
No ratings yet
Linear Regression Python Programming
25 pages
Linear Regression
No ratings yet
Linear Regression
75 pages
Cost Function
No ratings yet
Cost Function
17 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
cs229 2
No ratings yet
cs229 2
275 pages
Machine Learning
No ratings yet
Machine Learning
60 pages
Linear Regression: Jia-Bin Huang Virginia Tech
No ratings yet
Linear Regression: Jia-Bin Huang Virginia Tech
59 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
AS330 Series Elevator-Used Inverter User Manual V1.01
No ratings yet
AS330 Series Elevator-Used Inverter User Manual V1.01
128 pages
Linear Regression
No ratings yet
Linear Regression
95 pages
Week 04
No ratings yet
Week 04
101 pages
Module3 Ch1
No ratings yet
Module3 Ch1
83 pages
What Is Machine Learning by Coursera
No ratings yet
What Is Machine Learning by Coursera
47 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
Mla Unit 2
No ratings yet
Mla Unit 2
99 pages
Load Out
100% (2)
Load Out
239 pages
Week 4
No ratings yet
Week 4
101 pages
2 - Multiple Linear Regression
No ratings yet
2 - Multiple Linear Regression
71 pages
CS229
No ratings yet
CS229
69 pages
Lecture3 - Linear Regression and Logistic Regression
No ratings yet
Lecture3 - Linear Regression and Logistic Regression
60 pages
Lecture - 4 - Logistic Regression
No ratings yet
Lecture - 4 - Logistic Regression
62 pages
Linear Regression
No ratings yet
Linear Regression
55 pages
Unit 2
No ratings yet
Unit 2
35 pages
ML Primer PDF
No ratings yet
ML Primer PDF
122 pages
ML Session 1
No ratings yet
ML Session 1
22 pages
Lecture 4 - More On Linear Regression and Polynomial Regression
No ratings yet
Lecture 4 - More On Linear Regression and Polynomial Regression
26 pages
Linear - Regression - SGD
No ratings yet
Linear - Regression - SGD
71 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
48 pages
Foundations of Machine Learning - 3
No ratings yet
Foundations of Machine Learning - 3
38 pages
Lecture Slides - Linear Regression (2025)
No ratings yet
Lecture Slides - Linear Regression (2025)
45 pages
(MLP) MidtermNote
No ratings yet
(MLP) MidtermNote
31 pages
Lec1 PDF
No ratings yet
Lec1 PDF
56 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
MLA TAB Lecture3
No ratings yet
MLA TAB Lecture3
70 pages
GradientDescent-Regression Slides
No ratings yet
GradientDescent-Regression Slides
26 pages
Linear Regression For Absolute Beginners With Implementation in Python
No ratings yet
Linear Regression For Absolute Beginners With Implementation in Python
17 pages
Linear Regression
No ratings yet
Linear Regression
37 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
ML4 Linear Models
No ratings yet
ML4 Linear Models
34 pages
Regression
No ratings yet
Regression
25 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
MACHINE LEARNING ALGORITHM Unit-II
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II
115 pages
3.linear Regression
No ratings yet
3.linear Regression
18 pages
Notes 1
No ratings yet
Notes 1
30 pages
Tom Mitchell Provides A More Modern Definition
No ratings yet
Tom Mitchell Provides A More Modern Definition
10 pages
Lecture Notes 5 Linear Regression
No ratings yet
Lecture Notes 5 Linear Regression
11 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
5 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
APSC 258 Midterm Study Guide
No ratings yet
APSC 258 Midterm Study Guide
4 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
Machine Learning - SoS 2017
No ratings yet
Machine Learning - SoS 2017
15 pages
Understanding The Law of Resonance
No ratings yet
Understanding The Law of Resonance
13 pages
Optimal Foundation Design of A Vertical Pump Assembly
No ratings yet
Optimal Foundation Design of A Vertical Pump Assembly
9 pages
A Detailed Lesson Plan in Mathematics 9
No ratings yet
A Detailed Lesson Plan in Mathematics 9
7 pages
Assembly and Operating Instructions: Inverter Welding Machine
No ratings yet
Assembly and Operating Instructions: Inverter Welding Machine
14 pages
Lay Out of Geo Tech Lab
No ratings yet
Lay Out of Geo Tech Lab
15 pages
I Lecture 6
No ratings yet
I Lecture 6
39 pages
(15PR201203644338) PDF
No ratings yet
(15PR201203644338) PDF
4 pages
IGCSE Chemistry AO3 G10-2 Sungbeen Hong
No ratings yet
IGCSE Chemistry AO3 G10-2 Sungbeen Hong
14 pages
Step Template
No ratings yet
Step Template
20 pages
Ofosu
No ratings yet
Ofosu
9 pages
Introduction To Logic Module 3 Language and Definitions
No ratings yet
Introduction To Logic Module 3 Language and Definitions
16 pages
Cebuano Words 3
No ratings yet
Cebuano Words 3
5 pages
University of Mumbai: Revised Syllabus W.E.F. Academic Year, 2016-18
No ratings yet
University of Mumbai: Revised Syllabus W.E.F. Academic Year, 2016-18
5 pages
GE 7 - STS Module 5
No ratings yet
GE 7 - STS Module 5
16 pages
Filler Metals: SMAW (Stick) Solutions - Electrodes
No ratings yet
Filler Metals: SMAW (Stick) Solutions - Electrodes
2 pages
Egg Drop Project 2
No ratings yet
Egg Drop Project 2
2 pages
Vaishali Bujad Project..2
No ratings yet
Vaishali Bujad Project..2
54 pages
2425 - Pgdlma - Elscon - Mock - Assessment - Tagged
No ratings yet
2425 - Pgdlma - Elscon - Mock - Assessment - Tagged
4 pages
Cáscara de Plátano Como Biosorbente para La Descontaminación de Contaminantes Del Agua. Una Revisión
No ratings yet
Cáscara de Plátano Como Biosorbente para La Descontaminación de Contaminantes Del Agua. Una Revisión
28 pages
Xu2020 Social MEDIA
No ratings yet
Xu2020 Social MEDIA
14 pages
L8.2: Interfacing Digital Temperature and Humidity Sensor With Microcontroller
No ratings yet
L8.2: Interfacing Digital Temperature and Humidity Sensor With Microcontroller
6 pages
Report General Chejj
No ratings yet
Report General Chejj
3 pages
Prof Ed 7 Rating Scale
No ratings yet
Prof Ed 7 Rating Scale
3 pages
Wheel Decide Tutorial - Youtube
No ratings yet
Wheel Decide Tutorial - Youtube
3 pages
Organizational Behavior Assignment: Submitted By-Dachiraju Chandana Varma Section-D 141356
No ratings yet
Organizational Behavior Assignment: Submitted By-Dachiraju Chandana Varma Section-D 141356
4 pages
Daniel Robert Middleton
No ratings yet
Daniel Robert Middleton
3 pages
Reaction Paper
No ratings yet
Reaction Paper
2 pages
Hard Work, Determination, and Persistence: 3 Keywords in Life
No ratings yet
Hard Work, Determination, and Persistence: 3 Keywords in Life
2 pages
The Practically Cheating Calculus Handbook
From Everand
The Practically Cheating Calculus Handbook
S. Deviant
3.5/5 (7)
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
From Everand
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
Ahmed Ph. Abbasi
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
From Everand
Line Drawing Algorithm: Mastering Techniques for Precision Image Rendering
Fouad Sabry
No ratings yet
Random Optimization: Fundamentals and Applications
From Everand
Random Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

GR 1 Report Week 7

Uploaded by

GR 1 Report Week 7

Uploaded by

Supervised machine learning:

regression and classification

2. The need for machine learning

3.1. Cost function

3.2. Gradient descent algorithm

Figure 2: The plot of J(w,b) when 𝛼 is a big number

3.3. Multiple linear regression

Figure 3: The contour plot of 𝑤1 to 𝑤2

You might also like