0% found this document useful (0 votes)
13 views34 pages

5.1. Intro To Machine Learning

The document is an introduction to Machine Learning, covering its definition, types (supervised, unsupervised, and reinforcement learning), and applications such as linear regression. It details supervised learning techniques like classification and regression, as well as unsupervised learning methods like clustering and dimensionality reduction. The document also explains the concepts of hypothesis, cost functions, and the gradient descent algorithm used in machine learning models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views34 pages

5.1. Intro To Machine Learning

The document is an introduction to Machine Learning, covering its definition, types (supervised, unsupervised, and reinforcement learning), and applications such as linear regression. It details supervised learning techniques like classification and regression, as well as unsupervised learning methods like clustering and dimensionality reduction. The document also explains the concepts of hypothesis, cost functions, and the gradient descent algorithm used in machine learning models.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Intro to Machine

Learning
Part – 1
Dr. Oybek Eraliev,
Department of Computer Engineering
Inha University In Tashkent.
Email: [email protected]

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 1


Content

ØWhat is Machine Learning?


ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 2


What is Machine Learning?

Machine Learning (ML) – the use and development of computer systems that
are able to learn and adapt without following explicit instructions, by using
algorithms and statistical models to analyse and draw inferences from patterns
in data.

Machine Learning (ML) – Field of study that gives computers the ability to
learn without being explicitly programmed. Arthur Samuel (1959).

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


What is Machine Learning?

Artificial
Intelligence

Machine Learning is a part Machine Learning


of Artificial Intelligence.

Deep Learning

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


What is Machine Learning?
Supervised learning

Unsupervised learning
Machine learning
Reinforcement learning

Recommender system

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Content

ØWhat is Machine Learning?


ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 6


Supervised Learning
Supervised learning

A. Classification problem B. Regression problem


• Logistic Regression • Linear Regression
• Decision Tree • Ridge Regression
• Naive Bayes • Stepwise Regression
• K – Nearest Neighbor Example:
• Support Vector Machine • Stock Market Prediction
Example: • Rainfall prediction
• Email spam detection
• Speech Recognition
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Supervised Learning
Classification problem

Apple
Apple

Output

Model
Banana
(Algorithm)
Input

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Supervised Learning
Regression problem
45 °C

Temperature Temperature
(°C) (F)
10 50
13 55.4 113 F
22 71.6
35 95

Input Model Output


(Algorithm)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Content

ØWhat is Machine Learning?


ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 10


Unsupervised Learning

Clustering problem
• K – Means
• MeanShift
Unsupervised learning

Dimensionality reduction
• Principle Component Analysis (PCA)
• Linear Discriminant Analysis (LDA)
• Autoencoders (AEs)

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Unsupervised Learning
Clustering problem

Input Model Output


(Algorithm)
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Content

ØWhat is Machine Learning?


ØSupervised Learning
ØUnsupervised Learning
ØLinear Regression with One Variable
ØLinear Regression with Multiple Variables

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 13


100
Dataset 90
95

80

Temp (°C) Temp (F) 70


(x)
x (y)
y 60
10 50 50
15 59
20 68
25 77
30 86
35 ?

F = 32 + 1.8·t

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


800
700
600

Prices
500
Houses prices 300
(Dataset) 200

Size (feet2), Prices (in 1000 of $)


(x)
x (y)
y 500 1000 1500 2000 2500 3000 3500

2104 460 Size (feet2)


1416 232
1534 315
852 178
… …
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Temp (C°), (x) Temp F (y)

1 10 50
2 15 59
Training set of
Temp. Dataset
m 3 x 20 y 68
4 25 77
… … …

x = “input” variable/feature
(x(1), y(1)) = ( , )
y = “output” variable/target
m = Number of training examples (x(2), y(2)) = ( , )
(x, y) – one training example
(x(i), y(i)) – i th training example (x(3), y(3)) = ( , )

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


How do we represent h?
Training Set
hθ(x) = θ0 + θ1x
y y = b + wx

Learning
Algorithm

hθ(x) = θ0 + θ1x
Temp Temp
(C°) h (F) x

Hypothesis F = 32 + 1.8·t
1.8 t

y = hθ(x) = θ1 = w =
θ0 = b = Linear Regression with one variable
x=
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
Hypothesis: hθ(x) = θ0 + θ1 x
Temp (C°), (x) Temp F (y)
Temperature Dataset
h(x) = 1.5 + 0·x 1 h(x) = 010
+ 0.5 · x 50
h(x) = 1 + 0.5 · x
hθ(x) hθ(x) hθ(x)
2 15 59
3 3 3
m = 50 3 20 68
2 2 2
4 25 77
1 1 … … 1 …
x x x
0 1 2 3 0 1 2 3 0 1 2 3
Hypothesis: hθ(x) = θ0 + θ1 x

θ0 = 1.5 θi : Parameters θ0 = 0 θ0 = 1
θ1 = 0 θ1 = 0.5 θ1 = 0.5

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


(x(i), y(i)) (
1 𝟐
minimize ( 𝒉θ (𝒙 𝒊 ) − 𝒚(𝒊)
θ0 , θ1 2𝑚
%&'

y θ0 , θ1 𝒉θ 𝒙 𝒊 = θ𝟎 + θ𝟏 𝒙(𝒊)

x
0

𝐽 θ+ , θ' =
Cost
Idea: Choose θ0 , θ1 so function
that𝒉𝜽(𝒙) is close to y for
our training examples (x, y).

Squared error function

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Hypothesis: Simplified:

hθ(x) = θ00 + θ1 x
=0
Parameters:

θ0 , θ1

Cost function:
(
1 𝟐
𝑱 θ𝟎 , θ𝟏 = ( 𝒉θ (𝒙 𝒊 ) − 𝒚(𝒊) 𝑱 θ𝟏 =
2𝑚
%&'

Goal:

minimize 𝑱 θ𝟏𝟎 , θ𝟏
θ0θ,1θ1

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


hθ(x) 𝑱 θ𝟏
hθ(x)= θ1 x
θ𝟏 =1 𝐽 θ!
3 ×
hθ(x)= θ1 x 3
y 2 × θ𝟏 =0.5
×
2 ×
1 ×
hθ(x)= θ1 x 1
θ𝟏 =0 × ×
0 1 2 3 × θ!
x 0.5 0.5 1 1.5 2 2.5
( 0
1 𝟐
𝑱 θ𝟏 = - 𝒉 (𝒙 𝒊 ) − 𝒚(𝒊)
2𝑚 %&' θ

1 1
𝐽 1 =
2𝑚
( 1 − 1 "+ 2 − 2 "+ 3 − 3 ") =
2·3
·0=0 θ𝟏 1 0.5 0 1.5 2

𝐽 0.5 =
1
2𝑚
( 0.5 − 1 " + 1 − 2 " + 1.5 − 3 " ) =
1
· 3.5 = 0.58 𝑱 θ𝟏 0 0.58 2.33 0.58 2.33
2·3

𝐽 0 =
1
( 0 − 1 "+ 0 − 2 "+ 0 − 3 ") =
1
· 14 = 2.33
minimize 𝑱 θ𝟏
2𝑚 2·3 θ1
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev
,
1 𝒊 (𝒊) 𝟐
Cost function: 𝑱 θ𝟎 , θ𝟏 = U 𝒉θ (𝒙 ) − 𝒚
2𝑚
)*+

Goal: minimize 𝑱 θ𝟎 , θ𝟏
θ0 , θ1

Outline:
• Start with some ( θ0 , θ1 ) 𝜽𝟎 = 𝟎, 𝜽𝟏 = 𝟎

• Keep changing θ0 , θ1 to reduce 𝑱 θ𝟎 , θ𝟏


until we hopefully end up at a minimum.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Source: Machine learning course (Andrew Ng)

𝑱 θ𝟎 , θ𝟏

θ𝟏

θ𝟎

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040


Gradient descent algorithm

Repeat until convergence {

.
𝜃- = 𝜃- − 𝛼 _ ./ 𝐽(𝜃0, 𝜃+) (for j = 0 and j = 1)
1
}

Learning rate Derivative part

minimize 𝑱 θ𝟏 θ1 ∈ 𝑅
θ1

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


𝐽 θ! 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑠𝑙𝑜𝑝𝑒
.
𝜃+ = 𝜃+ − 𝛼 _ 𝐽(𝜃+)
./2
≥0

𝜃+ = 𝜃+ − 𝛼 _ (𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒)
θ!

𝐽 θ! 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑠𝑙𝑜𝑝𝑒

.
𝜃+ = 𝜃+ − 𝛼 _ 𝐽(𝜃+)
./2
≤0

𝜃+ = 𝜃+ − 𝛼 _ (𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑣𝑎𝑙𝑢𝑒)
θ!

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


𝐽 θ!
.
𝜃+ = 𝜃+ − 𝛼 _ 𝐽(𝜃+)
./2

If 𝛼 is too small, gradient descent


can be slow.
0 θ!

𝐽 θ!

If 𝛼 is too large, gradient descent


can overshoot the minimum. It
may fail to converge, or even
diverge.
0 θ!

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


𝐽 θ!

𝑠𝑙𝑜𝑝𝑒 = 0

θ! 𝑎𝑡 𝑙𝑜𝑐𝑎𝑙 𝑜𝑝𝑡𝑖𝑚𝑎
θ!

.
𝜃+ = 𝜃+ − 𝛼 _ ./ 𝐽(𝜃+)
2
𝐶𝑢𝑟𝑟𝑒𝑛𝑡 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓θ! =0

𝜃+ = 𝜃+ − 𝛼 _ 0
𝜃+ = 𝜃+

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Gradient descent can converge to a
local minimum, even with the learning
rate 𝛼 fixed.
𝐽 θ!

𝑎
.
𝜃+ = 𝜃+ − 𝛼 _ ./ 𝐽(𝜃+)
2 𝑏

As we approach a local minimum,


gradient descent will automatically
0 θ!
take a smaller steps. So, no need to
decrease 𝛼 over time.

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Temperature hθ(x)= θ0 + θ1 x
(Dataset)
𝑦
100
Temp (°C) Temp (F)
90
(x)
x (y)
y
80
10 50
15 59 70

20 68 60
25 77 50
30 86 𝑥
10 15 20 25 30

F = 32 + 1.8·t
θ0 = 32
hθ(x)= 32 + 1.8 · x
θ1 = 1.8

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Gradient descent algorithm Linear Regression Model
Repeat until convergence { hθ(x)= θ0 + θ1 x
(
/ 1 0
𝜃. = 𝜃. − 𝛼 ? 𝐽(𝜃+ , 𝜃' ) 𝐽 θ+ , θ' = ( ℎθ (𝑥 % ) − 𝑦 (%)
/,,
2𝑚
%&'

(for j = 0 and j = 1) 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝐽(𝜃+ , 𝜃' )


,* ,,+
}
( (
/ 𝜕 1 0 𝜕 1 % (%) 0
𝐽(𝜃+ , 𝜃' ) = ( ℎθ (𝑥 % ) − 𝑦 (%) = ( θ0 + θ1 𝑥 − 𝑦
/,, 𝜕𝜃. 2𝑚 𝜕𝜃. 2𝑚
%&' %&'

(
/ 1
j=0 /,*
𝐽(𝜃+ , 𝜃' ) = ( ℎ (𝑥 % ) − 𝑦 (%) 𝜃0
𝑚 θ
%&'
(
/ 1
j=1 /,+
𝐽(𝜃+ , 𝜃' ) = ( ℎ (𝑥 % ) − 𝑦 (%) · 𝑥 % 𝜃+
𝑚 θ
%&'

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


Gradient descent algorithm

Repeat until convergence {


,
+
𝜃0 = 𝜃0 − 𝛼 _ ,U ℎθ (𝑥 ) ) − 𝑦 ())
)*+ Update
, 𝜃0 and 𝜃+
+
𝜃+ = 𝜃+ − 𝛼 _ U ℎθ (𝑥 ) ) − 𝑦 ()) _ 𝑥 )
,
)*+

(for j = 0 and j = 1)
}

Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 Oybek Eraliev


𝑦
hθ(x) 𝑱(𝜽𝟎 , 𝜽𝟏 )
100
5. 𝐽 θ!
90
4.
80 1.
70 3.
2.
60 2.
3.
50 5.
4.
𝑥 1.
10 15 20 25 30 0 θ!

Iterations: Cost function: Gradient descent:


1. θ" = 0, θ! = 0
2. θ" = …, θ! = … (
1 𝟐
3. θ" = …, θ! = … 𝑱 θ𝟎 , θ𝟏 = - 𝒉 (𝒙 𝒊 ) − 𝒚(𝒊)
2𝑚 %&' θ
4. θ" = …, θ! = …
5. θ" = 32, θ! = 1.8

Dr. Oybek Eraliyev 𝛼 = 0.01


Class: Artificial Intelligence SOC4040 Oybek Eraliev
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 33
Dr. Oybek Eraliyev Class: Artificial Intelligence SOC4040 34

You might also like