AI2025 Lecture02 Recording Slides
AI2025 Lecture02 Recording Slides
https://docs.google.com/presentation/d/1xC-
lg8RnaO4wTwwQWYEJBUtjj2Caxn5QsHM0LEc9YVc/edit#slide=id.g27be7003ef_0_9
Information
Infer
CAT
Image information Prediction
2
Image information Prediction
4 ?
hours points
Prediction
information
Information
Infer
CAT
Image information Prediction
2
Image information Prediction
training
Labeled dataset
SKKU Kang Research Group / SEE3007 Spring 2025 15
Machine Learning
Predict (test) with trained model
2
Image information Prediction
4 ?
hours points
Prediction
𝒙 Model 𝑦ො ≈ 𝑦
Input Output
(features) (prediction)
Label
(Actual)
Training?
𝐷= 𝒙 𝟏 ,𝑦 1 , 𝒙 2 ,𝑦 2 ,… 𝒙 𝑚 ,𝑦 𝑚
Building a model to make the model can predict
the labels by using train data
Each row of data is called an observation or a tuple
SKKU Kang Research Group / SEE3007 Spring 2025 Prof. Woowhan Jung, DSLAB, Hanyang Univ. 21
Classification vs Regression
100
80
70
60
Classification Regression
100
80
70
60
Classification Regression
80
Predicted rating
70
Q2. Classification? Regression?
60
Cat
Classification Regression
4 ?
hours points
Prediction
2 4
Linear
3 6
4 ?
Points
w2
Hours (x) Points (y)
1 2
2 4 w3
3 6
1 2 3 1
2 4 6 4
3 6 9 9
mean=14/3
1 2 4 4
2 4 8 16
3 6 12 36
mean=56/3
1 2 2 0
2 4 4 0
3 6 6 0
mean=0/3
2 4 16 4 0 4 16
3 6 36 9 0 9 36
MSE
# Loss function
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) * (y_pred - y)
print("MSE=", l_sum / 3)
w_list = []
mse_list = []
for w in np.arange(0.0, 4.1, 0.1):
print("w=", w)
l_sum = 0
for x_val, y_val in zip(x_data, y_data):
y_pred_val = forward(x_val)
l = loss(x_val, y_val)
l_sum += l
print("\t", x_val, y_val, y_pred_val, l)
print("MSE=", l_sum / 3)
w_list.append(w)
mse_list.append(l_sum / 3)
plt.plot(w_list, mse_list)
plt.ylabel('Loss')
plt.xlabel('w')
plt.show()
# Loss function
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) * (y_pred - y)
plt.plot(w_list, mse_list)
plt.ylabel('Loss')
plt.xlabel('w')
plt.show()
Examples)
Area of house -> house price
# of iPhones sold -> Apple’s sales
Squared error: 𝑦ො − 𝑦 2
2
Loss function 𝐿 𝑦ො (𝑖) , 𝑦 (𝑖) = 𝑦 (𝑖) − 𝑦ො (𝑖)
1 𝑚 1 𝑚 2
Cost function 𝐽 𝒘, 𝑏 = σ 𝐿 𝑦ො (𝑖) , 𝑦 (𝑖) = σ 𝑦 (𝑖) − 𝑦ො (𝑖)
𝑚 𝑖=1 𝑚 𝑖=1
▪ Given
▪ Training data 𝐷 = 𝒙 𝟏 , 𝑦 1 , 𝒙 2 , 𝑦 2 , … 𝒙 𝑚 , 𝑦 𝑚
▪ Our goal
1 𝑚 2
▪ Find 𝒘, 𝑏 that minimizes 𝐽 𝒘, 𝑏 = σ
𝑚 𝑖=1
𝑦 𝑖 − 𝑦ො 𝑖
Q. How?
Applicable methods: gradient descent, linear least squares, …
𝑁
1 2
𝑙𝑜𝑠𝑠 = 𝑀𝑆𝐸 = 𝑦ො𝑛 − 𝑦𝑛
𝑁
𝑛=1
MSE
𝑎𝑟𝑔 min 𝑙𝑜𝑠𝑠(𝑤)
𝑤
loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
Random initial weight 𝜕𝑤
starting point
Global loss
minimum
w
SKKU Kang Research Group / SEE3007 Spring 2025 48
Gradient Descent Algorithm
loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
Random initial weight 𝜕𝑤
starting point
𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤
𝜶 = learning rate
(small value)
Global loss
minimum
𝑤𝑛𝑒𝑤 𝑤𝑝𝑟𝑒𝑣
w
SKKU Kang Research Group / SEE3007 Spring 2025 49
Gradient Descent Algorithm
loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
𝜕𝑤
Compared to the
previous jump, w
moves slower 𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤
𝜶 = learning rate
(small value)
Global loss
minimum
𝑤𝑝𝑟𝑒𝑣
w
SKKU Kang Research Group / SEE3007 Spring 2025 50
Calculate Derivative
𝑙𝑜𝑠𝑠 = 𝑦ො − 𝑦 2 = 𝑥 ∗ 𝑤 − 𝑦 2
𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤
𝜕𝑙𝑜𝑠𝑠
= 2𝑥 𝑥 ∗ 𝑤 − 𝑦
𝜕𝑤
https://www.derivative-calculator.net/
loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
Random initial weight 𝜕𝑤
starting point 𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼 2𝑥 𝑥 ∗ 𝑤 − 𝑦
𝜶 = learning rate
(small value)
Global loss
minimum
𝑤𝑝𝑟𝑒𝑣
w
SKKU Kang Research Group / SEE3007 Spring 2025 52