0% found this document useful (0 votes)
14 views52 pages

AI2025 Lecture02 Recording Slides

The document is a lecture on machine learning, specifically focusing on concepts such as classification, regression, linear regression, and deep learning. It covers various topics including supervised learning, model design, training loss, and practical coding exercises related to linear regression. The material is adapted from various academic sources and is intended for a course in AI system semiconductor design.

Uploaded by

chiyeon0607
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views52 pages

AI2025 Lecture02 Recording Slides

The document is a lecture on machine learning, specifically focusing on concepts such as classification, regression, linear regression, and deep learning. It covers various topics including supervised learning, model design, training loss, and practical coding exercises related to linear regression. The material is adapted from various academic sources and is intended for a course in AI system semiconductor design.

Uploaded by

chiyeon0607
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

1

AI System Semiconductor Design


Lecture2: Introduction to Machine
Learning
Lecturer: Taewook Kang
Acknowledgments
Lecture material adapted from
PyTorchZeroToAll, Prof. Sung Kim, HKUST
Prof. Woowhan Jung, DSLAB, Hanyang Univ.
CM20315 Prof. Simon Prince, Dr. Georgios Exarchakis, and Dr. Andrew Barnes, University of Bath
Dr. Hyungjoo Seo, Apple, CA, USA
SKKU Kang Research Group / SEE3007 Spring 2025 1 1
Class Schedule - Python Part
▪ What is ML?
▪ Classification Vs. Regression
▪ Linear regression
▪ Gradient descent for linear regression
▪ Logistic regression
▪ Shallow Neural Networks
▪ Deep Neural Networks
▪ Stochastic gradient descent (SGD)
▪ MNIST Python implementation

SKKU Kang Research Group / SEE3007 Spring 2025 2


What is ML?

https://docs.google.com/presentation/d/1xC-
lg8RnaO4wTwwQWYEJBUtjj2Caxn5QsHM0LEc9YVc/edit#slide=id.g27be7003ef_0_9

SKKU Kang Research Group / SEE3007 Spring 2025 3


What is Human Intelligence?

SKKU Kang Research Group / SEE3007 Spring 2025 4


What is Human Intelligence?
What to eat for lunch?

SKKU Kang Research Group / SEE3007 Spring 2025 5


What is Human Intelligence?
What to eat for lunch?

Information Infer / Guess

SKKU Kang Research Group / SEE3007 Spring 2025 6


What is Human Intelligence?
What to dress?

Information
Infer

SKKU Kang Research Group / SEE3007 Spring 2025 7


What is Human Intelligence?
What is this picture?

CAT
Image information Prediction

SKKU Kang Research Group / SEE3007 Spring 2025 8


What is Human Intelligence?
What is this number?

2
Image information Prediction

SKKU Kang Research Group / SEE3007 Spring 2025 9


What is Human Intelligence?
What would be the grade if I study 4 hours?

4 ?
hours points
Prediction
information

SKKU Kang Research Group / SEE3007 Spring 2025 10


Machine Learning
What to dress?

Information
Infer

SKKU Kang Research Group / SEE3007 Spring 2025 11


https://styledna.ai/
SKKU Kang Research Group / SEE3007 Spring 2025 12
Machine Learning
What is this picture?

CAT
Image information Prediction

SKKU Kang Research Group / SEE3007 Spring 2025 13


Machine Learning
Machine needs lots of training

2
Image information Prediction

SKKU Kang Research Group / SEE3007 Spring 2025 14


Machine Learning
Machine needs lots of training
Model

training

Labeled dataset
SKKU Kang Research Group / SEE3007 Spring 2025 15
Machine Learning
Predict (test) with trained model

2
Image information Prediction

Test dataset Trained model

SKKU Kang Research Group / SEE3007 Spring 2025 16


Machine Learning
What would be the grade if I study 4 hours?

4 ?
hours points
Prediction

Hours (x) Points (y)


1 2
2 4 Training dataset
3 6
4 ? Test dataset

SKKU Kang Research Group / SEE3007 Spring 2025 17


Deep Learning?

Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville


SKKU Kang Research Group / SEE3007 Spring 2025 18
Supervised Learning

SKKU Kang Research Group / SEE3007 Spring 2025 https://udlbook.github.io/udlbook/ 19


SUPERVISED LEARNING

SKKU Kang Research Group / SEE3007 Spring 2025 20


Supervised learning

Dataset Goal: generalize the input-output relationship

𝒙 Model 𝑦ො ≈ 𝑦
Input Output
(features) (prediction)
Label
(Actual)
Training?
𝐷= 𝒙 𝟏 ,𝑦 1 , 𝒙 2 ,𝑦 2 ,… 𝒙 𝑚 ,𝑦 𝑚
Building a model to make the model can predict
the labels by using train data
Each row of data is called an observation or a tuple

SKKU Kang Research Group / SEE3007 Spring 2025 Prof. Woowhan Jung, DSLAB, Hanyang Univ. 21
Classification vs Regression

100

Life span (years)


90

80

70

60

Classification Regression

Output Categorical value Numeric value


type (class)

SKKU Kang Research Group / SEE3007 Spring 2025 22


Classification vs Regression

100

Life span (years)


90

80

70

60

Classification Regression

Output Categorical value Numeric value


type (class)

SKKU Kang Research Group / SEE3007 Spring 2025 23


Classification vs Regression
Q1. Classification? Regression?
100

Life span (years)


90

80
Predicted rating
70
Q2. Classification? Regression?
60

Cat
Classification Regression

Output Categorical value Numeric value


type (class)
Dog

SKKU Kang Research Group / SEE3007 Spring 2025 24


Classification or Regression?

• Univariate regression problem (one output, real value)


• Fully connected network

SKKU Kang Research Group / SEE3007 Spring 2025 25


Classification or Regression?

• Multivariate regression problem (>1 output, real value)


• Graph neural network

SKKU Kang Research Group / SEE3007 Spring 2025 26


Text Classification

• Binary classification problem (two discrete classes)


• Transformer network

SKKU Kang Research Group / SEE3007 Spring 2025 27


Image Classification

• Multiclass classification problem (discrete classes, >2 possible classes)


• Convolutional neural network (CNN)
SKKU Kang Research Group / SEE3007 Spring 2025 28
LINEAR REGRESSION

SKKU Kang Research Group / SEE3007 Spring 2025 29


Linear Regression – Problem Statement

What would be the grade if I study 4 hours?

4 ?
hours points
Prediction

Hours (x) Points (y)


1 2
2 4 Training dataset Supervised learning
3 6
4 ? Test dataset

SKKU Kang Research Group / SEE3007 Spring 2025 30


Linear Regression – Model Design

What would be the best model for the data? Linear?

Let’s make it simple!


Hours (x) Points (y)
1 2 General form

2 4
Linear
3 6
4 ?

: hat means a predicted value

SKKU Kang Research Group / SEE3007 Spring 2025 31


Linear Regression – Model Design

* The machine starts with a random guess, w=random value


w1

Points
w2
Hours (x) Points (y)
1 2
2 4 w3
3 6

Goal: Find a line


Hours
that fits the y best! Training data points

SKKU Kang Research Group / SEE3007 Spring 2025 32


Training Loss (Error)

Hours, x Points, y Prediction, y^(w=3) Loss (w=3)

1 2 3 1

2 4 6 4

3 6 9 9

mean=14/3

MSE: Mean Square Error

SKKU Kang Research Group / SEE3007 Spring 2025 33


Training Loss (Error)

Hours, x Points, y Prediction, y^(w=4) Loss (w=4)

1 2 4 4

2 4 8 16

3 6 12 36

mean=56/3

MSE: Mean Square Error

SKKU Kang Research Group / SEE3007 Spring 2025 34


Training Loss (Error)

Hours, x Points, y Prediction, y^(w=2) Loss (w=2)

1 2 2 0

2 4 4 0

3 6 6 0

mean=0/3

MSE: Mean Square Error

SKKU Kang Research Group / SEE3007 Spring 2025 35


Training Loss (Error)

MSE, mean square error

Loss Loss Loss Loss Loss


Hours, x Points, y
(w=0) (w=1) (w=2) (w=3) (w=4)
1 2 4 1 0 1 4

2 4 16 4 0 4 16

3 6 36 9 0 9 36

MSE 56/3=18.7 14/3=4.7 0 14/3=4.3 56/3=18.7

SKKU Kang Research Group / SEE3007 Spring 2025 36


Loss Graph
Loss (w=0) Loss (w=1) Loss (w=2) Loss (w=3) Loss (w=4)
MSE 56/3=18.7 14/3=4.7 0 14/3=4.7 56/3=18.7

MSE

SKKU Kang Research Group / SEE3007 Spring 2025 37


Coding Practice: Model & Loss

w = 1.0 # a random guess: a random value

# model for the forward pass


def forward(x):
return x * w

# Loss function
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) * (y_pred - y)

SKKU Kang Research Group / SEE3007 Spring 2025 38


Compute Loss for w

for w in np.arange(0.0, 4.1, 0.1):


print("w=", w)
l_sum = 0
for x_val, y_val in zip(x_data, y_data):
y_pred_val = forward(x_val)
l = loss(x_val, y_val)
l_sum += l
print("\t", x_val, y_val, y_pred_val, l)

print("MSE=", l_sum / 3)

SKKU Kang Research Group / SEE3007 Spring 2025 39


Plot Loss for w

w_list = []
mse_list = []
for w in np.arange(0.0, 4.1, 0.1):
print("w=", w)
l_sum = 0
for x_val, y_val in zip(x_data, y_data):
y_pred_val = forward(x_val)
l = loss(x_val, y_val)
l_sum += l
print("\t", x_val, y_val, y_pred_val, l)

print("MSE=", l_sum / 3)
w_list.append(w)
mse_list.append(l_sum / 3)

plt.plot(w_list, mse_list)
plt.ylabel('Loss')
plt.xlabel('w')
plt.show()

SKKU Kang Research Group / SEE3007 Spring 2025 40


Practice
▪ Stop the lecture here
▪ Complete the loss plot code
▪ Operate the code on your own Python environment
▪ The answer code will be followed
▪ If you are already used to Python, this would be a very easy task

SKKU Kang Research Group / SEE3007 Spring 2025 41


Complete Code Answer
import numpy as np
import matplotlib.pyplot as plt
x_data = [1.0, 2.0, 3.0]
y_data = [2.0, 4.0, 6.0]

w = 1.0 # a random guess: random value, 1.0

# our model for the forward pass


def forward(x):
return x * w

# Loss function
def loss(x, y):
y_pred = forward(x)
return (y_pred - y) * (y_pred - y)

w_list = np.arange(0.0, 4.1, 0.1)


mse_list = []
for w in w_list:
print("w=", w)
l_sum = 0
for x_val, y_val in zip(x_data, y_data):
y_pred_val = forward(x_val)
l = loss(x_val, y_val)
l_sum += l
print("\t", x_val, y_val, y_pred_val, l)
print("MSE=", l_sum / len(x_data))
mse_list.append(l_sum / len(y_data))

plt.plot(w_list, mse_list)
plt.ylabel('Loss')
plt.xlabel('w')
plt.show()

SKKU Kang Research Group / SEE3007 Spring 2025 42


Linear Regression
▪ Modelling the linear relationship between a scalar response (label) and one or more
explanatory variables (features)

Examples)
Area of house -> house price
# of iPhones sold -> Apple’s sales

SKKU Kang Research Group / SEE3007 Spring 2025 43


Linear Regression
▪ Data 𝐷 = 𝒙 𝟏 , 𝑦 1 , 𝒙 2 , 𝑦 2 , … 𝒙 𝑚 , 𝑦 𝑚
𝒙 Model 𝑦ො ≈𝑦
▪ Model
▪ Input:𝒙 𝑖 ∈ ℝ𝑑 Input Output Label
▪ Output 𝑦ො (𝑖) = 𝒘⊤𝒙 𝑖 + 𝑏
▪ Parameters: 𝒘 ∈ ℝ𝑑 , 𝑏 ∈ ℝ

Training a linear regression model?

Finding the model parameters 𝒘 and 𝑏 which make 𝑦ො ≈ 𝑦

Measuring the distance between 𝑦ො and 𝑦

Squared error: 𝑦ො − 𝑦 2

2
Loss function 𝐿 𝑦ො (𝑖) , 𝑦 (𝑖) = 𝑦 (𝑖) − 𝑦ො (𝑖)
1 𝑚 1 𝑚 2
Cost function 𝐽 𝒘, 𝑏 = σ 𝐿 𝑦ො (𝑖) , 𝑦 (𝑖) = σ 𝑦 (𝑖) − 𝑦ො (𝑖)
𝑚 𝑖=1 𝑚 𝑖=1

SKKU Kang Research Group / SEE3007 Spring 2025 44


Training a linear regression model

▪ Given
▪ Training data 𝐷 = 𝒙 𝟏 , 𝑦 1 , 𝒙 2 , 𝑦 2 , … 𝒙 𝑚 , 𝑦 𝑚
▪ Our goal
1 𝑚 2
▪ Find 𝒘, 𝑏 that minimizes 𝐽 𝒘, 𝑏 = σ
𝑚 𝑖=1
𝑦 𝑖 − 𝑦ො 𝑖

Q. How?
Applicable methods: gradient descent, linear least squares, …

We are going to use gradient descent!

SKKU Kang Research Group / SEE3007 Spring 2025 45


GRADIENT DESCENT ALGORITHM

SKKU Kang Research Group / SEE3007 Spring 2025 46


Learning (training)?: Find w that minimizes the loss
Loss (w=0) Loss (w=1) Loss (w=2) Loss (w=3) Loss (w=4)
MSE 56/3=18.7 14/3=4.7 0 14/3=4.7 56/3=18.7

𝑁
1 2
𝑙𝑜𝑠𝑠 = 𝑀𝑆𝐸 = ෍ 𝑦ො𝑛 − 𝑦𝑛
𝑁
𝑛=1

MSE
𝑎𝑟𝑔 min 𝑙𝑜𝑠𝑠(𝑤)
𝑤

SKKU Kang Research Group / SEE3007 Spring 2025 47


Gradient Descent Algorithm

loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
Random initial weight 𝜕𝑤
starting point

Global loss
minimum

w
SKKU Kang Research Group / SEE3007 Spring 2025 48
Gradient Descent Algorithm

loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
Random initial weight 𝜕𝑤
starting point

𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤
𝜶 = learning rate
(small value)
Global loss
minimum

𝑤𝑛𝑒𝑤 𝑤𝑝𝑟𝑒𝑣
w
SKKU Kang Research Group / SEE3007 Spring 2025 49
Gradient Descent Algorithm

loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
𝜕𝑤
Compared to the
previous jump, w
moves slower 𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤
𝜶 = learning rate
(small value)
Global loss
minimum

𝑤𝑝𝑟𝑒𝑣
w
SKKU Kang Research Group / SEE3007 Spring 2025 50
Calculate Derivative
𝑙𝑜𝑠𝑠 = 𝑦ො − 𝑦 2 = 𝑥 ∗ 𝑤 − 𝑦 2

𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤

𝜕𝑙𝑜𝑠𝑠
= 2𝑥 𝑥 ∗ 𝑤 − 𝑦
𝜕𝑤

https://www.derivative-calculator.net/

SKKU Kang Research Group / SEE3007 Spring 2025 51


Gradient Descent Algorithm

loss
𝜕𝑙𝑜𝑠𝑠
Gradient (slope) =
Random initial weight 𝜕𝑤
starting point 𝜕𝑙𝑜𝑠𝑠
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼
𝜕𝑤
𝑤𝑛𝑒𝑤 = 𝑤𝑝𝑟𝑒𝑣 − 𝛼 2𝑥 𝑥 ∗ 𝑤 − 𝑦
𝜶 = learning rate
(small value)
Global loss
minimum

𝑤𝑝𝑟𝑒𝑣
w
SKKU Kang Research Group / SEE3007 Spring 2025 52

You might also like