lec03_MultLinRegression
lec03_MultLinRegression
Recognition
1
Contents
#3
Outline
#4
Example: Blood Glucose Level
#5
Data from AIM 94 Experiment
#6
Demo on GitHub
• Demo:
demo2_glucose.ipynb
#7
Loading the Data
• Sklearn package:
– Many methods for machine
learning
– Datasets
– Will use throughout this class
• Diabetes dataset is one example
#8
Finding a Mathematical Model
Attributes Target
𝑥𝑥1 : Age 𝑦𝑦 =Glucose level
𝑥𝑥2 : Sex
𝑥𝑥3 : BMI
𝑦𝑦 ≈ 𝑦𝑦� = 𝑓𝑓(𝑥𝑥1 , … , 𝑥𝑥10 )
𝑥𝑥4 : BP
𝑥𝑥5 : S1
⋮
𝑥𝑥10 : S6
#9
Matrix Representation of Data
• Data is a matrix and a target vector
• 𝑛𝑛 samples: Attributes Target vector
– One sample per row
• 𝑘𝑘 features / attributes /predictors:
– One feature per column 𝑥𝑥11 ⋯ 𝑥𝑥1𝑘𝑘 𝑦𝑦1
𝑋𝑋 = ⋮ ⋱ ⋮ 𝑦𝑦 = ⋮ Samples
#10
Outline
#11
Multivariable Linear Model for Glucose
glucose ≈ prediction
= 𝛽𝛽0 + 𝛽𝛽1 Age + ⋯ + 𝛽𝛽4 𝐵𝐵𝐵𝐵 + 𝛽𝛽5 S1 + ⋯ + 𝛽𝛽10 S6
• General form:
𝑦𝑦 ≈ 𝑦𝑦� = 𝛽𝛽0 + 𝛽𝛽1 𝑥𝑥1 + ⋯ + 𝛽𝛽4 𝑥𝑥4 + 𝛽𝛽5 𝑥𝑥5 + ⋯ + 𝛽𝛽10 𝑥𝑥10
Target
Intercept 10 Features
#12
Multiple Variable Linear Model
#13
Example: Heart Rate Increase
https://www.mercurynews.com/2017/10/29/4851089/
#14
Why Use a Linear Model?
• Consider
1 2
2 0 2
𝐴𝐴 = 3 4 , 𝐵𝐵 = , 𝑥𝑥 = ,
3 2 3
5 6
• Compute (computations on the board):
– Matrix vector multiply: 𝐴𝐴𝐴𝐴
– Transpose: 𝐴𝐴𝑇𝑇
– Matrix multiply: 𝐴𝐴𝐴𝐴
– Solution to linear equations: Solve for 𝑢𝑢: 𝑥𝑥 = 𝐵𝐵𝐵𝐵
– Matrix inverse: 𝐵𝐵−1
#16
Slopes, Intercept and Inner Products
• Model with coefficients 𝜷𝜷: 𝑦𝑦� = 𝛽𝛽0 + 𝛽𝛽1 𝑥𝑥1 + ⋯ + 𝛽𝛽𝑘𝑘 𝑥𝑥𝑘𝑘
• Sometimes use weight bias version:
𝑦𝑦� = 𝑏𝑏 + 𝑤𝑤1 𝑥𝑥1 + ⋯ + 𝑤𝑤𝑘𝑘 𝑥𝑥𝑘𝑘
• Inner product:
– 𝒘𝒘 ⋅ 𝒙𝒙 = ∑𝑘𝑘𝑗𝑗=1 𝑤𝑤𝑗𝑗 𝑥𝑥𝑗𝑗
– Will use alternate notation: 𝐰𝐰 𝑇𝑇 𝒙𝒙 = 𝒘𝒘, 𝒙𝒙
#17
Matrix Form of Linear Regression
𝑨𝑨 a 𝑛𝑛 × 𝑝𝑝 feature matrix
• Matrix equation: � = 𝑨𝑨 𝜷𝜷
𝒚𝒚
#18
In-Class Exercise
#19
Outline
#20
Least Squares Model Fitting
𝐼𝐼=1
– Note that 𝑦𝑦�𝑖𝑖 is implicitly a function of 𝜷𝜷 = (𝛽𝛽0 , … , 𝛽𝛽𝑘𝑘 )
– Also called the sum of squared residuals (SSR) and sum of squared
errors (SSE)
• Least squares solution: Find 𝜷𝜷 to minimize RSS.
#21
Variants of RSS
#22
Finding Parameters via Optimization
A general ML recipe
#23
RSS as a Vector Norm
#24
Least Squares Solution
• Least squares solution: The vector 𝜷𝜷 that minimizes the RSS is:
� = 𝑨𝑨𝑇𝑇 𝑨𝑨
𝜷𝜷 −1 𝑨𝑨𝑇𝑇 𝒚𝒚
#25
Proving the LS Formula
• Least squares formula: The vector 𝜷𝜷 that minimizes the RSS is:
� = 𝑨𝑨𝑇𝑇 𝑨𝑨
𝜷𝜷 −1 𝑇𝑇
𝑨𝑨 𝒚𝒚
– Solve 𝛻𝛻𝑅𝑅𝑅𝑅𝑅𝑅 𝜷𝜷 = 0
#26
Gradients of Multi-Variable Functions
#27
Proof of the LS Formula
#28
In-Class Exercise
#29
Outline
#30
Arrays and Vector in Python
• Python:
– Arrays can have 1, 2, 3, … dimension
– Vectors can be 1D arrays; matrices are generally 2D arrays
– Vectors that are 1D arrays are neither row not column vectors
– If x is 1D and A is 2D, then left and right multiplication are the
same: x.dot(A) and A.dot(x)
#31
Fitting Using sklearn
#32
Manually Computing the Solution
#33
Calling the sklearn Linear Regression method
#34
Outline
#35
Simple vs. Multiple Regression
#36
Comparison to Single Variable Models
#37
Special Case: Single Variable
• Suppose 𝑘𝑘 = 1 predictor.
• Feature matrix and coefficient vector:
1 𝑥𝑥1
𝛽𝛽0
𝐴𝐴 = ⋮ ⋮ , 𝛽𝛽 =
𝛽𝛽1
1 𝑥𝑥𝑛𝑛
1 𝑇𝑇 −1 1
• LS soln: 𝛽𝛽 = 𝐴𝐴 𝐴𝐴 𝐴𝐴𝑇𝑇 𝑦𝑦
= 𝑃𝑃−1 𝑟𝑟
𝑁𝑁 𝑁𝑁
1 𝑥𝑥̅ 𝑦𝑦�
𝑃𝑃 = , 𝑟𝑟 =
𝑥𝑥̅ 𝑥𝑥 2 𝑥𝑥𝑦𝑦
• Obtain single variable solutions for coefficients (after some algebra):
𝑠𝑠𝑥𝑥𝑥𝑥 𝑠𝑠 2
𝑥𝑥𝑥𝑥
𝛽𝛽1 = 2 , 𝛽𝛽0 = 𝑦𝑦� − 𝛽𝛽1 𝑥𝑥,̅ 𝑅𝑅2 = 2 2
𝑠𝑠𝑥𝑥 𝑠𝑠𝑥𝑥 𝑠𝑠𝑦𝑦
#38
Simple Linear Regression for Diabetes Data
#39
Scatter Plot
#40
Outline
#41
Next Lecture
• Model Selection
#42