0% found this document useful (0 votes)

99 views7 pages

Linear Models - Numeric Prediction

Linear regression models are used for numeric prediction by fitting a linear equation that combines attribute values to predict a numeric target attribute. Least squares estimation is used to determine the regression coefficients by minimizing the squared differences between predicted and actual values. The correlation coefficient measures how well predicted values fit actual values on a straight line, ranging from 1 to -1. Regression trees and model trees extend linear regression to nonlinear problems by combining multiple linear models.

Uploaded by

ar9vega

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

99 views7 pages

Linear Models - Numeric Prediction

Uploaded by

ar9vega

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Overview

classication vs. numeric prediction

Linear Models
Connectionist and Statistical Language Processing
Frank Keller
[email protected]

linear regression least square estimation evaluating a numeric model, correlation selecting a regression model linear regression for classication regression trees, model trees

Computerlinguistik Universit at des Saarlandes

Literature: Witten and Frank (2000: ch. 4, 6), Howell (2002: ch. 15).
Linear Models p.1/26 Linear Models p.2/26

Numeric Prediction
An instance in the data set has the following general form:

Example
Predict CPU performance from conguration data: cycle time (ns) 125 29 29 29 29 memory min (kB) 256 8000 8000 8000 8000 2000 512 1000 memory max (kB) 6000 32000 32000 32000 16000 8000 4000 4000 cache (kB) 256 32 32 32 32 0 32 0 chan min 16 8 8 8 8 2 0 0 chan max 128 32 32 32 16 14 0 0 performance 198 269 220 172 132 52 67 45
Linear Models p.4/26

a1 , i , a2 , i , . . . , ak , i , xi
where a1,i , . . . , ak,i are attribute values, and xi is the target value, for the i-th instance in the data set. So far we have only seen classication tasks, where the target value xi is categorical (represents a class). Techniques such as decision tree and Naive Bayes are not (directly) applicable if the target is numeric. Instead algorithms for numeric prediction can be used, e.g., linear models.

...
125 480 480

Linear Models p.3/26

Linear Regression
Linear regression is a technique for numeric predictions thats widely used in psychology, medical research, etc. Key idea: nd a linear equation that predicts the target value x from the attribute values a1 , . . . , ak :

Linear Regression
The regression equation computes the following predicted value xi for the i-th instance in the data set.
k

(2)

xi = w0 + w1 a1,i , w2 a2,i , . . . , wk ak,i = w0 + w j a j,i

j=1

(1)

x = w0 + w1 a1 + w2 a2 + . . . + wk ak

Here, w1 , . . . wk are the regression coefcients, w0 is called the intercept. These are the model parameters that need to be induced from the data set.

Key idea: to determine the coefcients w0 , . . . wk , minimize e, the squared difference between the predicted and the actual value, summed over all n instances in the data set:

(3)

e = (xi xi )2 = xi w0 w j a j,i
i=1 i=1 j=1

The method for this is called Least Square Estimation (LSE).

Linear Models p.5/26

Linear Models p.6/26

Least Square Estimation

We demonstrate how LSE works with the simple case of k = 1, dropping the intercept w0 . The error equation (3) simplies to (abbreviating w1 = w and a1 = a):

Least Square Estimation

To minimize the squared error for the data set, we therefore set the derivative in (5) equal to zero:

(4)

e = (xi wai )2 = (xi2 2wai xi + w2 a2 i)

i i

(6)

2 ai xi + 2w a2 i =0
i i

Now differentiate the error equation in (4) with respect to w:

By resolving this equation to w, we obtain a formula for computing the value of w that minimizes the error:

(5)

e 2 = (2ai xi + 2wa2 i ) = 2 ai xi + 2w ai w i i i

(7)

i ai xi i a2 i

The derivative is the slope of the error function. The slope is zero at all points at which the function has a minimum.

This formula can be generalized to regression equations with more than one coefcient.

Linear Models p.7/26

Linear Models p.8/26

Example
Sample data set:

Evaluating a Numeric Model

a x 1 2 2 5 1 2 5 8
The t of a regression model can be visualized by plotting the predicted data values against the actual values.
10 8 6 target value 4 2 0 -2 -2
Linear Models p.9/26

actual value x predicted value x

Use Least Square Estimation to compute w for this data set:

(8)

xi ai 1 2 + 2 5 + (1)(2) + 5 8 w= i 2 = = 1.74 12 + 22 + (1)2 + 52 i ai

Regression equation: x = wa = 1.74a

-1

1 2 3 attribute value a

6
Linear Models p.10/26

Evaluating a Numeric Model

A suitable numeric measure for the t of a linear model is the mean squared error:

Example
Compute the mean squared error for the sample data set and the regression equation x = 1.74a:

(9)

1 n MSE = (xi xi )2 n i=1

Intuitively, this represents how much the predicted values diverge from the actual values on average. Note that the MSE is the quantity the LSE algorithm minimizes.

a x x (x x )2 1 2 1.74 0.068 2 5 3.48 1.346 1 2 1.74 0.068 5 8 8.70 0.490 MSE = 0.646
1 n i=1

(xi xi )2 = 1 4 (0.068 + 1.346 + 0.068 + 0.490) =

Linear Models p.11/26

Linear Models p.12/26

Correlation Coefcient
The correlation coefcient r measures the degree of linear association between predicted and the actual values:

Example
Compute the correlation coefcient for the example data set:

(10) (11) (12) SP =

SPA r= SP SA n )(xi x ) (x x SPA = i=1 i n1 )2 n i=1 (xi x n1 SA = )2 n i=1 (xi x n1

Here x and x are the means of the actual and predicted values, SP and SA their standard deviations. SPA is the covariance of the actual and predicted values.
Linear Models p.13/26

x = 3.25 x = 3.14 SPA = ((1.74 3.14)(2 3.25) + (3.48 3.14)(5 3.25)+ (1.74 3.14)(2 3.25) + (8.70 3.14)(8 3.25))/3 = 18.13 2 SP = ((1.74 3.14)2 + (3.48 3.14)2+ (1.74 3.14)2 + (8.70 3.14)2)/3 = 18.93 2 = ((2 3.25)2 + (5 3.25)2+ SA (2 3.25)2 + (8 3.25)2)/3 = 18.25 r = 18.13/( 18.93 18.25) = 0.975 r2 = 0.951
Linear Models p.14/26

Correlation Coefcient
Some important properties:
The correlation coefcient r ranges from

Partial Correlations
We can compute the multiple correlation coefcient that tells us how well the full regression model (with all attributes) ts the target values. We can also compute the correlation between the values of a single attribute and the target values. However this is not very useful as attributes can be intercorrelated, i.e., they correlate with each other (colinearity). We need to compute the partial correlation coefcient, which tells us how much variance is uniquely accounted for by an attribute once the other attributes are partialled out.

1.0 (perfect correlation) to 0 (no correlation) to 1.0 (negative correlation). r expresses how well the data points t on the straight line described by the regression model. r is signicant. Null hypothesis: there is no

Intuitively,

We can test if

linear relationship between predicted and actual values.

We can also compute r2 , which represents the amount of

variance accounted for by the regression model.

Linear Models p.15/26 Linear Models p.16/26

Selecting a Regression Model

We want to build a regression model that only contains the attributes that are predictive. Several methods to achieve this:
All subsets: compute models for all subsets of attributes

Testing on Unseen Data

We compute the regression weights and perform the model selection on the training data. To evaluate the resulting model, we compute model t on unseen test data using LSE or the correlation coefcient. Techniques for testing on unseen data (see last lecture):
Holdout: set aside a random sample of the data set for

and chose the one with the highest multiple r.

Backward elimination: compute a model for all attributes,

and then eliminate the one with the lowest partial r. Iterate until the multiple r deteriorates.
Forward selection: compute a model consisting only of the

testing, train on the rest.

attributes with the highest partial r. Then add the next best attribute. Stop when the multiple r doesnt improve. Different model selection algorithm can yield different models.
Linear Models p.17/26

k-fold crossvalidation: split the data in k random partition

and test on each one in turn.

leave-one-out: set

k to the number of instances in the data set, i.e., test on each instance separately.
Linear Models p.18/26

Linear Regression for Classication

Regression can be applied to classication:
Perform a separate regression on each class, set the

Linear Separability
Linear regression approximates a linear function. This means that the classes have to be linearly separable.

target value to 1 if an instance is in the class, and 0 if it is not in the class.

attribute a2

membership function for the class (1 for members, 0 for non-members).

To classify a new instance, compute the regression value

for each membership function, and assign the new instance the class with the highest value.
This procedure is called multiresponse linear regression.
Linear Models p.19/26

attribute a1

attribute a2

The regression equation then approximates the

attribute a1

For many interesting problems, this is not the case.

Linear Models p.20/26

Regression Trees
Regression trees are decision trees for numeric attributes. The leaves are not labeled with classes, but with the mean of the target values of the instances classied by a given branch.
To construct a regression tree, chose splitting attributes to minimize intrasubset variation for each branch. Maximize standard deviation reduction (instead of information gain):

Example

(13)

SDR = T
i

|Ti | T |T | i

Where T is the set of instances classied at a given node, T1 , . . . , Ti are the subset that T is split into, and is the standard deviation.

Linear Models p.21/26

Linear Models p.22/26

Model Trees
Model trees are regression trees that have linear regression models at their leaves, not just numeric values.
Induction algorithm:
Induce a regression tree using standard deviation

Example

reduction as the splitting criterion.

Prune back the tree starting from the leaves. For each leaf construct a regression model that accounts

for the instances classied by this leaf. Model trees get around the problem of linear separability by combining several regression models.

Linear Models p.23/26

Linear Models p.24/26

Summary
Linear regression models are used for numeric prediction. They t a linear equation that combines attributes values to predict a numeric target attribute.

References
Howell, David C. 2002. Statistical Methods for Psychology. Pacic Grove, CA: Duxbury, 5th edn. Witten, Ian H., and Eibe Frank. 2000. Data Mining: Practical Machine Learing Tools and

Techniques with Java Implementations. San Diego, CA: Morgan Kaufmann.

Least square estimation can be used to determine the coefcients in the regression equation so that the difference between predicted and actual values is minimal.
A numeric model can be evaluates using the mean squared error or the correlation coefcient. Regression models can be used for classication either directly in multiresponse regression or in combination with decision trees: regression trees, model trees.
Linear Models p.25/26 Linear Models p.26/26

Predictive Analytics (2)
No ratings yet
Predictive Analytics (2)
46 pages
intro to regression
No ratings yet
intro to regression
4 pages
Cost-Function
No ratings yet
Cost-Function
31 pages
Regression
No ratings yet
Regression
16 pages
Stats 101 - Class 03
No ratings yet
Stats 101 - Class 03
94 pages
Week 6 - Lecture 12-1
No ratings yet
Week 6 - Lecture 12-1
34 pages
Chapter 06 Linear Reg
No ratings yet
Chapter 06 Linear Reg
24 pages
Supervised Machine Learning - Regression
No ratings yet
Supervised Machine Learning - Regression
34 pages
CS550 Regression
No ratings yet
CS550 Regression
62 pages
T7.6
No ratings yet
T7.6
6 pages
Unit2 ML Notes
No ratings yet
Unit2 ML Notes
19 pages
DMV Unit 3 PPT_RSK_250419_125620 jfhuehiwhu
No ratings yet
DMV Unit 3 PPT_RSK_250419_125620 jfhuehiwhu
89 pages
6_Classification and Regression Tasks
No ratings yet
6_Classification and Regression Tasks
115 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
MECH4403 LR Week04
No ratings yet
MECH4403 LR Week04
25 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
Unit Iii
No ratings yet
Unit Iii
27 pages
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
No ratings yet
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
17 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
Two Texts by Sakya Chogden
No ratings yet
Two Texts by Sakya Chogden
51 pages
Dudjom Lingpa Supplication
No ratings yet
Dudjom Lingpa Supplication
5 pages
lecture3_supervised_learning_I
No ratings yet
lecture3_supervised_learning_I
84 pages
Ch06 MultipleLinearRegression
0% (2)
Ch06 MultipleLinearRegression
19 pages
Week 4 Linear Regression
No ratings yet
Week 4 Linear Regression
38 pages
Teit ML2
No ratings yet
Teit ML2
11 pages
Lecture 09_02.09.2024_Regression-01
No ratings yet
Lecture 09_02.09.2024_Regression-01
62 pages
ML Unit
No ratings yet
ML Unit
23 pages
Chapter 2 Simple Linear Regression
No ratings yet
Chapter 2 Simple Linear Regression
70 pages
SimpleLinearRegression PDF
No ratings yet
SimpleLinearRegression PDF
86 pages
2-(9-3) Regression Classifiers
No ratings yet
2-(9-3) Regression Classifiers
35 pages
Statics Thinking-Regression
No ratings yet
Statics Thinking-Regression
51 pages
Lab Manual Soft Computing
100% (1)
Lab Manual Soft Computing
44 pages
Chapter2 Annotated Part2
No ratings yet
Chapter2 Annotated Part2
30 pages
Sec2 Regression PDF
No ratings yet
Sec2 Regression PDF
183 pages
ML Unit-4
No ratings yet
ML Unit-4
65 pages
Managing Supply Chains A Logistics Approach International Edition 9th Edition Coyle Solutions Manualinstant download
100% (5)
Managing Supply Chains A Logistics Approach International Edition 9th Edition Coyle Solutions Manualinstant download
45 pages
Linear Regression Algorithm
No ratings yet
Linear Regression Algorithm
16 pages
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
No ratings yet
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
30 pages
Concise Summary of Mahamudra - Maitripa
No ratings yet
Concise Summary of Mahamudra - Maitripa
2 pages
(eBook PDF) Elementary Statistics in Social Research, Updated Edition 12th Edition 2024 Scribd Download
100% (7)
(eBook PDF) Elementary Statistics in Social Research, Updated Edition 12th Edition 2024 Scribd Download
48 pages
Machine Learning Lecture 1
No ratings yet
Machine Learning Lecture 1
5 pages
Numerical Methods With Applications
No ratings yet
Numerical Methods With Applications
34 pages
Lec4 Oct12 2022 PracticalNotes LinearRegression
No ratings yet
Lec4 Oct12 2022 PracticalNotes LinearRegression
34 pages
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
AAI Lecture 10 Sp 25
No ratings yet
AAI Lecture 10 Sp 25
37 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
33 pages
FML Unit2
No ratings yet
FML Unit2
13 pages
2Yeswanth
No ratings yet
2Yeswanth
74 pages
UNIT IV Na-Ve Bayes Classifier Algorithm
No ratings yet
UNIT IV Na-Ve Bayes Classifier Algorithm
33 pages
Lecture 3
No ratings yet
Lecture 3
51 pages
Auto/cross-Correlation: Generalized Regression Model
No ratings yet
Auto/cross-Correlation: Generalized Regression Model
37 pages
Comparative Analysis of AGBFM and IWOFM With Forecasting Models LSSVM-PSO, LSSVM-ACO and LSSVM-WOA
No ratings yet
Comparative Analysis of AGBFM and IWOFM With Forecasting Models LSSVM-PSO, LSSVM-ACO and LSSVM-WOA
17 pages
Machine Learning Unit2
No ratings yet
Machine Learning Unit2
31 pages
Chapter 5 - Machine Learning
No ratings yet
Chapter 5 - Machine Learning
114 pages
W2 Ecs7020p
No ratings yet
W2 Ecs7020p
54 pages
How To Use Aggregation and Combined Forecasting To Improve Seasonal Demand
100% (1)
How To Use Aggregation and Combined Forecasting To Improve Seasonal Demand
17 pages
Dzogchen The Nine Vehicles
100% (1)
Dzogchen The Nine Vehicles
9 pages
Team9 Report (2
No ratings yet
Team9 Report (2
29 pages
Aiml Unit 3
No ratings yet
Aiml Unit 3
9 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
Product Quality Prediction in Pulsed Laser Cutting
No ratings yet
Product Quality Prediction in Pulsed Laser Cutting
18 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
Unit 1 1. Define Machine Learning. Application of Machine Learning Applications of ML
No ratings yet
Unit 1 1. Define Machine Learning. Application of Machine Learning Applications of ML
40 pages
ML Notes (Module-3)
No ratings yet
ML Notes (Module-3)
21 pages
Module 5
No ratings yet
Module 5
48 pages
Deep Learning-Based Strategies For Integrated Autonomous Navigation A Review
No ratings yet
Deep Learning-Based Strategies For Integrated Autonomous Navigation A Review
6 pages
Satellite Pose Estimation Challenge: Dataset, Competition Design and Results
No ratings yet
Satellite Pose Estimation Challenge: Dataset, Competition Design and Results
15 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Research Article
No ratings yet
Research Article
10 pages
Linear Algebra Fundamentals
From Everand
Linear Algebra Fundamentals
Kartikeya Dutta
No ratings yet
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
Medical Insurance Cost Prediction
No ratings yet
Medical Insurance Cost Prediction
7 pages
Introduction To Estimation Theory, Lecture Notes: January 2014
No ratings yet
Introduction To Estimation Theory, Lecture Notes: January 2014
21 pages
Li AOD-Net All-In-One Dehazing ICCV 2017 Paper 2
No ratings yet
Li AOD-Net All-In-One Dehazing ICCV 2017 Paper 2
9 pages
(International Centre For Mechanical Sciences 140) T. Kailath (Auth.) - Lectures On Wiener and Kalman Filtering-Springer-Verlag Wien (1981) PDF
No ratings yet
(International Centre For Mechanical Sciences 140) T. Kailath (Auth.) - Lectures On Wiener and Kalman Filtering-Springer-Verlag Wien (1981) PDF
189 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
Fish Farming
No ratings yet
Fish Farming
11 pages
Lecture Notes To Accompany: Operations Management, 10 Edition Stevenson (Mcgraw-Hill 2009)
No ratings yet
Lecture Notes To Accompany: Operations Management, 10 Edition Stevenson (Mcgraw-Hill 2009)
105 pages
A Tutorial Pca in Matlab
No ratings yet
A Tutorial Pca in Matlab
12 pages
Financial Planning
No ratings yet
Financial Planning
98 pages
Least Mean Square Adaptive Filters
No ratings yet
Least Mean Square Adaptive Filters
502 pages
PSY417 Week12
No ratings yet
PSY417 Week12
34 pages
Anova
0% (1)
Anova
5 pages
Vitali Studies On The History and Literature of Tibet and The Himalaya 3
100% (1)
Vitali Studies On The History and Literature of Tibet and The Himalaya 3
222 pages
POM Book - 60 Pages
No ratings yet
POM Book - 60 Pages
60 pages
Regression: Unit Iii
No ratings yet
Regression: Unit Iii
54 pages
CSC 603 - Final Project
No ratings yet
CSC 603 - Final Project
3 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
19 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Jackson, Roger. Guenther's Saraha
No ratings yet
Jackson, Roger. Guenther's Saraha
34 pages
4.chapter 3 Demand Forecasting
No ratings yet
4.chapter 3 Demand Forecasting
43 pages
Meaning of Tara Mantra
No ratings yet
Meaning of Tara Mantra
2 pages
Fdsa UNIT V
No ratings yet
Fdsa UNIT V
18 pages
Bon Ka Ba Nag Po and The Rnying Ma Phur Pa
No ratings yet
Bon Ka Ba Nag Po and The Rnying Ma Phur Pa
17 pages
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Linear Models - Numeric Prediction

Uploaded by

Linear Models - Numeric Prediction

Uploaded by

Overview

classication vs. numeric prediction

Computerlinguistik Universit at des Saarlandes

Linear Models p.3/26

xi = w0 + w1 a1,i , w2 a2,i , . . . , wk ak,i = w0 + w j a j,i

The method for this is called Least Square Estimation (LSE).

Linear Models p.5/26

Linear Models p.6/26

Least Square Estimation

Least Square Estimation

e = (xi wai )2 = (xi2 2wai xi + w2 a2 i)

Now differentiate the error equation in (4) with respect to w:

Linear Models p.7/26

Linear Models p.8/26

Evaluating a Numeric Model

actual value x predicted value x

Use Least Square Estimation to compute w for this data set:

xi ai 1 2 + 2 5 + (1)(2) + 5 8 w= i 2 = = 1.74 12 + 22 + (1)2 + 52 i ai

Regression equation: x = wa = 1.74a

Evaluating a Numeric Model

1 n MSE = (xi xi )2 n i=1

(xi xi )2 = 1 4 (0.068 + 1.346 + 0.068 + 0.490) =

Linear Models p.11/26

Linear Models p.12/26

(10) (11) (12) SP =

SPA r= SP SA n )(xi x ) (x x SPA = i=1 i n1 )2 n i=1 (xi x n1 SA = )2 n i=1 (xi x n1

linear relationship between predicted and actual values.

variance accounted for by the regression model.

Selecting a Regression Model

Testing on Unseen Data

and chose the one with the highest multiple r.

testing, train on the rest.

k-fold crossvalidation: split the data in k random partition

Linear Regression for Classication

target value to 1 if an instance is in the class, and 0 if it is not in the class.

membership function for the class (1 for members, 0 for non-members).

The regression equation then approximates the

For many interesting problems, this is not the case.

Linear Models p.21/26

Linear Models p.22/26

reduction as the splitting criterion.

Linear Models p.23/26

Linear Models p.24/26

Techniques with Java Implementations. San Diego, CA: Morgan Kaufmann.

You might also like