0% found this document useful (0 votes)

3 views68 pages

Simple Linear Regression

Uploaded by

ZeroVector

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views68 pages

Simple Linear Regression

Uploaded by

ZeroVector

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 68

Simple Linear

Regression

1
Part I
Background Review

2
Regression Analysis
Statistical model is a mathematical description of the
data structure/data generating mechanism
Parametric model
 Easier to fit, interpret, infer
 More powerful (statistically)
 Model complexity is fixed

Nonparametric model
 No distributional assumption
 More flexible
 Model complexity may grow

Semiparametric model

3
Regression Analysis
Example: exam scores
Parametric: approximate the class distribution by a normal
distribution with certain parameters (mean and variance)
(hence we can say mean +/- one standard deviation ~ 68%)
Nonparametric: use the histogram

So far, we are dealing with only one variable

What if we have more variables available?
Want to exploit other information for a better picture

4
Regression Analysis
Regression studies the relationship between
Response/outcome/dependent variables; and
Predictor/explanatory/independent variables

In this course, we only deal with parametric approach

Goal: estimate the parameters
After building the model: interpret, infer, and predict

As we will see, the regression framework covers many of

the techniques you learnt in CB2200
5
Types of Variables
Nominal Binary
Qualitative/ No orderings in categories Only two categories
• Martial Status • Yes/No
Categorical • Eye Color • Male/Female
Categories are naturally ordered
Ordinal • Likert/rating scale
• Letter grades

Variables

• Number of children
Discrete • Defects per hour

Quantitative/
Numerical
• Weight
Continuous • Voltage
6
Let’s look at the simplest
case
To study the relationship between two numerical
variables, such as
Exam score vs. Time spent on doing revision
Apartment price vs. Gross floor area
Electricity consumption vs. Air temperature

We can use Pearson correlation coefficient, also known

as linear correlation coefficient

7
Linear Correlation
Analysis
Scatter plot

8
Linear Correlation
Analysis Cont’d
(Sample) Linear correlation coefficient,

 Dimensionless

 “Sign” indicates the direction (positive / negative) of a

linear relationship
 “Magnitude” measures the strength of a linear
relationship
 are the sample means
 , are the sample variances

9
Linear Correlation Analysis Cont’d
t-test for correlation coefficient
Important!! Note the
(no linear correlation) slight abuse of notations
(linear correlation exists) • Upright t denotes the
value of statistic
t-statistic t • denotes the
distribution itself
p-value |t| • denotes its upper tail
denotes a distribution with degree of freedom
quantile (d.f.)
Reject if |t| > or p-value <

10
Example
Is residential apartment price related to its gross floor
area and age of the building?

 Source: HKEA Transaction Records, http://www.hkea.com.hk/private/TransServ

11
 Data: Transactions of residential apartments in Tseung Kwan O during 1 – 8 April 2014
Example Cont’d
The data file consists of records, with records contain
missing values, we are going to make use of the
following three variables
Price = Price in million HK$
GrossFA = Gross floor area in ft2
Age = Age of building in years

12
Example • R will not process codes after #, use for comments
 #set working directory
 setwd(“C:/Users/chiwchu/Google Drive/Academic/CityU/MS3252/Lecture")

 #load the data • library(…) is to load packages

 library(readxl) • attach(…) so that variables in the
 Example = read_excel("Example.xlsx") database can be used directly;
 o/w, use Example$Price, etc
attach(Example)

 #scatter plots of Price vs GrossFA and Price vs Age

 plot(GrossFA,Price)
 plot(Age,Price)

• cbind(…) is to combine by columns

 #compute correlations and t-tests
• rbind(…) is to combine by rows
 cor(cbind(Price,GrossFA,Age))
 cor.test(Price,GrossFA)
13
Example Cont’d

14
Example Cont’d

Correlation coefficient Correlation coefficient

between Price itself between Price and GrossFA

Calculated value of p-value of t-test for

t-test statistics correlation coefficient

15
Conditional Distribution
Probability/density -> Distribution
Conditional probability/density -> Conditional
distribution
e.g. Let denote the random variable of whether it will
rain tomorrow (1=yes, 0=no)
If the probability of raining tomorrow is 0.4, then has a
Bernoulli(0.4) distribution,
But what if we know whether a typhoon is coming?
Let denote the random variable of whether a typhoon is
coming (1=yes, 0=no)
 can be random itself, but we can think of it as fixed
16
Conditional Distribution
Given the information of , the probability of raining
tomorrow and hence the distribution of may change!
Say, conditional probability , then the conditional
distribution of is
Similarly, the conditional distribution of could be

The conditional distribution of , particularly the

conditional mean, varies across different values of
Regression is about the study of conditional distribution!

17
Part II
Formulation and Estimation

18
Overview of Regression Analysis
Input
Response / outcome / dependent variable,
 The variable we wish to explain or predict
Predictor / covariate / explanatory / independent variable,
 The variable used to explain the response variable
Output
A (linear) function that allows us to
 Model association: Explain the variation of the response caused
by the predictor(s)
 Provide prediction: Estimate the value of the response based on

value(s) of the predictor(s)

19
Simple Linear Regression -
Formulation
A simple linear regression model consists of two
components
Regression line: A straight line that describes the
dependence of the average value (conditional mean) of the
-variable on one -variable
Random error: The unexpected deviation of the observed
value from the expected value
Population Slope Coefficient
Population Intercept
Predictor

Response Random Error

Regression Line 20
Simple Linear Regression -
Formulation
(Linear) regression model
Assumptions
 Linearity of regression equation
 is a linear function
 Error normality (can be dropped if sample size is large, why?)
 has a normal distribution for all
 Zero mean of errors (not really an assumption with the intercept)

 Constant variances of errors

 Error independence
 are independent for all
21
Simple Linear Regression -
Formulation
Equivalently, the linear regression model can be written as
 (mean function)
 (variance function)
 are independent and normally distributed
In other words, are independent
 denotes a normal distribution with mean and variance
We also call it mean regression model

22
Simple Linear Regression -
Formulation
Framework: we have one response and predictors
 here because we only have one
We obtain a random sample of size , containing the values
of and for each individual/subject/observation ,
Our goal is to model/infer about the conditional mean of
given
As the conditional mean is characterized by and , that
means we need to estimate and from the data

23
Simple Linear Regression -
Estimation
Goal: estimate and
Let’s denote these estimates by and
our notations for parameters: Greek alphabets represent
the population/true versions; English alphabets represent
the sample/estimated analogies.
Two methods (turn out to be equivalent for linear
regression):
Least Squares Estimator (LSE)/Ordinary Least Squares (OLS)
Maximum Likelihood Estimator (MLE)

24
Simple Linear Regression -
Estimation
𝑌
𝒀𝒊
𝒆𝒊
^
𝒀 𝒊

^ =𝒃 +𝒃 𝑿
𝒀 𝒊 𝟎 𝟏 𝒊

We are assuming
𝑋 normality of Y for
𝑋𝑖
every level of X
 represents the sample intercept
 represents the sample slope coefficient
 represents the sample residual error 25
Simple Linear Regression -
Estimation
 and are estimated using the least squares method,
which minimize the sum of squares errors (SSE)

26
Simple Linear Regression -
Estimation
The solution to and can be obtained by differentiating
with respect to and
That is to solve

and

simultaneously

27
Simple Linear Regression -
Estimation
The solutions are

and

Also, the estimate for the error variance is given by

28
Simple Linear Regression -
Estimation
Maximum Likelihood Estimation is to find the parameters
that maximize the likelihood/probability of observing the
sample
Recall that
The density function of is
Assume is known and equals 1 for simplicity…
The joint likelihood/probability of observing these given
these will be
Maximizing this likelihood function is equivalent to
minimizing , which is exactly the SSE, so MLE = LSE!
29
Example Cont’d

• lm(…) means linear model

• summary(…) reports summary
of variables/ model results

𝑏0
𝑏1

𝑆 𝑒= √ 𝑀𝑆𝐸

or
30
Example – The Model &
Interpretation of Coefficients Cont’d
The estimated simple linear regression equation

where = Price in million HK$

= Gross floor area in ft2
The estimated slope coefficient,
Measures the estimated change in the average value of as
a result of a one-unit increase in
 says that the price of an apartment increases by , on
average, for each square foot increase in gross floor area

31
Example – The Model &
Interpretation of Coefficients Cont’d
The estimated simple linear regression equation

where = Price in million HK$

= Gross floor area in ft2
The estimated intercept coefficient,
Denotes the estimated average value of when is zero
 says that the price of an apartment is , on average, when
the gross floor area is zero (any problem?)
Interpret with caution with the -value is out of range!!

32
Example Cont’d
Regress Price against Age

33
Example Cont’d
The relationship between apartment price and age of
the building is

where = Price in million HK$

= Age of building in years
If the building gets 1 year older, the average apartment
price decreases by

34
Confidence Interval (CI)
Confidence interval estimate for slope coefficient

 R program

% CI for

 Upon repeated sampling, those CI will cover the true

parameter with approximately 95% chance
 We are 95% confident that the population parameter is
contained in between the CI
35
Special Case I: One Sample
In other words, no are considered (imagine )
The linear regression model assumes that
 are independent
This is equivalent to fitting a normal distribution to the
entire sample!!
Intuitively, what are the best estimates for and ?


36
Special Case II: Two Groups
Now, the are either 0 or 1 indicating which group the
observation belongs to
The linear regression model assumes that
 are independent
 are independent
This is equivalent to fitting two normal distributions to the
two groups respectively!!

37
Part III
Goodness of Fit,
Parameter Inference,
and Model Significance

38
Goodness of Fit and Model
Significance
We want to compare the fitted model with against the
null model without
Fitted/Full model = the model you considered
Null model = special case I = a horizontal line at
(Saturated model = data = the model with perfect fit)

Saturated model <----- Fitted model -----> Null model

One simplest way to evaluate the goodness of fit is to

look at the variance/variation breakdown (although not
always a good idea) 39
Analysis of Variance
(ANOVA)
Total variation of the -variable is made up of two parts

Sum of Squares Total,

 Total variation of around their mean,
Sum of Squares Regression,
 Variation explained by the regression equation
Sum of Squares Errors,
 Variation not explained by the regression equation
Question: what are the values of and for the null and
saturated models?
40
Analysis of Variance
(ANOVA) Cont’d
SST, the total variation of -variable
we wish to explain

Price

SSE = SST - SSR

GrossFA

SSR, the variation of -variable that

being explained by the regression
equation with the p
41
Analysis of Variance
(ANOVA) Cont’d
Coefficient of determination,

Measures the proportion of variation of explained by the

regression equation with the predictor
Measures the “goodness of fit” of the regression model

Remark!! in simple linear regression, i.e. when there is

one -variable

42
Example
Which independent variable, GrossFA or Age, provides a
better explanation to the variation of apartment price?

SSR

SSE

=MSE
For SST, use either of
• sum(anova(m1)[,2])
• var(Price)*(length(Price)-1)

43
Inferences about the Parameters –
A -Variable Significance
t-test for a slope coefficient
(no linear relationship)
(linear relationship exists)
t-statistic t
where = standard error of the slope

p-value |t|
has a distribution with d.f.
Reject if |t| > or p-value <

44
Inferences about the Parameters –
A -Variable Significance Cont’d
 measures the variation in the slope of regression lines from
different possible samples (one color denotes one sample)
𝒀 𝒀

Small 𝑿 Large 𝑿


 variation of the errors around the regression line
45
Inferences about the Parameters –
A -Variable Significance Cont’d
Recall , we can show that !!
Test for is equivalent to the t-test for linear correlation
coefficient

46
Example
Is GrossFA significantly affecting the apartment price?

𝑏1 𝑆𝑏 1
t p-value

d.f. =

47
Example Cont’d
Is GrossFA significantly affecting the apartment price?
In R, use
• qt(.975,78) to obtain C.V.
• 2*(1-pt(10.81,78)) to obtain p-value
t
In exam,
At = • use t-table to obtain C.V.
• p-value is not computable by hand,
d.f. = = but a range can be found at best
C.V. =

Reject , GrossFA significantly affects

apartment price.

p-value < < , reject 48

Example Cont’d
Is Age having a significant negative relationship with
price?
t

At =
d.f. = ) =
C.V. =

Reject , Age has a significant negative

impacts on apartment price.

p-value = < , reject

Note: For one-tail test, 49

Inferences about the Parameters –
Overall Model Significance
F-test for the overall model
(the model is insignificant)
(the model is significant)
F-statistic F
where = Mean Squares Regression
= Mean Squares Errors
Again a slight abuse
= no. of predictors (excluding intercept) of notations
= no. of observations
p-value F
denotes an distribution with d.f. and
Reject if F > or p-value <
50
Example
Is the model significant?

F
d.f. =

p-value

SSR MSR
SSE MSE

51
Example Cont’d
Is the model significant?
In R, use
• qf(.95,78) to obtain C.V.
• 1-pf(116.90,78) to obtain p-value
F
In exam,
• use F-table to obtain C.V.
At = • p-value is not computable by hand
d.f. = =
C.V. =

Reject , the model is significant.

p-value < < , reject 52

Inferences about the Parameters –
Overall Model Significance
• The F statistic is a monotone transform of
• Essentially it is testing whether the is significantly bigger
than 0

• For simple linear regression (only one predictor)

• , i.e. F-test is equivalent to two-tail t-test!!

53
Part IV
Prediction and Diagnostics

54
Prediction of New Observations –
Point Prediction
Convert the given -value into the same measurement
scale as the observed -value
As the estimated slope coefficient is scale dependent
Ideally, only use the regression equation to predict the -
value when the given -value is inside the observed data
range
As we are not sure whether the linear relationship will go
beyond the range of observed -value

55
Example Cont’d
What is the estimated price for an apartment with gross floor
area ft2?
Prediction given by the simple linear regression equation

where = Price in million HK$

= Gross floor area in ft2
The expected price for an apartment with gross floor area ft 2
is
What is the estimated mean price for apartments with gross
floor area ft2? – same estimate, but any differences?
56
Prediction of New
Observations Cont’d
The prediction given by regression models raised from
different possible samples will vary

𝒀
^
𝒀 𝒊 Which prediction
^
𝒀 𝒊 we should trust?
^
𝒀 𝒊

𝑿𝒊 𝑿
57
Prediction of New Observations –
Interval Prediction Cont’d
Confidence interval estimate for the mean of -variable
given a -value

where , is the given -value

 R program
predict(m1,level=.95,interval="confidence")
 Note that , where is the sample variance of

58
Prediction of New Observations –
Interval Prediction Cont’d
Prediction interval estimate for an individual -value given
a -value

where
 R program
predict(m1,level=.95,interval=“prediction")
 It is still a type of confidence interval, although we are using
the term prediction interval to differentiate them

59
Prediction of New Observations –
Interval Prediction Cont’d
𝒀 Prediction Interval for an
individual -value

Confidence Interval for the

mean of -variable

𝑿𝒊 𝑿
60
Example Cont’d
Determine a % confidence interval for the mean
apartment price for flats of ft2 gross area
Also, construct a % prediction interval for the apartment
price for a flat of ft2 gross area

confidence interval for mean

point estimate for mean

61
Regression Assumptions
Linearity of regression equation
 is a linear function
Error normality
 has a normal distribution for all
Constant variances of errors

Error independence
 are independent for all

62
Residual Analysis
Check the regression assumptions by examining the
residuals
Residuals (or errors),
Plot
Residuals against the predictor for checking linearity and
constant variances
Residuals against index for checking error independence
Histogram of the residuals for examining error normality

63
Residual Analysis
e Cont’d

0
e
𝑿
Residuals has a systematic
0 pattern, and -variables are not
having a liner relationship, but a

𝑿 e curved one

 Residuals fall within a

horizontal band centered 0
around 0, displaying a
random pattern
𝑿
Error variance increases with -
64
value
Residual Analysis Cont’d

e e

0 0

Index(Time) Index(Time)

 Residuals displaying a
random pattern
Negative residuals are
associated mainly with the
early trials, and positive
residuals with the later trials,
time of the data being
collected affects the residuals
and -values
65
Residual Analysis Cont’d

35 35
30 30
25 25
20 20
%

%
15 15
10 10
5 5
0 0
-0.75 -0.5 -0.25 0 0.25 0.5 0.75 -0.75 -0.5 -0.25 0 0.25 0.5 0.75
e e

 Residuals follow a
symmetrical and bell shape
Residuals are being right-
skewed
distribution

66
Summary
Description Response Predictor Correlation Error
Population Version
Sample Analogy
Variance of Estimator
(take square root to get
standard error)

Intercept Slope Expected value of New observation of

67
Summary
 is the breakdown of variance / variations
 = a single number in that quantifies the model explained
variation / measures the goodness of fit
t-statistic t tests the significance of a single predictor, i.e.
whether
F-statistic F tests the significance of the entire model, i.e.
whether all , is the number of predictors
In this chapter with a single , and
Point prediction and confidence interval prediction

Astm E739
No ratings yet
Astm E739
7 pages
Applied Linear Regression Models 4th Ed Note
No ratings yet
Applied Linear Regression Models 4th Ed Note
46 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Applied Life Data Analysis
83% (6)
Applied Life Data Analysis
656 pages
CHAPTER 9 Estimation and Confidence Intervals
100% (1)
CHAPTER 9 Estimation and Confidence Intervals
45 pages
Simple+Linear+Regression
No ratings yet
Simple+Linear+Regression
71 pages
Lecture 6 Simple Linear Regression
No ratings yet
Lecture 6 Simple Linear Regression
36 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
95 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
Regression Analysis
No ratings yet
Regression Analysis
22 pages
Chapter2 1
No ratings yet
Chapter2 1
55 pages
Linear Regression
No ratings yet
Linear Regression
216 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
27 pages
10 - Regression 1
No ratings yet
10 - Regression 1
58 pages
Course 10-Part 1
No ratings yet
Course 10-Part 1
32 pages
Simple Linear Regression
100% (1)
Simple Linear Regression
50 pages
Fundamentals of Business Statistics: 6E John Loucks
No ratings yet
Fundamentals of Business Statistics: 6E John Loucks
40 pages
03 Revisions L Regression
No ratings yet
03 Revisions L Regression
25 pages
Multiple+Regression+A
No ratings yet
Multiple+Regression+A
32 pages
Chap13_Simple Linear Regression (1)
No ratings yet
Chap13_Simple Linear Regression (1)
60 pages
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
No ratings yet
The Bucharest University of Economic Studies Bucharest Business School Romanian - French INDE MBA Program
67 pages
Simple Linear Regression sample
No ratings yet
Simple Linear Regression sample
55 pages
03 - Simple Linear Regression
No ratings yet
03 - Simple Linear Regression
13 pages
BES - Lecture 10 - Simple Linear Regression
No ratings yet
BES - Lecture 10 - Simple Linear Regression
15 pages
Data Science 03 - Regression PDF
No ratings yet
Data Science 03 - Regression PDF
32 pages
F_Regression
No ratings yet
F_Regression
65 pages
Simple Regression
No ratings yet
Simple Regression
35 pages
Lecture-3---Linear-Regression-imran-20022025-092939am
No ratings yet
Lecture-3---Linear-Regression-imran-20022025-092939am
46 pages
Topic Simple Linear Regression
No ratings yet
Topic Simple Linear Regression
38 pages
15.Simple Linear Regression-530
No ratings yet
15.Simple Linear Regression-530
54 pages
Estimation of Causal Relationships I: Illustration 1
No ratings yet
Estimation of Causal Relationships I: Illustration 1
8 pages
Slides Prepared by John S. Loucks St. Edward's University
No ratings yet
Slides Prepared by John S. Loucks St. Edward's University
48 pages
Regression Equations
No ratings yet
Regression Equations
94 pages
Chapter14
No ratings yet
Chapter14
65 pages
Lecture 10
No ratings yet
Lecture 10
38 pages
Chap 11
No ratings yet
Chap 11
64 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
PE Civil: Transportation e-book Practice Exam
No ratings yet
PE Civil: Transportation e-book Practice Exam
41 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
Simple Linear Regression Analysis - ReliaWiki
No ratings yet
Simple Linear Regression Analysis - ReliaWiki
29 pages
Chapter 9 Simple Linear Regression and Correlation (1) (1)
No ratings yet
Chapter 9 Simple Linear Regression and Correlation (1) (1)
56 pages
06 Least Squar Regression
No ratings yet
06 Least Squar Regression
25 pages
BUSINESS STATISTICS: Simple Linear Regression and Correlation
No ratings yet
BUSINESS STATISTICS: Simple Linear Regression and Correlation
55 pages
EECM3724 Unit 9 ch14 Slides 2023
No ratings yet
EECM3724 Unit 9 ch14 Slides 2023
57 pages
Chapter 2 - Quantitative Analysis
No ratings yet
Chapter 2 - Quantitative Analysis
244 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
Simple Lin Regress Inference
No ratings yet
Simple Lin Regress Inference
51 pages
Chap 12
No ratings yet
Chap 12
62 pages
MAP 716 Lecture 4 Simple Linear Regression
No ratings yet
MAP 716 Lecture 4 Simple Linear Regression
23 pages
Chapter No 11 (Simple Linear Regression)
No ratings yet
Chapter No 11 (Simple Linear Regression)
3 pages
Statistics For Business STAT130: Unit 8: Correlation and Regression Analysis
No ratings yet
Statistics For Business STAT130: Unit 8: Correlation and Regression Analysis
56 pages
CH 12
No ratings yet
CH 12
57 pages
Week 2
No ratings yet
Week 2
33 pages
ESBE7ch12a (1) (1)
No ratings yet
ESBE7ch12a (1) (1)
48 pages
Corelation and Regression
No ratings yet
Corelation and Regression
137 pages
Session 3 - Linear Regression
No ratings yet
Session 3 - Linear Regression
96 pages
Lecture10 - SIMPLE LINEAR REGRESSION
No ratings yet
Lecture10 - SIMPLE LINEAR REGRESSION
13 pages
Chapter 2 Simple Linear Regression - Jan2023
No ratings yet
Chapter 2 Simple Linear Regression - Jan2023
66 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
64 pages
Complete Business Statistics: Simple Linear Regression and Correlation
No ratings yet
Complete Business Statistics: Simple Linear Regression and Correlation
50 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Notes2
No ratings yet
Notes2
16 pages
Methods for Non-Compartmental Pharmacokinetic Analysis With Observations Below the Limit of Quantification
No ratings yet
Methods for Non-Compartmental Pharmacokinetic Analysis With Observations Below the Limit of Quantification
13 pages
Unit-3 PROPORTIONS
No ratings yet
Unit-3 PROPORTIONS
18 pages
Homework Problems Stat 479: March 28, 2012
No ratings yet
Homework Problems Stat 479: March 28, 2012
44 pages
Statistical Analysis of Spatial and Spatio Temporal Point Patterns Third Edition Peter J. Diggle all chapter instant download
100% (11)
Statistical Analysis of Spatial and Spatio Temporal Point Patterns Third Edition Peter J. Diggle all chapter instant download
75 pages
Econ-2042- Unit 6-W12-13
No ratings yet
Econ-2042- Unit 6-W12-13
77 pages
zhang1994
No ratings yet
zhang1994
22 pages
Explaining Cointegration Analysis - Part II
No ratings yet
Explaining Cointegration Analysis - Part II
32 pages
Econometrics Notes - University of Utah (370 Pages)
No ratings yet
Econometrics Notes - University of Utah (370 Pages)
370 pages
thống kê
No ratings yet
thống kê
4 pages
Ecf480 FPD 3 2015 2
No ratings yet
Ecf480 FPD 3 2015 2
15 pages
Mock Paper SI
No ratings yet
Mock Paper SI
5 pages
Caro 2013
No ratings yet
Caro 2013
9 pages
Economics Data Analysis Excel
No ratings yet
Economics Data Analysis Excel
6 pages
Sampling and Estimation
No ratings yet
Sampling and Estimation
12 pages
Course Outline Mat361 Summer 2023
No ratings yet
Course Outline Mat361 Summer 2023
3 pages
Lecture Notes20120425152410
No ratings yet
Lecture Notes20120425152410
118 pages
Estimation of Multinomial Logit Models in R: The Mlogit Packages
No ratings yet
Estimation of Multinomial Logit Models in R: The Mlogit Packages
73 pages
Subjective Questions
No ratings yet
Subjective Questions
8 pages
Estimation of Parameter
No ratings yet
Estimation of Parameter
19 pages
4_5861748881427009429
No ratings yet
4_5861748881427009429
29 pages
Fisher On Design
No ratings yet
Fisher On Design
15 pages
Statistics Exercise Solution
100% (1)
Statistics Exercise Solution
19 pages
Mme PDF
No ratings yet
Mme PDF
9 pages
Passing-Bablok Regression For Method Comparison
No ratings yet
Passing-Bablok Regression For Method Comparison
9 pages
L1Norm Genetic
No ratings yet
L1Norm Genetic
10 pages
12
No ratings yet
12
16 pages
pop pk
No ratings yet
pop pk
17 pages

Simple Linear Regression

Uploaded by

Simple Linear Regression

Uploaded by

Simple Linear

So far, we are dealing with only one variable

In this course, we only deal with parametric approach

As we will see, the regression framework covers many of

We can use Pearson correlation coefficient, also known

 “Sign” indicates the direction (positive / negative) of a

 Source: HKEA Transaction Records, http://www.hkea.com.hk/private/TransServ

 #load the data • library(…) is to load packages

 #scatter plots of Price vs GrossFA and Price vs Age

• cbind(…) is to combine by columns

Correlation coefficient Correlation coefficient

Calculated value of p-value of t-test for

The conditional distribution of , particularly the

value(s) of the predictor(s)

Response Random Error

 Constant variances of errors

Also, the estimate for the error variance is given by

• lm(…) means linear model

where = Price in million HK$

where = Price in million HK$

where = Price in million HK$

 Upon repeated sampling, those CI will cover the true

Saturated model <----- Fitted model -----> Null model

One simplest way to evaluate the goodness of fit is to

Sum of Squares Total,

SSE = SST - SSR

SSR, the variation of -variable that

Measures the proportion of variation of explained by the

Remark!! in simple linear regression, i.e. when there is

Reject , GrossFA significantly affects

p-value < < , reject 48

Reject , Age has a significant negative

p-value = < , reject

Note: For one-tail test, 49

Reject , the model is significant.

p-value < < , reject 52

• For simple linear regression (only one predictor)

where = Price in million HK$

where , is the given -value

Confidence Interval for the

confidence interval for mean

point estimate for mean

 Residuals fall within a

Intercept Slope Expected value of New observation of

You might also like