Chapter 2 - Simple Linear Regression Function
Chapter 2 - Simple Linear Regression Function
Solomon Estifanos
MARCH, 2022
DEBRE BERHAN, ETHIOPIA
Outline
2
Simple Liner Regression Models
Regression analysis refers to estimating functions showing the relationship between two or more variables
and corresponding tests.
Regression analysis is concerned with the study of relationship between one variable (known as
dependent variable) and one or more other variables (known as independent variable(s)).
A central point in regression analysis is estimating regression functions accompanied by some
preceding and some succeeding steps in the econometrics methodology.
3
Simple Linear . . . Con’t . . .
𝒀𝒊 = 𝒇 𝑿𝟏 , 𝑿𝟐 , 𝑿𝟑 , . . .
Dependent Variable Independent Variable
Explained Variable Explanatory Variable
Endogenous Variable Exogeneous Variable
Regressand Regressors
• In addition to showing the causal relationship among the variables, regression
analysis has following objectives and uses.
1) To estimate mean or average value of the dependent variable, given the value of independent
variable(s)
2) To test hypothesis about sign and magnitude of relationship between the dependent variable
and one or more independent variable(s) and
3) To predict or forecast future value(s) of the dependent variable which is in turn used in policy
formulation
4
Simple Linear . . . Con’t . . .
✓ Having this much about regression function in general, now let’s see some more points
✓ Let’s see the meaning of the words ‘simple’ and ‘linear’ in the topic. In simple linear
regression function or analysis, the term ‘simple’ refers to the fact that we use only two
variables (one dependent and another independent variable).
✓ If the number of independent or explanatory variables is greater than one, we don’t use the
5
Simple Linear . . . Con’t . . .
a) The dependent variable is quantity demanded and the independent variable is price.
b) Economic theory does not specify whether demand is studied using one equation or more
elaborated form of simultaneous equations. Let us we use one equation system.
c) Economic theory is not clear about the mathematical form (linear or non-linear) of the demand
function. Let we choose a linear function. Therefore, the demand function is given as
𝒀𝒊 = 𝜶 + 𝜷𝑿𝒊
6
Simple Linear . . . Con’t . . .
d) The above form of demand function implies that the relation between Y and X is exact. That means the
whole variation in Y is due to the changes in X only and there are no other factors or variables affecting the
dependent variable except X. If this were true, all quantity-price pairs would fall on the straight line if we
plot it in the X-Y plane. However, if we gather information from the market and plot it in the X-Y plane, all
the price-quantity pairs do not fall on the straight line. Most of the pairs would lie on the straight line, some
of the points would lie above the straight line and some of the points would also lie below the straight line.
*
* * *
* * *
* *
7
Simple Linear . . . Con’t . . .
The deviation of points from the straight line may be attributed to several reasons. These are:
1) Omitted Variables from the Function
c) Some of the factors are random appearing in unpredictable way and time
d) Some of the factors have very small influence on the dependent variable
3) Errors of Aggregation: We often use aggregate data in which we add magnitudes referring to individual behavior
which are dissimilar. In this case, those variables expressing individual peculiarities are missing.
4) Errors of Measurement: errors of measurements of variables which are inevitable due to the methods of
8 collecting and processing statistical information.
Outline
In order to take into account, the above sources of errors, we include in econometric functions a random
variable which is usually denoted by letter ‘U’ and is called the error term or the disturbance term or the
random term.
𝒀𝒊 = 𝜶 + 𝜷𝑿𝒊 + 𝑼𝒊
The true relationship which connects the variables involved is split in two parts:
Such type of models is called stochastic or probabilistic models and is familiar in econometrics. The above
model shows that the relationship between the two variables is inexact and the total variation of the
dependent variable is split into two additive components: Explained variation and residual variation which
can be shown as;
9 Total Sum of Square (TSS) = Explained Sum of Square (ESS) + Residual Sum of Square (RSS)
Outline
The variation in the dependent variable is not hundred percent explained by the variation in the
explanatory variable. Thus, the variation in the dependent variable expressed as the sum of
explained variation and random variation given as follows.
11
Ordinary Least Square Method (OLS) and Classical Linear Regression
Model (CLRM) Assumptions
CLRM Assumptions
Assumption 1
The error terms ‘Ui’ are randomly distributed. Ui is a random real variable. The value which
Ui may assume in any period depends on chance: some may be positive, some may be
negative or some may be zero.
Assumption 2
• The disturbance terms ‘Ui’ have zero mean. The sum of the disturbance terms is zero. The
deviation or the values of some of the disturbance terms are negative; some are zero and
some are positive and the sum or the average is zero. That is, E(Ui) =ΣUi. Multiplying
both sides by (sample size ‘n’) we obtain the following. E(Ui) = ΣUi=0.
12
CLRM Assumptions Con’t . . .
Assumption 3
Independence of 𝑼′𝒊 s the disturbance terms are not correlated. This means that there is no
systematic variation or relation among the value of the error terms (𝑼𝒊 𝑎𝑛𝑑 𝑈𝑗 ) where i =
1, 2, 3, …….., j = 1, 2, 3, ……. and 𝑖 ≠ 𝑗 .
This is represented by zero covariance among the error terms: Cov(𝑼𝒊 , 𝑼𝒋 ) = 0 for i≠j.
Otherwise, it causes an autocorrelation problem.
Assumption 4
The disturbance terms have constant variance in each period. This is given as follows:
Assumption 6
• Linearity of the model in parameters. The simple linear regression requires linearity in
parameters; but not necessarily linearity in variables. The same technique can be applied
to estimate regression functions of the following forms:
14
CLRM Assumptions Con’t . . .
Assumption 7
The explanatory variable Xi is fixed in repeated samples. Each value of Xi does not vary
for instance owing to change in sample size. This means the explanatory variables are non-
random and hence distributional free variable.
Assumption 8
• The disturbance term Ui is assumed to have a normal distribution with zero mean and a
constant variance.
17
Where, RSS= Residual Sum of Squares.
Estimation Using OLS Method Con’t . . .
The method of OLS involves finding the estimates of the intercept and the slope for which the
RSS is minimized. To minimize the RSS we take the first order partial derivatives and equate
them to zero.
ෝ:
Partial derivative with respect to 𝜶
𝝏 σ 𝒆𝟐𝒊 𝒊 (−𝟏) = 0
ෝ
= 2σ 𝒀𝒊 − 𝜶
ෝ – 𝜷𝑿
𝝏𝜶
σ 𝒀𝒊 − 𝜶 𝒊 =0
ෝ – 𝜷𝑿
σ 𝒀𝒊 − 𝒏𝜶 𝒊=0
ෝ − 𝜷σ𝑿
σ 𝒀𝒊 = 𝒏𝜶 𝒊 ------------------------------------------------------- (1)
ෝ + 𝜷σ𝑿
σ 𝒀𝒊 𝑿𝒊 − 𝜶 𝟐𝒊 = 0
ෝ 𝑿𝒊 – 𝜷𝑿
σ 𝒀𝒊 𝑿𝒊 = 𝜶 𝟐𝒊 ----------------------------------------------- (2)
ෝ σ𝑿𝒊 + 𝜷σ𝑿
18
Estimation Using OLS Method Con’t . . .
Lets represent equation (1) and (2) in matrix form as;
𝒏 σ𝑿𝒊 ෝ
𝜶 σ𝒀𝒊
=
σ𝑿𝒊 σ 𝑿𝟐𝒊 𝜷 σ𝒀𝒊 𝑿𝒊
Applying Cramer’s rule to solve for 𝛽መ we get;
𝒏 σ 𝒀𝒊 𝑿 𝒊 − σ 𝒀𝒊 σ 𝑿 𝒊 σ 𝒀𝒊 𝑿𝒊 −𝒏𝒀 𝑿 ഥ ഥ
𝜷= σ 𝟐 σ 𝟐 or
𝜷 = σ 𝟐 ഥ𝟐 or
𝒏 𝑿𝒊 −( 𝑿𝒊 ) 𝑿𝒊 −𝒏𝑿
Yi 69 76 52 56 57 77 58 55 67 53 72 64
Xi 9 12 6 10 9 10 7 8 12 6 11 8
Fit the simple linear regression equation; Y = f(X) and Interpret your result.
20
Solution
σ 𝒀𝒊 = 𝟕𝟓𝟔 σ 𝒚𝟐𝒊 = 𝟖𝟗𝟒 σ 𝒚𝒊 𝒙𝒊 = 𝟏𝟓𝟔
σ 𝑿𝒊 = 𝟏𝟎𝟖 σ 𝒙𝟐𝒊 = 𝟒𝟖 ഥ = 𝟔𝟑 𝑿
𝒀 ഥ=𝟗
The coefficient of determination is the measure of the amount or proportion of the total
variation of the dependent variable that is determined or explained by the model or the
presence of the explanatory variable in the model.
The total variation of the dependent variable is split in two additive components: a part
explained by the model and a part represented by the random term.
Total Variation in 𝒀𝒊 : σ(𝒀𝒊 − 𝒀 ഥ )𝟐
Total Explained Variation: σ(𝒀 ഥ
𝒊 −𝒀
)𝟐
Total Unexplained Variation: σ 𝒆𝟐𝒊
The total variation of the dependent variable is: TSS = ESS + RSS, which means total sum
of square of the dependent variable is split into explained sum of square and residual sum
of square.
22
Coefficient of Determination (𝑹𝟐 ) Con’t . . .
The coefficient of determination is given by the formula;
or
or
The higher the coefficient of determination is the better the fit. Conversely, the smaller the
coefficient of determination is the poorer the fit. That is why the coefficient of determination is
used to compare two or more models. One minus the coefficient of determination is called the
coefficient of non-determination, and it gives the proportion of the variation in the dependent
23
variable that remained undetermined or unexplained by the model.
Outline
Example 2.2: Refer to Example 2.1. Determine how much percent of the variations in the quantity
supplied is explained by the price of the commodity and what percent remained unexplained.
σ ො
𝐲𝐢
𝟐 σ 𝐲𝐢 𝐱 𝐢 𝟑. 𝟐𝟓 ∗ 𝟏𝟓𝟔
𝛃
𝐑𝟐 = 𝟐
= 𝟐
= = 𝟎. 𝟓𝟔𝟕𝟏 = 𝟓𝟔. 𝟕𝟏%
σ 𝐲𝐢 σ 𝐲𝐢 𝟖𝟗𝟒
This result shows that 56.71% of the variation in the quantity supplied of the commodity under
consideration is explained by the variation in the price of the commodity; and the rest 43.29%
remained unexplained by the price of the commodity. In other word, there may be other important
explanatory variables left out that could contribute to the variation in the quantity supplied of the
24
commodity, under consideration.
Properties of OLS Estimates and Gauss-Markov Theorem
We would like to an estimate to be as close as the value of the true population parameters i.e. to vary within
only a small range around the true parameter. How are we to choose among the different econometric
methods, the one that gives ‘good’ estimates? We need some criteria for judging the ‘goodness’ of an
estimate.
‘Closeness’ of the estimate to the population parameter is measured by the mean and variance or standard
deviation of the sampling distribution of the estimates of the different econometric methods.
The ideal or optimum properties that the OLS estimates possess may be summarized by well-known theorem
known as the Gauss-Markov Theorem.
Statement of the theorem: “Given the assumptions of the classical linear regression model, the OLS
estimators, in the class of linear and unbiased estimators, have the minimum variance, i.e. the OLS
estimators are BLUE. Sometimes the theorem referred as the BLUE theorem i.e. Best, Linear, and
25 Unbiased Estimator.
Properties of OLS Estimates . . . Con’t . . .
a) Linear: a linear function of the random variable, such as, the dependent variable Y.
b) Unbiased: its average or expected value is equal to the true population parameter.
c) Minimum Variance: It has a minimum variance in the class of linear and unbiased
estimators. An unbiased estimator with the least variance is known as an efficient
estimator.
26
Properties of OLS Estimates . . . Con’t . . .
Unbiasedness: (for 𝜷)
Linearity: (for 𝜷)
Proposition: are the unbiased estimators of the true parameters
Proposition: 𝜷 are linear in Y. is an estimator of 𝜽 then E(𝜽)
If 𝜽 -𝜽 = amount of bias and if 𝜽 is
=𝜽
the unbiased estimator of 𝜽 then bias = 0 i.e. E(𝜽)
Sx y Sx (Y - Y ) Sx Y - YSx We know that bˆ = S kY i = S k i (a + b X i + U i )
bˆ = i i
= i
= i i
,
Sx 2
Sx 2
Sx 2
= aSk i + b Sk i X i + Sk iu i ,
i i i
Sx Y
bˆ = i
; Now, let
x
i
=K (i = 1,2,.....n)
k i =0
Sx 2 Sx 2
i
Sxi X i S - ) Xi S X 2 - X S X SX 2
- nX 2
i i Sk i X i = = (X X = = =1
Sxi2
Sxi2
SX - nX
2 2
SX 2
- nX 2
\ b̂ = SK Y kX i i =1
i
bˆ = b + S k i u i bˆ - b = S k i u i
\ b̂ is linear in Y
Therefore, bˆ is unbiased estimator of b .
Properties of OLS Estimates . . . Con’t . . .
Minimum Variance
smallest sampling variances. For this, we shall first obtain variance of𝛼ො 𝑎𝑛𝑑 𝛽መ and then establish that each has the
minimum variance in comparison of the variances of other linear and unbiased estimators obtained by any other
econometric methods than OLS.
var( bˆ ) = E ( k i u i ) 2
= E( k i2 u i2 ) + E(Sk i k j ui u j ) i j
Therefore,
To check whether
𝜷 possess minimum variance property, we compare its variances with
that of the variances of some other alternative linear and unbiased estimator of β, say 𝛽∗ .
Now, we want to prove that any other linear and unbiased estimator of the true population
parameter obtained from any other econometric method has larger variance that that OLS
estimators.
29
Properties of OLS Estimates . . . Con’t . . .
Suppose: b * an alternative linear and unbiased estimator of b and; To prove whether bˆ has minimum variance or not let’s compute var( b *) to compare with var( bˆ )
𝑥
Let b * = SwiYi where, wi k i ; but: wi = k i + ci where, 𝑘𝑖 = σ 𝑥𝑖 2 var( b *) = var( Swi Yi ) = Swi var(Yi )
2
𝑖
From these values we can drive Sci xi = 0, where xi = X i - X Given that ci is an arbitrary constant, 2 Sci2 is a positive i.e. it is greater than zero. Thus
Sci xi = ci ( X i - X ) =Sci X i + XSci , Since Sci xi = 1 Sci = 0 Sci xi = 0 var( b *) var( bˆ ) . This proves that bˆ possesses minimum variance property. In the similar way
Thus, from the above calculations we can summarize the following results. we can prove that the least square estimate of the constant intercept ( â ) possesses minimum
variance.
Swi = 0, Swi xi = 1, Sci = 0, Sci X i = 0
Variance of the Error/Random Term (𝑼𝒊 )
✓ You may observe that the variances of the OLS estimates involve 𝛿𝑢2 , which is the
population variance of the random disturbance term. But it is difficult to obtain the
population data of the disturbance term because of technical and economic reasons. Hence
it is difficult to compute 𝛿𝑢2 ; this implies that variances of OLS estimates are also difficult
to compute. But we can compute these variances if we take the unbiased estimate of 𝛿𝑢2
which is 𝛿መ𝑢2 computed from the sample value of the disturbance term 𝑒𝑖 from the
expression:
To use ˆ 2 in the expressions for the variances of aˆ and bˆ , we have to prove whether ˆ 2
ei
2
There are different tests that are available to test the statistical reliability of the parameter
estimates. The following are the common ones;
A)The standard error test
B)The students t-test
32
Confidence Interval and … Con’t . . .
A) The Standard Error Test
This test first establishes the two hypotheses that are going to be tested which are
commonly known as the null and alternative hypotheses. The null hypothesis
addresses that the sample is coming from the population whose parameter is not
significantly different from zero while the alternative hypothesis addresses that the
sample is coming from the population whose parameter is significantly different from
zero. The two hypotheses are given as follows:
H0: βi=0
H1: βi≠0
The standard error test is outlined as follows:
33
Confidence Interval and … Con’t . . .
1. Compute the standard deviations of the parameter estimates using the above formula
for variances of parameter estimates. This is because standard deviation is the
positive square root of the variance.
U2
se( b 1 ) =
xi2
U2 X i2
se( b 0 ) =
n xi2
2. Compare the standard errors of the estimates with the numerical values of the estimates and
make decision.
1
a) If se( b i ) ( b i ) , reject the null hypothesis and we can conclude that the
2
estimate is statistically significant.
1
b) If se( b i ) ( b i ) , do not reject the null hypothesis and we can conclude
2
that the estimate is not statistically significant.
34
Confidence Interval and … Con’t . . .
Example 2.4: The regression shows the estimated regression of supply on price
Yi = 33.75 + 3.25 X i where the numbers in parenthesis are standard errors. Test the statistical
(8.3) (0.9)
significance of the estimates using standard error test.
4. Determine the calculated value of t. The test statistic (using the t- test) is given by:
bˆi
tcal =
se( bˆi )
The test rule or decision is given as follows:
37
Confidence Interval and … Con’t . . .
Example 2.5: Refer to Example 2.1. Is price of the commodity significant in determining the quantity
supplied of the commodity under consideration? Use a=0.05.
H 0 : b1 = 0
H 1 : b1 0
As we found in Example 2.1,
bˆ1 = 3.25, se(bˆ1 ) = 0.8979
bˆi 3.25
t cal = = = 3.62
ˆ
se( b i ) 0.8979
The tabulated value as given in Example 2.1 is 2.228. since calculated t is greater than the
tabulated value, we reject the null hypothesis and conclude that the price of the commodity
is statistically significant in determining quantity supply of the commodity at 5% level of
significance.
38
Confidence Interval and … Con’t . . .
Confidence Interval Estimation of the regression Coefficients
✓ In the above section, we have seen how to test the reliability of parameter estimates. But
one thing that must be clear is that rejecting the null hypothesis does not mean that the
parameter estimates are correct estimates of the true population parameters.
✓ It means that the estimate comes from the sample drawn from the population whose
population parameter is significantly different from zero. In order to define how close to
the estimate the true parameter lies, we must construct a confidence interval for the
parameter.
✓ we can construct 100(1- a) % confidence intervals for the sample regression coefficients.
To do so we need to have the standard errors of the sample regression coefficients. The
standard error of a given coefficient is the positive square root of the variance of the
coefficient.
39
Confidence Interval and … Con’t . . .
✓ Variance of the intercept (𝛽መ0 )
X
2
i
var( bˆ0 ) = u
2
n xi
2
1 ei
2
var( bˆ1 ) = u
2
where, u2 =
x
2
i n-k
The standard errors are the positive square root of the variances, as repeatedly defined
above and 100(1- a) % confidence interval for the slope is:
40
Confidence Interval and … Con’t . . .
b1 - ta ( n - k )(se( b1 )) b1 b1 + ta ( n - k )(se( b1 ))
2 2
Example 2.6: From example 2.1 above, determine the 95% confidence interval for the
slope. xy
bˆ = =
156
= 3.25 bˆ0 = Y - bˆ1 X = 63 - (3.25)(9) = 33.75
48
x 2
e var( bˆ1 ) = u
1 1
2
= 38.7( ) = 0.80625
2
i
387 387
u = = = = 38.7 48
x
2
2
n-k 12 - 2 10
The tabulated value of t for (n-k) degrees of freedom is12-2=10 and a/2=0.025 is 2.228.
Hence, the 95% confidence interval for the slope is given by:
41 bˆ1 = 3.25 (2.228)(0.8979) = 3.25 2 = (3.25 - 2, 3.25 + 2) = (1.25, 5.25)
Test of Model Adequacy (Overall Significance Test)
Is the estimated equation a useful one? To answer this, an objective measure of some sort
is desirable.
The total variation in the dependent variable Y can partition into two: one that accounts for
variation due to the regression equation (explained portion) and another that is associated
with the unexplained portion of the model.
ഥ) 𝟐 =
(𝒀 − 𝒀 − 𝒀
(𝒀 ഥ) 𝟐 + ) 𝟐
(𝒀 − 𝒀
Decision: If the calculated variance ratio exceeds tabulated value, that is, if
Fcal > Fα (k-1, n-2), we then conclude that R2 is significant (or that the linear
regression model is adequate or statistically significant).
Note that, the F test is designed to test the significance of all variables or a set of
variables in a regression model. In the two-variable model, however, it is used to test
the explanatory power of a single variable (X), and at the same time, is equivalent to
the test of significance of R2.
44
Prediction Using Simple Linear Regression Model
Predicting the future values of the dependent variable is one of the key tasks in econometric analysis. The estimated
መ 𝑖 is used for predicting the values of Y for a given values of X. To proceed with, let X0 be
regression equation 𝑌𝑖 = 𝛼ෝ + 𝛽𝑋
the given value of X. Then we predict the corresponding value of Y0 of Y by
𝟎 = 𝜶
𝒀 𝟎
ෝ + 𝜷𝑿
This equation shows that the predictant is unbiased. Note that the predictant is unbiased in the sense that 𝐸 𝑌0 ) = 𝐸(𝑌0 since both
𝑌0 𝑎𝑛𝑑 𝑌0 are random variables.
45
Example
Using the following information about money demand (M) measured in billion USD and
interest rate (R) measured in percentage for eight different economy samples;
46
Example Con’t . . .
𝑴 (𝒀) 𝑹 (𝑿) 𝒚 𝒙 𝒚𝟐 𝒙𝟐 𝒚𝒙
56 6.3 14.125 0.337 199.516 0.114 4.767
50 4.6 8.125 -1.363 66.016 1.856 -11.070
46 5.1 4.125 -0.863 17.016 0.744 -3.558
30 7.3 -11.875 1.338 141.016 1.789 -15.883
20 8.9 -21.875 2.938 478.516 8.629 -64.258
35 5.3 -6.875 -0.663 47.266 0.439 4.555
37 6.7 -4.875 0.738 23.766 0.544 -3.595
61 3.5 19.125 -2.463 365.766 6.064 -47.095
47
Example Con’t . . .
σ 𝒚𝒙 −𝟏𝟑𝟔. 𝟏𝟑𝟖
𝟏 =
𝜷 = = −𝟎. 𝟏𝟎𝟐
σ 𝒚𝟐 𝟏𝟑𝟑𝟖. 𝟖𝟕𝟓
𝟎 = 𝒀
𝜷 𝟏𝑿
ഥ−𝜷 ഥ = 𝟒𝟏. 𝟖𝟕 − −𝟎. 𝟏𝟎𝟐 ∗ 𝟓. 𝟗𝟔 = 𝟒𝟐. 𝟒𝟖
𝑴(𝑴𝒐𝒏𝒆𝒚 𝟎 + 𝛃
𝑫𝒆𝒎𝒂𝒏𝒅) = 𝛃 𝟏 𝐈𝐧𝐭𝐞𝐫𝐞𝐬𝐭 𝐑𝐚𝐭𝐞 (𝐑)
𝑴(𝑴𝒐𝒏𝒆𝒚 𝑫𝒆𝒎𝒂𝒏𝒅) = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟏𝟎𝟐 ∗ 𝐈𝐧𝐭𝐞𝐫𝐞𝐬𝐭 𝐑𝐚𝐭𝐞 (𝐑)
B. If in a 9th economy the rate of interest rate is R = 8.1, predict the Demand for Money
(M) in this economy
𝑴(𝑴𝒐𝒏𝒆𝒚 𝑫𝒆𝒎𝒂𝒏𝒅) = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟏𝟎𝟐 ∗ 𝐈𝐧𝐭𝐞𝐫𝐞𝐬𝐭 𝐑𝐚𝐭𝐞 (𝐑)
𝑴(𝑴𝒐𝒏𝒆𝒚 𝑫𝒆𝒎𝒂𝒏𝒅) = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟏𝟎𝟐 ∗ 𝟖. 𝟏 = 𝟒𝟐. 𝟒𝟖 − 𝟎. 𝟖𝟐𝟔 = 𝟒𝟏. 𝟔𝟓𝟒
48
Example using STATA Output
Outcome Predictor
variable (Y) variable (X)
This is the p-value of the model. It tests whether R2is
different from 0. Usually we need a p-value lower than
. reg qtybeer pricebeer 0.05 to show a statistically significant relationship
between X and Y.
Source SS df MS Number of obs = 30
F( 1, 28) = 51.14
R-square shows the amount of variance of Y
Model 1156.92449 1 1156.92449 Prob > F = 0.0000 explained by X. In this case price of beer
Residual 633.490177 28 22.6246492 R-squared = 0.6462 explains 64.62% of the variance in quantity
Adj R-squared = 0.6335 demand of beer.
Total 1790.41467 29 61.7384368 Root MSE = 4.7565
Adj R-square shows the same as R-square but adjusted
by the no. of cases and no. of variables. When the no. of
qtybeer Coef. Std. Err. t P>|t| [95% Conf. Interval] variables is small and the no. of cases is very large then
Adj R-square is closer to R-square. This provides a
more honest association between X and Y.
pricebeer -9.835284 1.375388 -7.15 0.000 -12.65264 -7.017929
_cons 86.40601 4.324293 19.98 0.000 77.5481 95.26392
The confidence interval shows the lower and upper limit
where to find the true population parameter.
qtybeer = 86.406 - 9.835*pricebeer
For each one-point increase in price of Two-tail p-values test the hypothesis that each coefficient is different from 0. To reject this, the
beer, quantity demand of beer p-value has to be lower than 0.05 (you could choose also an alpha of 0.01). In this case,
decreases by 9.835 million liters. price of beer is statistically significant in explaining quantity demand of beer.
The t-values test the hypothesis that the coefficient is different from 0. To reject this, you need a t-value greater
than 1.96 (for 95% confidence). You can get the t-values by dividing the coefficient by its standard error. The t-
49
values also show the importance of a variable in the model.