Chapter 2 Simple Linear Regression

This document covers the fundamentals of Simple Linear Regression, including key concepts such as Population Regression Function, Stochastic Error Term, and the Method of Ordinary Least Squares (OLS). It outlines the assumptions of the Classical Linear Regression Model, the process of estimating model parameters, and the significance testing of regression coefficients. Additionally, it discusses the variances and standard errors of OLS estimators and provides insights into forecasting and the interpretation of regression results.

Uploaded by

Khushi Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

59 views31 pages

Chapter 2 Simple Linear Regression

Uploaded by

Khushi Garg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 31

Chapter e Simple Linear Regression Model : Two Variable Case After learning this chapter you will understand : Population Regression Function. Stochastic Error Term. Sample Regression Function. Method of Ordinary Least Squares. Assumptions of CLRM. Properties of OLS Estimators. Gauss-Markoy Theorem. Hypothesis Testing of OLS Estimators. Coefficient of Determination. Normality Tests : v Normal Probability Plot, v Jarque-Bera Test. Forcasting. Scaling and units of measurement. VVVVVVVVVV For Full Course Video Lectures of All Subjects of Eco. (Hons), B Com (H), BBE, MA Economics Register yourself at www.primeacademy.in Dheeraj Suri Classes Prime Academy 9899192027Prime Academy, www.primeacademy.in Basic Concepts 1. Regression : Regression means returning or stepping back to average or normal. It was first used by Sir francis Galton. Regression analysis, in general sense, means the estimation or prediction of the unknown value of one variable from the known value of other variable. In the words of M. M. Blair, “Regression analysis is a mathematical measure of the average relationship between two or more variables in terms of the original units of the data”. It is one of the very important statistical tools which is extensively used in business and economics to study the relationship between two or more variables that are related casually and for estimation of demand and supply curves, cost functions, production and consumption functions, etc. Linear Regression : Since in this text we are only concerned with linear regression models, so it is essential to know the meaning of linearity. The term linearity can be interpreted in two different ways as under : (i) Linearity in the Variables : If all the variables involved in the regression equation have degree one then such regression equation is said to linear in variables. wR Gi) Linearity in the Parameters : If the conditional expectation of Y, E(Y|X;) is a linear function of parameters, the B’s, then it is linear in parameters. Here it may or may not be linear in variables. In our analysis by linearity we mean linear in parameters only. The regression model may or may not be linear in variables, the X’s, but it is essentially linear in parameters, the B's. w : Regression equations are used to estimate the value of one variable on the basis of the value of other variable. If we have two variables X and Y the two regression for them are X on Y and Y on X. 4. Regression Equation of Y on X : The line of regression of Y on X is the line which gives the best estimate of Y for any given value of X. 5. Regression Equation of X on Y : The line of regression of X on Y is the line which gives the best estimate of X for any given value of Y. Note : Generally we study only regression equation of Y on X, where Y is the dependent or explained variable and X is independent or explanatory variable. 6. A Linear Probabilistic Model : As a first approximation we may assume that the Population Regression Function (PRF) is a linear function, i.c., it is of the type : E(Y/Xi) = Bi + 2X, Where, E(Y/X;) means the mean or expected value of Y corresponding to or conditional upon a given value of X, here by linearity we mean linear in parameters. B1 and > are unknown but fixed parameters known as regression coefficients. Br is called intercept term and B2 is called slope term. Econometrics 2. By Dheeraj Suri, 9899-192027Prime Academy, www.primeacademy.in 7. — Stochastic Specification of PRF : E(Y/X;) = i+ 2X: form of the regression function implies that the relationship between X and Y is exact, that is all the variations in Y are solely due to changes in X and there are no other factors affecting the dependent variable. If this were true then all the points of X and Y pairs, if plotted on a two dimensional plane, would fall on a straight line. However if we if we gather observations on the actual data and plot them on a diagram, we see they do not fall on a straight line (or any other smooth curve for that matter). The deviations of the observations from the line may be attributed to several factors. In Statistical analysis, however, one generally acknowledges the fact that the relationship is not exact by explicitly including a random factor, known as disturbance term in the linear regression model. So the linear regression model becomes : Ys= Bi + PoXi + us Where u, is known as the stochastic or random or residual or error term. 8. Sample Regression Function : Suppose that we have taken a sample of few values of Y corresponding to given X;. The line which is drawn to fit the data is called sample regression line. Mathematically, sample regression line can be expressed as: Y= B+ BX, or Y=b +b,X, Where, Y, is the estimator of E(Y/X;) or the estimator of population conditional mean, A, or by is the estimator of By, and By or by is the estimator of >. 9. Stochastic Specification of SRF : The stochastic sample regression function can be expressed as: Y=A+BX, +e or Y=¥ te, Where, ¢; is the estimator of population error term uj. 10. Estimating Model Parameters : Our basic objective in regression analysis is to estimate the stochastic PRF Yi= Bit BX + ui On the basis of the SRF ¥=B,+BX; +e, Because, generally our assumption is based on a single sample from some population. But due to sampling variations our estimate of PRF based on SRF is only approximate. To estimate the regression equation we use the method of least squares. il. Assumptions of Classical Linear Regression Model : In order to use the method of ordinary least squares (OLS), the following basic assumptions must hold for a two variable regression model : Econometrics 23 By Dheeraj Suri, 9899-192027Prime Academy, www.primeacademy.in ‘Assumption 1: Linear regression model. The regression model Y, = B, + £X; + u; is linear in the parameters. This interpretation of linearity is that the conditional expectation of Y, E(Y | Xi), is a linear function of the parameters, the fs; it may or may not be linear in the variable X. In this interpretation E(Y | X,) = A + BoX?, is a linear (in the parameter) regression model. To sce this, let us suppose X takes the value 3. Therefore, E(Y | X = 3) = fi + fs, which is obviously linear in fy and >. Note : A function is said to be linear in the parameter, say, 1, if f; appears with a power of I only and is not multiplied or divided by any other parameter (for example, BiPx, B2/Br, and so on). Assumption 2: X values are fixed in repeated sampling. Values taken by the regressor X are considered fixed in repeated samples. More technically, X is assumed to be nonstochastic. Assumption 3: Zero mean value of disturbance u;. Given the value of X, the mean, or expected, value of the random disturbance term u; is zero. Technically, the conditional mean value of u; is zero. Symbolically, we have E(u; |X;) = 0. A violation of this assumption introduces bias in the intercept of the regression equation. ‘Assumption 4: Homoscedasticity or equal variance of u:. Given the value of X, the variance of u; is the same for all observations. That is, the conditional variances of ui are identical. Symbolically, we have var(u; |Xi) = Elui— E(u: |Xi)? = E(u? | Xd) =e where var stands for variance. Assumption 5: No autocorrelation between the disturbances. Given any two X values, X; and X; (i # j), the correlation between any two uj and uj (i # j) is zero. Symbolically, cov (ui, uj [Xi, Xj) = E{ fui - E(u] | Xi } {aj - EQa)] |X} } E(u |Xi)(uj | Xj) 0 where i and j are two different observations and where cov means covariance. Assumption 6: Zero covariance between ui and Xi, or E(uiXi) = 0. Formally, coy (u;, Xi) = Elu; - E(u) |EXi - ECX))] E[u; (X; - E(X)] since E(u) = E(uXi) - E(X,)E(u) since EX) is nonstochastic = E(uiX) since E(u) = 0 = 0 by assumption Assumption 7: The number of observations n must be greater than the number of parameters to be estimated. Alternatively, the number of observations n must be greater than the number of explanatory variables. Assumption 8: Variability in X values. The X values in a given sample must not all be the same. Technically, var (X) must be a finite positive number. Assumption 9: The regression model is correctly specified. Alternatively, there is no specification bias or error in the model used in empirical analysis. Econometrics 24 By Dheeraj Suri, 9899-192027Prime Academy, www.primeacademy.in Assumption 10: There is no perfect multicollinearity. That is, there are no perfect linear relationships among the explanatory variables. 12. Method of Least Squares : The method of Ordinary Least Squares is attributed to a German Mathematician Carl Friedrich Gauss. This method is based on the assumption that the sum of the squares of differences between the estimated values and the actual observed values of the observations is minimum and the variables X and Y are related according to the simple linear regression model. The values of Pi, Bz and o? will almost never be known to an investigator. Instead, sample data consisting of n observed pairs (X;, Yi), (X2, Ya), ..., (Xn, Yn) will be available, from which the model parameters and the true regression line itself can be estimated. The least squares method is used to obtain the best fitting straight line to the given data and consists in selecting the best fitting straight line to the values of independent variable for dependent variable. Using this method the regression equation may be found as under : Regression Equation of Y on X : Y=h,+2,X Normal Equations YY = nf. +h X and DX =A X+ 8, xX? where f, and f, are constants and their values can be obtained by solving _ the normal equations. The least square estimate of the slope coefficient /, of true regression line is DIX, - HH -Y))_ aX -T XD, De - Xx] n> x? -(Dx,} The least square estimate of the intercept coefficient /, of true regression line is : B=Y-BX Note : Regression Equation of X on Y : X=h+hy Normal Equations DX =nf, + B,DY and YXY= BY + BD Y* where f, and f, are constants and their values can be obtained by solving the normal equations. 13. Variations in Yi : The variations in Yi may be classified as under : (i) Total Variation : The total variation in actual Y values about their sample mean ¥ is called total variation in Y. it may also be called total sum of squares (TSS). > y; is a measure of total variation in Y, such that, TsS = 0%, -¥)’ ie. Ey7, Econometrics 25 By Dheeraj Suri, 9899-192027Prime Academy, www.primeacademy.in Gi) Explained Variation : Ys? = D(-¥) =D(" -7F = T(x, - XY is the variation of the estimated Y values (f,) about their mean (f=), which appropriately may be called the sum of squares due to regression, (i.e., due to explanatory variables) or simply the explained sum of squares (ESS), such that ESS = D7, -¥, Ga-ne-7f &») yaw yer (ii) Unexplained Variation : Se? represents the unexplained variation of Y values about the regression line, or simply they are called residual sum of squares (RSS). Such that RSS = (¥,-¥)? or Ye? Where, Total Sum of Squares = Expected Sum of Squares + Residuals Sum of Squares ie., TSS = ESS + RSS Y, = Actual Value, ¥, = Estimated Value, ¥ = Actual Mean 14. Estimating o? and o : The parameter 6? determines the amount of variability inherent in the regression model. A large value of 6? means that the observed (i, )») are quite spread out about the true regression line, whereas when o? is small the observed points will tend to fall very close to the true regression line. The variance of the population error term o? is usually unknown. We therefore need to replace it by an estimate using sample information. Since the population error term is unobservable, one can use the estimated residuals to find an estimate. We start by forming the residual term =¥,- BBX, After estimating the residuals, we estimate the residuals sum of squares denoted x Sw -¥Y =D -4-BX7 We observe that, first of all two parameters 3, and 3, must be estimated, which implies a loss of two degrees of freedom. With this information we may use the folowing formula 7 estimating 6° as under: is known as the standard error of estimate or the standard error of the regression (se). It is simply the standard deviation of the Y values about the estimated regression line and is often used as a summary measure of the “goodness of fit” of the estimated regression line. Econometrics 2.6 By Dheeraj Suri, 9899-192027Prime Academy, www.primeacademy.in Note : The term number of degrees of freedom means the total number of observations in the sample (= n) less the number of independent (linear) constraints or restrictions put on them. In other words, it is the number of independent observations out of a total of n observations. For example, before the Ye? can be computed, , and 2, must first be obtained. These two estimates therefore put two restrictions on the ¢?.. Therefore, there are n ~ 2, not n, independent observations to compute the }’e*. 15. Variances and Standard Errors of OLS Estimators : As we know that OLS estimators are random variables, because their values will change from sample to sample. So we would like to know something about the sampling variability of these estimators. These sampling variabilities are measured by the variances of these estimators. The variances and standard errors of OLS estimators are computed by the following formulae : Var(b,) => — SE(b)) = Vartb, Var(b2 _ 6 Sax, - xP 16. Test of Significance of Regression Coefficients : As per central limit theorem, the regression coefficients bi and b2 follow normal distribution with their means equal to true B; and B, and variances as computed above. The following steps are taken to test the significance of slope term of regression equation : (i) Define Null hypothesis(Ho) and Alternative hypothesis(H.). Hp : B= 0, £e., slope term is statistically insignificant. H, : Slope term is statistically insignificant, ie., fo #0 (Two tailed test) B2.>0 (Upper tailed test) B<0 (Lower tailed test) (ii) Find out the tail of the test, determine whether it is single tail or two tail test. (ii) Calculate the standard error of b>. (iv) Calculate the test statistic “t’ as under : (3) (v) Set the Level of Significance ‘a’ (vi) Find ta (for single tail test) or ta (for two tail test) for n — 2 degrees of freedom from the table. (vii) Compare |t| and ta (or ta). (a) If |t] < tas (or ta), then do not Reject Null hypothesis. (b) If |t] > ta (or ta), then Reject Null hypothesis. ly we can test the statistical significance of intercept term B, Econometrics 27 By Dheeraj Suri, 9899-192027Prime Academy, www.primeacademy.in 17. Confidence Interval : Let us assume that ‘a’ is the level of significance or the probability of committing type I error, then the confidence interval of regression coefficients is computed as under : Confidence Interval of Intercept Term : P(t, ;.-SED,) $ B, $b, +t,).SE(b,))=1 Confidence Interval of Slope Term : P(b, ~t,2:SElb,) $ By $B, + tq,» SE@,))=1— 18. The Coefficient of Determination : The coefficient of determination is a measure of how well a ee model likely to predict future outcomes. The coefficient of determination 1? is the square of sample correlation coefficient between outcomes and predicted values. The coefficient of determination denoted by 1° is given by _ Explained Variance Tair Coefficient of determination =r? = Total Variance Properties of 2 : The following two properties of ? may be noted : (@) is anon negative quantity. (ii) Its limits areO< P< 1 19. The Goodness of Fit Test : Once the regression line has been fitted, we would like to know how good the fit is; in other words, we would like to measure the discrepancy of actual observations from the fitted line. This is important since the closer the data to the line, the better the fit or, in other words, the better the explanation of variation of the dependent variable by the independent variables. A usual measure of the goodness of fit is the square of the correlation coefficient, 7°. This is the proportion of the total variation of the dependent variable caused by the independent variable. In other words, _ ExpectedSum of Squares(ESS) ~ TotalSum of Squares(TSS) The closer the value of 1° to 1 the better fit is the regression model, because an Y= | means regression is perfect fit, i.2., ¥ =, 20. Gauss Markov Theorem : The Gauss Markov theorem states that, provided that the assumptions of CLRM are satisfied, the OLS estimators are BLUE, i.e., Best (most efficient) linear (combinations of Y;) unbiased estimators of the regression parameters. Thus, the OLS estimators have the following properties : (i) Linearity : b; and by are linear estimators, i.e., they are linear functions of random variable Yj. Gi) Unbiasedness : OLS estimators are unbiased. (a) bi and bp are unbiased estimates of B; and Bo, i.e., E(b:) = Bi and E(b2) = Bo. (b) The OLS estimator of the error variance is unbiased, i.e., E(6*) = 0°. Econometrics 28 By Dheeraj Suri, 9899-192027

Complete Indian Economy
No ratings yet
Complete Indian Economy
215 pages
Classical Linear Regression Model (CLRM)
100% (1)
Classical Linear Regression Model (CLRM)
68 pages
Chapter 4 Functional Forms
No ratings yet
Chapter 4 Functional Forms
22 pages
Accounting
No ratings yet
Accounting
14 pages
Ch2 - Econometrics For Finance (Regression Part)
No ratings yet
Ch2 - Econometrics For Finance (Regression Part)
34 pages
Introductory Macroeconomics
No ratings yet
Introductory Macroeconomics
82 pages
Simple Linear Regression Analysis..
No ratings yet
Simple Linear Regression Analysis..
51 pages
Econometrics I
No ratings yet
Econometrics I
43 pages
Chapter Two
No ratings yet
Chapter Two
44 pages
Simple Linear Regression1
No ratings yet
Simple Linear Regression1
36 pages
Estimation Problem of Two-Variable Regression Model
No ratings yet
Estimation Problem of Two-Variable Regression Model
15 pages
BASIC ECONOMETRICS OLS Assumptions
No ratings yet
BASIC ECONOMETRICS OLS Assumptions
29 pages
Outline - Simple Regression
No ratings yet
Outline - Simple Regression
51 pages
Econometrics Unit 3 Tedy Best
No ratings yet
Econometrics Unit 3 Tedy Best
147 pages
Simple Regression
No ratings yet
Simple Regression
45 pages
Chapter 1 Review of Stats
No ratings yet
Chapter 1 Review of Stats
40 pages
Chapter Two: Bivariate Regression Mode
100% (1)
Chapter Two: Bivariate Regression Mode
54 pages
Econometrics II: Revision Class: Introduction To Econometrics
No ratings yet
Econometrics II: Revision Class: Introduction To Econometrics
55 pages
Linear Regression Models
No ratings yet
Linear Regression Models
41 pages
Module 4 Part 3
No ratings yet
Module 4 Part 3
16 pages
Chapter 2
No ratings yet
Chapter 2
58 pages
Eco 3
No ratings yet
Eco 3
68 pages
Chapter 2
No ratings yet
Chapter 2
41 pages
Lecture 3 Simple Linear Regression
No ratings yet
Lecture 3 Simple Linear Regression
46 pages
Chapter 3 Multiple Regression
No ratings yet
Chapter 3 Multiple Regression
25 pages
Econometrics Theory Note
No ratings yet
Econometrics Theory Note
13 pages
? Indian Economy Unit 4
No ratings yet
? Indian Economy Unit 4
12 pages
Theme 2 Ordinary Least Squares Regression
No ratings yet
Theme 2 Ordinary Least Squares Regression
10 pages
Chapter Three
No ratings yet
Chapter Three
22 pages
Chapter 2 Econometrics
No ratings yet
Chapter 2 Econometrics
9 pages
BST 32202 Linear Regression 6 SLR Assumptions Lse
No ratings yet
BST 32202 Linear Regression 6 SLR Assumptions Lse
20 pages
Chapter 2
No ratings yet
Chapter 2
18 pages
Chapter 2 - Simple Linear Regression Function
100% (1)
Chapter 2 - Simple Linear Regression Function
49 pages
Unit 1 (B) Macro Notes
No ratings yet
Unit 1 (B) Macro Notes
16 pages
Vac Vasudhaiva Kutumbakam'
No ratings yet
Vac Vasudhaiva Kutumbakam'
15 pages
Chapter 2
No ratings yet
Chapter 2
17 pages
Lecture Two (Copy)
No ratings yet
Lecture Two (Copy)
27 pages
Basic Econometrics: TWO-VARIABLEREGRESSION MODEL
No ratings yet
Basic Econometrics: TWO-VARIABLEREGRESSION MODEL
35 pages
Adam Smith, Ricardo, Heckscher-Ohlin Model
No ratings yet
Adam Smith, Ricardo, Heckscher-Ohlin Model
9 pages
Week 2 - The Simple Linear Regression Model PDF
No ratings yet
Week 2 - The Simple Linear Regression Model PDF
47 pages
Unit 3 Micro (Externalities) PDF
No ratings yet
Unit 3 Micro (Externalities) PDF
12 pages
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
No ratings yet
Gujarati D, Porter D, 2008: Basic Econometrics 5Th Edition Summary of Chapter 3-5
64 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
L34 Open Economy Models 17
No ratings yet
L34 Open Economy Models 17
7 pages
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
100% (3)
Chapter Two: Simple Linear Regression Models: Assumptions and Estimation
34 pages
L26 Open Economy Models 9
No ratings yet
L26 Open Economy Models 9
6 pages
L25 Open Economy Models 8
No ratings yet
L25 Open Economy Models 8
6 pages
Notes 2
No ratings yet
Notes 2
16 pages
Two-Variable Regression Model - The Problem of Estimation
No ratings yet
Two-Variable Regression Model - The Problem of Estimation
35 pages
Simple Regression Model - Specification
No ratings yet
Simple Regression Model - Specification
5 pages
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
No ratings yet
ECN 5121 Econometric Methods Two-Variable Regression Model: The Problem of Estimation By: Domodar N. Gujarati
65 pages
Trix 2019
No ratings yet
Trix 2019
6 pages
Chapter One Part 1
No ratings yet
Chapter One Part 1
20 pages
TSNotes 1
No ratings yet
TSNotes 1
29 pages
M2L2 CLRM & Simple Linear Regression Analysis
No ratings yet
M2L2 CLRM & Simple Linear Regression Analysis
13 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
17 pages
Lecture 2-3
No ratings yet
Lecture 2-3
8 pages
CHP 3 PDF
No ratings yet
CHP 3 PDF
31 pages
Linear Models
No ratings yet
Linear Models
92 pages
Short - Notes - Econometric Methods
No ratings yet
Short - Notes - Econometric Methods
22 pages
Econometrics Unit 2
No ratings yet
Econometrics Unit 2
21 pages
ECON3049 Lecture Notes 1
No ratings yet
ECON3049 Lecture Notes 1
32 pages
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
100% (1)
405 Econometrics Odar N. Gujarati: Prof. M. El-Sakka
27 pages
BAB 4the Simple Linear Regression Model
No ratings yet
BAB 4the Simple Linear Regression Model
26 pages
Classical LinearReg 000
No ratings yet
Classical LinearReg 000
41 pages
Chapter 1
No ratings yet
Chapter 1
17 pages
Two-Variable Regression Analysis, Some Basic Ideas
No ratings yet
Two-Variable Regression Analysis, Some Basic Ideas
28 pages
Lecture # 2 (The Classical Linear Regression Model) PDF
No ratings yet
Lecture # 2 (The Classical Linear Regression Model) PDF
3 pages
Lecture 2: Simple Linear Regression Model: Recap
No ratings yet
Lecture 2: Simple Linear Regression Model: Recap
5 pages
Econometrics: Two Variable Regression: The Problem of Estimation
No ratings yet
Econometrics: Two Variable Regression: The Problem of Estimation
28 pages
Basic Regression Analysis
No ratings yet
Basic Regression Analysis
5 pages

Chapter 2 Simple Linear Regression

Uploaded by

Chapter 2 Simple Linear Regression

Uploaded by

You might also like