0% found this document useful (0 votes)
35 views10 pages

Econometrics Chapter Four - Phoenix

Panel data regression models utilize observations on the same cross-sectional units over multiple time periods, allowing for the analysis of dynamics and heterogeneity among units. These models offer advantages over pure cross-section or time series data, such as increased information, variability, and the ability to study complex behavioral models. The document discusses various estimation approaches, including fixed effects and random effects, highlighting how they account for individual and time-specific variations in the data.

Uploaded by

dirgu4553
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
35 views10 pages

Econometrics Chapter Four - Phoenix

Panel data regression models utilize observations on the same cross-sectional units over multiple time periods, allowing for the analysis of dynamics and heterogeneity among units. These models offer advantages over pure cross-section or time series data, such as increased information, variability, and the ability to study complex behavioral models. The document discusses various estimation approaches, including fixed effects and random effects, highlighting how they account for individual and time-specific variations in the data.

Uploaded by

dirgu4553
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 10
_ L CHAPTER FOUR Introduction to Panel Data Regression Models Al Introduction: Panel dats consist of observations on the same cross-sectional, or individusl units over seversl time periods. In panel data the same cross-sectional unit (sey family or 2 firm or a state) is surveyed ove rr time, I short, panel data have space as woll as time dimensions. There are other names for panel da ts, such as pooled data (pooling of time series and cross-sectionsl observations), combination of tim series and cross-section data, micropanel data, longitudinal data (s study over time of s variable or group of subjects), event history analysis (c.g.. studying the movement over time of subjects through successive states or conditions), cohort analysis (c<. following the career path af 2005 gradustes F abusiness and economics school. Although there are subtle veriations, al these names essentially ¢ ‘onnote movement over time of crass-sectional units, We wil therefore use the term panel dst ina ‘ener sense to include one or more of these terms. And we will call regression models based on such data panel data regression models, What are the advantages of pane! data over cross-section or time scries data? LSince pane! data relate to individuals, firms, states, countries, etc., over time, there is bound to be h ‘oterogencity in these units. The techniques of pane! data estimation can take such heterogencity expli citly into sccount by sllowing for individual specific variables, as we shall show shortly. We use the ter m individuslin a generic sense to include micro units such as individuals, firms, states, and countries. 2.By combining time series of cross-section observations. panel data ghe “more informative dats, m ore variability less collinearity among variables, more degrees of Freedom and more efficiency” 3. By studying the repeated cross section of observations, panel dats are better suited to study the dynamics of change. Spells. of unemployment, job turnover, and labor mobility are bot ter studied with panel data. 4,Panol data can better detect and measure offects that simply cannot be observed in pure cross-se ction or pure time series data. For example, the effects of minimum wage lews on employment and ca rings can be better studied if we include succes sive Waves of minimum wage increases in the federal ‘endfor state minimum wages. _ L 5.Pancl dats enables us to study more complicated behavioral models. For exemple, phenomens such 188 economies of scale snd technological change can be better handled by panel dita than by pure cro _ss-section or pure time series dats. 6, By making data available for sever al thousand units, pane! data can minimize the bies that might res lt if we aggregate individuals or firms into brosd aggregates. In short, panel data cen enrich empiric al analysis in ways that may not be possible if we use only cross-section or time series data, 42 ESTIMATION OF PANEL DATA REGRESSION MODELS. 1) THE AXED EFFECTS APPROACH Estimation of panel dats depends on the assumptions we make sbout the intercept, the slope coeffici ‘ents, and the error term, ue There are several possibilities: a), Assume that the intercept and slope coefficients are constant scross time and space and the erro rr term captures differences over time and individuels. 1), The slope coefficients are constant but the interc ept varies over individuals, ). The slope coeffi ints are constant but the intercept varies over individusls and time. 4). All coefficients (the intercept as well as slope coefficients) vary over individuals. 2}. The intercept as wel es slope coefficients vary over individusls end time Consider the following example that shows how real gross investment (}) depends on the real value of the firm (2) and res! capital stock (A3) in the Four companies. Data for each company on the three variables are available for the period 1935-1954. Thus, there are F ‘our cross-sectional units and 20 time periods. hall, therefore, we hene 80 observations. A priori, Yis expected to be positivelyrelated to A2 and AS. It is assumed that there are a maximum of Mcross sec tional units or observations and maximum of Time periods. If each cros s-sectionsl unit hes the sa me number of time series observations, then such 9 pane! (data) is called @ balanced panel, In the pres ‘ont example we have a bslanced panel, as each company in the sample has 20 observations. If the num bor of observations differs among pane! members, we call such ¢ panel an unbalanced panel 1938 sos uzoe we 1935 2008 3824 8 1937 m2 3038 Bee sasr 4688 26733 84 1929 a5 Basen i728 1930) 2308 i957 3 S7 isa 1130 ‘eae. 2203 joer ie 23808 Bera 1943 ara 17494 3193 oe oer foes.t 308, 4085, sas 20077 a108 tos Baa? jasa2 238 4887 4473 e587 456.4 teen too tr0a7 Beas, feet ine arsa8 Be39 $008 Bias 6607 om West 1935, 3176 90785 28 1935, 12.93 1915 18 1936 2018 4001.7 526 1936 25.90 5160 os 1937 4106 5387.4 1569 1937 135.05 7290 74 1938 2577 2702.2 2002 1938, 22.09 560.4 104 1939 3308 43192 203.8 1939 1854 5199 235 4940 4612 4643.9 2072 4940 20.87 6205 265 1941 5120 45512 2552 1941 4851 537.1 362 1942 4480 neaat 3037 4942 43.34 561.2 con 1943 499.6 4053.7 264.1 1943, ‘37.02 e172 a4 1944 5475 43733 2016 1948 3781 6267 a2 1045, 561.2 40409 265.0 1045, 3027 7372 24 4048 6384 4900.0 4022 4946, 53.48 7605 36.0 1947 565.9 3528.5 761.5 1947 55.58 S014 ama 4948 5202 2245.7 9224 4948 49.58 6623 1906 1949 555.1 37002 1020.1 1949 32.04 583.8 141.8 1950 6429 57556 10990 1950 so24 635.2 136.7 1951 7559 40330 © 1207.7 1981 54.38 7328 120.7 1952 8912 49249 14305 1952 178 864.1 1455, 1953 19084 e2an7 47773 1953 ‘0.08 11995 1748 1954 1486.7 95036 22263, 1958 ‘68.60 11889 2135, _2) AllCacficonts Constant scr Tim ae indice The simplest, and passibly naive, approachis to disregard the space snd time dimensions of the poole ddats and just estimate the usual OLS regression, That is, stack the 20 observations for esch comps: ny one on top of the other, thus giving in all SO observations for each of the variables in the model. Th 2 OLS results areas follows: ¥ = 63.3041 + 0.1101X2 + 0.3034X's se= (29.6124) (0.0137) (0.0493) t= (—2.1376) (8.0188) (6.1545) R’ =0.7565 — Durbin-Watson = 0.2187 n=80 df=77 The estimated model sssumes that the intercept value of GE, GM, US, and Westinghouse sre the same It also assumes that the slope coefficients of the two Avarisbles are all identical for all the Four firm 's, Obviously, these are highly restric ted assumptions. Therefore, despite its simplicity, the pooled reg ression may distort the true picture of the relationship be tween Vand the Xs across the four compa nies. Whee we need to do is find some way to take into account the specific nature of the Four comp nies. How this can be done is explained next ) Slope Coefficients Constant but the Intercept Veries across Individuals: ‘The Fixed Effects or Least-Squares Dummy Variable (LSDV) Regression Model ‘One way to take into account the “individuality” of each company or each cross-sectional unit is to le € the intercept vary for each company but still as sume thst the slope cocffickents are constant acros 's firms. To see this, we write the sbove model as: Vir = Bui + BoXrie + Bs Xair + tir 62) Notice that we have put the subscript /on the intercept term to suggest that the intercepts of the F ‘our firms maybe different: the differences may be duc to special fe atures of each company, such as managerial style or managerial philosophy. In the literature, model (42) is known as the fixed effects (regression) model (FEM). The term “fixed off ects” is due to the Fact thet, slthough the intercept may differ seross individuals (hore the four comp nics). cach individuals intercept does not wary over time: that is, itis time imariant. Notice that if w were to write the intercept as Blit.. it will suggest that the intercept of each company oF individual 4s time variant) It may be noted that the FEM given in (42) assumes that the (slope) coefficients of th eregressors do not vary across individuals or over time. How do we sctuslly allow for the (fixed effect) intercept to vary between companies? We can easily d © that by the dummy variable technique that we leerned in Chapter |. Therefore, we write (42) as: Vin = 1 + Dy; + 83Dyj + 4 Dyj + BrXrie + B3X ait + Mir Where 027-1 if the observation belongs to GM, 0 other wise; 037 | if the observation belongs to US. otherwise; and Dé lif the observation belongs to WEST, 0 otherwise. Since we have four compeni 2s, we have used only three dummies to avoid felling into the dummy-variable trap (iz. the situstion o F perfect collinearity), Here there is no dummy for GE, in other words, dj represents the intercept of GE and a ds, and a, the oifforential intercept coofficients, tell byhow much the intercepts of GM, U 'S, and WEST differ from the intercept of GE, In short, GE becomes the comperison company OF cours 2, you are free to choose any company 2s the comparison company, Since we are using dummies to estimate the fixed effects, in the litersture the model (6.3) is also kno wna the least-squares dummy variable (SDV) model. So, the terms Faxed offects and LSDV can be us od interchangeably In passing, note that the LSDV model (+3) is also known as the covariance: mode! and and A; sre known as covariates The results based on (4.3) are as follows: 245.7924 + 161.5722Dz; + 339.6328Dsy + 186.5666D3; +:0.1079X2; + 0.3461X3 (35.8112) (46.4563) (23.9863) (31.5068) (0.0175) (0.0266) t= (6.8635) (34779) (14.1594) (5.9214) (6.1653) (12.9821) R=09345 d=11076 df=74 The intercept values of the four companies are statistically different; being -245,7924 for GE, ~84.22 © (245.7924 + 15.8722) For GM, 93.8774 (-- 245.792 + 3396528) for US, and “59.2258 (--245.7 92% + 1885868) for WEST. These differences in the intercepts may be due to unique features of esch comp ‘any such as differences in management style or managerial talent. ©) Sope Coefficients Constant but the Intercept Varies over Individuals As Well As Time The Time Effect Just 4s we used the dummy variables to account for individusl €ompsny) effect, we c an allow for time effect. Such time effects can be easily accounted for if we introduce time dummies ‘one for cach year. Since we hane dats for 20 years, fram 1935 to 1954, we can introduce 19 time dum To consider this possibility we can combine cross section and time, as follows: Vig = 1 + Dom, + @3Dus + a4Dwesr, +0 +21DU + A1gDum53 + 62X2i + BsXai + tn ch. MAP age 4 _ L the imestment functions for the four companies are the seme except for their intercepts. Insll the ca ‘595 we have considered, the Nveriables had ¢ strong impact on 1. 4d) All Coefficients Vary across Indviduals Hore we assume that the intercepts and the slope coefficients are different for all individual, or cross ~section, units, This is to say that the investment functions of GE, GM, US, snd WEST sro all different. Wo can easily extend our LSDW model to take care of this situation, We introduce d the individusl dum mies in an sditive msnner To do this in the context of the Grunfold investment function, what we have to do is multiply each of t he company dummies by cach of the \ variables, That is, we estimate the following mode Vir = 1 + @2Daj + 3 Dry + 044 Dai + BrXrie + B3X ie + yi( Dri Xaie) + yo( Dri Xai) + yx( Dai Xait) + ya( Di Xair) + ys( Dae Xair) + yo( Dai Xai) + vir You will notice that the Y's are the differential slope coofficients, just as a2, a3, and ab are the differ ential intercepts. If one or more of the Y cocfficionts are statistically significant, it will tellus that on 2or more slope coefficients are different From the base group. For example, say 2 and {are statist ically significant. In this case (A2 + 1) will give tho value of the slope coefficient of A2 for General Mot ‘ors, suggesting that the GM slope cocfficiont of A2 is different from that of General Electric, whichis ‘our comparison company, If sll the differential intercept and sll the differential slope coefficients sre statistically significant, we can conckide thet the imestment functions af General Motors, United Stat 5 Steel, and Westinghouse are different from that of General Electric. IF this is in fact the case, there may be little point in estimating the poaled reiression 3) Although easy to use, the LSOV model has some problems that need to be borne in mind. if you introd uce too many dummy variables, a8 in the case of sbove model, you willrun up against the degrees of F reedom problem. In this case, we have 80 observations, but only$5 degrees of freedom—we lose 3 dF for the three company dummies, 19 df for the 19 year dummies, 2 for the two slope coefficients, and | For the common intercept. And with so many yeriables in the model, there is always the possibility of rmulticollinarity, which might make precise estimation of one or more parsmeters difficult. 2) THE RANDOM EFFECTS APPROACH IF the dummy variables do in Fact represent a lack of knowledge sbout the (trus) model. why not expre _ L 'ss this ignorance through the disturbance term uit? This is precisely the approach suggested by the proponents of the so called error components model ECM) or random effects model REM). Vie = Bu + BrXoin + BsXoaie + ir Instead of treating Alias fixed, we assume that it is @ random variable with a mean value of Al (no sub script ¢here). And the intercept vslue for an individual company can be expressed as: Bri = Biter 1,2,..,N Where é/is @ random error term with ¢mesn velue of zera and variance of BE. What we are essenti ally saying is thst the Four firms inchided in our ssmple sre a drawing from a much larger universe oF such companies and that they have ¢ common mean value for the intercept ( « 2) and the individual di Fforences in the intercept vakies of each compsny are reflected in the error term &7. The sbove mode! can be rewritten as Vir = Bi + BoX2ie + PaXair + 61 + wir = Bi + BrXrie + BaXair + wir Wie = 61 + uit The composite error term wie consists of two components, €/. which is the cross-section, or individ usl-specific, error component, and wit, which is the combined time series and cross-section error co ponent. The term error component's model derives its name because the composite error term wit onsists of two (or more) error components. The usual assumptions made by ECM are that ei ~ N(0,02) uir ~ N(0,02) Ele) =0 Elev) =0 Fj) Eluicttis) = E(uietys) = Elvitjs) = 0 Ge # #5). Notice carefully the difference between FEM and ECM. ln FEM each cross-sectional unit has its own (fix ed) intercept value, inal such values for Wcross-sectional units. In ECM, on the other hand, the inte reept Bl represents the mean value of all the (cross-sectional) intercepts and the error component £7 represents the (random) devistion of individual intercept fram this mesn vslue. However, keep in mind € hat é/is not dire ctly observable; itis what is known ss an unobservable, or latent, variable. _ L The results of ECM estimation of the Grunfeld invest ment function are presented in Table below Sever" ‘alaspocts of this re gression should be noted, First, if you sum the random effect vslues given for the Four companies, it will be zero, as it should (why?). Second the me sn value of the random error compo nent, &7, i the common intercept value of ~75,0353. The random effect value of GE of 69,9282 tels us by how much the random errar component of GE differs from the common intercept value. Similar interpretation applies to the other three values of the random effects. Third, the A value is obtained from the transformed generalized least squares GLS regression. ECM ESTIMATION OF THE GRUNFELD INVESTMENT FUNCTION © stat iable coef ti tic p-value 0.8 0.3870 4016 0.0000 13.02 0.0000 Intercept 73.2035: -169.9282 9.5078 stinghouse 13.87475 23 (GLE FIXED EFFECTS (LSDV) VERSUS RANDOM EFFECTS MODEL The challenge Facing a researcher is: Which model is better, FEM or ECM? The answer to this question hinges around the assumption one makes about the likely correlation between the individual, or cross section specific, error component &/and the Vregressors. IF it is assumed that é/and the Xs. are un correlated, ECM maybe appropriate, whereas if E7and the Xs are correlated. FEM may be appropriate ‘The assumptions underlying ECM is that the &/are a random drawing from a much ler ger popultion. But sometimes this may not be so, Keoping this Fundamental difference in the two spprosches in mind here are the observations made by Judge ot sl. may be helpful: Lif T(the number of time series dats) is large end (the number of cross-sections! units) is smell, th ‘re is likely to be little difference in the values of the parameters estimated by FEM and ECM, Hence th choice here is based on comput stionsl comenience On this score, FEM may be preferable _ L 2.When Wis large and Tis small, the estimates abtsined by the two methods can differ significantly. Recall that inECM Ali+ Bl + €/, whore &/is the cross-sectional random component, whereas in FEM w 2 treat Blas fed and not random. In the latter case, statistical inference is conditional on the obse Med cross-sectional units in the sample, This is appropriate if we strongly believe that the individual, ‘or cross-sectionsl, units in our sample are not random drawings from larger samp. In tht case. F EMis appropriate, However, if the cross-sectional units in the sample are regarded as random drawin gs. then ECM is appropriate, for in that case statistical inference is unconditional SIF the individual error component E/and one or more regressors ere correlated, then the ECM esti mators are biased, whereas those obtsined from FEM are unbiased 4.IF Nis large and Tis small, and if the assumptions underlying ECM hold, ECM estimators are more ¢ Fficient than FEM estimators Formal test that will hep us to choose between FEM and ECM is. test was developed by Hausman in | 978. We will not discuss the details of this test, for they are beyond the scope of under graduste lov ‘ol, The null hypothesis underlying the Hausman test is thst the FEM and ECM estimstors do not differ substantially The test statistic developed by Hausman hes an asymptotic x? distribution. iF the null hy pothesis is rejected, the conclusion is thst ECM is not sppropriste and that we maybe better off usin FEM, in which case statistical inferences willbe conditionsl on the E/in the sample. ‘SUMMARY AND CONCLUSIONS LPancl regression models are based on pane! dats. Pane! dats consist of observations on the seme er ‘0s8-sectionsl, or individual, units over seversl time periods, 2 There are several advantages to using pane! dats, First. thoy increase the sample size considerably. Second, by studying repeated cross-section observations, panel data are better suited to study the dynamics of change. Third, panel data ensble us to study more complicated behavioral models. 3 Despite their substantial acvantages, panel data pose seversl estimation and inference problems. $i ince such dats invoke both cross-section and time dimensions, problems that plague cross-sectional data (e.g. heterascedasticity) and time series. data (e.g., sutocorrelation) need to addressed. There & re some additionel problems, such as erass-correlation in individual units at the same point in time. 4. There are seversl estimation techniques to address one or more of these problems, The two most _ L Frominent are (|) the fixed effects model FEM) and (2) the random effects model (REM) or error comp ‘onents madel (ECM). 5.InFEM the intercept in the regression model is sllowed to differ among individuals in recognition oF the fact caich individual, or cross sectional, unit may have some special characteristics of its own, To take into eccount the differing intercepts, one can use dummy variables. The FEM using dummy varisb| 85 is known as the least-squares dummy variable (LSDV) model FEM is appropriate in situations where the individual specific intercept may be correlated with one or more regressors. A disadvantage of LS DVis that it consumes # lot of degrees of freedom when the number of cross-sectional units, N, is ve rylarge, in which case we will have to introduce N dummies (but suppress the common intercept ter ny. 6.An alternative to FEM is ECM. InECM it is assumed that the intercept of an individual unit is arrando m drawing from 8 much larger population with a-constant mesn value. The individusl intercept is then ‘oxpressed as a doviation from this constant mean value. One advantage of ECM over FEM is that it is ‘economical in degrees of Freedom, as we do not Ihave to estimate WV cross-sectional intercepts. Wen 22d only to estimate the mean value of the intere ept and its variance. ECM is appropriate in situstions where the (random) intercept of each cross-sectional unit is uncorrelated with the re gressors, 7. The Hausmen test can be used to decide between FEM and ECM. 8. Despite its increasing popularityin applied research, and despite increasing evsilsblity of such dats, pane! data regressions may not be appropriate in every situation, One has to use some practical judg, mont in each case.

You might also like