0% found this document useful (0 votes)
72 views11 pages

Cleveland, Local Regression - 000008

Local regression

Uploaded by

Stephen McIntyre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
72 views11 pages

Cleveland, Local Regression - 000008

Local regression

Uploaded by

Stephen McIntyre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 11
Chapter 8 Local Regression Models William S. Cleveland Bric Grosse Local regression models provide methods for fitting re mn surfaces, to data. Two examples are shown in F ere is one predictor, and the fitted fun ere are two predictors, and the fitted will be explained in de scion functions, or regres Band 8.2 Ia the fist mis the curve. In the second 2 is shown by a contour plot. later, predictors. One basic specific shere is a neighborhood containing x in mated by a function from a specific para ‘im this chapter, there willbe two classes— specifications of local regression models lead. ee 1 ing that consist of emoothing the response as a function of the predictors thus the iting methods are nonparametric regression procedures. ‘Recall that in Chapters 4 to 6, responses are modeled as parametric functions duct regression surface of two or more pr by additive functions of the predictor CHAPTER 8. LOCAL REGRESSION MODELS case, they provide more pan estimation prope: apprexi various specif. the details called loess, is U regression models show how the data CHAPTER 8. LOCAL REGRESSION MoDELS 8.1 Statistical Models and Fitting 8.11 Definition of Local Regression Models 2 pbeee for cach irom 1 ton, that yi measurement of the response and 2; is the rape nk eeor of measurements ofp predictors Tne wag regression model ‘he response and predictors are related by = 0a) +2, yibete isthe regression surface and the e in the space of the predictors, 9(23) is the expected value of Properties of the regression = ‘about them, We will now di functions and objects that are the specications that are allowable alg aos described in Section 8.2. Specification of the Errors Gh al cass, we suppose that the, are indep. 6. “One of two families of probability a {the Gaussian, The second is symmetric situation where the errors cetera, it the normal (leptokurtosis), and whieh ane estimation, 11 ab esi properties ofthe vasianes ofthe en one of two ways, The Se Simpy that they are a constane, fhe Soe fo that ase; bas constant “erlance a", wheve the a prior weights a, ate postions nova, lent random vaciables with mean ns ean be specified. The first is ions, wh Specification of the Surface Suppose fist, that all predictors are ni ach 2 in the space of the predictors, of 2, the regress sloss. The overall {Stine in Section 8.1.2. Size, ofcourse, hopin listance. For two or more numeric pa routed by deciding whether to normalise the sec "ill elaborate on this later We will allow the specification of one of two ions: near and quadratic polynomials. For ay there are tno pre vant aod tHE we specify linear, the clue concn ‘ree monomials: a and: Ife specify quadratic, the eas a made we ee ‘menomials: tric, and we will use Euclidean shapes of the neighborhoods ace les ofthe numeric predictors. We Several clases of parametric func. he ark F that given the values of the predictors notin the subset, th 313 re mame priors, We can spcty A= 2 and thar are two oF m as Wan re be dropped om th nyo th ovo tat oe sm de Pca soon sete he edt ate and He ge teva the clas eo four mono: constant tric class as function of the subset. If we change the con Sed aie mime Wasa eee oe nat Tonto Seed oan mene exploration ofthe data oth utction iain get hh ne nha ey oh stn ft rls Making Sich seit wn = "Sipe oy tat te ae nc redo, A conned faa fred by appre now “ri tro to reficre wit ela ale nd ff the sad Bk nde tescont Ther Ue cored er sr ee ek ena Sat al hte fale adhe mt. Ine sa, sone spot rapa Preier aply copay fr enc el of the combned cos Dred apply separa # Gaussian or symmetric dis ‘+ constant variance or @ priori weight; ic in numeric predictors ‘locally inear or locally quadratic in numeric p «neighborhood size; ‘+ Rormalization of the scales © dropping squares; * conditionally parametric subset Identically Distribn i aise uted, Gaussian Errors: One Numeric Pre- oe ‘sian errors with constant variance 92, f 9 at a specif Let Ay(2) = fe (2) = be the values of these 2 of, is comput smallest to largest, and nie stances ordered from Tuy {C-W, frosuct o forw>t be the tricube weight function, ‘The smoothness of = de oss ft depends on the specification of the neighbor d Parameter, a > 0, As a increase, ¢ equal to tn tran creases, § becomes smoother. Suppose ws 0 integer. We deine a weght fr een Eta wile) = 7 Se diced in the same mane, but : bch We wl cal he niger sn tance orn Tw hi ae secede su polynomial—that is, i \ is 1— a tas squares with the wage Fits} e quadratic w Seed or lye ting or an CHAPTER 8. LOCAL REGRESSION MODELS we At a quadratic. For a > 1, f 31. STATISTICAL MODELS AND FITTING 315 “Identically Distributed, Gaussian Errors: Two or More Nu- “meric Predictors continue to suppose the errors are identically distributed and Gaussian. ‘The tue additional issue that needs to be addressed for p numeric predictors with p > isthe notion of distance in the space of the predictors. Suppose 2 Is a value in ‘the space, To define neighhcrhood weights we need to define the distance, A,(2), ‘om x to 2, the ith observation of the predictors. We will use Euclidean distance, = but the 2, do not have to be the raw measurements. Typically, it makes sense to tak 2; to be the raw moagurements normalized in some way. We will normalize ‘the predictors by dividing them by their 10% trimmed sample standard deviation, and call this the standard normalization. There ore, however, situations where we ‘might choose not to normalize —for example, ifthe predictors represent position in space, ‘Armed with the A,(2), the loess fitting method for p > 1 is just an obvious ‘generalization of the one-prodictor method. For a < 1, neighborhood weights, are defined using the same formulas used for one predictor; thus, if = 1, we that c4)(2) is replaced by Aca Dropping Squares and Conditionally Parametric Fitting for ‘Two or More Predictors Suppose A hes been specifed to be 2. Suppose, in addition, that we have specified ‘the squares of certain predictors to be dropped. ‘Then those monomials are not used in the local fitting Suppose a proper subset of the predictors has been specifi parametric. Then we simply ignore these predictors in compu distances that aro used in the definition of the neighborhood weig ‘an easy exercige to show that this results in a conditionally paremotri Symmetric Errors and Robust Fitting Suppose the &; have been specified to have a symme modify the loess fitting procedures to produce a robust rot adversely afected if the errors have a long-talled di ‘ficiency In the Gauselan case. ‘The loess robust estimate begins with the Gaussian-error estimate, the residuals &=u- aed) 316 1 CHAPTER 8. LOCAL REGRESSION MoDEIS. ‘are computed. Let for |uj > 6 De the bisquare weight function. Lat m= medi be the median absolute residual. The robustness weight are = Bless 6m) An updated estimate, ihe neighborhood large residuals receive n i computed sing the Ll Sting m replaced by rj ay with 1) ith als ate computed and the estimate soveral tines, Factor Predictors me oy ade ne of ore factor prdictrs Seats oe fr ech combination of lev This in he Sting by dividing the data {in any way; for example However, if the error distribution te are pooled in forming the median absolute residual, Errors with Unequal Scales Suppose we specify that a, have cor variance o?, retin th ecghorsond waht a) eo ttete Pebut eins the whe nena) see a 8.2 S Functions and Objects Q + and thn carryout graphical diagaoetis to che the neat 2 § FUNCTIONS AND OBJECTS © the fitted models, Our goal is to show how the data are analyzed in practice using S, id how each dataset presents a diferent challenge, We begin, however, by rapidly running through the § functions for Btting and inference to give an overview; the ‘The basic modeling function is toe Lete-apply it to some concocted data in the data frame madeup, which has two numeric predictors > nanes(aadeup) 11 "response > avvech(adeup) We will ft Gaussian model with the smoothing parameter, a, equal to 0.8 and the degree, A, of the locally-fitted polynomial equal to 1: > adeup.n < lotsa(resgonse ~ one + t¥0, span = 0.6, degree = 2) > madeop 2 cat! YoessCformta = response ~ one + two, span = 0.5, degree =D Muster of Observations: 100 Equivslest hunber of Paransters: 14.9 Residual Standard Error’ 0.9698 maeiple Renquared 0.76 Resiauale: ein ist Q nadin 3rd. Q_ max -2.289 -0, 5064 0.1243 0.7359 2.357 Notice that the printing shows the equivalent number of parameters, ms this measure of the amount of smoothing, which is defined in Section 8.4, is analogous to the number of parameters in a par ft. Also shown is an estimate of 0, the standard error of the residuals. ipdate the ft by dropping the equare of the first prodictor and making it con parametric: > medeup.new < update(aadeup.2, drop.oquare = “one", 4 paranctric = "one") > sedeup.nee oa osea(formla = 7% parametric =| Je ~ one # t¥o, span = 0.8, degree = 2, “drop.equare = "one") umber of Observations: 100 Equivalent Nuaber of Paraneters: 6.9 2.40 Reesdual Standara Erzor: ose multiple Rrsquared: Residuals: 26 CHAPTER 8. LOCAL REGRESSION MODELS sf 7 24, ’ eal | poy. > | i? pot ga 3 ° | alee 2° 34 ° ° | | © Siew 85: Reidut geist 8 wih a saterlotsmoothing—frs tt gu. ‘glues and, again, no convincing dependence was found. ‘To check the assumption 2a Gaussian distribution of the errors, we will male a Geacen probability plot Te tutto judge the straightness ofthe points on such ples ‘it wrte a litle fonction that draws a line though the lower ane vores quartiles: > agtine funczion() { data.quartiles < quant orn. quares el > & (date. quartiter (rors. quar ebline(a, 8) ) ‘ow we make the plot: aqnorm reeiduals(gas.x)) 3a7 | 82, § FUNCTIONS AND OBJECTS resiuaa(gas ma) Figure 86: Residuals against B with oscatterplot smoothing —scoond fi to gus. ssiduals (gas. gqlin ification is justified, ‘The result, shown in Figure 8.8, suggests that the Gaussian specification is justific which allows us to carry out stat ity. First, we compute 99% pointwise co > posntrive ene. 9521 4.109907 §.40200 6.586510 2.52760 1.710817 ie CHAPTER 8. LOCAL REGRESSION MODELS Sanabs(eduategas.n) esfoas.m) Figure 87: Suareroot absolute residuals against fed onlue with 8 soatterplt emoothing ste: U7) oraassadd 07068 5.0nsr006 5.135260 3. s4a6868 1. sses0i7 (7) ovs226823 Stover Cr) -a.aaeaig® §-2801866 &.es1:876 a.ro40105 27507007 6852464 (7) -0.4246042 he function pos 3 confidence Plot (gaan, cont: reihe fs earlier used to plot the eurve, will compute an 1 plot, as shown ia Figure 8.9: =” be limits are ce equally spaced points from the minimum to *. ‘Thus, the limits that are plotted in ed. We know fem the tf for purposes of illustration dels: mt does not fit the data, ical comparison ofthe two 329 82. $ FUNCTIONS AND OBJECTS resisval(gaem) ‘Quartiog of Stancard Nema Figure 88: Gaussian quantile plot of residuals with line passing through lower and wpper Figure 8.8: Cavsian qua quartiles AoessCformla = NOx ~ E, span = 2/3, degree = 2) Tunber of Observations 2 Equivatent tuber of Parsaeters: 5.5 0.2406 Residual Standard Error: Multiple R-equared 0.96 Ressdaads: nin iat Q median rd g aar ~0.8606 -0.213 0.02811 0.1271 0.6234 Yooes (formula = NOx ~ E, span = 1, degree = 2) Number of Observations: 2 @ sedian 3raQ nar 4596 ~0.1019 0.2014 0.0133, hn he incense ners in ao ganna sein drop arate, bt the enue af acess pre of thelck of. Wecan tat nr cen oe 0022, gas.a) mula = NOE ~ E, open = 1, do Wr ~ Ey span = 2/3, ance Table @ «82.2 Ethanol Data 82. § FUNCTIONS AND OBJECTS 3a me RSSTest_ =F Value Pr(F) 1 3.5 4.689 1v=e2 10.14 0.000861 2 6s 1.7760 The result, as expected, is highly significant. ‘The experiment that produced the ‘we just analyzed was also run with gasoline replaced by ethanol. There and two predictors: B, as before, and C, the compression ratio of the engine. ‘The data are in ettanot: discovered that an addi- ‘because of an interaction ‘These data were analyzed tive ft did not approximat between C and E. typing easier we will a sttech(ethanol) Exploratory Data Display for starting an analysis with two or more predictors is own in Figure 8.10; 332 Figure 6.10: Ethanol date—scetterplot matre of NOx, C, and B. CHAPTER 8, LOCAL REGRESSION MODELS 82, § FUNCTIONS AND OBJECTS I 333 of the two variables are nearly uncorrelated and that C' takes on one of ‘The intervals are shown on the given panel; as we move fe rls, we move from left to right and then bottom to top through the the intervale overtap = 1/4) endpoints of the intervals in the left specified by shared by the successive intervals. 368 8.3.3. Graphics to Same case, enough evaluation is done by plot) for oees objets that ws want tea be done it confidence intervals for future renderings ofthe praph, Tan aa be done using the function prep plotting by piot() sthanol plot <= preplot(ethanol.cp, confidence = 7) plotCethanel.piot) 8.4 Statistical and Computational Methods tical methods in the fing of uss the methods of inference that 84.2, we discuss computational methods that underlie loess St ep the discussion from be all numeric. Extending the res obvious 8.4.1 Statistical Inference Waals we will suppose thatthe eros have been specie tobe Gausian andthe variances have been specified to be constant One important property of a Gaussan-ervor less estimate, (2), i thet linear in y—that is, ae) =D prize the (x) do not depend on the yj. This near rests in ditibution Fting. S/O the estimate that are very similar to those fr classic tence fitting. ue the diagnostic methods have been applied and have revealed no {ack of ft in 9(2); we will take this to mean that g(s) g(a) ale Suppose {izitr that dingnstc checking has veri the specications othe nee oes ‘the model, Estimation of ¢ Since gf linear in yi, the fitted value at 2; can be written CHAPTER 8, LOCAL REGRESSION MODELS + which saves the computations for future | = Land 2, le where isthe m xn identity matrix, Fork = 1 and b= u(D'b)" Wie estimate a by the seal estimate Diet Confidence Intervals for 9(z) Since the standard deviation of (2) We estimate o(2) by Let a ‘The distribution of ate) — ala) so this of freedom; we can vs p degrees ue 6 ‘on d(z). Notice thatthe va sae tte sea the al 9. al parametric fisting, ut not elose enough — es eee Pate eeer nerrere eet Bee hdr age em te tan, th ‘two values are equal. For loess, they are eplcaly ase o foiomstace it ice ee es acai Settee, eset tec ee interval 3 foDs am CHAPTER 8. LOCAL REGRESSION moDELS #4. STATISTICAL AND COMPUTATIONAL METH Analysis of Variance for Nested Models We can use the analysis of vatiance to test a ‘The null and alternative models have the same neighborhood variables. ‘The fitting variables of the null model are a subset of the fitting variables of the alternative model. Zul local regression model against ax alternative one. Let the parameters of the null model be 0, and 6. Tet the paramoters of the alternative model be For the test ta ‘make sense, the null model should be neste ig ative; we will define this concept Let rss be the residual sum-of of the alternative model, ae eal be the residual sumo squares of the sal aa sng st statisti, whichis analogous to that for the aces of variance in the parametric case, is ‘The Equivalent Number of Parameters “ nae) ithe Gare the fitted values, th (rs! — re) Fe Yh Variances) rr Tee & dlsteution that ie well approximated by an F ditibation with denomi- Fe wi te equal peter of pm we fra for dl th i eo beta ened partes gtr oF rece tP deerees of freedom p, defined earlier, and narra look-up degrees of freedom ranutdel being nested inthe alternative expreses the idea sot capturing any elle tha: the null en capture, Sat Dretkely & specification of when it makes Sense to ur roe fhe a ztaee fo compare two models. The alls nese ese alternative if °e following conditions hald hese factors desired value y by taking a to be 1.27/y, selected all neighborhood and fitting variables. However, having selected all ‘except a, we can get, approximatel ‘where ris the umber of fitting var Symmetric Errors inferences are based on ‘When the error dstibation is specified to be symmetric, inferences ae sed on pees aus Tat he obsitn weghs andthe median bec ae the final update ofthe fit, f(z), ber and m, respectively, | ‘The poeudo-values are the square of a numeric tor is dropped from the alternative model, then it must not be presen ‘ull model; the converse need not be teva Hatt ones where jar the Sted values, & are the residuals, and Sew Gam) cedures of tho Gaussian Inferences are carried out by applying the inference procedures of| acg curves Por oberon of the responte yy the Crap spp an tocomputeacouene eral oro ‘zampl, suppor we wan ° ‘coverage. For coverage using this procedure is well approximated by the nominal coverage.

You might also like