Kernel Regression Section3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

A Note on Kernel Regression Partho Sarkar

3. Locally weighted regression


There is one pitfall inherent to kernel regression. Consider what occurs when xj approaches a boundary of the data (left or right). The kernel weights can no longer be symmetric. To illustrate, consider the right boundary of the data. Specifically, consider the process of obtaining a prediction y0 at x0, where x0 is at or near this right boundary. Only points to the left of x0 are capable of receiving kernel weights (other than x0 itself). There are simply no points to the right of x0 to receive any weight. Now, if the data (and the true function f) are decreasing toward the right boundary, then all yvalues in the weighted sum used to obtain y0 are, most likely, greater than or equal to the value y0 at x0. The bias at the boundary will result in a prediction, y0 , that will be too high (see the figure below).

Y
20

15

10

0 0 2 4 6 8 10 12 14

Kernel fit Memory points

Query points Locally Linear

Figure 5: Kernel vs. Locally Weighted regression

Locally weighted regression (also called Local polynomial regression) is a form of nonparametric regression that addresses this boundary problem Locally weighted regression uses weighted least squares (WLS) regression5 to fit a d-th degree polynomial to the data, where d is an integer, e.g., d=1 is local linear regression, d=2

Weighted least squares is a method of regression, similar to least squares in that it uses the same minimization of the sum of the residuals. However, instead of weighting all the residuals equally, they are weighted such that points with a greater weight contribute more to the sum.

Page 8 of 15

A Note on Kernel Regression Partho Sarkar is local quadratic regression, etc. The weights assigned to the observations are calculated via the kernel function, as above. Then these weights are used to estimate the coefficients of a local polynomial function fit. The simple kernel regression, as described previously, is just a special form of locally weighted regression, with d = 0. Apart from the boundary problem present in kernel regression, mentioned earlier, locally weighted regression, also addresses the problem of potentially inflated bias and variance in the interior of the data set if the points are not uniformly densely distributed or if substantial curvature is present in the underlying, though undefined, regression. The above figure illustrates these points- the locally linear regression fit seems more accurate than the kernel regression fit, especially towards the boundaries and at points of curvature. We now sketch the procedure for locally weighted regression. Consider as before fitting yj at the point xj. First, the weights, wij are obtained for the i=1,2,,m points in the memory set. This results in the vector of kernel weights wj:

w j = ( w1 j w2 j wmj )
Recall that the simple (zero order/Nadaraya-Watson) KR estimate of yi is a weighted sum of the yis: 14.

y j = m( x j ) = wij yi
i =1

where y j is the predicted value of yj


With local polynomial regression though, the wij, for a fixed j, become the weights to be used in weighted least squares regression6. The idea behind locally weighted regression is to use weighted least squares regression to fit a dth order polynomial: 15. y j = 0 j + 1 j x j + 2 j x j 2 + ... + dj x j d Where the coefficients kj depend on the (X,Y) points in memory and the kernel weights wij. This is explained below. The weight matrix for local polynomial regression is derived from the elements of wj as:
w 1j 16. W j = diag (w j ) = 0 wmj

Following the procedure of weighted least squares, the estimated coefficients for the locally weighted regression fit at xj are then found via 17. j = ( X'Wj X ) X'Wjy
-1

where j = ( 0 j 1 j ... dj ) ' is the column of regression coefficients and X is the matrix
6

It is important to note that these distinct weights vary with changing j

Page 9 of 15

A Note on Kernel Regression Partho Sarkar of regressors


1 x1 18. X = 1 xm x12
2 xm

xd1 xd m

for locally weighted regression determined by the degree d of the polynomial. Note that a column of constants (1s) is the first column- this corresponds to the constant term 0 in the equation below. Thus, provided ( X'Wj X ) exists, the fit at xj is obtained as:
-1

19. y j = x j j = 0 j + 1 j x j + 2 j x j 2 + ... + dj x j d

where x j is the j-th row of the X matrix Note that a separate regression on all the memory points has to be carried out for every query point, i.e., the coefficients have to re-estimated for every xj (though they are used to estimate yj only for the j-th point). This makes local polynomial regression even more computationally intensive than simple kernel regression for sizeable memory and query sets. Authors generally agree that for the majority of cases, a first order fit (local linear regression) is an adequate choice for d. Local linear regression is suggested to balance computational ease with the flexibility to reproduce patterns that exist in the data. Nonetheless, local linear regression may fail to capture sharp curvature if present in the data structure. In such cases, local quadratic regression (d=2) may be needed to provide an adequate fit. Most authors agree there is usually no need for polynomials of order d>2).

4. Multivariate Kernel Regression


When there are multiple explanatory variables (k>1), the basic principles of kernel regression remain the same, but their implementation becomes more complex. Our independent variable data will now look like a matrix7 X, where xki is the i-th value of the k-th variable Xk

x11 X= x1m

x21 x2 m

xk1 x km

The data in memory will now take the form of pairs of vectors of values of the independent and dependent variables, ( X1 , y1 ) , ( X 2 , y2 ) , , ( X m , ym ) , where Xi is the i-th independent variable observation vector8, Xi = [ x1i
7 8

x2i xki ] '

Matrices and vectors are shown in bold type. It is more convenient for later work to express this as a column vector, hence the transpose operator

Page 10 of 15

You might also like