Skip to content

Files

Latest commit

 Cannot retrieve latest commit at this time.

History

History
66 lines (44 loc) · 2.91 KB

local_weighted_learning.md

File metadata and controls

66 lines (44 loc) · 2.91 KB

Locally Weighted Linear Regression

It is a non-parametric ML algorithm that does not learn on a fixed set of parameters such as linear regression.
So, here comes a question of what is linear regression?
Linear regression is a supervised learning algorithm used for computing linear relationships between input (X) and output (Y). \

Terminology Involved

number_of_features(i) = Number of features involved.
number_of_training_examples(m) = Number of training examples.
output_sequence(y) = Output Sequence.

Unable to render expression.

$\theta$
Unable to render expression.

$^T$
x = predicted point.
J(
Unable to render expression.

$\theta$
) = COst function of point.

The steps involved in ordinary linear regression are:

Training phase: Compute \theta to minimize the cost.
J(

Unable to render expression.

$\theta$
) =
Unable to render expression.

$\sum_{i=1}^m$
((
Unable to render expression.

$\theta$
)$^T$
Unable to render expression.

$x^i$
-
Unable to render expression.

$y^i$
)$^2$

Predict output: for given query point x,
return: (

Unable to render expression.

$\theta$
)$^T$ x

Linear Regression

This training phase is possible when data points are linear, but there again comes a question can we predict non-linear relationship between x and y ? as shown below

Non-linear Data



So, here comes the role of non-parametric algorithm which doesn't compute predictions based on fixed set of params. Rather parameters $\theta$ are computed individually for each query point/data point x.

While Computing $\theta$ , a higher "preferance" is given to points in the vicinity of x than points farther from x.

Cost Function J(

Unable to render expression.

$\theta$
) =
Unable to render expression.

$\sum_{i=1}^m$
Unable to render expression.

$w^i$
((
Unable to render expression.

$\theta$
)$^T$
Unable to render expression.

$x^i$
-
Unable to render expression.

$y^i$
)$^2$

Unable to render expression.

$w^i$
is non-negative weight associated to training point
Unable to render expression.

$x^i$
.
Unable to render expression.

$w^i$
is large fr
Unable to render expression.

$x^i$
's lying closer to query point
Unable to render expression.

$x_i$
.
Unable to render expression.

$w^i$
is small for
Unable to render expression.

$x^i$
's lying farther to query point
Unable to render expression.

$x_i$
.

A Typical weight can be computed using \

Unable to render expression.

$w^i$
=
Unable to render expression.

$\exp$
(-$\frac{(x^i-x)(x^i-x)^T}{2\tau^2}$)

Where

Unable to render expression.

$\tau$
is the bandwidth parameter that controls
Unable to render expression.

$w^i$
distance from x.

Let's look at a example :

Suppose, we had a query point x=5.0 and training points

Unable to render expression.

$x^1$
=4.9 and
Unable to render expression.

$x^2$
=5.0 than we can calculate weights as :

Unable to render expression.

$w^i$
=
Unable to render expression.

$\exp$
(-$\frac{(x^i-x)(x^i-x)^T}{2\tau^2}$) with
Unable to render expression.

$\tau$
=0.5

Unable to render expression.

$w^1$
=
Unable to render expression.

$\exp$
(-$\frac{(4.9-5)^2}{2(0.5)^2}$) = 0.9802

Unable to render expression.

$w^2$
=
Unable to render expression.

$\exp$
(-$\frac{(3-5)^2}{2(0.5)^2}$) = 0.000335

So, J(

Unable to render expression.

$\theta$
) = 0.9802*(
Unable to render expression.

$\theta$
Unable to render expression.

$^T$
Unable to render expression.

$x^1$
-
Unable to render expression.

$y^1$
) + 0.000335*(
Unable to render expression.

$\theta$
Unable to render expression.

$^T$
Unable to render expression.

$x^2$
-
Unable to render expression.

$y^2$
)

So, here by we can conclude that the weight fall exponentially as the distance between x &

Unable to render expression.

$x^i$
increases and So, does the contribution of error in prediction for
Unable to render expression.

$x^i$
to the cost.

Steps involved in LWL are :
Compute \theta to minimize the cost. J(

Unable to render expression.

$\theta$
) =
Unable to render expression.

$\sum_{i=1}^m$
Unable to render expression.

$w^i$
((
Unable to render expression.

$\theta$
)$^T$
Unable to render expression.

$x^i$
-
Unable to render expression.

$y^i$
)$^2$
Predict Output: for given query point x,
return :
Unable to render expression.

$\theta$
Unable to render expression.

$^T$
x

LWL