0% found this document useful (0 votes)
89 views5 pages

KDE - Direct Plug-In Method

This paper introduces the Direct Plug-in, a popular KDE bandwidth calculation used for a broad set of cases. For more information or related material, visit us at: https://numxl.com/numxl-pro/ To find this article on our website, please go to: https://numxl.com/blogs/kde-direct-plug-in-method/

Uploaded by

NumXL Pro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views5 pages

KDE - Direct Plug-In Method

This paper introduces the Direct Plug-in, a popular KDE bandwidth calculation used for a broad set of cases. For more information or related material, visit us at: https://numxl.com/numxl-pro/ To find this article on our website, please go to: https://numxl.com/blogs/kde-direct-plug-in-method/

Uploaded by

NumXL Pro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Type: Technical Note

KDE – Direct Plug-in Method


A quick recap: In the KDE Optimization Primer, we derived the formula for the optimal bandwidth and
the minimum AMISE, as follows:

R( K )
hopt = 5
 ( K ) R( f (2) )n
2
2

5 5 R( f (2) ) R( K ) 4 22 ( K )
AMISE(hopt ) =
4 n4
Where:

- R ( f (2) ) is defined as follows:



R( f (2) ) = [f
(2)
( x)]2 dx
−

- R ( K ) and  22 ( K ) are known constant quantities determined by the selection of the kernel
function (e.g., Gaussian).
(2)
The main problem in using the formula above in practice is that we don’t know the value of R ( f ) :
the integral of the squared second derivative of the underlying probability density function f ( x ) , which
we are trying to estimate.
(2)
How can we overcome this problem? We can estimate R ( f ) using the KDE itself.

Before we go any further, we need to introduce a neat math trick: Assuming that r  0 , the
f ( r ) () → 0 , then the following relation holds:

[ f ( x)]2 dx = (−1) s  f (2 s ) ( x) f ( x)dx


(s)

(2)
So, if we are to apply it to R ( f ) :

R( f (2) ) =  [ f (2) ( x)]2 dx =  f (4) ( x) f ( x)dx

(2)
Remember that f ( x ) is the probability density function, so we can express R ( f ) as follows:

R( f (2) ) = E[ f (4) ( x)] =  4

KDE - Direct Plug-in Method -1- © Spider Financial Corp


(2)
In effect, we converted R ( f ) into the expectation of the 4-th derivative of f ( x ) , which can be
estimated (i.e., ˆ 4 ) non-parametrically using the sample data:

1 n ˆ (4)
ˆ 4 =  f ( xi )
n i =1

Where fˆ (.) is a data-driven estimator of the f ( x ) fourth derivative.


(4)

ψr Estimation
Using a KDE estimator with kernel function L(.) and bandwidth g , the fˆ ( x; g ) is defined as follows:

1 n x − Xi
fˆ ( x; g ) = 
ng i =1
L(
g
)

1 n x − Xi
fˆ (1) ( x; g ) = 2  L(1) ( )
ng i =1 g

...

1 n (r ) x − X i
fˆ ( r ) ( x; g ) = L ( g )
ng r +1 i =1

Plugging in fˆ ( x; g ) , the expected r-th derivative estimator ˆ 4 ) is expressed as follows:


(r )

1 n n Xi − X j
2 r +1 
ˆ r = L( r ) ( )
n g i =1 j =1 g

The next question is: What is g ? It is the kernel function L(.) (and the bandwidth g ). Is it the same
one we use for the final KDE? Not necessarily, as our goal is to minimize the error in the ˆ r estimate.

Under certain regularity assumptions (Wand and Jones (1995)), The asymptotic bias and variance of ˆ r
are obtained, so we can compute the asymptotic mean squared error (AMSE):

 L( r ) (0)  ( L) r + 2  g 2  2 R( L( r ) ) o 4
AMSE[ˆ r ( g )] =  r +1 + 2 + +   [ f ( r ) ( x)]2 f ( x)dx − r2 
 ng 4 
2 2 r +1
n g n 

And the AMSE optimal bandwidth is:

k ! L( r ) (0)
g AMSE = r + k +1 −
k ( L)ˆ r + k  n

KDE - Direct Plug-in Method -2- © Spider Financial Corp


Where:

- k is the number of stages, provided k ( L)  0 .

Note that a symmetric kernel function has a positive Kth moment (i.e., k ( L)  0 ) for even K (i.e.,
k  {0, 2, 4, 6,8,...} )

- For k = 0 (Silverman method), we start with a known distribution (e.g., Gaussian), calculate ˆ 4
(analytically), and compute the optimal bandwidth.
- For k = 2 (Direct plug-in method), we start with a known distribution (e.g., Gaussian). Then:
o Calculate ˆ 8 (analytically).
o Stage 1:
▪ Compute the optimal bandwidth value for estimating ˆ 6 .
▪ Calculate ˆ 6 .
o Stage 2:
▪ Compute the optimal bandwidth value for estimating ˆ 4 .
▪ Calculate ˆ 4 .
o Calculate the optimal KDE bandwidth.
- For k = 4 (Direct Plug-in method), we start with a known distribution (e.g., Gaussian). Then:
o Calculate ˆ12 (analytically).
o Stage 1:
▪ Compute the optimal bandwidth value for estimating ˆ10 .
▪ Calculate ˆ10 .
o Stage 2:
▪ Compute the optimal bandwidth value for estimating ˆ 8 .
▪ Calculate ˆ 8 .
o Stage 3:
▪ Compute the optimal bandwidth value for estimating ˆ 6 .
▪ Calculate ˆ 6 .
o Stage 4:
▪ Compute the optimal bandwidth value for estimating ˆ 4 .
▪ Calculate ˆ 4 .
o Calculate the optimal KDE bandwidth.

KDE - Direct Plug-in Method -3- © Spider Financial Corp


Typically, two stages ( k = 2 ) are considered a good trade-off between bias (mitigated as k increases),
and variance (increases with k ).

Direct Plug-in Method (Sheather & Jones)


This is the method proposed by Sheather and Jones (1991), where they consider L = K and k = 2 ,
yielding what we call the Direct Plug-In (DPI). The algorithm is:

1. Using the sample data, calculate ˆ = min( s, ˆ IQR ) .

1  ( x −  )2 
2. Assume a Gaussian underlying distribution (i.e., f ( x) =  ( x) = exp  −  ), then
2  2 2 
calculate (analytically) the ˆ 8 .

105
ˆ 8 =  ( x) ( x)dx =
(8)

− 32 ˆ 9
3. Calculate the optimal bandwidth for ˆ 6 , g1 :

2 K (6) (0)
g1 = − 9
2 ( K )ˆ 8  n

4. Estimate ˆ 6 :

1 n n Xi − X j
ˆ 6 = 2 7
n g1
 K
i =1 j =1
(6)
(
g1
)

5. Calculate the optimal bandwidth for ˆ 4 , g 2 :

2 K (4) (0)
g2 = 7 −
2 ( K )ˆ 6  n
6. Estimate ˆ 4 :

1 n n (4) X i − X j
ˆ 4 =  K ( g )
n2 g 25 i =1 j =1 2

7. Now, using ˆ 4 as an estimate for R ( f


(2)
):

R( K )
hDPI = 5
 ( K )ˆ 4  n
2
2

KDE - Direct Plug-in Method -4- © Spider Financial Corp


Conclusion
In this paper, we assumed that the kernel function K is not only symmetric but also has four (4)
continuous derivatives. The assumption excludes the use of many kernel functions (e.g., uniform,
triangular, bi-weight, tri-weight) but fortunately, the Gaussian meets the conditions, and is often used
with the DPI method.

The Sheather and Jones Direct Plug-in method is popular in practice for a broad set of cases, yielding a
good performance for smooth densities, at least in simulation.

References
- Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC
London.
- W. Zucchini, Applied smoothing techniques, Part 1 Kernel Density Estimation., 2003.
- Byeong U. Park and J. S. Marron. Comparison of Data-Driven Bandwidth Selectors. Journal of the
American Statistical Association Vol. 85, No. 409 (Mar., 1990), pp. 66-72 (7 pages).
- S.J. Sheather and M.C. Jones. A reliable data-based bandwidth selection method for kernel
density estimation. J. Royal Statist. Soc. B, 53:683-690, 1991.

KDE - Direct Plug-in Method -5- © Spider Financial Corp

You might also like