Nonparametric Methods: Jason Corso
Nonparametric Methods: Jason Corso
Jason Corso
SUNY at Bualo
Nonparametric Methods
1 / 49
Histograms
Histograms
p(X, Y )
Y =2
Y =1
Nonparametric Methods
3 / 49
Histograms
Histograms
p(X)
p(X, Y )
Y =2
Y =1
Nonparametric Methods
3 / 49
Histograms
Histograms
p(X)
p(X, Y )
Y =2
Y =1
X
p(Y )
Nonparametric Methods
3 / 49
Histograms
Nonparametric Methods
4 / 49
Histograms
Nonparametric Methods
4 / 49
Histograms
(1)
Hence the model for the density p(x) is constant over the width of
each bin. (And often the bins are chosen to have the same width
i = .)
J. Corso (SUNY at Bualo)
Nonparametric Methods
4 / 49
Histograms
Bin Number
Bin Count
0
3
1
6
2
7
Nonparametric Methods
5 / 49
Histograms
= 0.04
0.5
0.5
0.5
= 0.08
0
= 0.25
Nonparametric Methods
6 / 49
Histograms
5
0
5
0
5
0
Nonparametric Methods
= 0.04
0.5
0.5
0.5
= 0.08
0
= 0.25
7 / 49
Histograms
Nonparametric Methods
5
0
5
0
5
0
= 0.04
0.5
0.5
0.5
= 0.08
0
= 0.25
7 / 49
Histograms
5
0
5
0
5
0
= 0.04
0.5
0.5
0.5
= 0.08
0
= 0.25
Nonparametric Methods
7 / 49
Histograms
5
0
5
0
5
0
= 0.04
0.5
0.5
0.5
= 0.08
0
= 0.25
Nonparametric Methods
7 / 49
Histograms
Nonparametric Methods
8 / 49
Histograms
Nonparametric Methods
8 / 49
Histograms
Nonparametric Methods
9 / 49
Histograms
Nonparametric Methods
9 / 49
Histograms
Nonparametric Methods
9 / 49
(2)
Nonparametric Methods
10 / 49
Nonparametric Methods
(4)
11 / 49
(4)
The binomial for k peaks very sharply about the mean. So, we expect
k/n to be a very good estimate for the probability P (and thus for
the space-averaged density).
This estimate is increasingly accurate as n increases.
relative
probability
1
0.5
20 50
0
100
P = 0.7
k/n
FIGURE 4.1. The relative probability an estimate given by Eq. 4 will yield a particular
Nonparametric Methods
11 / 49
Practical Concerns
Practical Concerns
The validity of our estimate depends on two contradictory
assumptions:
1
Nonparametric Methods
14 / 49
Practical Concerns
Practical Concerns
The validity of our estimate depends on two contradictory
assumptions:
1
Nonparametric Methods
14 / 49
Practical Concerns
Practical Concerns
The validity of our estimate depends on two contradictory
assumptions:
1
Nonparametric Methods
14 / 49
Practical Concerns
Practical Concerns
The validity of our estimate depends on two contradictory
assumptions:
1
Nonparametric Methods
14 / 49
Practical Concerns
Nonparametric Methods
15 / 49
Practical Concerns
Nonparametric Methods
16 / 49
Practical Concerns
Nonparametric Methods
16 / 49
Practical Concerns
Nonparametric Methods
16 / 49
Practical Concerns
Nonparametric Methods
16 / 49
Practical Concerns
Nonparametric Methods
16 / 49
Practical Concerns
Nonparametric Methods
16 / 49
n=1
Vn =1/ n
kn = n
n=4
Practical Concerns
n=9
n = 16
n = 100
...
...
...
...
FIGURE 4.2. There are two leading methods for estimating the density at a point, here
at the center of each square. The one shown in the top row is to start with a large
volume
centered on the test point and shrink it according to a function such as Vn = 1/ n. The
other method, shown in the bottom row, is to decrease the volumein a data-dependent
way, for instance letting the volume enclose some number kn = n of sample points.
The sequences in both cases represent random variables that generally converge and
allow the true density at the test point to be calculated. From: Richard O. Duda, Peter
c 2001 by John Wiley17&
Classification
. Copyright "
E. J.Hart,
David
G. Stork, PatternNonparametric
Corso and
(SUNY
at Bualo)
Methods
/ 49