0% found this document useful (0 votes)
17 views27 pages

Grad Lecture 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views27 pages

Grad Lecture 3

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Lecture 3:

• Tes6ng Models: chi-squared


• “Scien6fic Method”
• Student’s t
• Correla6on test
• Non-parametric tests
Statistical Uncertainties
Fundamental, calculable, random variations
due to an inherent limited sampling of the
underlying distribution (i.e. counting statistics).

Systematic Uncertainties
Incidental, estimated (bounded), systematic biases
incurred as a result of limited measurement precision
(also always present).
There is no universally applicable method for estimating/bounding*
systematic uncertainties. A typical approach often relies on independent
cross-checks, accounting for possible statistical limitations of calibration
procedures, knowledge about the experimental design and general
consistency arguments.
* Systematic errors that are “determined” become corrections!

Because of their very different nature, there is no standard,


mathematically rigorous way to combine the 2 types of uncertainties.
The convention is thus to quote results in the form:

Result ± Uncertainty (stat) ± Uncertainty (sys)

And error bars such as: or sys stat


How do you then make use of
such data points to fit a model?
It is often generally assumed that systematic uncertainties
can be treated in a similar way to statistical uncertainties,
with careful attention to correlations.

Ideally, the best way to treat systematic uncertainties are as


free parameters in the model fit, constrained by the
separately determined bounds on their values.

or sys stat
Testing Models
Consider:
n
2 2

χ ≡ gi σ=1

i=1

where gi are samples drawn


from a normal (i.e. Gaussian)
distribution of unit variance

Then the distribution of this quantity defines a χ2 (“chi-squared”)


distribution with n degrees of freedom

effective number of independent


samples contributing to the variance
The χ2 probability density n χ2
function for n degrees of
2 (χ 2) 2 −1 e −2
freedom has the form: P(χ , n) = n n
2 2 Γ( 2 )

∫0
Where Γ(k) = (k − 1)! or Γ(z) = x z−1e −xdx
if k = positive integer for any real value z

1 − χ2
2
Note that: P( χ ,2) = e 2
2
(will come back to this)

And the integral probability is given by:

( )
2
n χ
γ 2, 2 α

∫0
2
P( > χ , n) = 1 − where γ(z, α) = x z−1e −xdx
Γ( n2 )
Probability to be outside of √4
= 2 standard devia6ons
is 0.0455

Probability to be outside of
√1 = 1 standard devia6ons
is 0.3173
Pearson’s χ2 Test
So, for example, if we have a model, m, involving k free parameters
(determined by a fit to the data) that seeks to predict the values, x, of n
data points, each with normally distributed uncertainties, we can
construct the sum:
n
(xi − mi)2 for binned data


S≡ (Poisson sta6s6cs)

i=1
σm
2
i

normalises things to give


a Gaussian distribution
with unit variance

For example, imagine fitting a straight line


(2 parameters: slope and intercept) to a set
of data. You can always force the line to go
through 2 of the data points exactly, so only
n-2 of the data points will contribute to the
variance around the model
“If my model is correct, how often would a randomly drawn
sample of data yield a value of χ2 at least as large as this?”

Determining the best values for the model parameters by choosing


them so as to minimise χ2 is called the “Method of Least Squares.”

Note that, if you vary one of the model parameters from its best fit
value until χ2 increases by 1, this therefore represents the change
in the model parameter associated with 1 unit of variance in the fit
quality (i.e. the “1σ uncertainty” in the model parameter).
Example 1:

Say we have n measurements of some quantity, with each measurement


having a different Gaussian uncertainty. What is the best estimate for the
mean value of this quantity?

n 2
(xi − μ) Let’s find the
χ2 =
∑ σi2 value of µ that
minimises this
i=1

n n n
∂χ 2 (xi − μ) xi 1
∑ ∑
=−2 + 2μ

=−2 =0
∂μ σi2 σ
i=1 i
2 σ
i=1 i
2
i=1

n xi
∑i=1 n
∑i=1 wi xi
σi2 1
μbest = n
= n
where wi ≡
σi2
∑i=1 12 ∑i=1 wi
σi
Example 2:

A newly commissioned underground neutrino


detector sees a rate of internal radioactive
contamination decreasing as a function of time.
Measurements of the number of such events
observed are taken on 10 consecutive days.
Determine the best fit mean decay time in order
to determine the source of the contamination.

decay probability:

1 − tt
P(t) = e o
to
to = mean decay lifetime
normalised to total often approximate
number of events integral over bin with
observed in 10 days the average

How many degrees of freedom?


• 10 independent data points
• fit parameter to
• but normalisation is also based on
observed data (for single bin,
variance would be zero)

DoF = 10 - 2 = 8

How good is the fit?


P( χ 2 > 11.34 | 8 DoF ) = 0.18

222Rn
mean lifetime
= 5.51 days

~1σ ~1σ

best fit
value
“Chi by eye”
100

80

60

40

30
0 1 2 3 4 5 6 7 8 9 10

χ 2 ∼ (0.3)2 + (0.1)2 + (1.2)2 + (1.7)2 + (0.3)2 + (0.8)2 + (0.8)2 = 5.8

Degrees of Freedom = 7 – 2 = 5

NOTE: This doesn’t tell you which model is correct,


but it can tell you which models don’t fit well!
Quoted chance probability based
on correlation analysis: 6 x 10-4
_x_
1705
1712
1693
1715
1778
1756
1681
1707
1712
1710
1721
1717
1731
1710
1777
1693
1747
1690
1692
20 2
(xi − 1718.45) Avg (m) = 1718.45 1722
χ2 =

= 8.22
i=1
1718.45
Wuant ‘em Effect
DoF = 20 – 1 = 19 (98.4% chance of getting something larger)
Scientific Method: e lMo
Me l
d d eol d o

MoMdel
M o de
M el l

de el
ER ! ! oM d

Mo od
D M M M o

d
O R o o d e de l

M
de l

l
el

l
Model l o d
Simplest and
most predic6ve M
A theory is judged not
on what it can explain,
but on what it can
Test for Rejected with
reproducibly predict!
reproducible
Model
high confidence
predic6ons
We don’t prove models
to disprove Next simplest & correct; we reject those
Not$rejected$with$ most predic6ve models that are wrong!
high$confidence$

Rejected with
Test for
high confidence
reproducible
Don’t state that data are “consistent”
with a given model, but rather that they
are “not inconsistent.”
Student’s t
( Often misinterpreted as referring to being from or for “a student,” rather
than the fact that the name of the author happens to be “Student” )
( Except this was actually a pseudonym used by William Sealy Gosset in
his 1908 paper, who was couching himself as “a student”! )
σ rms of the full

σm =
distribution.
Recall that the rms deviation in the estimated
mean from a set of n samples is given by :
n
But what if we don’t know σ a priori and all
we have are the sampled estimators?
1 n
n∑
x̄ = xi
x̄ − μ
t≡ i=1

(s/ n )
where
n
1
s2 = (xi − x̄)2
(n − 1) ∑
i=1
Want to find the distribution of t
x̄ − μ
t≡
(s/ n )
ν = # degrees of freedom

As you would expect, this


approaches the shape of
a Gaussian distribution as
the sample size grows:
Pearson Correlation Coefficient
A test of linear correlation between two sets of data

( sx ) ( sy )
1 xi − x̄ yi − ȳ
n−1∑
rxy =
i=1

This is just the covariance normalised by the sample rms deviations.

The value of this quantity runs from 1 (completely correlated) to -1


(completely anti-correlated), with zero indicating no correlation.
The statistics provides a relative measure of linear correlation but, in
general, the probability distribution for r will depend on the
distributions of x and y.

IF x and y are uncorrelated and each drawn from a normal


distribution (such that, jointly, they can be described by a 2-D
Gaussian), then:

1 − r2
σr = DoF for 2 free
n−2 parameters in
linear fit

From which it is possible to define a t statistic for r:

n−2
tr = r
1 − r2
Spearman Rank-Order Correlation Coefficient
A non-parametric test of correlation between two sets of data (i.e. linearity is not assumed)

Define Ri as the ‘rank’ of xi (i.e. the numerical position in an ordered


list of the n data points from lowest to highest x value).
Define Si as the ‘rank’ of yi (i.e. the numerical position in an ordered
list of the n data points from lowest to highest y value).
Note: it’s possible to have identical ranks if a data set contains multiple identical
values! In which case you should ascribe to each of these an ‘average’ rank value

Then define the rank coefficient as:


n ∑i (Ri − R )(Si − S )
( sR ) ( sS )
1 Ri − R̄ Si − S̄
=
n−1∑
r=
i=1 ∑i (Ri − R )2 ∑i (Si − S )2
Similarly, the probability distribution can be approximated by the t statistic:

n−2 Generally pretty good and


tr = r no longer depends on the
1 − r2 actual distributions of x & y
Kolmogorov-Smirnov (and the like)
A non-parametric test of distributions

Plot the cumulative fraction of events less than or equal to a particular value
of x as a function of x, along with the cumulative distribution for some model:

data x
fraction For example: Is this data Normally distributed?
≤x
point value
1 10.1 0.1
2 10.7 0.2
3 11.2 0.3
4 14.6 0.4
5 15.1 0.5
data
6 16.3 0.6
7 16.5 0.7
integral Gaussian for
8 18.8 0.8 mean=x and σ=sx
9 24.2 0.9
10 27.9 1.0

x̄ = 16.54
sx = 5.8
Equivalently:
A more clearly defined
span on the x-axis with
a visually simple model
D+ = 0.2 expectation that is
independent of the test
distribution

D- = 0.13

There are several statistics that can be used to assess the level of agreement:

K-S statistics (Clustering) Cramer-von Mises (Variance)

∑( 2n )
D+ = maximum positive deviation from the model line n 2
2i − 1 1
W2 = yi − +
D- = maximum negative deviation from the model line
i=1
12n

( 2)
D = max(D+ , D-) 2
1
V = D+ + D- (Kuiper test) U 2 = W 2 − n ȳ −
In general, the probability distributions for these statistics need to be
determined by Monte Carlo calculations. However, for continuous variables
tested against a well-defined model distribution under the null hypothesis,
tables and approximate parameterisations exist to obtain p-values:

“High Tail”
Modified Approximate
Test Statistic Test Statistic Parameterisation
(T) (T*) for P(T* > z)

trial factor for


choosing best!

M.A. Stephens, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 32, No. 1. (1970), pp. 115-122

You might also like