Phy3004w Poisson Statistics Practical UCT

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

PHY3004W Poisson Statistics

Chloe Sole, SLXCHL001


May 1, 2017

Abstract
Using a Geiger-Muller counter we measured the number of counts per 10 seconds of radiation emitted by a 60 Co source.
From this data we were able to plot the running mean for multiple samples with various arithmetic means. Following this
the frequency distributions were plotted and the values we obtained were compared the expected Poisson distribution. We
found that our data can be described by a Poisson distribution.

Introduction and Theory


A Geiger-Muller detector purely counts gamma rays from our 60 Co and looses all data about the energies of the gamma
rays,[1]. We will be modeling a Poisson statistical model fit to the data we obtain to determine whether or not radiation
from our source can be statistically described by a Poisson distribution. Our data consisted of 100 trials at a particular
distance from our source to obtain a mean count rate per 10 seconds between a desired range. Before analyzing any of the
data, we recognise that background radiation causes the data to be skew and hence we must remember to correct for this.
To determine the running mean of a set of data we used equation 1, [2].
1 j
rc (j) = xj = xi (1)
j i=1
The final value of the running mean should be, by definition, the arithmetic mean, which is defined by equation 2. The
uncertainty for the running mean values is defined as the square root of rc (j)/j, [2]. We will
p show that the final uncertainty
value will be very similar to the standard uncertainty of the mean which is calculated by s2 /N , where s2 is the variance
and N is the total number of data points in the sample, the number of trials, [3].
1 N
x = xi (2)
N i=1
To calculate the variance of the data we use, where s is the standard deviation, [2]:
1
s2 = N (xi x)2 (3)
N 1 i=1
We use this to calculate the standard error of the mean as mentioned above. We will also need to calculate the variance
of the variance. This can be done in multiple ways. The simplest being using the expectation value of the sample variance
minus the mean squared, equation 4. Another method is to look at the variance of the variance using the 4th moment of the
data, equation 5, [1].

2N x2 + (N 1)x
E[s2 x2 ] = (4)
N (N 1)
1  N 3 4
V [s2 ] = m4 s (5)
N N 1
If our data is Poissonian these two should yield similar results. To then calculate the uncertainty of the variance we
simply take the square root of E[s2 x2 ] or V [s2 ]. To ensure we are propagating uncertainties correctly we obtained equation
6 from our first year measurement manual, [3].

1
r
n u(A) o2 n u(B) o2
u(R) = R a + b (6)
A B
where R = cAa B b . For the Poisson frequency plots we needed to bin the datapensuring that most of our bins had a
frequency greater than 5. We estimated our uncertainties in the frequency in a bin by ni (1 ni /N ), where ni is the number
of counts in bin i and N is the total number of trials. 2 is typically the parameter which determines how good of a fit we
have for a model fitted to data. We shall be performing a 2 /ddof analysis on our Poisson fitted plots, where ddof is the
degrees of freedom - typically the number of points minus the number of free variables. In our case the degrees of freedom is
the number of bins minus 1. 2 is defined by equation 7, [4], [5].

N 1 (Oi Ei )2
2 = i=1 (7)
u2i
where Oi is the observed frequency, Ei is the expected frequency of a Poisson distribution and ui is the uncertainty in
the Oi .
After determining these plots we found the number of bins that agreed with their Poisson fit over the total number of
bins for that plot and estimated the uncertainty in this fraction in a similar way to the uncertainty in the frequency per bin,
[2].
r
na
na na (1 ) (8)
Nbins

Experimental Setup

(a) (b)

Figure 1: The experimental set up to obtain a count rate per 10 second interval of between 70 and 80 counts. (a) The voltage
set to 600V and a time interval of 10 seconds. (b) the set up of the detector and the source.

Each group of students had a different distance between the source


at detector in order to obtain various different count rate per 10
seconds ranges. Our Geiger-Muller counter was connected to the
mains and set up to detect a count rate per 10 seconds between 70- Table 1: Table showing the calculated standard
80. The voltage was set to 600V. We are aware of the fact that the error of the mean using the value of the final
voltage of the Geiger counter drifts and this may cause discrepancies running mean uncertainty
p and calculated using
in the data. 100 trials were taken per range of count rates per 10 s2 /N
seconds.
p p
Data Set x = s2 /N x = rc (j)/j
Background 0.21 0.20
Results and Discussions 40-50 0.71 0.63
To start our analysis we subtracted the average background count, 60-70 0.82 0.79
3.85 counts per 10 seconds, from each trial in each data set. This 70-80 0.85 0.84
90-100 0.90 0.98

2
(a) Data set with count rates between 40 and 50 counts (b) Data set with count rates between 60 and 70 counts per 10
per 10 seconds and an arithmetic mean of x = 38.10. seconds and an arithmetic mean of x = 62.13.

(c) The running mean of the measured background radiation (d) Data set with count rates between 90 and 100 counts per 10
converges to the arithmetic mean of x = 3.85. seconds and an arithmetic mean of x = 96.36.

Figure 2: The running means of 3 other data sets and the background radiation.

ensured our running means and calculations after this point


arent biased or skewed by radiation not from our source. The
running mean plots were plotted using python and equation
1. The blue line represent the mean to which the data set
converges.The
p uncertainty of the running mean was calculated
using rc (j)/j. Figure 3 and 2 are 5 of the plots to show
the running mean of 4 different data sets and the background
radiation. From these figures we can see that our running
means do converge to the arithmetic mean, this is because our
estimator is unbiased and over 10 seconds the counts detected
were all within a reasonable range of our data sets mean. We
can clearly see that our uncertainty in our running mean is
an estimate for the standard error of the mean. We note that
this agrees with the fact that the standard error of the mean
requires a large data set in order to be effective. This explains
why the error bars are so ridiculously large for small values of
j.
Looking at table 1 we can see that the calculated uncer-
tainty in the mean is very similar in both methods. But from
here forward we shall be using the uncertainty in the mean as
Figure 3: The running mean of the data set of 100 trials calculated by square root of the variance over the total number
with a count rate of between 70-80 counts per 10 sec. This of trials, N.
running mean converges to the arithmetic mean of the
data set, x = 70.44
3
Table 2: Table showing the calculated aver- Now looking at table 2 where we p have tabulated the results of the
ages with their uncertainties, calculated using mean, equation 2, and its uncertainty, s2 /N , along with the associated
variance, equation 3, and its uncertainty, square root of equation 4, per
p
s2 /N and V [s2 ] = E[s2 x2 ]
data set. Looking at these values we note a general trend of increasing
Data Set x x s2
s 2 uncertainty with the increasing average counts but we also note that it
Background 3.85 0.21 4.488 0.058 increases considerably slower than the counts per 10 seconds hence at the
0-10 3.73 0.28 8.044 0.056 higher count rates the uncertainty is much smaller in comparison to the
10-20 8.93 0.31 9.51 0.13 value of the mean. At the smaller mean values we note that there is only
20-30 21.98 0.45 19.90 0.32 one order of magnitude difference between the uncertainty of the mean and
30-40 31.64 0.58 34.19 0.45 the mean. For the data set 0-10 counts per 10 seconds we noted that when
40-50 38.10 0.71 50.95 0.55 we removed the background radiation we obtained some negative counts.
50-60 52.42 0.78 60.54 0.75 This is physically impossible. We explain this by the fact that since our
60-70 62.43 0.82 67.28 0.89 background counts per 10 seconds fall between 0-10 counts the data set
70-80 70.44 0.85 72.8 1.0 with the source present with a count rate of 0-10 counts per 10 seconds is
90-100 96.36 0.90 81.3 1.4 virtually indistinguishable from the background radiation.
We chose to use the variance of the variance being equal to the expecta-
tion value of the variance minus the mean squared as it yields, on average,
a slightly smaller uncertainty. The results of using the expectation value and using equation 5 can be seen in table 3.
From here we binned our data w.r.t. count rate per 10 seconds, ensuring that 80% of the bins had more than 5 trials with
that count per 10 seconds. This allowed us to be able to estimate the uncertainty in the frequency of trials with a count rate
in bin i as none of the bins were empty. Having 80% of the bins with a trial frequency of greater than 5 makes sure that

(a) Data set with an arithmetic mean of x = 38.10 (b) Data set with an arithmetic mean of x = 62.13 counts per 10
counts per 10 seconds. seconds.

(c) The frequency plot of the data set of 100 trials with a (d) Data set with count rates between 90 and 100 counts per 10
count rate of between 70-80 counts per 10 sec. seconds and an arithmetic mean of x = 96.36.

Figure 4: The frequency plots of the count rates per 10 seconds of 100 trials having the desired mean. The Poisson distribution
x x
of 100 trials at the same mean was plotted as the solid line on top of the data points. Poisson Distribution P (x, x) = x x!
e

4
Table 3: Table showing the results from the when we do a 2 /ddof we get an accurate result. Refer to figure 4 for
different methods of calculating the variance. the frequency plots of the binned count rates per 10 seconds. To determine
how well the Poisson distribution, of 100 trials and the same mean, fits
2 2
our data we perform a 2 /ddof analysis. We do this by using formula 7
Data Set E[s x] V [s ] E V
and dividing by the number of bins minus one. Using python to crunch
40-50 29.70 79.61 0.54 0.82
the numbers we get the results found in table 4. We can see from figure
60-70 79.36 70.20 0.89 0.84
4 that the data set with the mean count rate per 10 seconds of 62.13 fits
70-80 100.94 90.78 1.0 0.95
its corresponding Poisson distribution the best. From our goodness of fit
90-100 188.54 168.22 1.4 1.3
analysis we find that the value we obtain for this data set implies that
we have too many parameters, this interpretation comes from UCTs 2016
seconds year physics python practical 3, [4]. From our values in table 2 we
complete a check to see if our data can be described by a Poisson dis-
Table 4: Table showing the calculated tribution. If our data is Poissonian then s2 /x = 1. To ensure we have the
goodness of fit analysis of our Poisson correct uncertainty for our s2 /x we use equation 6 for the propagation of
plots the uncertainties. We then plot s2 /x against x, refer to figure 5a. From
this figure we can see that all the data sets agree with the unity except
Data Set 2 2
/ddof two values. This still means that 80% of the data sets agree with s2 /x = 1
which implies that the radiation from our 60 Co source can be described
40-50 3.79 0.76
by the Poisson distribution. Our next test to check for our data matching
60-70 0.78 0.157
Poisson is the plot of the fraction of bins that agree with the Poisson distri-
70-80 4.92 0.98
bution, figure 4, against x. We calculated the fraction of bins that agreed
90-100 4.94 0.99
with the Poisson distribution for the 10 data sets, background included,
and plotted these values with their estimated uncertainty, refer to figure
5b. We know that the expected value is 0.68 due to the fact that if our error bars truly represent the 68% probability that
the data point lies in that range then there is a 68% probability that the Poisson prediction actually agrees with our data.
Referring to figure 5b we see that our data agrees with the Poisson distribution however our uncertainties are massive. We
note that the estimate for the uncertainty is unreliable is na < Nb ins which is the case for majority of our plots as not every
single point agreed with that data sets associated Poisson curve.

(a) (b)

Figure 5: The experimental set up to obtain a count rate per 10 second interval of between 70 and 80 counts. (a) The voltage
set to 600V and a time interval of 10 seconds. (b) the set up of the detector and the source.

Python code for all the calculations and plots will be attached at the end.

Conclusions
From our analysis we can see that our data is describable by a Poisson distribution. From figures 4, 5a and 5b we can see
that majority of the data agrees with a Poisson distribution.

5
References
[1] Will Horowitz. Poisson part 1,2, 2016.
[2] Trisha Salagaram. Poisson analysis lecture, 2017.
[3] Fred Lubben Bob Campbell Andy Buffler, Saalih Allie. Introduction to Measurement in the Physics Laboratory. UCT,
3.6 edition, 2010.

[4] Spencer Wheaton. Python activity 3, 2016.


[5] G. F. Knoll. Radiation Detection and Measurement, chapter 3. Wiley, 3rd edition, 2000.

You might also like