04 Numerics
04 Numerics
R Programming
Numerical Computation
Computer Arithmetic Is Not Exact
> macheps =
function()
{
eps = 1
while(1 + eps/2 != 1)
eps = eps/2
eps
}
> macheps()
[1] 2.220446e-16
Floating-Point Precision
> a = 12345678901234567890
> print(a, digits=20)
[1] 12345678901234567168
1.797693 × 10308 .
> x = 1+1.234567890e-10
> print(x, digits = 20)
[1] 1.0000000001234568003
> y = x - 1
> print(y, digits = 20)
[1] 1.234568003383174073e-10
The function
To check this we can compute and graph the function over the
range [.988, 1.012].
−4e−14
x
Problem Analysis
> x = .99
> y = c(x^7, - 7*x^6, + 21*x^5, - 35*x^4, +
35*x^3, - 21*x^2, + 7*x, - 1)
> cumsum(y)
[1] 9.320653e-01 -5.658296e+00 1.431250e+01
[4] -1.930837e+01 1.465210e+01 -5.930000e+00
[7] 1.000000e+00 -1.521006e-14
−1e−14
−3e−14
x
Example: The Sample Variance
If the mean of xi is far from 0, then ∑ xi2 and nx2 will be large
and nearly equal to each other. The relative error which results
from applying the right-hand side formula can be very large.
and this series can be used as the basis for an algorithm for
computing ex .
> expf =
function(x, n = 10)
1 + sum(x^(1:n) / cumprod(1:n))
> expf(1)
[1] 2.718282
The Exponential Function (Cont.)
> expf =
function(x, n = 10)
1 + sum(cumprod(x / 1:n))
> expf(1)
[1] 2.718282
> expf =
function(x)
{
n = 0
term = 1
oldsum = 0
newsum = 1
while(newsum != oldsum) {
oldsum = newsum
n = n + 1
term = term * x / n
newsum = newsum + term
}
newsum
}
Power Series – Positive x
Why?
Power Series – Reasons
alternate in sign.
e−x = 1/ex .
> expf =
function(x)
{
if ((neg = (x < 0))) x = -x
n = 0; term = 1
oldsum = 0; newsum = 1
while(newsum != oldsum) {
oldsum = newsum
n = n + 1
term = term * x / n
newsum = newsum + term
}
if (neg) 1/newsum else newsum
}
A Vectorised Algorithm
> expf =
function(x)
{
x = ifelse((neg = (x < 0)), -x, x)
n = 0; term = 1
oldsum = 0; newsum = 1
while(any(newsum != oldsum)) {
oldsum = newsum
n = n + 1
term = term * x / n
newsum = newsum + term
}
ifelse(neg, 1/newsum, newsum)
}
Vectorisation Results and Accuracy
> expf(x)
[1] 2.061154e-09 4.539993e-05 1.000000e+00
[4] 2.202647e+04 4.851652e+08
> exp(x)
[1] 2.061154e-09 4.539993e-05 1.000000e+00
[4] 2.202647e+04 4.851652e+08
plot(x, y)
120
100 ●
●
●
● ●
80
●
●
● ●
● ●
60
y
●
● ● ● ●
● ●
● ● ●
●
40
● ●
● ● ● ●
● ● ●
● ● ● ● ● ●
● ●
20
● ● ●
● ●
●
● ●
● ●
0
5 10 15 20 25
x
plot(x, y, type = "l")
120
100
80
60
y
40
20
0
5 10 15 20 25
x
plot(x, y, type = "b")
120
100 ●
●
●
● ●
80
●
●
● ●
● ●
60
y
●
● ● ● ●
● ●
● ● ●
●
40
● ●
● ● ● ●
● ● ●
● ● ● ● ● ●
● ●
20
● ● ●
● ●
●
● ●
● ●
0
5 10 15 20 25
x
Color Control In R
●
1 2 3 4 5
●
6 7 8 9 10
●
11 12 13 14 15
● ● ●
16 17 18 19 20
●
21 22 23 24 25
Line Customisation
120
100 ●
●
●
Stopping Distance (ft)
● ●
80
●
●
● ●
● ●
60
●
● ● ● ●
● ●
● ● ●
●
40
● ●
● ● ● ●
● ● ●
● ● ● ● ● ●
● ●
20
● ● ●
● ●
●
● ●
● ●
0
5 10 15 20 25
Speed (mph)
(Note that this is very old data)
Customising Plot Axes
1
y
−1 1
−1
x
Root Finding
an xn + · · · + a1 x + a0
x2 + 2x + 1 = 0
6x5 + 5x4 + 4x3 + 3x2 + 2x + 1 = 0
as follows.
> polyroot(1:6)
[1] 0.2941946+0.6683671i
[2] -0.3756952+0.5701752i
[3] -0.3756952-0.5701752i
[4] 0.2941946-0.6683671i
[5] -0.6703320+0.0000000i
1 + 2x + 3x2 + 4x3 + 5x4 + 6x5
−1.5
−1.0
−0.5
x
0.0
0.5
1.0
An R Function for General Root-Finding
$f.root
[1] 1.128129e-07
$iter
[1] 7
$estim.prec
[1] 6.103516e-05
Returned Value - More Accuracy
$f.root
[1] 0
$iter
[1] 8
$estim.prec
[1] 0.0001787702
pnorm(x) − 0.975
1.0
1.5
x
2.0
2.5
3.0
Finding Minima and Maxima
$objective
[1] 1
> pi/2
[1] 1.570796
sin(x)
0.0
0.5
1.0
x
1.5
2.0
2.5
3.0
Interpolation
– Interpolation: s(xi ) = yi , i = 0, . . . , n.
– Continuity: si−1 (xi ) = si (xi ), i = 1, . . . , n − 1.
– Smoothness: s0i−1 (xi ) = s0i (xi ) and
s00i−1 (xi ) = s00i (xi ), i = 1, . . . , n − 1.
●
20
●
●
18
Degrees C
●
●
16
●
14
●
12
Month
Interpolating Temperatures
> xm = 1:13 + .5
> ym = c(20.1, 20.3, 19.4, 16.9, 14.3,
12.3, 11.4, 12.1, 13.5, 14.8,
16.7, 18.5, 20.1)
> f = splinefun(xm, ym, method = "periodic")
●
18
Temperature
●
●
16
●
14
●
12
2 4 6 8 10 12
x
Heating and Cooling Rates
−3 −2 −1 0 1 2
2
4
6
8
Month
10
12
The Hottest Day
$objective
[1] 20.37027
> 28 * (imax$maximum - 2)
[1] 3.171179
$objective
[1] 11.39856
Which means
n
nθ = ∑ xi or θ = x.
i=1
Optimisation of General Functions
5.0
4.5
4.0
3.5
3.0
θ
Optimisation in R
Lets look for the value which minimises the sum of squared
deviations of the values 0.06, 0.36, 0.59, 0.73, 0.96, 0.22,
0.85, 0.05, 0.09 around a given point.
$objective
[1] 1.018622
Example: Least-Squares (Cont.)
> mean(y)
[1] 0.4344444
> median(y)
[1] 0.36
120
100 ●
●
●
Stopping Distance (ft)
● ●
80
●
●
● ●
● ●
60
●
● ● ● ●
● ●
● ● ●
●
40
● ●
● ● ● ●
● ● ●
● ● ● ● ● ●
● ●
20
● ● ●
● ●
●
● ●
● ●
0
5 10 15 20 25
Speed (mph)
Example: Least Absolute Deviations
> Q = function(theta)
sum(abs(cars$dist - theta[1] -
theta[2] * cars$speed))
Now we call optim with the given starting point and function.
> z = optim(theta, Q)
> z$convergence
[1] 0
Example: Least Absolute Deviations
> z$par
(Intercept) speed
-11.600009 3.400001
> theta
(Intercept) speed
-17.579095 3.932409
We can see the difference between the two fits with a plot.
Example: Least Absolute Deviations
●
●
Stopping Distance (ft)
● ●
80
●
●
● ●
● ●
60
●
● ● ● ●
● ●
● ● ●
●
40
● ●
● ● ● ●
● ● ●
● ● ● ● ● ●
● ●
20
● ● ●
● ●
●
● ●
● ●
0
5 10 15 20 25
Speed (mph)
Example: Women’s Fertility
ha
F(t) = P(T ≤ t) = 1 − .
(h + t)a
as follows
> chi2 =
function(Obs, Exp)
sum((Obs - Exp)^2/Exp)
Example: Chi-Square Statistic Value
We can now apply this function to the Hutterite data and our
computed expected values.
> X2 = chi2(Hutterites,
342 * diff(F(breaks, 5, 10)))
> df = length(Hutterites) - 1
> 1 - pchisq(X2, df)
[1] 0
One way to look for such values is to search for the values of a
and h which make the chi-square statistic as small as possible.
> z$par
[1] 3.126338 9.582163
> z$value
[1] 16.76438
Example: Degrees of Freedom
In fact this is not the case because we have used our data to
estimate the best fitting parameter values.
> f = function(t)
F(t, z$par[1], z$par[2]) - .5
> uniroot(f, c(0, 24), tol = 1e-12)$root
[1] 2.378409
> f = function(t)
F(t, z$par[1], z$par[2]) - .9
> uniroot(f, c(0, 24), tol = 1e-12)$root
[1] 10.4315
Example: Graphing the Fitted Distribution
> f = function(t, a, h)
a * h^a / (h + t)^(a + 1)
0.30
0.25
0.20
Density
0.15
0.10
0.05
0.00
0 5 10 15 20 25 30 35
Time (Months)
Maximum Likelihood
Q(θ ) = −l(θ )
An Example: The Normal Distribution
> Q =
function(theta)
-sum(log(dnorm(x, theta[1], theta[2])))
> z$par[2]
[1] 4.518694
> sd(x) * sqrt(9/10)
[1] 4.524195
Choice of Optimisation Method
so that BFGS has done a better job than the default method.
Graphing the Likelihood Surface
> Q2 =
function(theta1, theta2) {
## Recycle arguments
if (length(theta1) < length(theta2))
theta1 = rep(theta1,
length = length(theta2))
if (length(theta2) < length(theta1))
theta2 = rep(theta2,
length = length(theta1))
## Vector calculation
ans = numeric(length(theta1))
for(i in 1:length(ans))
ans[i] = -sum(log(dnorm(x, theta1[i],
theta2[i])))
ans
}
Plotting the Log-Likelihood
30
6
5
4
42
40 34
38 36 32 40
32 34 36 38
3
16 18 20 22 24
Likelihood Profiles
40
40
Q
Q
30
30
16 18 20 22 24 16 18 20 22 24
theta1 theta1
theta2 = 5 theta2 = 6
40
40
Q
Q
30
30
16 18 20 22 24 16 18 20 22 24
theta1 theta1
Grid Searches for Good Starting Values
> sd(x)/sqrt(length(x))
[1] 1.508065
Explaining the Different Standard Errors
> sqrt(diag(solve(z$hessian)))
[1] 1.430677 1.011642