0% found this document useful (0 votes)
33 views74 pages

DEM4110 - Interpolation and Extrapolation - 2021

The document discusses various interpolation techniques used to estimate population figures between census periods. It defines interpolation as inferring intermediate values within a data series, compared to extrapolation which infers values beyond the series. Methods discussed include polynomial interpolation, exponential functions, and Aitken's iterative procedure.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views74 pages

DEM4110 - Interpolation and Extrapolation - 2021

The document discusses various interpolation techniques used to estimate population figures between census periods. It defines interpolation as inferring intermediate values within a data series, compared to extrapolation which infers values beyond the series. Methods discussed include polynomial interpolation, exponential functions, and Aitken's iterative procedure.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

DEM 4110

Interpolation Techniques
The university of Zambia
Department of Population Studies
Introduction
• The most complete and reliable source of
information on the population of countries
and their geographic subdivisions is a census
based on house-to-house enumeration.

• However, they are inadequate for most


purposes
Public use of data
• Market research analysts,
• Public and private planners,
– For determining national and subnational
allocations of funds
– Guiding administrative planning
To think about…

• How do we meet the need for day to day


population figures?
WE ESTIMATE!
THE METHODS AND MATERIALS OF DEMOGRAPHY , Page 684-
Types of Population estimates

• Intercensal estimates,
• Postcensal estimates,
• Projections

– Postcensal estimates and projections can be


regarded as extrapolations, and intercensal
estimates as interpolations.
Definitions
• Interpolation is narrowly defined as the
art of inferring intermediate values
within a given series of data by use of a
mathematical formula or a graphic
procedure.

• Extrapolation is the art of inferring


values that go beyond the series of data.
Cont’d
• Many of the techniques used for
interpolation are suitable also for
extrapolation; hence, the term
interpolation is often used to refer to
both types of inference.
Uses of Interpolation
• Estimating intermediate or external values in a
given series
• Subdividing grouped data into component parts
• Inferring rates for subgroups from rates for
broad groups
Methods of interpolations
• Mathematical formula,
• Graphical fitting of data or
• combination of the two
Cont’d
• Graphic Interpolation- Plot a series of
given data on a large-scale graph, draw a
free-hand curve through the plotted
points, and then read off the
intermediate points from the graph as
needed
• Mathematical formula has the quality of
imputing a regularity or smoothness to
the given series of data
Interpolating point data
• Polynomial interpolation,
• Exponential functions,
• Osculatory interpolation,
• and use of spline functions.
Polynomial Interpolation

• A given series conforms to an equation of the


general type
𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑥 2 + 𝑑𝑥 3 + ⋯
Cont….
– y = a + bx is a straight line passing through any
two given points
– y = a + bx + cx2 is a quadratic, or parabola,
passing through any three given points
– y = a + bx + cx2 + dx3 is a cubic, passing
through any four given points.

• a polynomial of nth degree passes through


any n+1 given points
Choice of degree of polynomial
• Depends on the type of data to be interpolated
• The criterion normally requires use of a higher-degree
equation than a straight line.
• Greater smoothness would normally be achieved by
employing at least two observations before, and two
observations after, the point of interpolation
• This would seem to call for at least 4-point interpolation
by a third-degree polynomial
• Where possible, but often 3-point or even 2-point
interpolation will give about the same results
Cont…
• Recall the function
𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑥 2 + 𝑑𝑥 3 + ⋯
– Simply means 𝑦 is function of 𝑥 or

𝑦 = 𝑓(𝑥)
Cont…
• f(a) means the value of the function
when x equals a
• f(b) the value of the function when x
equals b….
• then 𝑓(𝑥) will be the desired interpolated
value of the function 𝑓 for any 𝑥.
Scenario
• Calculate the population of X in year
1975
Date (𝑥) Pop (𝑦)
1960 16321
1965
1970 30567
1975 𝑓(𝑥)
1980 52108
1985
Methods of Application
• Waring’s formula
• Aitken’s iterative Procedure
• Newton’s interpolation methods
Waring’s Formula
• The Waring formula, also known as the
Lagrange formula or the Waring-
Lagrange formula
• Used to derive the multipliers to
interpolate for the 𝑓(𝑥) value
corresponding to a given 𝑥 value.
• The Waring formula for interpolating
between four points by a polynomial is
as follows:
Cont’
𝑥−𝑏 𝑥−𝑐 𝑥−𝑑
• 𝑓 𝑥 =𝑓 𝑎 +
𝑎−𝑏 𝑎−𝑐 (𝑎−𝑑)
𝑥−𝑎 𝑥−𝑐 𝑥−𝑑
𝑓 𝑏 +
𝑏−𝑎 𝑏−𝑐 (𝑏−𝑑)
𝑥−𝑎 𝑥−𝑏 𝑥−𝑑 (𝑥−𝑎)(𝑥−𝑏)(𝑥−𝑐)
𝑓 𝑐 + 𝑓(𝑑)
𝑐−𝑎 𝑐−𝑏 𝑐−𝑑 𝑑−𝑎 𝑑−𝑏 (𝑑−𝑐)
Cont’d
• The formula given is equivalent to the
polynomial
𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑥 2 + 𝑑𝑥 3
Passing through the points f(a), f(b), f(c)
and f(d) to derive f(x)
• A value of f(X) can be obtained given the
values of f(a), f(b), f(c) and f(d)
The points do not have to be equally spaced i.e difference is not
evenly distributed among x
Cont’d
• Formula is suitable for computing the
multipliers or coefficients to be applied to f(a),
f(b), f(c) and f(d) to obtain f(x)

• The multipliers for any particular interpolation


formula add to 1.00
Cont’d
• The following formula is equivalent to
the polynomial:
y = a + bx + cx2

• Is a parabola passing through three


points f(a), f(b) and f(c)
• Waring’s 3-point formula
Cont’
𝑥−𝑏 𝑥−𝑐 𝑥−𝑎 𝑥−𝑐
• 𝑓 𝑥 =𝑓 𝑎 +𝑓 𝑏 +
𝑎−𝑏 𝑎−𝑐 𝑏−𝑎 𝑏−𝑐
𝑥−𝑎 𝑥−𝑏
𝑓 𝑐
𝑐−𝑎 𝑐−𝑏
Examples
1. Given the following values
X 20 30 60 70 90
y 150 390 2430 3750 7710

- Find f(50)
- Find f(55)
- Find f(40)
- Find f(80)
- Find f(35)
Cont’d
• Solutions

f(50)=1470

f(80)=5497.2
Cont’d
• Calculate 1975 population using the data
given
Cont’
• 𝑓 1975 =
1975−1970 1975−1980 1975−1990
𝑓 1960 +
1960−1970 1960−1980 (1960−1990)
1975−1960 1975−1980 1975−1990
𝑓 1970 +
1970−1960 1970−1980 (1970−1990)
1975−1960 1975−1970 1975−1990
𝑓 1980 +
1980−1960 1980−1970 1980−1990
(1975−1960)(1975−1970)(1975−1980)
𝑓(1990) +
1990−1960 1990−1970 (1990−1980)
Cont’
• 𝑓 1975 = 𝑓 1960 ∗ −0.0625 + 𝑓 1970 ∗
(0.5625) + 𝑓 1980 ∗ 0.5625 + 𝑓 1990 ∗
(−0.0625)

• 𝑓 1975 = 16321 ∗ −0.0625 + 30567 ∗


(0.5625) + 52108 ∗ 0.5625 + 87724 ∗
−0.0625
Cont…

Date (𝑥) Pop (𝑦) Multipliers

1960 16321 -0.0625

1970 30567 0.5625

1980 52108 0.5625

1990 87724 -0.0625


Cont….

Date (𝑥) Pop (𝑦) Multipliers weighted pop

1960 16321 -0.0625 -1020.0625

1970 30567 0.5625 17193.9375

1980 52108 0.5625 29310.75

1990 87724 -0.0625 -5482.75

1975 40001.875
Cont’d
• Calculate 1965 and 1985 populations
Aitken’s iterative procedure
• Aitken’s (1932) iterative procedure is a
system of successive linear
interpolations equivalent to
interpolation by a polynomial of any
desired degree
Cont’
Formular:

𝑓 𝑎 𝑏 − 𝑥 − 𝑓 𝑏 (𝑎 − 𝑥)
𝑓 𝑥 =
𝑏 − 𝑥 − (𝑎 − 𝑥)
Cont…
Given Computational stages Proportion
ordinates ate parts
1 2 3

f(a) (a – x )

f(b) f(x: a, b) (b – x )

f(c) f(x: a, c) f(x: a, b, c) (c – x)

f(d) f(x: a, d) f(x: a, b, d) f(x: a, b, c, (d – x )


d)
Cont…
• First column, “Given ordinates”: the given
data, i.e the four observations
• “Proportionate parts”: differences between the
given abscissa and the one for which the
interpolation is wanted
Cont’d
• Entries of the computational stage (1) are each
calculated by computing diagonal cross-
products, differencing them, and dividing by
the difference between the proportionate
parts.
• Each of the expressions f(x:a,b), f(x:a,c),
f(x:a,d), etc. is an estimate of f(x) obtained by
linear interpolation or extrapolation of f(a) and
one of the subsequent f(b), f(c), or f(d) values
Cont….
𝑓 𝑎 𝑏−𝑥 −𝑓 𝑏 (𝑎−𝑥)
• 𝑓 𝑥: 𝑎, 𝑏 =
𝑏−𝑥 −(𝑎−𝑥)
𝑓 𝑎 𝑐−𝑥 −𝑓 𝑐 (𝑎−𝑥)
• 𝑓 𝑥: 𝑎, 𝑐 =
𝑐−𝑥 −(𝑎−𝑥)
𝑓 𝑎 𝑑−𝑥 −𝑓 𝑑 (𝑎−𝑥)
• 𝑓 𝑥: 𝑎, 𝑑 =
𝑑−𝑥 −(𝑎−𝑥)
• These give an estimate of f(x) obtained by
linear interpolation of f(a) and one of the
subsequent given values
Cont…
• Successive linear interpolations using
results of the previous stage
𝑓 𝑥:𝑎, 𝑏 (𝑐−𝑥)−𝑓 𝑥:𝑎, 𝑐 (𝑏−𝑥)
• 𝑓 𝑥: 𝑎, 𝑏, 𝑐 =
𝑐−𝑥 −(𝑏−𝑥)
𝑓 𝑥:𝑎, 𝑏 (𝑑−𝑥)−𝑓 𝑥:𝑎,𝑑 (𝑏−𝑥)
• 𝑓 𝑥: 𝑎, 𝑏, 𝑑 =
𝑑−𝑥 −(𝑏−𝑥)
• The final computation
• 𝑓 𝑥: 𝑎, 𝑏, 𝑐, 𝑑 =
𝑓 𝑥:𝑎, 𝑏,𝑐 (𝑑−𝑥)−𝑓 𝑥:𝑎,𝑏,𝑑 (𝑐−𝑥)
𝑑−𝑥 −(𝑐−𝑥)
Example

Computational stages

proportionate
Date(x) Pop(f(x)) 1 2 3 parts

1980 5700

1990 7383

2000 9885

2010 13092
Solutions

Computational stages

proportionate
Date(X) Pop(f(x)) 1 2 3 parts

1980 5700 1980-1995= - 15

1990 7383 8224.5 1990-1995= - 5

2000 9885 8838.75 8531.625 2000-1995= 5

2010 13092 9396 8517.375 8538.752010-1995= 15


Newton’s Interpolation
formula
• Divided difference
• Forward differences
• Backward differences
Newton’s divided difference
formula
• Consider observations 𝑥0, 𝑥1, 𝑥2 … 𝑥𝑛, and 𝑦0, 𝑦1,
𝑦2 … 𝑦𝑛, corresponding values of the curve 𝑦 =
𝑓 𝑥 ,
• Define the divided difference as follows
0-th order dd:
• 𝑓 𝑥0 = 𝑓 𝑥0
1-st order dd:
𝑓 𝑥1 −𝑓(𝑥0 )
• 𝑓 𝑥1 , 𝑥0 =
𝑥1 −𝑥0
2-nd order dd:
𝑓 𝑥2 ,𝑥1 −𝑓(𝑥1 ,𝑥0 )
• 𝑓 𝑥2 , 𝑥1 , 𝑥0 =
𝑥2 −𝑥0
Cont…
• 3-rd order dd:
𝑓 𝑥3 ,𝑥2 ,𝑥1 −𝑓(𝑥2 ,𝑥1 ,𝑥0 )
– 𝑓 𝑥3 , 𝑥2 , 𝑥1 , 𝑥0 =
𝑥3 −𝑥0
• the 0th order divided difference are just the given
𝑓 𝑥𝑖 ’s;
• 1-st dd computed from 0-th, 2-nd from 1-st, 3-rd
from 2nd and so on.
Cont…
0-th 1st dd 2nd dd 3rd dd
𝑥𝑖 𝑦 = 𝑓 𝑥𝑖
𝑥0 𝑓 𝑥0
𝑓 𝑥1 , 𝑥0
𝑥1 𝑓 𝑥1 𝑓 𝑥2 , 𝑥1 , 𝑥0
𝑓 𝑥2 , 𝑥1 𝑓 𝑥3, 𝑥2 , 𝑥1 , 𝑥0
𝑥2 𝑓 𝑥2 𝑓 𝑥3 , 𝑥2 , 𝑥1
𝑓 𝑥3 , 𝑥2
𝑥3 𝑓 𝑥3
Newton’s DD interpolating
formula
• 𝑓 𝑥 = 𝑓 𝑥0 + 𝑥 − 𝑥0 𝑓(𝑥1 , 𝑥0 ) + 𝑥 − 𝑥0 (𝑥 −
Example
• Given the populations for 1960, 1970, 1980
and 1990, find the population for 1975
• First, calculate the divided differences
0-th 1st dd 2nd dd 3rd dd
Date(𝑥𝑖 ) Pop(f(𝑥𝑖 ))
𝑥0 =1960 16321
1424.6
𝑥1 =1970 30567 36.475
2154.1 1.13
𝑥2 =1980 52108 70.375
3561.6
𝑥3 =1990 87724
Newton’s DD interpolating
formula
• 𝑓 𝑥 = 𝑓 1960 + 1975 − 1960 𝑓(𝑥1 , 𝑥0 ) + 1975 − 1960 (1975 −
Forward differences
• Consider observations 𝑥0, 𝑥1, 𝑥2 … 𝑥𝑛, and 𝑦0,
𝑦1, 𝑦2 … 𝑦𝑛, corresponding values of the curve
𝑦=𝑓 𝑥 ,
• Δ- forward difference operator
• Δ𝑦0 = 𝑦1 - 𝑦0, Δ𝑦1 = 𝑦2 - 𝑦1, …Δ𝑦𝑛 = 𝑦𝑛 - 𝑦𝑛−1,
• Δ𝑦0 , Δ𝑦1 ,…, Δ𝑦𝑛 are first forward differences
of 𝑦
2
• Second forward differences Δ will be given
by differences of first forward differences
Cont….
• Δ2 𝑦0 = Δ(Δ𝑦0 )= Δ(𝑦1 - 𝑦0 )=Δ𝑦1 -Δ𝑦0
=(𝑦2 - 𝑦1 )-(𝑦1 - 𝑦0 )= 𝑦2 -2𝑦1 -𝑦0
• second forward differences will give third
3
forward differences denoted by Δ
• If there are common differences denoted
ℎ,in the values of 𝑥 and 𝑦 = 𝑓 𝑥 be the
given function then
• Δ 𝑓 𝑥𝑖 =𝑓 𝑥𝑖 + ℎ -𝑓 𝑥𝑖
Cont…
0 Δ𝑓(𝑥) Δ2 𝑓(𝑥) Δ3 𝑓(𝑥)

(X) f(x)

𝑥0 𝑦0
Δ𝑦0 = 𝑦1 -𝑦0

𝑥1 𝑦1 Δ2 𝑦0 =Δ𝑦1 − Δ𝑦0

Δ𝑦1 = 𝑦2 -𝑦1 Δ3 𝑦0 =Δ2 𝑦1 − Δ2 𝑦0

𝑥2 𝑦2 Δ2 𝑦1 =Δ𝑦2 − Δ𝑦1

Δ𝑦2 = 𝑦3 -𝑦2

𝑥3 𝑦3
Newton’s FD interpolation
formula
2
𝑝(𝑝−1)Δ 𝑦0
• 𝑓 𝑥 = 𝑦0 + 𝑝Δ𝑦0 + +
2!
3
𝑝(𝑝−1)(𝑝−2)Δ 𝑦0
+ ⋯+
3!
𝑛
𝑝(𝑝−1)(𝑝−2)….(𝑝−𝑛−1)Δ 𝑦0
𝑛!
𝑥−𝑥0
• Where 𝑝=

Backward differences
• Consider observations 𝑥0, 𝑥1, 𝑥2 … 𝑥𝑛, and
𝑦0, 𝑦1, 𝑦2 … 𝑦𝑛, corresponding values of the
curve 𝑦 = 𝑓 𝑥 ,
• ∇ - backward difference operator
• ∇𝑦1 = 𝑦1 - 𝑦0, ∇𝑦2 = 𝑦2 - 𝑦1, …∇𝑦𝑛 = 𝑦𝑛 - 𝑦𝑛−1,
• ∇𝑦1 ,∇𝑦2 ,… . ∇𝑦𝑛 are called first backward
differences of 𝑦
• Second backward differences will be given
by differences of first forward differences
Cont….
• ∇2 𝑦2 =∇(∇𝑦2 )=∇(𝑦2 - 𝑦1 )=∇𝑦2 -∇𝑦1
=(𝑦2 - 𝑦1 )-(𝑦1 - 𝑦0 )= 𝑦2 -2𝑦1 -𝑦0
• second backward differences will give third
backward differences denote by ∇3
• If there are common differences denoted
ℎ,in the values of 𝑥 and 𝑦 = 𝑓 𝑥 be the
given function then
• ∇𝑓 𝑥𝑖 =𝑓 𝑥𝑖 -𝑓 𝑥𝑖 − ℎ
Osculatory interpolation
• In general, the more regular the sequence of data
points, the easier and simpler it is to interpolate values
between given data points.
• Population data, however, are often quite irregular;
appearing as waves
• These waves or “oscillations” are reflections of
demographic changes, such as rising or declining birth
rates, mortality rates, and external migration rates
Cont’d
• Equations include:
– Sprague’s Fifth-Difference Equation
– Karup-King’s Third-Difference Equation
– The Beers Six-Term Ordinary and Modified
Formulas

Involves combining two overlapping polynomials into one


equation.

See Methods & Materials, Pg 688 for equations


Cont’d
• It is common practice that national annual statistical
reports are published with population figures in five-
year age cohorts, rather than single-year cohorts.
• Estimations of the population in single year age
cohorts when demographic reports are published with
5-year cohorts can be done
• Such disaggregation is necessary for the calculation of
various rates used in planning.
Cont’d
• The formulas for oscillatory interpolation can be
expressed in terms of coefficients or multipliers that
are applied to the given data.
• An interpolated value can then be readily computed by
multiplying the given data by the corresponding
coefficients and by accumulating the products.
• In this way, the analyst has only to select the method
of interpolation and to know how to use the
multipliers; he or she does not need to be familiar with
the formula itself or with the mathematical derivation
of the multipliers.
Cont’d
• The Karup-King formula is applied to four
points
• The Sprague formula to six points and
• The Beers formulas to six points
• For all formulas the given points must be
equally spaced
Cont’d
• Example: Given the data from the U.S life
table, use Karup-King Formula and
relevant multipliers to interpolate to
single ages between l45 and l50
Cont’d
• General equation for these
interpolations is…..
N2+x=m1N1.0+m2N2.0+m3N3.0+m4N4.0
Where:

• Xis a fraction between 0 and 1;


• N1.0, N2.0, N3.0, and N4.0 represent four given points
• m1, m2, m3, m4 are the four multipliers associated with
the four given points
Cont’d
• If we wish to compute l48, our x fraction
will be 0.6 (from l45 to l50)
• Thus, our equation will be…..

N2+x=m1N40+m2N45+m3N50+m4N55
Or

l48=m1l40+m2l45+m3l50+m4l55
Multipliers are…..
Cont’d
• Select multipliers for N2.6
l48=-.048(93064)+.424(91,378)+.696(88756)-
.072(84,711)=89,952
Note:
• For interpolation of ages 0–1 to 4–5, the “first interval”
coefficients are used.
• For interpolation of ages 101–102 to 105–106, the
“last interval” coefficients are used.
• For interpolation of all other ages, the “middle
interval” coefficients are used.
Cont’d
• Derive the following
– l46
– l47
– l49
See
Methods and Materials pg. 685-690
Graduation (Smoothing)
• Graduation or “smoothing” is another type of
interpolation designed to obtain a smooth series of
values from an irregular series of observed values
(remove noise from the data set which allows for
important patterns to stand out)

• Most of the interpolation formulas introduced combine


interpolation with some smoothing or graduation

• We introduce mathematical graduation methods of


demographic data
Cont’d
• Mathematical graduation of census data can be
employed to derive figures for 5-year age groups that
are corrected primarily for net reporting error.

• What these graduation procedures do, essentially, is to


“fit” different curves to the original 5- or 10-year
totals, modifying the original 5-year totals
Cont’d
Smoothing Age Data
• Particular pattern is expected
• Once indices for age-digit preference
(whipple’s, Myers etc) or graphs indicate that
the age structure of the population is not
correct, a decision should be made about
whether or not the age structure should be
adjusted
• Smoothing techniques come into play to
correct data for age misreporting
Cont’d
• Among the major graduation methods are the Carrier-
Farrag (1959) ratio method, Karup-King-Newton
quadratic interpolation, Arriaga formular,, and
methods developed by the United Nations.
• The U.S. Census Bureau (1994) has developed a
spreadsheet program, AGESMTH, that smooth's the 5-
year totals of a population using most of these
methods.

See POPULATION ANALYSIS WITH MICROCOMPUTERS


Quote…
“Facts are stubborn, but statistics are
more pliable”
---Mark Twain--
references
• Shryock: Methods and materials,
• R L Burden & J D Faires: Numerical
Analysis (9th Edition)

You might also like