MST121 Handbook PDF
MST121 Handbook PDF
Using Mathematics
Handbook
This publication forms part of an Open University course. Details of this and
other Open University courses can be obtained from the Student Registration
and Enquiry Service, The Open University, PO Box 197, Milton Keynes
MK7 6BJ, United Kingdom: tel. +44 (0)845 300 6090, email
[email protected]
Alternatively, you may visit the Open University website at
http://www.open.ac.uk where you can learn more about the wide range of
courses and packs offered at all levels by The Open University.
To purchase a selection of Open University course materials visit
http://www.ouw.co.uk, or contact Open University Worldwide, Walton Hall,
Milton Keynes MK7 6AA, United Kingdom, for a brochure: tel. +44 (0)1908
858793, fax +44 (0)1908 858787, email [email protected]
First published 1997. Second edition 2004. Third edition 2008. Reprinted 2008.
Copyright
c 1997, 2004, 2008 The Open University
SI units 4
Mathematical modelling 4
Notation 6
Glossary 9
MST121 Block A 26
MST121 Block B 31
MST121 Block C 38
MST121 Block D 45
If you are taking MST121 and MS221 together, then we suggest that you use
the MS221 Handbook for both courses.
3
The Greek alphabet
A α alpha N ν nu
B β beta Ξ ξ xi
Γ γ gamma O o omicron
∆ δ delta Π π pi
E ε epsilon P ρ rho
Z ζ zeta Σ σ sigma
H η eta T τ tau
Θ θ theta Y υ upsilon
I ι iota Φ φ phi
K κ kappa X χ chi
Λ λ lamda Ψ ψ psi
M µ mu Ω ω omega
SI units
The International System of units (SI units) is an internationally agreed set of units
and symbols for measuring physical quantities.
Some of these are base units, such as
metre symbol m (measurement of length),
There are also derived units, which are used for quantities whose measurement
combines base units in some way. Some of these are
area m2 (metres squared or square metres),
Mathematical modelling
Specify Create
purpose model
Do
mathematics
Evaluate Interpret
results
4
Some useful graphs
5
Notation
Some of the notation used in the course is listed below. The right-hand column gives
the chapter and page of MST121 where the symbol is first used.
6
e the base for the natural logarithm function and the A3 33
exponential function; e = 2.718 281 . . .
exp the exponential function A3 34
−1
f the inverse function of the one-one function f A3 37
arccos x the angle in the interval [0, π] whose cosine is x A3 41
arcsin x the angle in the interval [− 12 π, 12 π] whose sine is x A3 40
arctan x the angle in the interval (− 12 π, 12 π) whose tangent is x A3 41
loga the logarithm function to the base a A3 42
ln the natural logarithm function, that is, loge where A3 44
e = 2.718 281 . . .
n
ai the sum a1 + a2 + · · · + an B1 11
i=1
A a matrix B2 19
AB the product of the matrices A and B B2 19
n
A the nth power of the square matrix A B2 21
A+B the sum of the matrices A and B B2 22
kA the scalar multiple of the matrix A by the real number k B2 23
A−B the difference of the matrices A and B B2 23
aij the element in the ith row and jth column of the matrix A B2 24
v a vector B2 26
vi the ith component of the vector v B2 26
I the identity matrix B2 35
−1
A the inverse of the invertible matrix A B2 36
det A the determinant of the square matrix A B2 38
Ax = b the matrix form of a pair of simultaneous linear equations B2 41
0 the zero vector B3 8
1
i the Cartesian unit vector B3 8
0
0
j the Cartesian unit vector B3 8
1
|a| the magnitude of the vector a B3 10
−→
PQ the displacement vector from P to Q B3 12
−→
OQ the position vector of Q B3 12
7
f (x) the (first) derivative of the function f at the point x C1 13
f the second derived function of f C1 28
f (x) the second derivative of the function f at the point x C1 28
dy
the (first) derivative of y with respect to x (Leibniz notation) C1 41
dx
d2 y
the second derivative of y with respect to x (Leibniz notation) C1 42
dx2
d dy
(y) a variation of C1 43
dx dx
d
(f (x)) a variation of the Leibniz notation for f (x) C1 43
dx
ṡ the (first) derivative of s with respect to t, where t is time C1 43
(Newton’s notation)
s̈ the second derivative of s with respect to t, where t is time C1 43
(Newton’s notation)
f (x) dx the indefinite integral of f (x) with respect to x C2 8
f the indefinite integral of the function f C2 8
b
f (x) dx the definite integral of f (x) from a to b C2 41
a
b
f the definite integral of the function f from a to b C2 41
a
b
[F (x)]a F (b) − F (a) C2 41
y(a) = b shorthand for the initial condition y = b when x = a C3 10
P (E) the probability that the event E occurs D1 10
P (X = j) the probability that a random variable X takes the value j D1 37
µ the mean of a probability distribution, or a population mean D1 43
D2 23
x the sample mean D2 21
s the sample standard deviation D2 25
σ the standard deviation of a probability distribution, or a D2 24
population standard deviation
SE a standard error, that is, the standard deviation of a sampling D3 14
distribution D4 20
ESE an estimated standard error D4 22
Q1 the lower quartile D4 9
Q3 the upper quartile D4 9
H0 the null hypothesis of a hypothesis test D4 18
H1 the alternative hypothesis of a hypothesis test D4 18
Glossar y
Below is a glossary of terms used in MST121. First the definition of the term is
given, then the page in this handbook where more detail can be found, and finally
the chapter and page of MST121 where the term is first used.
restricted to (− 12 π, 12 π).
9
circle The set of points in a plane that are a fixed distance from a 27 A2 20
specified point in the plane.
closed form A formula that defines a sequence an in terms of the 26 A1 8
subscript n. It should be accompanied by a statement of the appropriate
range for n.
codomain A set containing all the outputs of a function. See also A3 9
function.
coefficient matrix The square matrix used when a pair of simulta- 35 B2 40
neous linear equations are written in matrix form.
coefficient (of a term) The factor by which the term is multiplied A0 31
in a particular product.
column (of a matrix) See matrix.
common difference The difference between any two successive terms 26 A1 14
in an arithmetic sequence.
common logarithm The logarithm function with base 10. A3 44
common ratio The ratio of any two successive terms in a geometric 26 A1 19
sequence.
completed-square form The completed-square form of x2 + 2px is 27 A2 26
(x + p)2 − p2 .
component form (of a vector) The description of a vector a in 36 B3 16
terms of the Cartesian unit vectors i and j: a = a1 i + a2 j.
component (of a vector) See vector.
composite function A function k with rule of the form C1 48
k(x) = g(f (x)).
confidence interval An interval of plausible values for a population 47 D3 19
parameter.
constant (1) A significant number; for example π. A0 28
10
11
12
arbitrary constants.
in which the probability of success in each trial is the same. The prob
-
abilities of obtaining the values 1, 2, 3, . . . form a geometric sequence.
13
graph (of a real function f ) The set of points (x, f (x)) in the A3 9
Cartesian plane.
half-life The time that it takes for the mass of a radioactive substance 43 C3 28
to decay to half of its original amount.
histogram A diagram that represents a data set in which each value D2 15
(or group of values) is represented by a rectangle whose area is propor-
tional to the frequency of that value (or group of values).
hypotenuse The longest side of a right-angled triangle. A2 21
infinite sequence A sequence that has a first term but no final term. A1 6
14
integrand In an integral, the function which is to be integrated. 40 C2 8
integration The process of finding either an integral or the indefinite 40 C2 8
integral of a function.
intercept A value of x or y where a line (or curve) meets the x-axis 29 A2 11
or y-axis, respectively. The x-intercept is the value of x where it meets
the x-axis. The y-intercept is the value of y where it meets the y-axis.
interquartile range The difference between the upper quartile and 47 D4 11
the lower quartile of a batch of data.
interval An unbroken subset of the real line. A3 7
15
16
17
18
19
rise from A to B The rise from a point A(x1 , y1 ) to a point B(x2 , y2 ) A2 9
is y2 − y1 .
root (of an equation) An alternative term for a solution of an
equation.
row (of a matrix) See matrix.
rule (of a function) The process for converting each input value in 28 A3 6
the domain of the function into a unique output value. See function.
run from A to B The run from a point A(x1 , y1 ) to a point B(x2 , y2 ) A2 9
is x2 − x1 .
sample A subset of a population. 46 D2 8
20
skewed A data set which is not symmetric (or, equivalently, for which D2 11
a frequency diagram is not symmetric) is said to be skewed. If the large
data values are more spread out than the small data values, so that a
frequency diagram has a longer right tail than left tail, then the data
set is right-skew. If the left tail of a frequency diagram is longer than
the right tail, then the data set is left-skew. The terms right-skew and
left-skew are also used to describe probability distributions.
slope (of a line) If A and B are two points on a line, then the slope 27 A2 9
of the line is (rise from A to B) ÷ (run from A to B). The slope is also
called the gradient.
smooth A function f is said to be smooth if the derivative exists at C1 14
each point of the domain of f .
solution of a differential equation A function, y = F (x) say, (or 42 C3 6
a more general equation relating the independent and dependent vari-
ables) for which the differential equation is satisfied.
solution (of an equation) A value of the unknown for which the A0 35
equation is satisfied.
solving a triangle The process of determining all the angles and side A2 37
lengths of a triangle.
speed A (scalar) measure of how fast an object is moving, irrespective B3 26
of its direction of motion. The speed is the magnitude of the velocity
vector.
square matrix A matrix with the same number of rows and B2 21
columns.
standard error The standard deviation of a sampling distribution. D3 14
D4 20
standard error of the mean The standard deviation of the sam- 47 D3 14
pling distribution of the mean for samples of size n, for any given sample
size n.
standard normal distribution The normal distribution with mean 46 D2 36
0 and standard deviation 1.
stationary point A point x0 in the domain of a smooth function f at 39 C1 22
which f (x0 ) = 0, or the corresponding point (x0 , f (x0 )) on the graph
of f .
step size The value of the variable h in Euler’s method, which deter- 44 C3 39
mines the distance between the successive values of x at which solution
estimates are calculated.
subscript In the notation an , n is called the subscript of a. A1 7
21
subtended angle (at a point) The angle that lies between two lines A2 21
drawn from the point to the endpoints of a line segment or an arc of a
circle.
sin θ
tangent (of an angle θ) tan θ
=
± 12 π, ±
23 π, . . . .
, where θ = 27 A2 36
cos θ
22
Background material for MST121
Rounding number s
To round to a given number of decimal places, look at the digit one place to the
right of the number of places specified. If this digit is 5 or more, then round up; if it
is less than 5, then round down.
To round to a given number of significant figures, start counting significant
figures from the first non-zero digit on the left, and follow the rules for rounding.
Scientific notation
In scientific notation, positive numbers are expressed in the form a × 10n , where a is
between 1 and 10, and n is an integer.
1 √ √ √
a−n = a0 = 1 a1/n = n
a am/n = n am = ( n a)m
an
Surds
An irrational number is √
in surd
√ form√if it is
√ written in terms of roots of rational
numbers. For example, 2, 4 5 and 3 7 + 4 3 are numbers in surd form.
Calculating means
To calculate
the mean of a batch of data, add together the values (x) in the batch
to give x, and divide by n, the number of values in the batch.
Algebra
Difference of two squares: a2 − b2 = (a − b)(a + b).
23
Background
Angle measurement
The angle subtended at the centre of a circle by an arc equal in length to the radius
of the circle is defined to be one radian. Thus 2π radians = 360◦, and the rules for
converting between degrees and radians are
180 π
x radians = x × degrees, y degrees = y × radians.
π 180
Polygons
A plane figure which is a closed shape whose sides are straight lines is called a
polygon. A point where two sides meet is called a vertex. A polygon with n sides
(and hence n vertices) is referred to as an n-gon.
The angle sum of an n-gon is (n − 2)180◦, that is, (n − 2)π radians.
An n-gon is said to be regular if all its sides are equal and all its angles are equal.
Triangles
A triangle is a polygon with three sides. Its angle sum is 180◦, that is, π radians. If
all three sides are of equal length, then it is an equilateral triangle and all three
angles are 60◦. If two sides are of equal length, then it is an isosceles triangle and
the two angles opposite the equal length sides are equal.
The area of a triangle is
1
2
2
ab sin θ, where a and b are two side lengths, and θ is the angle between the sides.
Right-angled triangles
Pythagoras’ Theorem: For a triangle ABC with side lengths a, b and c (opposite
A, B and C, respectively), where the angle at C is a right angle, c2 = a2 + b2 .
A
The side opposite the right angle is known as the hypotenuse.
24
Background
Quadrilaterals
A quadrilateral is a polygon with four sides. Its angle sum is 360◦, that is,
2π radians.
A quadrilateral in which all sides and all angles are equal is a square.
In a parallelogram, opposite angles are equal and the two diagonals bisect each
The area of a rectangle is A = lb, where l is the length and b is the breadth.
Circles
A circle of radius r has
area A = πr 2 .
Congruence
Two figures are congruent if they have the same shape and the same size.
Two n-gons are congruent if all corresponding sides and angles are equal.
Similarity
Two figures are similar if they have the same shape; their sizes need not be the
same.
Two n-gons are similar if each angle in one n-gon is equal to the corresponding angle
in the other. In this case, the length of each side in one n-gon is the same multiple of
the corresponding length in the other.
Prisms
A prism is a solid with constant cross-section. A cylinder is a prism with circular
cross-section.
The surface area of a prism is the sum of the areas of its faces. In particular, the
surface area of a cylinder is A = 2πr 2 + 2πrh, where r is the radius of the circular
cross-section and h is the length.
The volume of a prism is the area of its cross-section multiplied by its length. In
particular, the volume of a cylinder is V = πr 2 h.
25
MST121 A1
Types of sequences
Convention: The first term of a sequence has subscript 1, unless otherwise indicated.
An arithmetic sequence with first term a and common difference d can be
specified by either of the following recurrence systems:
x1 = a, xn+1 = xn + d (n = 1, 2, 3, . . .),
with closed form xn = a + (n − 1)d (n = 1, 2, 3, . . .);
x0 = a, xn+1 = xn + d (n = 0, 1, 2, . . .),
with closed form xn = a + nd (n = 0, 1, 2, . . .).
A geometric sequence with first term a and common ratio r can be specified by
either of the following recurrence systems:
26
MST121 A1/A2
Lines
Type Slope Equation
Parallel to m=0 y = c,
x-axis where c is a constant
Parallel to Infinite x = d,
y-axis where d is a constant
Not parallel to m = (y2 − y1 )/(x2 − x1 ), y − y1 = m(x − x1 ) or
y-axis where (x1 , y1 ) and (x2 , y2 ) y = mx + c,
are points on the line where c is the y-intercept
If two lines are parallel, then they have equal slopes. If two lines are
perpendicular, then either the product of their slopes is −1 or one has slope 0 and
the other has infinite slope.
The distance between two points (x1 , y1 ) and (x2 , y2 ) is
(x2 − x1 )2 + (y2 − y1 )2 .
The midpoint of a line segment with endpoints (x1 , y1 ) and (x2 , y2 ) is
x1 + x2 y1 + y2
, .
2 2
Circles
Geometrically, a circle is the set of points that are at a fixed distance (the radius)
from a specified point (the centre). Algebraically, a circle with centre (a, b) and
radius r has the equation
(x − a)2 + (y − b)2 = r2 .
To find the equation of a circle, given three points A, B and C on the circle, find the
perpendicular bisectors of the line segments AB and BC. The centre of the circle is
the intersection point of the two perpendicular bisectors. The radius of the circle is
the distance from the centre to any of the points A, B or C.
To complete the square of x2 + 2px, use
x2 + 2px = x2 + 2px + p2 − p2 = (x + p)2 − p2 .
Trigonometr y
Let P (x, y) be a point on the unit circle (with centre O) such that the angle from the
positive x-axis to OP is θ (measured anticlockwise if θ is positive, clockwise if θ is
negative). Then
sin θ
cos θ = x, sin θ = y and tan θ = (provided that cos θ = 0).
cos θ
27
MST121 A2/A3
Trigonometric identities
cos (−θ) = cos θ cos (π − θ) = − cos θ cos (θ + 2π) = cos θ
A line with slope m passing through the point (x1 , y1 ) has parametric equations
x = t + x1 , y = mt + y1 .
A line passing through the two points (x1 , y1 ) and (x2 , y2 ) has parametric
equations
x = x1 + t(x2 − x1 ), y = y1 + t(y2 − y1 ).
Functions
A (real) function is specified by giving
the domain, that is, the set of allowable input values, which are real numbers;
the rule for converting each input value to a unique output value, which is also
a real number.
The output of a function f for a given input x is called the image of x under f , and
is written f (x). The set of all outputs of the function f is called the image set of f .
Convention: When a function is specified just by a rule, it is understood that the
domain of the function is the largest possible set of real numbers for which the rule is
applicable.
Function notation
A standard notation used to specify a function f is
f (x) = x2 + 1 (0 ≤ x ≤ 6).
Other notations used to specify the same function f are
f : x −→ x2 + 1 (0 ≤ x ≤ 6);
f : [0, 6] −→ R
x −→ x2 + 1.
|x| =
−x, if x < 0.
28
MST121 A3
The scalings and translations in the table above can, with two exceptions, be applied
in any order with the same result. The exceptions are that the result of applying
both a horizontal translation and an x-scaling depends in general on the order in
which these are applied, and similarly for both a vertical translation and a y-scaling.
Inverse functions
A real function f is one-one if it has the following property: for all x1 , x2 in the
domain of f , if x1 = x2 , then f (x1 ) =
f (x2 ).
A real function f is increasing if it has the following property: for all x1 , x2 in the
A real function f is decreasing if it has the following property: for all x1 , x2 in the
To obtain the rule for the function f −1 (in terms of x), solve the equation
y = f (x) to obtain x in terms of y, and then exchange the roles of x and y. The
image set of f is the domain of f −1 , and vice versa.
To obtain the graph of y = f −1 (x), reflect the graph of y = f (x) in the 45◦ line.
29
MST121 A3
Logarithms
An exponential function f (x) = ax , where a > 0 and a = 1, has domain R and image
set (0, ∞). Its inverse function, called logarithm to the base a and denoted by loga ,
has domain (0, ∞) and image set R. Thus, for y > 0,
x = loga y means that y = ax .
The natural logarithm has base e = 2.718 281 . . . and is often written as ln. The
common logarithm has base 10 and is often written as log.
Provided that a > 0 and a = 1, the logarithm to the base a has the following
properties:
(a) loga 1 = 0, loga a = 1;
(b) for x > 0 and y > 0,
(i) loga (xy) = loga x + loga y,
(ii) loga (x/y) = loga x − loga y;
(c) for x > 0 and p in R, loga (xp ) = p loga x.
To use logarithms to solve an equation of the form ax = k, where k > 0 and
1, apply the function ln to both sides of the equation, and use
a > 0, a =
property (c) to obtain
ln k
x= .
ln a
30
MST121 B1
n
i 2 1 − rn+1
n
ar = a + ar + ar + · · · + ar = a (r
= 1)
i=0
1−r
∞
a
ari = (|r| < 1)
i=0
1−r
n
n
n
n
(a + bxi ) = an + b xi (a + bxi ) = a(n − m + 1) + b xi
n
i = 1 + 2 + 3 + · · · + n = 12
n(n + 1)
i=1
Exponential model
The (discrete) exponential model for population variation is based on the assumption
of a constant proportionate growth rate, r. The model is described by either the
recurrence relation
Pn+1 = (1 + r)Pn (n = 0, 1, 2, . . .),
or its closed-form solution
Pn = (1 + r)n P0 (n = 0, 1, 2, . . .),
where Pn is the population size at n years after some chosen starting time. The
proportionate growth rate r is the proportionate birth rate minus the proportionate
death rate.
Logistic model
The logistic model for population variation is based on the assumption of a
proportionate growth rate R(P ) of the form R(P ) = r(1 − P/E), where r and E are
positive parameters. The model is described by the recurrence relation
Pn
Pn+1 − Pn = rPn 1 − (n = 0, 1, 2, . . .),
E
where Pn is the population size at n years after some chosen starting time. The
positive constant r represents the proportionate growth rate of the population when
the population size is small, and the positive constant E represents the equilibrium
population level (the population size at which the proportionate growth rate is zero).
The long-term behaviour of sequences generated by the logistic recurrence relation
(with 0 < P0 ≤ E(1 + 1/r)) depends on the value of r, as shown in the table below.
31
MST121 B1
Reciprocal Rule
If the terms of a sequence bn are of the form 1/an , where terms of the sequence an
become arbitrarily large as n increases, then lim bn = 0.
n→∞
If p > 0, then np → ∞ as n → ∞.
If p < 0, then np → 0 as n → ∞.
If p = 0, then np = 1.
32
MST121 B2
⎛ ⎞
0.4 0.3
⎝ 0.2 0.1 ⎠
0.4 0.6
If the outputs of one network feed directly into an equal number of inputs in a
second network, then the matrix representing the combined network is obtained by
multiplying the matrices representing the two original networks.
Matrix multiplication
Two matrices A and B can be multiplied only if the number of columns of A equals
the number of rows of B. The element in the ith row and jth column of the product
matrix AB is obtained by adding up the products of corresponding elements of the
ith row of A and the jth column of B.
Thus, if A is an m × n matrix and B is an n × p matrix, then C = AB is an m × p
matrix with elements given by
n
cij = aik bkj (i = 1, 2, . . . , m and j = 1, 2, . . . , p).
k=1
⎛ ⎞
a11 a12
b11 b12 b13
For example, if A = ⎝ a21 a22 ⎠ and B = , then
b21 b22 b23
a31 a32
⎛ ⎞
a11 b11 + a12 b21 a11 b12 + a12 b22 a11 b13 + a12 b23
C = AB = ⎝ a21 b11 + a22 b21 a21 b12 + a22 b22 a21 b13 + a22 b23 ⎠ .
a31 b11 + a32 b21 a31 b12 + a32 b22 a31 b13 + a32 b23
33
MST121 B2
Matrix addition
Two matrices A and B can be added only if they have the same size. If A and B are
m × n matrices, then C = A + B is also an m × n matrix with elements given by
cij = aij + bij (i = 1, 2, . . . , m and j = 1, 2, . . . , n).
a11 a12 a13 b b b
For example, if A = and B =
11 12 13 , then
Scalar multiplication
When a matrix is scalar multiplied by a real number k, each element of the matrix is
multiplied by k. For example, if
a11 a12 a13 ka11 ka12 ka13
A= , then kA = .
Vectors
A vector is a matrix with only one column. Elements of a vector v are often called
components and are specified as vi . The size of a vector is the number of
components it has.
For vectors u and v, the vectors u + v and ku are formed according to the
definitions for general matrices.
Population modelling
A matrix model for the structure of a population in terms of two interdependent
subpopulations Jn and An is given by
pn+1 = Mpn (n = 0, 1, 2, . . .),
Jn
where M is a 2 × 2 matrix and pn is the vector which gives the
An
subpopulation sizes at n years after a chosen starting time.
The closed-form solution for this model is
pn = Mn p0 (n = 1, 2, 3, . . .).
34
MST121 B2
Inverting 2 × 2 matrices
1 0
The matrix I = is the 2 × 2 identity matrix. For any 2 × 2 matrix A,
0 1
AI = IA = A.
If two 2 × 2 matrices A and B have the property that AB = I = BA, then B is the
inverse of A. The inverse
of a matrix A is usually denoted A−1 . The inverse of the
a b
general 2 × 2 matrix is given by
c d
1 d −b
, provided ad − bc = 0.
ad − bc −c a
If the determinant of a matrix A is zero, then A does not have an inverse and A is
non-invertible.
35
MST121 B3
Triangle Rule
To find the sum a + b of two vectors a and b in geometric form.
1. Choose any point P in the plane.
2. Draw an arrow to represent a, with tail at P and tip at Q, say.
3. Draw an arrow to represent b, with tail at Q and tip at R, say.
4. Draw the arrow with tail at P and tip at R, to complete the triangle P QR. This
last arrow represents the vector a + b.
Parallelogram Rule
To find the sum a + b of two vectors a and b in geometric form.
1. Choose any point P in the plane.
2. Draw an arrow to represent a, with tail at P and tip at Q, say.
3. Draw an arrow to represent b, with tail at P and tip at S, say.
4. Complete the parallelogram P QRS, and draw the arrow with tail at P and tip at
R. This last arrow represents the vector a + b.
Scalar multiplication
If a is a vector in geometric form and k is a real number, then the scalar multiple ka
has magnitude |ka| = |k||a|. If k is non-zero, then the direction of ka is the same as
that of a if k > 0, or opposite to that of a if k < 0.
a2 = |a| sin θ.
36
MST121 B3
Sine Rule
For any triangle, the side lengths a, b, c and corresponding opposite angles A, B, C
are related by the formulas
sin A sin B sin C a b c
= = or, equivalently, = = .
a b c sin A sin B sin C
Cosine Rule
For any triangle, the side lengths a, b, c and corresponding opposite angles A, B, C
are related by the formulas
b2 + c2 − a2
a2 = b2 + c2 − 2bc cos A, cos A = ,
2bc
c2 + a2 − b2
b2 = c2 + a2 − 2ca cos B, cos B = ,
2ca
a2 + b2 − c2
c2 = a2 + b2 − 2ab cos C, cos C = .
2ab
37
MST121 C1
Differentiation
Differentiation is a process which enables you to find: the gradient of a graph, and
the rate at which one variable changes with respect to another.
Let f be a function.
cos(ax) −a sin(ax)
eax ae
ax
ln(ax) (ax > 0) 1/x (ax > 0)
Product Rule
If k is a function with rule of the form k(x) = f (x)g(x), where f and g are smooth
functions, then k is smooth and
k (x) = f (x)g(x) + f (x)g (x).
In Leibniz notation, if y = uv, where u = f (x) and v = g(x), then
dy du dv
= v+u .
dx dx dx
38
MST121 C1
Quotient Rule
If k is a function with rule of the form k(x) = f (x)/g(x), where f and g are smooth
functions, then k is smooth and
g(x)f (x) − f (x)g (x)
k (x) = .
(g(x))2
In Leibniz notation, if y = u/v, where u = f (x) and v = g(x) = 0, then
dy 1 du dv
= 2 v −u .
dx v dx dx
Composite Rule
If k is a function with rule of the form k(x) = g(f (x)), where f and g are smooth
functions, then k is smooth and
k (x) = g (f (x))f (x).
In Leibniz notation (Chain Rule), if y = g(u), where u = f (x), then
dy dy du
= .
dx du dx
Increasing/Decreasing Criterion
Let I be an open interval in the domain of a smooth function f .
Stationar y points
Let f be a smooth function. The function f has a stationary point at x = x0 if
f (x0 ) = 0. The corresponding point (x0 , f (x0 )) on the graph of f is also called a
stationary point.
39
MST121 C1/C2
Optimisation Procedure
To find the greatest (or least) value of a smooth function f on a closed interval I
within the domain of f , proceed as follows.
1. Find the stationary points of f .
2. Evaluate f at each of the endpoints of I and at each of the stationary points
inside I .
3. Choose the greatest (or least) of the function values found in Step 2.
Integration
The function F is an integral of the function f if F = f. The indefinite integral
of f (x) is
f (x) dx = F (x) + c,
is defined to be
b
[F (x)]a = F (b) − F (a),
where F is any integral of f .
a (constant) ax + c
1
xn −1)
(n = xn+1 + c
n+1
1
(x > 0) ln x + c
x
1 ax
eax e +c
a
1
cos(ax) sin(ax) + c
a
1
sin(ax) − cos(ax) + c
a
40
MST121 C2
Double-angle formulas
sin(2θ) = 2 sin θ cos θ
cos(2θ) = cos2 θ − sin2 θ = 2 cos2 θ − 1 = 1 − 2 sin2 θ
Modelling motion
The SI units for kinematic quantities are as follows.
ds dv
v= and a = .
dt dt
The following formulas apply for the motion of a particle along a straight line with
constant acceleration a, if at time t = 0 the particle has velocity v0 and position s0 .
The velocity and position of the particle are related by the equation
v 2 − 2as = v02 − 2as0 .
41
MST121 C2/C3
Direct integration
The general solution of the differential equation dy/dx = f (x) is the indefinite
integral
y= f (x) dx = F (x) + c,
where F (x) is any integral of f (x) and c is an arbitrary constant. Any initial
condition
y = b when x = a, that is, y(a) = b,
enables a value for the arbitrary constant c to be found. The corresponding
particular solution satisfies both the differential equation and the initial condition.
42
MST121 C3
Implicit differentiation
dy
If y is a function of x and H(y) = F (x), then H (y) = F (x).
dx
Separation of variables
The method applies to differential equations of the form dy/dx = f (x)g(y).
1. Divide both sides by g(y), for g(y) = 0, to obtain
1 dy
= f (x).
g(y) dx
2. Integrate both sides with respect to x. The outcome is
1
dy = f (x) dx.
g(y)
3. Carry out the two integrations, introducing one arbitrary constant, to obtain the
general solution in implicit form. If possible, manipulate the resulting equation
to make y the subject, thus expressing the general solution in explicit form.
43
MST121 C3
Euler’s method
Euler’s method for solving the initial-value problem
dy
= f (x, y), y(x0 ) = y0
dx
is described by the pair of recurrence relations
xn+1 = xn + h, yn+1 = yn + hf (xn , yn ) (n = 0, 1, 2, . . .),
where h is the step size between the successive values of x at which solution
estimates are calculated. Each calculated value yn is an estimate of the
corresponding ‘true solution’ y at x = xn ; that is, yn is an estimate of y(xn ). The
sequence of estimates depends on the choice of both the step size h and the overall
number of steps N . Decreasing h, while increasing N to cover the same range of
x-values, leads to progressively improved estimates for the solution values, and with
a small enough step size, any desired level of accuracy can be achieved.
44
MST121 D1
Probability
For any event E, 0 ≤ P (E) ≤ 1.
The multiplication rule for independent events states that if E and F are
Geometric distributions
If a sequence of trials of an experiment is carried out and the probability of success
in each trial is p (0 < p < 1) independently of the results of earlier trials, then X, the
number of trials required to obtain a success, has a geometric distribution. The
probability function of X is given by
P (X = j) = (1 − p)j−1 p, j = 1, 2, 3, . . . .
The mean number of trials required to obtain a success is 1/p.
Probability distributions
The mean of the probability distribution of a discrete random variable X is denoted
by µ and is defined to be
µ= j × P (X = j),
j
where the summation is over all values j which X can take, that is, for which
P (X = j) > 0.
The corresponding formula for the mean of a continuous random variable X with
probability density function f is
∞
µ= x f (x) dx.
−∞
45
MST121 D2
The sample mean and sample standard deviation are examples of sample
statistics. In general, given a sample of data from a population, the sample mean x
is used to estimate the population mean µ, and the sample standard deviation s is
used to estimate the population standard deviation σ.
Normal distributions
A normal distribution is a continuous probability distribution. Probabilities are
calculated by finding areas under a normal curve, which has the following typical
shape.
46
MST121 D3/D4
Sampling distributions, the Central Limit Theorem and confidence inter vals
The sampling distribution of the mean for samples of size n from a population
√ with
mean µ and standard deviation σ has mean µ and standard deviation σ/ n. These
results hold for any sample size.
The standard deviation of the sampling distribution of the mean is called the
standard error of the mean and is denoted by SE.
The Central Limit Theorem states that, for large sample sizes (at least 25), the
sampling distribution of the mean for samples of size n from a population with mean
µ and standard deviation σ may be approximated by a normal distribution with
mean µ and standard deviation
σ
SE =
√
.
n
Given a sample of size n from a population, a 95% confidence interval for the
population mean µ is given by
s s
x − 1.96 √
, x + 1.96
√
,
n n
where x is the sample mean and s is the sample standard deviation. The sample size
n must be at least 25.
The box extends from the lower quartile to the upper quartile, and a vertical line is
drawn through the box at the median. The whiskers extend from the ends of the box
to the minimum and maximum values in the batch of data.
47
MST121 D4
Hypothesis testing
There are three stages involved in a hypothesis test:
1. Set up the null and alternative hypotheses.
2. Calculate the test statistic.
3. Report conclusions.
The two-sample z-test is a hypothesis test which may be used when a sample of at
least 25 observations is available from each of two populations. It may be used to
investigate whether there is a difference between the means of the populations. The
three stages involved in carrying out the two-sample z-test are outlined below.
Stage 1: Hypotheses
Set up the null and alternative hypotheses:
H0 : µA = µB ,
µB ,
H1 : µA =
where µA and µB are the means of populations A and B, respectively.
nA nB
xA and xB are the sample means, sA and sB are the sample standard deviations, and
nA and nB are the sizes of the samples from A and B, respectively.
Stage 3: Conclusions
If −1.96 < z < 1.96, then H0 is not rejected at the 5% significance level.
The conclusion should be expressed in terms of the hypothesis being tested.
The quantity ESE in the test statistic for the two-sample z-test is the estimated
standard error of the difference between two sample means; that is, it is the
estimated value of the standard deviation of the sampling distribution of the
difference between two sample means.
48