0% found this document useful (0 votes)
24 views69 pages

(A) Some Basic Mathematics

QM4F Some basic mathematics (logarithms, differentiation and optimisation, matrices and vectors) Darryl Holden September 7, 2015

Uploaded by

Nasdkla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views69 pages

(A) Some Basic Mathematics

QM4F Some basic mathematics (logarithms, differentiation and optimisation, matrices and vectors) Darryl Holden September 7, 2015

Uploaded by

Nasdkla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

QM4F

Some basic mathematics (logarithms,


differentiation and optimisation, matrices and
vectors)

Darryl Holden

September 7, 2015

1
The maths notation
y = f (x)
(‘y equals f of x’) indicates that the variable y is determined
by (is a function of) the variable x. In a diagram with x on the
horizontal and y on the vertical the function appears as a
means of journeying from the horizontal to the vertical.
An example is
y = f (x) = ax2 + bx + c. (1)
As long as a is not zero (so x2 actually appears) we have a
quadratic function.

2
In fact different a, b, and c give different quadratic functions so
we might take (1) as defining the family of quadratic functions.
The particular quadratic

y = f (x) = 3x2 − 6x − 24 (2)

follows from a = 3, b = −6, and c = −24. We have

f (−1) = −15 f (0) = −24 f (1) = −27.

3
The roots of a quadratic are the x which make the y of (1)
equal to zero. They can be obtained by using

−b ± b2 − 4ac
(3)
2a
where ± indicates two calculations, one with − and one with
+. Note that there are two distinct roots if b2 − 4ac is positive,
b
a single root (− 2a ) if b2 − 4ac = 0, and no roots if b2 − 4ac is
negative (because a negative does not have a square root).
For the quadratic in (2) I can also find the roots by writing the
quadratic as

3(x2 − 2x − 8) = 3(x − 4)(x + 2).

4
If a = 0 in (1) we obtain

y = f (x) = bx + c

which defines a linear function, corresponding to a straight line


in y, x space. Now c is the value of y when x equals zero, often
called the vertical intercept, and b is the increase in y when x
increases by one. To see this write

y2 = bx2 + c y1 = bx1 + c,

y2 − y1 = b(x2 − x1 ),
and let x2 = x1 + 1.

5
Two other functions we’ll consider in due course are the
exponential function and the natural logarithm function,

y = ex y = ln(x).

Sometimes it is important to be aware that only certain x are


allowed. Eg, the value of ln(x) is only defined for positive x. A
term like ln(−3) is not defined. Sometimes it is important to
be aware that only certain y are possible. Eg, ex always results
in a positive. So there are no x which result in ex = −3.

6
notation it is often a good idea to move away from the general
y = f (x) notation and use symbols appropriate in a particular
application. For example in economics you might find q = f (p)
to suggest that quantity depends on price. In fact in a market
there is a quantity demanded by consumers (q d say) and a
quantity supplied by producers (q s say) and both depend on
price. We might write q d = f (p) (quantity demanded depends
on price) and q s = h(p) (quantity supplied depends on price)
where f and h denote different functions of p. q d (q s ) is
assumed to be a decreasing (increasing) function of p.

7
e

The value after t years of a pound invested at an interest rate


of r is given by

Vt = (1 + r)t t = 0, 1, 2, 3, . . . (4)

if interest is paid annually. Now suppose interest is paid at the


rate 2r every six months. The future value is now
h r i2t
Vt = 1 +
2
for t = 0, 21 , 1, 32 , 2, 52 , . . .

8
Similarly, if interest is paid at the rate 3r every four months
then h r i3t
Vt = 1 +
3
for t = 0, 31 , 23 , 1, 43 , 35 , 2, 73 , . . .
The general rule for the case where interest is paid at the rate
r
n n times per year is
h r int
Vt = 1 +
n
for t = 0, n1 , n2 , . . . , 1, n+1
n ,...

9
We can ask what happens when n goes to infinity (n → ∞)
and interest accrues on a continuous basis, and we write t ≥ 0
for the resulting expression. We have
h r in
lim 1 + = er
n→∞ n
where e is the nonrepeating, nonterminating decimal that
begins e = 2.71828 . . ., or, in a bit more detail,

2.718281828459045235360287471352662497757247093699959 . . .

10
The implied continuous time future value is

V (t) = ert t≥0

which is also commonly presented as V (t) = exp(rt). We speak


of the exponential function being evaluated at rt. For t = 1 we
have
V (1) = er
and then r = 0.05 gives e0.05 = 1.051271096, which is close, but
not exactly, the 1 + r we obtain with an annual interest
payment (put t = 1 in (4)).

11
The exponential function and logarithms

I use my calculator to obtain

e1 = 2.718281828 e2 = 7.389056099

e−1 = 0.3678794412 e−2 = 0.1353352832


and can now introduce the natural logarithm or logarithm to
the base e, ln. The definition is as the power to which e has to
be raised to obtain a given number. We have

ln(2.718281828) = 1 ln(7.389056099) = 2

ln(0.3678794412) = −1 ln(0.1353352832) = −2
where top left is ln(e) = 1.

12
In mathematics notation, and as in the various examples on the
previous slide,
y = ex
means x = ln(y). So the exponential function takes us from x
to y and the natural logarithm takes us from y back to x. The
natural logarithm is called the inverse function to the
exponential function.

aside (the vice versa bit) writing the previous paragraph with y
and x interchanged means x = ey is the same as y = ln(x). So
now the natural logarithm takes us from x to y and the
exponential takes us back to x. The exponential is called the
inverse to the natural logarithm function.

13
On the next slide I have a sketch of the exponential function,
y = ex , on the right and a sketch of the natural logarithm
function, y = ln(x), on the left.

On the right y = ex takes us from the horizontal to the vertical


and x = ln(y) takes us from the vertical to the horizontal.

On the left y = ln(x) takes us from the horizontal to the


vertical and x = ey takes us from the vertical to the horizontal.

14
y y

15
Some rules are
ln(ab) = ln(a) + ln(b),
as long as a and b are positive,

ln(ap ) = p ln(a),

as long as a is positive. Also we have ln(1) = 0 and ln(e) = 1.

We also have
ea eb = ea+b (ea )p = eap
with e1 = e and e0 = 1.

16
example How many years are required for a doubling of value in

V (t) = Iert t≥0

where I is an initial investment? We need to find the t which


makes V (t) = 2I, which is

2 = ert .

We have
ln(2) = ln(ert ) = rt ln(e) = rt
and then
ln(2)
t= .
r
For r = 0.05 we have t = 13.86 years.

17
In the same framework we can find the r which is required for
value to double in some amount of time, t∗ say. We need
rt∗
2=e

and
rt∗
ln(2) = ln(e ) = rt∗ ln(e) = rt∗
and
ln(2)
r= .
t∗

18
Then for t∗ = 8 we have
ln(2)
r= = 0.08664339758
8
corresponding to around 8.7%.

19
Common logarithms are logarithms to base 10. Now we need
to ask what power 10 has to be raised to to obtain a given
value. For example log(100) = 2 as 100 = 102 . You can have
logarithms to any base greater than 1 but e and 10 are the only
bases in use.

20
Differentiation

The derivative of the function y = f (x) is denoted as y ′ (y


dy
dash), or f ′ (f dash), or dx (d y by d x), or df
dx (d f by d x).
For f (x) = x2 we have
dy
= 2x.
dx
We see that (i) the function is decreasing in value when the
derivative is negative, (ii) the function is increasing in value
when the derivative is positive, and (iii) the function is at its
minimum value when the derivative is zero.

21
(i), (ii), and (iii) are true for all functions so we see a use for
the derivative in providing information about the nature of a
given function. (iii) suggests the usefulness of the derivative in
minimisation problems (minimise cost, minimize risk, . . . ). In
fact a zero derivative also goes with a function being at its
maximum so the derivative will also be useful in maximisation
problems (maximise profit, maximise return, . . . ). But we’ll
need to discover how to distinguish between maximum and
minimum if both go with zero derivative.

22
formal definition The formal definition of the derivative is
f (x + h) − f (x)
lim
h→0 h

where f (x+h)−f
h
(x)
is the slope of the line ab in the following
figure. Letting h go to zero implies dragging b back along the
function in the direction of a. In the limit the two points
become one and we have the derivative as the slope of the
tangent line to the function at a, which is the slope of tt. We
have a negatively sloped tangent line, ie a negative derivative,
indicating the function is decreasing in value at a.

23
f

f (x + h) − f (x) t
a

h x
x

24
For f (x) = x2 we have

f (x + h) − f (x) (x + h)2 − x2 x2 + h2 + 2hx − x2


= =
h h h
which is
h2 + 2hx
= h + 2x.
h
We obtain 2x by letting h go to zero. So we have
d(x2 )
= 2x
dx
by applying the basic definition. Alternatively we have the
p = 2 application of the power rule.

25
d(xp )
= pxp−1 (power rule)
dx
Notice that this works for all p, so that, for example (p = 0.5)

d(x0.5 )
= 0.5x−0.5
dx
1
Which raises two questions. What is x−0.5 ? Answer: x0.5 , so
that
d(x0.5 ) 1
= 0.5 ,
dx 2x
−2 1 0.5 √
(similarly x = x2 , . . . ), and what is x ? Answer: x, the
square root of x. This function is only defined for x ≥ 0.

26
Two special cases of the power rule merit separate mention.
p = 1 gives
d(x)
= 1,
dx
and p = 0 goes with
d(constant)
= 0.
dx

27
Differentiating the quadratic for the quadratic

y = ax2 + bx + c
d(ax2 +bx+c)
we can write the derivative, dx , as
d(ax2 ) d(bx) d(c)
+ + ,
dx dx dx
so the individual terms making up y are differentiated
seperately, and then as
d(x2 ) d(x)
a +b ,
dx dx
which is
2ax + b.
b
We get a zero derivative at x = − 2a .

28
Interpreting the derivative
dy
For the linear function y = bx + c the derivative is dx = b. But
dy
b, and therefore now dx , is the increase in y when x increases
dy
by 1. That dx is the increase in y when x increases by 1 is not
quite strictly correct in general but it is how the derivative is
interpreted. The interpretation is not quite strictly correct
because it is based on moving from a, in the figure on slide 23,
along the tangent line tt. This is not quite the same as moving
around the curve, but it ought to be OK if the change in x is
‘small’.

29
The word marginal is often attached to a derivative. For
example if c = c(q) relates a firm’s costs c to its output q then
dc
dq , the increase in costs when output increases by 1, is called
marginal cost and if u = u(w) relates an individual’s utility u
du
to their wealth w then dw , the increase in utility when wealth
increases by 1, is called marginal utility. The common
assumption of ‘decreasing marginal utility’ means marginal
du
utility, dw , decreases as wealth increases, meaning the w
du du
derivative of dw is negative. The w derivative of dw is the
second derivative of u(w).

30
Two important tangents The tangent line to the exponential
function at 0 is
t(x) = 1 + x.
For ‘small’ x the implication is that ex =approx 1 + x. For
example e0.1 = 1.105170918 is just a little bigger than 1 + 0.1.
That ex is greater than t(x) is correctly suggested by the
sketch on slide 15.
Let Pt (Pt−1 ) be price of an asset at time t (t − 1). The tangent
to the ln function at Pt−1 is
x − Pt−1
t(x) = ln(Pt−1 ) + .
Pt−1

31
As long as Pt is ‘close’ to Pt−1 we have
Pt − Pt−1
ln(Pt ) =approx ln(Pt−1 ) +
Pt−1
or
Pt − Pt−1
ln(Pt ) − ln(Pt−1 ) =approx
Pt−1
For example ln(105) − ln(100) = 0.0487901016417 is just a little
smaller than 105−100
100 . That ln(Pt ) is smaller than t(Pt ) is
correctly suggested by the sketch on slide 15.

32
f

xl xr x

33
maximum, minimum, and the second derivative
In the diagram we have a function with a maximum at xl . The
derivative of the function is zero at this x value, as indicated
by the horizontal tangent. To the left (right) the derivative is
positive (negative) as the function is increasing (decreasing).
We see that the derivative is decreasing as we pass through xl .
But a decreasing derivative will correspond to a negative
derivative of the derivative, the derivative of the derivative
being the second derivative of the original function, as in
2  
d y d dy
2
=
dx dx dx
dy
(d 2 y by d x squared is the derivative of dx ).

34
So: for a maximum we need a zero first derivative and a
negative second derivative.
For the minimum at xr there is again a zero first derivative but
a positive second derivative as the first derivative is increasing
in value. So: for a minimum we need a zero first derivative and
a positive second derivative.
For the quadratic the second derivative is the derivative of the
first derivative, 2ax + b, so we have
d2 y
2
= 2a.
dx
Now we see a positive (negative) a gives a positive (negative)
b
second derivative and − 2a corresponds to a minimum
(maximum) of the quadratic.

35
As another example consider

y = x3 − 3x2 − 24x + 30

where the first derivative is


dy
= 3x2 − 6x − 24 (5)
dx
and the second derivative is
d2 y
2
= 6x − 6. (6)
dx

36
dy
The x which make dx = 0, the stationary points of the
function, are x = −2 and x = 4. To determine their nature
requires using the second derivative in (6). We have

d2 y
2
= 6(−2) − 6 = −18
dx
at x = −2. The negative second derivative means x = −2 is
revealed as a maximum. We have
d2 y
2
= 6(4) − 6 = 18
dx
at x = 4. The positive second derivative means x = 4 is
revealed as a minimum.

37
dy d2 y
Aside What about dx = 0 and dx2 = 0? You could have a
minimum (see y = x4 , on the right in the following sketch) or a
maximum (see y = −x4 , not shown in the following sketch).
But there is also the possibility of a point of inflexion (see
y = x3 , on the left in the following sketch), which is neither a
maximum or minimum.

38
f f

x x

39
Partial differentiation and optimisation with two choice variables
Let y = f (x1 , x2 ), indicating y is determined by the values of
both x1 and x2 . There are now two first derivatives, called
partial derivatives, and denoted using ∂ rather than d, as in
∂y ∂y
∂x1 ∂x2
In obtaining the first (second) partial derivative you simply
treat x2 (x1 ) as constant so we are given information about
what happens to y when x1 (x2 ) changes with x2 (x1 ) constant.

40
For y = x21 − 16x1 + 5x22 − 38x2 + 4x1 x2 + 27 the first order
partial derivatives are
∂y ∂y
= 2x1 − 16 + 4x2 = 10x2 − 38 + 4x1 .
∂x1 ∂x2
For optimisation (maximisation and minimisation) both first
order partial derivatives are required to be zero. So we need to
solve 2x1 − 16 + 4x2 = 0 and 10x2 − 38 + 4x1 = 0 for x1 and x2 .
This gives us x1 = 2 and x2 = 3. To discover what this
stationary point corresponds to requires the second order
partial derivatives. Two of these are basically just second
derivatives of the same ilk as previously.

41
 
∂2y ∂ ∂y
2 = ,
∂x1 ∂x1 ∂x1
which is 2 for our current example, and
2
 
∂ y ∂ ∂y
2 = ,
∂x2 ∂x2 ∂x2
which is 10 for our current example.
There is also
∂2y
∂x1 ∂x2
∂y ∂y
where ∂x 1
is differentiated with respect to x 2 or ∂x2 is
differentiated with respect to x1 . Both routes lead to the same
expression, 4 in our current example.

42
The conditions needed for optimisation are as follows. In
∂y ∂y
addition to ∂x1
= 0 and ∂x2 = 0 we also need

(i) For a maximum


 2
∂2y ∂2y ∂2y ∂2y ∂2y
< 0 and < 0 and − > 0.
∂x21 ∂x22 ∂x21 ∂x22 ∂x1 ∂x2

(ii) For a minimum


2 2 2 2
 2
2
∂ y ∂ y ∂ y∂ y ∂ y
> 0 and > 0 and − > 0.
∂x21 ∂x22 ∂x21 ∂x22 ∂x1 ∂x2

43
A constrained optimisation
We have an unconstrained minimum of
y = x21 − 16x1 + 5x22 − 38x2 + 4x1 x2 + 27 at x1 = 2 and x2 = 3.
Now let’s consider the consequences of the constraint

x1 = x2 .

There are sophisticated techniques for solving constrained


optimisation problemsa but the method of substitution works
here. Just substitute x2 = x1 to obtain

y = x21 − 16x1 + 5x21 − 38x1 + 4x21 + 27


= 10x21 − 54x1 + 27.

a
The method of Lagrange multipliers, named after . . .

44
We have
dy d2 y
= 20x1 − 54 2 = 20.
dx1 dx1
We see x1 = 2.7 gives a minimum of y subject to x1 = x2 .

The cost of the constraint will be in the value of y achieved.

45
Some rules for differentiation
(a) The exponential and logarithmic functions have derivatives
dex d ln(x) 1
= ex = .
dx dx x

(b) The product rule tells us that the derivative of f1 × f2 is

df2 df1
f1 + f2 .
dx dx
As an example consider x × ln(x) (which is only defined for
positive x).
We have f1 = x and f2 = ln(x) or f1 = ln(x) and f2 = x. It
does not matter which combination we choose.

46
The second option gives us
df1 1 df2
= = 1.
dx x dx
We obtain the derivative of x × ln(x) as ln(x) + 1.

47
(c) The quotient rule applies when y = ff21 gives y as the ratio
of two functions. We have
 
dy 1 df1 df2
= 2 f2 − f1 .
dx f2 dx dx
x3
For example the derivative of ex is

ex × 3x2 − x3 ex
(ex )2
which is
ex x2 (3 − x) x2 (3 − x)
x 2
= x
.
(e ) e

48
(d) The chain rule
Suppose we have the functions y = 3u and u = 2x and want to
dy
obtain dx . We can write y = 3u = 3(2x) = 6x and then
dy
= 6.
dx
Or we can write
dy du
=3 =2
du dx
and
dy dy du
= =3×2=6
|dx {zdu dx}
the (a?) chain rule
just as if one du knocked out the other.

49
You usually need to introduce u to the story. For example

y = f (x) = ln(g(x))

can be differentiated if we introduce u = g(x). We have

y = ln(u) u = g(x)

and then
dy 1 du
= = g ′ (x).
du u dx
The chain rule gives
d ln(g(x)) g ′ (x)
= . (7)
dx g(x)

50
A similar manipulation gives

deg(x)
= g ′ (x) × eg(x) .
dx

51
The result in (7) explains a common interpretation of d ln(g(x))
dx .
It is as the increase in g(x) when x increases by 1 unit (g ′ (x))
expressed as a proportion of the initial value (because of
division by g(x)). So: the interpretation is as the proportionate
increase in g when x increases by 1 unit.
A related story applies to
d ln(g(x)) x dg
= .
d ln(x) g(x) dx
Some of the details are missing but we have the percentage
increase in y when x increases by 1%. This is the elasticity
measurement of the responsiveness of y to x.

52
d ln(g(x))
The derivative d ln(x) may look a little strange but it will
certainly be true that d dln(g(x))
ln(x) = γ in

ln(g) = π + γ ln(x).

So γ has an interpretation of an elasticity and we have a


constant elasticity specification.

53
Matrices and vectors
A matrix is defined as a rectangular array of numbers arranged
in rows and columns and contained in brackets. Examples are
 
1 0  



 4 −6.3 h i
A = −2 5.8  B=  C = −8 4.6
  1 12.45
4.3 −7.4

and  
7.2
   
4 −4 5.6 −3.2
 
D=  E= 
2 15 −6.5  1.2 
 
−8

54
A is described as ‘3 by 2’ (or 3 × 2 or (3, 2)), as in 3 rows and 2
columns. The row count always comes first. So B is 2 by 2, and
is an example of a square matrix, C is 1 by 2, D is 2 by 3, and
E is 4 by 1. Capitals are routinely used to denote matrices.
As well as being a matrix C can also be called a row vector. As
well as being called a matrix E can also be called a column
vector. Lowercase, as in c and e, is acceptable for vectors. For
both matrices and vectors boldface, A, c, . . . , is often used.

55
An attempt at generality might present
 
m11 m12 . . . m1q
 
 
m
 21 m 22 . . . m 2q 
 
 .. .. 
M = . . 
 
 .. .. 
 . . 
 
mp1 . . . . . . mpq

as a p by q matrix. Note the convention with the subscripts.


m21 is in the second row and the first column rather than the
first row and the second column.

56
Addition and substraction addiction and subtraction are only
defined for matrices of the same size. Given matrices of the
same size we add, or subtract, corresponding elements. For
example given
   
1 0 4 −6.3
   
   
F = −2 5.8  G =  1 12.45
   
4.3 −7.4 −5 7.9

we have
   
1+4 0 − 6.3 5 −6.3
   
   
F + G = −2 + 1 5.8 + 12.45 =  −1 18.25
   
4.3 − 5 −7.4 + 7.9 −0.7 0.5

57
and
   
1−4 0 + 6.3 −3 6.3
   
   
F − G = −2 − 1 5.8 − 12.45 = −3 −6.65
   
4.3 + 5 −7.4 − 7.9 9.3 −15.3

transposition if you take the rows of a matrix and use them to


make the columns of a new matrix you obtain the transpose of
the original matrix. We have
 

h i

1 −2 4.3
E = 7.2 −3.2 1.2 −8 F = 
0 5.8 −7.4

58
multiplication Our A is 3 by 2 and our B is 2 by 2. The
product AB will exist as a 3 by 2 matrix in this case. How do
we find the product?
To find the entry in the first row and the first column we take
the first row of A and the first column of B, we multiply
corresponding elements and sum. We have

1 × 4 + 0 × 1 = 4.

To find the entry in the first row and the second column we
take the first row of A and the second column of B, we
multiply corresponding elements and sum. We have

1 × (−6.3) + 0 × 12.45 = −6.3.

59
To find the entry in the second row and the first column we
take the second row of A and the first column of B, we
multiply corresponding elements and sum. We have

(−2) × 4 + 5.8 × 1 = −2.2.

To find the entry in the second row and the second column we
take the second row of A and the second column of B, we
multiply corresponding elements and sum. We have

−2 × (−6.3) + 5.8 × 12.45 = 84.81.

60
To find the entry in the third row and the first column we take
the third row of A and the first column of B, we multiply
corresponding elements and sum. We have

4.3 × 4 − 7.4 × 1 = 9.8.

To find the entry in the third row and the second column we
take the third row of A and the second column of B, we
multiply corresponding elements and sum. We have

4.3 × (−6.3) − 7.4 × 12.45 = 119.22.

61
We have

 
1×4+0×1 1 × (−6.3) + 0 × 12.45
 
 
AB = (−2) × 4 + 5.8 × 1 −2 × (−6.3) + 5.8 × 12.45
 
4.3 × 4 − 7.4 × 1 4.3 × (−6.3) − 7.4 × 12.45
 
4 −6.3
 
 
= −2.2 84.81 
 
9.8 119.22

62
This all works because there are exactly as many entries in the
rows of A as there are in the columns of B. That is there are
exactly as many columns in A as there are rows in B. This is
what is required in general for a produce to exist: the number
of columns of the first matrix equals the number of rows of the
second matrix. So, for example, the product BA will not exist
for the current A and B.

The general rule let M1 be p by q and M2 be r by s. The


product M1 M2 exists if q = r. If it exists it is p by s. The
product M2 M1 exists if s = p. If it exists it is r by q.

63
The inverse matrix
b
a single equation writing x = a = a−1 b solves

ax = b.

The solution exists as long as a 6= 0. If a is zero then it is not


possible to divide by it.
two equations the equations

a11 x1 + a12 x2 = b1 a21 x1 + a22 x2 = b2

can be written as Ax = b, where


     
a11 a12 x1 b1
A=   x=   b=  
a21 a22 x2 b2

64
Solving Ax = b requires finding a matrix B giving
 
1 0
BA =  
0 1
 
1 0
where   is the identity matrix I (or I2 ) of order 2. If
0 1
such a B exists it must be 2 by 2 and we will have

BAx = Bb

which is
x = Bb,
as BAx = Ix = x, so that we have an expression for x.

65
If it exists B is actually called the inverse of A, and is denoted
A−1 . Not all 2 by 2 matrices have inverses. When it exists A−1
is given by
 
a22 −a12
A−1 =  a11 a22 −a12 a21 a11 a22 −a12 a21 
−a21 a11
a11 a22 −a12 a21 a11 a22 −a12 a21

which is often presented as


 
−1 1 a22 −a12
A =  
a11 a22 − a12 a21 −a21 a11

where a 2 by 2 matrix is multiplied by a scalar. This involves


the multiplication of every element of the 2 by 2 matrix by the
scalar.

66
The previous slide reveals what is required for A−1 to exist.
We require
a11 a22 − a12 a21 6= 0 (8)
for division by a11 a22 − a12 a21 to be defined. As a11 a22 − a12 a21
is the determinant of A a non-zero determinant is what is
required for A−1 to exist.
So if (8) holds we can find A−1 and the unique solution to
Ax = b, as A−1 b. What happens if A−1 doesn’t exist? There
are a number of possibilities.

67
 
1 1
Consider A =   where A−1 does not exist. When
2 2
 
1
b =  we have two contradictory equations so there will be

6
   
x1 1
no x =   solving them. When b =  we have don’t really

x2 2
 
x1
have two equations. There will be an infinity of x =  
x2
solving them. We require x2 = 1 − x1 for arbitrary x1 .

68
A great application of matrix methods is in defining the least
squares estimator of unknown parameters in a regression
equation. Of course this requires we learn something about
regression equations, unknown parameters, and least squares
estimation.

69

You might also like