Engineering Mathematics II
Engineering Mathematics II
Engineering Mathematics II
Contents
1 Matrices and determinant 4
1.1 Types of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.1 Some special types of square matrices . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Operation with matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.3 Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.4 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.5 Subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.6 Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.7 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.8 Transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.9 Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.10 Properties of transpose and inverse of matrix. . . . . . . . . . . . . . . . 8
1.2 Determinant of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Determinant of order two . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.2 Determinant of order three . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.3 Properties of determinant . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.4 Calculation of the Inverse of matrix . . . . . . . . . . . . . . . . . . . . 9
1.2.5 Minor and cofactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.6 Inverse of a square matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 System of linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Eigen-system and Characteristic Polynomial of a Square Matrix . . . . . . . . . 12
1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 Fourier Series 15
2.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Some useful integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Useful trigonometric results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Fourier series with period L 6= 2π . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Full worked Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 2
3 Laplace Transform 47
3.1 Important Formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2 Properties of Laplace Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5 Appendices 54
5.1 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 Some questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.3 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6 Differential Equation 57
6.1 Some definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2 Formulation of Differential equation . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3 Solution of a Differential Equation . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.3.1 First order DE by Separation of Variables Method . . . . . . . . . . . . . 60
6.3.2 Homogeneous Differential Equations . . . . . . . . . . . . . . . . . . . . 61
6.3.3 Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3.4 Bernoulli’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3.5 Exact Differential Equation . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.4 Linear Second order Differential Equation with Constant Coefficients . . . . . . 66
6.4.1 Solution of homogeneous second order DE with Constant Coefficients . . 67
6.4.2 Solution of non-homogeneous second order DE with Constant Coefficients 67
8 Probability distribution 77
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.2 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.3 Probability Density Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
8.4 Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.5 Expectation and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.5.1 Expected Value of a Function of X . . . . . . . . . . . . . . . . . . . . . 79
8.5.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.6 The Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.6.1 Expectation and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.6.2 The Poisson,s Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.6.3 Binomial Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.6.4 Derivation of the Poisson,s distribution . . . . . . . . . . . . . . . . . . . 81
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 3
9 Introduction to Statistics 82
9.1 Arrangement of data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.2 Types of Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.2.1 Dependent and Independent Variables . . . . . . . . . . . . . . . . . . . 83
9.3 Experimental and Non-Experimental Research . . . . . . . . . . . . . . . . . . . 84
9.3.1 Categorical Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.3.2 Continuous variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
10 Descriptive Statistics 86
10.1 Measures of Central Tendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10.1.2 Mean (Arithmetic) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
10.1.3 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
10.1.4 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
10.2 Measures of Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.2.1 Average Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.2.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.2.3 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.3 Five number summary and box plot . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.3.1 Five number summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
10.3.2 Box plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
10.3.3 Procedure for constructing a box plot . . . . . . . . . . . . . . . . . . . . 91
10.3.4 Information Obtained from a Box plot . . . . . . . . . . . . . . . . . . . 91
10.3.5 Range and interquartile range . . . . . . . . . . . . . . . . . . . . . . . . 92
10.4 Graphical Representation of the Statistical Data . . . . . . . . . . . . . . . . . . 93
10.5 FREQUENCY DISTRIBUTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . 94
10.5.1 GROUPED DATA: Tabular presentation of data . . . . . . . . . . . . . 94
10.5.2 Some terminologies: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
10.5.3 Relative Frequency Distributions . . . . . . . . . . . . . . . . . . . . . . 96
10.5.4 General Rules for Organizing Data into Groups . . . . . . . . . . . . . . 98
10.6 Normal distribution curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
10.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 4
1 2 3
A = 4 4 6 , (1)
7 8 9
B = 1 2 3 , (2)
1
C = 4 , (3)
7
1 0 0
D = 0 4 0 , (4)
0 0 9
1 2 3 0
E = 4 4 6 10 , (5)
7 8 9 23
1 0 9
UT = 0 4 7 , (6)
0 0 9
1 0 0
LT = 78 4 0 and (7)
9 6 9
1 2 3
4 4 6
A =
7 8 9 .
(8)
4 4 6
The element aij is allocated in ith row and in j th column; for example a32 is the entry in 3rd
row and in 2nd column. The order of a matrix is determined by its number of rows and the
number of columns.
• Square matrices: when the number of rows is the same as the number of columns for
example the matrix in Equation (1).
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 5
• Rectangular matrices: when the number of rows is different from the number of columns
for example the matrix in Equations (8) and (5).
We may also distinguish row matrices and column matrices: a matrix made by a single row
is called a row matrix and a matrix made by a single column is called a column matrix.
The matrices given by Equations (3) and (2) are examples of column matrix and row matrix
respectively.
• Diagonal matrix: When all entries above and below the leading diagonal are zeros. The
matrix given by Equation (4) is the example of a diagonal matrix.
• Lower triangular matrix: when all entries above the reading diagonal are zeros. The
matrix given by the Equation (7) is the example of the lower triangular matrix.
• Upper triangular matrix: when all entries below the reading diagonal are zeros.The ma-
trix given by the Equation (6) is the example of the lower triangular matrix.
• Identity matrix: All entries are zeros except the entries on the reading diagonal which
are all one (1):
1 0 0 0
0 1 0 0
Example
0
.
0 1 0
0 0 0 1
Solution: Since A and B have the same order 4, we have the following: x = 6, z = 8 and y = 20.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 6
1.1.4 Addition
Order of the matrices must be the same. Add corresponding elements together. Note that
matrix addition is commutative and matrix addition is associative. ie
A + B = B + A commutativity and
(A + B) + C = A + (B + C) associativity.
1.1.5 Subtraction
The order of the matrices must be the same then subtract corresponding elements. Note that
matrix subtraction is not commutative (neither is subtraction of real numbers) and matrix
subtraction is not associative (neither is subtraction of real numbers)
2. A − B and
3. A + 3B.
2 3 4 1 3 1
where A = 0 2 6 and A = 0 2 6
6 3 8 6 3 1
Solution:
3 6 5
A + B = 0 4 12 ,
12 6 9
1 0 3
A − B = 0 0 0 ,
0 0 7
5 12 7
A + 3B = 0 8 24 . (12)
24 12 11
1. Commutativity of Addition A + B = B + A,
2. Associativity of Addition A + (B + C) = (A + B) + C,
5. Distributive c (A + B) = cA + cB,
7. Additive Identity A + O = O + A = A,
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 7
The order of the product is the number of rows in the first matrix by the number of columns
in the second matrix. That is, the dimensions of the product are the outer dimensions. Since
the number of columns in the first matrix is equal to the number of rows in the second matrix,
you can pair up entries. Each element in row i from the first matrix is paired up with an
element in column j from the second matrix. The element in row i, column j, of the product
is formed by multiplying these paired elements and summing them. Each element in the prod-
uct is the sum of the products of the elements from row i of the first matrix and column j of
the second matrix. There will be n products which are summed for each element in the product.
2(1) + 2(0) + 4(6) 2(3) + 3(2) + 4(3) 2(1) + 3(6) + 4(1) 26 24 24
1. AB = 36 22 18
AB =
0(1) + 2(0) + 6(6) 0(3) + 2(2) + 6(3) 0(1) + 2(6) + 6(1)
6(1) + 3(0) + 6(6) 6(3) + 3(2) + 8(3) 6(1) + 3(6) + 8(1) 54 48 32
2 3 4 2 3 4 28 24 58
2.
A2 =
0 2 6
0
2 6 = 36
22 60
6 3 8 6 3 8 60 48 106
• (AB)T = B T AT ,
• (AB)−1 = B −1 A−1 ,
Solution:
10 10
2 −6 = 10(−6) − 2(10) = −80
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 9
10 10 −10
−6 12 2 12 2 −6
2 −6 12 = 10 − 10 − 10
−12 6 −4 6 −4 −12
−4 −12 6
= 960
2. If a matrix has two identical rows or a columns then, its determinant is zero.
3. If a matrix is triangular then, its determinant is the product of entries on the leading
diagonal.
4. If two rows or columns of a matrix are interchanged then, its determinant changes the
sign.
6. If a multiple of a row or column of a matrix were added to another row or column then,
its determinant is not changed.
if the matrix
a11 a12 a13 ... a1n
a21 a22 a23 ... a2n
= .. .. . (17)
.. .. ...
. . . .
am1 am2 am3 ... amn
is invertible.
Example: Consider the following diagram find the current through each junction.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 11
Solution:
Remark
a11 a12 a13 ... a1n
a21 a22 a23 ... a2n
If .. .. 6= 0, then we have a unique solution. Thus the system is said
.. .. ..
. . . . .
am1 am2 am3 ... amn
to be consistent. If this determinant is zero we may have two cases to discuss; either there is
infinitely many solution or no solution. this have been discussed in class.
1.4 Diagonalization
Diagonaization is a process of transforming a square matrix A into a similar diagonal matrix
D. We say that two matrices are similar if there exists an invertible matrix B such that
A = B −1 AB. To find this matrix B, we need the concepts of eigenvalues and eigenvector of a
given square matrix.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 12
Definition 1
If X is a non zero vector, then a such scalar λ in the relation above is called an eigenvalue of
A corresponding to X.
Definition 2
A such non zeros vector X in the relation above is called an eigenvector of A corresponding to
the eigenvalue λ.
Definition 3
The association of eigenvalues and eigenvectors is called the eigen-system of A .
Definition 4
The determinant p(λ) = det(A − λI) is called characteristic polynomial of matrix A. Here I is
an identity matrix with the same dimension as the matrix A.
Remark
1. The degree of the characteristic polynomial is exactly the order of matrix A.
2. All eigenvalues of the matrix A are the roots of the characteristic polynomial.
Example
1 0 −1
Consider the following matrix A = 1 2 1
2 2 3
• Calculate the determinant of matrix A.
Solution
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 13
1 0 −1
• Calculate the determinant of matrix A. 1 2 1 = 1(6 − 2) − 1(2 − 4) = 6
2 2 3
• Find all eigenvalues and eigenvectors of the matrix A. The characteristic polynomial is
given by (x−3)(x−2)(x−1) hence the eigenvectors are given by the root of characteristic
polynomial ie (x − 3)(x − 2)(x − 1) = 0 implies x = 3, x = 2 and x = 1 are the eigenvalues
of the matrix A.
1 1 1
The eigenvectors are compacted in the following matrix as its columns −1 − 12 −1 .
−2 −1 0
• Find
a matrix P which transform the matrix A
into a diagonal
form. The matrix P =
1 1 1 3 0 0
−1 − 1 −1 can transform A into D = 0 2 0 such that
2
−2 −1 0 0 0 1
−1
D = P AP .
4 2 −1
• Hence find
A , A and1 A .
−1 −1 − 2
−1
P = 2 2 0 , thus
1
0 −1 2
D4 P −1
A4 = P
−49 −50 −40
A4 = 65 66 40
130 130 81
−1 −2 −4
A2 = P D2 P −1 = 5 6 4
10 2 10 1 9 1
3
−3 3
−1 −1 −1 1 5 1
A = PD P = −6 6
− 3
1 1 1
−3 −3 3
1.6 Exercises
1. Expand the following determinants:
2 −3 4
(a) 5 1 −6
−7 8 −9
5 0 7
(b) 8 −6 −4
2 3 9
2. Show the following equalities:
b − c c − a a − b
(a) c − a a − b b − c = 0
a − b b − c c − a
x+y x x
(b) 5x + 4y 4x 2x = x3
10x + 8y 8x 3x
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 14
x y z
(c) x2 y 2 z 2 = xyz(x − y)(y − z)(z − x)
x3 y 3 z 3
a + x y z
(d) x a+y z = a2 (a + x + y + z)
x y a + z
Find A and B.
5. Given
x y x 6 4 x+y
3 = + .
z w −1 2w z+w 3
Find x, y, z and w
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 15
2 Fourier Series
2.1 Theory
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 16
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 17
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 18
2.2 Exercises
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 19
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 20
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 21
3 Laplace Transform
Let f (t) be a function defined for all positive values of t, then
Z ∞
F (s) = e−st f (t)dt (18)
0
provided the integral exists, is called the Laplace transform of f (t). It is denoted as
Z ∞
L[f (t)] = F (s) = e−st f (t)dt
0
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 48
1
L[1] = (19)
s
1
L[eat ] = (20)
s−a
a
L[sin(at)] = (21)
s + a2
2
s
L[cos(at)] = (22)
s + a2
2
a
L[sinh(at)] = (23)
s − a2
2
s
L[cosh(at)] = (24)
s 2 − a2
n!
L[tn ] = n+1
(25)
s
4. Laplace
Rt transform of integral:
L[ 0 f (t)dt] = F (s)
s
, where L[f (t)] = F (s),
5. Dividing by t Theorem: R∞
If L[f (t)] = F (s), then L[ 1t f (t)] = s F (s),
3. VL = L dI
dt
dq
4. I = dt
4.3 Illustrations
4.3.1 Example One
Suppose the current and charge on the capacitor in the circuit of Figure: 1 are zero at time
zero. Find the output voltage response to an input voltage modelled by δ(t) volts.
Solution
The output voltage is Eout (t) = q(t)/C, so we will determine q(t). We have also Ein (t) = δ(t).
By Kirchhoffs voltage law, we have the following:
1
LI 0 + RI + q = Ein (t) i.e (26)
C
1
LI 0 + RI + q = δ(t) (27)
C
This can be written as follows:
d2 q dq 1
2
+ R + q = δ(t) (28)
dt dt C
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 51
Assume that q(0) = q(0) = 0, the we apply on both side the Laplace transform, we get the
following:
Solution
Apply the Laplace transform on both sides of equations (54) and (55) we obtain the following:
12 12 1
4I(s) + I(s) − I1 (s) = , (36)
s s s
4 2
Is (s) + I1 (s) − I(s) = 0. (37)
s s
Simplification give us the following:
1
(s + 3)I(s) − 3I1 (s) = , (38)
4
−2I(s) + (s + 4)I1 (s) = 0. (39)
1 −t 1
I1 (t) = e − e−6t and (44)
10 10
3 −t 1
I(t) = e + e−6t . (45)
20 10
1 −t 1 1
q1 = − e + e−6t + and (46)
10 60 12
3 1 1
q = − e−t − e−6t +
20 60 6
I2 = I − I1 and q2 = q − q1 . (47)
1 −t 1 −6 t
I2 (t) = e + e and (48)
20 5
1 1 −6 t 1
q2 (t) = − e−t − e + . (49)
20 30 12
5 Appendices
5.1 Exercises
1. Calculate the determinant of the following matrices.
(a)
2 2 5 7
0 5 7 0
, 2 marks (50)
0 0 2 5
0 0 0 10
(b)
2 2 5 7
0 5 7 0
7
(51)
0 2 5
14 0 4 10
Find A and B.
x y z
3. x2 y 2 z 2 = xyz(x − y)(y − z)(z − x)
x3 y 3 z3
s+4
i. F (s) = s2 +4
+ 5.
s
ii. F (s) = s2 +4
+ s23+9 .
e−5s
iii. F (s) = + e−3s + s23 .
s2 +1
(b) Find the differential equation whose solution is y = A cos(2x) + B sin(2x), where A
and B are arbitrary constants.
5.3 Solutions
Q1 (a) Solve the following Differential equations:
i. dy
dx
= cos(x)
cos(y)
.
cos(y)dy = cos(x)dx,
sin(y) = sin(x) + C.
ii. y 00 + 5y 0 + 6y = 0.
The characteristic equation is given by: m2 + 5m + 6 = 0, this implies that
m = −2 or m = −3. 1 mark
Thus the solution is the following: y = Ae−2x + Be−3x .
iii. y 00 + y 0 + y = 2e2t .
The characteristic
√ equation is√ given by: m2 + m + 1 = 0, this implies that
m = − 2 + i 2 or m = − 21 − i 23 .
1 3
1
√ √
Thus the homogeneous solution is as follows: yh = e− 2 x A cos( 23 )t + B sin( 23 )t .
The particular is of this form yp = Ce2t , this give the following: C(7) = 2 thus
2
C= 7
thus yp = 27 e2t . Therefore
the general solution is a follows: y = yp + yh =
1
√ √
e− 2 x A cos( 2
3
)t + B sin( 2
3
)t + 72 e2t .
(b) Find the Fourier series representation of f (t) = t2 on the interval −π < t < π.
This function is even, thus we need a0 and an .
1 π 2 π2
Z
a0 = t dt = ,
π 0 3
Z π
1 2
an = t2 cos(nt)dt = (−1)n .
π 0 n2
So ∞
π2 X 2
f (t) = + 2
(−1)n cos(nt).
6 n=0
n
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 56
2 e−5s
F (s) = + + e−3s
s3 s2 + 1
Thus
3 0 0
P −1 AP = 0 2 0 .
0 0 1
s+4
i. F (s) = s2 +4
+ 5.
f (t) = cos(2t) + 2 sin(2t) + 5δ(t).
s 3
ii. F (s) = s2 +4
+ s2 +9
.
f (t) = cos(2t) + sin(3t).
e−5s
iii. F (s) = s2 +1
+ e−3s + 2
s3
.
(b) Find the differential equation whose solution is y = A cos(2x) + B sin(2x), where A
and B are arbitrary constants.
y = A cos(2x) + B sin(2x)
y 0 = 2B cos(2x) − 2A sin(2x)
6 Differential Equation
6.1 Some definitions
Definition 1: An equation which involves differential coefficients is called a differential Equa-
tion.
Examples
1.
dy 1 + x2
= ,
dx 1 − y2
2.
d2 y dy
= 2 − 8y,
dx2 dx
3. 2 i
d2 y h
dy 3/2
α 2 = 1+ ,
dx dx
4.
∂u ∂y
x +y = nu.
∂x ∂x
We distinguish two types of differential equations: Ordinary differential differential equations
and Partial differential differential equations; ie ODEs and PDEs.
Definition 2: An ODE is a differential involving derivatives with respect to a single indepen-
dent variable.
Definition 3: An PDE is a differential involving partial derivatives with respect to more than
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 58
1.
d2 q dq 1
L 2
+ R + q = E sin ωt
dt dt C
2. 2
d2 q
dq 1
cos t 2 + sin t + q = E sin ωt
dt dt C
3. 2 i
d2 y h
dy 3/2
α 2 = 1+ ,
dx dx
The order of all the above equations is 2. The degree of equation 1 and 2 is 1. The degree of
equation 3 is 2.
1.
y = Ax + A2
2.
y = A cos x + B sin x
3.
y 2 = Ax2 + Bx + C
Solution:
1.
y = Ax + A2
Differentiate once to get y 0 = A,then twice differentiation give y 00 = 0.
2.
y = A cos x + B sin x
Differentiate once to get y 0 = −A sin x + B cos x,then twice differentiation give y 00 =
−(A cos x + B sin x). Replace y by its value we get the following y 00 = −y
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 59
3.
y 2 = Ax2 + Bx + C
Differentiate three times we get the following:
2yy 0 = 2Ax + B
2y 0 y 0 + 2yy 00 = 2A
2y 00 y 0 + 2y 0 y 00 + 2y 0 y 00 + 2yy 000 = 0
6y 00 y 0 + 2yy 000 = 0
3y 00 y 0 + yy 000 = 0
Exercises
Write the order and degree of the following differential equations:
1. (a)
y 00 + a2 x = 0
(b)
h i3/2
0 2
1 + (y ) = y 00
(c)
x2 (y 00 )3 + y(y 0 )2 + y 4 = 0
f (y)dy = φ(x)dx,
we say that variables are separable. We get solution by integrating both sides.
Working Rule
Examples:
1. Solve
dy x(2lnx + 1)
=
dx sin y + y cos y
• Separate variables as (sin y + y cos y)dy = x(2lnx + 1)dx.
R R
• Integrate both sides as (sin y + y cos y)dy = x(2lnx + 1)dx ⇒
x2 x2 i
h Z
− cos y + y sin y + cosy = 2 (lnx) − xdx +
2 2
y sin y = x2 ln(x)
y sin y = x2 ln(x) + C
2. Solve x4 y 0 + x3 y = sec(xy),
x3 (xy 0 + y) = sec(xy). Let u = xy then du
dx
= xy 0 + y.This gives the following: x3 u0 = secu.
x−2 x−2
sin u = ⇒ sin xy =
−2 −2
x−2
sin xy = − +C
2
3. Solve
(2x2 + 3y 2 − 7)xdx = (3x2 + 2y 2 − 8)ydy.
xdx 3x2 + 2y 2 − 8
= 2
ydy 2x + 3y 2 − 7
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 61
dy 3xy + y 2
= 2 .
dx 3x + xy
In such case we put y = v(x)x and y 0 = v + xv 0 . The reduced equation involves v and x only.
The new differential equation can be solved by separation of variables method.
Working Rule
• Put y = vx ⇒ y 0 = v + xv 0
• Separate variables
Example 1: Solve
(2xy + x2 )y 0 = 3y 2 + 2xy
3y 2 + 2xy
y0 = .
2xy + x2
This is homogeneous DE.
• Put y = vx ⇒ y 0 = v + xv 0
0 3v 2 x2 + 2x2 v
xv + v = .
2x2 v + x2
dv v2 + v
x =
dx 2v + 1
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 62
• Separate variables
dx 2v + 1
= 2 dv
x v +v
• Integrate both sides Z Z
dx 2v + 1
= dv
x v2 + v
ln|x| = ln|v 2 + v|
Example 2:
(3xy + y 2 )
y0 =
3x2 + xy
Let y = vx then y 0 = v + xv,.
3x2 v + v 2 x2
v + xv 0 =
3x2 + x2 v
3v + v 2
v + xv =
v+3
0
v + xv = v
xv 0 = 0
y = cx
Exercises Solve the following differential equations:
1.
dy y
= + x sin(y/x)
dx x
2.
(y 2 − xy)dx + x2 dy = 0
3.
(x2 − y 2 )dx + 2xydy = 0
4.
dy
x(y − x) = y(y + x)
dx
5.
x(x − y)dy + y 2 dx = 0
6.
dy x − 2y
+
dx 2x − y
7.
dy 3xy + y 2
=
dx 3x2
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 63
P dx dy
R R R
P dx P dx
e +e P y = Q(x)e .
dx
d(yI(x))
= Q(x)I(x)
dx
Z
yI(x) = QI(x)dx + C
Z
1 C
y= QI(x)dx +
I(x) I(x)
R
P dx
The value I(x) = e is called integrating factor .
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 64
Working Rule
• Step 1: Convert the given equation to the standard form of linear differential equation ie
dy
dx
+ P y = Q(x).
R
• Step 2: Find the integrating factor I(x) = e P dx .
1 C
R
• Step 3: Solution is y = I(x) QI(x)dx + I(x) .
dy
Example 1: Solve (x + 1) dx = y + ex (x + 1)2
• Step 1: Convert the given equation to the standard form of linear differential equation ie
dy 1 dy 1
= y + ex (x + 1) ⇒ − y + ex (x + 1)
dx x+1 dx x + 1
.
−1
R
dx 1
• Step 2: Find the integrating factor I(x) = e x+1 = x+1
.
• Step 3: Solution is
Z
1 x
y = (x + 1) e (x + 1)dx + C(x + 1) ⇒ y = (x + 1)ex + C(x + 1).
x+1
dy
Example 2: Solve (x3 − x) dx − (3x2 − 1)y = x5 − 2x3 + x
• Write y = z 1/(1−n)
dw 1 dw 1
y + yw = ex y ⇒ + w = ex .
dx x dx x
This equation is not in Bernoulli’s form but it is linear one. Thus its solution is given by
(x − 1)ex + C (x−1)ex +C
.
w= ⇒ y=e x
x
Example 3 Using Bernoulli’s method solve the following differential equation:
dr
r sin θ − cos θ = r2
dθ
Solution:
1
r=
sin θ + cos θ
Example 4 solve the following differential equation:
dθ tan θ
− = (1 + r)er sec θ
dr 1 + r
Solution:
sin(θ) = (1 + r)(er + C)
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 66
This form above is said to an exact differential equation if the following hold:
∂M ∂N
=
∂y ∂x
Where ∂M
∂y
denotes the differential coefficients of M with respect to y keeping x to be constant
∂N
and ∂x is the differential coefficients of N with respect to x keeping y to be constant.
Working rule
• Step 2: Integrate terms of N which do not contain x with respect to y keeping x constant.
x5 + x3 y 2 − x2 y 3 − y 5 = c
ay 00 + by 0 + cy = R(x).
1. If ∆ = b2 − 4ac > 0, then the solution is given by y = Aem1 x + Bem2 x where A and B are
arbitrary constants and m1 , m2 are roots of the characteristic equation.
3. If ∆ = b2 − 4ac < 0, then the solution is given by y√ = (A cos βx + B sin βx)eαx where
−∆
A and B are arbitrary constants and α = −b 2a
, β = 2a are roots of the characteristic
equation.
Let again
then
yp0 = u(x)y10 + v(x)y20
yp00 = u0 (x)y10 + v 0 (x)y20 + u(x)y100 + v(x)y200
Put yp , yp and yp00 in ay 00 + by 0 + cy = R(x) we get the following:
a(u0 (x)y10 + v 0 (x)y20 + u(x)y100 + v(x)y200 ) + b(u(x)y10 + v(x)y20 + cu(x)y1 + cv(x)y2 ) = R(x)
au0 (x)y10 + av 0 (x)y20 + au(x)y100 + av(x)y200 + bu(x)y10 + bv(x)y20 + cu(x)y1 + v(x)y2 = R(x)
au0 (x)y10 + av 0 (x)y20 = R(x)
1
u0 (x)y10 + v 0 (x)y20 = R(x) (55)
a
Put (54) and (55) together we get the following:
Let w(y1 , y2 ) = y1 y20 − y10 y2 , then the values of u and v are given by the following:
Z
y2 R(x)
u(x) = − dx (58)
aw(y1 , y2 )
Z
y1 R(x)
v(x) = dx (59)
aw(y1 , y2 )
We distinguish different cases to be specific we shall consider only the following cases:
• R(x) = A0 erx
1. If r is not a root of characteristic equation am2 +bm+c = 0, we set the particular solution
1
to be yp = Aerx where A = ar2 +br+c
1. If r is not a root of characteristic equation am2 +bm+c = 0, we set the particular solution
to be yp = A cos wx + B sin wx where A and B are constant to found.
Pn (x) = A0 + A1 x + A2 x2 + A3 x3 + ... + An xn
1. Random experiment: There are some experiments whose results may be different, even
if they are performed under the same conditions. They are called random experiments.
For example tossing a coin or throwing a die is random experiment.
2. Trial or event: Performing a random experiment is called a trial and its outcome is called
event. For instance Tossing a coin is trial and turning up a head or tail is an event.
3. Equally likely events: Two events are said to equally likely events if one of them cannot
be expected in preference of the other. For example if we draw a card from a well shuffled
pack, we may get any card, the 52 different cases are equally likely.
4. Compound events: When two or more events occur in composition with each other, the
simultaneous occurrence is termed as compound event. For instance when a die is thrown,
getting 5 or 6 is compound event.
5. Exhaustive event: The set of all possible outcomes of single performance of a random
experiment is exhaustive event or sample space. Each outcome is called sample point.
For example in case of tossing a coin once the sample space would be S = {H, T }.
6. Independent event: Two events may be independent when the actual happening of one
of them does not influence in any way the chance (probability) of the happening of the
other. For example the event of getting head on first coin and the event of getting a tail
on the second coin in a simultaneous throw of two coins are independent.
7. Mutually exclusive events: Two events are said to be mutually exclusive if the occurrence
of one excludes the occurrence of the other. For instance, on tossing a coin either we get
head or tail but not both.
8. Favourable events: The events that ensure the required happening are said to be favourable
events. For instance, in throwing of a die to have even numbers 2, 4 and 6 are favourable
ways.
10. Complement events: Events are said to be complements if they are mutually exclusive
and they exhaust the entire sample space.
2. If A and B are mutually exclusive then P r(A ∪ B) = P r(A) + P r(B) and if A and B are
not mutually exclusive then P r(A ∪ B) = P r(A) + P r(B) − P r(A ∩ B).
3. P r(S) = 1
4. P r(A) = 1 − P r(A)
Examples
1. An integer is chosen at random from the set of S = {x/x ∈ Z+ , x < 14}. Let A be the
event of choosing a multiple of three and let B be the event of choosing an even number.
Find the probability of A ∪ B, A ∩ B and A − B.
Solution:
S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13}.
A = {3, 6, 9, 12} and B = {2, 4, 6, 8, 10, 12}.
A ∪ B = {2, 3, 4, 6, 8, 9, 10, 12}.A ∩ B = {6, 12} and AB = {2, 4, 8, 10}.
Thus
8 2 4
P r(A ∪ B) = , P r(A ∩ B) = , P r(AB) = .
13 3 13
2. A coin is weighted so that head is three times as likely to appear as tail.
Find P r(T ) and P r(H) .
Solution: P r(T ) + P r(H) = 1 but 3P r(T ) = P r(H), theref oreP r(T ) + 3P r(T ) = 1,
hence P r(T ) = 41 and 3P r(T ) = P r(H) = 34 .
3. If A and B are events with P r(A) = 1/8 and P r(B) = 1/12 and P r(A ∩ B) = 14 . Find
P r(A ∪ B).
1
Solution : P r(A ∪ B) = P r(A) + P r(B) − P r(A ∩ B) = 18 + 12 − 14 = 13
24
Occurrence / events Happening Failling Results: One happens and the other fails
A M N MN
B m n mn
Results: Both happen or fail Mm Nn MN+mn+Mm+Nn
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 72
We have
Mm M m
P r(A ∩ B) = =
M N + mn + M m + N n M +N m+n
Note that if A and B are independent events, then Ā and B̄ are independent events too.
ie P r(A ∩ B) = P r(A)P r(B) ⇒ P r(Ā ∩ B̄) = P r(Ā)P r(B̄).
The proof is simple: We know
¯ B) = 1 − P r(A ∪ B) = 1 − P r(A) − P r(B) + P r(A ∩ B).
P r(Ā ∩ B̄) = P r(A ∪
But A and A are independent ie P r(A ∩ B) = P r(A)P r(B), this implies that
P r(Ā ∩ B̄) = 1 − P r(A) − P r(B) + P r(A)P r(B) = (1 − P r(A))(1 − P r(B)) = P r(Ā)P r(B̄).
EXAMPLE: An article manufactured by a company consists of two parts A and B. In the
process of manufacture of part A, 9 out of 100 are likely to be defective similarly In the process
of manufacture of part B 5 out of 100 are likely to be defective. Calculate the probability that
the assembled article will not be defective. Assuming that the events of finding the part A non
defective and that of the part B non defective are independent.
Solution
T he probability that part A is defective is 9/100.
T he probability that part b is defective is 5/100.
T he probability that part A is not defective is 1 − 9/100 = 91/100.
T he probability that part B is not defective is 1 − 5/100 = 95/100.
The probability that the assembled article will not be defective is (91/100) × (95/100) = 0.8645
since those events are independent.
Example:
Let consider a random experiment of throwing 2 dice at the same time and record the results
to be on the form (d1 , d2 ). Consider an event A to be the six is appearing on the first place.
And Consider an event B to be the six is appearing on the second place.
1. Calculate the probability of the event A.
2. Calculate the probability of the event B.
3. Calculate the probability of the compound event A ∩ B.
4. Calculate the probability of the compound event A ∪ B.
5. Verify that the events A and B are independent or not.
6
1. P r(A) = 36
= 61 ,
6
2. P r(B) = 36
= 16 ,
1
3. P r(A ∩ B) = 36
,
11
4. P r(A ∪ B) = 36
,
1
5. P r(A)P r(B) = 36
.
Hence A and B are independent events.
Example:
A factory runs two machines A and B. Machine A operates 80% of the time while Machine B
operates 60% of the time. And at least one machine operates for 92% of the time. Do these
machines operate independently?
Solution :
P r(A) = 0.8
P r(B) = 0.6
P r(A ∪ B) = 0.92
P r(A ∪ B) = P r(A) + P r(B) − P r(A ∩ B)
0.92 = 0.8 + 0.6 − P r(A ∩ B)
P r(A ∩ B) = 0.48
P r(A ∩ B) = P r(A)P r(B) = 0.48 Hence they operate independently.
P r(A ∩ B)
P r(A/B) = .
P r(B)
2. P r(B)
3. P r(A ∪ B)
4. P r(B/A)
5. P r(A/B)
Solution
1
1. P r(A) = 2
1
2. P r(B) = 2
1
3. P r(A ∪ B) = 4
P r(A∩B) 1
4. P r(B/A) = P r(A)
= 2
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 74
P r(A∩B) 1
5. P r(A/B) = P r(B)
= 2
P r(Br )P r(A/Br )
P r(Br /A) = Pn
i=1 P r(Bi )P r(A/Bi )
EXAMPLE
Machines A and B produce 60% and 40% respectively of the total output of the factory. The
parts produced by the machine A, 3% are defective and the parts produced by the machine
B, 5% are defective. A part is selected at random from a days production and found to be
defective. What is the probability that it comes from Machine A.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 75
3. Drawing 5 cards from a deck for a poker hand (done without replacement, so not inde-
pendent)
Example:
What is the probability of rolling exactly two sixes in 6 rolls of a die?
SOLUTION
There are five things you need to do to work a binomial story problem.
1. Define Success first. Success must be for a single trial. Success = ”Rolling a 6 on a single
die”
Any time a six appears, it is a success (denoted S) and any time something else appears, it
is a failure (denoted F). The ways you can get exactly 2 successes in 6 trials are given below.
The probability of each is written to the right of the way it could occur. Because the trials
are independent, the probability of the event (all six dice) is the product of each probability of
each outcome (die)
Notice that each of the 15 probabilities are exactly the same as (1/6)2 × (5/6)4 . Also, note
that the 1/6 is the probability of success and you needed 2 successes. The 5/6 is the probability
of failure, and if 2 of the 6 trials were success, then 4 of the 6 must be failures. Note that 2 is
the value of r and 4 is the value of n r. Further note that there are fifteen ways this can occur.
This is the number of ways 2 successes can be occur in 6 trials without repetition and order
not being important, or a combination of 6 things, 2 at a time. Thus the required probability
is 15 × (1/6)2 × (5/6)4
The probability of getting exactly x success in n trials, with the probability of success on a
single trial being p is:
P (X = r) = (rn ) × pr × q n−r
with
n!
(rn ) = and p + q = 1.
r!(n − r)!
Example1:
A coin is tossed 10 times. What is the probability that exactly 6 heads will occur.
SOLUTION:
Success = ”A head is flipped on a single coin” p = 0.5, q = 0.5, n = 10 and r = 6.
P (r = 6) = (610 ) × 0.56 × 0.54 = 210 × 0.015625 × 0.0625 = 0.205078125.
Example2:
Suppose a biased coin comes up heads with probability 0.3 when tossed. What is the proba-
bility of achieving 0, 1,..., 6 heads after six tosses?
Solution:
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 77
8 Probability distribution
8.1 Introduction
To define probability distributions for the simplest cases, one needs to distinguish between dis-
crete and continuous random variables. In the discrete case, one can easily assign a probability
to each possible value: for example, when throwing a fair die, each of the six values 1 to 6 has
the probability 1/6. In contrast, when a random variable takes values from a continuum then,
typically, probabilities can be non-zero only if they refer to intervals: in quality control one
might demand that the probability of a ”500 g” package containing between 490 g and 510 g
should be no less than 98%.
A discrete variable is a variable which can only take a countable number of values. In this
example, the number of heads can only take 4 values (0, 1, 2, 3) and so the variable is discrete.
The variable is said to be random if the sum of the probabilities is one.
Quite often, the probability density function will be given to you in terms of x. In the above
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 78
3!
example, P (X = x) = (x3 ) /23 = x!(3−x)!8 . (see permutations and combinations for the meaning
x
of (3 ) ).
Example:
A die is thrown repeatedly until a 6 is obtained. Find the probability density function for the
number times we throw the die.
Solution:
Let X be the random variable representing the number of times we throw the die.
P (X = 1) = 1/6 (if we only throw the die once, we get a 6 on our first throw. Its probability is
1/6 . P (X = 2) = 65 16 (if we throw the die twice before getting a 6, we must throw something
that isn’t a 6 with our first throw, the probability of which is 5/6 and we must throw a 6 on
our second throw, the probability of which is 1/6) etc. In general P (X = x) = ( 56 )x−1 ( 61 )
So the expected value is the sum of: [(each of the possible outcomes) × (the
probability of the outcome occurring)].
In more concrete terms, the expectation is what you would expect the outcome of an experiment
to be on average.
Example
What is the expected value when we roll a fair die?
Solution
There are six possible outcomes: 1, 2, 3, 4, 5, 6. each of these has a probability of 1/6 of
occurring. Let X represents the outcome of the experiment.
Therefore
1
P (X = 1) = P (X = 2) = P (X = 3) = P (X = 4) = P (X = 5) = P (X = 6) =
6
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 79
For instance, P (X = 6) = 16 means that the probability that the outcome of the experiment is
6 is 1/6.Thus the expected value is calculated as follows:
1 1 1 1 1 1
E(X) = P (X = 1)× +P (X = 2)× +P (X = 3)× +P (X = 4)× +P (X = 5)× +P (X = 6)×
6 6 6 6 6 6
1 7
E(X) = [1 + 2 + 3 + 4 + 5 + 6] = .
6 2
So the expectation is 3.5 . If you think about it, 3.5 is halfway between the possible values the
die can take and so this is what you should have expected.
Example
For the above experiment (with the die), calculate E(X 2 ). Using our notation above, f (x) = x2
1 1 1
E(X 2 ) = P (X 2 = 1) × + P (X 2 = 4) × + P (X 2 = 9) ×
6 6 6
1 1 1
+P (X 2 = 16) × + P (X 2 = 25) × + P (X 2 = 36) ×
6 6 6
1 91
E(X 2 ) = [1 + 4 + 9 + 16 + 25 + 36] = .
6 6
Some useful properties of Expected values
The expected value of a constant is just the constant, so for example E(1) = 1.
Multiplying a random variable by a constant multiplies the expected value by that constant,
so E[2X] = 2E[X]. Consider the cases where a and b are constants, then
E[aX + b] = aE[X] + b,
E[X + Y ] = E(X) + E(Y ),
E[X − Y ] = E[X] − E[Y ].
8.5.2 Variance
The variance of a random variable tells us something about the spread of the possible values
of the variable. For a discrete random variable X, the variance of X is written as V ar(X).
Variance is calculated as follows:
V ar(X) = E[(x − µ)2 ].
This can also be written as follows:
V ar(X) = E[X 2 ] − µ2 = E[X 2 ] − (E[X])2 .
The standard deviation of X is the square root of Var(X). Note that the variance does not behave
in the same way as expectation when we multiply and add constants to random variables. In
fact:
V ar[aX + b] = a2 V ar(X),
V ar[X + Y ] = V ar[X] + V ar[Y ],
V ar[X − Y ] = V ar[X] + V ar[Y ].
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 80
1. The Poisson distribution is useful because many random events follow it.
2. If a random event has a mean number of occurrences λ in a given time period, then the
number of occurrences within that time period will follow a Poisson distribution.
Example
There are 50 misprints in a book which has 250 pages. Find the probability that page 100 has
no misprints.
The average number of misprints on a page is 50/250 = 0.2. Therefore, if we let X be the random
variable denoting the number of misprints on a page, X will follow a Poisson distribution with
parameter 0.2 . Since the average number of misprints on a page is 0.2, the parameter, λ of
the distribution is equal to 0.2 .
1. The memory less property of a binomial process carries across to a Poisson process;
2. The Poisson process is often a good approximation to the binomial process; and therefore
3. The various distributions of the Poisson process are good often approximations to their
corresponding binomial process distributions.
9 Introduction to Statistics
Statistics is the science that deals with the collection, analysis and interpretation of numer-
ical information. This science is divided into two areas: descriptive statistics and inferential
statistics. The theoretical base of the science of statistics is a field within mathematics called
mathematical statistics. Here, statistics is presented as an abstract, tightly integrated structure
of axioms, theorems, and rigorous proofs, involving many other areas of mathematics such as
calculus, probability theory, and higher algebra. To make this theoretical structure available to
the non-mathematician, an interpretative discipline has been developed called general statis-
tics in which the presentation is greatly simplified and often non-mathematical. From this
simplified version, each specialized field (e.g., agriculture, anthropology, biology, economics,
and engineering) takes material that is appropriate for its own numerical data.
Definition1: Statistics is concerned with the collection, ordering and analysis of data.
Definition2: Statistical data consists of set of recorded observation or values of some vari-
ables.
Definition3: a data variable is any quantity that can have a number of values. It can be
discrete or continuous.
A statistical exercise normally consist of four stages:
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 83
Example: The ages of student in the department of food processing and Agriculture were
recorded:
23 28 27 21 24 24 23 26 22 25 23 25 25 26 25 25 22 27 23 23 25 26 25 23 25 24 25 27 23 26 22
24 22 25 25 22 24 24 26 24 26 24 25 23 25 23 24 24 22 25 22 26 24. We can order this values,
analyse them and then we can draw a useful conclusion.
Imagine that a tutor asks 100 students to complete a maths test. The tutor wants to know
why some students perform better than others. Whilst the tutor does not know the answer to
this, she thinks that it might be because of two reasons: (1) some students spend more time
revising for their test; and (2) some students are naturally more intelligent than others. As
such, the tutor decides to investigate the effect of revision time and intelligence on the test
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 84
performance of the 100 students. The dependent and independent variables for the study are:
Dependent Variable: Test Mark (measured from 0 to 100)
Independent Variables: Revision time (measured in hours) and Intelligence (measured using IQ
score)
The dependent variable is simply that, a variable that is dependent on an independent vari-
able(s). For example, in our case the test mark that a student achieves is dependent on revision
time and intelligence. Whilst revision time and intelligence (the independent variables) may (or
may not) cause a change in the test mark (the dependent variable), the reverse is implausible;
in other words, whilst the number of hours a student spends revising and the higher a student’s
IQ score may (or may not) change the test mark that a student achieves, a change in a student’s
test mark has no bearing on whether a student revises more or is more intelligent (this simply
doesn’t make sense).
Therefore, the aim of the tutor’s investigation is to examine whether these independent vari-
ables - revision time and IQ - result in a change in the dependent variable, the students’ test
scores. However, it is also worth noting that whilst this is the main aim of the experiment, the
tutor may also be interested to know if the independent variables - revision time and IQ - are
also connected in some way.
In the section on experimental and non-experimental research that follows, we find out a little
more about the nature of independent and dependent variables.
• Nominal variables are variables that have two or more categories, but which do not have
an intrinsic order. For example, a real estate agent could classify their types of property
into distinct categories such as houses, condos, co-ops or bungalows. So ”type of property”
is a nominal variable with 4 categories called houses, condos, co-ops and bungalows. Of
note, the different categories of a nominal variable can also be referred to as groups or
levels of the nominal variable. Another example of a nominal variable would be classifying
where people live in Rwanda by PROVINCE. In this case there will be many more levels
of the nominal variable (5 in fact).
• Dichotomous variables are nominal variables which have only two categories or levels.
For example, if we were looking at gender, we would most probably categorize somebody
as either ”male” or ”female”. This is an example of a dichotomous variable (and also a
nominal variable). Another example might be if we asked a person if they owned a mobile
phone. Here, we may categorise mobile phone ownership as either ”Yes” or ”No”. In the
real estate agent example, if type of property had been classified as either residential or
commercial then ”type of property” would be a dichotomous variable.
• Ordinal variables are variables that have two or more categories just like nominal variables
only the categories can also be ordered or ranked. So if you asked someone if they liked
the policies of the Democratic Party and they could answer either ”Not very much”,
”They are OK” or ”Yes, a lot” then you have an ordinal variable. Why? Because you
have 3 categories, namely ”Not very much”, ”They are OK” and ”Yes, a lot” and you can
rank them from the most positive (Yes, a lot), to the middle response (They are OK), to
the least positive (Not very much). However, whilst we can rank the levels, we cannot
place a ”value” to them; we cannot say that ”They are OK” is twice as positive as ”Not
very much” for example.
• Interval variables are variables for which their central characteristic is that they can be
measured along a continuum and they have a numerical value (for example, temperature
measured in degrees Celsius or Fahrenheit). So the difference between 20C and 30C is the
same as 30C to 40C. However, temperature measured in degrees Celsius or Fahrenheit is
NOT a ratio variable.
• Ratio variables are interval variables, but with the added condition that 0 (zero) of the
measurement indicates that there is none of that variable. So, temperature measured in
degrees Celsius or Fahrenheit is not a ratio variable because 0C does not mean there is
no temperature. However, temperature measured in Kelvin is a ratio variable as 0 Kelvin
(often called absolute zero) indicates that there is no temperature whatsoever. Other
examples of ratio variables include height, mass, distance and many more. The name
”ratio” reflects the fact that you can use the ratio of measurements. So, for example, a
distance of ten metres is twice the distance of 5 metres.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 86
10 Descriptive Statistics
In descriptive statistics, techniques are provided for processing raw numerical data into usable
forms. These techniques include methods for collecting, organizing, summarizing, describing,
and presenting numerical information. If entire groups (populations) were always available for
study, then descriptive statistics would be all that is required.
This formula is usually written in a slightly different manner using the Greek capital letter, Σ
, pronounced ”sigma”, which means ”sum of...”:
You may have noticed that the above formula refers to the sample mean. So, why have we
called it a sample mean? This is because, in statistics, samples and populations have very
different meanings and these differences are very important, even if, in the case of the mean,
they are calculated in the same way. To acknowledge that we are calculating the population
mean and not the sample mean, we use the Greek lower case letter ”mu”, denoted as µ:
The mean is essentially a model of your data set. It is the value that is most common. You
will notice, however, that the mean is not often one of the actual values that you have observed
in your data set. However, one of its important properties is that it minimises error in the
prediction of any one value in your data set. That is, it is the value that produces the lowest
amount of error from all other values in the data set.
An important property of the mean is that it includes every value in your data set as part of
the calculation. In addition, the mean is the only measure of central
Pn tendency where the sum
of the deviations of each value from the mean is always zero, ie i=1 (xi − X̄) = 0.
When not to use the mean?
The mean has one main disadvantage: it is particularly susceptible to the influence of outliers.
These are values that are unusual compared to the rest of the data set by being especially small
or large in numerical value. For example, consider the wages of staff at a factory below:
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 87
Staff 1 2 3 4 5 6 7 8 9 10
Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k
The mean salary for these ten staff is 30.7k. However, inspecting the raw data suggests that
this mean value might not be the best way to accurately reflect the typical salary of a worker,
as most workers have salaries in the 12k to 18k range. The mean is being skewed by the two
large salaries. Therefore, in this situation, we would like to have a better measure of central
tendency. As we will find out later, taking the median would be a better measure of central
tendency in this situation.
Another time when we usually prefer the median over the mean (or mode) is when our data
is skewed (i.e., the frequency distribution for our data is skewed). If we consider the normal
distribution - as this is the most frequently assessed in statistics - when the data is perfectly
normal, the mean, median and mode are identical. Moreover, they all represent the most
typical value in the data set. However, as the data becomes skewed the mean loses its ability
to provide the best central location for the data because the skewed data is dragging it away
from the typical value. However, the median best retains this position and is not as strongly
influenced by the skewed values. This is explained in more detail in the skewed distribution
section later.
10.1.3 Median
The median is the middle score for a set of data that has been arranged in order of magnitude.
The median is less affected by outliers and skewed data. In order to calculate the median,
suppose we have the data below: 65 55 89 56 35 14 56 55 87 45 92
We first need to rearrange that data into order of magnitude (smallest first):
14 35 45 55 55 56 56 65 87 89 92
Our median mark is the middle mark - in this case, 56 (highlighted in bold). It is the middle
mark because there are 5 scores before it and 5 scores after it. This works fine when you have
an odd number of scores, but what happens when you have an even number of scores? What if
you had only 10 scores? Well, you simply have to take the middle two scores and average the
result. So, if we look at the example below:
65 55 89 56 35 14 56 55 87 45
We again rearrange that data into order of magnitude (smallest first): 14 35 45 55 55 56 56
65 87 89
Only now we have to take the 5th and 6th score in our data set and average them to get a
median of 55.5.
10.1.4 Mode
The mode is the most frequent score in our data set. On a histogram it represents the highest
bar in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the
most popular option. An example of a mode is presented below:
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 88
Normally, the mode is used for categorical data where we wish to know which is the most
common category, as illustrated below:
We can see above that the most common form of transport, in this particular data set, is the
bus. However, one of the problems with the mode is that it is not unique, so it leaves us with
problems when we have two or more values that share the highest frequency, such as below:
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 89
We are now stuck as to which mode best describes the central tendency of the data. This
is particularly problematic when we have continuous data because we are more likely not to
have any one value that is more frequent than the other. For example, consider measuring 30
peoples’ weight (to the nearest 0.1 kg). How likely is it that we will find two or more people
with exactly the same weight (e.g., 67.4 kg)? The answer, is probably very unlikely - many
people might be close, but with such a small sample (30 people) and a large range of possible
weights, you are unlikely to find two people with exactly the same weight; that is, to the nearest
0.1 kg. This is why the mode is very rarely used with continuous data.
Another problem with the mode is that it will not provide us with a very good measure of
central tendency when the most common mark is far away from the rest of the data in the data
set, as depicted in the diagram below:
In the above diagram the mode has a value of 2. We can clearly see, however, that the mode is
not representative of the data, which is mostly concentrated around the 20 to 30 value range.
To use the mode to describe the central tendency of this data set would be misleading.
• Range,
• interquartile range,
10.2.2 Variance
The population variance σ 2 (the Greek letter sigma squared) and the sample variance s2 are
given byP
σ 2 = n1 ni=1 (xi − µ)2 for populations and s2 = n−1
1
Pn 2
i=1 (xi − x̄) for samples.
It can be
P written also as follows:
σ 2 = n1 pi=1 fi (xi − µ)2 for populations and s2 = n−1
1
Pp 2
i=1 fi (xi − x̄) for samples.
Note: Example to be given in class!!
3. The Quartile one is the number or observation which leaves 25 % of the data on the left
hand side and 75% on the right side.
4. The Quartile three is the number or observation which leaves 75 % of the data on the left
hand side and 25% on the right side.
5. The Median is the number or observation which leaves 50 % of the data on the left hand
side and 50% on the right side.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 91
2. (a) If the lines are about the same length, the distribution is approximately symmetric.
(b) If the right line is larger than the left line, the distribution is positively skewed.
(c) If the left line is larger than the right line, the distribution is negatively skewed.
Exercise: A dietitian is interested in comparing the sodium content of real cheese with the
sodium content of a cheese substitute. The data for two random samples are shown. Compare
the distributions, using box plots.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 92
R = max − min
The interquartile range is the difference between the quartile 3 and quartile 1:
IR = Q3 − Q1
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 93
• Lower and upper limits: The numbers 20 and 29 are lower limits and upper limits
respectively.
• Class boundaries: class boundaries are 0.5 below lower limit and 0.5 above the upper
limit. For example for the class 20-29, the lower bound is 20 − 0.5 = 19.5 and the upper
bound is 29 + 0.5 = 29.5. Note: If the data are whole numbers we use 0.5 if there are of
one decimal we use 0.05 and so on.
• Class interval C: It is the difference between the upper and lower class boundaries. For
example, for the same class C = 29.5 − 19.5 = 10.
• Class width W: It is the difference between the upper and lower class limits. For
example, for the same class W = 29 − 20 = 9.
• Central values (or mid-value): It is the average of the upper and lower boundaries.
For example, for the same class ci = 29.5+19.5
2
= 25.5.
• Relative frequency: Is the percentage of frequency on the total frequency. For example,
for the same class 29 − 20 its relative frequency is 6×100
50
= 12% .
• A cumulative frequency distribution shows, for each class, the total number of
observations in all classes up to and including that class. When plotted, this gives a
distribution curve, or ogive.
Now from the above data, we can have the following table:
Overtime hours Frequencies Central values Lower bounds Upper bounds
10-19 2 14.5 9.5 19.5
20-29 6 24.5 19.5 29.5
30-39 14 34.5 29.5 39.5
40-49 17 44.5 39.5 49.5
50-59 8 54.5 49.5 59.5
60-69 3 64.5 59.5 69.5
Total 50
Example
As an illustration, suppose that a sample consists of the heights of 100 male students at MP
University. We arrange the data into classes or categories and determine the number of indi-
viduals belonging to each class, called the class frequency. The resulting arrangement as in the
below table is called a frequency distribution or frequency table.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 96
The first class or category, for example, consists of heights from 60 to 62 inches, indicated by
60 − 62, which is called class. 60 is the lower class limit but 62 is the upper class limit.
Since 5 students have heights belonging to this class, the corresponding class frequency is
5. Since a height that is recorded as 60 inches is actually between 59.5 and 60.5 inches while
one recorded as 62 inches is actually between 61.5 and 62.5 inches, we could just as well have
recorded the class as 59.5 − 62.5. The next class would then be 62.5 − 65.5, etc. In the class
59.5 − 62.5, the numbers 59.5 and 62.5 are often called class lower boundary and upper
class boundary respectively. The width of this class interval, denoted by w, which is usually
the same for all classes (in which case it is denoted by w), is the difference between the upper
and lower class limits. In this case w = 62 − 60 = 2. The class interval of this class interval,
denoted by c, which is usually the same for all classes (in which case it is denoted by c), is the
difference between the upper and lower class boundaries. In this case c = 62.5 − 59.5 = 3. The
midpoint of the class, which can be taken as representative of the class, is called the class
mark. In the above the class mark corresponding to the class interval 60 − 62 is 61. A graph
for the frequency distribution can be supplied by a histogram, as shown in the figure below, or
by a polygon graph (often called a frequency polygon) connecting the midpoints of the tops in
the histogram. It is of interest that the shape of the graph seems to indicate that the sample
is drawn from a population of heights that is normally distributed.
a) Arrange these grades (raw data set) into an array from the lowest grade to the highest
grade.
b) Construct a table showing class intervals and class midpoints and the absolute, relative,
and cumulative frequencies for each grade.
Solutions
a) Arranged grades
b) Note that since we are dealing here with discrete data (i.e., data expressed in whole
numbers), we used the actual grades as the class midpoints. We get the following table:
R+1
3. Calculate the size of each class using this formula c = K
.
4. Determine the number of observations falling into each class interval; that is, find the
class frequencies and the class midpoints.
Note: Example to be given in class!!
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 99
The standard normal distribution is a normal distribution with a mean of 0 and a standard
deviation of 1 (i.e., µ = 0 and σ = 1). Any normal distribution can be converted into a standard
normal distribution by letting µ = 0 and expressing deviations from µ in standard deviation
units (z scale). To find probabilities (areas) for problems involving the normal distribution, we
first convert the X value into its corresponding z value, as follows:
X −µ
z=
σ
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 100
10.7 Exercises
1. Give short notes on Categorical variables and provide examples
2. With help of examples describe the difference between descriptive statistics and inferential
statistics.
4. Explain what is a box plot and give its role in testing the symmetry of a distribution.
(a) The range of a statistical data is the difference between the median and mode.
(b) The mode of data is the observation with lowest frequency.
(c) The median of a statistical data is a number which divide the data into two equal
parts.
7. The doughnut are distributed in boxes depending on capacity of boxes: The number of
doughnut are as follows:
27 38 22 33 39 45 32 43 41 57 45 40 59 11 38 34 22 62 33 48 43 57 37 34 51
29 41 45 31 46 25 57 39 42 55 20 37 35 66 45 32 44 47 42 46 54 65 17 35 53 .
Use these data to Calculate
b median of data,
c mode of data,
d quartile three,
e mean of data,
f variance and standard deviation of data.
8. P
The sampling of thePnages of students from aPschool gave the following information:
n 2
x
i=1 i = 204802, i=1 xi = 3278, where n = i=1 fi = 53.
(a) Calculate the mean age, variance and standard deviation of the age.
(b) What can you say about the obtained results in (a)?
(a) Determine the mean, mode, median, variance, standard deviation, quartile one and
quartile three.
(b) Draw the bar-chart, pie-chart and box plot.
10. The length of 20 seeds are measured and the results in centimetre to two significant figure
are as follows: 7.3 7.1 6.6 7.0 7.8 7.3 7.5 6.2 6.9 6.7 6.5 6.8 7.2 7.4 6.5 6.9 7.2
7.6 7.0 6.8
(a) Compile step by step a table showing frequency distribution and relative frequency
distribution for regular classes of 0.2 cm from 6.2 cm to 7.9 cm.
Mr. NIYIGABA Emmanuel, Assistant Lecturer (IPRC MUSANZE) 101
(b) Compile step by step a table showing lower and upper limits, central values, lower
and upper bounds ans their corresponding frequencies of all classes, for regular classes
of 0.2 cm from 6.2 cm to 7.9 cm.
(c) Draw the corresponding histogram, frequency polygon curve and the frequency curve.
(d) Find the mean , variance, and the standard deviation of the grouped data.
(e) Find the modal class and the mode of the grouped data.
(f) Find the mode, mean , median and variance of the non grouped data.
(g) Compare the obtained value above to the one from grouped data.
(h) In one paragraph draw the conclusion about the length of the seeds.
11. The length in millimetre of 40 spindles were measured with the following results: 20.90
20.57 20.86 20.74 20.82 20.63 20.53 20.89 20.75 20.65 20.71 21.03 20.72 20.41
20.94 20.75 20.79 20.65 21.08 20.89 20.50 20.88 20.97 20.78 20.64 20.92 21.07
21.16 20.80 20.77 20.82 20.72 20.60 20.90 20.86 20 68 20.75 20.88 20.56 20.94
(a) Do the same as in the question one from 20.40 to 21.20 and consider the regular
intervals of 0.01.
(b) Is the frequency curve obtained a normal distribution curve? Explain your answer.
12. The company want to investigate the precision of a new employee who is employed to
make the doughnuts. A well made doughnut has 20 cm of perimeter. The perimeters
of a sample of 34 doughnuts are measured and the following results in cm are obtained.
19.63 19.82 19.96 19.75 19.86 19.82 16.61 19.97 20.07 19.89 20.16 19.56 20.05
19.96 19.68 19.87 19.90 19.66 19.77 19.99 20.00 20.11 20.01 19.84 19.73 19.93
20.03 19.86 19.81 19.77 19.78 19.75 19.87 19.72
(a) Arrange the data into 7 equal classes of width 0.09 cm for the range 19.50 cm to
29.19 cm and determine the frequency distribution.
(b) Find the mean , variance, and the standard deviation of the grouped data.
(c) Is this employee accurate?
(d) Find the class having the highest frequency.
(e) Find the lower class boundary of the 3rd class.
(f) Find the upper class boundary of the 7th class.
(g) Find the central value of the 5th class.
13. Specialist edible oil manufactures and many food companies convert the raw materials to
a wide range of food products. The main product of edible oil manufactures are described
in the following table by their corresponding quantities imported and their corresponding
prices in 1995.