STAT 111: Introduction To Statistics & Probability For Actuaries
STAT 111: Introduction To Statistics & Probability For Actuaries
Dr. P. S. Andam 2
NATURE OF STATISTICS
• Statistics refers to the theory of information with inference making as
objective.
Dr. P. S. Andam 4
BRANCHES OF STATISTICS
Dr. P. S. Andam 5
DATA
• Data is the raw material of statistics. It refers to unprocessed facts and
figures from which conclusions can be drawn.
• Data can either be:
Categorical or Qualitative
• Cannot be expressed as numerical values though numbers can be
assigned to them.
Quantitative
• Results from either counting or measuring. Also known as numerical.
Dr. P. S. Andam 6
Examples of Data Types
• Classify the following as either categorical or quantitative:
Number of Heights of
alphabets in the
English language
athletes in UG
Dr. P. S. Andam 7
VARIABLES
• A variable refers to any characteristic of an object under a survey.
• It is either measured (or counted) or categorized.
• Examples of a variables include:
Age, height, sex, performance in a quiz, time, marital status etc.
Dr. P. S. Andam 9
MEASUREMENT SCALES FOR VARIABLES
Qualitative Variables
• Nominal Scale: measurement that classifies values into mutually exclusive
groups where order/rank is unimportant.
• Ordinal Scale: measurement that classifies data also into mutually exclusive
categories that can be ranked.
Example: State the Scale of measurements that will be used on each of the
following variables:
Political party, performance on a IQ test (Pass, Fail), Awards at a ceremony (Ist
position, Ist runner up,…), religion, sex, blood group (A, B, O, AB), Insurance
Claim Severity (Very high, high, moderate,…) etc.
Dr. P. S. Andam 10
MEASUREMENT SCALES FOR VARIABLES CONT’D
Quantitative Variables:
• Interval scale: ranks data, and precise differences between units of measure
do exist; however, there is no meaningful zero.
NB: temperature (0oC does not mean total absence of temperature), IQ (score of
zero does not indicate no intelligence).
• Ratio Scale: possess all properties of interval scale and also the ratio between
any two values if meaningful.
NB: highest level of measurement & there exist a true zero (0).
Eg: Waiting time (of zero means did not wait), treatment cost (of zero means
paid nothing).
Dr. P. S. Andam 11
SUMMARY OF MEASUREMENT SCALES & VARIABLES
Primary Sources
• Primary data is obtained when researchers originally collect
data from designing experiments/conducting surveys. Eg:
interviews, giving questionnaires
Secondary Sources
• Secondary data are collected from other sources such as
libraries, the internet or corporate bodies.
Dr. P. S. Andam 13
USES OF STATISTICS (STATISTICAL DATA)
• Financial Planning
• Problem solving
• Political & economic decision-making
• Employment Opportunities
Dr. P. S. Andam 14
LECTURE TWO - OUTLINE
• Meaning of Data Reduction
• Describing Categorical & Quantitative Data using:
1. Frequency tables
Dr. P. S. Andam 15
DATA REDUCTION
• Data in its natural form is large or meaningless.
• The process of putting data in such a way that meaning can be
made is known as Data Reduction.
• Data reduction is a step in the data mining process.
𝟑
Pepsi 3 𝟏𝟎
Dr. P. S. Andam 18
SUMMARIZING CATEGORICAL DATA CONT’D
• Exercise
A partial relative frequency is given
by the table below (i) What is the relative
Year Relative Frequency frequency in the fourth year?
First 0.122 (ii) If the total sample is 200,
what is the frequency of the
Second 0.180
fourth year?
Third 0.400
(iii) Show the frequency
Fourth _ distribution
Dr. P. S. Andam 19
SUMMARIZING CATEGORICAL DATA CONT’D
PIE & BAR CHARTS
• Both useful for displaying categorical data with small number of classes.
• In a pie chart, the segment area represents the category value.
• Generally bar charts are better for display. Relative lengths are easier to
judge than relative areas.
Dr. P. S. Andam 20
SUMMARIZING CATEGORICAL DATA CONT’D
Dr. P. S. Andam 21
SUMMARIZING QUANTITATIVE DATA CONT’D
• Ungrouped frequency distributions
This is meant for discrete data.
Eg: The following are the ages of 10 Level 100 actuarial students in UG:
16, 17, 17, 18, 18, 18, 18, 19, 19, 20.
Ages (x) Tally Frequency (f)
16 / 1
17 // 2
18 //// 4
19 // 2
Dr. P. S. Andam 20 / 22 1
SUMMARIZING QUANTITATIVE DATA CONT’D
• Grouped Frequency Distributions
- Used for continuous data.
- As a ‘rule of thumb’ the number of classes is given by 𝑵, where N is
the number of observations
- One way to determine the class width is to use the formula;
𝑯𝒊𝒈𝒉𝒆𝒔𝒕 𝑽𝒂𝒍𝒖𝒆−𝑳𝒐𝒘𝒆𝒔𝒕 𝑽𝒂𝒍𝒖𝒆
Class Width =
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝑪𝒍𝒂𝒔𝒔𝒆𝒔
Dr. P. S. Andam 23
SUMMARIZING QUANTITATIVE DATA CONT’D
The following relates to marks obtained by Level 400 Actuarial Science
Students in UG:
52, 98, 85, 59, 92, 61, 81, 88, 58, 72, 72, 57, 78, 65, 62, 69, 80, 58, 60,74.
The frequency table is shown as follows:
Marks Tally Frequency Relative frequency Cumulative frequency
51 – 60 //// / 6 0.30 6
61 – 70 //// 4 0.20 10
71 – 80 //// 5 0.25 15
81 – 90 /// 3 0.15 18
91 – 100 // 2 0.10 20
Dr. P. S. Andam 24
SUMMARIZING QUANTITATIVE DATA CONT’D
Stem & Leaf Plots
• Suitable for data measured by the interval and ratio scales of measurement.
• The actual data values are included in this graph.
• A stem plot consist of a series of horizontal rows of numbers.
• Each row is labeled through a number called its stem.
• All numbers that follows the stem are called the leaves.
• Useful for a small set of numeric data.
• Gives an impression of location, spread and shape of values.
Dr. P. S. Andam 25
SUMMARIZING QUANTITATIVE DATA CONT’D
• Example;
Solution:
Stem & Leaf Plot Question
• 20 049 Key: • Consider the distribution of
• 21 012788 Stem: Hundreds aptitude scores policy: 200, 204,
• 22 27778 & Tens 209, 210, 211, 212, 217, 218,
• 23 01378 Leaf: Ones 218, 222, 227, 227, 227, 228,
• 24 12237 230, 231, 233, 237, 238, 241,
• 25 11346 242, 242, 243, 247, 251, 251,
• 26 0 253, 254, 256, 260.
Dr. P. S. Andam 26
SUMMARIZING QUANTITATIVE DATA CONT’D
Example: The following relates to marks obtained by Level 400 Actuarial
Science Students in UG:
52, 98, 85, 59, 92, 61, 81, 88, 58, 72, 72, 57, 78, 65, 62, 69, 80, 58, 60,74.
Draw a
(a) Cumulative frequency polygon
(b) Ogive
(c) Histogram, for the distribution.
(d) Dot plot
Dr. P. S. Andam 27
LECTURE THREE - OUTLINE
Descriptive Statistics for Univariate Data
• Measures of Location or Central Tendency:
- Mean (Arithmetic, Geometric & Harmonic), Mode & Median
• Measures of Dispersion or Spread or Variation:
- Range, Variance, Standard deviation, Coefficient of Variation,
Skewness & Kurtosis.
• Measures of Position
- Deciles, Quartiles & Percentiles
• Exploratory Data Analysis
- Box plots
Dr. P. S. Andam 28
INTRODUCTION
• When only one characteristic is measured on an experimental
unit, the data obtained is termed Univariate data.
• Graphical displays are not adequate for making inferences.
• As such, numerical measures are pursued.
Dr. P. S. Andam 32
MEASURES OF CENTRAL TENDENCY
Properties of the mean
Given 𝑥1 , 𝑥2 , … , 𝑥𝑛 are sample units and 𝑐𝜖ℝ, then:
𝑥1 ±𝑐 + 𝑥2 ±𝑐 +⋯+ 𝑥𝑛 ±𝑐
(a) 𝑛
= 𝑥ҧ ± 𝑐
𝑐𝑥1 +𝑐𝑥2 +⋯+𝑐𝑥𝑛
(b) 𝑛
= 𝑐 𝑥ҧ
(c) σ𝑛𝑖=1 𝑐1 𝑥𝑖 + 𝑐2 = 𝑐1 𝑛𝑥ҧ + 𝑐2
NB: (a) translation property (b) Scaling property (c) Linear combination
Dr. P. S. Andam 33
MEASURES OF CENTRAL TENDENCY
The Weighted Mean
• This reflects the relative importance of observations by including their
weights.
• If 𝑥1 , 𝑥2 , … , 𝑥𝑛 are 𝑛 measurements and 𝑤1 , 𝑤2 , … , 𝑤𝑛 be their relative
importance or weights (NB: σ𝒏𝒊=𝟏 𝒘𝒊 = 𝟏), then
ഥ𝒘 (𝒕𝒉𝒆 𝒘𝒆𝒊𝒈𝒉𝒕𝒆𝒅 𝒎𝒆𝒂𝒏) = σ𝒏𝒊=𝟏 𝒘𝒊 𝒙𝒊
𝒙
Example: Suppose there are 3 sections of this course, the average scores of the
final exams are:= 𝒙 ഥ𝑨 =71 for section A, 𝒙 ഥ𝑩 = 85 for section B & 𝒙 ഥ𝑪 = 78 for
section C. If the size of the 3 sections is the same, the mean score will be …?
Suppose the sizes are 10 for A, 15 for B & 20 for C, calculate the average score.
Dr. P. S. Andam 37
MEASURES OF CENTRAL TENDENCY
The Median
• This is the middle number in a data set.
• It indicates 50% of the data elements lie on both sides.
• To find the median, data elements have to be arranged in order (either
ascending or descending).
• The median corresponds to (i.e. for ungrouped data):
𝒏+𝟏
ෝ =
𝒎 𝒕𝒉 𝒑𝒐𝒔𝒕𝒊𝒐𝒏, 𝑖𝑓 𝒏 𝒊𝒔 𝒐𝒅𝒅 &
𝟐
𝒏
𝟐
𝒕𝒉 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏+𝒏𝒆𝒙𝒕 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏
ෝ =
𝒎 , 𝑖𝑓 𝒏 𝒊𝒔 𝒆𝒗𝒆𝒏
𝟐
Dr. P. S. Andam 38
MEASURES OF CENTRAL TENDENCY
• However, for grouped data, the median is
σ𝒇
𝟐
−σ 𝒇𝒎
ෝ = 𝑳𝒇 +
𝒎 𝑪 , where
𝒇𝒎
Dr. P. S. Andam 39
MEASURES OF CENTRAL TENDENCY
The Mode
• It is the data element with the highest frequency.
• It can be used as a measure of central tendency for categorical data
• For grouped (numerical) data, it is calculated as:
∆𝟏
𝑴𝒐𝒅𝒆 = 𝑳𝒎 + 𝑪, where
∆𝟏 +∆𝟐
-𝑳𝒎 = lower class boundary of the modal class
- ∆𝟏 = excess of modal class’ frequency over the frequency of the next
lower class
- ∆𝟐 = excess of modal class’ frequency over the frequency of the
next upper class
Dr. P. S. Andam 40
MEASURES OF CENTRAL TENDENCY
Example;
The table below is a collection of the marks obtained by level 400 Stats
students in a quiz. Calculate the mean, mode & median of the
distribution.
Marks Frequency
51 – 60 4
61 – 70 6
71 – 80 5
81 – 90 3
91 – 100 2
Dr. P. S. Andam 41
MEASURES OF DISPERSION
• Dispersion or Variability is the degree of spread of numerical
data about an average.
Dr. P. S. Andam 42
MEASURES OF DISPERSION
The Range
• Mathematically,
𝑹𝒂𝒏𝒈𝒆 = 𝑴𝒂𝒙𝒊𝒎𝒖𝒎 𝒗𝒂𝒍𝒖𝒆 − 𝑴𝒊𝒏𝒊𝒎𝒖𝒎 𝒗𝒂𝒍𝒖𝒆
• Albeit being very easy to compute, it is largely affected by outliers and is
also not meaningful for categorical data.
• For grouped data, range is given by;
𝑹𝒂𝒏𝒈𝒆
= 𝑼𝒑𝒑𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝒐𝒇 𝒕𝒉𝒆 𝒉𝒊𝒈𝒉𝒆𝒔𝒕 𝒄𝒍𝒂𝒔𝒔
− 𝑳𝒐𝒘𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 𝒐𝒇 𝒕𝒉𝒆 𝒍𝒐𝒘𝒆𝒔𝒕 𝒄𝒍𝒂𝒔𝒔
Dr. P. S. Andam 43
MEASURES OF DISPERSION
Variance and Standard deviation
• Most commonly used measures of dispersion
• Unlike the range, the variance (standard deviations) involve all elements
within a dataset.
• The intuition behind these statistics are we want a statistic that is
- small when observations are clustered around the mean &
- large when they are spread out
• The relationship between the variance and the standard deviation is
𝑺𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 = 𝑽𝒂𝒓𝒊𝒂𝒏𝒄𝒆
Dr. P. S. Andam 44
MEASURES OF DISPERSION
• The sample variance (𝒔𝟐 ) is calculated as:
σ 𝒏 𝟐
𝒊=𝟏(𝑿 𝒊 − 𝑿)
𝒔𝟐 =
𝒏−𝟏
i.e. the sum of the squared deviations from the sample mean divided by
(n – 1), where n = sample size.
• The population variance (𝝈𝟐 ) is given by;
σ 𝒏 𝟐
𝟐 𝒊=𝟏 (𝑿 𝒊 − 𝝁)
𝝈 =
𝒏
i.e. the sum of the squared deviations from the population mean divided by
n, where n = population size.
Dr. P. S. Andam 45
MEASURES OF DISPERSION
Properties of the Variance (Standard deviation)
• It is not affected by translation
If 𝒙𝟏 , 𝒙𝟐 , … , 𝒙𝒏 are sample units with variance say 𝒔𝟐 = 𝝅, then
(𝒙𝟏 ±𝒄), 𝒙𝟐 ± 𝒄 , … , (𝒙𝒏 ±𝒄) also have variance 𝒔𝟐 = 𝝅 for c𝝐𝑹.
• It responds to scaling
If 𝒙𝟏 , 𝒙𝟐 , … , 𝒙𝒏 are sample units with variance say 𝒔𝟐 = 𝝅, then
(𝒄𝒙𝟏 ), 𝒄𝒙𝟐 , … , (𝒄𝒙𝒏 ) also have variance = 𝒄𝟐 𝝅 for c𝝐𝑹.
Dr. P. S. Andam 47
MEASURES OF DISPERSION: Example
Dr. P. S. Andam 49
SHAPE OF DISTRIBUTIONS
• Right or positive skewed distributions have a long tail to the right.
• This implies the bulk of the measurements in the distribution lies to the
left
• Thus their mean > median > mode.
Dr. P. S. Andam 51
SHAPE OF DISTRIBUTIONS
Dr. P. S. Andam 52
SHAPE OF DISTRIBUTIONS
Dr. P. S. Andam 53
SHAPE OF DISTRIBUTIONS
Kurtosis
• This refers to the peakednesss of a dataset.
• That is, how peaked or flat a distribution is.
• The kurtosis (K) of a normal distribution is 3 (i.e. mesokurtic)
• However, highly /sharply peaked distributions are leptokurtic
(K > 3)
• Flat peaked distributions are platykurtic (K < 3)
Dr. P. S. Andam 54
SHAPE OF DISTRIBUTIONS
𝒏 𝟒
𝟏 ഥ
𝒙𝒊 − 𝒙
𝑲=
𝒏−𝟏 𝒔
𝒊=𝟏
Dr. P. S. Andam 55
SHAPE OF DISTRIBUTIONS
Dr. P. S. Andam 56
MEASURES OF POSITION
Deciles
• These are statistics that divide data in tens
• The kth decile is given by;
𝒌
𝒌𝒕𝒉 𝒅𝒆𝒄𝒊𝒍𝒆 = × 𝒏𝒕𝒉 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏, where 𝐧 = 𝐭𝐨𝐭𝐚𝐥 𝐟𝐫𝐞𝐪.
𝟏𝟎
• To find a percentile, put the dataset in ascending order.
Dr. P. S. Andam 57
MEASURES OF POSITION
Quartiles
• These are three statistics that divide a data set into four.
• The first or lower quartile (𝑸𝟏 ) is given by:
𝟏
𝑸𝟏 = × 𝑵𝒕𝒉 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏
𝟒 Interquartile
• The second or middle quartile (𝑸𝟐 ) is given by;
𝟏 range =
𝑸𝟐 = × 𝑵𝒕𝒉 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏
𝟐
• The third or upper quartile (𝑸𝟑 ) is given by;
𝑸 𝟑 - 𝑸 𝟏
𝟑
𝑸𝟑 = × 𝑵𝒕𝒉 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏
𝟒
Dr. P. S. Andam 58
MEASURES OF POSITION
Percentiles
• These are statistics that divide data in 100
• The kth percentile is given by;
𝒌
𝒌𝒕𝒉 𝒑𝒆𝒓𝒄𝒆𝒏𝒕𝒊𝒍𝒆 = × 𝒏𝐭𝐡 𝐩𝐨𝐬𝐢𝐭𝐢𝐨𝐧, where 𝐧 = 𝐭𝐨𝐭𝐚𝐥 𝐟𝐫𝐞𝐪.
𝟏𝟎𝟎
• To find a percentile, put the dataset in ascending order.
Dr. P. S. Andam 59
EXPLORATORY DATA ANALYSIS
• Geared towards analyzing data sets to summarize their main
characteristics
• It often uses visual methods.
• One useful one is the Box plot (Box – and – whisker plot)
• The purpose of exploratory data analysis is to examine data to find out
what information can be discovered about the data such as the center
and the spread.
Dr. P. S. Andam 60
EXPLORATORY DATA ANALYSIS
Box Plots
• It is constructed using these five specific values;
- Maximum Value
- Minimum value
- Lower Quartile
- Upper Quartile
- Middle Quartile
• These values are called a five-number summary of the data set.
Dr. P. S. Andam 61
EXPLORATORY DATA ANALYSIS
A boxplot is a graph of a data set obtained by
• drawing a horizontal line from the minimum data value to 𝑸𝟏
• drawing a horizontal line from 𝑸𝟑 to the maximum data value
• drawing a box whose vertical sides pass through 𝑸𝟏 and 𝑸𝟑 with a
vertical line inside the box passing through the median or 𝑸𝟐 .
• The lines are known as whiskers.
• The lowest value id the lower fence while the maximum value is the
upper fence.
Dr. P. S. Andam 62
EXPLORATORY DATA ANALYSIS
• Diagram depicting a box – and – whisker plot.
Lower Upper
𝑸𝟏 𝑸𝟐 𝑸𝟑
fence fence
x - axis
0 𝑹𝟏 𝒓𝟐 𝒓𝟑 𝒓𝟒 𝑹𝟓
Dr. P. S. Andam 63
EXPLORATORY DATA ANALYSIS
Summary of Steps to Construct a Box Plot
• Calculate median and the three (3) main quartiles.
• Obtain your IQR (interquartile range)
NB: IQR = It is the length of the interval that contains the middle 50%.
• Obtain your horizontal line
Dr. P. S. Andam 64
EXPLORATORY DATA ANALYSIS
Outliers
• These are anything above or below the upper and lower fences.
Lower fence = 𝑸𝟏 − 𝟏. 𝟓 𝑰𝑸𝑹
Upper fence = 𝑸𝟑 + 𝟏. 𝟓 𝑰𝑸𝑹
• Outliers may results from errors or miscalculations
NB; On the box plot, indicate outliers with an asterisk (*)
• The vertical line within the box corresponds to the median
Dr. P. S. Andam 65
EXPLORATORY DATA ANALYSIS
Example;
Amounts of fuel consumed per day by 8 buses tested for a fixed journey
are given below: 260, 290, 300, 320, 330, 340, 345, 520.
(a) Construct a box – and – whisker plot.
(b) Describe the graph
Dr. P. S. Andam 66
LECTURE FOUR - OUTLINE
Descriptive Statistics for Bivariate Data
• Contingency Tables
• Scatter Plots
• Correlation
- Parametric Correlation Coefficients
- Non-parametric Correlation Coefficients
Dr. P. S. Andam 67
INTRODUCTION
Bivariate Data Set
• It is a collection of data consisting of two variables of an experimental
unit. For instance, Height and Weight of a particular person.
Dr. P. S. Andam 69
CONTINGENCY TABLES: Layout (r × n)
VARIABLE B
VARIABLE A C1 C2 … Cn Row Sum
1 x11 x12 … x1n R1
⋮ ⋮ ⋮ ⋱ ⋮ ⋮
Dr. P. S. Andam 70
CONTINGENCY TABLES
Example;
A group of Financial Professionals consisting of Actuaries and Financial
Analysts were asked choose between a old method of valuation and a new
one. Of the 50 Actuaries, 20 chose the new method while 25 were
conservative. 5 actuaries did not make a choice. And of the 150 Financial
Analysts, 60 chose the new method, 75 stuck with the old method while
15 made no choice.
Construct a Contingency Table for the information given below
Dr. P. S. Andam 71
CONTINGENCY TABLES
Solution
Valuation Method
Financial New Valuation Old Valuation No Choice Total
Professional Method Method
Actuary 20 25 5 = 50
Financial 60 75 15 = 150
Analyst
= 80 =100 =20
Total 200
Dr. P. S. Andam 72
CONTINGENCY TABLES
• Some important information that can be deduced from contingency tables are:
(1) Proportions (percentages)
These proportions are with respect to the total number of people (i.e. 200)
Eg: 10% of Financial professionals went with the new valuation method etc.
Valuation Method
Dr. P. S. Andam 76
SCATTER PLOTS
• It is the best way of graphically displaying the relationship
between quantitative variables in a bivariate data.
Dr. P. S. Andam 77
SCATTER PLOTS
Examining Scatter Plots
• Look for the following
Dr. P. S. Andam 78
SCATTER PLOTS
Dr. P. S. Andam 79
CORRELATION COEFFICIENTS
• This statistic measures the strength and the direction of the
linear relationship between variables.
Dr. P. S. Andam 80
CORRELATION COEFFICIENTS
Pearson’s Correlation Coefficient (r)
𝟏
𝑪𝒐𝒗 (𝑿,𝒀) σ𝒏 (𝒙 −ഥ
𝒙)(𝒚𝒊 −ഥ
𝒚)
𝒏−𝟏 𝒊=𝟏 𝒊
• 𝒓= = , where
𝒔𝒙 𝒔𝒚 𝒔𝒙 𝒔𝒚
𝟏 𝟏
𝒔𝒙 = σ𝒏𝒊=𝟏 ഥ
𝒙𝒊 − 𝒙 𝟐 & 𝒔𝒚 = σ𝒏𝒊=𝟏 ഥ
𝒚𝒊 − 𝒚 𝟐
𝒏−𝟏 𝒏−𝟏
Dr. P. S. Andam 81
CORRELATION COEFFICIENTS
Characteristics of the Correlation Coefficient
• The Cov (X, Y) in the correlation formula denotes the Covariance.
• It provides a measure of strength of the correlation between two or
more variables.
• It has the following properties:
- Cov (X, X) = Var (X)
- Cov (X, Y) = Cov (Y, X). i.e. it is symmetric.
- It is unaffected by translation but affected by scaling.
Dr. P. S. Andam 82
CORRELATION COEFFICIENTS
Properties of the Correlation Coefficient
• Corr (X, X) = 1
• It only measures linear association
• It is strongly affected by outliers
Dr. P. S. Andam 83
CORRELATION COEFFICIENTS
Interpreting the Pearson’s Correlation Coefficient (|r|≤ 𝟏)
r Interpretation
0.00 – 0.20 Very weak
𝒏 σ𝒏 𝒙 𝒚
𝒊=𝟏 𝒊 𝒊 −( σ𝒏
𝒙
𝒊=𝟏 𝒊 )( σ𝒏
𝒊=𝟏 𝒚𝒊 )
𝒓= 𝟐 𝟐
𝒏 σ𝒏 𝒙𝟐−
𝒊=𝟏 𝒊
𝒏
σ𝒊=𝟏 𝒙𝒊 𝒏 σ𝒏 𝒚𝟐−
𝒊=𝟏 𝒊
𝒏
σ𝒊=𝟏 𝒚𝒊
Dr. P. S. Andam 85
CORRELATION COEFFICIENT
Example: Calculate the Correlation Coefficient
Student # of absences (x) Final Grade (y)
A 6 82
B 2 86
C 15 43
D 9 74
E 12 58
F 5 90
Dr. P. S. Andam G 8 86 78
CORRELATION COEFFICIENT
• The results of the correlation coefficient is displayed below:
2 2
STUDENT x y xy x y
A 6 82 492 36 6724
B 12 86 1032 144 7396
C 15 43 645 225 1849
D 9 74 666 81 5476
E 12 58 696 144 3364
F 5 90 450 25 8100
G 8 78 624 64 6084
TOTALS 67 511 4605 719 38993
Dr. P. S. Andam 87
CORRELATION COEFFICIENTS
• From the table;
𝑥 = 67, 𝑦 = 511, 𝑥𝑦 = 4605, 𝑥 2 = 719, 𝑦 2 = 39983.
𝑛 σ 𝑥𝑦−σ 𝑥 σ 𝑦
Therefore; 𝑟 = . Since 𝒏 = 𝟕,
[𝑛 σ 𝑥 2 − σ𝑥 2 ][𝑛 σ 𝑦 2 − σ𝑦 2]
𝟕 𝟒𝟔𝟎𝟓 −(𝟔𝟕×𝟓𝟏𝟏)
𝒓=
[ 𝟕 𝟕𝟏𝟗 −(𝟔𝟕)𝟐 ][ 𝟕 𝟑𝟗𝟗𝟖𝟑 −(𝟓𝟏𝟏)𝟐
𝒓 = −𝟎. 𝟔𝟐𝟔𝟕
NB: Interpret the results
Dr. P. S. Andam 88
Correlation Coefficient
Spearman’s Rank Correlation Coefficient (𝒓𝒔 )
• Used when data does not follow a Normal Distribution.
• That is, it is a Non-parametric Statistic.
• The correlation coefficient is given by the formula
𝟔 σ𝒏 𝒅𝟐
𝒊=𝟏 𝒊
𝒓𝒔 = 𝟏 −
𝒏(𝒏𝟐 −𝟏)
Dr. P. S. Andam 89
Correlation Coefficient
Treating tied ranks when dealing with Spearman’s Rank Correlation
• Assign the mean of the tied ranks to each tied score.
For instance there’s a tie between the score of the 4th and 5th positions,
4+5
assign the mean value = 4.5 to each of the two positions.
2
• The next score receives the 6th position
Dr. P. S. Andam 90
Correlation Coefficient
Example;
Calculate the Spearman’s rank Correlation Coefficient for the data below:
Pe rformance In Pe rformance in
STUDENT Statistics (x) Finance (y)
A 73 77
B 76 78
C 78 79
D 65 80
E 86 86
F 82 89
G 91 95
Dr. P. S. Andam 91
Correlation Coefficient
• The table for computation of the Spearman’s rank correlation coefficient
is shown below;
Dr. P. S. Andam 93
LECTURE FIVE - OUTLINE
Set Theory & Counting Processes
• Algebra of Sets
• Counting rules
- Permutations
- Combinations
Dr. P. S. Andam 94
INTRODUCTION: SETS
• A set is a collection of well-defined and distinct objects.
• “Well-defined” implies there is no doubt whatsoever about whether or not a
given item belongs to the set under consideration.
• “Distinct” in the sense that no two identical objects must be contained in the
same set.
Examples of sets:
(i) The set of all students in STAT 111 class
(ii) The set of all months with less than 30 days
(iii) The set of all integers > 1
Dr. P. S. Andam 95
INTRODUCTION: SETS
• The objects that belong to a set are called its elements or members.
• Sets are denoted with capital letters such as A, B, 𝓔.
• Elements are denoted by lower case letters such as a, b, z.
• "𝒂 ∈ 𝑩” means “a is an element of set B” & “𝒂 ∉ 𝑩” means otherwise.
• Sets can be described in three common ways:
– By definition (stating in words what it contains)
– By the roster method (listing the elements)
– By the property method (set-builder notation)
Dr. P. S. Andam 96
INTRODUCTION: SETS
Types of Sets
• Universal set (i.e. the population or sample in some cases)
• Equal & Equivalent sets
• Countable & Uncountable sets
• Null or Empty set (denoted by ∅ 𝑜𝑟 { })
• Singleton set
• Subsets
“𝐀 ⊂ 𝑩” means A is a subset of B. This implies that if 𝒂 ∈ 𝑨 then
𝒂 ∈ 𝑩.
Dr. P. S. Andam 97
SET OPERATIONS
Union of Sets (U)
• “A U B” denotes A union B.
• 𝐴 ∪ 𝐵 = {𝑥|𝑥 ∈ 𝐴 𝑜𝑟 𝑥 ∈ 𝐵 𝑜𝑟 (𝑥 ∈ 𝐴 𝑎𝑛𝑑 𝑥 ∈ 𝐵)}
Dr. P. S. Andam 98
SET OPERATIONS
Intersection of Sets
• "𝐴 ∩ 𝐵“ denotes the intersection of two sets A and B.
• 𝐴 ∩ 𝐵 = {𝑥|𝑥 ∈ 𝐴 𝑎𝑛𝑑 𝑥 ∈ 𝐵}
• If 𝐴 ∩ 𝐵 = ∅, then A and B are known as disjoint sets.
Dr. P. S. Andam 99
SET OPERATIONS
Complement of a Set
• Given a set A, its complement is denoted by 𝐴′ or 𝐴𝑐 or 𝐴.ҧ
• 𝐴𝑐 = {𝑥|𝑥 ∈ 𝒰, 𝑥 ∉ 𝐴}
• i.e. those elements belonging to 𝒰 (the universal set) but not in A.
Note the following laws of set algebra relating to complement of sets
− 𝐴𝑐 𝑐 = 𝐴
− 𝒰𝑐 = ∅
−∅𝑐 =𝑈
Solution:
There are six faces of each dice and since there are six dice in all, the
number of ways the faces may show up y the fundamental principle of
counting is given as:
6 ways for dice 1 × 6 ways for dice 2 × … × 6 ways for the 6th dice
= 6 × 6 × 6 × 6 × 6 × 6 = 66 = 46656 ways.
Dr. P. S. Andam 110
COUNTING PRINCIPLES
Factorial
• It is denoted by the exclamation mark. i.e. !
• Given a positive integer 𝑛, the product of all the whole numbers from 𝑛
to 1 is called 𝑛 factorial which is denoted as 𝒏!
• 𝒏! = 𝒏 𝒏 − 𝟏 𝒏 − 𝟐 … 𝟑 ∙ 𝟐 ∙ 𝟏
• 𝟎! = 𝟏
Eg: Given the numbers 1, 2, 3, 4, how many different numbers of three digits
can be formed from them if repetitions are not allowed.
Solution:
(i) 𝑛(ways)= 4! = 4 × 3 × 2 × 1 = 24 ways
Eg2: Given the numbers 1, 2, 3, 4, how many different numbers of three digits
can be formed from them if repetitions are allowed.
Solution: 𝑛(ways) = 43 = 4 × 4 × 4 = 64 ways
Dr. P. S. Andam 116
CYCLIC PERMUTATIONS
• Typically, actual positions do not matter.
For instance; if six people are sitting in a circle, we do not get a new permutation
if they all move one position in a clockwise (or anti-clockwise) direction.
Eg2: Suppose each of the executives was accompanied by the secretary to take
minutes at the meeting.
(a) How many arrangements are possible that alternate the executives and their
secretaries?
(b) If a secretary should sit by his executive, how many arrangements are possible
that alternate the executives and the secretaries?
.
Dr. P. S. Andam 118
CYCLIC PERMUTATIONS
Solution:
(a) Suppose an executive sits down at the start. Then ∃ (5 – 1)! different
arrangements for the remaining executives. The five secretaries can be
seated in the next 5 alternating seats. Thus ∃ 5! Possibilities for them.
Then by the multiplication principle there are ∃ 5!× 4! = 2880
different arrangements.
(b) Suppose the first to sit down is an executive. Then ∃ (5 – 1)! different
arrangements. ∃ two ways the first secretary can sit, either at the left
or the right of her executive. Once she sits all other places are
automatic for the rest of the secretaries. Hence ∃ 2(4!) = 48 possible
arrangements.
Dr. P. S. Andam 119
PERMUTATIONS WITH REPETITIONS
• Suppose we want to find the number of permutations (with repetition)
of 𝑛 objects of which 𝒏𝟏 , 𝒏𝟐 , … , 𝒏𝒌 .
𝒏!
• Then the number of ways is , where 𝒏𝟏 + 𝒏𝟐 + ⋯ + 𝒏𝒌 = 𝒏.
𝒏𝟏 !𝒏𝟐 !⋯𝒏𝒌 !
Eg: In how many ways can the letters of the word STATISTICS be arranged?
Solution:
𝑛 = 10, 𝑛1 number of S = 3, 𝑛2 number of 𝑇 = 3,
𝑛3 number of 𝐼 = 2, 𝑛4 number of 𝐴 = 𝑛1 number of 𝐶 = 1
10!
∴ The number of ways = = 50400 ways.
3!3!2!1!1!
NB: It must be noted that the sample space can either be finite, countably
infinite or uncountably infinite. i.e. Discrete or continuous.
ራ 𝐴𝑖 = 𝑆
𝑖=1
For Instance; when a die is thrown, the events 1 , 2 , 3 , 4 , 5 , {6}
are collectively exhaustive since their union equals the Samples space 𝑆.
Again, when a coin is tossed, the events {𝐻} and {𝑇} are collectively
exhaustive since their union is the sample space 𝑆.
Dr. P. S. Andam 131
INTRODUCTION: Definition of terms
Partition
• The events 𝑨𝟏 , 𝑨𝟐 , … , 𝑨𝒏 form a partition of the sample space 𝑆 if
(a) 𝐴𝑖 ≠ ∅, ∀𝑖 = 1,2, … , 𝑛
(b) 𝐴𝑖 ∩ 𝐴𝑗 = ∅, ∀𝑖 ≠ 𝑗 & 𝑖, 𝑗 = 1,2, … , 𝑛
(c) =𝑖𝑛ڂ1 𝐴𝑖 = 𝑆
• Condition (a) means nonempty classes are not allowed, (b) implies
classes or events should be pairwise mutually exclusive and (c) means
all classes or events must be mutually exclusive.
Dr. P. S. Andam 132
INTRODUCTION: Definition of terms
Example: Partition
A coin is tossed thrice. Partition the sample space 𝑆 according to the
number of heads in the outcome.
Solution
The sample space is 𝑆 = {𝐻𝐻𝐻, 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝑇𝐻𝐻, 𝐻𝑇𝑇, 𝑇𝐻𝑇, 𝑇𝑇𝐻, 𝑇𝑇𝑇}.
The Partitions are:
𝐴1 = 𝐻𝐻𝐻 (3 heads)
𝐴2 = 𝐻𝐻𝑇, 𝐻𝑇𝐻, 𝑇𝐻𝐻 (2 heads)
𝐴3 = {𝑇𝑇𝐻, 𝑇𝐻𝑇, 𝐻𝑇𝑇} (1 head)
𝐴4 = {𝑇𝑇𝑇} (No heads)
Dr. P. S. Andam 133
INTRODUCTION: Definition of terms
Independent Events
• Two events 𝐴 and 𝐵 are independent if the occurrence (or non-occurrence) of
one of them is not affected by the occurrence (or non-occurrence) of the
other.
Example (Independence): When two coins are tossed, the occurrence of the
event “head” on the first coin and “tail” on the second coin are independent.
• Otherwise, the two events are dependent.
Example (Dependence): A box contains two blue pens and one red pen. Two
pens are picked at random successively. The events “blue pen picked in the first
round” and “red pen picked in the second round” are dependent. Clearly, the
likelihood that you will pick a red pen depends on whether it has been picked
already or not.
Dr. P. S. Andam 134
INTRODUCTION: Definition of terms
Properties of Independent Events
If A & 𝐵 defined over the same sample space 𝑆 are independent events, then
(a) A & 𝐵′ are independent
(b) A’ & 𝐵 are independent
(c) A’ & 𝐵′ are independent
• Also, if B ⊆ 𝐴, then
𝑃 𝐴∩𝐵 𝑃 𝐵
𝑃 𝐴𝐵 = = =1
𝑃 𝐵 𝑃 𝐵
• Also
𝑃 𝐵 𝐴 = P(B)
𝑃 𝐵 = 𝑃(𝐴𝑖 ∩ 𝐵)
𝑖=1
𝑃 𝐵 = 𝑃(𝐴𝑖 )𝑃(𝐵|𝐴𝑖 )
𝑖=1
(b) It is later learned that the selected survey subject was smoking a
cigar. Also, 9.5% of males smoke cigars, whereas 1.7% of females
smoke cigars. Use this additional information to find the probability
that the selected subject is a male.
𝑿
∙𝒔 ∙ 𝑿(𝒔)
𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒑𝒂𝒄𝒆
𝑹𝒆𝒂𝒍 𝑵𝒖𝒎𝒃𝒆𝒓 𝑺𝒚𝒔𝒕𝒆𝒎
Dr. P. S. Andam 157
Introduction
Example:
Let the random experiment be the tossing of a fair coin twice.
The Sample space 𝑺 = {𝑯𝑯, 𝑯𝑻, 𝑻𝑯, 𝑻𝑻}
We can define a r.v. 𝑿 as the number of tails.
In tabular form;
Sample Point HH HT TH TT
𝑿 (𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒕𝒂𝒊𝒍𝒔)
2 1 1 0
𝑺
Dr. P. S. Andam 159
Introduction
NB: (a) From the example, the original sample space has four sample
points but the event space has 3 points i.e. 𝟎, 𝟏, 𝟐
(b) 𝑿 is a function with domain 𝜴 and range ⊂ ℝ
Example:
Give the sample space and range space (event space) of each of the ff r.v.
(i) Number of heads × Number of tails
(ii) Number of heads + Number of tails
When two coins are tossed.
𝒙𝒊
𝒙𝟐 𝒙𝟑 𝒙𝟒
𝒙𝟏
3/8
2/8
1/8
𝒙𝒊
Dr. P. S. Andam
0 1 2 3 168
Discrete Probability Distributions
Example 2: A committee of 4 is to be selected from a group of 5 men and
5 women. Let 𝑿 be the r.v. representing the number of women in the
committee. Create the p.m.f.
Solution:
𝑿 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑤𝑜𝑚𝑒𝑛; hence 𝒙 = 𝟎, 𝟏, 𝟐, 𝟑, 𝟒.
𝟏𝟎
𝑺= , where 𝑺 is the sample space.
𝟒
𝟓
Number of ways of selecting women =
𝒙
𝟓
Number of ways of selecting men =
𝒚
Dr. P. S. Andam 169
Discrete Probability Distributions
But 𝒙 + 𝒚 = 𝟒 ⇒ 𝒚 = 𝟒 − 𝒙
Therefore, the number of ways of consisting the committee is;
𝟓 𝟓 𝟓 𝟓
=
𝒙 𝒚 𝒙 𝟒−𝒙
Hence in conclusion,
𝟓 𝟓
𝑷 𝑿=𝒙 = 𝒙 𝟒 − 𝒙 , 𝒙 = 𝟎, 𝟏, 𝟐, 𝟑, 𝟒.
𝟏𝟎
𝟒
Dr. P. S. Andam 170
Discrete Probability Distributions
Example 3:
Verify that 𝒑(𝒙) is a pmf of some r.v. 𝑿 if:
𝟏
𝒑 𝒙 = ቐ𝟐𝟏 𝟐𝒙 + 𝟑 , 𝒙 = 𝟏, 𝟐, 𝟑
𝟎, 𝒆𝒍𝒔𝒆𝒘𝒉𝒆𝒓𝒆
Example 4:
𝒌(𝒙 − 𝟏), 𝒙 = 𝟑, 𝟒, 𝟓
Given 𝒑 𝒙 = ቊ ,
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
Find 𝑘𝜖ℝ| 𝒑(𝒙) is a legitimate pmf.
Dr. P. S. Andam 171
Continuous Probability Distributions
• The probability distribution for a continuous random variable is given
either in graphical form or functional form.
• This is known as probability density function (pdf) denoted by 𝒇 𝒙 .
• Tables cannot be used because listing all values is impossible.
Definition: A function 𝒇 𝒙 defined on the real numbers is called a pdf if it
satisfies the ff properties:
𝑷𝟏: 𝒇 𝒙 ≥ 𝟎 , ∀𝒙
∞
𝑷𝟐: න 𝒇 𝒙 𝒅𝒙 = 𝟏
−∞
4/8
f(x)=x/8
3/8
2/8
1/8
0
1 2 3 4 x
• The cdf is the most universal characteristic of a r.v. thus, it exists for all
random variables be it discrete or continuous.
• It is also known as the distribution function
6/8
4/8
2/8
0 1 4
2 3 x
Dr. P. S. Andam 183
Distribution function of Continuous r.v.
Definition: Let 𝑿 be a continuous r.v. with pdf 𝒇 𝒙 . Then the cdf 𝑭(𝒙) is
given by
𝒙
𝑭 𝒙 = න 𝒇 𝒕 𝒅𝒕
−∞
0, 𝑥 < 0
𝑥
Example: the pdf of 𝑿 is given by: 𝑓 𝑥 = ൞2 , 0 ≤ 𝑥 ≤ 2
0, 𝑥 ≥ 2
Find the cdf of 𝑿
𝟏 𝟏
Theorem: If 𝑷(𝑿 ≤ 𝒙𝟎 ) ≥ or 𝑷(𝑿 ≥ 𝒙𝒐 ) ≥ , then 𝑥𝑜 is said to be the
𝟐 𝟐
median of the distribution.
𝑬 𝑿 = 𝒙𝒊 𝒑(𝒙𝒊 )
𝒊=𝟏
When each 𝒙𝒊 , ∀𝒊 are equally likely, then 𝒇𝒊 = 𝟏, ∀𝒊. Then the expectation
becomes:
𝒏
𝟏
𝑬 𝑿 = 𝒇𝒊 𝒙𝒊
𝑵
𝒊=𝟏
Dr. P. S. Andam 190
Numerical Characteristics of Random Variables
Definition: [The Expectation of a Continuous r.v.]
Suppose 𝑿 is a continuous random variable then the expectation is given by:
∞
𝑬 𝑿 = න 𝒙𝒇 𝒙 𝒅𝒙
−∞
Property 4: The expectation of the deviation of r.v. 𝑿 from its mean is zero. i.e.
𝑬 𝑿−𝝁 =𝟎
Dr. P. S. Andam 192
Numerical Characteristics of Random Variables
Variance
• The variance of a r.v. 𝑿 is the expectation of the square of the deviation of the
r.v. from its expected value. i.e.
𝒏
𝟐
𝑽𝒂𝒓 𝑿 = 𝑬 𝑿 − 𝑬 𝑿 = 𝑬[𝑿 − 𝒖]𝟐 = (𝒙𝒊 − 𝒖)𝟐 𝒑(𝒙𝒊 )
𝒊=𝟏
Solution:
𝐸 2𝑋
3 3
1 3 3 1
= 2𝑥𝑖 𝑝 𝑥𝑖 = 2 𝑥𝑖 𝑝 𝑥𝑖 = 2 0× + 1× + 2× + 3× =3
8 8 8 8
𝑖=0 𝑖=0
Dr. P. S. Andam 195
Numerical Characteristics of Random Variables
Moments
• The moment of a r.v. 𝑿 is the expectation of different 𝒌 powers (𝒌 =
𝟏, 𝟐, … ) of the r.v when the expectation exists.
Types of Moments
• Moment about the origin
• Moment about the mean
• Moment about a point
1 5
Solution: 𝑆 = {1,2,3,4,5,6}, 𝑃 𝑋 = 1 = & 𝑃 𝑋 = 0 =
3 6
. Hence,
1
, 𝑥=1 1 5
3
(i) 𝑃 𝑋 = 𝑥 = ൞5 (ii) 𝐸 𝑋 = 𝑝 = & 𝑉𝑎𝑟 𝑋 = 𝑝𝑞 =
6 36
, 𝑥=0
6
Dr. P. S. Andam 207
Bernoulli Distribution
Example 2: Suppose the probability of germination of a beans seed is
0.8 and the germination of a seed is considered a success. If 10 seeds are
planted independent of each other, describe the experiment below and
characterize its probability distribution.
Solution: The experiment involves 10 Bernoulli trials with success
probability (or parameter) 𝑝. Thus, if 𝑋 = 𝑔𝑒𝑚𝑖𝑛𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑎 𝑠𝑒𝑒𝑑, then 𝑋 =
0 𝑜𝑟 1 where 0 = 𝑛𝑜𝑛 − 𝑔𝑒𝑟𝑚𝑖𝑛𝑎𝑡𝑖𝑜𝑛 & 1 = 𝑔𝑒𝑟𝑚𝑖𝑛𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑎 𝑠𝑒𝑒𝑑.
0.8, 𝑥=1
𝑃 𝑋=𝑥 =ቊ
0.2, 𝑥=0
NB:
Theorem: The Binomial Distribution is a legitimate probability distribution.
Dr. P. S. Andam 212
Binomial Distribution
Properties of the Binomial Distribution
If 𝑿~𝒃(𝒙; 𝒏, 𝒑) then:
(a) 𝑬 𝑿 = 𝒏𝒑
(b) 𝑽𝒂𝒓(𝑿) = 𝒏𝒑
Solution:
Let a Head be ‘a success’ then 𝑿 = 𝒕𝒉𝒆 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒉𝒆𝒂𝒅𝒔
𝟏 𝟏
And 𝒏 = 𝟓, 𝒑 = , 𝒒 =𝟏−𝒑=
𝟐 𝟐
𝟏
The r.v. 𝑿~𝒃(𝒙; 𝟓, )
Dr. P. S. Andam
𝟐 214
Binomial Distribution
1 5 1 3 1 5−3
(a) 𝑃 𝑋 = 3 = 𝑏 3, 5, = = 0.3125
2 3 2 2
(b) 𝑃 𝑋 ≥ 3 = 𝑃 𝑋 = 3 + 𝑃 𝑋 = 4 + 𝑃 𝑋 = 5 = 0.5
(a) 𝑃 𝑋 ≤ 1 = 1 − 𝑃 𝑋 = 0 = 0.96875
𝑩 𝒓; 𝒏, 𝒑 = 𝒃 𝒙; 𝒏, 𝒑 , 𝒙 = 𝟎, 𝟏, 𝟐, … , 𝒏
𝒙=𝟎
Solution: let X be the number of times the sum of the two numbers which
show up is 10.
Then 𝑋~𝑏(𝑥; 360, 𝑝)
3 1
The required event is { 𝟒, 𝟔 , 𝟔, 𝟒 , 𝟓, 𝟓 } so 𝑝 = =
36 12
Dr. P. S. Andam 219
Binomial Distribution
1
(a) 𝐸 𝑋 = 𝑛𝑝 = 360 × 12
= 30
1 1
(b) 𝑉𝑎𝑟 𝑋 = 𝑛𝑝 1 − 𝑝 = 360 × 12
× 1−
12
= 27.50
Therefore, the standard deviation 𝜎 = 𝑉𝑎𝑟(𝑋) = 27.50 = 5.24
𝑃 𝑟, 𝝀 = 𝑷 𝑿 ≤ 𝒓 = 𝒑(𝒙; 𝝀)
𝒙=𝟎