Class Note 303
Class Note 303
Chapter 1
Time Series Analysis
#Types of Population
i) Finite Population: The finite population is also known as a countable
N
population, in which the population can be counted.
ii) Infinite Population: The infinite population is also known as an
SA
uncountable population, in which counting units in the population is not
possible.
HA
to individual.
#Types of Variables
i) Discrete Variables: Discrete variables are variables for which the values
they can take are countable and have a finite number of possibilities.
IQ
ii) Continuous variables: A continuous variable is defined as a variable that
can take an uncountable set of values or infinite set of values.
UF
The interval may be an hour, a day, a week, a month, or a calendar year. For
Example, Hourly temperature readings, daily sales in a shop, weekly sales in a
shop, weekly sales in a market, monthly production in an industry, yearly
agricultural production, and population growth in ten years are examples of time
series.
D
#Objectives of Time Series Analysis: The main objectives of analyzing the time
M
series data are to get a concrete idea about the past behavior of data so that
appropriate course of action for future can be taken. However, the objective can be
pointed out as follows:
1) To identify the pattern and trend, and isolate the influencing factors or
effects.
2) To apply the idea obtained from analyzing the pattern of time series data for
future planning and control.
1
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
# Mathematical Models for Time-Series Analysis: There are two mathematical
models, which used for the decomposition of a time series into different
SA
components. These are:
• Multiplicative Model
• Additive Model
HA
# Multiplicative Model: In traditional or classical time series analysis, it is
assumed that there is a multiplicative relationship among these four components.
Let Yt denotes the value of a series at time t. symbolically,
Yt = Tt × St × Ct × It
IQ
where, Tt = Secular Trend Component, St = Seasonal component, Ct = Cyclical
component and It = Irregular component.
UF
For example, if Tt =450, St = 1.4, Ct = 1.6 and It = 0.9, then, Yt = 450 × 1.4 × 1.6 ×
0.9 = 907.2
# Additive Model: According to this model, a time series is the sum of its four
TO
components. Symbolically,
Yt = Tt + St + Ct + It
where, Tt = Secular Trend Component, St = Seasonal component, Ct = Cyclical
component and It = Irregular component.
D
301
# Secular Trend (Tt): Many time series met in practice exhibit a tendency of either
growing or reducing fairly steadily over time. This tendency of a time series data
over a long period of time is called secular trend. Some series increase slowly,
some increase fast, others decrease at varying rate, and some remain relatively
constant for long periods of time.
For example, despite short-run deviations from the trend, gross domestic product
of a country, prices or productions of a farm or a country, money supply show an
N
upward trend over the years due to increasing population. On the other hand,
population growth or death rate shows a downward trend. The time series data can
be represented by a Histogram, which exhibits its long-term trend.
SA
# Factors affecting trend: There are several factors that affect trend in time series
data. Some of the important factors are:
HA
1. Population: The ever-increasing population of a country is responsible for
increasing trends in series like prices, population, production sales etc.
2. Technology, Institution and culture: Downward or upward trends in some
factors are caused by technological, institutional or cultural changes.
IQ
# Reasons for studying Trends: The reasons for studying trend in a time series
data are pointed below:
1. It allows us to describe the historical pattern of time series data.
UF
# Seasonal Variations (St): Seasonal variations are like cycles, but they occur
short and repetitive calendar periods. By seasonal variation we mean a periodic
movement that repeats itself with remarkable similarity at a regular interval of
D
time, the period being no longer than one year. Hence, seasonal component of a
time series data is the repetitive and predictable movement of observations around
M
the trend line during particular time intervals of the year. In order to measure or to
detect the seasonal component, the data must be given in small unit of time, such
as hours, days, weeks, months or quarters. Many business and economic time
series met in practice consists of quarterly or monthly observations.
3
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
(i) Climate and weather factors: Change in the climate and weather conditions such
as humidity, heat, rainfall etc. act on different product and industries differently
and cause change in demand of them.
(ii) Social customs and Religious factors: Due to some religious festival the sales
or demand of practical product varies over the year.
# Cyclical Fluctuation: A cycle is a wave like pattern about a long-term trend that
is usually apparent over a number a year. The term cycle refers to the recurrent
N
variations in time series that usually last longer than a year. One complete period is
called a cycle. Many business and economic time series met in practice appear to
exhibit oscillatory, or cyclical, pattern unconnected with special behavior. They are
SA
not necessarily regular, but follow rather smooth pattern of upswings or
downswings. Examples of cycles include the business cycle that record periods of
economic recession and inflation, long term product-demand cycles and cycles in
HA
the monetary and financial sectors. There are four well-defined periods or phases
in the business cycle, namely i) prosperity, ii) decline, iii) depression and iv)
improvement.
# There are two reasons for identifying the irregular components in time series
data. These are:
to certain degree of error owing to the unpredictable erratic, influences, which may
enter.
4
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
i) Graphic method: The given data are plotted on a graph paper and a free hand
trend line fitted to the data is obtained just by inspection. A freehand curve drawn
through the data values is often an easy and perhaps, adequate representation of
SA
data. Forecasts can be obtained simply by extending the line.
A trend line fitted by this method should conform to the following conditions:
a) The trend line should be smooth - a straight line or mix of long gradual curves.
HA
b) The numerical sum of vertical deviations of the observations below the trend
should be equal to the numerical sum of deviations above the line.
c) The sum of squares of deviations of the observations from the trend line should
be as small as possible.
d) The trend line should bisect the cycles so that area above the trend line should
IQ
be equal to the area below the line, not only for the entire series but as much as
possible for each full cycle.
UF
Example: Fit a trend line to the following data by usig freehand curve method.
Sales
80 85 97 110 160 94 86 174 180 200 135 120 105
Turnover
Solution At first we plot the
turnover against year in a graph
D
5
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
Merits
i) This is the simplest, quickest and easiest method of estimating trend values.
ii) This method is very flexible in the sense that it can be used irrespective of the
nature of the trend component, whether it is linear or non-linear.
N
iii) If the statistician knows the past behavior of the data series, it is possible to
obtain a secular trend by this method, even sometimes better than by any other
mathematical method.
SA
Demerits
i) From its nature of fitting, it is clear that this method is very subjective, because
HA
the trend line depends on personal judgment and therefore what happens to be a
good fit to one individual may not be so for others.
ii) The trend line drawn cannot have much value if it is used as a basis for
prediction.
iii) It is very time consuming to construct a free hand curve if a careful and
IQ
conscientious job is to be done.
ii) Semi-average method: In this method, the given data are divided into two
UF
equal parts preferably with equal number of periods. If there is an odd number of
years or period like 7, 9 etc. the middle year is ignored and the two equal parts are
formed. An average of each part is computed and the two points thus obtained are
centered corresponding to the middle period and shown on the graph. A straight
TO
line is drawn through these two points. The values lying on this line describe the
trends. By projecting the line it is possible to forecast the future values.
Like the free hand curve method, the semi-average method of determining the
trend has the following merits and demerits.
M
Merits
i) This is a more logical and easier method of determining trend than freehand
curve method;
ii) This method does not take that much time to find the trend component;
iii) This is an objective method of determining trends as everyone whoever applies
this method is bound to get the same result.
6
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
Demerits
i) This method assumes straight line relationship between the time period and the
observations, so if the relationship deviates much from the linearity, then forecast
by this method will be biased and less reliable;
ii) Since this method is based on arithmetic mean, it also bears the limitations of
this mean. For example, if there are some extreme values in either half or both half
of the series, then the trend line will not indicate the true picture of the growth
factor. This danger is the greatest if the time period represented by the average is
N
small.
SA
commodity from the following data.
HA
2010 20
2011 24
IQ
2012 22
2013 30
UF
2014 28
2015 32
TO
Solution: Since there are six years, we will take an average of the first three years
namely 2010, 2011, 2012 and last three years namely 2013, 2014, 2015. These
averages will be plotted corresponding to the middle period i.e. 2011 and 2014
respectively. When these two are joined we will get the trend line by the method of
D
semi-averages.
M
7
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
2013 30 27.33
SA
30 + 28 + 32 =
2014 28 30 30.00
90
2015 32 32.67
HA
Trend Value Calculation: Semi-average increment = (30 - 22) = 8/3
2010 : 22 - 8/3 = 19.33 ; 2012 : 22 + 8/3 = 24.67
2013 : 30 - 8/3 = 27.33 ; 2015 ; 30 + 8/3 = 32.67
2011 : 22
IQ
2014 : 30
The original sales figures and the estimated trend valuesare shown in following
UF
graph.
TO
D
M
8
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
or, Σ𝑦 = 𝑛𝑎 + 𝑏Σ𝑥 ………… (i)
2
or, Σ𝑥𝑦 = 𝑎Σ𝑥 + 𝑏Σ𝑥 ………. (ii)
SA
Σ𝑥.Σ𝑦
Σ𝑥𝑦 − 𝑛
𝑏= 2 (Σ𝑥)
2 𝑎 = 𝑦 − 𝑏𝑥
Σ𝑥 − 𝑛
HA
#Problem: The following table shows the production of a fertilizer factory.
2016 92
2017 98
TO
2018 89
2019 91
2020 100
D
Requirement:
(i) Calculate the values for straight-line by the squares method.
M
#Solution:
(i) y = a + bx
Σ𝑥.Σ𝑦
Σ𝑥𝑦 −
𝑏= 2
𝑛
(Σ𝑥)
2 , 𝑎 = 𝑦 − 𝑏𝑥
Σ𝑥 − 𝑛
9
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
2014 − 2017
Here, x = 1
= -3
Year y x xy x2
2014 70 -3 -210 9
2015 75 -2 -150 4
N
2016 92 -1 -92 1
2017 98 0 0 0
SA
2018 89 1 89 1
2019 91 2 182 4
HA
2020 100 3 300 9
Total 615 0 119 28
IQ
Σ𝑥.Σ𝑦
Σ𝑥𝑦 −
Here, 𝑏 = 2
𝑛
2
(Σ𝑥)
Σ𝑥 − 𝑛
UF
0×615
119 − 7
= 2
0
28 − 7
TO
119
= 28
= 4.25
𝑎 = 𝑦 − 𝑏𝑥
D
615
= − 4. 25 × 0 = 87.8571
M
87.85 + 4.25
= 113.35
6 🇽
(ii) The estimated production for 2023 is -
10
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
x Trend Line
Year y
87.85 + 4.25x
2014 70 -3 75.1
2015 75 -2 79.35
N
2016 92 -1 83.6
SA
2017 98 0 87.85
2018 89 1 92.1
2019 91
HA
2 96.35
2020 100 3 100.6
Total 615 615
IQ
UF
TO
D
M
11
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
Chapter 2
Probability Distribution
N
#Probability distribution: Probability distribution is related to theoretical
frequency distribution. The probability distribution describes how outcomes are
SA
expected to vary. If we toss a fair coin twice, the possible outcomes are - HH, HT,
TH, TT.
P(H) = 2,1,1,0 / {0,1,2}
HA
#Types of Probability Distributions:
1. Discrete Probability Distributions: Discrete random variable
a. Binomial Distribution
b. Poisson Distribution
2. Continuous Probability Distributions: Continuous random variable
IQ
a. Normal Distribution
or, ( )p q
𝑛
𝑥
x n-x
D
If n and p are two parameters of the binomial distribution, then mean and variance
of the distribution are -
It is to be noted that mean is greater than variance of the distribution since q < 1.
12
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
6. Number of successful hits in a target out of fixed number of hits.
SA
1. It is a discrete probability distribution with parameters n and p.
2. The mean of the distribution is np and its variance is npq.
3. The distribution is symmetric if p = q = 0.5
HA
4. The distribution tends to Poisson distribution if the number of trials, n tends
to infinity.
5. The distribution is positively skewed if p < 0.5 and negatively skewed if p >
0.5.
IQ
#Problem: It was found that 5% of products of a lot are defective. If 8 products are
selected randomly, What is the probability of getting less than 3 defective
products?
UF
13
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
1. The mean of the distribution is λ.
2. The variance of the distribution is also λ.
SA
3. When λ tends to ∞, Poisson distribution tends to normal distribution.
4. The distribution is positively skewed and leptokurtic.
HA
#Normal distribution: A random variable X is said to be have a normal
distribution if its probability density function is defined by -
1 2
− (𝑥−µ)
2 1 2
f(x ; μ ; σ ) = .𝑒 2σ
;-∞<x<+∞
σ 2π
Here π and e are mathematical constants. π = 3.1416 and e = 2.7183
IQ
The mean of normal distribution is μ and variance is σ2.
UF
14
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
Chapter 3
Sampling Distribution
N
#3. What is statistic?
SA
Answer: The sample characteristics of a sample is called statistic. Or, Any
function of a random sample is known as statistic.
HA
Answer: The population characteristic is called a parameter.
iii) F-distributions
iv) Distribution of Sample mean
v) Distribution of Sample proportion
Answer: The central limit theorem states that ‘Regardless of the shape of the
population, the distribution of the sample means approaches the normal probability
M
15
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
𝑋−µ
That means, 𝑍 = ∼ N (0, 1)
N
σ
𝑛
SA
#9. What is Chi-Square (χ2) Distribution?
Answer: The sum of squares of n independent standard normal variates is called
chi-squares with n degrees of freedom. Let Z1, Z2, Z3, ... , Zn be n independent
standard normal variables, then chi-square denoted χ2 is defined as -
HA
𝑛
2 2 2
χ = ∑ 𝑍𝑖 = Σ 𝑍
n
𝑖=1
2
μ and variance σ . Then the sample mean is 𝑋 = 𝑛
∑ 𝑥𝑖
𝑖=1
M
1
= 𝑛
× 𝐸(𝑛𝑥) = 𝑥 or μ
16
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
= 𝑉𝑎𝑟 𝑛
+ 𝑛
+ 𝑛
+.... + 𝑛
2
SA
1 2 σ
= 2 × 𝑛σ = 𝑛
𝑛
HA
σ
Answer: The standard error of sample mean is .
𝑛
Problem 1: Suppose light bulbs have a life length with an average of 4,000 hours
and a variance of 40,000. If we take a sample of 100 light bulbs and calculate
mean, what is the probability that mean will be: (i) between 3950 and 4100? (ii)
more than 4150?
D
2
Solution: (i) We have µ = 4000 , σ = 40000
𝑥−µ 3950− 4000 50
M
(ii) To find the probability that the sample mean will be more than 4150 hours, we
again need to calculate the z-score:
𝑥−µ 4150− 4000 150
For 𝑥 = 4150 , z = σ = 200 = 20
= 7.5
𝑛 100
N
#17. What is sampling distribution for proportion?
SA
Answer: Suppose, a variable has two categories which follows binomial
distribution with parameters n and π, and suppose a random sample of size n is
taken from the population, where P is the proportion of a particular category of the
variable. We know the mean and variance of distribution are nπ and nπ(1-π)
HA
respectively.
π (1−π) 𝑃−π
The standard error of estimated P is given by σ𝑝 = 𝑛
;z=
σ𝑝
Problem 2: A manufacturer of pens has determined from experience that 5% of the
IQ
pens he produces are defective. If a random sample of 420 pens is examined, what
is the probability that the proportion defective is between 0.024 and 0.044?
UF
π (1−π)
Solution: Here, σ𝑝 = 𝑛
0.05 (1−0.05)
= = 0.0106
TO
420
18
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
Problem 3: It is known that 65% of items of a lot are defective. What is the
probability that a simple random sample of size 100 items will show the proportion
of defective items to be 60% or less.
N
σ𝑝
SA
The required probability is (0.5 - 0.3508) = 0.1492 ≈ 14.92%
HA
#17. What are the parts of statistic?
Answer: Statistic is mainly divided into two parts: (i) Statistical Inference (ii)
Statistical Method
19
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
N
𝐸(𝑋) = μ
SA
#24. What is consistency?
Answer: Consistency refers to the effect of sample size on the accuracy of the
estimator. A statistic is said to be a consistent estimator of the population parameter
HA
if it approaches the parameter as the sample size increases. Thus, sample mean 𝑋 is
said to consistent estimator of population mean μ if 𝑋 → μ as n → ∞.
mean uses all the sample values for its computation while mode and the median do
not, hence, sample mean is a better estimator in this sense.
shopkeeper would say that it is very likely (with a confidence of 95%, say) that the
daily sales amount would be between Tk. 19500 and Tk. 20500.
#Case I. Confidence interval for population mean when a sample is drawn from a
normal population with known variance σ2.
For population mean μ is given , confidence level is 95%.
𝑋 ± 𝑧σ𝑥 = 𝑋 ± 1. 96σ𝑥
20
Prepared By: Md. Toufiq Hasan (24th Batch, AIS, RU)
#Problem: A magazine editor wants to know the average family income of its
readers. A random sample of 49 readers yields income data with a mean of Tk.
13,000 and an estimate of the population standard deviation (σ) of Tk. 2,000.
Construct a 95% confidence interval estimate of the population mean.
N
We have, 𝑋 ± 𝑧σ𝑥 = 13000 ± 1. 96 × 285. 714
SA
= 13000 ± 560
HA
The confidence interval is 12440 - 13560.
IQ
UF
TO
D
M
21