0% found this document useful (0 votes)
52 views4 pages

Finding An Optimal Interval Length in Hi PDF

This document summarizes a research paper that proposes a new approach for determining an optimal interval length in high order fuzzy time series models to improve forecasting accuracy. The approach uses an optimization technique called the golden section search with a single-variable constraint to find the optimum interval length. The method is applied to enrollment data from the University of Alabama and shows improved results compared to existing first and high order fuzzy time series approaches.

Uploaded by

AYU JANNATUL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views4 pages

Finding An Optimal Interval Length in Hi PDF

This document summarizes a research paper that proposes a new approach for determining an optimal interval length in high order fuzzy time series models to improve forecasting accuracy. The approach uses an optimization technique called the golden section search with a single-variable constraint to find the optimum interval length. The method is applied to enrollment data from the University of Alabama and shows improved results compared to existing first and high order fuzzy time series approaches.

Uploaded by

AYU JANNATUL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Expert Systems with Applications 37 (2010) 5052–5055

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

Finding an optimal interval length in high order fuzzy time series


Erol Egrioglu a, Cagdas Hakan Aladag b,*, Ufuk Yolcu a, Vedide R. Uslu a, Murat A. Basaran c
a
Department of Statistics, Ondokuz Mayis University, Samsun 55139, Turkey
b
Department of Statistics, Hacettepe University, Ankara 06532, Turkey
c
Department of Mathematics, Nigde University, Nigde 51000, Turkey

a r t i c l e i n f o a b s t r a c t

Keywords: Univariate fuzzy time series approaches which have been widely used in recent years can be divided into
Forecasting two classes, which are called first order and high order models. In the literature, it has been shown that
Fuzzy sets high order fuzzy time series approaches improve the forecasting accuracy. One of the important parts of
High order fuzzy time series forecasting obtaining high accuracy forecasts in fuzzy time series is that the length of interval is very vital. As men-
model
tioned in the first-order models by Egrioglu, Aladag, Basaran, Uslu, and Yolcu (2009), the length of inter-
Length of interval
Optimization
val also plays very important role in high order models too. In this study, a new approach which uses an
optimization technique with a single-variable constraint is proposed to determine an optimal interval
length in high order fuzzy time series models. An optimization procedure is used in order to determine
optimum length of interval for the best forecasting accuracy, we used optimization procedure. In the
optimization process, we used a MATLAB function employing an algorithm based on golden section
search and parabolic interpolation. The proposed method was employed to forecast the enrollments of
the University of Alabama to show the considerable outperforming results.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction Aladag, Basaran, Egrioglu, Yolcu, and Uslu (2009) introduced a


new approach, which uses a feed-forward neural network for
Fuzzy time series approaches have been successfully applied to defining fuzzy relations and is based on a high order fuzzy time
the data such as stock exchange, temperature and enrollment series forecasting model.
which include uncertainty. Fuzzy time series approaches have In this study, a new approach is proposed so that it analyzes a
found many diversified application areas since it differs from con- high order fuzzy time series forecasting model by optimizing the
ventional approaches in many respects. The most important is that interval length which plays an important role in partitioning
it does not require the check of theoretical assumptions. the universe of discourse of the time series. The optimization of
Fuzzy time series approach was firstly proposed by Song the interval length is executed by a MATLAB function called
and Chissom (1993a, 1993b, 1994). Sullivan and Woodall (1994) ‘‘fminbnd” which uses the polynomial interpolation together with
proposed a method based on Markov model. Chen (1996) also golden section search. The proposed method is applied to the data
proposed a new method which is simple since it does not require of enrollments of Alabama University. The obtained results have
matrix operations. Huarng (2001) pointed out that the interval been compared to the results from the first and high order ap-
length influences the forecasting performance and proposed two proaches available in the literature.
methods which are based on the average and the distribution, for In Section 2, the fundamental definitions about fuzzy time ser-
defining the length. Egrioglu, Aladag, Basaran, Uslu, and Yolcu ies are presented. In Section 3, the method of Chen (2002) is given.
(2009) introduced a new approach based on the optimization of In subsequent section, the proposed method and its application re-
the interval length. The studies mentioned above can be catego- sults are presented. The final section provides a brief conclusion.
rized under the name of first-order fuzzy time series model.
Since first-order fuzzy time series models have a simple struc-
ture, they can generally be insufficient to explain more complex 2. Fuzzy time series
relationships. For this reason, Chen (2002) proposed a new method
which analyzes a high order fuzzy time series forecasting model. The definition of fuzzy time series was firstly introduced by
Song and Chissom (1993a, 1993b). In fuzzy time series approaches,
the validation of theoretical assumptions does not needs to be
* Corresponding author. Tel.: +90 3122977900. checked just as in conventional time series procedures. The most
E-mail address: [email protected] (C.H. Aladag). important advantage of fuzzy time series approaches is to be able

0957-4174/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2009.12.006
E. Egrioglu et al. / Expert Systems with Applications 37 (2010) 5052–5055 5053

to work with a very small set of data and not to require the linear- In order to obtain crisp values as a forecast, the procedure de-
ity assumption. The some general definitions of fuzzy time series fined in Step 5 are employed here to defuzzify fuzzy values so fore-
are given as follows: casts are calculated.
Let U be the universe of discourse, where U ¼ fu1 ; u2 ; . . . ; ub g. A
fuzzy set Ai of U is defined as Ai ¼ fAi ðu1 Þ=u1 þ fAi ðu2 Þ=u2 þ  If the kth order fuzzified history time series for year i are
   þ fAi ðub Þ=ub , where fAi is the membership function of the fuzzy Aik ; Aiðk1Þ ; . . . ; and Ail , where k P 2, and there is the following
set Ai ; fAi : U ! ½0; 1. ua is a generic element of fuzzy set fuzzy logical relationship in the kth order fuzzy logical relation-
Ai ; fAi ðua Þ is the degree of belongingness of ua to Ai ; fAi ðua Þ 2 ½0; 1 ship groups shown as follows:
and 1 6 a 6 b.
Aik ; Aiðk1Þ ; . . . ; Ail ! Aj ;
Definition 1. Fuzzy time series Let YðtÞðt ¼ . . . ; 0; 1; 2; . . .Þ a subset
of real numbers, be the universe of discourse by which fuzzy sets where Aik ; Aiðk1Þ ; . . . ; Ail and Aj , are fuzzy sets, and the maximum
fj ðtÞ are defined. If FðtÞ is a collection of f1 ðtÞ; f2 ðtÞ; . . . then FðtÞ is membership value of Aj occurs at interval uj , and the midpoint of
called a fuzzy time series defined on YðtÞ. uj is mj , then the forecasted time series of year i is mj .
 If the kth order fuzzified history time series for year i are
Aik ; Aiðk1Þ ; . . . ; and Ail , where k P 2, and there is the following
Definition 2. Fuzzy time series relationships assume that FðtÞ is fuzzy logical relationship in the kth order fuzzy logical relation-
caused only by Fðt  1Þ, then the relationship can be expressed ship groups shown as follows:
as: FðtÞ ¼ Fðt  1Þ Rðt; t  1Þ, which is the fuzzy relationship Aik ; Aiðk1Þ ; . . . ; Ail ! Aj1
between FðtÞ and Fðt  1Þ, where * represents as an operator. To
Aik ; Aiðk1Þ ; . . . ; Ail ! Aj2
sum up, let Fðt  1Þ ¼ Ai and FðtÞ ¼ Aj . The fuzzy logical relation-
ship between FðtÞ and Fðt  1Þ can be denoted as Ai ! Aj where 
Ai refers to the left-hand side and Aj refers to the right-hand side 
of the fuzzy logical relationship. Furthermore, these fuzzy logical 
relationships can be grouped to establish different fuzzy
Aik ; Aiðk1Þ ; . . . ; Ail ! Ajp ;
relationship.
where Aik ; Aiðk1Þ ; . . . ; Ail ; Aj1 ; Aj2 ; . . ., and Ajp , are fuzzy sets, then
Definition 3. Let FðtÞ be a fuzzy time series. If FðtÞ is a caused by we can see that there is an ambiguity to forecast the time series
Fðt  1Þ; Fðt  2Þ; . . . ; Fðt  mÞ, then this fuzzy logical relationship of year i (i.e. the fuzzy data of year i may be Aj1 or Aj2 or Ajp ). In
is represented by this case, we must find higher order fuzzified history time series
Fðt  mÞ; . . . ; Fðt  2Þ; Fðt  1Þ ! FðtÞ; for year i, such that there is no ambiguity to forecast the time
series of year i. Assume that there exists an integer m that can
and it is called the mth order fuzzy time series forecasting model. resolve this ambiguity, where m P k, such that mth order fuzz-
ified time series of year i are Aim ; Aiðm1Þ ; . . . ; and Ai1 , and there is
the following fuzzy logical relationship in the mth order fuzzy
3. Chen’s high order fuzzy time series method
logical relationship groups shown as follows:

Chen (2002) proposed a method based on high order fuzzy time Aim ; Aiðm1Þ ; . . . ; Ail ! Aj1 ;
series which enable to obtain forecasts. The method proposed by
where Aim ; Aiðm1Þ ; . . . ; Ail and Aj are fuzzy sets, and the maximum
Chen produces more accurate forecasts than the first-order fuzzy
membership value of Aj occurs at interval uj , and the midpoint of
time series methods (Chen, 2002). The model given in definition
uj is mj , then the forecasted time series of year i is mj .
3 can be analyzed by the high order fuzzy time series approach.
 If the kth order fuzzified history time series for year i are
The steps of the algorithm of the method proposed by Chen
Aik ; Aiðk1Þ ; . . . ; and Ail , where k P 2, and there is the following
(2002) can be given as follows:
fuzzy logical relationship in the kth order fuzzy logical relation-
ship groups in which the right-hand side of the fuzzy logical
Step 1. Define the discourse of universe and subintervals. Based
relationship is empty shown as follows:
on min and max values in the data set, Dmin and Dmax vari-
ables are defined. Then choose two arbitrary positive num- Aik ; Aiðk1Þ ; . . . ; Ail ! #;
bers which are D1 and D2 in order to divide the interval
evenly. where Aik ; Aiðk1Þ ; . . . ; and Ail ; are fuzzy sets, and the maximum
membership values of Aik ; Aiðk1Þ ; . . . ; and Ail occur at intervals
U ¼ ½Dmin  D1 ; Dmax þ D2 : uik ; uiðk1Þ ; . . . ; and uil ; respectively, and the midpoint of
Step 2. Define fuzzy sets based on the universe of discourse and uik ; uiðk1Þ ; . . . ; and uil are mik ; miðk1Þ ; . . . ; and mil ; respectively
fuzzify the historical data. then the forecasted time series of year i is calculated as follows:
Step 3. Fuzzify observed rules. 1  mik þ 2  miðk1Þ þ    þ k  mi1
Step 4. Establish fuzzy logical relationships and group them based :
1 þ 2 þ  þ k
on the current states of the data of the fuzzy logical rela-
tionships. Based on the linguistically defined variables,
kth order fuzzy logical relationship Aik ; Aiðk1Þ ; . . . ; Ail ! Aj
can be established. For example, the values of the year 4. The proposed method based on the optimization of MSE
i  1 and i corresponds to fuzzy values Aa and Ab . Also,
the values of the year i þ 1 corresponds to fuzzy value Aj . In fuzzy time series approaches, at the forecasting process, the
Therefore, 2th order fuzzy logical relationship can be writ- length of intervals affects the forecasting performance. Hence, it
ten as Aa ; Ab ! Aj . In a similar manner, the more high is important to choose an affective length of intervals for improv-
order fuzzy logical relationships and fuzzy logical groups ing forecasting accuracy in fuzzy time series approaches. The pro-
for 3th, 4th and other high orders are constructed. posed method optimizes the length of interval by following with
Step 5. Forecast and Defuzzify. the algorithm of Chen (2002) method. In the optimization process,
5054 E. Egrioglu et al. / Expert Systems with Applications 37 (2010) 5052–5055

we used a MATLAB function called ‘‘fminbnd” which minimizes Table 2


MSE. The function ‘‘fminbnd” is used to find minimum of a sin- The results of proposed method.

gle-variable function on a fixed interval. It finds a minimum for a Orders Optimum length of interval MSE
problem specified by 2 231.7782 62,639
3 222.0880 60,714
min f ðxÞ; 4 404.2322 172,820
x

subject to x1 < x < x2 :

x, x1, and x2 are scalars and f(x) is a function that returns a scalar. In Table 3
MATLAB, ^x ¼ fminbndðf ðxÞ; x1 ; x2 Þ returns a value ^x that is a local The comparison of the results.
minimum of the scalar valued function f(x) in the interval
Method Order MSE
x1 < x < x2 . In other words, to find the minimum of the function
Song and Chissom (1993a) 1 412,499
f(x) in the interval (x1, x2),
Song and Chissom (1994) 1 775,687
a ¼ fminbndðf ðxÞ; x1 ; x2 Þ; Sullivan and Woodall (1994) 1 386,055
Chen (1996) 1 407,507
can be used in MATLAB. f(a) gives the local minimum value in the Huarng (2001) 1 78,792a
124,707b
interval (x1, x2).
Chen (2002) 3 86,694
The algorithm used by ‘‘fminbnd” is based on golden section Aladag et al. (2009) 2 78,073
search introduced by Kiefer (1953) and parabolic interpolation. Egrioglu et al. (2009) 1 66,661
Unless the left endpoint x1 is very close to the right endpoint x2, Our proposed method 3 60,714
‘‘fminbnd” never evaluates f(x) at the endpoints, so f(x) need only a
Average based length.
be defined for x in the interval x1 < x < x2 . If the minimum actually b
Distribution based length.
occurs at x1or x2, fminbnd returns an interior point at a distance of
no more than 2 * TolX from x1 or x2, where TolX is the termination
tolerance. See Brent (1973) or Forsythe, Malcolm, and Moler (1976) the minimum value of fMSE ðxÞ in the interval 200 < x < 1000. The
for the details about the algorithm. obtained results from second, third, and fourth order models are gi-
For the aim of a comparative study, methods that the proposed ven in Table 2. According to Table 2, the best result is obtained
method, Huarng’s (2001) average and distribution based length when the 3rd ordered model is used. The optimum length of inter-
methods, Chen (1996), Chen (2002), Song and Chissom (1993a, val is 222.0880 for the 3rd ordered model.
1994), Sullivan and Woodall (1994), Aladag et al. (2009) and Egrio- A key point in choosing effective lengths of intervals is that it
glu et al. (2009) are applied to the data of the enrollments of the should not be too large or too small. When an effective length of
University of Alabama. The enrollment data are presented in Ta- intervals is too large, there will be no fluctuations in the fuzzy time
ble 1, the results obtained from the proposed methods are summa- series. On the other hand, when the length is too small, the mean-
rized in Table 2 and the results obtained from the mentioned ing of fuzzy time series will be diminished as mentioned in Huarng
methods are summarized in Table 3. (2001). In order not to finding either a very small or very large
The initial value for the length of the interval was chosen 13,000 interval in the optimization process, the constrained optimization
like previous studies. MSE for forecasted observations is used as a technique was employed over the interval between 200 and
measure of forecasting accuracy so the objective function value is 1000. In other words, we take the left and the right end points as
equal to MSE value obtained from Chen (2002). Using with the 200 and 1000, respectively to avoid getting very small or very large
algorithm of a single-variable constrained optimization to mini- intervals in fuzzy time series analysis.
mize MSE value via MATLAB function called ‘‘fminbnd” over the
interval between 200 and 1000, the optimal length of interval is
obtained. We minimize the MSE value by optimizing the length 5. Results and discussion
of interval so that we increase the forecasting accuracy. We use
the function ‘‘fminbnd” as follows: In this study, we propose a new approach which is based on the

optimization of the interval length for analyzing a high order fuzzy
x ¼ fminbndðfMSE ðxÞ; 200; 1000Þ; time series forecasting model. The proposed approach is the im-
where fMSE ðxÞ gives the MSE value obtained from forecasted and ac- proved version of the algorithm given by Chen (2002) in terms of
tual values for the length of interval x when Chen’s (2002) method forecasting accuracy. In the study done by Chen (2002), the inter-
is used. The left and the right end points are taken as 200 and 1000, val length is left as a decision upon the researcher. However, it is
respectively. The function used in the optimization step produces known that the defining the interval length does actually affect
the forecasting performance of the fuzzy time series approaches.
It is observed that calculating interval length using optimization
Table 1 procedure, instead of selecting it arbitrarily, assures better forecast
The enrollment data.
accuracy. In this study, the interval length which minimizes MSE is
Years Actual Years Actual determined by optimization which differs from the method pro-
1971 13,055 1982 15,433 posed by Chen (2002). Egrioglu et al. (2009) previously introduced
1972 13,563 1983 15,497 a first-order fuzzy time series approach by optimizing the interval
1973 13,867 1984 15,145 length and shown that the forecasting performance is improved
1974 14,696 1985 15,163 significantly. Therefore, this study can be considered as the gener-
1975 15,460 1986 15,984
1976 15,311 1987 16,859
alized form of the method proposed by Egrioglu et al. (2009).
1977 15,603 1988 18,150 Huarng (2001) denoted how the selection of the length of inter-
1978 15,861 1989 18,970 val affects the fuzzy time series analysis. It can be briefly said that a
1979 16,807 1990 19,328 very wide interval length will diminish the fluctuations of the time
1980 16,919 1991 19,337
series while the narrow one causes a very crisp time series. There-
1981 16,388 1992 18,876
fore, in the proposed method, when calculating the length of inter-
E. Egrioglu et al. / Expert Systems with Applications 37 (2010) 5052–5055 5055

val, choosing the left and the right end points of the interval which Brent, R. P. (1973). Algorithms for minimization without derivatives. Englewood Cliffs,
New Jersey: Prentice-Hall.
will be used by optimization procedure is very important in terms
Chen, S. M. (1996). Forecasting enrollments based on fuzzy time-series. Fuzzy Sets
of forecasting accuracy. In the previous section, we explained why and Systems, 81, 311–319.
we restricted the interval length between 200 and 1000 for the Chen, S. M. (2002). Forecasting enrollments based on high order fuzzy time series.
enrollment data. Cybernetics and Systems, 33, 1–16.
Egrioglu, E., Aladag, C. H., Basaran, M. A., Uslu, V. R., & Yolcu, U. (2009). A new
When the proposed method is implemented to the enrolment approach based on the optimization of the length of intervals in fuzzy time
data, the order of the model is taken as 2, 3 and 4, respectively. series. Journal of Intelligent and Fuzzy Systems, in press.
The best forecasts are obtained from the proposed model with Forsythe, G. E., Malcolm, M. A., & Moler, C. B. (1976). Computer methods for
mathematical computations. Prentice-Hall.
the order 3. It is observed that the smallest MSE value is achieved Huarng, K. (2001). Effective length of intervals to improve forecasting in fuzzy time-
from the proposed method when it is compared with other meth- series. Fuzzy Sets and Systems, 123, 387–394.
ods which are Huarng’s (2001) average and distribution based Kiefer, J. (1953). Sequential minimax search for a maximum. In Proceedings of the
American Mathematical Society (Vol. 4, pp. 502–506), MR0055639. doi:10,2307/
length methods, Chen (1996), Chen (2002), Song and Chissom 2032161.
(1993a), Song and Chissom (1994), Sullivan and Woodall (1994), Song, Q., & Chissom, B. S. (1993a). Fuzzy time series and its models. Fuzzy Sets and
Aladag et al. (2009) and Egrioglu et al. (2009). Systems, 54, 269–277.
Song, Q., & Chissom, B. S. (1993b). Forecasting enrollments with fuzzy time series –
Part I. Fuzzy Sets and Systems, 54, 1–10.
References Song, Q., & Chissom, B. S. (1994). Forecasting enrollments with fuzzy time series –
Part II. Fuzzy Sets and Systems, 62(1), 1–8.
Aladag, C. H., Basaran, M. A., Egrioglu, E., Yolcu, U., & Uslu, V. R. (2009). Forecasting Sullivan, J., & Woodall, W. H. (1994). A comparison of fuzzy forecasting and Markov
in high order fuzzy times series by using neural networks to define fuzzy modeling. Fuzzy Sets and Systems, 64(3), 279–293.
relations. Expert Systems with Applications, 36, 4228–4231.

You might also like