Prosiding SEAMS 2011
Prosiding SEAMS 2011
th
th
th
Proceedings of the 6 SEAMS-GMU International
Conference on Mathematics and Its Applications
th th
Yogyakarta - Indonesia, 12 - 15 July 2011
Department of Mathematics
Faculty of Mathematics & Natural Sciences
Universitas Gadjah Mada
Sekip Utara Yogyakarta - INDONESIA 55281
Phone : +62 - 274 - 552243 ; 7104933
Fax. : +62 - 274 555131
PROCEEDINGS OF THE 6TH
SOUTHEAST ASIAN MATHEMATICAL SOCIETY
GADJAH MADA UNIVERSITY
INTERNATIONAL CONFERENCE ON MATHEMATICS
AND ITS APPLICATIONS 2011
DEPARTMENT OF MATHEMATICS
FACULTY OF MATHEMATICS AND NATURAL SCIENCES
UNIVERSITAS GADJAH MADA
YOGYAKARTA, INDONESIA
2012
Published by
Department of Mathematics
Faculty of Mathematics and Natural Sciences
Universitas Gadjah Mada
Sekip Utara, Yogyakarta, Indonesia
Telp. +62 (274) 7104933, 552243
Fax. +62 (274) 555131
PROCEEDINGS OF
THE 6TH SOUTHEAST ASIAN MATHEMATICAL SOCIETY-GADJAH MADA UNIVERSITY
INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS APPLICATIONS 2011
Copyright @ 2012 by Department of Mathematics, Faculty of Mathematics and
Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia
ISBN 978-979-17979-3-1
PROCEEDINGS OF THE 6TH
SOUTHEAST ASIAN MATHEMATICAL SOCIETY-GADJAH MADA
UNIVERSITY
INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS
APPLICATIONS 2011
Chief Editor:
Sri Wahyuni
Managing Editor :
Managing Team :
Supporting Team :
Parjilan Warjinah
Siti Aisyah Emiliana Sunaryani Yuniastuti
Susiana Karyati
Tutik Kristiastuti Sudarmanto
Tri Wiyanto Wira Kurniawan
Sukir Widodo Sumardi
EDITORIAL BOARDS
Analysis
Supama
Atok Zulijanto
Applied Mathematics
Fajar Adi Kusumo
Salmah
Computer Science
Edi Winarko
MHD. Reza M.I. Pulungan
The Editors
October, 2012
CONTENTS
Title i
Publisher and Copyright ii
Managerial Boards iii
Editorial Boards iv
List of Reviewers v
Preface vii
Paper of Invited Speakers
On Things You Can’t Find : Retrievability Measures and What to do with Them ……............... 1
Andreas Rauber and Shariq Bashir
The Linear Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System… 79
Salmah
Algebra
Application of Fuzzy Number Max-Plus Algebra to Closed Serial Queuing Network with
Fuzzy Activitiy Time ………………………………………………………………………………………………..…………… 193
M. Andy Rudhito, Sri Wahyuni, Ari Suparwanto, F. Susilo
Enumerating of Star-Magic Coverings and Critical Sets on Complete Bipartite Graphs………… 205
M. Roswitha, E. T. Baskoro, H. Assiyatun, T. S. Martini, N. A. Sudibyo
Construction of Rate s/2s Convolutional Codes with Large Free Distance via Linear System
Approach ………………………………………………………………………………………………........…………………….. 213
Ricky Aditya and Ari Suparwanto
Characteristics of IBN, Rank Condition, and Stably Finite Rings ………....................................... 223
Samsul Arifin and Indah Emilia Wijayanti
The Existence of Moore Penrose Inverse in Rings with Involution …........................................ 249
Titi Udjiani SRRM, Sri Wahyuni, Budi Surodjo
Analysis
L
On Necessary and Sufficient Conditions for into 1 Superposition Operator ……...………... 289
Elvina Herawaty, Supama, Indah Emilia Wijayanti
A DRBEM for Steady Infiltration from Periodic Flat Channels with Root Water Uptake ……….. 297
Imam Solekhudin and Keng-Cheng Ang
Applied Mathematics
Sequence Analysis of DNA H1N1 Virus Using Super Pair Wise Alignment .............................. 331
Alfi Yusrotis Zakiyyah, M. Isa Irawan, Maya Shovitri
Optimization Problem in Inverted Pendulum System with Oblique Track ……………………………. 339
Bambang Edisusanto, Toni Bakhtiar, Ali Kusnanto
Existence of Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 347
Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
Effect of Rainfall and Global Radiation on Oil Palm Yield in Two Contrasted Regions of
Sumatera, Riau and Lampung, Using Transfer Function .......................................................... 365
Divo D. Silalahi, J.P. Caliman, Yong Yit Yuan
A Mathematical Model of Periodic Maintence Policy based on the Number of Failures for
Two-Dimensional Warranted Product ..…................................................................................. 403
Hennie Husniah, Udjianna S. Pasaribu, A.H. Halim
The Existence of Periodic Solution on STN Neuron Model in Basal Ganglia …………………………. 413
I Made Eka Dwipayana
Expected Value Approach for Solving Multi-Objective Linear Programming with Fuzzy
Random Parameters ….............................................................................................................. 427
Indarsih, Widodo, Ch. Rini Indrati
Chaotic S-Box with Piecewise Linear Chaotic Map (PLCM) ...................................................... 435
Jenny Irna Eva Sari and Bety Hayat Susanti
Safety Analysis of Timed Automata Hybrid Systems with SOS for Complex Eigenvalues …….. 471
Noorma Yulia Megawati, Salmah, Indah Emilia Wijayanti
Global Asymptotic Stability of Virus Dynamics Models and the Effects of CTL and Antibody
Responses ………………………………………………………………………………….……………………………………….. 481
Nughtoth Arfawi Kurdhi and Lina Aryati
A Simple Diffusion Model of Plasma Leakage in Dengue Infection …………………………..………… 499
Nuning Nuraini, Dinnar Rachmi Pasya, Edy Soewono
The Sequences Comparison of DNA H5N1 Virus on Human and Avian Host Using Tree
Diagram Method …………………………………..……………………………………………………………..…………….. 505
Siti Fauziyah, M. Isa Irawan, Maya Shovitri
Fuzzy Controller Design on Model of Motion System of the Satellite Based on Linear Matrix
Inequality …………………………………………………………………….…………….…………..…….…………………….. 515
Solikhatun and Salmah
Unsteady Heat and Mass Transfer from a Stretching Suface Embedded in a Porous Medium
with Suction/injection and Thermal Radiation Effects…………………………………………………………. 529
Stanford Shateyi and Sandile S Motsa
Stability Analysis and Optimal Harvesting of Predator-Prey Population Model with Time
Delay and Constant Effort of Harvesting ………………………………………………………..……………………. 567
Syamsuddin Toaha
Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation ………….. 579
Widowati, Nurhayati, Sutimin, Laylatusysyarifah
Survey of Methods for M onitoring Association Rule Behav ior ............................. 589
Ani Dijah Rahajoe and Edi Winarko
A Comparison Framework for Finge rprint Recognition Met hods .................... 601
Ary Noviyanto and Reza Pulungan
The Global Behav ior of Ce rtain Turing System …………………. ..................................... 615
Janpou Nee
Logic Approach Towards Formal Ve rification of C rypt ographic P rotocol ......... 621
D.L. Crispina Pardede, Maukar, Sulistyo Puspitodjati
A Framew ork for an LTS Sem antics for P romela …........... ................. .............. 631
Suprapto and Reza Pulungan
Mathematics Education
Consistency of the Bootstrap Estimator for Mean Under Kolmogorov Metric and Its
Implementation on Delta Method ……................................………………….................................. 679
Bambang Suprihatin, Suryo Guritno, Sri Haryatmi
Multivariate Time Series Analysis Using RcmdrPlugin.Econometrics and Its Application for
Finance ..................................................................................................................................... 689
Dedi Rosadi
Unified Structural Models and Reduced-Form Models in Credit Risk by the Yield Spreads …. 697
Di Asih I Maruddani, Dedi Rosadi, Gunardi, Abdurakhman
New Weighted High Order Fuzzy Time Seriesfor Inflation Prediction ……................................ 715
Dwi Ayu Lusia and Suhartono
Prediction the Cause of Network Congestion Using Bayesian Probabilities ............................. 737
Erwin Harapap, M. Yusuf Fajar, Hiroaki Nishi
Solving Black-Scholes Equation by Using Interpolation Method with Estimated Volatility……… 751
F. Dastmalchisaei, M. Jahangir Hossein Pour, S. Yaghoubi
Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia ............... 763
Heri Kuswanto
Second Order Least Square for ARCH Model …………………………....................………………............. 773
Herni Utami, Subanar, Dedi Rosadi, Liqun Wang
Valuing Employee Stock Options Using Monte Carlo Method ……………................................. 813
Kuntjoro Adji Sidarto and Dila Puspita
Recommendation Analysis Based on Soft Set for Purchasing Products ................................. 831
R.B. Fajriya Hakim, Subanar, Edi Winarko
Ordering Dually in Triangles (Ordit) and Hotspot Detection in Generalized Linear Model for
Poverty and Infant Health in East Java ……................................…………………………………………. 865
Yekti Widyaningsih, Asep Saefuddin, Khairil Anwar Notodiputro, Aji Hamim Wigena
Empirical Properties and Mixture of Distributions: Evidence from Bursa Malaysia Stock
Market Indices …………………....................................................................................................... 879
Zetty Ain Kamaruzzaman, Zaidi Isa, Mohd Tahir Ismail
An Improved Model of Tumour-Immune System Interactions …………………………………………….. 895
Trisilowati, Scott W. Mccue, Dann Mallet
Paper of Invited
Speakers
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 1–8.
Abstract. Information retrieval systems are commonly evaluated along two criteria,
namely effectiveness and efficiency, i.e. how well and how quickly they retrieve relevant
documents. More recently, another measure referred to as retrievability has been gaining
importance, namely how unbiased an IR system is against documents of certain charac-
teristics, i.e. how well all documents can theoretically be found. This paper provides a
short introduction to this concept of retrievability and show different ways of measuring
and estimating it. We will further discuss, how retrievability can be used to evaluate
and tune retrieval systems and discuss open challenges in terms of better understanding
the mathematical characteristics of similarity computation in high-dimensional feature
spaces.
1. INTRODUCTION
In the basic vector space model of information retrieval (IR), both documents as
well as queries are represented as vectors in a high-dimensional feature space. Distance
measures such as cosine distance, any L-norm, and others are used to identify the most
similar document vectors to any query vector, resulting in a ranked list of documents
to be returned to the user. This model poses several interesting challenges and oppor-
tunities for improving IR systems, be it in terms of identifying an optimized feature
space, selecting a suitable metric that can handle the frequently very high-dimensional
(several tens of thousands of dimensions), and very sparse feature spaces.
Evaluation of IR systems is usually performed along two dimensions, namely effec-
tiveness (how good is the system in returning relevant documents) and efficiency (how
time/memory efficient is the system). With the dominant effectiveness-based measures
being precision and recall, Retrievability has recently evolved as an important additional
dimension along which to evaluate information retrieval (IR) systems. It basically eval-
uates, in how far each document can be found by a given system. As each retrieval
1
2 Rauber Andreas and Shariq Bashir
model has certain characteristics that determine it’s behavior against documents of dif-
ferent length, vocabulary size, vocabulary distribution etc., retrievability can be used to
determine a potential bias introduced by a certain retrieval model against specific types
of documents. While retrievability has been primarily used in the context of recall-
oriented application domains (specifically patent retrieval), it may also be applied in a
range of more conventional application domains to understand which documents cannot
be found under certain assumptions, and to tune retrieval systems.
Still, the reasons as to why some documents have low or high retrievability, and
the characteristics of the bias introduced by a retrieval system are not fully understood.
Even more, estimating retrievability without processing prohibitively large numbers of
queries is a challenging task. We thus need to strive for a more systematic approach, re-
lying on the mathematical characteristics of the feature space as resulting from the data
as well as the query space, and their relationship. Similar to the insights gained from
understanding the behavior of distance metrics in very high-dimensional and specifically
very sparse feature spaces [1], we need to obtain a more solid model of the interplay
between data- and query space under a specific distance metric in order to understand
system bias and optimize retrieval. This extended abstract aims at raising a few ques-
tions in this context.
The remainder of this paper is organized as follows. A short introduction to re-
trievability measurement is provided in Section 2. Section 3 the proceeds with describing
some research challenges in the context of retrievability research, before providing some
conclusions in Section 4.
P
q∈Q f (kdq , c)
r̂(d) = (3)
|Q̂|
where Q̂ is the set of queries that can retrieve d when not considering any rank
cut-off factor.
Numerous experiments have meanwhile been performed on retrievability-related
analysis. one of the surprising factor is, that most retrieval system exhibit a rather
strong bias, either favoring or disfavoring strongly longer or shorter documents that are
rather vocabulary-rich or vocabulary poor. In most corpora there is a surprisingly high
number of documents that have a retrievability of 0, i.e. that cannot be found via any
query up to a reasonable length and excluding extremely specific queries consisting of
very rare terms. For examples of such analyses, see [2, 3, 4]
3.1. Understanding retrievability. While we have observed the effect of largely vary-
ing retrievability values for different documents in various retrieval models, we still lack
a solid understanding of the relationship between the document space, the query space
and the distance measures used to retrieve documents that underly all effectiveness
based measures, but also retrievability. On the one hand, in situations where the char-
acteristics of the document space and the query space are highly similar, low-retrievable
documents tend to be outliers in the document space. This seems to be the case in some
multimedia retrieval scenarios such as music retrieval. However, in text retrieval the
situation is somewhat different: while the documents live in a rather continuous space
of weighted term frequency values, the query space is rather discrete, as conventional
queries consist of combinations of unique query terms. Thus, queries basically form
rather discrete sub-spaces of those terms present in the query, ignoring the importance
of each term as expressed in the weighted model used for representing the documents.
As a result of this, also documents that live in rather dense spaces in the document
space may end up having low retrievability as they are never close enough to the dis-
crete query points, leading to a different reason for having low retrievability based on
the document space.
On Things You Can’t Find: Retrievability Measures and What to Do with Them 5
4. CONCLUSIONS
Retrievability measurement has evolved as a promising evaluation for the perfor-
mance of an information retrieval system. It measures, in how far a systems provides
in principle equal access to all objects, i.e. in how far each object is equally likely to
be found using specific queries. Quite surprisingly, information retrieval systems differ
significantly in the bias they impose on a document corpus, returning some objects for
a large number of potential queries, while other objects are virtually never retrieved
within the top-n documents for all possible queries.
Obtaining a better understanding of this phenomenon will offer a basis for better
understanding and optimizing retrieval systems. It will also allow us to better un-
derstand the limitations of what can and what cannot be found. This is increasingly
essential as the corpora of information that we are searching in become increasingly
large, with the willingness of users to scan long result lists diminishing. thus, under-
standing which data items cannot be retrieved via any query may provide significant
insights on the features and limitations of a system. In order to achieve this, we need
to more thoroughly investigate the mathematical properties of these high-dimensional
and very sparse feature spaces, the characteristics of both the document as well as the
query feature space and the relationship between these two as influenced by the distance
measures used for retrieval.
References
[1] Aggarwal, C. C. and Hinneburg, A. and Keim, D. A., On the Surprising Behavior of Dis-
tance Metrics in High Dimensional Spaces. In Proceedings of the 8th International Conference
on Database Theory (ICDT ’01), 420–434, Springer, 2001.
[2] Bashir, S. and Rauber, A. Identification of low/high retrievable patents using content-based fea-
tures. In PaIR ’09: Proceeding of the 2nd international workshop on Patent information retrieval,
pages 9–16, 2009.
[3] Bashir, S. and Rauber, A. Improving Retrievability and Recall by Automatic Corpus Partition-
ing In: Transactions on Large-Scale Data- and Knowledge Centered Systems. 2:122-140.
2010.
On Things You Can’t Find: Retrievability Measures and What to Do with Them 7
[4] Azzopardi, L. and Vinay, V. Retrievability: an evaluation measure for higher order information
access tasks. In CIKM ’08: Proceeding of the 17th ACM conference on Information and knowledge
management, pages 561–570, New York, NY, USA, 2008. ACM.
[5] Sakai, T. Comparing metrics across trec and ntcir: the robustness to system bias. In CIKM
’08: Proceeding of the 17th ACM conference on Information and knowledge management, pages
581–590. ACM, 2008.
Andreas Rauber
Vienna University of Technology.
e-mail: [email protected]
Shariq Bashir
Vienna University of Technology.
e-mail: [email protected]
8 Rauber Andreas and Shariq Bashir
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 9–18.
A QUASI-STOCHASTIC DIFFUSION-REACTION
DYNAMIC MODEL FOR TUMOUR GROWTH
1. INTRODUCTION
Cancer is one of the main causes of death in the world. According to the World
Health Organization, in 2005, out of a total of 58 million deaths worldwide, cancer
accounts for 7.6 million, or around 13%, of all deaths (http://www.who.org/). In Sin-
gapore, cancer is the leading cause of death and is responsible for more than 25% of
all deaths in the past few years (http://www.moh.gov.sg/). It is therefore not sur-
prising that cancer research, both theoretical and experimental, has been gaining more
attention in recent years.
Cancer is a generic term for a group of diseases characterized by the abnormal
and uncontrolled growth of cells. Normal and healthy cells can grow, divide, die and
be replaced in a regulated fashion. However, if a cell becomes transformed as a result
of mutations in certain key genes, it can lose its ability to control its growth. This
in turn may lead to excessive proliferation, creating a cluster of cells, better known as
a primary tumour. If the tumour is malignant, it possesses the ability to spread and
invade neighbouring tissues. In fact, what makes cancer so lethal is the spread of cancer,
or metastasis, to other sites of the body because if the spread is not controlled, it can
result in death.
9
10 Ang Keng Cheng
The development of cancer can be divided into three distinct stages: avascular,
vascular and metastatic. When a primary tumour is first formed, it is simply a collection
of cells without any supply of blood vessels. This is the avascular stage and the avascular
tumour can sometimes remain dormant for a long period of time. If the tumour manages
to induce blood vessels to grow towards it and in turn develops its own network of vessels
(vasculature), it becomes a vascular tumour. The vascular tumour enters the metastatic
stage when its cancer cells are able to escape from the primary tumour to invade tissues
in a distant site and form a secondary tumour.
Avascular tumour growth usually begins with the presence of proliferating cells
dividing and reproducing at some rate. Some of these cells will then turn quiescent, that
is, alive but not dividing. However, if conditions are right, quiescent cells may begin
dividing again. Many cancer therapies specifically target dividing cells. Some quiescent
cells may eventually die, forming a central core of necrotic (dead) cells and the typical
three zone tumour spheroid is formed. This is shown schematically in Figure 1. Such
typical structure is evident in experimental culture studies performed by researchers
such as Folkman and Hochberg [6].
Proliferating cells
Quiescent cells
Necrotic cells
2. THE MODEL
The tumour growth model considered here assumes that the cells are divided
into sub-populations of proliferating, quiescent and necrotic cells whose cell densities
are denoted by p(x, t), q(x, t) and n(x, t) respectively where t denotes time and x is
the one-dimensional space coordinate. Like the model first proposed by Sherratt and
Chaplain [10] and later modified by Tan and Ang [11], the current model also consists
of a set of partial differential equations based on the usual diffusion-reaction dynamics.
These equations describe the evolutions of p, q and n with time and along x.
However, unlike these previous models, we do not assume that movements of
proliferating and quiescent cells are inhibited by their proximity to each other. Instead,
the standard diffusion model is used for both cell sub-populations p and q. The resulting
model is a set of governing equations stated as follows.
∂p ∂2p
= + g(c).p.(1 − p − q − n) − f (c).p (1)
∂t ∂x2
∂q ∂2q
= + f (c).p − h(c).q (2)
∂t ∂x2
∂n
= h(c).q (3)
∂t
c0 γ
c = (1 − α(p + q + n)) (4)
γ+p
The functions g(c), f (c) and h(c) are variable rates of growth, and c(x, t) is a
function representing nutrient concentration. Following Tan and Ang, f (c) and h(c)
are assumed to be decreasing. Proliferating cells become quiescent at the rate of f (c)
and quiescent cells become necrotic at the rate of h(c). In addition, proliferating cells
divide at a rate of g(c) and a Gompertz growth rate is used to represent this rate in this
model. We also assume that as c tends to ∞, both f and h vanish. The growth process is
driven by nutrient supply, whose concentration in this case is governed by Equation (4),
as proposed by Sherratt and Chaplain for in vivo multicellular spheroids. Here, c0 and γ
are constant parameters, with c0 representing the nutrient concentration in the absence
of tumour cell population, and α ∈ (0, 1] represents a constant of proportionality.
Although several functional forms for f (c) and h(c) are possible, in the current
discussion, we take f (c) = 21 (1 − tanh(4c − 2)) and h(c) = 21 f (c). For a standard
Gompertz model, we assume g(c) = βeβc for some β between 0 and 1. For convenience,
we fix g(0) = 1 and set the total cell density p + q + n to 1 at t = 0 and x = 0 in all
computations.
Equations (1) to (4) and the functions f (c), g(c) and h(c) constitute a model for
growth of tumour cells based on diffusion-reaction dynamics.
12 Ang Keng Cheng
3. NUMERICAL SOLUTION
The governing equations (1) to (4) may be solved numerically using standard
finite difference methods. In our case, we employ the forward difference approximation
for time derivatives and central difference approximations for the space derivatives.
Using ∆t and ∆x as the time steps and space intervals respectively in a finite difference
scheme, the following set of discretized equations is obtained.
∆t j
pj+1
i = pji + pi+1 − 2pj
i + p j
i−1 + ∆t g(cj
i )(1 − ri
j
) − f (c j j
i )pi (5)
∆x2
∆t
qij+1 = qij + 2
j
qi+1 − 2qij + qi−1
j
+ ∆t f (cji )pji − h(cji )qij (6)
∆x
nj+1
i = nji + ∆t h(cji )qij (7)
c0 γ
cj+1
i = 1 − αrij (8)
γ + pji
where rij = pji + qij + nji . In these discretized equations, the superscript and subscript
refer to the time level and space position respectively. As an example, pji = p(xi , tj ) =
p(i∆x, j∆t).
Given the randomness expected in cell mitosis and cell diffusion in proliferation, it
is both reasonable and realistic to include a stochastic term in the form of multiplicative
noise to Equation (1). Following Doering, Sargsyan and Smereka [4], Equation (5) may
be modified as
∆t j
pj+1
i = pji + pi+1 − 2p j
i + pj
i−1 + ∆t g(c j
i )(1 − ri
j
) − f (cj j
i )p i
∆x√2
+τ pji ∆t∆Wij (9)
where ∆Wij are independent Gaussians with mean zero and variance ∆t, and τ is
a suitably chosen scaling factor used to control the amplitude of the added noise. To
solve the stochastic differential equation numerically, we use the Euler-Maruyama (EM)
method as described by Higham [8]. A discretized Brownian path over a given duration,
say [0, T ] with a chosen incremental value of δt is first constructed. As part of the finite
difference computation, Equation (9) is used with a stepsize of ∆t = Rδt for some
positive integer R. This is to so that the stepsize for the finite difference scheme is
always an integer multiple of the increment for the discretized Brownian path. This, in
essence, is the EM method which ensures that the set of points used in the discretized
Brownian path contains the points of the EM computation.
The increment ∆Wij is then computed using
(j+1)R
X
∆Wij = Wij+1 − Wij = dWk
k=jR+1
and W0 = 0.
The boundary conditions and initial conditions used in the present discussion, as
well as some suitable parameter values are listed below.
A Quasi-Stochastic Diffusion-Reaction Dynamic Model for Tumour Growth 13
Boundary conditions:
∂p ∂q
=0 and =0 at x = 0 and as x → ∞.
∂x ∂x
Initial conditions:
q(x, 0) = n(x, 0) = 0 and p(x, 0) = 0.01 exp(−0.1x).
Suitable parameter values:
α ∈ (0.2, 0.9), β ∈ (0.1, 1.0), γ = 10, c0 = 1.
For the purpose of discussion, we have also fixed R = 2 and ∆t = 0.04, and
T = 16. Theoretically, x ∈ [0, ∞); however, in practice, we set a finite upper limit
for the space dimension. This upper limit has to be large enough to capture the main
features we wish to see in the results. After some computational experiments, it is found
that x = 210 is good enough, and that ∆x = 1 is an appropriate space interval.
The finite difference equations with the initial boundary conditions and suggested
parameter values, time steps and space intervals, are solved using matlab. Results are
presented and discussed in the next section.
1 1
0.8 0.8
Proliferating Cells, p
Proliferating Cells, p
t=2 t=2
0.6 0.6
0.4 t = 16 0.4 t = 16
0.2 0.2
0 0
0 50 100 150 200 0 50 100 150 200
space, x space, x
0.5 0.5
t=2 t=2
Quiescent Cells, q
Quiescent Cells, q
0.4 0.4
0.3 t = 16 0.3 t = 16
0.2 0.2
0.1 0.1
0 0
0 50 100 150 200 0 50 100 150 200
space, x space, x
1 1
0.8 0.8
Necrotic Cells, n
Necrotic Cells, n
0.6 0.6
0.4 0.4
t = 16 t = 16
0.2 0.2
t=2 t=2
0 0
0 50 100 150 200 0 50 100 150 200
space, x space, x
(a) deterministic, without noise (b) quasi-stochastic, with noise
cells. During this period, however, necrosis is limited to the central tumour core. This
is consistent with results obtained experimentally by Nirmala et al [9].
As time passes from t = 8 to t = 16, proliferating cells continue to increase in
numbers and move away from the centre, indicating absence of tumour regression. This
also compares well with results from Nirmala et al, who observed that there was no
limiting spheroid volume. From t = 8 onwards, necrotic cell density begins to build up
and a necrotic core starts to take shape.
Assuming that the model is radially symmetrical, images of tumour growth may
be constructed in two dimensions from the numerical results obtained by randomly dis-
tributing the cells along a circumference fixed by the corresponding value of x. The
result is a series of images simulating tumour growth progressing towards the distinc-
tive three-zone structure as shown in Figure 3. Comparing the last image in the figure
with the schematic diagram in Figure 1, it is evident that the model provides a rea-
sonable representation of tumour growth. Moreover, the patterns demonstrated by the
model are consistent with experimental observations of Dorie et al [5], and Folkman
and Hochberg [6].
for k = 1, 2, 3, where
Nk −1
1 X
p̄kj = φ(xki ) pkj,i ,
Nk i=0
210i
xki = and pkj,i is the approximation to p(xki , T ) with N = Nk and the jth inde-
Nk
pendent realisation of the Brownian path. In the present convergence study, we used
N1 = 210, N2 = 420, N3 = 840 and N4 = 1680, with ∆t = 0.04. Many runs were
carried out to obtain values of Sk and Table 1 records the results of five such runs.
Results shown in the table indicate that with multiplicative noise, the ratio Si /Si+1 is
about 4 in all cases, suggesting that the current numerical method has achieved weak
convergence. It seems reasonable to say that finite difference scheme used in the present
study converges reasonably well and is stable.
6. CONCLUDING REMARKS
In this paper, we examine a quasi-stochastic dynamic model for tumour growth
by including multiplicative noise to one of the governing equations. The model is solved
using standard finite difference method, together with the Euler-Mayurama method to
handle the stochastic component. The result is a more realistic model, producing a
simulated tumour growth that agrees with published experimental observations.
A brief analysis of the numerical method using space averages indicates that
the method is both convergent and stable, although it is possible to achieve better
convergence if implicit or semi-implicit finite difference schemes are used. Nevertheless,
the focus of this paper is on a more realistic tumour growth model that can be solved
using a reasonable numerical scheme.
A Quasi-Stochastic Diffusion-Reaction Dynamic Model for Tumour Growth 17
References
[1] Adam, J.A., A simplified mathematical model of tumour growth, Mathematical Bioscience, 81,
224-229, 1986.
[2] Burton, A.C., Rate of grwoth of solid tumours as a problem of diffusion, Growth, 80, 157-176,
1966.
[3] Davie, A.M. and Gaines, J.G., Convergence of numerical schemes for the solution of parabolic
stochastic partial differential equations, Mathematics of Computation, 70, 121-134, 2000.
[4] Doering, C.R., Sargsyan, K.V. and Smereka, P., A numerical method for some stochastic
differential equations with multiplicative noise, Physics Letters A, 344, 149-155, 2005.
[5] Dorie, M., Kcallman, R. and Coyne, M., Effect of cytochalasin b, nocodazole and iradiation
on migration and internalization of cells and microspheres in tumour cell spheroids, Experimental
Cell Research, 166, 370-378, 1986.
[6] Folkman, J. and Hochberg, M., Self-regulation of growth in three dimensions, Journal of Ex-
perimental Medicine, 138, 745-753, 1973.
[7] Greenspan, H.P., Models for the growth of a solid tumour by diffusion, Studies in Applied Math-
ematics, 62, 317-340, 1972.
[8] Higham, D.J., An algorithmic introduction to numerical simulation of stochastid differential equa-
tions, SIAM Review, 43, 525-546, 2001.
[9] Nirmala, C., Rao, J.S., Ruifrok, A.C., Langford, L.A. and Obeyesekere, M., Growth char-
acteristics of glioblastoma spheroids, International Journal of Oncology, 19, 1109-1115, 2001.
[10] Sherratt, J.A. and Chaplain, M.A.J., A new mathematical model for avascular tumour growth,
Journal of Mathematical Biology, 43, 291-312, 2001.
[11] Tan, L.S. and Ang, K.C., A numerical simulation of avascular tumour growth, ANZIAM Journal,
46, C902-C917, 2005.
[12] Ward, J.P. and King, J.R., Mathematical modelling of avascular tumour growth, IMA Journal
of Mathematics Applied in Medicine and Biology, 14, 39-69, 1997.
Halina France-Jackson
Abstract. A semiprime ring A is called a *- ring if A/I is prime radical for every nonzero
ideal I of A. In this survey paper we will show the versatility of *- rings by discussing
some open problems of radical theory which were solved with the aid of *- rings.
Keywords and Phrases: radical; strongly-prime, prime-essential, fillial rings; atoms
1. INTRODUCTION
In this paper, all rings are associative and all classes of rings are closed under
isomorphisms and contain the one-element ring 0. The fundamental definitions and
properties of radicals can be found in Andrunakievich and Rjabukhin [2], Divinsky [4]
and Gardner and Wiegandt [14]. The notation I C A means that I is a two-sided ideal
of a ring A. For a class µ of rings, an ideal I of a ring A is called an µ ideal if the
factor ring R/I is in the class µ and U (µ) denotes the class of all rings that cannot be
homomorphically mapped onto a nonzero ring in µ. As usual, for a radical γ, the γ
radical of a ring A is denoted by γ (A) and S (γ) = {A : γ (A) = 0} is the semisimple
class of γ. π denotes the class of all prime rings and β = U (π) denotes the prime
radical. A ring A is simple if it has no proper ideals, that is, for every I C A, either
I = 0 or I = A.
A ring R is called a ∗-ring France-Jackson [5] if it satisfies one of the following
equivalent conditions:
(1) R ∈ π and R/I ∈ / π for every proper I C R
(2) R ∈ S (β) and R/I ∈ β for every 0 6= I C R.
Example 1.1. Every simple prime ring is a ∗-ring.
n o
2x
Example 1.2. France-Jackson [5] W = 2y+1 : x, y ∈ Z and gcd (2x, 2y + 1) = 1 is
a Jacobson radical ∗-ring without minimal ideals.
19
20 Halina France-Jackson
Example 1.3. France-Jackson [9] A nonzero prime heart P H (R) of a prime ring R
is a ∗-ring, where P H (R) = ∩ {I : 0 6= I C R and R/I ∈ π}.
Example 1.4. France-Jackson [5] Let x be a nonzero element of the centre of a semiprime
ring A, and let I be an ideal of Ax maximal with respect to having an empty intersection
with the set {xn : n > 1}, then Ax/I is a ∗-ring.
Example 1.5. Let M be the unique maximal ideal of a commutative local principal
ideal domain R with the identity element 1. If M 6= 0, then M is a ∗-ring which is not
a simple ring.
Proof. Since R is a local ring, the set of nonunits of R is the ideal M . Suppose I $ M
is a nonzero prime ideal of M . Then I C R and, since R is a commutative principal
ideal domain, it follows that I = iR and M = nR for some 0 6= i ∈ I and n ∈ M .
Then, since i = i1 ∈ I $ M , it follows that i = nr for some r ∈ R. If r ∈ / M , then r
is a unit of R and then n = ir−1 ∈ I since I C R. But this implies that M ⊆ I $ M ,
a contradiction. Thus r ∈ M . But then, since M/I is a prime commutative ring and
0 + I = i + I = (n + I) (r + I), it follows that n ∈ I or r ∈ I. But n ∈ I, implies
M ⊆ I $ M , a contradiction so we must have r ∈ I. Then r = is for some s ∈ R which
implies i1 = i = nr = n (is) = i (ns) since R is commutative. This implies 1 = ns
since R has no zero divisors. This means that 1 ∈ M which implies that M = R. This
contradicts the maximality of M . Thus M has no nontrivial prime ideals. Moreover,
since R is a prime ring, so is M which implies that M is a ∗-ring.
Suppose M is a simple ring and let 0 6= m ∈ M . Then 0 6= mM C M because R
is a commutative ring without zero divisors. But, since M is simple, this implies that
mM = M . Then mx = m = m1 for some x ∈ M and, since R has no zero divisors, it
followas that 1 = x ∈ M which implies that M = R, a contradiction. Thus M is not a
simple ring.
In this survey paper we will show the versatility of ∗-rings by discussing some
open problems of radical theory which were solved with the aid of ∗-rings.
Examples are domains, prime Goldie rings and simple rings with unity. Every
prime ring can be embedded in an SP ring. All SP rings are coefficient rings for some
primitive group rings and this was the initial motivation for their study.
Handelman and Lawrence [15] observed that if a prime ring A whose centre is a
field F contains a nonzero nil ideal that is locally nilpotent as an F -algebra, then A
is not SP. But, by Golod-Shafarevitch theorem Gardner Wiegandt [14], there exists a
finitely generated nil algebra A which is not nilpotent and so not locally nilpotent. So
they asked:
Can an SP ring with unity contain a nonzero nil ideal?
In Korolczuk [16] an essential extension of a ∗-ring which fits the prerequisites
was constructed as follows:
Let A be a finitely generated nil ring which is not nilpotent and let A1 be the
Dorroh extension of A which is again finitely generated. Since for any natural number
n, An is finitely generated, by Zorn’s Lemma we can find an ideal I of A1 which is
maximal with respect tonot containing any An . Then A1 /I is an SP with unity and,
as A ∈ N implies N A1 6= 0, we have N A1 /I 6= 0.
By the choice of I, for every nonzero
prime ideal J/I of A1 /I , we have An ⊆ J
1 1
for some n. But, since A /J ' A /I /(J/I) ∈ π, then A ⊆ J. Thus 0 6= (A + I) /I is
contained in the prime heart P H A1 /I of A1 /I, that is the intersection of all nonzero
prime ideals of A1 /I. But by France-Jackson [9], a nonzero prime heart of a prime ring
is a ∗-ring, so A1 /I is an essential extension of its prime heart.
It is well known Tumurbat and Wiegandt [26] that if γ and δ are special radicals
that coincide on all simple prime rings and on all prime rings without minimal ideals,
then γ = δ. Since polynomial rings have no minimal ideals, Ferrero Tumurbat and
Wiegandt [26] asked:
Can two distinct special radicals coincide on all simple rings and on polynomial
rings A [x] for all rings A?
In Korolczuk [17] (respectively France-Jackson [6]) it was proved that blW (respec-
tively l W ) is an atom of S (respectively K) and is not generated by a single simple
idempotent ring.
Since every special (respectively supernilpotent) radical generated by a nonzero
∗-ring is an atom in S Korolczuk [16] (respectively K) France-Jackson [6], Rjabukhin’s
question now becomes:
This question is still open. The following is also an open problem (and a very
difficult one):
It was shown in France-Jackson [5] that the equality of β and U (∗k ) is equivalent
to the lattice S (respectively K) being atomic, with every atom the smallest special
(respectively supernilpotent) radical containing some ∗-ring.
*-Rings in Radical Theory 23
2.7. Prime-like atoms. It was shown in Tumurbat and France-Jackson [25] that the
collection Lspl of all special and prime-like radicals is a complete sublattice of the
lattice S of all special radicals. Minimal elements of Lspl are called prime-like atoms.
It is natural to ask:
Do prime-like atoms exist?
Since blW is a special atom and it is a prime-like radical, it is clearly a prime-like
atom.
2.8. Supernilpotent nonspecial radicals. Almost nilpotent rings are rings whose
every proper homomorphic image is nilpotent. For example, the ∗-ring W is almost
nilpotent. Prime essential rings are semiprime rings whose every nonzero ideal is not
a prime ring.
Example 2.1. France-Jackson [7]
Let A be the ∗-ring W , let κ be an infinite cardinal number greater than the
cardinality of A and let W (κ) be the set of all finite words made from a well-ordered
alphabet of cardinality κ, lexicographically ordered. Then W (κ) is a semigroup with
multiplication defined by xy = max {x, y} and the semigroup ring A (W (κ)) is a nonzero
prime essential ring whose every prime homomorphic image is isomorphic to the ring
A.
Since special radicals are hereditary and they contain the prime radical β, every
special radical is supernilpotent. Therefore Andrunakievich [1] asked:
Is every supernilpotent radical special?
In France-Jackson [7] it was proved that any radical (other than β) whose semisim-
ple class contains all prime essential rings is nonspecial. This yields non-speciality of
certain known radicals such as, the lower radical L2 generated by the class of all almost
nilpotent rings and thus shows that L2 does not coincide with the antisimple radical
which answers a question of van Leeuwen and Heyman [20] in the negative. A natural
question arises:
Can a nonspecial radical contain a nonzero prime essential ring?
It was proved in France-Jackson [7] that the nonhereditary (and therefore non-
special) Jenkins radical U ({all prime simple rings}) Leavitt [18] contains the nonzero
nonsimple prime essential ring A (W (k)) constructed in Example 2.1.
Let ρ be a supernilpotent radical. Let ρ∗ be the class of all rings A such that
either A is simple in ρ or the factor ring A/I is in ρ for every nonzero ideal I of A and
every minimal ideal M of A is in ρ. Let L (ρ∗ ) be the lower radical determined by ρ∗
and let ρϕ denote the upper radical determined by the class of all subdirectly irreducible
rings with ρ-semisimple hearts. Le Roux and Heyman [19] proved that ρ ⊆ L (ρ∗ ) ⊆ ρϕ
and L (G ∗ ) = Gϕ , where G is the BrownMc-Coy radical. They asked:
Is it true that L (ρ∗ ) = ρϕ when ρ is replaced by the prime radical β, the locally
nilpotent radical L, the nil radical N or the Jacobson radical J , respectively?
*-Rings in Radical Theory 25
matrices with entries from A having only finitely many nonzero entries, then R is a
nonsimple idempotent ∗-ring with zero centre and so R ∈ χ \ α.
3. CONCLUDING REMARKS
Although many problems were solved using ∗ -rings, very little is known about
their structure. Thus there is a strong motivation for studying ∗-rings to determine
their properties.
References
[1] andrunakievich V. A., Radicals of associative rings I, (in Russian), Mat. Sb. 44, 179-212, 1958.
[2] Andrunakievich V. A. and Rjabukhin Yu. M. , Radicals of algebras and structure theory, (in
Russian), Nauka, Moscow, 1979.
[3] Booth G. L. and France-Jackson H., On the lattice of matric-extensible radicals II, Acta Math.
Hungar. 112 (3), 187-199, 2006.
[4] Divinsky N, Rings and Radicals, Allen & Unwin: London, 1965.
[5] France-Jackson H., ∗-rings and their radicals, Quaestiones Math. 8 (3), 231-239, 1985.
[6] France-Jackson H., On atoms of the lattice of supernilpotent radicals, Quaestiones Math. 10
(3), 251-256, 1987.
[7] France-Jackson H., On prime essential rings, Bull. Austral. Math. Soc. Ser A, 47, 287-290, 1993.
[8] France-Jackson H., On the Tzintzis radical, Acta Math. Hungar. 67 (3), 261-263, 1995.
[9] France-Jackson H., Rings related to special atoms, Quaestiones Math. 24 (1), 105-109, 2001.
[10] France-Jackson H., On a nonsimple idempotent ∗-ring with zero centre, Acta Math. Hungar.
100 (4), 325-327, 2003.
[11] France-Jackson H., On supernilpotent nonspecial radicals, Bull. Austral. Math. Soc., 78, 107-
110, 2008.
[12] Gardner B. J., Small ideals in radical theory, Acta Math. Hungar., 43, 287-294, 1984.
[13] Gardner B. J., Some results and open problems concerning special radicals, in: Radical Theory
(Proceedings of the 1988 Sendai Conference, Sendai 24-30 July 1988), (ed. S. Kyuno) (Uchida
Rokakuho Pub. Co. Ltd, Tokyo, Japan, 1989) 25-26, 1989.
[14] Gardner B. J. and Wiegandt R. Radical Theory of Rings, Marcel Dekker Inc., New York, 2004.
[15] Handelman D. and Lawrence J., Strongly prime rings, Trans. Amer. Math. Soc. 211, 209-223,
1975.
[16] Korolczuk H., Lattices of Radicals of Rings, PhD thesis (in Polish), University of Warsaw, 1982.
[17] Korolczuk H., A note on the lattice of special radicals, Bull. Polish Acad. Sci. Math. 29, 103-104,
1981.
[18] Leavitt W. G. and Jenkins T. L., Non-hereditariness of the maximal ideal radical class, J.
Natur. Sci. Math. 7, 202-205, 1967.
[19] Le roux H. J. and Heyman G. A. P., A question on the characterization of certain upper radical
classes, Boll. Unione Mat. Ital. Sez. A 17(5), 67-72, 1980.
[20] van Leeuwen L. C. A. and Heyman G. A. P., A radical determined by a class of almost nilpotent
rings, Acta Math. Hungar. 26, 259-262, 1975.
[21] Puczylowski E. R. and Roszkowska E, Atoms of lattices of radicals of associative rings, in
Radical Theory (Proceedings of the 1988 Sendai Conference, Sendai 24-30 July 1988), (ed. S.
Kyuno) (Uchida Rokakuho Pub. Co. Ltd, Tokyo, Japan, 1989) 123-134, 1989.
[22] Rjabukhin J. M., Overnilpotent and special radicals (in Russian), Algebry i moduli Mat. Issled.
48, Kishinev, 80-93, 1978
[23] Snider R. L., Lattices of radicals, Pacific J. Math., 40, 207-220, 1972.
*-Rings in Radical Theory 27
[24] Roszkowska E., Lattices of Radicals of Associative Rings, PhD thesis (in Polish), University of
Warsaw, 1995.
[25] Tumurbat S. and France-Jackson H., On prime-like radicals, Bull. Austral. Math. Soc., 82,
113-119, 2010.
[26] Tumurbat S. and Wiegandt R., A note on special radicals and partitions of simple rings, Comm.
Algebra 30 (4), 1769-1777, 2002.
[27] Tzintzis G., An almost subidempotent radical property, Acta Math. Hungar. 49, 173-184, 1987.
Halina France-Jackson
Nelson Mandela Metropolitan University
e-mail: [email protected]
28 Halina France-Jackson
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 29–40.
Abstract. The intensive investigation of (strongly) n-clean rings and right (left) clean
rings) have been done by many authors. Moreover, in every case they have some al-
most similar performances. The author has generalized those notions by combining the
definition of n-clean rings and right clean rings, i.e. right n-clean ring. In this paper
we give an overview the properties of (strongly) n-clean rings and right n-clean rings,
especially the product, the quotient ring, homomorphic image and matrices over over a
(strongly) n-clean rings or right n-clean ring. Furthermore, we also give an overview of
some properties of (strongly) clean n-modules and right clean modules.
Keywords and Phrases: (strongly) n-clean rings, right clean rings, right n-clean rings,
(strongly) n-clean modules, right clean modules.
1. INTRODUCTION
Throughout, for the ring R we mean an associative ring with identity 1R . An
element e in ring R is called (strongly) clean if e can be composed into a sum of a
(nonzero) idempotent element and a unit elements. An element r in ring R is called
(strongly) n-clean if r can be composed into a sum of a (nonzero) idempotent element
and n unit elements. A ring R is called strongly n-clean and n-clean if all its elements
are strongly n-clean and n-clean respectively.
Chen and Cui in [4] gave some characterizations of a (strongly) clean ring. Khak-
sari and Moghimi [6] introduced a slight generalization of clean rings to n-clean rings.
Moreover they also presented some properties of clean modules. A clean module is a
module which endomorphism ring is a clean ring. Wang and Chen in [7] observed 2-
clean rings and presented some properties which can be generalized into n-clean rings.
Călugăreanu [1] defined a generalization of clean to right clean by replacing the the
units by right units. Furthermore, Wijayanti [8] proposed a more general definition i.e.
right n-clean rings. In this paper we give an overview of the properties of strongly or
29
30 Indah Emilia Wijayanti
right n-clean rings, especially the product of strongly or right n-clean rings, the quo-
tient ring, homomorphic image of a strongly or right n-clean ring and matrices over a
strongly or right n-clean ring. Furthermore, we also give an overview of some properties
of (strongly) n-clean modules and right clean modules which are referred to Camillo
et.al [2] and Zhang [9].
A nonzero element s in the ring R is called right invertible if there exists t ∈ R
such that st = 1R . Sometimes we call right invertible elements as right units, i.e an
element which has a right inverse but not necessary left inverse. An element e in a ring
R is called idempotent if e2 = e. We denote by Ur (R) the right invertible elements of
R and Id(R) the set of idempotent elements in R.
We recall the Pierce Decomposition which plays an important rule in this work.
Let R be a ring with identity and {e1 , e2 , . . . , en } idempotent elements in R such that
e1 + e2 + . . . + en = 1R . Then R can be decomposed
Ln into a direct sums of ei Rej for
every i, j = 1, 2, . . . , n and denoted by R = i,j=1 ei Rej . We call this decomposition as
Pierce Decomposition of R. Sometimes we denote this decomposition as a generalized
matrix :
e1 Re1 e1 Re2 . . . e1 Ren
e2 Re1 e2 Re2 . . . e2 Ren
R'
.. .. .. ..
. . . .
en Re1 en Re2 ... en Ren
We summarize some propositions in which many properties of clean ring are sim-
ilar with both of its generalizations and special cases, which are referred to [6] and
[8].
Proposition 2.1. Let f : R → S be a ring homomorphism. The following assertions
are satisfied.
(i) If R is a (right) n-clean ring, then Im(f ) is a (right) n-clean ring.
(ii) If R is a strongly n-clean ring and Id(R) ∩ Ker(f ) = 0, then Im(f ) is a strongly
n-clean ring.
Clean Rings and Clean Modules 31
Proof. We recall the proof of assertion (i) from Proposition 2 of [8] and the proof of (ii)
from Proposition 2.4 of [6].
(i) For any r ∈ R there is an idempotent element e ∈ Id(R) and right invertible
elements u1 , u2 , . . . , un ∈ Ur (R) such that
r = e + u1 + · · · + un .
It implies
f (r) = f (e) + f (u1 ) + · · · + f (un ).
Since f (e) ∈ Id(S) and f (ui ) ∈ Ur (S) for all i’s, it is clear f (r) is a right n-clean element
and S is right n-clean.
(ii) Let s ∈ Im (f ), so there r ∈ R such that f (r) = s. But as R a strongly
n-clean, we have r = e + u1 + · · · + un , where e a non-zero idempotent element, ui s are
unit. Therefore
s = f (r) = f (e + u1 + · · · + un ) = f (e) + f (u1 ) + · · · + f (un ).
Since f (e) 6= 0 and also an idempotent element in Im (f ) and f (ui )s are unit in Im (f ),
we conclude that Im (f ) is a strongly n-clean ring.
Then we have immediately the following corollary.
Corollary 2.1. Let R, S and T be rings and consider the short exact sequence
0 /S /R /T / 0.
The following assertions are satisfied.
(i) If R is a (right) n-clean ring, then S and T are (right) n-clean rings.
(ii) If R is a strongly n-clean ring and Id(R) ∩ Ker(f ) = 0, then S and T are strongly
n-clean rings.
It means, every (right) n-clean ring implies its ideals and quotient rings are also
(right) n-clean ring. For strongly n-clean ring we need more condition, i.e. the idem-
potent elements should not included in kernels. Now we give the following proposition.
Proposition 2.2. Let {Rλ }Λ be a family of rings. The following assertions are satisfied.
Q
(i) The product of rings Λ Rλ is a right n-clean ring if and only if each Rλ is n-clean
for all λ ∈ Λ.
Q
(ii) If each Rλ is strongly n-clean for all λ ∈ Λ, then the product of rings Λ Rλ is a
strongly n-clean ring.
Proof. We prove assertion (i).
Q
(⇒) It is clear from Proposition 2.1, pµ : Λ Rλ → Rµ is an epimorphism and it
means Rµ is right n-clean for all µ ∈ Λ.
Q
(⇐) Assume now every Rλ is right n-clean. Take any (rλ )Λ in Λ Rλ . Since rλ is
a right n-clean element, rλ = eλ + u1λ + · · · + unλ , where eλ ∈ Id(Rλ ) and uiλ ∈ Ur (Rλ )
for all λ ∈ Λ. Then we obtain
(rλ )Λ = (eλ + u1λ + · · · + unλ )Λ = (eλ )Λ + (u1λ )Λ + · · · + (unλ )Λ ,
Q Q
where (eλ )Λ ∈ Id( Λ Rλ ) and each (uiλ )Λ ∈ Ur ( Λ Rλ ).
32 Indah Emilia Wijayanti
Thus, if ring R is direct sums of (right) n-clean rings, then R is (right) n-clean
ring. Consider the following counter example to show that the converse of statement
(ii) of Proposition 2.2 is not true.
Example 2.1. In Z3 we have 0 = 1 + 1 + 1, 1 = 1 + 1 + 2 and 2 = 1 + 2 + 2. Also we
have that Z3 × Z2 is strongly 2-clean. But Z2 is not a strongly 2-clean.
To obtain a sufficient and necessary condition for strongly n-clean ring, we have a
special condition as we refer from Proposition 2.2 of [6] and it is showed in the following
proposition.
Proposition 2.3. Let {Rλ } be a family of rings such Q that at least one of them is
strongly n-clean and the others are n-clean ring, then λ∈Λ Rλ is strongly n-clean.
In fact, if a ring is (right) n-clean, then it is also (right) m-clean for any m ≥ n
as we show in the next proposition. But this property is not hold for strongly n-clean
as given in the counter example. We have already known that Z3 is strongly 2-clean.
Since 2 ∈ Z3 is can not be expressed to sum of an idempotent element and 3 units, Z3
is not strongly 3-clean.
Proposition 2.4. Let R be a ring. If an element r ∈ R is (right) n-clean, then r is
also (right) m-clean for any non-zero integer m ≥ n.
Proof. It is sufficient to prove that for any right n-clean element r in R, it is right
n + 1-clean. Let r be a right n-clean element in R, then r = e + u1 + u2 + · · · + un ,
where e ∈ Id(R) and ui ∈ Ur (R) for all i’s. Consider that e = (1 − e) + (2e − 1). Thus
we obtain
r = (1 − e) + (2e − 1) + u1 + u2 + · · · + un
where 2e − 1 ∈ Ur (R), i.e. (2e − 1)(2e − 1) = 1.
The next investigation is to look for the relationship between a strongly or right
n-clean ring with matrices over it.
Proposition 2.5. Let A, B be rings, A CB a bimodule and
A C
R= .
0 B
R is (right) n-clean if and only if A and B are (right) n-clean.
a c
(⇐) Now take any ∈ R, where a ∈ A and b ∈ B. Since A and B are
0 b
right n-clean, there exists an idempotent ea ∈ A and right units ua1 , . . . , uan ∈ A such
that a = ea + ua1 + . . . + uan . Also there exists an idempotent eb ∈ B and right units
ub1 , . . . , ubn ∈ B such that b = eb + ub1 + . . . + ubn . Moreover,
a c ea 0 ua1 c ua2 0 uan 0
= + + + ··· +
0 b 0 eb 0 ub1 0 ub2 0 ubn
ea 0 uai 0
where is idempotent and are right units for all i’s. Consider
0 eb 0 ubi
now −1
ua1 −u−1 −1
ua1 c a1 cub1 1 0
=
0 ub1 0 u−1
b1 0 1
where u−1 −1
a1 and ub1 are right inverse of ua1 and ub1 respectively. We conclude that
ua1 c
0 ub1
is a right unit and R is right n-clean.
We give now the sufficient condition for n-clean, strongly n-clean and right n-clean
ring respectively.
Proposition 2.6. Let e be an idempotent element in R. If eRe and (1 − e)R(1 − e)
are strongly or right n-clean, then R is a strongly or right n-clean ring.
Hence
a x
A =
y b
f + u1 + · · · + un x
=
y g + v1 + · · · + vn + yu−1
1 x
f 0 u1 x u2 0 un 0
= + + + ··· +
0 g y vn + yu−1
1 x 0 v 1 0 vn−1
34 Indah Emilia Wijayanti
u1 x
It is sufficient to show that is a right unit. But
y vn + yu−1
1 x
−1
u1 + u−1 −1 −1
−u−1 −1
u1 x 1 xvn yu1 1 xvn =1
y vn + yu−1
1 x −vn−1 yu−1
1 vn−1
and
u−1 −1 −1 −1
−u−1 −1
1 + u1 xvn yu1 1 xvn u1 x
=1
−1
−vn yu1 −1 −1
vn y vn + yu−1
1 x
as needed. Then R is right n-clean.
e1 Re1 M
such as we have T (R) = . Take any element in T (R), say r =
N B
a x
, where a ∈ e1 Re1 , x ∈ M , y ∈ N and b ∈ B. Since e1 Re1 and B are right n-
y b
clean, we can find f ∈ Id(e1 Re1 ) and ui ∈ Ur (e1 Re1 ) such that a = f +u1 +u2 +· · ·+un .
Now consider that yu−1 −1 −1
1 x ∈ B, for u1 ∈ e1 Re1 satisfies u1 u1 = 1e1 Re1 . Thus
b − yu−1
1 x ∈ B and we have
a x
r =
y b
f + u1 + u2 + · · · + un x
=
y g + v1 + v2 + · · · + vn + yu−1
1 x
f 0 u1 x u2 0 un 0
= + + + ··· + .
0 g y vn + yu−1 1 x 0 v1 0 vn−1
where g ∈ Id(B)
and vi ’s in Ur (B).
Similar with the argument in Proposition 2.6 we
u1 x
conclude that is right invertible in T (R).
y vn + yu−1 1 x
The converse of Proposition 2.6 is not true since we can not always obtain a right
n-clean element ere ∈ eRe although r ∈ R is a right n-clean element. Moreover, the
observation of necessary and sufficient condition of clean ideals has been done by Chen
and Chen in [3].
Proposition 2.8. Let ei ’s be idempotent elements in a ring R, i = 1, 2, . . . , n, which
are pairwise orthogonal and e1 + e2 + · · · + en = 1R . If each ei Rei is strongly or right
n-clean, i = 1, 2, . . . , n, then R is strongly or right n-clean.
Next proposition gives a simpler condition, i.e. the idempotent elements are not
necessary pairwise orthogonal. The sketch of its proof is quiet similar with the proof of
Proposition 2.7, so we skip the prof in this note.
Proposition 2.9. Let ei ’s be idempotent elements, i = 1, 2, . . . , n. If each ei Rei is
strongly or right n-clean, then generalized matrix T (R) is strongly or right n-clean.
The important results of this work is the following proposition in which we prove
that the clean properties of a ring can be transferred to the matrices over this ring.
Proposition 2.10. If R is a strongly or right n-clean ring, ei ’s are idempotent elements
in R, i = 1, 2, . . . , n, then the n × n matrices over R is also strongly or right n-clean.
is strongly or right n-clean. Then for any n × n matrix A over R, A ∈ T (R). Since
T (R) is strongly or right n-clean, A is also strongly or right n-clean.
3. CLEAN MODULES
We give some definitions of clean modules as follow.
Definition 3.1. An R-module M is called strongly or right n-clean module if EndR (M )
is a strongly or right n-clean ring.
If n = 1, then we obtain a strongly or right clean module. One example of clean
module is a continuous module (see Camillo et. al. [2] and Haily-Rahnaoui [5]).
Some previous authors have been investigated the properties of strongly clean
modules (Khaksari-Moghimi [6], Zang [9] and right clean modules (Călugăreanu [1]).
For further investigation of strongly or right clean modules, we refer to the prop-
erties of strongly or right clean rings. For example, motivated by Corollary 2.10 of
paper Khaksari-Moghimi [6] we have the following interpretation.
Proposition 3.1. If {Mi }, i = 1, . . . , n is a family of strongly or right n-clean modules,
M = M1 ⊕ M2 ⊕ · · · ⊕ Mn and for any f : Mi → Mj , f = 0 if i = j, then M is strongly
or right n-clean.
Proof. Let M = M1 ⊕ M2 ⊕ · · · ⊕ Mn and consider
EndR (M ) = EndR (M1 ⊕ M2 ⊕ · · · ⊕ Mn )
= EndR (M1 ) ⊕ EndR (M2 ) ⊕ · · · ⊕ EndR (Mn ).
Since every EndR (Mi ) is a strongly or right n-clean ring, according to Proposition 2.2
EndR (M ), as a product of EndR (Mi )s, is a strongly or right n-clean ring. Hence M is
a strongly or right n-clean modules.
Proposition 3.2. If M is a strongly or right n-clean module, S = EndR (M ) and S
contains an idempotent element, then n × n matrices over S is also a strongly or right
n-clean ring.
Proof. Since M is a strongly or right n-clean module, S is a strongly or right n-clean
ring. Then we apply Proposition 2.6 to obtain the n×n matrices over S is also (strongly
or right) n-clean(strongly or right) n-clean.
We recall the important result of Camillo et. al. [2] (Proposition 2.2 and 2.3),
which gave a necessary and sufficient condition for clean and strongly clean element.
Proposition 3.3. Let M be an R-module, S = EndR (M ) and e ∈ S an idempotent
element. Denote A = Ker(e) and B = Im(e).
(i) An element f ∈ S is clean if and only if there exists a decomposition M = C ⊕ D
such that f (A) ⊆ C, (1 − f )B ⊆ D and both f : A → C and (1 − f ) : B → D are
isomorphisms.
(ii) An element f ∈ S is strongly clean if and only if there exists a decomposition
M = A ⊕ B such that f (A) ⊆ A, (1 − f )B ⊆ B and both f : A → A and
(1 − f ) : B → B are isomorphisms.
Clean Rings and Clean Modules 37
With some condition, next we show that a (strongly or right) n-celan ring R is
possible to make all R-modules M also strongly or right n-celan.
Proposition 3.6. Let M be a finite rank free R-module. If R is (strongly or right)
n-celan, then so is the M .
Proof. It is clear that HomR (M, N ) is an EndR (M )-module by defining following scalar
multiplication :
EndR (M ) × HomR (M, N ) → HomR (M, N )
(f, g) 7→ g ◦ f.
Since M is a strongly n-clean module, EndR (M ) is a strongly n-clean. According to a
result in [6], HomR (M, N ) is also a strongly n-clean module.
Proposition 3.8 restricts for finite rank free modules. But this condition can be
generalized into infinite countable rank free modules.
38 Indah Emilia Wijayanti
0 / Ker ψ /F ψ
/P / 0.
Since P is projective, this short exact sequence is split, so we have F ' Ker(ψ)⊕P .
Clean Rings and Clean Modules 39
4. CONCLUDING REMARKS
There are some open problems related to the cleanness of modules. One of the
crucial investigations is an alternative definition of clean modules which is more natural
than by using the endomorphism ring. In case we still apply the recent definition,
we might observe the necessary and sufficient condition for an n-clean element in an
endomorphism ring. Also the investigation of the properties of free module and n-clean
module has not been done.
References
[1] Călugăreanu, G., One-sided Clean Rings, Studia Universitatis Babes-Bolyai, Vol. 55, No. 3,
2010.
[2] Camillo, V.P., Khurana, D., Lam, T.Y., Nicholson, W.K., Continuous Modules are Clean,
Journal of Algebra, 304 No.1, 94 - 111, 2006.
[3] Chen, H. and Chen, M., On Clean Ideals, International Journal of Mathemnatics and Mathe-
matical Sciences (IJMMS), 62, 3949 - 3956, 2003.
[4] Chen, W. and Cui, S., On Clean Rings and Clean Elements, Southeast Asian Bulletin of Math-
ematics, 32, 855-861, 2008.
[5] Haily, A. and Rahnaoui, H., Endomorphisms of Continuous Modules with Some Chain Condi-
tions, International Journal of ALgebra, 4 (8), 397 - 402, 2010.
[6] Khaksari, A. and Moghimi, G., Some Results on Clean Rings and Modules, World Applied
Sciences Journal, 6 (10), 1384 - 1387, 2009.
[7] Wang, Z. and Chen, J.L., 2-Clean Rings, arXiv:math/0610918v1 [math.RA] 30 Oct 2006.
[8] Wijayanti, I.E., On Right n-Clean Rings, submitted to Jurnal Matematika dan Sains (JMS) ITB,
2011.
[9] Zhang,H., On Strongly Clean Modules, Communications in Algebra, 37(4), 1420-1427, 2009.
Intan Muchtadi-Alamsyah
Abstract. An Nakayama algebra is an algebra that is both right and left serial. In this
paper we explain some research on Nakayama algebras that have been conducted: a con-
struction of an explicit tilting complex that gives derived equivalence between symmetric
Nakayama algebras and Brauer tree algebras. Then we explain our ongoing research
on Nakayama algebras in group representation theory based on the above result. For a
class of non-symmetric Nakayama algebras we explain our ongoing research on Nakayama
algebras with mutation.
Keywords and Phrases: Nakayama algebra, Brauer tree algebra, derived equivalence,
cluster tilted algebra, mutation quiver.
1. INTRODUCTION
An Nakayama algebra is an algebra that is both right and left serial. Nakayama
algebras are central in the representation theory of finite dimensional algebras. They
have a well understood module category, with particularly nice combinatorial properties.
For an algebraically closed field k, they are given as path algebras of finite quivers (the
Gabriel quivers) modulo ideals generated by linear combinations of paths. Moreover,
their Gabriel quivers are either of Dynkin type An or a finite oriented cycle.
An explicit tilting complex can be constructed that gives equivalence between the
symmetric Nakayama algebra and the Brauer tree algebra associated to a line without
exceptional vertex. They are derived equivalent based on the method in [19] and also
using the Green order correspond to these algebras [17, section 4.4]. As an application
is the result in [20] where this equivalence is used in representation theory of braid
groups. Moreover by Rickard [22, Theorem 4.2], up to derived equivalence, a Brauer
tree algebra is determined by the number of edges of the Brauer tree and the multiplicity
of the exceptional vertex. Hence, an arbitrary Brauer tree algebra is derived equivalent
41
42 Intan Muchtadi-Alamsyah
to a Brauer star algebra associated with a star having the same number of edges, i.e.
the Nakayama algebra with m simple modules and Loewy length n where m divides n.
In recent years, a major direction within representation theory has been the study
of cluster tilted algebras. These algebras occur as endomorphism algebras of certain
objects in triangulated Hom-finite categories related to derived categories. For these
algebras there is a concept of mutation. This notion is related to mutation of quivers
with (super-)potentials occurring in mathematical physics. For two algebras which
are related by a single mutation, their module categories have a similar structure, the
algebras are said to be ”nearly Morita equivalent”. It is an interesting general problem
to understand and explore algebras which are nearly Morita equivalent to algebras with
a well understood module category such as Nakayama algebras. In our research the
intersection between Nakayama algebras and cluster tilted algebras is explored.
In group representation theory, by result of Dade [11], blocks with cyclic de-
fect groups are Brauer tree algebras and the Brauer correspondent of these blocks are
Nakayama algebras. Hence the derived equivalence between Brauer tree algebras and
Nakayama algebras gives new alternative to Broué’s conjecture [6] for the case blocks
with cyclic defect groups. Research on invariance of derived equivalence have been
conducted, the most recent is by Zimmermann [25] and our research in this area is
the invariance of p-regular subspace of blocks with abelian defect groups. Fan and Kul-
shammer [12], by using perfect isometries, have shown the invariance of these subspaces.
By using derived equivalence between Brauer tree algebras and Nakayama algebras, one
may expect an alternative proof of Fan and Kulshammer’s method to get this invariance.
2. NAKAYAMA ALGEBRAS
A quiver Q is a quadruple (Q0 , Q1 , s, t) where Q0 is the set of vertices (points),
Q1 is the set of arrows and for each arrow α ∈ Q1 , the vertices s(α) and t(α) are the
source and the target of α, respectively (see [3]). If i and j are vertices, an (oriented)
path in Q of length m from i to j is a formal composition of arrows
p = α1 α2 · · · αm
where s(α1 ) = i, t(αm ) = j and t(αk−1 ) = s(αk ), for k = 2, · · · , m. To any vertex i ∈ Q0
we attach a trivial path of length 0, say ei , starting and ending at i such that for any
arrow α (resp. β) such that s(α) = i (resp. t(β) = i) then ei α = α (resp. βei = β). We
identify the set of vertices and the set of trivial paths.
Let KQ be the K-vector generated by the set of all paths in Q. Then KQ can be
endowed with a structure of K-algebra with multiplication induced by concatenation of
paths, that is,
β1 β2 · · · βn α1 α2 · · · αn , if t(βn ) = s(α1 )
(β1 β2 · · · βn )(α1 α2 · · · αn ) =
0, otherwise.
KQ is called the path algebra of the quiver Q. The algebra KQ can be graded by
KQ = KQ0 ⊕ KQ1 ⊕ · · · ⊕ KQm ⊕ · · · ,
Research on Nakayama Algebras 43
Definition 2.1. Let Q be a finite connected quiver. The ideal of path algebra KQ
generated by arrows of Q is called arrow ideal and denoted by RQ .
Definition 2.2. Let Q be a finite quiver and RQ be the arrow ideal in path algebra KQ.
An ideal I in KQ is admissible if there exists m ≥ 2 such that
m 2
RQ ⊆ I ⊆ RQ .
If I is an admissible ideal in KQ, (Q, I) is called bound quiver. The quotient algebra
KQ/I is called bound path algebra.
Theorem 2.1. [3, Theorem 3.2] A basic and connected algebra A is a Nakayama algebra
if and only if its quiver QA is one of the following quivers:
(1) An quivers
Proposition 2.1. [3, Proposition 3.8] Let A be a basic and connected algebra, which
is not isomorphic to K. The A is a self-injective Nakayama algebra if and only if A ∼
=
KQ/I where Q is the quiver
44 Intan Muchtadi-Alamsyah
with m ≥ 1 and I = Rn for some n ≥ 2, where R denotes the arrow ideal of KQ.
We denote KQ/I in the previous proposition by Nnm . The algebra Nnm is sym-
metric if and only if m divides n. For the case Nakayama algebra Nnn the paths ei of
length 0 are mutually orthogonal idempotents and their sum is the unit element. The
Loewy series of the indecomposable projective modules Pi are as follows:
Si
Si+1
..
.
Sn
Pi =
S1
S2
..
.
Si
2.1. Brauer Tree Algebras. Let G be a finite connected tree with a cyclic ordering
of the edges adjacent to a given vertex and with a particular vertex v, the exceptional
vertex, and a positive integer m, the multiplicity of the exceptional vertex. To this data
(G, v, m) one associates a finite dimensional symmetric algebra, called a Brauer tree
algebra, characterized up to Morita equivalence by the following properties:
(1) The isomorphism classes of simple modules are parametrized by the edges of
G. Denote by Pj a projective cover of a simple module Sj corresponding to an
edge j. Then rad(Pj ) = soc(Pj ) is the direct sum of two uniserial modules Ua
and Ub where a and b are the vertices of j.
(2) For c in {a, b}, let j = j0 , j1 , , jr be the cyclic ordering of the r+1 edges around c.
Then the composition factors of Uc , starting from the top, are Sj1 , Sj2 , , Sjr , Sj0 , Sj1 , , Sjr ,
where the number of composition factors is m(r + 1) − 1 if c is the exceptional
vertex and r otherwise.
Associated to a Brauer tree algebra are two numerical invariants: the number of edges
of the tree and the multiplicity of the exceptional vertex.
Two examples are
Research on Nakayama Algebras 45
(1) A basic Brauer tree algebra associated to a line with n edges numbered 1, , n
such that i is adjacent to i + 1, and with no exceptional vertex. We assume
n > 1.
3. DERIVED EQUIVALENCE
In 1989, Rickard [21] and Keller [15]have given a necessary and sufficient criterion
for the existence of derived equivalences between two rings as a generalization of Morita
equivalence. Rickard’s theorem says that for two rings A and B the derived categories
Db (A) and Db (B) of A and B are equivalent as triangulated categories if and only if
there exists an object T in Db (A), named tilting complex, satisfying similar proprieties
46 Intan Muchtadi-Alamsyah
The proof of this theorem is based on the method in [19] and also using the Green
order correspond to these algebras [17, section 4.4] (See also [19, Example 5.2]). As an
application is the result in [20] where this equivalence is used in representation theory
of braid groups.
Any quiver Q with no loops and no cycles of length two, can be mutated at vertex
i to a new quiver Q∗ by the following rules:
(1) The vertex i is removed and replaced by a vertex i∗ , all other vertices are kept.
(2) For any arrow i → j in Q there is an arrow j → i∗ in Q∗ .
(3) For any arrow j → i in Q there is an arrow i∗ → j in Q∗ .
(4) If there are r > 0 arrows j1 → i, s > 0 arrows i → j2 and t arrows j2 → j1 in
Q, there are t − rs arrows j2 → j1 in Q∗ . (Here, a negative number of arrows
means arrows in the opposite direction.)
(5) all other arrows are kept.
In [9], Buan and Vatne provide an explicit description of the mutation class of
An -quivers, whereas a geometric interpretation of mutation of An -quivers is given by
Caldero, Chapoton and Sciffler [10]. For two algebras which are related by a single
mutation, their module categories have a similar structure, the algebras are said to be
”nearly Morita equivalent”.
It is an interesting general problem to understand and explore algebras which
are nearly Morita equivalent to algebras with a well understood module category such
as Nakayama algebras. Therefore, in our research the intersection between Nakayama
algebras and cluster tilted algebras is explored. Ringel in [23] has given the classification
of selfinjective cluster tilted algebras.
Theorem 4.1. [23] The selfinjective cluster tilted algebras are
(1) the Nakayama algebras Nn−2,n where n ≥ 3,
(2) algebras with an even number 2m of simples, m indecomposable projectives have
length 3 and the remaining m have length m + 1.
Based on results by Ringel [23] and Buan and Vatne [9] we get the following
theorem
Theorem 4.2. Nakayama algebras that admit some mutations are the selfinjective
Nakayama algebras Nn−2,n and the Nakayama algebras associated to An quivers.
The next step will be to classify the mutation class of these algebras, i.e. to
characterize the algebras which are nearly Morita equivalent to these algebras.
Theorem 5.1. [17, Proposition 6.3.2] Let R and S be two rings and assume Db (R) = ∼
Db (S) as triangulated category. Let T be the tilting complex over R with endomorphism
ring B. Then the centers of R and S are isomorphic.
Our research in this area is the invariance of p-regular subspace of blocks with
abelian defect groups. We start with a prime p, F a field of characteristic p > 0 and
G a finite group with an abelian Sylow p-subgroup. Element of G which order cannot
be divided by p is called p0 -element. The conjugation class formed by those elements is
called p-regular class.
The F -subspace of the group algebra F G spanned by all p-regular class sums in
G is denoted by Zp0 F G. Meyer showed in [18] that this subspace is a subalgebra of the
center ZF G of F G.
If C is a conjugacy class in G, then defect group of C is a Sylow p-subgroup of
CG (g), centralizer of g, with g belongs to C. A block B of F G is the smallest submodule
which contains indecomposable submodules of F G and simple submodules of F G. Define
Zp0 B = B ∩ Zp0 F G. Fan and Kulshammer proved in [12] the following result:
Theorem 5.2. [12] If B is a block with abelian defect group, then Zp0 B is a subalgebra
of ZB, center of B. Moreover, Zp0 B is invariant under perfect isometry, and hence,
under derived equivalence.
By using derived equivalence between Brauer tree algebras and Nakayama alge-
bras, one may expect an alternative proof of Fan and Kulshammer’s method to get this
invariance without going through the perfect isometry. Our preliminary result is the
invariance of the ranks of the p-regular subspaces as following:
We fix a prime number p and a p-modular system (k, O, F ), that is O a complete
discrete valuation ring with field of fractions k of characteristic 0 and residue field F
of characteristic p. The O-algebras we consider will always be free of finite rank as
O-modules.
Fix a finite group G and a block B1 of the group algebra OG, with defect group
D. Denote by Gp0 the set of p-regular elements in G. Denote by OGp0 the O-sublattice
of OG spanned by Gp0 and Zp0 OG = ZOG ∩ OGp0 . We set Zp0 B1 = B1 ∩ Zp0 OG.
Theorem 5.3. For blocks B1 and B2 with cyclic defect groups of OG and OH, respec-
tively, the derived equivalence between B1 and B2 implies rank Zp0 B1 = rank Zp0 B2 .
Proof The blocks B1 and B2 are Brauer tree algebras and by [22, Theorem 4.2], up
to derived equivalence, a Brauer tree algebra is determined by the number of edges of
the Brauer tree and the multiplicity of the exceptional vertex. Hence B1 and B2 has
the same number of edges, which is also the same number of simple modules.Since by
[14, Remarks 4.9] the rank of Zp0 B1 (resp. Zp0 B2 ) coincides with the number of simple
B1 -modules (resp. B2 -modules), consequently Zp0 B1 and Zp0 B2 has the same rank.
QED
For further research, using the derived equivalence between Nakayama algebras
and Brauer Tree Algebras, we will show that the isomorphism given in Lemma 5.1 maps
Research on Nakayama Algebras 49
Zp0 B1 to Zp0 B2 . This will provide a new invariant of derived equivalence and gives an
alternative proof of Fan and Kulshammer’s result.
References
[1] Assem I., Brustle T., Schiffler R. Cluster-tilted algebras as trivial extensions, Bull. London
Math. Soc. 40 (1), 151-162, 2008.
[2] Assem I., Brustle T., Schiffler R. Cluster-tilted algebras and slices, Journal of Algebra Vol-
ume 319, Issue 8, 3464-3479, 2008.
[3] Assem, I., Simson, D., Skowronski, A., Elements of the Representation Theory of Assosiative
Algebras, London Math Soc Student Text 65, Cambridge Univ Press, 2006.
[4] Auslander,M., Reiten,I., Smaloe,S.O. Representation Theory of Artin Algebras, Cambridge
Univ Press, 1995.
[5] Bernstein I. N., Gelfand I. M., Ponomarev V. A. Coxeter functors, and Gabriels theorem,
Uspehi Mat Nauk 28, no. 2, 19-33, 1973.
[6] Broué, M., Isométries parfaites, types de blocs, catégories dérivées, Ast/’erisque 181-182, 61-92,
1990.
[7] Buan A., Marsh R., Reiten I. Cluster-tilted algebras, Trans. Amer. Math. Soc. 359, no. 1,
323-332, 2007.
[8] Buan A., Marsh R., Reineke M., Reiten I., Todorov G. Tilting theory and cluster combina-
torics, Adv. Math. 204, 572-618, 2006.
[9] Buan A. and Vatne D., Derived Equivalence for Cluster-tilted Algebras of Type An . J. Algebra
319, no. 7, 2723-2738, 2008.
[10] Caldero P., Chapoton F., Schiffler R. Quivers with relations arising from clusters (An case),
Trans. Amer. Math. Soc. 358, no. 3, 1347-1364, 2006.
[11] Dade, E.C., Blocks with cyclic defect groups, Annals of Math. 84, 20-49, 1966.
[12] Fan, Y., and Kulshammer,B., A note on blocks with abelian defect groups, preprint.
[13] Fomin S., Zelevinsky A. Cluster Algebras I: Foundations, J. Amer. Math. Soc. 15, no. 2,
497-529, 2002.
[14] Huppert,B. Character theory of finite groups, Walter de Gruyter - Berlin - New York, 1998.
[15] Keller, B., A remark on tilting theory and DG-algebras, Manuscripta Mathematica 79, 247-253,
1993.
[16] Keller B., On triangulated orbit categories, Documenta Math. 10, 551-581, 2005.
[17] Koenig,S. and Zimmermann, A., Derived Equivalences for Group Rings, Lecture Notes in Math-
ematics 1685, Springer-Verlag, Berlin, 1998.
[18] Meyer,H. On a subalgebra of the centre of a group ring, Journal of Algebra, 295, 293-302, 2006.
[19] Muchtadi-Alamsyah, I., Homomorphisms of complexes via homologies, J.Algebra 294, 321-345,
2005.
[20] Muchtadi-Alamsyah, I., Braid action on derived category Nakayama algebras, Communications
in Algebra 36:7, 2544-2569, 2008.
[21] Rickard, J., Derived equivalences as derived functors, J.London Math.Soc 43, 37-48, 1991.
[22] Rickard, J., Derived categories and stable equivalence, J. Pure Appl. Algebra 61, 303317, 1989.
[23] Ringel, C.M., The self-injective cluster tilted algebras, to appear in Archiv der Mathematik.
[24] Ringel C.M. Some Remarks Concerning Tilting Modules and Tilted Algebras. Origin. Relevance.
Future, LMS Lecture Notes Series 332(An appendix to the Handbook of Tilting Theory), Cam-
bridge University Press, 2007.
[25] Zimmermann, A., Invariance of generalized Reynolds ideals under derived equivalences, Mathe-
matical Proceedings of the Royal Irish Academy 107A (1), 1-9, 2007.
50 Intan Muchtadi-Alamsyah
Intan Muchtadi-Alamsyah
Algebra Research Group
Faculty of Mathematics and Natural Sciences
Institut Teknologi Bandung.
e-mail : [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 51–64.
1. INTRODUCTION
Image analysis refers to the task of automatically extracting information from im-
ages. Examples include automatic tracking of vehicles in video sequences, handwriting
recognition, face recognition, counting cells on microscope slides, predicting crop yield
from aerial photographs, detecting cancer from x-ray images, etc. The applications to
defense, forensics, surveillance, agriculture, science, technology and medicine are vast
and growing. Mathematics and statistics underpins image analysis and much of this
mathematics is common across the various areas of application. At the same time, each
application and each class of images has its own peculiarities that impact some aspects
of how mathematics is used.
This paper examines the peculiarities of medical image analysis and more specif-
ically, mathematical issues arising in automatic detection of breast cancer in screening
mammograms. The examples and methods considered here reflect the current and re-
cent research interests of the authors and is not intended to represent to full range of
activity in this field. Readers interested in a comprehensive account of mammography
51
52 Bottema et al.
and computer-aided diagnosis of breast cancer are encouraged to consult the collection
of papers edited by Suri and Rangayyan [14].
The present paper is aimed at a mathematical audience and does not presume
any familiarity with either image analysis or the application field of computer-assisted
screening mammography. Accordingly, brief introductions to these topics are provided.
which the object is not present and then designing a classification rule that best sepa-
rates the two groups. If the models are normal distributions, for example, an optimal
threshold for classification can be derived from elementary theory.
An image can be viewed as a point in an N dimensional feature space where N
is the number of pixels (picture elements). The number N ranges from a few tens of
thousands for very small images to several million. Without context information to
connect the information in separate pixels, classification in such a high dimensional
feature space is virtually impossible.
Instead, the usual practice is to extract features from the images that are thought
to represent the information of interest more efficiently. Features may include the
strength, orientation and juxtaposition of edges, shapes of regions of high contrast,
spatial distributions of lines, textures, or colors. In this way an image is reduced to
small number of features and classification is based on these features only. Thus object
detection requires a choice of features that allow good classification and a method for
extracting these features.
The choice of features is usually determined from the context of the application.
An astronomer may provide descriptions of comets that distinguish them from stars
and planets, for example, and these descriptions will form the basis of the choice of
features.
The extraction of these features is often the most crucial task in an image analysis
problem. The detection of the comet may require identifying the tail. Where does the
tail start and end? How can the tail be distinguished from background stars or noise
in the image? Even if edges and lines can be found, how do they connect to form the
objects of interest? These are the issues that are the most difficult to solve.
The process of identifying coherent regions in the image is called segmentation.
Once the image has been segmented, features such as shape, contrast, texture, orienta-
tion, can be measured for each segment separately in order to find which of the regions,
if any, match the expected features of the object of interest.
Segmentation, in turn, depends on the quality and complexity of the image. Many
images include noise or artifacts. One segmentation method may work perfectly well
if edges are sharp and noise is low, but may fail if edges are fuzzy and noise levels
are high. This motivates the use of preprocessing steps such as noise reduction, edge
enhancement, histogram equalization, etc., to allow segmentation methods best chance
of producing good results.
Altogether, a typical detection task involves the following steps:
preprocessing → segmentation → feature extraction → classification
In addition, many tasks require the additional step of image registration - align-
ing images to highlight difference or measure similarities. For example, in looking for
comets, comparing images taken at different times (say on consecutive nights) can reveal
the motion of a comet relative to apparently stationary stars far away.
One frustrating aspect of this process is that individual steps cannot be fully
evaluated in isolation. What is the best method for noise reduction for a particular
class of images? One can try several methods, but the difficulty is in judging the
54 Bottema et al.
results. Visual assessment is possible but subjective. In the long run, the quality of
noise reduction can only be measured according to the quality of the image segmentation
that follows. The quality of the image segmentation can only be measured according to
the quality of the features that can be extracted and these can be judged only according
to the final accuracy of the classification step.
Fortunately, classication can be measured using methods of machine learning if
there are sufficient examples of images where the true state (object present or object
absent) is known. Classification performance is often reported in terms of sensitivity
and specificity and these are commonly summarized using ROC analysis.
Classification of objects in images refers to the task of assigning a particular (or
previously detected) object in the image to one of two or more classes. For example, a
system might acquire images of vehicles on a road and the task is to classify these as cars,
trucks, motorcycles, or bicycles. The processing steps are very similar to those listed
for detection except that the classification is usually an assignment to many classes. In
some classification tasks, the segmentation is already known but many image analysis
tasks involve both a detection and a classification step.
Methods for feature extraction (including feature selection) and classification are
not very different for general image analysis tasks and medical image analysis. However,
several aspects of preprocessing, segmentation and image registration are very different
in the context of medical images, particularly x-ray images, than in the context of visual
images. These three steps will be discussed in subsequent sections.
1.3. What is Special About X-ray Images? The two main aspects of medical x-
ray images that impact choices of image analysis methods are that the images are
projection images and that the signal to noise ratio is low. The fact that the images are
projection images means that the intensity value of a single pixel reflects the aggregate
attenuation of an x-ray beam through several tissues (skin, heart, rib, lung, for example).
Hence a pixel does not ”belong to” any single object. By contrast, a pixel in a visual
images, represents either a part of a building or a tree or bird or someone’s face, but,
in any event, only one object. Remarkably, this important distinction between x-ray
images and visual images is often neglected even though everyone is entirely aware of
the distinction. A second consequence of being a projection image is that objects of
uniform x-ray attenuation, but of rounded shape, necessarily have poorly defined edges
in the image. This is because the path length of the x-ray beam through the object
becomes shorter closer to the edge of the object. Since organs in the human body
are nearly always rounded in shape, edges defining objects in medical x-ray images are
seldom sharp (Fig. 1).
Low signal to noise ratios occur in all areas of image analysis but the distinction
is that, once identified, developments in technology offer the possibility of improving
this ratio. In the case of x-ray images, improvements in technology are used to lower of
dosage as a priority over improving signal to noise ratios.
Other peculiarities of x-ray images include the phenomenon of beam hardening
(energy dependent attenuation) and scattering, but these issues will not be considered
in this paper.
Mathematics in Medical Image Analysis: A Focus on Mammography 55
500
400
Intensity
300
200
100
0
0 1000 2000 3000
300
Intensity
250
200
2. PREPROCESSING
The key preprocessing step, and the one that is most sensitive to the special
circumstances of x-ray images, is that of noise reduction. Naive noise reduction can
provide visually satisfying results (Fig. 2), but this kind noise reduction is not well
suited to mammography. The trouble is that some outlying intensities or a small group
of intensities might be noise due to scattering or photon statistics, but might equally
well represent actual structure such as a narrow ridge of intensities running across the
line along which the intensities were sampled. In mammograms, such ridges may result
from fibers in normal tissue or spicules associated with malignant tumors (Fig. 3).
Noise reduction is needed that eliminates spike noise but retains narrow ridges.
Elegant methods for noise reduction with these properties exist based on ”anisotropic
smoothing” introduced by Perona and Malik [12]. The idea start with the noisy image
56 Bottema et al.
300
Intensity
250
200
for example, then the diffusion is strong where the gradient is small and weak where
the gradient is large. Thus edges are preserved while the regions of similar intensity are
smoothed. This method is called anisotropic, even though the action is locally isotropic.
However, truly anisotropic methods based on these general principles do exist [18].
Even so, in our own work on mammograms [1], anisotropic smoothing did not
perform as well as a method called neutrosophic image denoising [6]. In this method,
each pixel in the image I is replaced by a vector P (i, j) = (T (i, j), U (i, j), F (i, j)) loosely
representing the membership states of the pixel at location (i, j) in a region. The three
states are True (T ), Undecided (U ), False (F ). The values are computed as
¯ j) − Imin (i, j)
I(i,
T (i, j) =
Imax (i, j) − Imin (i, j)
δ(i, j) − δmin (i, j)
U (i, j) =
δmax (i, j) − δmin (i, j)
F (i, j) = 1 − T (i, j)
¯ j)|,
δ(i, j) = |I(i, j) − I(i,
Mathematics in Medical Image Analysis: A Focus on Mammography 57
studying this phenomenon is that there is no general criterion for good noise reduction.
In other words, stating the right theorem is not even possible.
3. IMAGE SEGMENTATION
Most segmentation methods used in mammography rely solely on pixel intensity
to distinguish masses or candidate masses from normal tissue. Early papers suggested
linear filters for detection. Significant post processing was required to realise reasonable
detection performance [13].
3.2. Graph Based Methods. A raft segmentation methods exist which are either
directly or loosely based on graph theory. The advantage is that the emphasis is away
from detecting edges (as in the case of snakes) but on identifying regions of similar
intensity or regions that are similar according to some other attribute or set of attributes.
The basic setting is to form a graph G = (V, E) were V is a set of vertices and E is
a set of edges. Thus e ∈ E means e = (vi , vj ) for some vi , vj ∈ E. The vertices are the
pixels that comprise the image and the objective is to assign edges so that the connected
components of the graph correspond to regions of interest in the image. These methods
are used to segment the entire image, meaning that the image is decomposed into the
union of disjoint segments. In contrast, many segmentation schemes, including snakes,
are used to delineate one object of interest at a time.
Mathematics in Medical Image Analysis: A Focus on Mammography 59
3.2.1. Adaptive Pyramids. The adaptive pyramid [10] [7] starts with every vertex con-
nected to each of its eight neighbors. This neighborhood is called the support set of the
vertex. The method works by selecting a subset of these vertices to ”survive” to the next
level of the pyramid. A rule is used to insure that if two vertices are connected by an
edge, only one can survive and if a vertex does not survive, there is at least one vertex in
its support which does survive. The rule selects the surviving vertices according to how
well they represent their support region. Often the criterion used is similar intensity
but other attributes could be used instead. A surviving vertex inherits the supports of
vertices in the previous level that did not survive but are more similar to the surviving
vertex than other surviving vertices. If this process continues, eventually there is one
vertex at the top of the pyramid and so the entire image constitutes one segment. In
order to achieve useful segmentation, a rule is used to decide if a non-surviving vertex
at a certain level of the pyramid is not sufficiently similar to any surviving vertex to
be associated to any vertex at the next level. Such a vertex is called a root. All the
vertices in the base of the pyramid (the original image) associated to this vertex form
a separate segment (component) of the image.
3.2.2. Minimum Spanning Trees. This method, based on work by Felzenszwalb and
Huttenlocher [5], starts with a collection of eligible edges E comprising the edges be-
tween pixels and their four nearest neighbors. Each edge is assigned an edge weight
according to some rule, for example,
|I(vi ) − I(vj )| (vi , vj ) ∈ E
w(e) = w((vi , vj )) = .
∞ otherwise
The edges are sorted according to weight so that w(ei ) ≤ w(ej ) for i < j and the initial
graph is set as H 0 = (V, F 0 ) where F 0 is the empty set. The graph H q = (V, F q ) is
constructed from H q−1 = (V, F q−1 ) as follows. Let eq = (vi , vj ) denote the qth edge. If
vi and vj lie in different components of H q−1 and w(eq ) is small compared to the internal
variation of both components, set F 1 = F q−1 ∪ {eq }. Otherwise set F q = F q−1 . The
rule for merging or not merging components depends on a single parameter that directly
controls the granularity of the segmentation and provides a handle for automatically
tuning the segmentation to a particular class of images [1].
3.3. Statistical Region Merging. The image is viewed as the union of disjoint regions
of uniform intensity plus additive, normally distributed noise. Again the process starts
by viewing each pixel as an isolated region in the image. Regions are merged sequentially
according the likelihood that the regions have the same mean. The image I is realised
as Q independent random variables and the number of gray levels in I is g. According
to [11], for any fixed pair of regions R and R0 in I and for any 0 < δ ≤ 1
s !
0 0 1 1 1 2
Prob |(R̄ − R̄ ) − E(R̄ − R̄ )| ≥ g + ln < δ.
2Q |R| |R0 | δ
Thus R and R0 should be merged if
s
0 1 1 1 2
|(R̄ − R̄ )| ≤ g + ln .
2Q |R| |R0 | δ
60 Bottema et al.
In practice, Q is selected according to the saliency of the smallest objects the user hopes
to be able to segment [2] and δ is set to be a very small number, for example δ ≈ 1/|I|.
3.4. Mixture Models. So far, our group has found that statistical region merging
provides the best segmentation when judged according to the final performance of the
full breast cancer detection scheme. However, there are plenty of avenues left to explore.
To begin with, the methods presented above, are all based on the model that a region
of interests has uniform intensity and that each pixel is associated with exactly one
structure within the breast. Since mammograms are projection images and the structure
comprising the breast are generally rounded in shape, neither of these assumptions are
valid.
Current work is aimed at building a more realistic model of the mammogram by
viewing the image as the sum of bivariate Gaussian distributions. The objective is to
find the number of such distributions together with the set of means and variances that
best explain the content of the image.
3.5. Role of Texture. The discussion so far has focused on using image intensity alone
to segment the image. Two regions may have the same mean intensity but differ in the
variance of the intensity values. More generally, the distribution of edges, lines, bumps,
may vary between regions even if mean intensity does not. These notions have lead
to the use of texture in image analysis generally, especially in classification tasks. In
mammography, several studies have considered texture in classifying masses as benign or
malignant [3],[17],[20]. In this context, the region of interest has already been identified
and so texture descriptions can be computed for the region without difficulty.
In segmentation, the use of texture is more difficult because texture cannot be
measured one pixel at a time. Typically, each pixel is assigned a neighborhood and
textures measured over that neighborhood are assigned to the pixel. Many texture
measures may be used resulting in the representation of each pixel by a vector of texture
attributes. If the boundary assigned to the pixel spills across two regions of different
texture, then the texture attributes assigned to the pixel will reflect neither region
accurately. Thus regions of uniform texture will have blurred boundaries in this vector
representation.
Another problem is that there is usually no way to know ahead of time which
texture attributes characterize the region of interest. A popular approach is to measure
a large numbers of feature attributes, say by applying a filter bank, and searching
for clusters, called textons, in the resulting space of output vectors [16]. Each texton
represents a pattern that appears often in the image. Each pixel in the original image is
then assigned to the texton which lies closest to the vector of filter outputs associated
with the pixel.
4. IMAGE REGISTRATION
The literature on image registration is huge because of its fundamental role in
tracking objects in sequences of images such as video streams and in coordinating stereo
views of objects. Most of the work is in the context of visual images and radar images
(signals). In these settings, objects of interest are usually robust in the sense that shapes
Mathematics in Medical Image Analysis: A Focus on Mammography 61
T
→
are generally consistent and the relative positions of many objects remain constant.
Even if a car moves with respect to buildings in the background, these object separately
retain their shapes well so that, locally, the map, T , that matches points in one image
to the next can be modelled as having nice mathematical properties (Fig. 4). A small
patch in the first image can be searched for a match in the second image, testing all
possible translations and rotations, for example. If reliable matches are found this way
for several patches, the map T can be inferred by restricting the class to, say, affine
maps.
In the case of mammograms, the situation is quite different for several reasons.
First, screening visits are typically one to three years apart. Breast tissue naturally
changes over time. Breast become more fatty with age, calcium deposits increase, etc.
Second, the positioning of the breast and exposure settings are generally not consistent.
Third, different types of film or detectors may be used as technology advances. Forth,
and most important, the breast is compressed between two plates at acquisition. The
soft tissue rolls inconsistently so that, in the x-ray image, being a projection image, the
relative positions of objects is not necessarily consistent. The map between x-rays of
the same breast from consecutive visits is not even a function since a single pixel in one
image may correspond to two locations in the second image (Fig. 5).
These considerations may discourage attempts to register mammograms, but there
is a mitigating factor. Aligning all the tissue is not necessary. Only candidate tumors
need to be associated in order to decide which anomalies are new, which have changed
and which have remained essentially unchanged. This view inspired a method to replace
true registration by matching only the information content relevant to cancer [9]. The
first step is to use detection methods to find all possible candidates for masses in both
images. Typically, 40-50 regions are included even though most images have no tumors
and very few have more than one. Next all the candidate masses in one image are
assigned a ”mass-like” score that indicates how much these objects resemble true masses.
This score is based on shape, contrast and texture features. The candidate masses are
then viewed as vertices of a graph with attributes assigned to indicate fuzzy descriptions
of relative location. In addition, the breast boundary is included in the list to provide a
fuzzy description of the location within the breast too. The graphs for the two images
are then aligned using a graph matching algorithm.
62 Bottema et al.
C1 C2
π π
Figure 5. The circle at the top represents a breast in the normal state.
The small circle and the cross represent anomalies in the breast lying
in the same vertical plane. Compression at the first visit (C1 ) results
in a distorted breast and the circle and cross happen to align. An x-
ray image taken top to bottom of the breast shows these two objects
as superimposed. Once acquisition is complete, the breast resumes its
normal shape so the map C1 can be viewed invertible. At the second
visit, the compression C2 results in a different relative position of the
two anomalies and the resulting x-ray shows the circle and cross as
separate objects. The induced map C can be modeled as invertible.
However, the map T between the two x-ray images is not even a func-
tion.
If an anomaly is found in the current image with a high mass-like score but is
matched to mass in the same location with a similar mass-like score, then the anomaly is
rejected as a candidate cancer. This step reduces the number of false positive detections
of cancer. On the other hand, if an anomaly is found in the current image with no match
in the previous image or is matched to much smaller anomaly in the previous image,
then this will be flagged as likely to be cancer. The key is that anomalies can be matched
even if their location relative to other anomalies are slightly different in the two images
or overlap in one image.
Mathematics in Medical Image Analysis: A Focus on Mammography 63
5. CONCLUDING REMARKS
This has been a rather haphazard gallop through mathematical ideas arising in
computer-aided screening mammography. The message is that the most mathematically
appealing solutions do not alway provide the best results, but, mathematical ideas still
provide the way forward in improving early detection of breast cancer.
Radiologists are very good at spotting breast cancer without computers and
computer-aided diagnosis systems in current use are quite successful. Hence, the best
one can usually hope for is a slight improvement over current performance. The in-
clusion of graph matching to circumvent image registration (Section 4), for example,
resulted in a reduction in the false positive rate from 1.04 false positive reports per
image to 1.00 at a true detection rate of 80 percent. On the other hand, due to the very
large number of women attending screening programs world wide and the high preva-
lence of breast cancer, a small improvement in screening performance has the potential
to save thousands of lives.
Acknowledgement. The authors thank the National Breast Cancer Foundation and
the Flinders Medical Centre Foundation for support and BreastScreen SA for access to
archives of screening mammograms.
References
[1] M. Bajger, F. Ma, and M. J. Bottema. Automatic tuning of mst segmentation of mammgorams
for registration and mass detection algorithms. In M. J. Bottema B. C. Lovel A. J. Maeder H. Shi,
Y. Zhang, editor, 2009 Digitial Image Computing Techniques and Applications, Melbourne, Aus-
tralia, Dec. 2009, IEEE Computer Society, pages 400–407, 2009.
[2] M. Bajger, S. Williams, F. Ma, and M. J. Bottema. Mammographic mass detection with statistical
region merging in digital mammography. In Digital Image Computing Techniques and Applica-
tions, Sydney, Australia, Dec. 2010, IEEE Computer Vision Society, pages 27–32, 2010.
[3] H-P. Chan, D. Wei, M. A. Halvie, B. Sahiner, D. D. Adler, M. M. Goodsitt, and N. Petrick.
Computer-aided classification of mammographic masses and normal tissue: Linear discriminant
analysis in texture feature space. Phys. Med. Biol., 40:857–876, 1995.
[4] L. D. Cohen. On active contour models and balloons. Comput. Vision, Graphics, and Image Proc.:
Image Understanding, 53:211–218, 1991.
[5] P. F. Felzenszwalb and D. P. Huttenlocher. Image segmentation using local variation. Proceedings
of IEEE Conference on Computer Vision and Pattern Recognition, pages 98–104, 1998.
[6] Y. Guo and H. D. Cheng. New neutrosophic approach to image segmentation. Int. Jour. Computer
Vision, 42:587–595.
[7] J. M. Jolion and A. Montanvert. The adaptive pyramid: A framework for 2d image analysis.
Computer Vision, Graphics, and Image Processing, 55(3):339–348, May 1992.
[8] M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. International Journal
of Computer Vision, 1(4):321–331, 1987.
[9] F. Ma, M. Bajger, , and M. J. Bottema. Temporal analysis of mammograms based on graph
matching. In E. A. Krupinski, editor, Digital Mammography 9th International Workshop, IWDM
2008, Tucson, AZ, USA, July 2008, Proceedings, number 5116 in Lecture notes in computer
science, pages 158–165. Springer, 2008.
[10] A. Montanvert, P. Meer, and A. Rosenfeld. Hierarchical image analysis using irregular tessellations.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(4):307–316, April 1991.
64 Bottema et al.
[11] R. Nock and F. Nielsen. Statistical region merging. Trans. Pattern Anal. Mach. Intell., 26:1452–
1458, 2007.
[12] P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence, 12:629–639, 1990.
[13] B. Sahiner, H-P. Chan, N. Petrick, M. A. Helvie, and L. M. Hadjiiski. Improvement of mam-
mographic mass characterization using spiculation measures and morphological features. Medical
Physics, 28(7):1455–1465, 2001.
[14] J. S. Suri and R. M. Rangayyan. Recent advances in breast imaging, mammography, and computer-
aided diagnosis of breast cancer. SPIE, 2006.
[15] P. Taylor, H. Potts, L. Wilkinson, and R. Givin-Wilson. Impact of CAD with full field digital
mammography on workload and cost. In J. Marti, A. Oliver, J. Freixenet, and R. Marti, editors,
Digital Mammography 10th International Workshop, IWDM 2010, Girona, Spain, June 2010,
Proceedings, number 6136 in Lecture notes in computer science, pages 1–8. Springer, 2010.
[16] M. Varma and A. Zisserman. A statistical approach to texure classification from single images.
Int. Jour. Computer Vision, 62:61–81, 2005.
[17] D. Wei, H-P. Chan, et al. False-positive reduction technique for detection of masses on digital
mammograms: Global and local multiresolution texture analysis. Medical Physics, 24(6):903–914,
1997.
[18] J. Weickert. Anisotropic Diffusion in Image Processing. B. G. Teubner, Stuttgart, 1998.
[19] C. Xu and J. L. Prince. Snakes, shapes and gradient vector flow. IEEE Trans. Image Proc.,
7:359–369, 1998.
[20] R. Zwiggelaar and E. R. E. Denton. Texture base segmentation. In S. M. Astley, M. Brady,
C. Rose, and R. Zwiggelaar, editors, Digital Mammography 8th International Workshop, IWDM
2006, Manchester, UK, June 2006, Proceedings, number 4046 in Lecture notes in computer science,
pages 433–440. Springer, 2006.
M. J. Bottema
Flinders University.
e-mail: [email protected]
M. Bajger
Flinders University.
e-mail: [email protected]
F. Ma
Flinders University.
e-mail: [email protected]
S. Williams
Flinders University.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 65–78.
Reza Pulungan
Abstract. This paper lays out the past and the future of one of the most interesting
research problems in the area of phase-type distributions: the problem of their minimal
representations. We will chronologically present contemporary results, including our own
contributions to the problem, and provide several pointers and possible approaches in
attempting to solve the problem in future work.
Keywords and Phrases: Phase-type distributions, Markov chain, minimal representa-
tions, order.
1. INTRODUCTION
The problem of minimal representations remains one of the open problems in the
research area of phase-type (PH) distributions [16, 18]. Given a phase-type distribution,
a minimal representation is an absorbing Markov chain with the fewest number of
states, whose distribution of time to absorption is governed by the same phase-type
distribution. Obtaining minimal representations is important in various circumstances,
including, but not limited to, modeling formalisms that support compositionality [13,
2, 12]. In such circumstance, models are constructed by composing smaller components
via various operations that usually result in exponential blowups of the state space.
Ensuring that all components and all intermediate results of the composition come in
minimal representations will significantly reduce these blowups.
Previous researches [1, 19, 20, 3, 4, 15, 5, 6] have produced several techniques to
obtain these minimal representations. However, the frontier is still limited to acyclic
phase-type distributions [9, 10, 11, 22], namely those phase-type distributions having
at least one Markovian representation that contains no cycle. Even in this case, the
resulting algorithm is not yet satisfactory, for it contains non-linear programming, which
can be inefficient in many cases.
65
66 Reza Pulungan
This paper lays out the past and the future of one of the most interesting research
problems in the area of phase-type distributions: the problem of their minimal repre-
sentations. We will chronologically present contemporary results, including our own
contributions to the problem, and provide several pointers and possible approaches in
attempting to solve the problem in future work.
The paper is organized as follows: Section 2 introduces phase-type distributions
and other concepts required throughout the paper. In this section, we also formulate
the problem of the order of phase-type distributions. Section 3 lays out previous partial
solutions to the problem. In Section 4, we describe our contribution in solving the prob-
lem by proposing an algorithm to reduce the size of acyclic phase-type representations.
The paper is concluded in Section 5.
2. PRELIMINARIES
2.1. Phase-Type Distributions. Let the stochastic process {X(t) ∈ S | t ∈ R+ } be
a homogeneous Markov process defined on a discrete and finite state space
S = {s1 , s2 , · · · , sn , sn+1 }
and with time parameter t ∈ R+ := [0, ∞). The Markov process is a finite continuous-
time Markov chain (CTMC). We view the structure of such a CTMC as a tuple
M = (S, R) where R a rate matrix R : S × S → R+ . The rate matrix R is re-
lated to the corresponding
P infinitesimal generator matrix by: Q(s, s0 ) = R(s, s0 ) if
s 6= s else Q(s, s) = − s0 6=s R(s, s ) for all s, s0 ∈ S. If state sn+1 is absorbing (i.e.,
0 0
Q(sn+1 , sn+1 ) = 0) and all other states si are transient (i.e., there is a nonzero proba-
bility that the state will never be visited once it is left, or equivalently, there exists at
least one path from the state to the absorbing state), the infinitesimal generator matrix
of the Markov chain can be written as:
~
A A
Q= ~ .
0 0
Matrix A is called a PH-generator and it is non-singular because the first n states
in the Markov chain are transient. Vector A ~ is a column vector where its component A~i
for i = 1, · · · , n represents the transition rate from state si to the absorbing state. The
Markov chain is fully specified by the generator matrix Q and the initial probability
vector (~α, αn+1 ), where α ~ is an n-dimensional row vector corresponding to the initial
probabilities of the transient states and αn+1 is the initial probability to be immediately
in the absorbing state. Therefore α ~ ~1 + αn+1 = 1, where ~1 is an n-dimensional column
vector whose components are all equal to 1.
Definition 2.1 (Phase-Type Distribution [16]). A probability distribution on R+ is
a phase-type (PH) distribution if and only if it is the distribution of the time until
absorption in a Markov process of the type described above.
α, A) is called the representation of the PH distribution and P H(~
The pair (~ α, A)
is used to denote the PH distribution with representation (~α, A).
The Order of Phase-Type Distributions 67
The probability distribution of the time until absorption in the Markov chain
(hence of PH distribution) is given by:
~ exp(At)~1,
F (t) = 1 − α for t ≥ 0. (1)
The Laplace-Stieltjes transform (LST) of the PH distribution is given by:
Z ∞
f˜(s) = ~ + αn+1 ,
~ (sI − A)−1 A
exp(−st)dF (t) = α (2)
−∞
where s ∈ R+ and I is the n-dimensional identity matrix. Consider the LST of the PH
distribution in (2). This transform is a rational function, namely:
polynomial of its LST expressed in irreducible ratio is called the algebraic degree—or
simply the degree—of the distribution.
It is known [16, 18] that a given PH distribution has more than one irreducible
representation. The size of a minimal irreducible representation, namely a representa-
tion with the fewest possible number of states, is referred to as the order of the PH
distribution. O’Cinneide in [18] showed that the order of a PH distribution may be
different from, but at least as great as, its algebraic degree. Therefore the following
lemma is straightforward.
Theorem 2.2. Let n be the size of a PH representation whose size is equal to the
algebraic degree of its PH distribution. The the order of the PH distribution is n.
2.5. The Problem of the Order of Phase-Type Distributions. The main problem
addressed in this paper is the problem of the order of PH distributions, namely: given a
PH distribution, what is its order? Stated differently, we would like to find the minimal
number of states required to represent a given PH distribution as an absorbing CTMC.
The PH distribution can be given in various ways: as a probability distribution in
mathematical formulas, as a Laplace-Stieltjes transform, or even as a PH representation
of a certain size.
As a byproduct, of course, it would be advantageous to also be able to devise
methods to compute a minimal representation—namely a PH representation whose size
is equal to the order—of the given PH distribution.
3. PREVIOUS RESULTS
In this section, previous results on the partial solutions to the problem of the order
of PH distributions are presented. The first early result is given in Theorem 2.2. This
theorem is a restatement of lemmas found in [16, 18]. The theorem basically establishes
that the lower bound of the order of PH distributions is their respective algebraic degree.
In the following subsections, we present further partial results, starting in acyclic
PH distributions, the general PH distributions, the relationship between simplicity and
order, and, in the end, an attempt to find non-minimal but nonetheless sparse repre-
sentations.
HE and Zhang in [9] provided an algorithm, called the spectral polynomial algo-
rithm, to obtain the ordered bidiagonal representation of any given acyclic PH repre-
sentation. The spectral polynomial algorithm is of complexity O(n3 ) where n is the size
of the given acyclic PH representation.
In [19], O’Cinneide formally characterized acyclic PH distributions by proving
Theorem 2.1. The characterization basically relates acyclic PH distributions to the
shape of their density functions and LSTs. The theorem maintains that the LST of any
acyclic PH distribution is a rational function and all of its poles are real. Hence, a PH
representation could be cyclic; but as long as its LST has only real poles, there must
exist an acyclic PH representation that has the same PH distribution.
The following three theorems by Commault and Chemla in [4] specify certain
conditions for acyclic PH representations to be minimal, namely to have their size be
equal to their respective order.
Theorem 3.2 ([4]). The order of a PH distribution with LST f˜(s) = P (s)/Q(s), where
P (S) and Q(s) are co-prime polynomials, such that Q(s) has degree n with n real roots
and P (s) has degree less than or equal to one, is n.
Theorem 3.2 establishes that the convolution of several exponential distributions
always produces minimal PH representations. This means that Erlang representations—
formed by a convolution of several exponential distributions of the same rate—and
hypoexponential representations—formed by a convolution of several exponential dis-
tributions of possibly different rates—are always minimal.
Theorem 3.3 ([4]). Consider a PH distribution with LST f˜(s) = P (s)/Q(s), where
P (S) and Q(s) are co-prime polynomials with real roots, such that:
s + µ1 s + µ2
P (s) = , µ1 ≥ µ2 > 0, and
µ1 µ2
n
Y s + λi
Q(s) = , λ1 ≥ λ2 ≥ · · · ≥ λn > 0.
i=1
λi
If µ2 ≥ λn and (µ1 + µ2 ) ≥ (λn−1 + λn ), then the order of the distribution is n.
Theorem 3.4 ([4]). Consider a PH distribution with LST f˜(s) = P (s)/Q(s), where
P (S) and Q(s) are co-prime polynomials with real roots, such that:
m
Y s + µi
P (s) = , µ1 ≥ µ2 ≥ · · · µm > 0, and
i=1
µi
n
Y s + λi
Q(s) = , λ1 ≥ λ2 ≥ · · · ≥ λn > 0, n > m.
i=1
λi
If µm ≥ λn , µm−1 ≥ λn−1 , · · · , µ1 ≥ λn−m+1 , then the order of such PH distribution is
n.
Theorems 3.3 and 3.4 provides several conditions for the convolution of several
exponential distributions of possibly different rates, which starts not only from the first
state, to be minimal.
70 Reza Pulungan
So far, the partial results only provide conditions for the order of acyclic PH
distributions to be equal to the size of the representations. This, in itself, is important,
since it provides a means to determine whether an existing PH representation is already
minimal, or we should first try to find a smaller or even a minimal representation before
proceeding to use it. However, an algorithmic results would be useful. Such results
will allow us to obtain not only the order but also the minimal PH representations
themselves. We shall return to this issue in Section 4, where such algorithmic methods
are described.
—states to represent such PH distributions. Since the poles of the LST of a PH dis-
tribution are eigenvalues of the PH-generator of any of its representation, the order of
the representation increases when the angle between the position of any complex poles
and the vertical line passing through the real dominating pole decreases [6]. This the-
orem assures us that finding a PH representation of a size that is exactly equal to the
algebraic degree of its PH distribution is not always possible.
The Order of Phase-Type Distributions 71
3.3. PH-Simplicity and Order. Let {Xt | t ∈ R≥0 } be an absorbing Markov process
representing a PH distribution and let τ be a random variable denoting its absorption
time.
Definition 3.1 ([14]). The dual or the time-reversal representation of the absorbing
Markov process {Xt | t ∈ R≥0 } is given by an absorbing Markov process {Xτ −t | t ∈
R≥0 }.
The relationship between the two processes can be described intuitively as follows:
the probability of being in state s at time t in one Markov process is equal to the
probability of being in state s at time τ − t in the time-reversal Markov process and
vice versa.
α, A), then its dual representation is
Lemma 3.1 ([3, 5]). Given a PH representation (~
~ B) such that:
(β,
β~ = A
~ T M and B = M−1 AT M,
where M = diag(m)
~ is a diagonal matrix whose diagonal components are formed by the
αA−1 .
~ = −~
components of vector m
Lemma 3.1 provides a recipe to obtain the dual representation of a given PH
representation. It is important to note that the size of both PH representation and its
dual are equal.
The notion of PH-simplicity, on the other hand, was first formalized in [17] and
it is closely related to the notion of simplicity in convex analysis.
Definition 3.2. A PH-generator A (of dimension n) is PH-simple if and only if for
any two n-dimensional substochastic vectors α
~ 1 and α ~ 1 6= α
~ 2 , where α α1 , A) 6=
~ 2 , P H(~
P H(~
α2 , A).
Theorem 3.7 ([3]). Given a PH representation of size n. If both PH-generators of the
representation and its dual representation are PH-simple, then the algebraic degree of
the associated PH distribution is n.
Theorem 3.7 establishes the relationship between PH-simplicity and the order of
PH distributions. In particular, the theorem maintains that if PH-generators of a PH
representation and of its dual representation are both PH-simple, than, no matter their
initial probability distributions, both representations are minimal and the order of the
associated PH distribution is equal to the size of the representations.
3.4. Mixture of Monocyclic Erlang. Figure 1 depicts an example of a monocyclic
Erlang representation in graph form.
The representation has n states and ends in an absorbing state, depicted by the
black circle. The representation is basically formed by a convolution of n exponential
distributions of the same rate λ—hence, Erlang—but with a single cycle from the last
to the first state—hence, monocyclic—with rate µ < λ.
Mocanu and Commault in [15] show that a conjugate pair of complex poles in the
LST of a PH distribution can be represented by a single monocyclic Erlang, and they
proceeded to prove Theorem 3.8.
72 Reza Pulungan
1
λ λ λ−µ
1 2 n
4. OUR CONTRIBUTION
In this section, we will explore further on the algorithmic solution to the problem
in the field of acyclic PH distributions. In [10], HE and Zhang provided an algorithm
The Order of Phase-Type Distributions 73
L-terms. The LST of an exponential distribution with rate λ is given by f˜(s) = s+λ λ
.
s+λ
Let L(λ) = λ , i.e., the reciprocal of the LST. We call a single expression of L(·) an
L-term. The LST of an ordered bidiagonal representation (β, ~ Bi(λ1 , λ2 , · · · , λn )) can
be written as:
β1 β2 βn
f˜(s) = + + ··· + ,
L(λ1 ) · · · L(λn ) L(λ2 ) · · · L(λn ) L(λn )
β1 + β2 L(λ1 ) + · · · + βn L(λ1 ) · · · L(λn−1 )
= , (3)
L(λ1 )L(λ2 ) · · · L(λn )
74 Reza Pulungan
but this may not be in irreducible ratio form. Here the denominator polynomial corre-
sponds exactly to the sequence of the transition rates of the ordered bidiagonal represen-
tation, and thus its degree is equal to the size of the ordered bidiagonal representation.
Reduction. Observing (3), we see that in order to remove a state from the ordered
bidiagonal representation, we have to find a common L-term in both the numerator and
denominator polynomials. If we find that, we might be able to drop a state from the
representation. But removing a common L-term from the numerator and denominator
involves redistributing the initial probability distribution. This may not be possible,
because the resulting vector ~δ may not be substochastic (a vector ~δ is substochastic if
δi ≥ 0 and ~δ~1 ≤ 1). Otherwise, a state can be removed. The procedure of identifying
and properly removing a state from an ordered bidiagonal representation is based on
Lemma 4.1 (see [21] for proof).
Let M E(~ α, A, ω
~ ) denote the matrix-exponential (ME) distribution of representa-
α, A, ω
tion (~ ~ ). The set of PH distributions is a subset of ME distributions. In particular,
the initial distributionP vector α~ in an ME representation is allowed to be non-stochastic
vector, as long as 0 < i α ~ i ≤ 1. Let ~1|x be a vector of dimension x whose components
are all equal to 1.
Lemma 4.1. If for some 1 ≤ i ≤ n, β1 + β2 L(λ1 ) + · · · + βi L(λ1 ) · · · L(λi−1 ) is divisible
by L(λi ) then there exists a unique vector ~δ such that:
~ Bi(λ1 , · · · , λn )) = M E(~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn ), ~1|n−1 ).
P H(β,
If vector ~δ is substochastic, then
~ Bi(λ1 , · · · , λn )) = P H(~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )).
P H(β,
If both conditions are fulfilled, then switching from the given representation
~ Bi(λ1 , · · · , λn )) to the smaller representation (~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )) means
(β,
reducing the size from n to n − 1. Algorithmically, we investigate the two conditions
for a given λi . The divisibility of the numerator polynomial is obtained by checking
whether R(−λi ) = 0, where R(s) is the numerator polynomial in (3). The substochas-
ticity (i.e., the absence of nonnegative components) of ~δ is checked while computing it,
as explained below.
Let Bi1 := Bi(λ1 , · · · , λi ), Bi2 := Bi(λ1 , · · · , λi−1 ). Lemma 4.2 (see [21] for
proof) states that we can simply ignore the last n − i states in both bidiagonal chains.
Lemma 4.2. If δj = βj+1 , for i ≤ j ≤ n − 1, then:
~ Bi(λ1 , · · · , λn )) = M E(~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn ), ~1|n−1 )
P H(β,
implies:
P H([β1 , · · · , βi ], Bi1 ) = M E([δ1 , · · · , δi−1 ], Bi2 , ~1|i−1 ). (4)
Algorithm. Lemma 4.1 can thus be turned into an algorithm that reduces the size of
a given APH representation (α, A), which we give here in an intuitive form.
(1) Use SPA to turn (~ α, A) into (β, ~ Bi(λ1 , · · · , λn )), which takes O(n3 ) time.
(2) Set i to 2.
(3) While i ≤ n:
(a) Check divisibility w.r.t. λi (i.e., R(λi ) = 0), which takes O(n) time.
(b) If not divisible (i.e., R(λi ) 6= 0), continue the while-loop with i is set to
i + 1.
Otherwise, construct (7), and then solve it by backward substitution. This
takes O(n2 ) time, and produces (~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )). If vec-
tor ~δ is substochastic (which takes O(n) time to check), continue with the
PH representation (~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )) and then decrease n
to n − 1; otherwise continue the while-loop with (β, ~ Bi(λ1 , · · · , λn )) and i
is set to i + 1.
(4) Return (β,~ Bi(λ1 , · · · , λn )).
76 Reza Pulungan
5. CONCLUDING REMARKS
This paper has described one of the most interesting research problems in the
area of phase-type distributions: the problem of their order and hence their minimal
representations. Several partial solutions to the problem have been discussed. For
acyclic phase-type distributions, the problem has basically been solved. An algorithm
that is guaranteed to transform any given acyclic phase-type representation to its min-
imal representation has been proposed in [10]. Although the algorithm involves solving
non-linear programming, which can be difficult, highly unstable and prone to numerical
errors, this algorithm is an excellent basis for further developments and improvements.
Based on our own proposed algorithm to reduce the size of acyclic phase-type represen-
tations, we think that to achieve minimality, non-linearity seems to be unavoidable.
For the general phase-type distributions, the problem is still open. Currently
available partial results are restricted to conditions for minimality without algorithmic
possibilities. The only algorithmic method that we are aware of is the algorithm pro-
posed in [15]. However, this algorithm only strives for obtaining sparse representations,
The Order of Phase-Type Distributions 77
not minimal ones. Nevertheless, we think that this algorithm is also an excellent ba-
sis for further developments and improvements towards an algorithm that can produce
minimal representations, since the output of the algorithm is quite similar to ordered
bidiagonal representations.
References
[1] Aldous, D. and Shepp, L., The least variable phase-type distribution is erlang, Communications
in Statistics: Stochastic Models, 3, 467-473, 1987.
[2] Bernardo, M. and Gorrieri, R., Extended Markovian process algebra, In CONCUR 96, Con-
currency Theory, 7th International Conference, Pisa, Italy, August 26-29, 1996, Proceedings,
volume 1119 of Lecture Notes in Computer Science, 315-339, Springer, 1996.
[3] Commault, C. and Chemla, J.-P., On dual and minimal phase-type representations, Communi-
cations in Statistics: Stochastic Models, 9(3), 421-434, 1993.
[4] Commault, C. and Chemla, J.-P., An invariant of representations of phase-type distributions
and some applications, Journal of Applied Probability, 33, 368-381, 1996.
[5] Commault, C. and Mocanu, S., A generic property of phase-type representations, Journal of
Applied Probability, 39, 775-785, 2002.
[6] Commault, C. and Mocanu, S., Phase-type distributions and representations: Some results and
open problems for system theory, International Journal of Control, 76(6), 566-580, 2003.
[7] Cumani, A., On the canonical representation of homogeneous Markov processes modelling failure-
time distributions, Microelectronics and Reliability, 22, 583-602, 1982.
[8] Dmitriev, N. and Dynkin, E.B., On the characteristic numbers of a stochastic matrix. Comptes
Rendus (Doklady) de lAcadémie de Sciences de lURSS (Nouvelle Série), 49, 159162, 1945.
[9] He, Q.-M. and Zhang, H., Spectral polynomial algorithms for computing bi-diagonal representa-
tions for phase-type distributions and matrix-exponential distributions, Stochastic Models, 2(2),
289-317, 2006.
[10] He, Q.-M. and Zhang, H., An Algorithm for Computing Minimal Coxian Representations, IN-
FORMS Journal on Computing, ijoc.1070.0228, 2007.
[11] He, Q.-M. and Zhang, H., On matrix exponential distributions, Advances in Applied Probability,
39(1), 271-292, 2007.
[12] Hermanns, H., Interactive Markov Chains: The Quest for Quantified Quality, Lecture Notes in
Computer Science, 2428, Springer, 2002.
[13] Hillston, J., A compositional approach to performance modelling, Cambridge University Press,
1996.
[14] Kelly, F.P., Reversibility and Stochastic Networks, Wiley, 1979.
[15] Mocanu, S. and Commault, C., Sparse representation of phase-type distributions, Communica-
tions in Statistics: Stochastic Models, 15(4), 759-778, 1999.
[16] Neuts, M. F., Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach,
Dover, 1981.
[17] O’Cinneide, C. A., On non-uniqueness of representations of phase-type distributions, Communi-
cations in Statistics: Stochastic Models, 5(2), 247-259, 1989.
[18] O’Cinneide, C. A., Characterization of phase-type distributions, Communications in Statistics:
Stochastic Models, 6(1), 1-57, 1990.
[19] O’Cinneide, C. A., Phase-type distributions and invariant polytopes, Advances in Applied Prob-
ability, 23(43), 515-535, 1991.
[20] O’Cinneide, C. A., Triangular order of triangular phase-type distributions, Communications in
Statistics: Stochastic Models, 9(4), 507-529, 1993.
[21] Pulungan, R., Reduction of acyclic phase-type representations, Ph.D. Dissertation, Saarland
University, Saarbrücken, Germany, 2009.
78 Reza Pulungan
[22] Pulungan, R. and Hermanns, H., Acyclic minimality by construction—almost, Sixth Interna-
tional Conference on Quantitative Evaluation of Systems, QEST 2009, 61-72, IEEE Computer
Society, 2009.
Reza Pulungan
Department of Computer Science and Electronics,
Faculty of Mathematics and Natural Sciences,
Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail : [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 79 - 88
SALMAH
Abstract. In this paper the noncooperative linear quadratic game problem will be
considered. We present necessary and sufficient conditions for existence of optimal
strategy for linear quadratic continuous non -zero-sum two player dynamic games for index
one descriptor system. The connection of the game solution with solution of couple Riccati
equation will be studied. In noncooperative game with open loop structure, we study Nash
solution of the game. If the second player is allowed to select his strategy first, he is called
the leader of the game and the first player who select his strategy at the second time is
called the follower.A stackelberg strategy is the optimal strategy for the leader under the
assumption that the follower reacts by playing optimally.
Keywords and Phrases : Dynamic, game, noncooperative, descriptor, system
1. INTRODUCTION
Dynamic game theory brings three keys to many situations in economy, ecology, and
elsewhere: optimizing behavior, multiple agents presence, and decisions consequences.
Therefore this theory has been used to study various policy problems especially in macro-
economic. In applications one often encounters systems described by differential equations
system subject to algebraic constraints. The descriptor systems, gives a realistic model for this
systems.
In policy coordination problems, questions arise, are policies coordinated and which
information do the parties have. One scenario is noncooperative open-loop game. According
this, the parties can not react to each other’s policies, and the only information that the
players know is the model structure and initial state.
In this paper we will consider a linear open-loop dynamic game in which the player
satisfy a linear descriptor system and minimize quadratic objective function. For finite
horizon problem, solution of generalized Riccati differential equation is studied. If the
planning horizon is extended to infinity the differential Riccati equation will become an
algebraic Riccati equation.
79
80 SALMAH
2. PRELIMINARIES
If the second player is allowed to select his strategy first, he is called the leader of the
game and the first player who select his strategy at the second time is called the follower. A
stackelberg strategy is the optimal strategy for the leader under the assumption that the
follower reacts by playing optimally.
Assumption which is needed will be given.
Assumption 2.1:Descriptor system (1) regular, impulse controllable and finite dynamic
stabilizable which satisfy
(i). | | , except for a finite number of ,
(ii). ( ) ( ) ,
(iii). ( | ) [ ]
The To derive necessary condition of opnimal Nash solution we need the Hamiltonian
functions as follow.
( ) ( ) ( ),
( ) ( ) ( ),
With Lagrange multiplier method as in [11] we get necessary conditions for objective
function to be optimal in the Nash sense are
, , ̇ i=1,2. (3.1)
Substitute these equations to (1) yields
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 81
̇( ) ( ) ( ) (3.2)
and
with i=1,2. (3.3)
The boundary conditions are
( ) and ( ) ( ). (3.4)
From necessary condition of optimal Nash solution we get optimal strategies for 2
players dynamic game are ( ) with satisfy (3.2) and boundary
conditions(3.4), i=1,2.
We can write in matrix form and get
̇( ) ( )
( ) ( ̇ ( )) ( ) ( ( )) , (3.5)
̇ ( ) ( )
with boundary conditions (3.4) . System (3.5) can be written in descriptor form as
̃ ̇ ̃ . If ( ̃ ̃) regular, system (3.5) will have solution (see [11]). We need the
following assumptions for equation (3.5).
Assumption 3.1: Descriptor system (3.5) is regular and impulse free i.e
̃ ̃( ̃) .
If Assumption 3.1 is satisfied then system (3.5) will be regular and impulse controllable.
For 2 players linear quadratic dynamic game define 2 generalized differential Riccati
equation as follow
̇
̇
with
with boundary condition
( ) (3.6)
The following theorem concern with relationship between the existence of dynamic
game solution and generalized Riccati differential equation (3.6).
Theorem 3.1: The two player linear quadratic discrete dynamic game (2.1), (2.2) has, for
every consistent initial state, a Nash equilibrium if the set of differenttial Riccati equation
(3.6) has a set of solutions on [0,T].
Moreover the optimal feedback Nash equilibrium is given by
( ) ( ) ( ), ( ) ( ) ( ),
where x(t) is a solution of the closed loop system
̇( ) ( ( ) ( )) ( ) ( )
AT K1 x(t ) Q1 x(t ) ,
and with same reason we get
ET 2 (t ) AT K2 x(t ) Q2 x(t ) .
This two equation has solution.
For 2 players infinite time linear quadratic dynamic game the players satisfy system (1).
Objective function to be minimized are in the form
1 T
2
J i (u1 , u 2 )
2 0
x (t )Qi x (t ) u T
j (t ) Rij u j (t ) dt ,
j 1
i=1,2 (3.7)
with all matrices symmetric. Furthermore and semi positive definite and positive
definite.
Generalized algebraic Riccati equation for 2 players infinite time problem that related
with Nash equilibrium are
(3.8)
with .
It can be proved that Theorem 1 will also be satisfied for optimal control for infinite
time problem, therefore we get the optimal Nash have form ( ) ( ) ( ),
with are constant matrices, solution of (3.8).
To derive necessary condition of opnimal Nash solution we need the Hamiltonian functions
for the follower is.
( ) ( ) ( ),
With Lagrange multiplier method as in [11] we get necessary conditions for objective
function for the follower to be optimal is
, , ̇ i=1,2. (4.1)
From the first equation of (4.1) we get
̇ ( ) ( ) ( ). (4.2)
From the second equation of (4.2) we get the optimal control for the follower is
. (4.3)
The boundary conditions are
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 83
( ) and ( ) ( ). (4.4)
For the second player as the leader define Hamiltonian
( ) ( ) ( ). (4.5)
Let get deridative of (4.4) to ( ) we get
. (4.6)
Let get deridative of (4.4) to ( ) we get
̇ . (4.7)
With the second player as the leader let the Hamiltonian is
( ) ( ) ( )
( ),
or
( ) ( ) ( )
( ) ( ). (4.8)
Let get derivative of (4.8) to x we get
̇ ( ) ( ) ( ) ( ). (4.9)
Let get derivative of (4.8) to we get
( ). (4.10)
Let get derivative of (4.8) to we get
. (4.11)
Let get derivative of (4.8) to we get
̇ . (4.12)
Substitute (4.9) to (4.12) we get
̇ . (4.13)
Substitute (4.3) to (4.13) we get
̇ . (4.14)
From necessary condition of optimal Stackelberg solution we get optimal strategies for 2
players dynamic game are ( ) with satisfy (4.2), (4.8) and (4.13) and
boundary conditions(4.4), i=1,2.
We can write in matrix form and get
E 0 0 0 x A S1 S2 0 x
0 ET 0 0 0 A S21 S1
0 0 E T
0 1 Q1 0 AT 0 1
0 E T 2 Q2 AT 2
0 0 Q1 0
(4.15)
with boundary conditions (4.4) . System (4.15) can be written in descriptor form as ̃ ̇ ̃ .
If ( ̃ ̃) regular, system (4.15) will have solution. We need the following assumptions for
equation (4.15).
Assumption 4.1: Descriptor system (4.15) is regular and impulse free i.e
̃ ̃( ̃) .
If Assumption 4.1 is satisfied then system (4.15) will be regular and impulse
84 SALMAH
controllable.
For 2 players Stackelberg linear quadratic dynamic game define generalized differential
Riccati equation as follow
̇
̇
̇
with .
with boundary condition
( ) , ( ) . (4.16)
The following theorem concern with relationship between the existence of dynamic
game solution and generalized Riccati differential equation (4.16).
Theorem 4.1: The two player linear quadratic discrete dynamic game (2.1), (2.2) has, for
every consistent initial state, a Stackelberg equilibrium if the set of differenttial Riccati
equation (4.16) has a set of solutions on [0,T].
Moreover the optimal feedback Nash equilibrium is given by
( ) ( ) ( ), ( ) ( ) ( )
Where x(t) is a solution of the closed loop system
̇( ) ( ( ) ( )) ( ) ( ) .
Generalized algebraic Riccati equation for 2 players infinite time problem that related
with Nash equilibrium are
With . (4.17)
5. NUMERICAL EXAMPLE
We will give numerical example to find optimal Nash solution of game by try to find ARE
solution. Consider system
1 0 x1 0 1 x1 0 1
u1 u 2 ,
0 0 x 2 1 0 x2 1 0 (4.18)
x1 (0) x10 , x2 (0) x20 .
For the cost function, given
1 0 0 1
Q1 , Q2 , R1 1, R2 2, R21 1.
0 1 1 1
We will find solution of the generalized algebraic Riccati equation (4.18) to get optimal
Nash of the game.
Because , we have
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 85
K1 (1,1) 0
K1N ,
aK1 (1,1) 1 a (4.32)
Let K2(2,1)=c we can write
1 K1 (1,1) 0
K 2 N ,
c ab K1 (1,1)
(4.33)
where ( ) and ( ) can be found from (4.24) and (4.30). Solution for ( ),
( )are ( ) . Take ( ) we get ( ) .
Optimal Nash control gain for the players is given by
4 4
0 1 0
K1N 3 K2 N 3
a 1 a
4 ab 1 ab 4
4
3 3 18 3 .(4.34)
Now we will find the Stackelberg equilibrium of the game. Because we get
. From the first Riccati equation (4.17) we get (4.19)-(4.22). From the second
Riccati (4.17) we get
1
K2 (2,1) L2 (1,2) P(1,1) L2 (1,2) K1 (2,1) L2 (1,1) K2 (1,1) 0,
2 (4.35)
K2 (2,2) L2 (1,1) L2 (1,2) K1 (2,2) 1 0, (4.36)
K 2 (1,1) L2 (2,2) L2 (2,2) K1 (2,1) 1 0, (4.37)
P(2,2) 1 L2 (2,2) K1 (2,2) 0. (4.38)
From the third Riccati equation (4.17) we get
1
P(1,1) K1 (1,1) K1 (1,1) K1 (2,1) K 2 (1,1) K2 (2,1) 0,
2 (4.39)
P(1,1) P(2,2) K1 (2,2) K2 (2,2) 0, (4.40)
P(2,2) P(1,1) P(2,2) K1 (2,1) K1 (1,1) K1 (2,1) K2 (1,1) K2 (2,1) 0, (4.41)
K1 (2,2)( P(2,2) 2) K2 (2,2) 0. (4.42)
Take ( ) , we get from (4.22) ( ) . From (4.21) we get
( ) ( ) .
From (4.20) we get
( ) ( ).
Take ( ) From (4.38) and because ( ) we get
( ) .
From (4.42) we get
( ) ( ).
Take ( ) , from (4.36) we get
( ) ( ) .
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 87
( ) ( ) ( ) ( ). (4.44)
Therefore we can find ( ) from (4.44). Let ( ) . From (4.39) we can get
( ). Let ( ) .
Then we get optimal Stackelberg for the game is given by
( ), ( ).
( ) ( )
6. CONCLUDING REMARK
This paper consider 2 player non-zero-sum linear quadratic dynamic game with descriptor
systems for finite horizon and infinite horizon case. Necessary condition for the existence of a
Nash equilibrium and Stackelberg equilibrium have been derived with Hamiltonian method.
The paper also consider 2 couple Riccati-type differential equation for finite horizon case and
algebraic Riccati equation for infinite horizon case related with Nash equilibrium and
Stackelberg equilibrium.
References
[1] BASAR, T., AND OLSDER, G.J., Dynamic Noncooperative Game Theory, second Edition,
Academic Press, London, San Diego, 1995.
[2] DAI, L., Singular Control Systems, Springer Verlag, Berlin, 1989.
[3] ENGWERDA, J., On the Open-loop Nash Equilibrium in LQ-games, Journal of Economic
Dynamics and Control, Vol. 22, 729-762, 1998.
[4] ENGWERDA, J.C., LQ Dynamic Optimization and Differential Games, Chichester: John
Wiley & Sons, 229-260, 2005.
[5] KATAYAMA, T., AND MINAMINO, K., Linear Quadratic Regular and Spectral
Factorization for Continuous Time Descriptor Systems, Proceedings of the 31st
Conference on decision and Control, Tucson, Arizona, 967-972, 1992.
88 SALMAH
[6] LEWIS, F.L., A survey of Linear Singular Systems, Circuits System Signal Process,
vol.5, no.1, 3-36, 1986.
[7] MINAMINO, K., The Linear Quadratic Optimal Regulator and Spectral Factorization for
Descriptor Systems, Master Thesis, Department of Applied Mathematics and Physics,
Faculty of Engineering, Kyoto University, 1992.
[8] MUKAIDANI, H. AND XU, H., Nash Strategies for Large Scale Interconnected Systems,
43rd IEEE Conference on Decision and Control Bahamas, 2004, pp 4862-4867.
[9] SALMAH, BAMBANG, S., NABABAN, S.M., AND WAHYUNI, S., Non-Zero-Sum Linear
Quadratic Dynamic Game with Descriptor Systems, Proceeding Asian Control
Conference, Singapore, pp 1602-1607, 2002.
[10] SALMAH, N-Player Linear Quadratic Dynamic Game for Descriptor System,
Proceeding of International Conference Mathematics and its Applications SEAMS-
GMU, Gadjah Mada University, Yogyakarta, Indonesia, 2007.
[11] XU, H., AND MIZUKAMI, K., Linear Quadratic Zero-sum Differential Games for
Generalized State Space Systems, IEEE Transactions on Automatic Control, Vol. 39
No.1, January, 1994, 143-147,1994.
SALMAH
Department of Mathematics Gadjah Mada University
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 89–120
SERGEY KRYZHEVICH
INTRODUCTION
89
90 SERGEY KRYZHEVICH
1. MATHEMATICAL MODEL
function of the parameter , and . We may suppose without loss of generality that
column vector, consisting of elements . Suppose that the system (1.1) is defined
for and the following Newtonian impact condition takes place as soon as the first
Condition 1. If then ,
.
Consider a vibro-impact system
(1.2)
open.
2) The set , corresponding to
6) The set is at most countable. All the limit points of this set belong to .
Since the solutions of the vibro-impact systems are discontinuous at impact instants, the
classical results on integral continuity are not applicable. Nevertheless, the following two
statements hold true.
smoothly depend on .
). The solution of the vibroimpact system (1.2) is also one of (1.1) over the
the period .
for all , ,
, (2.1)
, , .
Here
Fix a small positive and consider the shift mapping for the system (1.2), given by
3. THE SEPARATRIX
Denote .
Lemma 3.1. There exists a neighborhood of zero such that if the parameters and θ are
small enough, the set is a surface of the dimension , which is the graph
(3.1)
Figure 2. The curve of initial data, corresponding to grazing and a near-grazing stretching of
a small square R in the phase space.
as . Then
Let . Denote the elements of the matrix by and ones of the matrix by
. Denote the columns of matrices and by and respectively, the strings of the
, defined by formulae
, .
Condition 3.
1) Either or the matrix does not have eigenvalues on the unit circle in ℂ.
2)
(4.1)
Condition 4.
1) Either or the matrix does not have eigenvalues on the unit circle in ℂ.
2)
(4.2)
From the geometrical point of view, the first items the first items of conditions 3 and
4 mean that the equilibriums are saddle hyperbolic and (4.1), (4.2) provide that the
corresponding stable and unstable manifolds intersect. This will be shown below.
Later on we shall suppose that Condition 3 is satisfied. Otherwise, we consider the mapping
instead of . Then the matrix is replaced with , and the condition (4.1) is
replaced with (4.2). The similar reasoning shall prove the statement of Theorem 1 in the
considered case (see the right part of Figure 4). All the proofs given below, may be repeated
for this case.
98 SERGEY KRYZHEVICH
Theorem 1. If Condition 2 and either Condition 3 or Condition 4 are satisfied, there exist
values and such that for all the mapping is chaotic in the
sense of Devaney, 1987. More precisely, there exist an integer m and a compact set
invariant with respect to and such that the following conditions are satisfied.
4) The set is transitive i.e. there exists a point such that the orbit
is dense in .
Remark. The similar results may be obtained for the systems with any finite number of
grazings over the period.
5. GRAZING
Now we start to prove Theorem 1. Note that all the mappings , corresponding to the same
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
99
of the corresponding system, having an impact at the moment . Suppose that the
and the parameter are chosen so that there exists a neighborhood such that
smooth in the neighborhood of the point , let us estimate the Jacobi matrix .
Denote
.
Similarly, we define the values . Consider the Taylor formula for values
as functions of :
(5.1)
100 SERGEY KRYZHEVICH
Here all functions, denoted by the letter with different indices, are smooth with respect
to all arguments except Denote
, ,
.
Denote
Clearly, . Then, similarly to the results of the paper Ivanov (1996), we obtain
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
101
(5.2)
Note that as .
. Here .
6. LYAPUNOV EXPONENTS
We check that the equilibrium point of the mapping is hyperbolic and estimate the
bigger and the smaller absolute value of eigenvalues of the matrix and
ones of small perturbations of this matrix. The mapping can be represented as the
tends to as .
The matrix
It follows from the form of this matrix, that the matrix has the eigenvalue
Lemma 6.1. If the condition 3 is satisfied there exist positive constants and such that
for any matrix such that and any the matrix does
not have eigenvalues on the unit circle in ℂ.
PROOF: Suppose the statement of Lemma 6.1 is not true. Then there exist a sequence of
matrices and sequences such that all matrices have
and that
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
103
as .
). Let , ,
.
.
From the definition of eigenvalues and eigenvectors we obtain
(6.1)
corresponding to , we have
7. HOMOCLINIC POINT
The mapping is differentiable at the points of the set . Due to Lemma 6.1 the
eigenvalues of Jacobi matrices are out of the unit circle, provided is small. Then, due to
the Perron theorem, in a small neighborhood of the point there exist the local stable
manifold and the unstable one of the mapping . Both of them are smooth
invariant sets and of the mapping . The obtained sets consist, generally
speaking, of a countable number of the connected components. Every of these components is
a partially smooth manifold.
PROOF: If Conditions (4.1) be satisfied then, for small values of the parameter the
two connected components. Denote one, which does not contain the point , by . The
for any .
The surface is not smooth in the neighborhood of the manifold . For the points
, where
independent. The vectors and lie at different half-spaces, separated by the hyperplane
parameter and the neighborhood may be chosen so that the surfaces and
intersect transversally. This proves the lemma for the considered case (see the left part of
Figure 4). □
The Smale-Birkhof theorem Smale, 1965 on the existence of a chaotic invariant set
in a neighborhood of a homoclinic point is not applicable in the considered case since the
mapping is discontinuous. However, the similar techniques will help us to find a chaotic
8. SYMBOLICAL DYNAMICS
Consider the new smooth coordinates in the neighborhood of the point , such that the
following conditions are satisfied.
1) The point corresponds to .
stable and the unstable manifolds are given by the conditions and
respectively.
4) The direction of the tangent line to the axis, corresponding to the coordinate , taken at
the point coincide with one of the vector , and one, corresponding to the coordinate
are chosen so that and there exist natural numbers and such that
Figure 5. Domain .
components. One of them (let us denote it by ) contains the point . Another one,
is invariant with respect to the mapping , compact and nonempty, since it contains
the point . Moreover, the set does intersect neither with the inverse images
for any integer there is a neighborhood of the set such that the mapping
is smooth.
Lemma 8.1. For any k∈ ℕ, any set such that for any j=0,...,
PROOF: Consider an arbitrary arc , joining the parts and of the boundary of the set
Let us call such arcs admissible. Similarly to the well-known Palis lemma ( -lemma), Palis &
di Melo, 1982, chapter 2, Lemma 7.1, one may show that for small values of , and
108 SERGEY KRYZHEVICH
the unit one. Particularly, this means that the set contains two admissible arcs
and . Fix the index . It follows from what is proved, that the inverse image of
any admissible arc , contains an admissible arc . Applying the same procedure
curve
.
such that for any . Due to Lemma 8.1 for any sequence one may
Therefore the set is of the power continuum. It is the unit shift of the index to the left,
which corresponds to the mapping . The presence of this symbolical dynamics proves
9. EXAMPLE
Consider the following 2 degree of freedom system
; . (9.1)
Assume that
(9.2)
correspond to the zeros of the first component of a solution . Consider the system
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
109
(9.4)
Note that the second component of a solution of System (9.1) (or one
. (9.5)
If Conditions (9.2) are satisfied, this equation has the only periodic solution
has the only zero over the period . The general solution of Equation (9.5) is of the
form
Here and are arbitrary constants. All periodic solution of System (9.4) with one impact
over the period correspond to values and , that there exists (the impact
moment) such that
(9.6)
.
Fixing values and consider (9.6) as a system of the variables and . Let
. Suppose
110 SERGEY KRYZHEVICH
.
The first two string of (9.6) may be rewritten as follows:
(9.7)
If , then and the component of the periodic solution has only one
zero (of the multiplicity 2) over the period. Otherwise the condition (9.8) uniquely defines the
value ϑ∈ . It follows from the third equation of (9.5) that
where
.
Substituting the obtained expression for to the first equation of (9.7), we obtain
, (9.9)
where
the only point of the period. If then for every fixed value of the couple
the equation (9.9) has two zeros. Both of them correspond to the branches of
periodical solutions, which continuously depend on and . These solutions have the only
This means that the condition (9.3) coincides in the considered case with
(4.1). Then the conditions of Theorem 1 are satisfied. Therefore, vibro-impact system (9.4)
has an non-hyperbolic invariant set, which contains invariant subsets of the shift mapping,
described by the symbolical dynamics.
In this section we compare main results of the current paper with experiments and simulations
made by other authors. As it was noticed in the introduction, there are hundreds of papers
where a numerical and experimental approach was applied to study vibro-impact systems.
Here we do not try to give a review of all these results, we quote ones of papers Molenaar,
van de Water & de Wegerand, 2000 and Ing, Pavlovskaia, Wiercigroch, Banerjee, 2008 in
order to give a confirmation of results of the current paper. Though both discussed models
describe single degree-of-freedom oscillations, it is impossible to avoid all oscillations in
other dimensions, so one still needs to consider a higher dimension model (and, consequently,
Theorem 1) to have a theoretical justification of mentioned experimental results.
Let us start our mini-review with the paper Molenaar, van de Water & de Wegerand,
112 SERGEY KRYZHEVICH
2000. There, an experiment with the mechanical system, depicted at Figure 6 and modeling
atomic force microscopy, is described.
The aim of the experiment is to construct bifurcation diagrams near grazing impact,
and to explore the geometric convergence of series of period-adding bifurcations. To reach
this goal, a precise control of the frequency and amplitude of the excitation is needed.
The experiment consists of a U-shaped, brass leaf spring that is excited horizontally
by means of a large electromagnetic exciter on which it is mounted. The beam has length 13
cm, width 2 cm, and is made of 0.2 mm thick material; its clamped ends are 2 cm apart. The
U shape suppresses undesired torsional motion of the beam. When the deflection of the beam
is large enough, a ceramic ball that is attached to the beam collides with a hardened steel plate
on the exciter. These materials are chosen such that the wear due to frequent impacts is
negligible, so that the distance between the stop and the equilibrium position of the beam is
constant. A problem in this experiment is the excitation of many higher harmonic modes upon
impact. To increase the damping of these, adhesive tape is glued on the inner side of the
spring and on the side that faces the exciter. The upper side of the spring is kept shiny for the
measurement of the deflection of the spring using a laser beam.
The period of non-excited oscillations of the spring is 41.41 ms. The near-impact
positions of the ceramic ball were fixed by the laser; the error does not exceed 0.3 mm. The
frequency of the exciting force (oscillations of the pendant) is assumed to be constant (around
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
113
It was shown at Figure 7 how the grazing bifurcation, which takes place for
and transferring the periodic motion with one impact over the period to one
with two impacts, changes the phase portrait of the system. We see, instead of a stable
periodic solution a strange attractor which persists up to at least.
Another case, corresponding to the excitation frequency equal to 21.97 Hz (Figure
8), is even more interesting. In this case there are at least two grazing bifurcations,
responsible for chaotic behavior (for and for ).
114 SERGEY KRYZHEVICH
It was one of conclusions of the quoted paper that robust chaotic oscillations may
appear in a neighborhood of grazing. This conclusion, justified by numerical simulations and
bifurcation analysis, shows that the result of Theorem 1 is applicable to real life systems.
The similar problem has been analyzed by the research group in non-smooth
dynamics at Aberdeen University (see Ing, Pavlovskaya, Wiercigroch, Banerjee, 2008 and
Wiercigroch, Sin,1998). The main aim of the paper was modeling piecewise smooth
dynamical systems, particularly ones that appear in percussion drilling problems.
The experimental rig consists of a block of mild steel supported by parallel leaf
springs. These provide the primary stiffness while preventing the mass from rotation. The
secondary stiffness consists of a beam, mounted on a separate column, which prevents large
displacement in the positive vertical direction. Contact between the two is controlled via an
adjustable bolt mounted on the beam. Harmonic excitation of the system via the base is then
generated by an electromagnetic shaker. It is assumed that there is no coupling between the
oscillator and the shaker, due to the large mass ratio in favour of the shaker.
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
115
Measurements are recorded using an eddy current displacement probe to monitor the
displacement of the mass, and accelerometers to measure the base and mass acceleration. 100
Hz low-pass filtering is performed on the pre-amplified accelerometer signals. The time
history is then plotted, and the Savitsky-Golay algorithm used for polynomial smoothing of
the data. As a y-product of this operation the first derivative is available, which enables direct
plotting of the phase portrait.
Bifurcations are monitored as a function of frequency by incrementing the frequency
a small amount, allowing transients to decay, and then using the base acceleration signal to
construct an appropriate Poincaré stroboscopic map. The system was driven from a
nonimpacting to an impacting response, and care was taken to follow each attractor for as
long as it remained stable, in order to capture bifurcation phenomena in the experimental
system which could be compared to the model.
The experimental system was designed so that it could be described by a simple
mathematical model, shown in Figure 9. The primary system is described very well by a
linear oscillator. The effects of inertia of the secondary beam, and additional damping during
the contact phase, are expected to be small in comparison to that of the main oscillator, and
are neglected. The beam is considered to provide stiffness support only.
.
Figure 9. Physical model of the oscillator.
The experimental results are shown at Figure 10. Bifurcation diagrams (a) recorded
experimentally for the mass displacement under varying frequency at Hz;
116 SERGEY KRYZHEVICH
These results also confirm the main idea of Theorem 1 that the chaotic oscillations
may appear in a neighborhood of grazing.
CONCLUSION
A structure of invariant manifolds is studied for near-grazing periodic solutions of vibro-
impact systems. A presence of homoclinic points has been established for vibro-impact
systems satisfying some general type conditions. This provides existence of a Devaney
chaotic invariant set.
Comparing the results of the current paper with ones on existence of stochastic chaos
or Li-Yorke chaos, one may say that the chaotic invariant set, obtained in this paper, is always
structurally stable, i.e. it persists if the parameters of the system are slightly perturbed.
Comparing the results of the paper with a single degree of freedom case, studied in
Kryzhevich, 2008, we needed new techniques, like Lemma 6.1 to estimate Lyapunov
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
117
exponents (in the s.d.f. case they are automatically nonzero). The techniques to find a
homoclinic point (Lemma 7.1) is also quite different in the case of many degrees of freedom.
However, a result on single degree of freedom systems can still be generalized.
ACKNOWLEDGEMENTS
This work was supported by the UK Royal Society (joint research project with Aberdeen
University), by the Russian Federal Program "Scientific and pedagogical cadres", grant no.
2010-1.1-111-128-033 and by the Chebyshev Laboratory (Department of Mathematics and
Mechanics, Saint-Petersburg State University) under the grant 11.G34.31.0026 of the
Government of the Russian Federation.
References
[1] AKHMET, M. U. (2009). Li-Yorke chaos in systems with impacts. Journal of Mathematical
Analysis & Applications, 351(2), 804-810, 2009.
[2] BERNARDO, M., BUDD C. J., CHAMPNEYS A. R. AND KOWALCZYK P., Bifurcations and
Chaos in Piecewise-Smooth Dynamical Systems: Theory and Applications. New York,
Springer, 2007.
[3] BABITSKY, V. I. (1998) Theory of Vibro-Impact Systems and Applications, Berlin,
Germany, Springer.
[4] BANERJEE, S., YORKE, J. A. AND GREBOGI, C. (1998) Robust chaos. Physical Review
Letters, 80(14), 3049-3052.
[5] BUDD, C. , GRAZING IN IMPACT OSCILLATORS. IN: BRANNER B. AND HJORTH P. (Ed.) Real
and Complex Dynamical Systems (pp. 47-64). Kluwer Academic Publishers, 1995.
[6] BUNIMOVICH L.A., PESIN YA. G., SINAI YA. G. AND JACOBSON M. V., Ergodic theory of
smooth dynamical systems. Modern problems of mathematics. Fundamental trends, 2, 113-
231, 1985.
[7] CHERNOV N., MARKARIAN R. , Introduction to the ergodic theory of chaotic billiards.
IMCA, Lima, 2001.
[9] CHIN W., OTT E., NUSSE, H. E. AND GREBOGI, C., Universal behavior of impact oscillators
near grazing incidence. Physics Letters A 201(2), 197-204, 1995.
[10] DEVANEY, R. L., An Introduction to Chaotic Dynamical Systems. Redwood City, CA:
Addison-Wesley, 1987.
[12] GORBIKOV, S. P. AND MEN'SHENINA, A. V., Statistical description of the limiting set for
chaotic motion of the vibro-impact system. Automation & remote control, 68(10), 1794-1800,
2007.
[13] HOLMES, P. J., The dynamics of repeated impacts with a sinusoidally vibrating table.
Journal of Sound and Vibration 84(2), 173-189, 1982.
[14] ING, J., PAVLOVSKAIA E., WIERCIGROCH M, BANERJEE S., Experimental study of an
impact oscillator with a one-side elastic constraint near grazing. Physica D 239, 312-321,
2008.
[15] Ivanov, A. P., Bifurcations in impact systems. Chaos, Solitons & Fractals 7(10), 1615-
1634, 1996.
[16] KOZLOV, V. V. AND TRESCHEV, D. V., Billiards. A genetic introductionto the Dynamics
of Systems with Impacts. Translations of mathematical Monographs, 89. Providence, RI:
American Mathematical Society, 1991.
[19] MOLENAAR J, VAN DE WATER W. AND DE WEGERAND J., Grazing impact oscillations.
Physical Review E, 62(2), 2030-2041, 2000.
[21] PALIS, J., DI MELO, W., Geometric Theory of Dynamical Systems. Springer-Verlag, 1982.
[23] SMALE, S., Diffeomorfisms with many periodic points. Differential & Combinatorial.
Topology. Princeton: University Press, 63-81, 1965.
[24] THOMSON, J. M. T., GHAFFARI, R., Chaotic dynamics of an impact oscillator. Physical
Review A, 27(3), 1741-1743, 1983.
[25] WIERCIGROCH, M., SIN, V.W.T., Experimental study of a symmetrical piecewise base-
excited oscillator, J. Appl. Mech.65, 657-663, 1998.
Abstract. A time series is a realization or sample function from a certain stochastic process. The
main goals of the analysis of time series are forecasting, modeling and characterizing. Conventional
time series models i.e. autoregressive (AR), moving average (MA), hybrid AR and MA (ARMA)
models, assume that the time series is stationary. The other methods to model time series are soft
computing techniques that include fuzzy systems, neural networks, genetic algorithms and hybrids.
That techniques have been used to model the complexity of relationships in nonlinear time series
because those techniques is as universal approximators that capable to approximate any real
continuous function on a compact set to any degree of accuracy. As a universal approximator, fuzzy
systems have capability to model nonstationary time series. Not all kinds of series data can be
analyzed by conventional time series methods. Song & Chissom [19] introduced fuzzy time series as
a dynamic process with linguistic values as its observations. Techniques to model fuzzy time series
data are based on fuzzy systems. In this paper, we apply fuzzy model to forecast interest rate of Bank
Indonesia certificate that gives better prediction accuracy than using other fuzzy time series methods
and conventional statistical methods (AR and ECM).
Keywords and Phrases : soft computing, fuzzy systems, time series, fuzzy time series, fuzzy relation
1 INTRODUCTION
121
122 SUBANAR AND AGUS M AMAN ABADI
Soft computing deals with imprecision, uncertainty, partial truth, and approximation to
achieve tractability, robustness and low solution cost. Soft computing techniques include
fuzzy systems, neural networks, genetic algorithms and hybrids (Zadeh, [25]). That
techniques have been used to model the complexity of relationships in nonlinear time series
because those techniques is as universal approximators that capable to approximate any real
continuous function on a compact set to any degree of accuracy (Wang, [22]). Soft computing
techniques, as universal approximators, make no assumptions about the structure of the data.
Fuzzy systems are systems combining fuzzifier, fuzzy rule bases, fuzzy inference
engine and defuzzifier (Wang, [22]). The systems have advantages that the developed models
are characterized by linguistic interpretability and the generated rules can be understood,
verified and extended. As a universal approximator, fuzzy systems have capability to model
nonstationary time series and give effect of data pre-processing on the forecast performance
(Zhang, et.al, [26]; Zhang & Qi, [27]). Studying on data pre-processing using soft computing
method has been done. Popoola [16] has analyzed effect of data pre-processing on the
forecast performance of subtractive clustering fuzzy systems. Then, Popoola [16] has
developed fuzzy model for time series using wavelet-based pre-processing method. Wang
[23] and Tseng, et al [20] applied fuzzy model to analyze financial time series data.
Not all kinds of series data can be analyzed by conventional time series methods. Song
& Chissom [19] introduced fuzzy time series as a dynamic process with linguistic values as
its observations. Techniques to model fuzzy time series data are based on fuzzy systems.
Some researchers have developed fuzzy time series model. Hwang et al. [12] used data
variants to modeling, Huarng [11] constructed fuzzy time series model by determining
effective intervals length. Then, Sah and Degtiarev [18] and Chen and Hsu [8] established
fuzzy time series 1-order. Lee, et al [14] and Jilani et al [13] developed fuzzy time series high
order. Abadi, et al ([1], [2], [3], [4]) developed fuzzy model for fuzzy time series data that
optimize the fuzzy relations. This method was applied to forecast interest rate of Bank
Indonesia certificate and gave better prediction accuracy than using other fuzzy time series
methods and conventional statistical method (AR and ECM).
The rest of this paper is organized as follows. In section 2, we briefly review the
conventional time series model. In section 3, we introduce fuzzy systems and its properties. In
section 4, construction of fuzzy model for time series data using table lookup scheme
(Wang’s method) is introduced. Optimization of fuzzy model for time series data is discussed
in section 5. We also give example of application of fuzzy systems for forecasting interest
rate of Bank Indonesia Certificate based on time series data in section 6. Finally, some
conclusions are discussed in section 7.
2. MATHEMATICAL MODEL
A time series can be expressed as { X t : t 1, 2,..., N} where t is time index and N is the total
respectively.
The main goals of the analysis of time series are forecasting, modeling and
characterizing. Conventional statistical models for time series analysis can be classified into
linear models and non-linear models. Linear models are autoregressive (AR), moving average
(MA), hybrid AR and MA (ARMA) models. The linear models assume that the underlying
data generation process is time invariant. The orders of simple autoregressive and moving
average models can be determined by the autocorrelation function (ACF) and partial
autocorrelation function (PACF) plots of the time series data and the models can be identified
from those functions (Makridakis, et.al, [15]). Box and Jenkins [6] introduced a model
combining both AR and MA models called ARMA model. An ARMA model with order
(p,q) is expressed as ARMA(p, q) where p is order of moving average (MA) and q is order of
autoregressive (AR). The models assume that the time series data is stationary. If the time
series data is nonstationary, then the modified model, integrated ARMA or ARIMA model, is
used to generate a model (Chatfield, [7]). If the dependence is nonlinear where variance of a
time series increases with time i.e. the time series is heteroskedastic, then the series is
modeled by autoregressive conditional heteroskedastic (ARCH) model (Engle, [9]).
Bollerslev [5] introduced the generalization of ARCH model called generalized ARCH
(GARCH) model.
3. FUZZY SYSTEMS
In this section, we introduce some basic definitions and properties of fuzzy systems.
Definition 3.1 (Zimmermann, [28]) Let U be universal set. Fuzzy set A in universal set U is a
Definition 3.2 (Wang, [22]) A fuzzy relation Q in U1 U 2 ... U n is defined as the fuzzy set
Based on the definition of fuzzy relation, the concept of compositions of fuzzy relation can be
generated.
and V
l l
where Ai and B are fuzzy sets in U i , respectively and x = (x 1, x2,
The fuzzy rule (3.1) can be represented by fuzzy relation in UxV where the membership
function is defined by ( x1 , x2 ,..., xn , y) ( x1 ) ... ( xn ) ( y) .
(l ) l l l
Ru A 1 A n B
Fuzzy system is a system combining fuzzifier, fuzzy rule bases, fuzzy inference
engine and defuzzifier. In fuzzy inference engine, fuzzy logic principles are used to combine
fuzzy rule in fuzzy rule bases into a mapping from a fuzzy set A in U to a fuzzy set B in V. In
applications, if the input and output of fuzzy system are real numbers, then a fuzzifier and
fuzzifier is defined by a mapping from U to fuzzy set A that map x* U to fuzzy set A in U.
There are three kinds of fuzzifier i.e. singleton fuzzifier, Gaussian fuzzifier and triangular
fuzzifier. A defuzzifier is defined as a mapping from fuzzy set B in V , the output of
fuzzy inference engine, to real number y* V. There are three kinds of defuzzifier i.e. center
of gravity, center average and maximum.
Contribution of Fuzzy Systems for Time Series Analysis 125
Definition 3.4 (Wang, [22]) Let A be fuzzy set in U. A fuzzy inference engine based on
individual rule inference with union combination, Mamdani’s product implication, algebraic
product for all t-norm operators, maximum for all s-norm operators, gives output of fuzzy set
B in V whose membership function as
K n
B ( y) mak sup ( A ( x) A ( x) B ( y)) . l l
(3.2)
l 1
xU i 1
i
If fuzzy set Bl is normal with center y l , then fuzzy system using Mamdani
implication, fuzzy inference engine, singleton fuzzifier and center average defuzzifier is
l
K n
y Ail ( xi )
f ( x) l 1K ni 1 (3.3)
Ail ( xi )
l 1 i 1
with input x U and f(x) V
n
.
The advantage of fuzzy system (3.3) is that the computation of the system is simple.
The fuzzy system (3.3) is non linear mapping that maps x U to f(x) V .
n
Different membership functions of A l and B l give the different fuzzy system. If the
i
membership functions of Al and B l is Gaussian, then the fuzzy system (3.3) becomes
i
n x x l 2
y ai exp i l i
M
l l
i
l 1 i 1
f ( x) (3.4)
M n
2
a l exp xi xi
l
i
l 1 i 1 il
where ail (0, 1], il (o, ), xil , y l R .
126 SUBANAR AND AGUS M AMAN ABADI
n
Theorem 3.5 (Wang, [22]) Let U be compact set in . For any real continuous function
g(x) on U and for every 0, there exists a fuzzy system f(x) in the form of (3.4) such that
sup f ( x) g ( x) .
xU
Based on the Theorem 3.5, fuzzy system can be used to approximate any real
continuous function on compact set with any degree of accuracy. In applications, not all of
values of function are known so it is necessary to construct fuzzy system based on sample
s
chosen ai 1 , il xi x
l 2
and x x0l l
0i
, then fuzzy system (3.4) has the
i 1
following property.
Theorem 3.6 (Wang, [22]) For arbitrary 0, there exists * 0 such that fuzzy system
(3.4) with = * has the property that f ( x0l ) y0l , for all l = 1, 2, …,N.
In this section, construction of fuzzy model for time series data using table lookup scheme
will be introduced. Suppose given the following N training
data: ( x1 p (t 1), x2 p (t 1),..., xm p (t 1); x1 p (t )) , p 1, 2,3,..., N . Construction of fuzzy relations to
modeling time series data from training data based on the table lookup scheme is presented as
follows:
Step 1. Define the universal set for main factor and secondary factors. Let U [1 , 1 ]
be universal set for main factor, x1 p (t 1), x1 p (t ) [1 , 1 ] and V = [i , i ] , i 2,3,..., m ,
Step 2. Define fuzzy sets on the universal sets. Let A1,k (t i),..., AN ,k (t i) be Ni fuzzy sets in i
time series Fk (t i) . The fuzzy sets are continuous, normal and complete in [ k , k ] , i
=0,1, k 1, 2,3,..., m .
Step 3. Set up fuzzy relations using training data. From this step we have the following M
collections of fuzzy relations designed from training data:
( Al (t 1), Al (t 1),..., Al (t 1)) Al (t ) , l = 1, 2, 3, …, M. (4.1)
j1* ,1 * ,2
j2 * ,m
jm i1* ,1
Step 4. Determine the membership function for each fuzzy relation resulted in the Step 3. The
fuzzy relation (4.1) can be viewed as a fuzzy relation on U V with
U U1 ... U m m
, V and the membership function for the fuzzy relation is
= A * ( t 1)
( x1 p (t 1)) A * ,2 ( t 1)
( x2 p (t 1))... A * ,m ( t 1)
( xmp (t 1)) A l
(t )
( x1 p (t ))
j1 ,1 j2 jm i1* ,1
Step 5. For given fuzzy set input A(t 1) in input space U, establish the fuzzy set output
A ( x1 (t )) sup( A ( x(t 1))R ( x(t 1); x1 (t )))) , where x(t 1) ( x1 (t 1),..., xm (t 1)) .
l
l
xU
Step 6. Find out fuzzy set A(t ) as the combination of M fuzzy sets
M
A1(t ), A2 (t ), A3(t ),..., AM (t ) defined as A(t ) ( x1 (t )) max( A1 (t ) ( x1 (t ),..., AM (t ) ( x1 (t ))) =
l 1
M m
max (sup( A ( x(t 1)) Rl ( x(t 1); x1 (t ))) = max (sup( A ( x(t 1)) A
M
l 1 ( t 1)
( x f (t 1)) A ( x1 (t )))) .
l
xU l 1 xU
if , f i1 ,1
f 1
Step 7. Calculate the forecasting outputs. Based on the Step 6, if fuzzy set input A(t 1) is
Step 8. Defuzzify the output of the model. If the aim of output of the model is fuzzy set, then
we stop at the Step 7. We use this step if we want the real output. If fuzzy set input A(t 1) is
given with Gaussian membership function
m
( xi (t 1) xi* (t 1))2
A(t 1) ( x(t 1)) exp( ),
i 1 ai2
128 SUBANAR AND AGUS M AMAN ABADI
m
2
K xi* (t 1) xil (t 1)
Bl ( y) max exp( ) ( y ) (4.3)
l 1 ai2 ( il )2 Bl
i 1
With y [1 , 1 ] . If given real input ( x1 (t 1),..., xm (t 1)) , then the forecasting real output
j 1
exp( i
i 1 ai2 i2, j
)
5.1 Selection of Input Variables. Given M fuzzy relations where the lth fuzzy relation is
expressed by:
yw r r
( xi xir )2
Br , wr A1r A2r ... Anr , and Air ( xi ) exp( ) . Saez and Cipriano [17] defined the
ir2
i ( x) depends on input variable x and computation of the sensitivity is based on training data.
Thus, we compute I i i2 i2 for each variable where and are mean and standard
deviation of sensitivity of variable xi respectively. Then, input variable with the smallest
value Ii is discarded. Based on this procedure, to choose the important input variables, we
must take some variables having the biggest values Ii.
5.2 Construction of Complete Fuzzy Relations Using Method of Degree of Fuzzy Relation. In
modeling time series, if there are less number of training data, then fuzzy relations resulted
may not cover all values in input domain. So in this paper, a procedure to construct complete
fuzzy relations will be introduced. Given the following N input-output
i = 1, 2, …, n. The method to design complete fuzzy relations is given by the following steps:
Step 1. Define fuzzy sets to cover the input and output spaces.
For each space [i , i ] , i = 1, 2, …, n, define N i fuzzy sets Ai j , j = 1, 2, …, Ni
which are complete and normal in [i , i ] . Similarly, define N y fuzzy sets B j , j = 1, 2, …,
antecedent has form: x1 is A1j and x2 is A2j and ... and xn is Anj
1 2 n
simplified by
based on the training data. Choosing the consequence is done as follows: For any training
data ( x1 p , x2 p ,..., xn p ; y p ) and for any fuzzy set Bj , choose B j* such that
A ( x1 p ) ...A ( xn p )B ( y p ) , then choose one of some B . From this step, we have the fuzzy
j1 jn j
j*
1 n
relations in form:
j j
IF x1 is A1 1 and x2 is A2 2 and ... and xn is An n , THEN
j
y is B j*
n
So if this process is continued for every antecedent, there are N i complete fuzzy relations.
i 1
Theorem 5.1 If A is a set of fuzzy relations constructed by Wang’s method and B is a set of
fuzzy relations generated by method of degree of fuzzy relation, then A B .
Based on the Theorem 5.1, the method of degree of fuzzy relation is generalization
of the Wang’s method.
5.3 Reduction of Fuzzy Relations Using Singular Value Decomposition Method. f the number
of training data is large, then the number of fuzzy relations may be large too. So increasing
the number of fuzzy relations will add the complexity of computations. To overcome that, we
will apply singular value decomposition method (Yen, at.al [24]). Reduction of fuzzy
relations is done by the following steps referring to Abadi, et.al [4]:
Step 1. Set up the firing strength of the fuzzy relation for each training datum
(x;y)= ( x1 (t 1), x2 (t 1),..., xm (t 1); x1 (t )) as follows
m
Ai f , f ( t 1) ( x f (t 1)) Ail ,1 ( x1 (t ))
f 1 1
Ll (x;y) = M m
Ai f , f ( t 1) ( x f (t 1)) Aik ,1 ( x1 (t ))
k 1 f 1 1
Step 2. Construct N x M firing strength matrix L (Lij ) where Lij is firing strength of j-th
fuzzy relation for i-th datum, i = 1, 2, …, N, j = 1, 2, …, M.
Step 3. Compute singular value decomposition of L as L USV T .
Step 4. Determine the biggest r singular values with r rank( L) .
V 11 V 12
Step 5. Partition V as V , where V 11 is r x r matrix, V 21 is (M-r)x r matrix,
V 21 V 22
and construct V1T (VT11 ,VT21 ) .
Contribution of Fuzzy Systems for Time Series Analysis 131
Step 6. Apply QR-factorization to V1T and find M x M permutation matrix P such that
V1T P QR where Q is r x r orthogonal matrix, R = [R11, R12], and R11 is r x r upper
triangular matrix.
Step 7. Assign the position of entries one’s in the first r columns of matrix P that indicate the
position of the r most important fuzzy relations.
Step 8. Construct time series forecasting model (4.3) or (4.4) using the r most important
fuzzy relations.
Step 9. If the model is optimal, then stop. If it not yet optimal, then go to Step 4.
In this section, singular value decomposition method is applied to forecast interest rate of
Bank Indonesia Certificate (BIC) based on time series data. First, the method of sensitivity
input is applied to select input variables. Second, singular value decomposition method is
applied to select the optimal fuzzy relations. The initial fuzzy model with 8 input variables
(x(k-8), x(k-7), …, x(k-1)) from data of interest rate of BIC will be considered. The universal
set of 8 inputs and 1 output is [10, 40] and 7 fuzzy sets A1 , A2 ,..., A7 are defined on each
universal set of input and output with Gaussian membership function. Then the procedure in
Section 5.1 is applied to find significant inputs. The distribution of sensitivity of input
variables Ii is shown in Figure 1(a). We choose the biggest two sensitivity of input variables I i
and three sensitivity of input variables Ii. Based on selecting the biggest two sensitivity of
input variables and three sensitivity of input variables, the selected input variables are x(k-8),
x(k-1) and x(k-8), x(k-3), x(k-1), respectively.
Then time series model constructed by two input variables x(k-8) and x(k-1) has
better prediction accuracy than time series model constructed by three input variables x(k-8),
x(k-3), x(k-1). So we choose x(k-8) and x(k-1) as input variables to predict value x(k). Then
the method of degree of fuzzy relation is applied to yield 49 fuzzy relations showed in Table
1.
132 SUBANAR AND AGUS M AMAN ABADI
The singular value decomposition method in Section 5.3 is applied to get optimal
fuzzy relations. The singular values of firing strength matrix are shown in Figure 1(b). There
are 10 optimal fuzzy relations. The positions of the 10 most important fuzzy relations are
known as 1, 2, 8, 9, 10, 15, 17, 29, 37, 44 printed bold in Table 1. The resulted fuzzy relations
are used to design time series forecasting model (4.3) and (4.4).
(a) (b)
(a) (b)
Figure 2. Prediction and true values of interest rate of BIC using: (a) singular
value decomposition method (b) degree of fuzzy relation method
7. CONCLUSION
In this paper, we have presented capability of fuzzy systems to model time series data.
As a universal approximator, fuzzy systems have capability to model non stationary time
series. The uniqueness of fuzzy system is that the system can formulate problems based on
expert knowledge or empirical data. We also presented a method to select input variables and
134 SUBANAR AND AGUS M AMAN ABADI
reduce fuzzy relations of time series model based on training data. The method was used to
get significant input variables and optimal number of fuzzy relations. We applied the
proposed method to forecast the interest rate of BIC. The result was that forecasting interest
rate of BIC using the proposed method has a higher accuracy than that using conventional
time series methods.
REFERENCES
[1] ABADI, A.M., SUBANAR, WIDODO AND SALEH, S., Designing fuzzy time series model and
its application to forecasting inflation rate. 7Th World Congress in Probability and
Statistics. Singapore: National University of Singapore, 2008.
[2] ABADI, A.M., SUBANAR, WIDODO, SALEH, S., Constructing Fuzzy Time Series Model
Using Combination of Table lookup and Singular Value Decomposition Methods
and Its Applications to Forecasting Inflation Rate, Jurnal ILMU DASAR, 10(2), 190-
198, 2009.
[3] ABADI, A.M., SUBANAR, WIDODO, SALEH, S., A New Method for Generating Fuzzy
Rules from Training Data and Its Applications to Forecasting Inflation Rate and
Interest Rate of Bank Indonesia Certificate, Journal of Quantitative Methods, 5(2),
78-83, 2009.
[4] ABADI, A.M., SUBANAR, WIDODO, SALEH, S., Fuzzy Model for Forecasting Interest Rate
of Bank Indonesia Certificate, Proceedings of the 3rd International Conference on
Quantitative Methods Used in Economics and Business, Faculty of Economics,
Universitas Malahayati, Bandar Lampung, June 16-18, 2010.
[5] BOLLERSLEV, T., Generalized Autoregressive Conditional Heteroscedasticity, Journal of
Econometrics, 31, 307-327, 1986.
[6] BOX, G.E.P. AND JENKINS, G.M., Time Series Analysis: forecasting and Control, Holden-
Day, San Francisco, 1970.
[7] CHATFIELD, C., The Analysis of Time Series: An Introduction, Sixth Edition, Chapman &
Hall/CRC Press, Boca Raton, 2004.
[8] CHEN, S.M. AND HSU, C.C., A New Method to Forecasting Enrollments Using Fuzzy
Time Series, International Journal of Applied Sciences and Engineering, 3(2), 234-
244, 2004.
[9] ENGLE, R.F., Autoregressive Conditional Heteroscedasticity with Estimate of Variance of
United Kingdom Inflation, Econometrica, 50, 987-1008, 1982.
[10] GOLUB, G.H., KLEMA, V., STEWART, G.W., Rank Degeneracy and Least Squares
Problems, Technical Report TR-456, Dept. of Computer Science, University of
Maryland, College Park, 1976.
[11] HUARNG, K., Effective Lengths of Intervals to Improve Forecasting in Fuzzy Time
Series, Fuzzy Sets and Systems 123, 387-394, 2001.
[12] HWANG, J.R., CHEN, S.M., LEE, C.H., Handling Forecasting Problems Using Fuzzy
Time Series, Fuzzy Sets and Systems 100, 217-228, 1998.
[13] JILANI, T.A, BURNEY, S.M.A., ARDIL, C., Multivariate High Order Fuzzy Time Series
Forecasting for Car Road Accidents. International Journal of Computational
Intelligence, 4(1), 15-20, 2007.
[14] LEE, L.W., WANG, L.H., CHEN, S.M., LEU, Y.H., Handling Forecasting Problems Based
Contribution of Fuzzy Systems for Time Series Analysis 135
SUBANAR
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah
MadaUniversity, Indonesia
e-mail: [email protected]
Abstract. This is a survey about quivers and representations of quivers in the geomet-
ric point of view. Specifically, we will study about degeneration for finite dimensional
representation of quivers. In this paper we will prove the following result by Brion. If
0 → U → M → N → 0 is an exact sequence of finite dimensional representations of
bound quiver (Q, I), then M degenerates to U ⊕ N.
1. INTRODUCTION
A quiver is a finite directed (oriented) graph, possibly with multiple arrows and
loops. Using a quiver Q one can define an algebra over an algebraically closed field
k called path algebra kQ. Conversely, if an algebra A is basic, connected, and finite
dimensional we may obtain associate quiver QA [1]. By using quivers, studying algebra
becomes more interesting because we could work in a graphical structure. Similarly,
using a bound quiver (Q, I) associate to an algebra A, we can visualize any (finite
dimensional) A-module as a k-linear representation of (Q, I) [1]. Following [3], we
can show that the representation M of the bound quiver (Q, I) is naturally an affine
variety and the product Gd of general linear groups acts on the category repk (Q, I) of
all representations of bound quiver (Q, I) so that the Gd -orbits are the isomorphism
classes of representations in repk (Q, I). The point in the closure of the Gd -orbit of
representation M of bound quiver (Q, I) may be viewed as geometric degenerations
of the representation M. In this paper we will prove the following result in [2]. Let
0 → U → M → N → 0 be an exact sequence of finite dimensional representations of
bound quiver (Q, I). Then M degenerates to U ⊕ N.
137
138 Darmajid and Intan Muchtadi-Alamsyah
2. PRELIMINARIES
Throughout the paper, k denotes an algebraically closed field. A quiver is a
oriented graph. More specifically, a quiver Q = (Q0 , Q1 , s, t) is a quadruple consisting
of two sets: Q0 whose elements are called vertices and Q1 whose elements are called
arrows, and two maps s, t : Q1 → Q0 which associate to each arrow α ∈ Q1 its source
s(α) ∈ Q0 and its target t(α) ∈ Q0 , respectively. A quiver Q is said to be finite if Q0
and Q1 are finite sets [1]. We will only consider finite quiver Q.
Example 2.1. Let Q be quiver whose Q0 := {1, 2, 3}, Q1 := {α, β}, s(α) := 2, t(α) :=
1, s(γ) := 2, t(γ) := 1, s(β) := 3, t(β) := 2, and s(λ) = t(λ) := 3. Then
α
β
Q := 1 2 3 λ
γ
δ 3 γ
family dimV := (dim Va )a∈Q0 is the dimension vector of V. It lies in the additive
group NQ0 consisting of allP
tuples of positive integers d = (da )a∈Q0 . The dimension of
V is the natural number dim Va . For a path τ = αl αl−1 · · · α1 : a → b in quiver
a∈Q0
Q, the k-linear map fτ : Va → Vb is defined the identity map of Va if l = 0 and the
composition fαl ◦ fαl−1 ◦ · · · ◦ fα1 if l > 0. A representation of bound quiver (Q, I)
is a representation M = (Ma , fα )a∈Q0 ,α∈Q1 with the additional property that for each
P r
k-linear combination ρ = ki τi ∈ I where ∀i ∈ {1, . . . , r}, τi ∈ I are path from a
i=1
r
P
to b, the k-linear map fρ = ki fτι is zero. In this paper, we will only consider the
i=1
finite-dimensional representations of bound quiver (Q, I) [5].
Let V = (Va , fα )a∈Q0 ,α∈Q1 and W = (Wa , gα )a∈Q0 ,α∈Q1 be two representations
of bound quiver (Q, I). A morphism ϕ : V → W is family ϕ = (ϕa )a∈Q0 of linear maps
(ϕa : Va → Wa )a∈Q0 such that for any arrow α : a → b the equality gα ◦ ϕa = ϕb ◦ fα
holds. In fact, the category of (finite-dimensional) k-representations of bound quiver
(Q, I), denoted by repk (Q, I), is equivalent to the category of finite dimensional kQ/I-
modules [1].
Example 2.3. Let Q be the quiver bound by commutativity relation I = hβα − δγi
3
β α
λ
Q := 1 2 5
δ γ
4
0
(0)
(1) (0)
M := k k k
(1)
(1)
k
140 Darmajid and Intan Muchtadi-Alamsyah
k
1
0 (0)
1 1
V := k k2 0
[1]
0 (0)
1
k [0]
[1] 1 1 k
(1) (1)
[1]
(1)
W := k k k
(1) (1)
L
repk (Q, I) into direct sum of Mdb ×da (k) where Mdb ×da (k) denote the set
α:a→b;α∈Q1
of all db × da -matrices. It is clear that
M M
Homk kdb , kda =
repk (Q, I) = Mdb ×da (k) = Md
α:a→b;α∈Q1 α:a→b;α∈Q1
!
si
r k
P Q
i Vαj = Vρ = O for
i=1 i
M M j=1
!
= Vα ∈ Mdb ×da (k) r
P Qsi
j
α:a→b α:a→b some ki αi = ρ = 0;
α∈Q1 α∈Q1 i=1 j=1
ki ∈ k; αj are paths from a to b
i
for all (Gc )c∈Q0 ∈ Gd , for all V ∈repk (Q, I) which is parameterized by (Vβ )β∈Q1 ∈ Md
and for all arrow α : a → b. The Gd -orbits are the isomorphism classes of representations
in Md .
Denoted by OM the Gd -orbit of representation M in repk (Q, I). The main object
of our interest is the closure OM (in the Zariski topology) of the orbit OM . There
are two main reasons for studying such orbit closure. By inspecting them as affine
varieties by methods of algebraic geometry we can achieve deeper understanding of
the category of representations. On the other hand, our orbits closures provide many
interesting example of affine varieties, whose geometric properties are derived from
known properties of category of representations.
α
Example 3.1. Let Q := 1 ← − 2; I = h0i; and d = (d1 , d2 ) ∈ N2 . The group Gd =
GLd1 × GLd2 acts on Md via
Q := 1 λ
for the Ua and completing them to the bases of Ma , we obtain a point (Mα )α∈Q1 ∈ Md
that parameterize M ((Mα )α∈Q1 ∼= M) such that Mα (kra ) ⊂ krb for all α : a → b. Here,
r = (ra )a∈Q0 denotes the dimension vector of U. The family of restrictions (induced
by monomorphism ϕ) (Uα : kra → krb )α:a→b,α∈Q1 parameterize U, that is (Uα )α∈Q1 ∼ =
U α Xα
U where Uα ∈ Mrb ×ra (k). Therefore,∀α ∈ Q1 , Mα = with Xα ∈
O Yα
Mrb ×sa (k), Yα ∈ Msb ×sa (k), and sa = da − ra for all a ∈ Q0 . Moreover, the vector
space kda can be decomposed into
kda = kra ⊕ ksa
for all a ∈ Q0 . Using the surjectivity of ψa , we obtain that s = (sa )a∈Q0 is the
dimension vector of N and the family of quotient maps (induced by epimorphism
ψ) (Nα : kra → krb )α:a→b,α∈Q1 parameterize N, that is (Nα )α∈Q1 ∼
= N where Nα ∈
Uα Xα
Msb ×sa (k). Hence, ∀α ∈ Q1 , we obtain Mα = .
O Nα
Define a homomorphism of algebraic groups
λ : GL1 → GLd
t 7→ (λa (t))a∈Q0 ,
where
tIra 0
λa (t) =
0 Isa
in the decomposition kda = kra ⊕ ksa for all a ∈ Q0 . Therefore, we have
λ(t) · M = λb (t)Mα λa (t)−1 α:a→b,α∈Q1
−1 !
tIrb 0 Uα Xα tIra 0
=
0 Isb O Nα 0 Isa
α:a→b,α∈Q1
Uα tXα
= .
O Nα α∈Q
1
λM (t) ∼
= M on an open dense set GL1 of k and λM (0) ∼
= U ⊕ N. By Lemma 3.1, M
degenerates to U ⊕ N.
4. CONCLUDING REMARKS
We have shown that if 0 → U → M → N → 0 is an exact sequence of finite-
dimensional representations of bound quiver (Q, I) then M degenerates to U ⊕ N. This
mean that we construct degenerations in algebraic terms from geometric degeneration
of the representations of bound quiver.
Acknowledgement. The authors would like thank to I-MHERE FMIPA ITB for
financial support based on Surat Perjanjian No.113/I1.B01/I-MHERE ITB/SPK/2011.
References
[1] I. Assem, D. Simson, and A. Skowronski, Elements of the Representation Theory of Associative
Algebras, in : Techniques of Representation Theory, vol 1, Cambridge University Press, New York,
2006.
[2] M. Brion, Representations of Quivers, Lecture notes available on http://www-fourier.ujf-
grenoble.fr/˜mbrion/notes quivers rev.pdf, 2000.
[3] Darmajid, Variety Representasi Aljabar dan Variety Modul, Prosiding Seminar Nasional Aljabar
2011, 18-27, 2011.
[4] H. Kraft, Geometric Methods in Representation Theory, in: Representations of Algebras, Lecture
Notes in Math., 944, 180–258, 1982.
[5] C. Riedtmann, Degenerations for Representations of Quivers with Relations, Ann. Sci. École
Norm. Sup. 4, 275-301, 1986.
Darmajid
Algebra Research Division, Institut Teknologi Bandung.
e-mail: [email protected]
Intan Muchtadi-Alamsyah
Algebra Research Division, Institut Teknologi Bandung.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 145–158.
Abstract. For an arbitrary finite commutative group (A; ?), we are interested to define
a particular subset Ln n
a,b of the set O (A) of all n-ary operations on A. We study its
properties and the connection between this set and the n-clone P oln αA of all n-ary
quasilinear operations in O n (A). We are also interested in the properties of Ln
a,b for
1. INTRODUCTION
Let A be an arbitrary set and let O(A) be the set of all operations on the set
A. A clone on the set A, i.e. a subset of O(A) that is closed under superposition
and contains all projections has been widely studied by many authors see e.g., ([2], [3],
[4], [5], [6], [9], [10], [11], [12], [13]). An important clone is P olαA consisting of all
operations which preserve αA = {(u, v, w, z)|u, v, w, z ∈ A, u ? v = w ? z} for a group
(A; ?). Thus, if f is an n-ary operation for n ≥ 1 in P olαA and (ui , vi , wi , zi ) ∈ αA ,
i = 1, . . . , n, then (f (u1 , . . . , un ), f (v1 , . . . , vn ), f (w1 , . . . , wn ), f (z1 , . . . , zn )) ∈ αA , i.e.
f (u1 , . . . , un ) ? f (v1 , . . . , vn ) = f (w1 , . . . , wn ) ? f (z1 , . . . , zn ). It is easy to understand
∞
{f ∈ On (A)|f (u1 ? v1 , . . . , un ? vn ) ? f (e, . . . , e) = f (u1 , . . . , un ) ?
S
that P olαA =
n=1
f (v1 , . . . , vn )} for the identity element e of the group. It is well-known that if (A; ?)
is an elementary abelian p-group for a prime number p and |A| = pm , then P olαA
is a maximal clone on A ([12]). Now, consider P oln αA = P olαA ∩ On (A), which we
call an n-clone on A. If (ui , vi , wi , zi ) ∈ α, i = 1, . . . , n then we have u ?n v = w ?n z
for u = (u1 , . . . , un ), v = (v1 , . . . , vn ), w = (w1 , . . . , wn ) and z = (z1 , . . . , zn ) and ?n is
defined by (x1 , . . . , xn ) ?n (y1 , . . . , yn ) = (x1 ? y1 , . . . , xn ? yn ). Therefore for arbitrary
145
146 Denecke, K. and Susanti, Y.
definition, we have Cπ1 ∩ Cπ2 = ∅ for π1 6= π2 . Thus, for every f ∈ On (A) there is a
unique πf ∈ O1 (A) such that f ∈ Cπf . We use id for the identity operation on A and
cny for constant n-ary operation, i.e. cny (x) = y for all x ∈ An and y ∈ A. It is clear that
cny ∈ Cc1y . Furthermore, let ∆An = {x̂|x ∈ A}. If f ∈ Cid , then f (x̂) = x and if f ∈ Cc1y ,
then f (x̂) = y for all x̂ ∈ ∆An (see [8]). Now, for the main aim of this paper, we define
Lna,b := {f ∈ On (A)|f (x) ? f (a ?n x−1 ) = b for all x ∈ An } for fixed a ∈ An and b ∈ A.
Lna,b could be the empty set but also it could be equal to On (A). Moreover, in general
P oln αA and Lna,b are not comparable with respect to inclusion. Before we come to the
properties of this set, recall that we obtain in the following way a semigroup of n-ary
operations on A.
Let On (A) be the set of all n-ary operations on A. On On (A) we define an op-
eration + by f + g := f (g, . . . , g) for arbitrary f, g ∈ On (A), i.e. (f + g)(x) = f (g(x)) d
n n
for every x ∈ A . The operation + is associative, giving a semigroup (O (A); +)
(see [1], [2], [6], [7] and [8]). By this definition, generally, if f ∈ Cπ1 and g ∈ Cπ2 ,
then f + g ∈ Cπ1 ◦π2 for the composition operation ◦ in O1 (A). Therefore we have
f + g = cny for every f ∈ Cc1y (see [8]). Moreover, it is clear that if C is an n-clone
on A, then (C; +) is a subsemigroup of (On (A); +). Particularly for A = {0, 1}, we
have C4n = {f ∈ On (A)|f (0̂) = 0, f (1̂) = 1}, ¬C4n = {f ∈ On (A)|f (0̂) = 1, f (1̂) = 0},
K0n = {f ∈ On (A)|f (0̂) = f (1̂) = 0} and K1n = {f ∈ On (A)|f (0̂) = f (1̂) = 1}. Clearly,
C4n , ¬C4n , K0n and K1n are all disjoint and On (A) = C4n ∪ ¬C4n ∪ K0n ∪ K1n . By cn0 and
cn1 we mean the constant operations with value 0 and 1, respectively. Moreover, the
operation + has the following properties:
g if f ∈ C4n
¬g if f ∈ ¬C4n
f +g =
cn if f ∈ K0n
n0
c1 if f ∈ ¬K0n
We recall the definition of a four-part semigroup (see [2]). We use four non-empty,
finite and pairwise disjoint sets S1 = {a11 , a12 , . . . , a1nr }, S2 = {a21 , a22 , . . . , a2nr },
S3 = {a31 , a32 , . . . , a3ns }, S4 = {a41 , a42 , . . . , a4ns } and define a binary operation ∗ on
S = S1 ∪ S2 ∪ S3 ∪ S4 by
alk if aij ∈ S1
1 if l = 2
2 if l = 1
atk if aij ∈ S2 where t =
aij ∗ alk =
3 if l = 4
4 if l = 3
∗
a ∈ S if a ∈ S
3 ij 3
∗∗
a ∈ S4 if aij ∈ S4 .
The binary operation ∗ is well-defined, and it can be checked that it is associative,
giving us a semigroup (S; ∗) called a four-part semigroup (see [2]). The following theo-
rem gives a characterization of all four-part subsemigroups of the four-part semigroup
(On ({0, 1}); +).
Theorem 1.1. ([7]) A set S ⊆ On ({0, 1}) is the universe of a four-part subsemigroup
of (On ({0, 1}); +) if and only if
(i) S ∩ ¬C4n 6= ∅
(ii) S ∩ ¬C4n = ¬(S ∩ C4n ), S ∩ K1n = ¬(S ∩ K0n ) and
(iii) {cn0 } ⊆ S.
Remark 1.1. From the above characterization we know that if S1 , S2 ⊆ On ({0, 1}) both
form four-part semigroups, then S1 ∩ S2 also forms a four-part semigroup. Moreover, it
can be shown that S ⊆ On ({0, 1}) forms a four-part semigroup if and only if S ∩C4n 6= ∅,
cn0 ∈ S and ¬f ∈ S for all f ∈ S.
Recall also that a semigroup S = (S; ∗) is called a two-constant semigroup if there
are subsets S1 , S2 of S such that S = S1 ∪ S2 , S1 ∩ S2 = ∅, Si 6= ∅, i = 1, 2 and if there
are two fixed elements b∗ ∈ S1 , b∗∗ ∈ S2 such that
∗
b ∈ S1 if a ∈ S1
a∗b=
b∗∗ ∈ S2 if a ∈ S2
(see [2]). The following theorem characterizes two-constant subsemigroup of the four-
partsemigroup (On ({0, 1}); +).
Theorem 1.2. ([7]) A subset S ⊆ On ({0, 1}) is the universe of a two-constant sub-
semigroup of (On ({0, 1}); +) if and only if S ⊆ K0n ∪ K1n and {cn0 , cn1 } ⊆ S.
2. PROPERTIES OF Ln n
a,b AND P ol αA FOR AN ARBITRARY FINITE
COMMUTATIVE GROUP (A; ?)
Lemma 2.1. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a natural
number. For every a, a0 , â ∈ An and b, b0 , y ∈ A, the following propositions are true.
(i) Lna,b 6= ∅ if and only if x ?n x 6= a for all x ∈ An or there exists y ∈ A such that
b = y ? y.
148 Denecke, K. and Susanti, Y.
Lna,b ∩ Lna,b0 = ∅ if b 6= b0 .
(ii)
If f ∈ Lnâ,b and g ∈ Lna0 ,a then f + g ∈ Lna0 ,b .
(iii)
Lna,b contains the projection eni if and only if ai = b.
(iv)
Lna,b contains the constant element cny if and only if b = y ? y.
(v)
If {x ? x|x ∈ A} = A, then there exists a unique constant element cny ∈ On (A)
(vi)
such that cny ∈ Lna,b .
(vii) Lnâ,b ∩ Cc1y 6= ∅ if and only if b = y ? y.
T n
(viii) ( Lâ,a ) ∩ Cc1y = ∅ for all x ∈ A.
a∈A
Proof: (i) Assume that there exists x ∈ An such that x ?n x = a, i.e. a ?n x−1 = x
and b 6= y ? y for all y ∈ A. Then for every f ∈ On (A) we have f (x) ? f (a ?n x−1 ) =
f (x) ? f (x) 6= b, i.e f 6∈ Lna,b . Thus Lna,b = ∅, a contradiction. Conversely, let x ?n x 6= a
for all x ∈ An . Then for every x ∈ An we have x 6= a ?n x−1 . Thus, we can choose f
such that f (x) = e and f (a ?n x−1 ) = b for every x ∈ An and have f ∈ Lna,b . If there
exist y ∈ A such that b = y ? y then cny (x) ? cny (a ?n x−1 ) = y ? y = b, i.e. cny ∈ Lna,b .
Thus Lna,b 6= ∅.
(ii) Let b, b0 ∈ A such that b 6= b0 . Assume that there exists f ∈ Lna,b ∩ Lna,b0 . Then for
all x ∈ An we obtain f (x) ? f (a ?n x−1 ) = b and f (x) ? f (a ?n x−1 ) = b0 and hence b = b0 ,
a contradiction.
(iii) Let f ∈ Lnâ,b and g ∈ Lna0 ,a . Then for every x ∈ An we obtain f (x) ? f (â ?n x−1 ) = b
and g(x) ? g(a0 ?n x−1 ) = a. Therefore
Proof: (i) Let f ∈ Lna,b and x ∈ An . Then we have f (x) ? f (a ?n x−1 ) = b and
hence
c f (x) ? c f (a ?n x−1 ) = ωc (f (x)) ? ωc (f (a ?n x−1 ))
= (f (x))−1 ? c ? (f (a ?n x−1 ))−1 ? c
= (f (x) ? f (a ?n x−1 ))−1 ? c ? c
= b−1 ? c ? c
= ωc (b) ? c.
Thus c f ∈ Lna,ωc (b)?c . Conversely, let c f ∈ Lna,ωc (b)?c . Then c f (x)?c f (a?n x−1 ) =
ωc (b) ? c, i.e. ωc (f (x)) ? ωc (f (a ?n x−1 )) = b−1 ? c ? c. Since ωc ωc (x) = x for every x ∈ An
we obtain
f (x) ? f (a ?n x−1 ) = ωc (ωc (f (x))) ? ωc (ωc (f (a ?n x−1 )))
= (ωc (f (x)))−1 ? c ? (ωc (f (a ?n x−1 )))−1 ? c
= (ωc (f (x)) ? ωc (f (a ?n x−1 )))−1 ? c ? c
= (b−1 ? c ? c)−1 ? c ? c
= b
and hence f ∈ Lna,b .
(ii) Let Lna,b contain c f for all f ∈ Lna,b . Then for all f ∈ Lna,b and x ∈ An we have
f (x) ? f (a ?n x−1 ) = c f (x) ? c f (a ?n x−1 ) = b and therefore we have
b = c f (x) ? c f (a ?n x−1 )
= ωc (f (x)) ? ωc (f (a ?n x−1 ))
= (f (x))−1 ? c ? (f (a ?n x−1 ))−1 ? c
= (f (x) ? f (a ?n x−1 ))−1 ? c ? c
= b−1 ? c ? c,
i.e. b ? b = c ? c. Conversely, let b ∈ A. If b ? b = c ? c, then b = b−1 ? c ? c = ωc (b) ? c
and thus by (i), Lna,b contains c f for all f ∈ Lna,b .
(iii) Let b , c f ∈ Lnb̂,b ∩Lnĉ,c . Then by (ii) and Proposition 2.1, we get f = b b f ∈ Lnb̂,b
and f = c c f ∈ Lnĉ,c , i.e f ∈ Lnb̂,b ∩ Lnĉ,c . Conversely, let f ∈ Lnb̂,b ∩ Lnĉ,c . Then by (ii),
150 Denecke, K. and Susanti, Y.
The following result gives a necessary and sufficient condition for Lna,b to be an
n-clone on A.
Theorem 2.1. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a natural
number. For every a ∈ An and b ∈ A the following propositions are equivalent.
(i) Lna,b contains all projections.
(ii) Lna,b is an n-clone on A.
(iii) a = b̂.
Proof: (i) ⇔ (iii) is clear by Lemma 2.1 (iv).
(ii)⇔ (iii) Let Lna,b be an n-clone on A. Then Lna,b contains all projections. Therefore
a = b̂. Conversely, let a = b̂, i.e. Lna,b = Lnb̂,b . Thus by Lemma 2.1 (iv), Lna,b contains
all projections eni , i ∈ {1, . . . , n}. Moreover, let f, g1 , . . . , gn be in Lna,b = Lnb̂,b . Then
gi (x) ? gi (b̂ ?n x−1 ) = b, i.e. gi (b̂ ?n x−1 ) = b ? gi (x)−1 for all i = 1, . . . , n. Therefore for
arbitrary x ∈ An and g(x) = (g1 (x), . . . , gn (x)) we have
f (g1 , . . . , gn )(x) ? f (g1 , . . . , gn )(b̂ ?n x−1 )
= f (g1 (x), . . . , gn (x)) ? f (g1 (b̂ ?n x−1 ), . . . , gn (b̂ ?n x−1 ))
= f (g1 (x), . . . , gn (x)) ? f (b ? g1 (x)−1 , . . . , b ? gn (x)−1 )
= f (g1 (x), . . . , gn (x)) ? f (b̂ ?n (g1 (x)−1 , . . . , gn (x)−1 ))
= f (g(x)) ? f (b̂ ?n (g(x))−1 )
= b.
Therefore Lna,b is an n-clone on A.
By the definition of the operation + on On (A), it is clear that if C is an n-clone
on A, then (C; +) is a subsemigroup of (On (A); +). In the following part we will prove
some results on semigroups in On (A) related to Lna,b . A direct consequence of Theorem
2.1 is the following proposition.
Proposition 2.2. Let (A; ?,−1 , e) be a finite commutative group. For every a, b ∈ A
the following four propositions are equivalent.
(i) L1a,b forms a subsemigroup in (O1 (A); ◦).
(ii) L1a,b contains the identity operation.
(iii) L1a,b is a 1-clone.
(iv) a=b.
Proposition 2.3. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a
natural number. For every a ∈ An and a, b, c, y ∈ A the following four propositions are
true.
(i) (Lnâ,a ; +) is a subsemigroup of (On (A); +).
(ii) (Lna,y?y ∩ Cc1y ; +) is a subsemigroup of (On (A); +) which is not an n-clone.
(iii) If b ? b = c ? c, then (Lnb̂,b ; +) is a subsemigroup of (On (A); +) containing c f
for all f ∈ Lnb̂,b .
On Sets Related to Clones of Quasilinear Operations 151
Proof: ⇒ Let Lna,b form a subsemigroup of (On (A); +). Let f ∈ Lna,b and y ∈ A.
Then we can find πf ∈ O1 (A) such that πf (x) = f (x̂) for every x ∈ A. Moreover, by
Lemma 2.3 we can find g ∈ Lna,b and x ∈ An such that y = g(x). Therefore we have
g(x) ? g(a ?n x−1 ) = b, i.e. g(a ?n x−1 ) = b ? (g(x))−1 = b ? y −1 and f + g ∈ Lna,b , i.e.
(f + g)(x) ? (f + g)(a ?n x−1 ) = b. Hence
πf (y) ? πf (b ? y −1 ) = πf (y) ? πf (g(a ?n x−1 ))
= ?n x−1 ))
f (ŷ) ? f (g(a\
= d ? f (g(a\
f (g(x)) ?n x−1 ))
= (f + g)(x) ? (f + g)(a ?n x−1 )
= b,
i.e. πf ∈ L1b,b .
⇐ Let f, g ∈ Lna,b and x ∈ An . By assumption we have πf ∈ L1b,b , i.e. πf (y) ? πf (b ?
152 Denecke, K. and Susanti, Y.
y −1 ) = b for all y ∈ A and g(x) ? g(a ?n x−1 ) = b, i.e. g(a ?n x−1 ) = b ? (g(x))−1 = b ? y −1
for y = g(x). Therefore
(f + g)(x) ? (f + g)(a ?n x−1 ) = f (g(x)) d ? f (g(a\ ?n x−1 ))
= f (b ? y −1 )
y ) ? f (b\
= πf (y) ? πf (b ? y −1 )
= b,
i.e. f + g ∈ Lna,b and hence (Lna,b ; +) is a subsemigroup of (On (A); +). This completes
the proof.
Now, we come to some properties of P oln αA and the connection between Lna,b
and P oln αA .
Lemma 2.4. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a natural
number. Let c ∈ A and f ∈ On (A). Then f ∈ P oln αA if and only if c f ∈ P oln αA .
Proof: Let f ∈ P oln αA . For arbitrary (ui , vi , wi , zi ) ∈ αA , i = 1, 2, . . . , n, put
u = (u1 , . . . , un ), v = (v1 , . . . , vn ), w = (w1 , . . . , wn ) and z = (z1 , . . . , zn ). Then we
have f (u) ? f (v) = f (w) ? f (z). By c f (x) = ωc (f (x)) = (f (x))−1 ? c we get
c f (u) ? c f (v) = ωc (f (u)) ? ωc (f (v))
= (f (u))−1 ? c ? (f (v))−1 ? c
= (f (u) ? f (v))−1 ? c ? c
= (f (w) ? f (z))−1 ? c ? c
= (f (w))−1 ? c ? (f (z))−1 ? c
= ωc (f (w)) ? ωc (f (z))
= c f (w) ? c f (z).
Hence c f ∈ P oln αA . Conversely, let c f ∈ P oln αA and (ui , vi , wi , zi ) ∈ αA , i =
1, 2, . . . , n. Then c f (u) ? c f (v) = c f (w) ? c f (z), i.e. ωc (f (u)) ? ωc (f (v)) =
ωc (f (w)) ? ωc (f (z)). Using the properties ωc ωc (x) = x for all x ∈ A and ωc (f (x)) =
(f (x))−1 ? c we obtain
f (u) ? f (v) = ωc ωc (f (u)) ? ωc ωc (f (v))
= (ωc (f (u)))−1 ? c ? (ωc (f (v)))−1 ? c
= (ωc (f (u)) ? ωc (f (v)))−1 ? c ? c
= (ωc (f (w)) ? ωc (f (z)))−1 ? c ? c
= (ωc (f (w)))−1 ? c ? (ωc (f (z)))−1 ? c
= ωc ωc (f (w)) ? ωc ωc (f (z))
= f (w) ? f (z),
i.e. f ∈ P oln αA .
The following results show the connection between the sets of Lna,b and P oln αA .
Lemma 2.5. Let (A; ?,−1 , e) be a finite commutativeSgroup, let n ≥ 1 be a natural
number and let a ∈ An be arbitrary. Then P oln αA ⊆ Lna,b .
b∈A
3. PROPERTIES OF Ln
a,b FOR PARTICULAR FINITE COMMUTATIVE
GROUPS
In this section we will study some properties of the set Lna,b for elementary abelian
p-groups, i.e. abelian groups in which all non-identity elements have order p for prime
numbers p and then for the group (Zm = Z/mZ; +) for arbitrary natural numbers
m ≥ 2.
Lemma 3.1. Let (A; ?,−1 , e) be an elementary abelian 2-group (|A| = 2m ) and let n ≥ 1
be a natural number. For every a, â ∈ An and a, b, y ∈ A the following propositions are
satisfied.
(i) Lnê,e = On (A) and Lnê,b = ∅ for all b 6= e.
(ii) Lna,b contains c f for all f ∈ Lna,b and for all c ∈ A.
(iii) Lna,b contains some constant elements if and only if b = e. Moreover, Lna,e
contains all constant elements of On (A).
(iv) Lnâ,b ∩ Cc1y 6= ∅ if and only if b = e.
(v) If Lna,b ∩ Cc1y 6= ∅, then (Lna,b ∩ Cc1y ; +) is a subsemigroup of (On (A); +) if and
only if b = e. T
Lnâ,a if and only if f ∈
T n
(vi) {c f |c ∈ A} ⊆ Lâ,a .
a∈A a∈A
(vii) {eni |i = 1, 2, . . . , n} ∪ ( {c eni |i = 1, 2, . . . , n}) ⊆
S T n
Lâ,a
c∈A a∈A
Proof: (i) Let (A; ?,−1 , e) be a 2-group, i.e. x−1 = x for all x ∈ A. Then for
every f ∈ On (A) we have f (x) ? f (ê ? x−1 ) = f (x) ? f (ê ? x) = f (x) ? f (x) = e. Therefore
f ∈ Lnê,e , i.e. On (A) ⊆ Lnê,e and thus Lnê,e = On (A). Moreover, by Lemma 2.1 (ii), we
154 Denecke, K. and Susanti, Y.
(i) Lna,b 6= ∅.
(ii) Lna,b contains a constant element cny for a unique y ∈ A.
(iii) Lna,b ∩ Cc1y forms a semigroup for a unique y ∈ A.
(iv) If f ∈ Lna,b , then c f ∈ Lna,b if and only if b = c.
T n
(v) If f ∈ Lâ,a , then f ∈ Cid .
a∈A
Proof: (i) By the fact that {x ? x|x ∈ A} = A for a p-group A and Lemma 2.1
(i).
(ii) By assumption and by Lemma 2.1 (vi).
(iii) By (ii), there is a unique cny in Lna,b and hence Lna,b ∩ Cc1y 6= ∅. Moreover, for all
f, g ∈ Lna,b ∩ Cc1y we have f + g = cny ∈ Lna,b ∩ Cc1y , i.e. Lna,b ∩ Cc1y forms a subsemigroup
of (On (A); +).
(iv) Let f ∈ Lna,b . By assumption, b ? b = c ? c if and only if b = c. Therefore by Lemma
2.2 (ii), c ∈ Lna,b if and only if b ? b = c ? c if and only if b = c.
T n
(v) Let f ∈ Lâ,a . Assume that f 6∈ Cid , i.e. there exists x̂ ∈ ∆An such that
a∈A
f (x̂) 6= x. Then for â ∈ ∆An such that â = x̂ ?n x̂ if and only if a = x ? x we get
−1 ) = f (x̂) ? f (x̂) 6= x ? x = a. Therefore f 6∈ Ln , a contradiction.
f (x̂) ? f (â ?n xd â,a
On Sets Related to Clones of Quasilinear Operations 155
Recall that for the group (Zm ; +) where + is the usual addition modulo m and
A = Zm , by f for f ∈ On (A) we mean n-ary function on A mapping x ∈ An to
ω(f (x)) with ω(x) = m − 1 − x ([8]).
The following propositions are true for (Z/mZ; +).
Proposition 3.2. Let n ≥ 1, m ≥ 2 be two natural numbers and let A = Zm . For every
a ∈ An and b ∈ A the following propositions hold.
(i)If m is odd, then Lna,b 6= ∅.
(ii)f ∈ Lna,b if and only if f ∈ Lna,−b−2 .
(iii)Lna,b contains f for all f ∈ Lna,b if and only if 2b = −2.
(iv) If 2b = −2, then (Lnâ,a ; +) is a semigroup containing f for all f ∈ Lnâ,a .
(v) If m is even, then Lna,b contains some constant elements cni if and only if b is
even. Moreover, Lna,b contains exactly two constants.
(vi) If m is odd, then Lna,b contains a constant element cni , i ∈ A.
Proof: (i) and (vi) Let m be odd. It is easy to check that {a + a|a ∈ Zm } = Zm .
Applying Lemma 2.1 (i) we have Lna,b 6= ∅ for all a ∈ An = Zm n
and b ∈ A = Zm . Thus
we have (i). Moreover, by Lemma 2.1 (vi), we obtain (vi).
(ii), (iii) and (iv) Putting c = m − 1 we get is equal to c . Then applying Lemma
2.2 (i), Lemma 2.2 (ii) and Proposition 2.3 (iii), we have (ii), (iii) and (iv), respectively.
(v) Let m be even. Then by Lemma 2.1 (v), it is clear that Lna,b contains some constant
elements cni if and only if b is even. Moreover, for every even number b, 0 ≤ b ≤ m − 1,
we have that xb = 2b and yb = m+b 2 are the only numbers in A = Zm satisfying
xb + xb = yb + yb = b. Therefore, cnxb (x) + cnxb (a −n x) = xb + xb = b and cnyb (x) +
cnyb (a −n x) = yb + yb = b for every x ∈ An = Zm n
, i.e. cnxb , cnyb ∈ Lna,b .
4. PROPERTIES OF Ln n
a,b AND P ol αA FOR A = {0, 1}
In this section we consider A = {0, 1} and the 2-group ({0, 1}; +), where + is
addition modulo 2, since in this case our sets Lna,b are sets of Boolean operations and
Boolean operations play an important role in many applications.
Proposition 4.1. For A = {0, 1} the following propositions are true.
(i) L2a,0 forms a subsemigroup of (O2 ({0, 1}); +) for all a ∈ A2 .
(ii) L2a,1 does not form a subsemigroup of (O2 ({0, 1}); +) for all (1, 1) 6= a ∈ A2 .
Proof: Let c20 , c21 , e21 , e22 , ¬e2i , ¬e22 , f+ , ¬f+ be the following binary operations on
{0, 1}:
c20 c21 e21 e22 ¬e21 ¬e22 f+ ¬f+
(0, 0) 0 1 0 0 1 1 0 1
(0, 1) 0 1 0 1 1 0 1 0
(1, 0) 0 1 1 0 0 1 1 0
(1, 1) 0 1 1 1 0 0 0 1.
By simple counting we get precisely the following sets L2a,b .
156 Denecke, K. and Susanti, Y.
Lemma 4.1. Let A = {0, 1} and let n ≥ 1 be a natural number. Then the following
propositions hold:
(i) Ln0̂,0 = On (A) and Ln0̂,1 = ∅,
(ii) For all a ∈ An and b ∈ A the set Lna,b contains ¬f whenever f ∈ Lna,b ,
(iii) If a ∈ An and 0̂ 6= a 6= 1̂, then Lna,1 ∩(K0n \{cn0 }) 6= ∅ and Lna,1 ∩(K1n \{cn1 }) 6= ∅.
(iii) Lna,1 forms a subsemigroup of (On ({0, 1}); +) if and only if a = 1̂. Moreover,
Ln1̂,1 = C ∪ ¬C for some C ⊆ C4n .
Proof: Let A = {0, 1} and a ∈ An . By Lemma 4.1 (ii), Lna,b contains ¬f for all
f∈ Lna,b . Moreover, by Lemma 3.1 (iii), Lna,0 contains the two constants cn0 and cn1 .
(i) If a 6= 1̂ then by Lemma 2.1 (iv), Lna,0 contains the projection eni whenever ai = 0.
Since eni (0̂) = 0 and eni (1̂) = 1 we have eni ∈ C4n and thus Lna,0 ∩ C4n 6= ∅. Therefore we
have cn0 , cn1 ∈ Lna,0 , ¬(Lna,0 ∩ C4n ) = Lna,0 ∩ ¬C4n 6= ∅ and ¬(Lna,0 ∩ K0n ) = Lna,0 ∩ K1n and
hence by Theorem 1.1, (Lna,0 ; +) is a four-part semigroup.
(ii) Now, let a = 1̂, i.e. Lna,o = Ln1̂,0 . We show that L1̂,0 ∩ C4n = ∅ and L1̂,0 ∩ ¬C4n = ∅.
Assume that there is f ∈ L1̂,0 ∩ C4n or f ∈ L1̂,0 ∩ ¬C4n . Then we have f (0̂) = 0 and
f (1̂) = 1 or f (0̂) = 1 and f (1̂) = 0. Therefore f (0̂) + f (1̂ −n 0̂) = f (0̂) + f (1̂) = 1
and hence f 6∈ Ln1̂,0 , a contradiction. Thus Ln1̂,0 ⊆ K0n ∪ K1n such that cn0 , cn1 ∈ Ln1̂,0
and Ln1̂,0 contains the negations of its elements, i.e. Ln1̂,0 = C ∪ ¬C for some cn0 ∈ C ⊆
K0n . Therefore by Theorem 1.2, Ln1̂,0 forms a two-constant semigroup containing the
negations of its elements.
(iii) Let Lna,1 form a semigroup. Assume that a 6= 1̂. By Lemma 4.1 (i), â 6= 0̂ and
by Lemma 3.1 (iii), Lna,1 does not contain any constant element. By Lemma 4.1 (iii),
Lna,1 ∩ (K0n \ {cn0 }) 6= ∅ and hence f + g = cn0 6∈ Lna,1 , for every cn0 6= f ∈ Lna,1 ∩ K0n , g ∈
Lna,1 , a contradiction. Thus a = 1̂. Conversely, if a = 1̂, then by Proposition 2.3 (i),
Lna,1 = Ln1̂,1 forms a subsemigroup of (On ({0, 1}); +). Moreover, by Lemma 3.1 (iii),
Ln1̂,1 does not contain neither cn0 nor cn1 . Hence Ln1̂,1 ∩ K0n and Ln1̂,1 ∩ K1n must be empty
sets. Furthermore, by Lemma 4.1 (ii), Ln1̂,1 contains all the negations of its elements.
Therefore Ln1̂,1 = C ∪ ¬C for some C ⊆ C4n .
Theorem 4.1. Let A = {0, 1} and let n ≥ 1. Then (P oln αA ; +) is a four-part sub-
semigroup of (On (A); +).
Proof: Each n-ary projection eni belongs to P oln αA . Thus P oln αA ∩ C4n 6= ∅.
Moreover, the constant cn0 belongs to P oln αA . The operation ¬ corresponds to c for
c = 1 and then by Lemma 2.4 we have ¬f ∈ P oln αA whenever f ∈ P oln αA . Therefore
by Theorem 1.1, (P oln αA ; +) is a four-part semigroup.
References
[1] Butkote, R., Denecke, K., Semigroup Properties of Boolean Operations, Asian-Eur. J. Math,
Vol. 1, No. 2, 157–176, 2008.
[2] Butkote, R., Universal-algebraic and Semigroup-theoretical Properties of Boolean Operations,
Dissertation, Universität Potsdam, 2009.
[3] Denecke, K., Lau, D., Pöschel, R. and Schweigert, D., Hyperidentities, Hyperequational
Classes and Clone Congruences, Contributions to General Algebra 7, 97-118, Verlag Hölder-
Pichler-Tempsky, Wien, 1991.
[4] Denecke, K., Wismath, S. L., Hyperidentities and Clones, Gordon and Breach Science Publisher,
2000.
[5] Denecke, K., Wismath, S. L., Universal Algebra and Applications in Theoretical Computer
Science, Chapman and Hall, 2002.
158 Denecke, K. and Susanti, Y.
[6] Denecke, K., Wismath, S. L., Universal Algebra and Coalgebra, World Scientific, 2009.
[7] Denecke, K., Susanti, Y., Semigroup-theoretical Properties of Boolean Operations, —–, 2009
(submitted).
[8] Denecke, K., Susanti, Y., Semigroups of n-ary Operations on Finite Sets,—–, 2010 (submitted).
[9] Fearnley, A., Clones on Three Elements Preserving a Binary Relation, Algebra Universalis 56,
165-177, 2007.
[10] Lau, D., F unction Algebras on Finite Sets, Springer 2006.
[11] Pöschel, R., Kalužnin, L. A., F unktionen- und Relationenalgebren, VEB Deutscher Verlag der
Wissenschaften, Berlin 1979.
[12] Rosenberg, I. G., Über die Funktionale Vollständigkeit in den Mehrwertigen Logiken, Rozpravy
Ćeskoslovenské Akad. véd, Ser. Math. nat. Sci. 80, 3-93, 1970.
[13] Szendrei, Á., Clones in Universal Algebra, Les Presses de L’ Université de Montréal, 1986.
Denecke, K.
Universität Potsdam, Am Neuen Palais 10, 14469 Potsdam Deutschland.
e-mail: [email protected]
Susanti, Y.
Universitas Gadjah Mada, FMIPA UGM Sekip Utara Bulaksumur 55281 Yogyakarta Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 159–168.
1. INTRODUCTION
Two aspects stand out in system theory, those are robust stabilization and model
reduction. Robust stabilization problem is the problem of finding the controller that
not only stabilizes nominal plant, but also a similar to the nominal ones. Moreover,
the controller of a low order is often desirable in practice. One way approach to obtain
a low order controller is a model reduction. For that purposes, coprime factorization
of the representation of a transfer function has become a powerful tool. The coprime
factorization theory was then generalized to infinite dimensional system in [1, 3].
Curtain and Opmeer [2, 9] have developed model reduction method based on
LQG-balanced realization for the infinite dimensional system. The LQG-balanced is a
realization of a transformed system, such that the solutions of the corresponding control
and filter Riccati operator equations of Linear Quadratic Gaussian (LQG) controllers
are equal and diagonal. The key step in the analysis of LQG-balanced realization is to
construct the normalized left-coprime factorization (NLCF) systems using the solutions
of LQG-Riccati equations of infinite dimensional systems [2].
159
160 Fatmawati, R. Saragih, Y. Soeharyadi
During the last decades, there has been much research in the design of H∞ con-
trollers, which are robust to system uncertainty and disturbance. In [8], it is shown
that the LQG-balanced realization of FDLTI systems can be carried out to the H∞ -
balanced based on H∞ -type controllers. Hence, it is very interesting to generalize the
H∞ -balanced to infinite dimensional systems. The H∞ -balanced realization is con-
structed via normalized left-coprime factorization (NLCF) systems using the solutions
of H∞ -Riccati equations of infinite dimensional systems [4]. The systems considered are
assumed to be exponentially stabilizable and in a detectable linear state, with bounded
and finite-rank input and output operators. Furthermore, we derive the connection be-
tween the controllability and observability gramians of NLCF systems and the solutions
of H∞ -Riccati equations.
on X is exponentially stable [3] if there exist positive constants M and α such that
kSF (t)k ≤ M e−αt , for all t ≥ 0.
Next, we review an H∞ -control problem for infinite-dimensional systems. This
problem is concerned with a generalized plant
ẋ(t) A B 0 B x(t)
z1 (t)
= C 0 0 0 w1 (t)
.
(2)
z2 (t) 0 0 0 I w 2 (t)
y(t) C 0 I 0 u(t)
z1 w1
Put z := , and w := , where z is the error signals and w is the disturbance
z2 w2
signals. The state space description of the controller with transfer function K is given
by
The closed-loop transfer function from w to z will be denoted by Tzw . The existence
of suboptimal H∞ -control for infinite-dimensional systems is given by the following
theorem.
Theorem 2.1. [10, 7] There exists an admissible controller K for the system (2) such
that kTzw k∞ < γ, γ > 0 if and only if there exists operator X, Y ∈ L(X ) with X =
X ∗ ≥ 0, Y = Y ∗ ≥ 0 that satisfy
1. Operator X satisfying control Riccati equation
AY x + Y A∗ x − (1 − γ −2 )Y C ∗ CY x + BB ∗ x = 0, x ∈ D(A∗ ) (5)
Moreover, when these conditions hold, one such the controller K (3) can be con-
struted with
AK = A − (1 − γ −2 )BB ∗ X − ZC ∗ C,
BK = ZC ∗ ,
CK = −B ∗ X
Z = (I − γ −2 Y X)−1 Y = Y (I − γ −2 XY )−1 .
162 Fatmawati, R. Saragih, Y. Soeharyadi
Since M̃, Ñ, X and Y analytic and bounded in C+ 0 , the identity (14) must hold
on all of s ∈ C+
0 , where C +
0 is the set of complex number with real part larger
than zero.
3. The last step, we will show that
M̃(iω)M̃(iω)∗ + Ñ(iω)Ñ(iω)∗ = I, ω ∈ R. (15)
Note that for ω ∈ R will be obtained
M̃(iω)M̃(iω)∗ + Ñ(iω)Ñ(iω)∗
= [I − C(iωI − AY )−1 β 2 Y C ∗ ][I − Cβ 2 Y (−iωI − A∗Y )−1 C ∗ ]+
C(iωI − AY )−1 βBβB ∗ (−iωI − A∗Y )−1 C ∗
= I − C(iωI − AY )−1 β 2 Y C ∗ − Cβ 2 Y (−iωI − A∗Y )−1 C ∗ +
C(iωI − AY )−1 β 4 Y C ∗ CY + β 2 BB ∗ (−iωI − A∗Y )−1 C ∗
= I+
C(iωI − AY )−1 AY β 2 Y + β 2 Y A∗Y + β 4 Y C ∗ CY + β 2 BB ∗ (−iωI − A∗Y )−1 C ∗
= I.
To establish (15), we substitute AY = A − β 2 Y C ∗ C to the following form
AY β 2 Y + β 2 Y A∗Y + β 4 Y C ∗ CY + β 2 BB ∗
= [A − β 2 Y C ∗ C]β 2 Y + β 2 Y [A − β 2 Y C ∗ C]∗ + β 4 Y C ∗ CY + β 2 BB ∗
= Aβ 2 Y + β 2 Y A∗ − β 2 Y C ∗ Cβ 2 Y + β 2 BB ∗
=0 by (6).
This verifies (15).
Normalized H∞ Coprime Factorization for Infinite-Dimensional Systems 165
Note that the NLCF system (AY , BY , CY , DY ) is exponentially stable. The con-
nection between the controllability and observability gramians of NLCF system with
the solutions of the Riccati equations is given by the following lemma.
Proof. Since the NLCF system (AY , BY , CY , DY ) is exponentially stable, then the con-
trollability gramian LB = L∗B ≥ 0 is the unique solution of the Lyapunov equation [3,
Lemma 4.1.24]
AY β 2 Y x + β 2 Y A∗Y x = −β 2 BB ∗ x − β 4 Y C ∗ CY x. (19)
We see that LB dan β 2 Y are both the solutions to the Lyapunov equation (18). By the
uniqueness of the solutions, we conclude that LB = β 2 Y .
Similarly, the observability gramian LC = L∗C ≥ 0 is the unique solution of the
Lyapunov equation
Since the maximum Hankel singular value of NLCF sistem (AY , BY , CY , DY ) is less
than one, we have that (I − LC LB ) is invertible and (I − LC LB )−1 : D(A∗ ) → D(A∗ )
[3, Lemma 9.4.7, Lemma 8.3.2].
We verify that Q := (I − LC LB )−1 LC is a solution of the control LQG-Riccati (7).
166 Fatmawati, R. Saragih, Y. Soeharyadi
where we have used (21) and (17). Hence, Q = (I − LC LB )−1 LC satisfies the Riccati
equation
QAY x + A∗Y Qx = −(I − LC LB )−1 C ∗ C(I − LB LC )−1 x + QBY BY∗ Qx
QAY x + A∗Y Qx − QBY BY∗ Qx = −(I − LC LB )−1 C ∗ C(I − LB LC )−1 x. (22)
Then substitute BY = [βB − β 2 Y C ∗ ] to (22) such that we obtain
QAY x + A∗Y Qx − Qβ 2 BB ∗ Qx − Qβ 2 Y C ∗ Cβ 2 Y Qx
= −(I − LC LB )−1 C ∗ C(I − LB LC )−1 x. (23)
Adding C ∗ C to the both sides of (23) and using β 2 Y = LB , we obtain
QAY x + A∗Y Qx − Qβ 2 BB ∗ Q + C ∗ Cx
= C ∗ Cx + QLB C ∗ CLB Qx − (I − LC LB )−1 C ∗ C(I − LB LC )−1 x
= (I − LC LB )−1 [(I − LC LB )C ∗ C(I − LB LC ) + LC LB C ∗ CLB LC − C ∗ C] ·
(I − LB LC )−1 x
= −(I − LC LB )−1 [LC LB C ∗ C(I − LB LC ) + (I − LC LB )C ∗ CLB LC ] (I − LB LC )−1 x
= −QLB C ∗ C − C ∗ CLB Qx.
Hence, for x ∈ D(A), we have
QAY x + A∗Y Qx − Qβ 2 BB ∗ Qx + C ∗ Cx = −QLB C ∗ Cx − C ∗ CLB Qx. (24)
Moreover, using AY = A − β 2 Y C ∗ C, we obtain
XAY x + A∗Y Xx − Xβ 2 BB ∗ Xx + C ∗ Cx (25)
∗ 2 ∗ ∗ 2 2 ∗ ∗
= XAx + A Xx − Xβ Y C Cx − C Cβ Y Xx − Xβ BB Xx + C Cx.
Note that X is the unique solution of the control LQG-Riccati equation (7). Using
β 2 Y = LB , we can rewrite (25) as
XAY x + A∗Y Xx − Xβ 2 BB ∗ Xx + C ∗ Cx = −XLB C ∗ Cx − C ∗ CLB Xx. (26)
According to (24) and (26), we have Q and X are both solutions to the control LQG-
Riccati equation. So by the uniqueness property, we have
X = Q = (I − LC LB )−1 LC = LC (I − LB LC )−1 . (27)
Normalized H∞ Coprime Factorization for Infinite-Dimensional Systems 167
4. CONCLUDING REMARKS
We have constructed the normalized left-coprime factorization (NLCF) systems
system based on H∞ -Riccati equation. There is the relationship between the control-
lability and observability gramians of NLCF system with the solutions of the Riccati
equations. The next work, we will extend a model reduction techniques based on H∞ -
balancing via NLCF system.
References
[1] Curtain, R. F., Robust stabilizability of normalized coprime factors: the infinite-dimensional
case, Int. J. Control 51, 1173-1190, 1990.
[2] Curtain, R. F., Model reduction for control design for distributed parameter systems, in Research
Directions in Distributed Parameter systems, SIAM, Philadelphia, PA, 95-121, 2003.
[3] Curtain, R. F. and Zwart, H. J., An Introduction to Infinite-Dimensional Systems, Springer-
Verlag, New York, 1995.
[4] Fatmawati, Model Reduction and Low Order Controller Design Strategy for Infinite Dimensional
Systems, PhD Dissertation, Institut Teknologi Bandung, Indonesia, 2010.
[5] Glover, K. and McFarlane, D., Robust stabilization of normalized coprime factor plant de-
scriptions with H∞ -bounded uncertainty, IEEE Trans. on Automatic Control AC-34, 821-830,
1989.
[6] Meyer, D. E., Fractional balanced reduction: model reduction via fractional representation, IEEE
Trans. on Automatic Control, 35, 1341-1345, 1990.
[7] Morris, K. A, H∞ -output feedback of infinite-dimensional systems via approximation, Systems
Control Lett. , 44, 211-217, 2001.
[8] Mustafa, D. and Glover, K., Controller reduction by H∞ -balanced truncation, IEEE Trans.
on Automatic Control, 36, 668-682, 1991.
[9] Opmeer, M. R., LQG balancing for continuous-time infinite-dimensional systems, SIAM J. Con-
trol and Optim., 46, 1831-1848, 2007.
[10] van Keulen, B., H∞ -Control for Distributed Parameter Systems: A State-Space Approach, Sys-
tems & Control: Foundation & Applications, Birkhäuser, Boston, 1993.
168 Fatmawati, R. Saragih, Y. Soeharyadi
FATMAWATI
Department of Mathematics, Faculty of Science and Technology, Universitas Airlangga,
Kampus C Jl. Mulyorejo Surabaya, Indonesia.
e-mail: [email protected]
Roberd Saragih
Industrial and Financial Mathematics Group, Faculty of Mathematics and Natural Science,
Institut Teknologi Bandung,
Jl. Ganesa 10 Bandung, Indonesia.
e-mail: [email protected]
Yudi Soeharyadi
Analysis and Geometry Group, Faculty of Mathematics and Natural Science, Institut Teknologi
Bandung,
Jl. Ganesa 10 Bandung, Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 169 - 174.
Abstract. Let be a function from any set X into any lattice L with the smallest elemet. The set
of all such functions is called the L-power set of X and is denoted by LX. Considering that LX is a
complete Heyting algebra containing 0 X and 1 X . The lattice L can be isomorphically embedded
into the lattice LX, so L can be viewed as a subset of LX. The purpose of this note is to construct L as a
complete Heyting algebra containing both the smallest and the largest elements. We first show that
the smallest element of LX contained in nonempty subset of sublattice of LX . Since every nonempty
subset of L also contain the smallest element, it is implies that the subset always has a supremum and
an infimum in L which is a complete lattice. Moreover, we know that LX be relatively pseudo-
complemented lattice, so is L. Finally, we can prove L as a complete Heyting algebra.
Keywords and Phrases: complete lattice, relatively pseudo-complemented , Heyting algebra,
complete Heyting algebra.
1. INTRODUCTION
169
170 H.O.L. M ONIM , I.E. WIJAYANTI, S. WAHYUNI
1.1 Fundamental Definition. The concept about complete Heyting algebra connects with
particular cases of lattice such as a relatively pseudo-complemented lattice and a lattice
which holds infinitely distributive law. On the other hand, we know that studying lattice can
be done by an algebraic approach and an order sets approach. Furthermore, we can use a
notation V ; , as an algebra with two binary operation and . In this section, we
shall give some basic definitions and its examples which related to lattice and Heyting
algebra. We also can start by recalling some well known results. For basic notations and
results, we refer the reader to the references [6].
.
respectively.
....
24
12
30
6 6 15
.. 4
2
3 2
10
3
31
5
1 3
Figure 1 Figure 2
3
3
3
b) The closed interval [0,1] with the usual order is a complete lattice.
c) 3
The set of all subsets of some set, ordered by inclusion, is a complete lattice.
3
3
3
3
3
3
C on st ru ct i on of a C om p let e He yt i n g Alg eb ra f or a n y La t t i c e 171
d) The set of all integers with the usual ordering is not a complete lattice.
The following definition explains some notation relating with the mapping between two
lattices.
Definition 1.3. [1] Let : L L * be any mapping where L and L* are lattice such that:
(i) x L y x L* y ,
(ii) x L y x L* y ,
(iii) x L y x L* y ,
for any x, y L . Mapping is called:
a) isoton if it holds (i),
b) meet homomorphism if it holds (i) and (ii),
c) join homomorphism if it holds (i) and (ii),
d) lattice homomorphism if it holds (i), (ii) and (iii).
a b a b , for all A L and for all b L .
aA aA
The following definition of a Heyting algebra as a lattice is taken from Katsuya (1981).
Definition 1.7. [4]. A lattice L is a Heyting algebra if it is a bounded distributive lattice and
an RPC lattice.
The connecting property between a lattice (as a poset) and Heyting algebra is a core of this
paper.
172 H.O.L. M ONIM , I.E. WIJAYANTI, S. WAHYUNI
Proposisi 1.8. [3]. Let L, be a bounded lattice and a relatively pseudo-complemented
lattice. Then the three binary operations, infimum, supremum and relatively pseudo-
complemented, make L,,, ,0,1 as a Heyting algebra. Conversely, the Heyting algebra
is defined a bounded lattice such that for all a and b in L there is a greatest element x of L in
which a x b holds.
Definition 1.9. [5] By an L-subset of X, we mean a function from any set X into L where L is a
complete Heyting algebra. The set of all L-subsets of X is called the L-power set of X and is
denoted by LX.
Here are the known results that are taken from [6].
Theorem 1.10. [6] Let L be a complete Heyting algebra. LX is a complete Heyting algebra if
LX is a relatively pseudo-complemented lattice.
Theorema 1.11. [6] The L-power set of X, together with the operations union and
intersection, and relatively pseudo-complemented constitutes a complete Heyting algebra
whose partially ordering is . Its maximal and minimal elements are 1X and 0X, respectively.
Moreover, the lattice L can be isomorphically embedded into the lattice LX.
From now on LX is a complete Heyting algebra along with the operations union,
intersection, and relatively pseudo-complemeneted consisting 1X and 0X, unless otherwise is
stated. On the other hand, we assume L as lattice with the smallest element 0.
Relation between the lattice L and the lattice LX showed in [6] as follows. We start by
recalling the embed mapping : L LX that is defined by
a , x Y
a x aY x :
0 , x X \Y
where Y X and a L .
Case 1. For any lattice L, unfortunately the definition of the function is fail.
Case 2. For lattice L with the bottom, the definition of the function is well defined.
C on st ru ct i on of a C om p let e He yt i n g Alg eb ra f or a n y La t t i c e 173
The following lemma which is taken from Davey and Priestly [2] will be used to prove
the main result.
Proposisi 2.2. Let X be any set and L is any lattice with the smallest element. Let LX be a
complete Heyting algebra. There exist isomorphically embedding : L LX with the
properties that for sublattice L contain every function aY in LX such that a aY x
for any x X , Y X and a L . Then L is a complete Heyting algebra.
Proof: By the definition of a aY x for any x X since
above, we have
a a and a 0 implies a 0 . Thus, there exist at least 0 L such that L contain
aY in LX hold the property. Let the set of all functions satisfying this property, denoted by
Sa : aY x X a aY x contained in L because 0 L implies 0Y S a
which is the smallest element of LX . Thus, every nonempty subset A of L contain the
smallest element 0 L and L is said to satisfiy the minimum condition. It means that L
fulfills descending chain condition. Furthermore, for every non-empty subset A of L there
exist a finite subset B of A such that sup A = sup B exists in L. From this we conclude that L is
complete lattice. Moreover, LX holds Infinitely distributives law, so is L. According to
Birkhoff, L is relative pseudo-complemented lattice. Finally, referent to the properties of
relative pseudo-complemente lattice as the result in [6] show that L is a complete Heyting
algebra.
3. CONCLUDING REMARK
If L be any lattice and LX be a complete Heyting algebra then we cannot construct a complete
Heyting algebra L because of defining of the embed mapping from L into LX failed. On
the other hand, lattice L with the smallest element can be constructed as complete Heyting
algebra if there is an isomorphically embedding with the properties that for sublattice
L contain all functions such that a aY x for any x X , Y X and a L .
174 H.O.L. M ONIM , I.E. WIJAYANTI, S. WAHYUNI
References
[1] BIRKHOFF G., Lattice Theory, vol. 25, The American Mathematical Society Qollouquim Publishing, New
York, 1984.
[2] DAVEY B. A. AND PRIESTLEY H. A., Introduction to Lattices and Order, Second Edition, Cambridge
University Press, 2002.
[3] RASIOWA H. AND SIKORSKI R., The Mathematics of Metamathematics, Warszawa, Poland, 1963.
[4] KATSUYA E., Completion and coproduct of Heyting Algebras.Tsukuba Journal of Mathemathics, vol. 5, No.2,
1981
[5] MODERSON, J. AND MANIK S., Fuzzy Commutative Algebra, World Scientific Publishing Co. Pte. Ltd,
Singapore, 1998.
[6] MONIM, H.O.L., WIJAYANTI I.E. AND WAHYUNI S., Construction a Complete Heyting Algebra-LX, Presented
in National Conference on Algebras, Padjajaran University, Bandung, 201
SRI WAHYUNI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta-Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 175 - 182.
Abstract. A bilinear form semigroup is a semigroup constructed based on a bilinear form. In this
paper, it is established the properties of a bilinear form semigroup as a regular semigroup. The
bilinear form semigroup will be denoted by . We get some properties of the fuzzy regular
bilinear form subsemigroup of , i.e: is a fuzzy regular bilinear form subsemigroup of
if and only if , is a regular subsemigroup of , provided . If is a
nonempty subset of , then is a regular subsemigroup of if and only if , the
characteristics function of , is a fuzzy regular bilinear form subsemigroup of . If is a
fuzzy regular bilinear form subsemigroup of , then .
Keywords and Phrases : fuzzy subsemigroup, fuzzy regular bilinear form subsemigroup
1. INTRODUCTION
The basic concept of fuzzy subset is established by Zadeh since 1965. The concept has
been developed in many areas of algebraic structures, such as fuzzy subgroup , fuzzy subring,
fuzzy subsemigroup, etc. Refers to the articles of Asaad (1991), Kandasamy (2003),
Mordeson & Malik (1998), Ajmal (1994), we define the fuzzy subset of a set that is a
mapping from into the interval , i.e. . Let be a fuzzy subset of and
, then the level subset is defined as set of all elements in such that their
images of the fuzzy subset are more than or equal to , i.e. .
Moreover, let be a semigroup, then a mapping is called a fuzzy
subsemigroup of a semigroup if
(1)
The fuzzy subset is a fuzzy subsemigroup of a semigroup if and only if every nonempty
level subset is a subsemigroup of .
175
176 KARYATI ET AL
Let and be vector spaces over field , with the characteristics of is zero. A
function is called a bilinear form if is linear with respect to each variable.
Every bilinear form determines two linear transformation, , which is defined as
and , which is defined as . In this case,
and are the dual spaces of and , respectively. In this paper, we denote and
as the set of all linear operators of and , respectively. For , we can
construct the following sets:
The structure of the set is a semigroup with respect to the binary operation define as
. The semigroup is called a bilinear form semigroup
related to the bilinear form .
Let be a semigroup and , an element is called a regular element if
there exists such that . A semigroup is called regular if every element
is regular. The element is called a completely regular if there exists such that
and . A semigroup is called completely regular if every element is a
completely regular element.
The purpose of this paper is to define the fuzzy regular bilinear form subsemigroup
and investigates the characteristics of it.
2. RESULTS
and
(2)
then is called a fuzzy regular bilinear form subsemigroup.
Furthermore,
, .
From this point we have is a fuzzy regular bilinear form subsemigroup of semigroup
.
Hence,
Thus,
Therefore it implies .
The following proposition give the condition of a semigroup homomorphism, such that
the image of a fuzzy regular bilinear form of is a fuzzy regular bilinear form
subsemigroup of too, and visa versa.
Proof:
1. For every is a regular subsemigroup bilinear form , provided
. In fact, if there is such that is not a nonempty set
and is not regular semigroup, then there is such that:
, ,i.e. and
, ,
Th e Fu z z y R e gu la ri t y of B i li n ea r form Sem i gr ou p s 179
and
Let :
and
Clearly, , and so ,
i.e. is not regular. It contradicts that is a fuzzy regular bilinear form
subsemigroup of semigroup
2. If is a fuzzy regular bilinear form subsemigroup of semigroup , then for every
, , there exists such that
. Since is surjective, for every there exists
such that .
For every , , there exists
with
that implies
It is proved that , so .
3. CONCLUDING REMARKS
References
[1] AJMAL, NASEEM., Homomorphism of Fuzzy groups, Correspondence Theorem and Fuzzy Quotient Groups.
Fuzzy Sets and Systems 61, p:329-339. North-Holland. 1994
[2] AKTAŞ, HACI, On Fuzzy Relation and Fuzzy Quotient Groups. International Journal of Computational
Cognition Vol 2 ,No 2, p: 71-79. 2004
[3] ASAAD, MOHAMED, Group and Fuzzy Subgroup. Fuzzy Sets and Systems 39, p:323-328. North-Holland. 1991
[4] HOWIE, J.M., An Introduction to Semigroup Theory. Academic Press, Ltd, London, 1976
[5] KANDASAMY, W.B.V., Smarandache Fuzzy Algebra. American Research Press and W.B. Vasantha Kandasamy
Rehoboth. USA. 2003
[6] KARYATI, S.WAHYUNI, B. SURODJO, SETIADJI , Ideal Fuzzy Semigrup. Seminar Nasional MIPA dan Pendidikan
MIPA di FMIPA, Universitas Negeri Yogyakarta, tanggal 30 Mei 2008. Yogyakarta, 2008.
[7] KARYATI, S. WAHYUNI , B. SURODJO, SETIADJI, The Fuzzy Version Of The Fundamental Theorem Of
Semigroup Homomorphism. The 3rd International Conference on Mathematics and Statistics (ICoMS-3)Institut
Pertanian Bogor, Indonesia, 5-6 August 2008. Bogor, 2008.
[8] KARYATI, S. WAHYUNI, B. SURODJO, SETIADJI, Beberapa Sifat Ideal Fuzzy Semigrup yang Dibangun oleh
Subhimpunan Fuzzy. Seminar Nasional Matematika, FMIPA, UNEJ. Jember, 2009.
[9] MORDESON, J.N AND MALIK, D.S. Fuzzy Commutative Algebra. World Scientific Publishing Co. Pte. Ltd.
Singapore, 1998.
KARYATI
Ph. D student, Mathematics Department, Gadjah Mada University
e-mail: [email protected]
SRI WAHYUNI
Mathematics Department, Gadjah Mada University
e-mail: [email protected]
BUDI SURODJO
Mathematics Department, Gadjah Mada University
e-mail: [email protected]
SETIADJI
Mathematics Department, Gadjah Mada University
182 KARYATI ET AL
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 183 – 192.
Abstract. Given a directed graph E. We can denote a Leavitt path algebra with whose
coefficients are in a commutative unital ring R. Leavitt path algebras generalize a Leavitt path
algebras over field K. Generally, both and have the same basic properties.
However, some differences do exist. In addition to differ in sufficient condition of graded uniqueness
theorem, they also differ in sufficient condition of the Cuntz-Krieger uniqueness theorem.
Keywords and Phrases: Path algebra, Leavitt path algebra, Graded uniqueness theorem, the Cuntz
Krieger uniqueness theorem.
1. INTRODUCTION
Any mathematical object involving points and connections between them be called
graph. Graphs are not only viewed as combinatorial objects that sit at the core of
mathematical intuition, but also can be stated algebraicly. An amalgamation between graph
theory and algebras can create the new algebras such as path algebras over field.
In [1], [2], and [3] the authors explained in details to construct Leavitt path algebras
over graph E with coefficients in field K that be denoted . Leavitt path algebras
are constructed from path algebras over field on Leavitt extended graph with Cuntz-Kreager
conditions.
Leavitt path algebras over commutative unital ring R are a generalization of
Leavitt path algebras over field K. Tomforde in [4] constructs by using the
definition of Leavitt E-family. It is interesting to examine the similarities and differences
between the two constructions as well as their properties.
In [5] two constructions of and are compared, and we obtain that the
definition of LK(E) can be expressed as K-algebra constructed by Leavitt E-family. On the
contrary can be stated as path R-algebra on Leavitt extended graph which require
certain Cuntz-Krieger conditions. Several properties of LK(E) are still hold on LR(E) but many
183
184 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI
others are different from the LR(E)’s properties. One of the differences is the graded
uniqueness theorem of Leavitt path algebras.
This paper is a continuation of the one presented before in Bandung. The focus of the
previous article was on the graded uniqueness theorem of Leavitt path algebras [5], while the
present is on the Cuntz-Krieger Uniqueness Theorem on LR(E) and LK(E). Both this paper and
the previous are not only the review but also the proofs in detail. In addition, Tomforde [4]
proved that be -graded rings, while in [5] be described in detail that be -
graded algebras with different proof.
The paper is organized to five sections. After we have expressed introduction in
section 1, we will describe constructions of Leavitt path algebras in section 2. In section 3 we
will clarify the properties of Leavitt path algebras. The Cuntz Krieger Uniqueness Theorem
of Leavitt path algebras will be explained in section 4, and finally we conclude this paper in
section 5.
2.1.Leavitt Path Algebra over Field. In [1], [2] and [3] Aranda Pino et. al. Explained in
detail how to construct Leavitt path algebras with coefficients in field. They begin by
reminding the reader of the construction of the standard path algebra of a graph.
Definition 2.1.1. Let K be a field and E (E0,E1,r,s) be a graph. The path K-algebra over E is
defined as the free K-algebra KE with the relations:
The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras 185
a. vi v j ij vi for every vi , v j E 0
b. ei ei r (ei ) s(ei )ei for every ei E1
In addition to definition of path algebras, the definition of extended graph be needed to
construct Leavitt path algebras. The followings are definitions related to extended graph adn
Leavitt path algebras.
Definition 2.1.2. Given a graph we define new graph called extended graph,
that is where ( E
1 *
) ei* : ei E1 and the functions r’, s’:
. An edge is named a real edge
and is called a ghost edge. If the path be real path then
be ghost path and , for every .
Definition 2.1.3. Let be a graph. The Leavitt path algebra of E over field K denoted LK(E)
is defined as the path algebra of the extended graph , which satisfy
Cuntz-Krieger relations :
(CK1) e f e, f r (e) for every e, f E
* 1
2.2.Leavitt Path Algebra over Commutative Unital Ring.Tomforde [4] defined Leavitt
path algebras over commutative ring with unit differently. We will introduce Leavitt E-family
of graph E over ring R to construct definition of Leavitt path algebras.
Definition 2.2.2. Given a graph E and commutative unital ring R. The Leavitt path algebra
with coefficient in R, denoted LR(E), is the universal R-algebra generated by a Leavitt E-
family.
If we examine definition 2.1.1, 2.1.2, 2.1.3, 2.2.1, and 2.2.2, it means that the two
constructions are compared, then the definition 2.1.3 of LK(E) can be expressed as the
universal K-algebra generated by a Leavitt E-family. On the other hand, can be stated
as path R-algebra on Leavitt extended graph which meets certain Cuntz-Krieger conditions.
Therefore, here is a brief review on Leavitt path algebras that is a generalization of
.
186 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI
In [5] the authors reviewed not only how to construct Leavitt path algebras but also
about their properties. In addition they reviewed some properties of without proof and
explained in detail several properties of which still exist in , these are as follow:
1. All elements in are non-zero and
for all .
2. Both Leavitt path algebras are unital if is finite. They both contain local unit if
is infinite.
3. Both are -graded algebras, whose elements are linear combinations of monomials.
Associated with the third property above, there is little difference between [4] and [5]
in stating one of the properties of of -graded and they prove differently. Tomforde
proved that be -graded rings while the authors in [5] showed that be -graded
algebras with a complete proof.
As previously discussed is a generalization of . They have some similar
properties, but there are also several different properties. One of them is “the graded
uniqueness theorem of Leavitt path algebras” that was completely reviewed in [5]. The
following will be resubmitted without proof of the graded uniqueness theorem of .
Theorem 3.1. Let E be a graph, and let R be a commutative unital ring. If S is a graded ring
and : → S is a graded ring homomorphism with the property that for all
v ∈ E0and for all r ∈ R \ {0}, then is injective.
In theorem 3.1. above, if a commutative unital ring R is replaced by field K then here
is always true that for all r ∈ K\{0} there exists r-1∈ K\{0}such that .
Therefore a sufficient condition for all v ∈ E0and for all r ∈ R\{0} can be replaced
with for all v ∈ E0. Thus we obtain the following corollary.
In this section, we will take more of a graphs that satisfy the Condition (L). As defined
earlier, that a graph E is said to satisfy Condition (L) if every cycle in E has an exit, in which
is called an exit of the path α if there exists such that
and . Next we will discuss in detail the lemmas that support the proofs of Cuntz-
Krieger Uniqueness theorem.
Lemma 4.1. Let E is a graph satisfying Condition (L). If F is a finite subset of and
, then there exists a path such that and for every .
Proof : Given and a finite subset , then there are three cases.
CASE 1. : There is a path from v to a sink , i.e. .
Let be a path with a sink. For any path , we can see that
. It is not possible because is a , and
then and .
CASE 2. : There is no path from v to a sink .
Given . If with , is a path that no
repeated vertices, then for any . But that is
impossible, because this condition will imply that for some . It means
that there is a contradiction with no repetition of vertices. Thus, it is proved that .
CASE 3. : the path whose vertex v is a source has repetition of vertices and there is a path
from v to base point of a cycle in E.
Given any path , which has a repetition of vertices and there exists a
path of the shortest length such that and is the base point of a cycle in E.
Choose a cycle that has the shortest length with . Because any cycle has an
exit, we can suppose that be an exit for . Let be the segment of from
to .
Consider the following illustration :
v3
e3 e4 ,
v e1 e2 v4 f v6
, for every edge in path
v1 v2 e6 e5 , for every edge at cycle
v5
Because of the minimality of length of both and , the path has the
property that . If we select sufficiently many repetitions of the cycle
, then it can be guaranteed that the length of path is greater than or equal to M, in other
word, . Therefore, we have that that
is impossible since this will result in for some . It is a contradiction
with ). Thus, .
188 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI
Lemma 4.2. Let E be a graph satisfying Condition L. Every polynomial in only real edges
and , there exists real paths , such that for some
and some
Proof : Given a graph E satisfying Condition (L), commutative ring R with unit. Suppose
is a polynomial in only real edges and . It will be proved that there are path
such that for some and , by the mathematical
induction on degree x (deg.x) :
The first step will show that it is correct on deg.x = 0 :
If is a polynomial in only real edges with deg.x = 0 and then
for and , with . Chosen
then .
The second step, suppose that it is true on deg.x N – 1, i.e. there exists path
such that for some and , then we will prove that it is true
on deg.x = N : If is a polynomial in only real edges with deg.x = N and
.
If x does not have a terms of degree 0, then such that each xi is a non
zero polynomial in real edges of degree N–1 or less, and with for
any . By induction hypothesis, is a polynomial non zero of deg.
N–1, there are such that , for some and .
If we take then .
If x has a term of degree 0, then we denote with
deg. , and for every . Chosen
, by lemma 4.1. there exists such that and we
have for every . If we take then
Lemma 4.3.Let E be a graph and R be a commutative unital ring. Every polynomial in only
real edges with , if there exists with , then for any edge with
it is the case that .
Proof : let is a non zero polynomial in only real edges with deg.x = k for some
. Since be graded -algebra, i.e. , then ,
where and each , for any . Since , it can be
assummed that . For every with we obtain .
Suppose that then . It contradicts the previous
proposition, that for and . Hence it must be .
Lemma 4.4. Let E be a graph and R be a commutative unital ring. If any non zero polynomial
in only real edges , then there exists E* such that x 0 and x is a polinomial
in only real edges.
The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras 189
Corollary 4.6. Let E be a graph satisfying Condition (L) and K be a field. If is a ring
homomorphism from to S with the property that for all then is
injective.
5. CONCLUSION
Leavitt path algebras LR(E) which is a generalization of Leavitt path algebras LK(E)
has some differences, in addition to some similarities. One of the differences is The Cuntz-
Krieger Uniqueness Theorem on LR(E) and LK(E), in addition to the graded uniqueness
theorem. The Cuntz-Krieger Uniqueness Theorem on LR(E) and LK(E) requires graph E which
satisfy the Condition L, i.e. if every cycle in E has an exit.
The differences between the two constructions are in the sufficient conditions of the
ring homomorphism such that the is injective. The Cuntzt-Krieger Uniqueness Theorem for
LR(E), it is for . This sufficient condition can still be applied
if the commutative unital ring R is replaced by field K. Since every element in field K has an
invers, the sufficient conditions of the ring homomorphism such that the is injective from
LK(E) to S is for .
Both the graded uniqueness theorem and the Cuntzt-Krieger uniqueness theorem have
The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras 191
a similar statement of the theorem, because both have the same sufficient conditions as well.
Ring homomorphism in the Cuntzt-Krieger uniqueness theorem does not require graded
homomorphism but the graph E must satisfy the Condition L. Otherwise, the graded
uniqueness theorem requires the graded homomorphism , but it does not require the graph
satisfying Conditin (L).
As a follow-up, many things that can be studied from Leavitt path algebras of both
LR(E) and LK(E) by developing what is already studied the authors before. For example, what
and how the necessary and sufficient conditions in order that LR(E) be simple, semi simple,
prime, or semi prime.
References
[1] ABRAMS, G., ARANDA PINO, G., Leavitt Path Algebra of a Graph, J. Algebra 293 (2), 319 – 334, 2005.
[2] ARANDA PINO, G.,, A Course On Leavitt Path Algebra, ITB, 2010
[3] ARANDA PINO, G., PERERA, F., MOLINA, M. S., Graph algebras: bridging the gap between analysis and algebra,
University of Malaga Press, Spain, 2007.
[4] TOMFORDE, M, Leavitt Path Algebras With Coefficient In A Commutative Ring, J. Algebra., 2009.
[5] WARDATI, K., ET AL, Teorema Keunikan Graded Aljabar Lintasan Leavitt (The Graded Uniqueness Theorem on
Leavitt Path Algebras), presented at the “National Conference on Algebra 2011” hosted by the Padjadjaran
University, Bandung, on April 30, 2011.
KHURUL WARDATI
Departement of Mathematics Faculty of Science and Technology State Islamic University
Sunan Kalijaga.
e-mail: [email protected]
SRI WAHYUNI
Departement of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah Mada
University.
e-mail: [email protected]
192 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 193 – 204.
Abstract. The activity times in a queuing network are seldom precisely known, and then
could be represented into the fuzzy number, that is called fuzzy activity times. This paper
aims to determine the dynamical model of a closed serial que uing network with fuzzy
activity time and its periodic properties using max-plus algebra approach. The finding
shows that the dynamics of the network can be modeled as a recursive system of fuzzy
number max-plus linear equations. The periodic properties of the network can be obtained
from the fuzzy number max-plus eigenvalue and eigenvector of matrix in the system. In the
network, for a given level of risk, it can be determined the earliest of early departure time
of a customer, so that the customer's departure interval time will be in the sma llest interval
where the lower bound and upper bound are periodic.
Keywords and Phrases: max-plus algebra, queuing network, fuzzy activity times, periodic.
1. INTRODUCTION
We will discuss the closed serial queuing network of n single-server, with a infinite
buffer capacity and n customers (Krivulin [4]). The network works with the principle of First-
In First-Out (FIFO). In the system, the customers have to pass through the queues
consecutively so as to receive service at each server. One cycle of network services is the
process of entry of customers into the buffer of 1st server to leave the nth server. After
completion of service to the nth server, customers return to the first queue for a new cycle of
network services. Suppose at the initial time of observation, all the servers do not give
service, in which the buffer of ith server contains one customer, for each i = 1, 2, ..., n. It is
assumed that the transition of customers from a queue to the next one requires no time.
193
194 M. ANDY R UDHITO E T AL
Figure 1 (Krivulin [5]) gives the initial state of the closed serial queuing network,
where customers are expressed by "•".
1 2 n
Figure 1 Closed Serial Queuing Network
The closed serial queuing network can be found in the assembly plant systems, such as
assembling cars and electronic goods. Customers in this system are palettes while the server
is a machine assembler. Palette is a kind of desk or place where the components or semi-
finished goods are placed and moved to visit machines assemblers. At first, 1st pallete enters
to the buffer of 1st engine and then enters to the 1st machine and the 2nd pallete enters to the
buffer of 1st engine. In the 1st engine, components are placed and prepared for assembly in
the next machine. Next, 1st palette enters buffer of 2nd machine and 2nd pallette enter 1st
machine. And so forth for n palettes are available, so that it reaches the state as in Figure 1
above, where the initial state observation is reached. After assembly is completed in the nth
machine, the assembly of goods will leave the network, while the palette will go back to the
buffer of 1st engine, to begin a new cycle of network services, and so on.
Max-plus algebra (Baccelli, et al. [1]; Heidergott, B. B, et. al. [3]), namely the set of
all real numbers R with the operations max and plus, has been used to model a closed serial
queuing network algebraically, with a deterministic time activity (Krivulin [4]; Krivulin
[5]). In the problem of modeling and analysis of a network sometimes its activity times is
not known, for instance due to its design phase, data on time activity or distribution are not
fixed. This activity can be estimated based on the experience and opinions from experts and
network operators. This network activity times are modeled using fuzzy number, that is
called fuzzy activity times. Scheduling problems involving fuzzy number can be seen in
Chanas and Zielinski [2], and Soltoni and Haji [9]. As for the issue network model
involving fuzzy number can be seen in Lüthi and Haring [6].
In this paper we determine the dynamical model of a closed serial queuing network
with fuzzy activity time and its periodic properties using max-plus algebra approach. This
approach will use some concepts such as: fuzzy number max-plus algebra, fuzzy number
max-plus eigenvalue and eigenvector (Rudhito [8]). We will discuss a closed serial queuing
network as discussed in Krivulin [4] and Krivulin [5], where crisp activity time will be
replaced with fuzzy activity time, where can be modeled by fuzzy number. The dynamical
model of the network can be obtained analogous with crisp activity time case. The periodic
properties of the network can be obtained from the fuzzy number max-plus eigenvalue and
eigenvector of matrix in the system. We will use some concepts and result on max-plus
algebra, interval max-plus algebra and fuzzy number max-plus algebra.
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 195
2. MAX-PLUS ALGEBRA
In this section we will review some concepts and results of max-plus algebra, matrix
over max-plus algebra and max-plus eigenvalue. Further details can be found in Baccelli, et
al. [1].
Let R : = R { } with R is the set of all real numbers and : = . Define two
operations and such that
a b := max(a, b) and a b : = a b
for every a, b R .
We can show that (R , , ) is a commutative idempotent semiring with neutral element =
and unity element e = 0. Moreover, (R, , ) is a semifield, that is (R, , ) is a
commutative semiring, where for every a R there exists a such that a (a) = 0. Then,
(R, , ) is called max-plus algebra, and is written as Rmax. The relation “ m ” defined on
Rmax as x m y iff x y = y. In Rmax, operations and are consistent with respect to the
order m , that is for every a, b, c Rmax, if a m b, then a c m b c, and a c m b
0 k k 1
x := 0 , x := x x
k
c. Define and : = , for k = 1, 2, ... .
Define R mmax : = {A = (Aij)Aij Rmax, i = 1, 2, ..., m and j = 1, 2, ..., n}, that is set of
n
all matrices over max-plus algebra. Specifically, for A, B R n max and Rmax we define
n
n
( A)ij = Aij , (A B)ij = Aij Bij and (A B)ij = A k 1
ik Bkj .
0 if i j
We define matrix E R n
max , (E )ij : =
n
and matrix R n
max , ( )ij := for every i
n
if i j
0 k k 1
R n
= En and A = A A
n
and j . For any matrix A max , one can define A for k = 1,
operations and are consistent with respect to the order m , that is for every A, B, C
R n n
max , if A
m B , then A C m B C, and A C m B C .
Define R nmax := { x = [ x1, x2, ... , xn]T | xi Rmax, i = 1, 2, ... , n}. Note that R nmax can be
1
viewed as R nmax . The elements of R nmax are called vectors over Rmax or shortly vectors. A
vector x R max is said to be not equal to vector , and is written as x , if there exists i
n
Conversely, for every weighted directed graph G = (V, A), can be defined a matrix A
n
R nmax , which is called the weighting matrix of graph G, where
w( j , i) if ( j , i) A
Aij = . The mean weight of a path is defined as the sum of the
if ( j , i) A .
weights of the individual arcs of this path, divided by the length of this path. If such a path is
a circuit one talks about the mean weight of the circuit, or simply the cycle mean. It follow
that a formula for maximum mean cycle mean max(A) in G(A) is max(A) =
n
k 1
(
1 n
k i1
( Ak
)ii )..
The matrix A R n n
max is said to be irreducible if its precedence graph G = (V, A) is
strongly connected, that is for every i, j V, i j, there is a path from i to j. We can show
2 n1
that matrix A R n n
max is irreducible if and only if (A A ... A )ij for
every i, j where i j (Schutter, 1996).
Given A R n max . Scalar Rmax is called the max-plus eigenvalue of matrix A if there
n
exists a vector v R nmax with v n1 such that A v = v. Vector v is called max-plus
eigenvector of matrix A associated with . We can show that max(A) is a max-plus eigenvalue
of matrix A. For matrix B = max(A) A, if Bii = 0, then i-th column of matrix B is an
*
the unique max-plus egenvalue of A and the max-plus eigenvector associated with max(A) is
v, where vi for every i {1, 2, ..., n} (Bacelli, et al., 2001).
In this section we will review some concepts and results of interval max-plus algebra,
matrix over interval max-plus algebra and interval max-plus eigenvalue. Further details can
be found in Rudhito, et al. [7] and Rudhito [8].
The (closed) max-plus interval x in Rmax is a subset of Rmax of the form
x = [ x , x ] = {x Rmax x m x m x },
which is shortly called interval. The interval x y if and only if y m x m x m y .
Especially x = y if and only if x = y and x = y . The number x Rmax can be represented
as interval [x, x]. Define I(R) := {x = [ x , x ] x , x R, m x m x } { }, where
:= [, ]. Define x y = [ x y , x y ] and x y = [ x y , x y ] for every x, y
I(R). We can show that (I(R), , ) is a commutative idempotent semiring with neutral
element = [, ] and unity element 0 = [0, 0]. This commutative idempotent semiring
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 197
matrix A = ( A ij ) R m n m n
max and A = ( A ij ) R max , which are called lower bound matrix
In this section we will review some concepts and results of fuzzy number max-plus
algebra, matrix over fuzzy number max-plus algebra and fuzzy number max-plus eigenvalue.
Further details can be found in Rudhito [8].
~ ~
Fuzzy set K in universal set X is represented as the set of ordered pairs K = {(x,
~
K~ (x)) x X } where K~ is a membership function of fuzzy set K , which is a mapping
198 M. ANDY R UDHITO E T AL
~ ~
from universal set X to closed interval [0, 1]. Support of a fuzzy set K is supp( K ) = {x X
~ ~ ~
K~ (x) 0}. Height of a fuzzy set K is height( K ) = sup { K~ (x)}. A fuzzy set K is
x X
~ ~
said to be normal if height( K ) = 1. For a number [0, 1], -cut of a fuzzy set K is
~ ~
cut( K ) = K ={x X K~ (x) }. A fuzzy sets K is said to be convex if K is convex,
that is contains line segment between any two points in the K , for every [0, 1],
Fuzzy number a ~ is defined as a fuzzy set in universal set R which satisfies the
following properties: i ) normal, that is a1 , ii ) for every (0, 1] a is closed in R, that
is there exists a ,
a R with a a such that a = [ a , a ] = {x R a x
a }, iii) supp( a~ ) is bounded. For = 0, define a0 = [inf(supp( a~ )), sup(supp( a~ ))]. Since
every closed interval in R is convex, a is convex for every [0, 1], hence a ~ is convex.
Let F(R) ~ := F(R) { ~ }, where F(R) is set of all fuzzy numbers and ~ : = {}
~ ~
with the -cut of ~ is = [,]. Define two operations and such that for every
~
a~ , b F(R) ~ , with a = [ a , a ] I(R)max and b = [ b , b ] I(R)max,
~ and b~ , written a~ ~ ~
i) Maximum of a b , is a fuzzy number whose -cut is interval
b , a b ] for every [0, 1]
[a
~ and b~ , written a~
ii) Addition of a
~ ~
b , is a fuzzy number whose -cut is interval
[ a b , a b ] for every [0, 1].
~ ~
We can show that (F(R) ~ , , ) is a commutative idempotent semiring. The
~ ~
commutative idempotent semiring F(R)max := (F(R) ~ , , ) is called fuzzy number max-
plus algebra, and is written as F(R)max (Rudhito, et al. [8]).
m n ~ ~ ~
Define F(R) max := { A = ( A ij) A ij F(R)max, i = 1, 2, ..., m and j = 1, 2, ..., n }. The
n
elements of F(R) mmax are called matrices over fuzzy number max-plus algebra or shortly
fuzzy number matrices. The operations on fuzzy number matrices can be defined in the same
~ n
way with the operations on matrices over max-plus algebra. Define matrix E F(R) nmax ,
~
0 if i j ~ ~
, and matrix F(R) max , with ( )ij := ~ for every i and j.
~ n n
with ( E )ij : =
~
if i j
~ ~
For every A F(R) mmax n and [0, 1], define -cut matrix of A as the interval
~
matrix A = ( Aij ), with Aij is the -cut of A ij for every i and j. Define matrix A = ( Aij )
mn
R max and A = ( Aij ) R mn
max which are called lower bound and upper bound of matrix
~ ~ n
A, respectively. We can conclude that the matrices A , B F(R) m
max are equal iff A = B ,
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 199
that isAij = Bij for every [0, 1] and for every i and j. For every fuzzy number matrix
~ ~ ~ ~
A , A [ A , A ]. Let F(R)max, A , B F(R) mmax n . We can show that A)
[ A ,
A ] and (A B) [ A B , A B ] for every [0, 1]. Let
~ ~
A F(R) mmax p , A F(R) max
p n
. We can show that (A B) [ A B , A B ] for
every [0, 1].
Define F(R) max := { ~ ~x , ~ ~ T ~
n
x =[ 1 x 2 , ... , xn ] | xi F(R)max , i = 1, ... , n }. The
n
elements in F(R) max are called fuzzy number vectors over F(R)max or shortly fuzzy number
vectors. A fuzzy number vector ~
x F(R) nmax is said to be not equal to fuzzy number vector
ε , written ~
~ x ~ xi ~ .
ε , if there exists i {1, 2, ..., n} such that ~
~ n n n
Fuzzy number matrix A F(R) nmax is said to be irreducible if A I(R) max is
~
irreducible for every [0, 1]. We can show that A is irreducible if and only if A
0
n
R nmax is irreducible (Rudhito, et al. [7]).
~ ~
n
A F(R) nmax
Let . The fuzzy number scalar F(R)max is called fuzzy number max-
~ ~ F(R) n with v~
plus eigenvalue of matrix A if there exists a fuzzy number vector v max
~ ~ ~ ~ ~ ~ ~ ~
ε n1 such that A v = v . The vector v is called fuzzy number max-plus
~ ~ ~
eigenvectors of matrix A associaed with . Given A F(R) nmax n
. We can show that the
~ ~ ~ ~
fuzzy number scalar max ( A ) = max
, where max is a fuzzy set in R with membership
[0,1]
function ~ (x) = ~ (x), and ~ is the characteristic function of the set [max( A ),
max max max
~
max( A )], is a fuzzy number max-plus eigenvalues of matrix A . Based on fundamental
max-plus eigenvector associated with eigenvalues max( A ) and max( A ), we can find
~
fundamental fuzzy number max-plus eigenvector associated with max (Rudhito [8]).
~ ~ ~
Moreover, if matrix A is irreducible, then max ( A ) is the unique fuzzy number max-plus
~
eigenvalue of matrix A and the fuzzy number max-plus eigenvector associated with
~
max( A ) is v~ , where v~i ~ for every i {1, 2, ..., n}.
We discuss the closed serial queuing network of n single-server, with a infinite buffer
capacity and n customers, as in Figure 1.
Let a~i (k ) = fuzzy arrival time of kth customer at server i,
~
di (k ) = fuzzy departure time of kth customer at server i,
~t = fuzzy service time of kth customer at server i.
i
for k = 1, 2, ... and i = 1, 2, ..., n.
The dynamical of queuing at server i can be written as
~ ~ ~ ~
di (k ) = max( ti + a~i (k ) , ti + di (k 1) ) (1)
~
~
d n (k 1) if i 1
ai (k ) = ~ . (2)
d i 1 (k 1) if i 2, ...,n
Using fuzzy number max-plus algebra notation, equation (1) can be written as
~ ~ ~ ~
di (k ) = ( ~ti a~i (k ) ) ( ~ti di (k 1) ). (3)
~ ~ ~ ~ ~
Let d (k ) = [ d1 (k ) , d 2 (k ) , ... , d n (k ) ]T, a~ (k ) = [ a~1 (k ) , a~2 (k ) , ... , a~n (k ) ]T and T =
~ ~
t1 ε
, then equations (3) and (2) can be written as
~
ε ~
tn
~ ~ ~ ~ ~ ~ ~
d (k ) = ( T a~ (k ) ) ( T d (k 1) . (4)
~ ~ ~
a~ (k ) = G d (k 1) , (5)
~
~ ε ~
ε 0
~ ~ ~
0 ε ε
with matrix G =
~
.
~ ~ ~
ε 0 ε
Substituting equation (5) to the equation (4), can be obtained the equation
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
d (k ) = T G d (k 1) T d (k 1)
~ ~ ~ ~ ~ ~ ~
= T ( G E ) d (k 1)
~ ~ ~
= A d (k 1) (6)
~ ~ ~ ~
t1 ε ε t1
~ ~ ~ ~
t2 t2 ε ε
~ ~ ~ ~ ~ ~
with fuzzy number matrix A = T ( G E ) = ~ε .
~ ~ ~
t n 1 tn1 ε
~ ~ ~ tn
~
ε ε t
n
The equation (6) is dynamical model of the closed serial queuing network.
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 201
Dynamical model recursive equation of the closed serial queuing network (6) can be
~
represented through the early departure time of customer d (0) , with its -cut d (0)
[ d (0) , d (0) ] for every [0, 1]. For every [0, 1] hold d (k ) = A d (k 1)
k
[ A d (k 1) , A d (k 1) ] = [ ( A ) d (0) , ( A ) d (0) ] (A )
k k
k ~
d(0). Thus, for every [0, 1] hold d (k ) = (A ) d(0). Hence we have d (k )
~~k ~ ~
= A d (0) . Since the early departure time of customer can be determined exactly, it is a
~
crisp time, that is a point fuzzy number d (0) , with d (0) [ d (0) , d (0) ] where d (0)
= d (0) for every [0, 1].
Since precedence of matrix A0 in the model of the closed serial queuing network
~
(Figure 1) is strongly connected, the matrix A0 is irredusible. Hence, matrix A in the
~
equation (6) is irredusibel. Thus, matrix A has unique fuzzy number max-plus eigenvalue,
~ ~ ~ is the fundamental fuzzy number max-plus eigenvector associated
that is max ( A ) where v
~ ~
with max ( A ), where v~i ~ for every i {1, 2, ..., n}.
~ where its -cut vector is
*
We construct fuzzy number vector v v* [ v* , v* ],
using the following steps. For every [0, 1] dan i = 1, 2, ..., n, form
1. v = 1 v , v = 1 v , with 1 = min(v 0 i ) .
i
* *
4. v = v , v
= 4() v , with 4() = min(v0 i v i ) .
i
The fuzzy number vector v~* is also a fuzzy number max-plus eigenvalue associated with
~ ~
max ( A ). From construction above, the components of v* , that is v* i are all non-negative
0 0
and there exist i {1, 2, ..., n } such that v* i
= 0 for every [0, 1]. Meanwhile, its -cut
vector is the smalest interval, in the sense that min(v* i v* i ) = 0 for i = 1, 2, ..., n, among
0 0
all possible fuzzy number max-plus eigenvector, the modification of the fundamental fuzzy
202 M. ANDY R UDHITO E T AL
number max-plus eigenvector v~ , where all the lower bounds of its components are non-
negative and at least one zero.
Since the fuzzy number vector v~* is a fuzzy number max-plus eigenvector associated
~ ~
with max ( A )
~ ~ ~ ~ ~
A v~* = max ( A ) v~* or A v* = max(A) v* or
[ A v* , A v* ] = [max( A ) v* , max( A ) v* ].
Hence A v* = max( A ) v* and A v* = max( A ) v*
for every [0, 1].
~
For some [0, 1], we can take the early departure time of customer d (0) = v* ,
that is the earliest of early departure time of a customer, such that the lower bound of
customer departure time intervals are periodic. This is because there exist i {1, 2, ..., n }
such that v* i = 0 for every [0, 1]. Since the operation and on matrix are consistent
with respect to the order “ m ”, then
( A ) v*
k
m ( A ) k v* m ( A ) k v* .
This resulted
d (k ) [ ( A ) v* , ( A ) v* ] [ ( A ) v* , ( A ) v* ]
k k k k
= [ (max ( A )) v* , (max ( A )) v* ]
k k
= [ (max ( A )) , (max ( A )) ] [ v* , v* ]
k k
= [ max ( A ), max ( A )] [ v* , v* ] for every k = 1, 2, 3 , ... .
k
Thus for some [0, 1], vector v* is the earliest of early departure time of a customer, so
that the customer's departure interval time will be in the smallest interval where the lower
bound and upper bound are periodic with the period max( A ) and max( A ), respectively.
In the same way as above, we can show that for some [0, 1], if we take the early
~
departure time d (0) = v , where v* m v m v , then we have
d (k ) [ ( A ) v, ( A ) v] [ ( A ) v* , ( A ) v* ]
k k k k
= [ max ( A ), max ( A )] [ v* , v* ].
k
References
[1] BACCELLI, F., COHEN, G., OLSDER, G.J. AND QUADRAT, J.P., Synchronization and Linearity, John Wiley &
Sons, New York, 2001.
[2] CHANAS, S. AND ZIELINSKI, P., Critical path analysis in the network with fuzzy activity times, Fuzzy Sets and
Systems, 122, 195–204, 2001.
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 203
[3] HEIDERGOTT, B., OLSDER, J. G AND WOUDE, J., Max Plus at Work, Princeton, Princeton University Press,
2005.
[4] KRIVULIN, N.K., A Max-Algebra Approach to Modeling and Simulation of Tandem Queuing Systems.
Mathematical and Computer Modeling, 22 , N.3, 25-31, 1995.
[5] KRIVULIN, N.K., The Max-Plus Algebra Approach in Modelling of Queueing Networks, Proc. 1996 Summer
Computer Simulation Conf., Portland, OR, July 21-25, 485-490, 1996.
[6] LÜTHI, J. AND HARING, G., Fuzzy Queueing Network Models of Computing Systems, Proceedings of the 13th
UK Performance Engineering Workshop, Ilkley, UK, Edinburgh University Press, July 1997.
[7] RUDHITO, A., WAHYUNI, S., SUPARWANTO, A. AND SUSILO, F., Matriks atas Aljabar Max-Plus Interval. Jurnal
Natur Indonesia 13, No. 2., 94-99, 2011.
[8] RUDHITO, A., Aljabar Max-Plus Bilangan Kabur dan Penerapannya pada Masalah Penjadwalan dan Jaringan
Antrian Kabur, Disertasi: Program S3 Matematika FMIPA Universitas Gadjah Mada, Yogyakarta, 2011.
[9] SOLTONI, A. AND HAJI, R., A Project Scheduling Method Based on Fuzzy Theory, Journal of Industrial and
Systems Engineering, 1, No.1, 70 – 80, 2007.
M. ANDY RUDHITO
Department of Mathematics and Natural Science Education, Sanata Dharma University,
Yogyakarta, Indonesia.
e-mail: [email protected]
SRI WAHYUNI
Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia.
e-mail: [email protected]
ARI SUPARWANTO
Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia.
e-mail: [email protected]
F. SUSILO
Department of Mathematics, Sanata Dharma University, Yogyakarta, Indonesia.
e-mail: [email protected]
204 M. ANDY R UDHITO E T AL
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 205–212.
Keywords and Phrases: K1,n -magic covering, critical sets, complete bipartite graph.
1. INTRODUCTION
We consider finite and simple graphs. The vertex set of a graph G is denoted
by V (G) while the edge set by E(G). An edge-covering of G is a family of different
subgraphs H1 , . . . Hk such that any edge of E belongs to at least one of the subgraphs
Hi , 1 ≤ i ≤ k. If every Hi is isomorphic to a given graph H, then G admits an H-
covering. A total labeling of G is an injection function f : V ∪ E → {1, 2, . . . |V | + |E|}
such that for each subgraph H 0 = (V 0 , E 0 ) of G isomorphic to H, Σv∈V 0 f (v)+Σe∈E 0 f (e)
is constant. Further, if f (V ) = {1, 2, . . . |V |} then G is called H-supermagic.
The H-supermagic covering was first introduced by Gutiérrez and Lladó [3] in
2005, on which H-supermagic labeling of stars, complete bipartite graphs, paths and
cycles were found. In [4], Lladó and Morragas studied Cn -supermagic labeling of some
graphs, i.e. wheel, windmill, prism and books. They proved that those graphs men-
tioned are Ch -supermagic for some h. In [5], Maryati et al. proved that some classes
of trees such as subdivision of stars, shrubs and banana tree graphs are Ph -supermagic.
Furthermore, cycles-supermagic labeling of chain graphs kCn -snake, triangle ladders
T Ln , grids Pm xPn for n = 2,3,4,5, also fans Fn and books Bn are found in Ngurah et
al.. [7]. For certain schakles and amalgamations of a connected graph, Maryati et al.
[6] had shown the result on the same topics, and a path-amalgamation of isomorphic
graphs had also been proved by Salman and Maryati (see [8]). For some of the labelings,
we number all vertices and all edges. and we called these number positions. A graph
labeling can be represented as a set of ordered pairs of position and its label.
Baskoro et al. in [1] defined a critical set of a graph G with labeling f as a set
Qf (G) = {(a, b)|a, b ∈ {1, 2, . . . , |V (G) ∪ E(G)}} with the ordered pair (a, b) represents
label b in position a, which satisfy :
(1) f is the only labeling of G which has label b in position a
(2) No proper subset of G satisfies 1.
If a critical set has c members, thus it has size c.
A critical set can be applied on secret sharing schemes. Secret sharing schemes
were first introduced by Blakley [2] and Shamir [9] in 1979. A secret sharing scheme is
a method of sharing a secret S among a finite set of participants P = {P1 , P2 , . . . Pn } in
such a way that if the participants in A ⊆ P are qualified to know the secret, then by
pooling together their partial information, they can reconstruct the secret S; but any
B ⊂ P which is not qualified to know S, can not reconstruct the secret. The key S is
chosen by special participant D, called dealer and the dealer gives partial information,
called the share to each participant to share the secret S.
In this talk we enumerate all possible of K1,n -magic covering to Km,n for m =
n, n = 2, 3 and for some n = 4. We also provide some of its critical sets for certain
K1,n - magic coverings.
star magic covering. In labeling K2,2 , the vertices are labeled by {a, b, c, d} and the
edges are labeled by {e, f, g, h}, ordered from left to right (see Figure 1 ). The magic
constants m(f ) are found by summing all four possible K1,2 -star magic coverings (see
Figure 2).
a b
f g
e h
c d
a b
f g
e h
d c
c d
a b
b a
e f h
g
c d
3.1. Possible K1,2 -magic Coverings on K2,2 . The labels of vertices and edges are
positive integers between 1 to |V (G) + E(G)|. From Equation (1) T should be vertices
with the sum of labels mod 4, and there are 6 possible K1,2 -magic coverings on K2,2
(See Figure 3).
Tabel 1 shows the only six possible K1,2 -magic coverings on K2,2 with the last
three are the dual of the first three. For the notion of duality, see [10]. Given a star
208 M. Roswitha et al.
1 5 1 7 6 2
6 7 8 3 7 8
8 3 5 4 1 4
4 2 6 2 5 3
(a) (b) (c)
8 2 3 7 8 4
4 5 2 1 3 6
1 6 8 5 1 2
7 3 4 6 5 7
(d) (e) (f)
magic covering λ, its dual labeling λ0 is defined by λ0 = v + e + 1 − λ(x) for every vertex
x and λ0 (x, y) = v + e + 1 − λ0 (x, y) for every edge {x, y}.
T m(f) a b c d e f g h
12 21 1 5 4 2 8 6 7 3
16 22 7 1 2 6 4 3 8 6
6 2 5 3 1 7 8 4
20 23 2 8 7 3 6 6 1 4
3 7 4 6 8 2 1 5
24 24 8 4 5 7 1 3 6 2
3.2. Possible K1,3 -magic Coverings on K3,3 . We label K3,3 in the same order as
in Figure 4, the vertices are labeled by {a, b, c, d, e, f } and the edges are labeled by
{g, . . . , o}, again, ordered from left to right. There are 6 K1,3 -magic coverings on a
bipartite graph K3,3 . To find the magic constant m(f ), see Equation (2), and all T ’s
found are vertices with the sum of labels mod 3. To get K1,3 -star magic coverings on a
bipartite graph K3,3 , first we search all possible combinations of labels of vertices in T .
As an example, for T = 21 we need only one combination of vertices labels {1,2,3,4,5,6}.
For T = 24, we have 3 combinations on labels of vertices: {1,2,3,4,5,9}, {1,2,3,4,6,8},
{1,2,3,5,6,7}. But there is no guarantee that every combination of T has a star magic
covering.
To find all possible coverings, we also use a duality, e.g. T = 75 is a dual of
T = 21, and T = 72 is a dual of T = 24. From Table 2, it can be seen that we have
Enumerating of Star-Magic Coverings and Critical Sets 209
a b c
I j k l m
g h n o
d e f
20 labelings for T = 24 and T = 72. The results of the enumeration of the possible
labeling with T = 21 to T = 75 are presented in Table 2. For each T odd we could not
find a K1,3 -magic coverings. But there is no guarantee that all even T have star magic
coverings on it.
T 21 24 27 30 33 36 39 42 45 48
m(f ) 47 48 49 50 51 52 53 54 55 56
Labelings 0 20 0 52 0 109 0 166 0 224
T 51 54 57 60 63 66 69 72 75
m(f ) 57 58 59 60 61 62 63 64 65
Labelings 0 166 0 109 0 52 0 20 0
Proposition 3.1. If the sum of labels of vertices T is odd, then there is no star magic
covering on complete bipartite graphs K3,3 .
Proof. Suppose there is a star magic covering on complete bipartite graphs K3,3 with
the odd sum of vertices T . If T odd then m(f ) must be odd, too, and it follows that
s is also odd according to Equation (2). Suppose T has a label set {a,b,c,d,e,f} as in
Figure 4, then we have s as follows.
6 m(f ) = 2 s + 2 T
(2)
s = 3 m(f ) − T
It is clear that if T odd then m(f ) is also odd, then we have s is even, a contradiction.
3.3. K1,4 -magic Covering on K4,4 . There are hundreds of combinations of label
sets on K4,4 . For example, for T = 56 by using a computer program, we have 402
combinations. that yields more than 922 labeling (we still work on it). Some of the
labeling on certain T have been found but still more works to do. A list of the possible
labeling can be seen on Table 3.
210 M. Roswitha et al.
T 40 48 56
m(f ) 90 93 96
Labelings 0 135 922
{(1,1), (3,4)} {(1,8), (4,3)} {(3,4), (4,2)} {(4,2), (5,8)} {(5,8), (5,6)}
{(1,1), (2,5)} {(1,8), (6,3)} {(3,4), (4,6)} {(4,2), (7,3)} {(5,8), (6,2)}
{(1,1), (2,7)} {(1,8), (6,4)} {(3,4), (6,2)} {(4,3), (6,7)} {(5,8), (6,6)}
{(1,1), (3,6)} {(1,8), (7,5)} {(3,4), (6,6)} {(4,3), (7,8)} {(5,8), (7,7)}
{(1,1), (5,5)} {(1,8), (8,2)} {(3,4), (7,1)} {(4,2), (8,3)} {(6,2), (8,5)}
{(1,1), (6,6)} {(2,2), (6,4)} {(3,4), (7,7)} {(4,3), (8,6)} {(6,3), (7,6)}
{(1,1), (7,7)} {(2,2), (7,5)} {(3,4), (8,5)} {(4,6), (6,2)} {(6,3), (8,2)}
{(1,1), (8,4)} {(2,4), (5,1)} {(3,5), (4,3)} {(4,6), (7,1)} {(6,4), (8,6)}
{(1,3), (5,8)} {(2,4), (7,6)} {(3,5), (4,7)} {(4,7), (5,1)} {(6,6), (7,7)}
{(1,3), (6,2)} {(2,4), (8,2)} {(3,5), (6,3)} {(4,7),(8,2)} {(6,6), (8,3)}
{(1,3), (3,4)} {(2,5), (5,8)} {(3,5), (6,7)} {(5,1), (6,4)} {(6,7), (7,8)}
{(1,6), (3,5)} {(2,5), (6,6)} {(3,5), (7,2)} {(5,1), (6,7)} {(6,7), (8,4)}
{(1,6), (5,1)} {(2,5), (7,7)} {(3,5), (7,6)} {(5,1), (6,3)} {(7,3), (8,4)}
{(1,6), (6,7)} {(2,5), (8,3)} {(3,5), (8,4)} {(5,1), (7,6)} {(7,5), (8,6)}
{(1,8), (2,2)} {(2,7), (5,5)} {(3,6), (7,3)} {(5,1), (8,4)} {(7,6), (8,2)}
{(1,8), (2,4)} {(2,7), (6,2)} {(3,7), (6,4)} {(5,1), (8,6)} {(7,7), (8,3)}
{(1,8), (3,5)} {(2,7), (8,4)} {(4,2), (5,5)} {(5,5), (7,3)}
Open Problem. Enumeration of possible labelings on K4,4 for T ∈ [64, 160] with
99 ≤ k ≤ 135 has not been counted yet. There is a possible sum of hundreds or even
thousands of them since we have been found 922 from 280 out of a list of combinations
of 402 counted by a computer program.
References
[1] Baskoro, E. T., Simanjuntak, R. and Adithia, M. T., Secret Sharing Scheme based on magic-
labeling, Proceeding of 12th National Mathematics Conference, 23-27, 2004.
[2] Blackley, G. R., Safeguarding Criptographic Keys, Proc. AFIPS 12th, New York, (48) 313-317,
1979.
[3] Gutiérrez, A. and Lladó, A., Magic Coverings, JCMCC 55, 43-56, 2005.
[4] Lladó, A. and Moragas, J., Cycle-magic Graphs, Discrete Mathematics 307(23), 2925-2933,
2007.
[5] Maryati, T. K., Baskoro, E. T. and Salman, A. N., M., Ph -supermagic labelings of some trees,
JCMCC 65, 197-204, 2008.
[6] Maryati, T. K., Salman, A. N. M., Baskoro, E. T., Ryan, J.and Miller, M., On H-supermagic
labelings for certain schakles and amalgamations of a connected graph, Utilitas Mathematica 83,
333-342, 2010.
[7] Ngurah, A. A. G., Salman, A. N. M. and Susilowati, L., H-supermagic labeling of graphs,
Discrete Mathematics 310(8), 1293-1300, 2010.
[8] Salman, A. N., M. and Maryati, T. K., On graph-(super)magic labelings of a path-amalgamation
of isomorphic graphs, Proc. of the 6thIMT-GT Conference on Mathematics, Statistics and its
Applications ICMSA2010, Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia, 2010.
[9] Shamir, A., How to share a secret, Comm. ACM 22 No.11, 612-613, 1979.
[10] Wallis, W.D., Magic Graph, Birkhäuser, Boston, Basel, Berlin, 2001.
E. T. Baskoro, H. Assiyatun
Combinatorial Mathematics Research Division,
Faculty of Mathematics and Natural Sciences
Institut Teknologi Bandung, Indonesia.
e-mails: [email protected]; [email protected]
212 M. Roswitha et al.
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 213–222.
213
214 Ricky Aditya and Ari Suparwanto
2. CONVOLUTIONAL CODES
In this section, we introduce the basic concept of convolutional codes. Most
definitions and theorems in this section can be found in Johanesson and Zigangirov [1].
In convolutional codes, the length of each codeword is not constant. The input message
of convolutional codes later will be generalized as sequence of messages with finitely
many nonzeros. The sequence is represented as polynomial vector. At first, we discuss
about definition of convolutional codes from abstract algebra point of view. We recall
that for any finite field with order q, denoted by Fq , the polynomial ring Fq [z] is a
principal ideal domain (PID). Moreover, Fqn [z] is a finitely generated free module over
Fq [z]. Bases of free modules over PID have same properties as bases of vector spaces
over field.
Definition 2.1. Given a finite field Fq . A convolutional codes C over Fq is defined as
a Fq [z]-submodule of Fqn [z].
Definition 2.3. Given a rate k/n convolutional code C with encoder matrix G(z).
(1) Memory of C is defined as the highest degree of entries of G(z).
(2) Complexity of C is defined as the highest degree of full size minors of G(z).
We also have another parameters such as row degree and total degree. Row degree
is the highest degree of entries in a row of the encoder matrix, and total degree is sum
of all row degrees. Encoder matrix of a convolutional codes is unique up to matrix
equivalence, as given in following lemma:
Lemma 2.1. Two polynomial matrices G(z) and G0 (z) are encoder matrix of a con-
volutional code C if and only if exist a unimodular matrix U (z) ∈ GLk (Fq [z]) such that
G(z) = U (z) · G0 (z). Then we say that G(z) and G0 (z) are equivalent.
Encoder matrix with high memory might make the encoding process more com-
plicated. So, we are concerned about choosing encoder matrix such that its memory
is minimal. We say that an encoder matrix is minimal if its total degree is minimum
among all equivalent encoders. For later discussion, we limit our discussion for minimal
encoder. By choosing minimal encoder matrix, the total degree and so the memory of
the encoder matrix are minimal. So, the encoding map will be easier to do.
As in linear block codes, we can also define a concept of distance in convolutional
codes. Usually the distance of convolutional codes is called free distance.
Definition 2.4. Given a convolutional code C and v(z), w(z) ∈ C. If v(z) and w(z)
represent sequences of codewords v = v1 , v2 , v3 , · · · and v = v1 , v2 , v3 , · · · respectively,
then:
216 Ricky Aditya and Ari Suparwanto
(1) free distance of v(z) and w(z), denoted by df (v(z), w(z)), is defined as:
∞
X
df (v(z), w(z)) = d(vi , wi ),
i=1
1, if vi 6= wi
where d(vi , wi ) = , i ∈ N.
0, if vi = wi
(2) free weight of v(z), denoted by wtf (v(z)), is defined as:
wtf (v(z)) = df v(z), 0 ;
(3) free distance of C, denoted by df (C), is defined as:
df (C) = min {df (v(z), w(z)) : v(z), w(z) ∈ C, v(z) 6= w(z)}
= min wtf (v(z)) : v(z) ∈ C, v(z) 6= 0 .
Free distance of a convolutional code are determined from the number of different
digits as sequences of codewords, not as polynomial vectors. As in linear block codes, if
a convolutional code C has free distance df (C) = d, then C can detect up to d − 1 errors
and correct up to b d−1
2 c errors. We also can generalize Singleton Bound in linear block
codes to convolutional codes, which is given and proved in Rosenthal and Smarandache
[2].
Theorem 2.1. Every rate k/n convolutional code C with complexity δ and encoder
matrix G(z) satisfies:
δ
df (C) ≤ (n − k) · + 1 + δ + 1. (2)
k
Theorem 2.1 is called Generalized Singleton Bound. Convolutional codes which
reach the equality in (2) is called Maximum Distance Separable (MDS) Convolutional
Codes. The upper bound of free distance of convolutional codes might be very large,
depends on its complexity. However, it is not easy to determine the free distance of
a convolutional code. Consequently, to construct a convolutional code with designed
free distance is not easy too. For efficient error-correcting process, we need a ”good”
convolutional codes, which has large free distance. Therefore, we should consider some
construction methods of convolutional codes such that the free distance are large enough.
One construction method is from linear system approach, which will be discussed in next
sections.
3. REPRESENTATION OF CODES
In this section, we will discuss about connection between convolutional codes and
linear systems. First, we discuss about representation of convolutional codes as discrete
time-invariant linear systems. Later, we will discuss about how to construct codes from
given systems. All proofs of theorems and lemmas in this section can be found on York
[4].
Construction of Rate s/2s Convolutional Codes with Large Free Distance... 217
Lemma 3.1. If given matrices A, B, C and D in system (6), then exist polynomial
matrices X(z), Y (z) dan U (z) of size k × δ, k × (n − k) and k × k respectively such that:
z·I −A C
Ker O −I = Im X(z) Y (z) U (z) . (7)
−B D
Moreover, G(z) = Y (z) U (z) is the encoder matrix of a convolutional code C.
We have seen that from a linear system we can construct an encoder matrix. Rate
and complexity of constructed code depend on dimension and size of matrices in the
system. So, from Theorem 3.1 and Lemma 3.1 we have found a close relation between
convolutional codes and linear system. In next section, we construct a convolutional
code by defining the system (choosing quadruple (A, B, C, D)) before. However, if sys-
tem (A, B, C, D) is transformed to (K, L, M ) form by (5), three properties in Theorem
3.1 do not always hold. So, we are interested to find some properties of the system
such that all properties in Theorem 3.1 hold. One can show that every representation
system of convolutional codes is controllable system. But, not all representation system
are observable. We say a convolutional code to be an observable code if its representa-
tion system is observable too. We can justify the controllability and observability of a
system in (A, B, C, D) form by these matrices:
B
BA
2
Φδ (A, B) = BA and Ωδ (A, C) = C AC A2 C · · · Aδ−1 C .
..
.
BAδ−1
From system theory we have a system is controllable (observable) if Φδ (A, B) (Ωδ (A, C))
is full rank. For a controllable and observable system, we have following lemma:
Lemma 3.2. If (A, B, C, D) is a quadruple matrices in system (6) such that
rank (Ωδ (A, C)) = δ and rank (Φδ (A, B)) = δ, then the triple matrices (K, L, M ) which
is defined as (5) satisfies the properties in Theorem 3.1.
We can say that controllability and observability of a system are necessary con-
ditions to make a ”converse” of Theorem 3.1. In the later discussion, we construct
convolutional codes by defining a controllable and observable system. In addition, we
choose (A, B, C, D) such that our constructed codes have large free distance.
distance is the maximum possible. The case for rate 1/2 is also proved completely. In
this section, we will generalize the proof of rate 1/2 convolutional codes to rate s/2s,
for any natural numbers s.
At first, we remember that for any rate s/2s, the size of matrix D in system (6)
is s × s. Therefore, we can choose D = Is , the identity matrix of size s × s, and the
input and output space have same dimension. Moreover, we can modify system (6) to:
xt+1 = xt · (A − CB) + yt · B
ut = −xt · C + yt · D (8)
vt = (yt ut ), x0 = xp+1 = 0
In other words, we exchange the role of input ut and output yt as in (8) form. The
problem is to guarantee that both (6) and (8) are controllable and observable. First we
consider matrices A and B below:
s
α 0 ··· 0
1 1 ··· 1
.. α α2 ··· αδ
0 α2s . . .
.
2 4
α2δ
A=
, B = α α ··· , (9)
. .. ..
.. .. .. ..
..
. .0 . . . .
0 ··· 0 αδs αs−1 α2(s−1) · · · αδ(s−1)
where α is primitive element of field Fq and q ≥ δ 2 s. One can show that A and B form
a controllable pair. To make system (8) also controllable, we must make A − CB and
B form a controllable pair too. We will consider the following lemma:
Lemma 4.1. If α is primitive element of field Fq and assume that q ≥ max δ 2 s, 3δs + 1 ,
s | q − 1. Let A, B are matrices as defined in (9) and:
(δ+1)s
α 0 ··· 0
.. ..
0
0 α(δ+2)s . .
A = .
,
. . . . .
. . . 0
0 ··· 0 α2δs
then exist δ × s matrix C such that:
A0 = S −1 (A − CB)S,
for some invertible matrix S of size δ × δ.
Proof. We want to show that A0 similar with A − CB, which is equivalent with showing
0 (δ+1)s
det (x · I − (A
−CB)) = det(x · I − A ) = (x − α ) · (x − α(δ+2)s ) · · · · · (x − α2δs ).
c1
c2
Write C = . , then substitute x with α(δ+1)s , α(δ+2)s , α(δ+3)s , · · · , α2δs , we will
..
cδ
have det (x · I − (A − CB)) = 0 become a system of equations of c1 ·B, c2 ·B, · · · , cδ ·B.
Because B is full row rank, we can uniquely determined c1 , c2 , · · · , cδ . By determining
the solutions of this equation system, we will get the result.
220 Ricky Aditya and Ari Suparwanto
From Lemma 4.1, we have an invertible matrix S of size δ × δ such that system
(8) is equivalent with:
xt+1 = xt · A0 + yt · BS
ut = −xt · S −1 C + yt (10)
vt = (yt ut ), x0 = xp+1 = 0.
Theorem 4.1. Convolutional code C which is constructed from system (10) have free
distance at least 2(δ + 1).
Proof. For any v(z) ∈ C, where v(z) 6= 0 and deg (v(z)) = γ, let v(z) = v0 ·z γ +v1 ·z γ−1 +
· · ·+vγ−1 ·z +vγ and v(z) = (y(z) u(z)), y(z) = y0 ·z γ +y1 ·z γ−1 +· · ·+yγ−1 ·z +yγ and
u(z) = u0 ·z γ +u1 ·z γ−1 +· · ·+uγ−1 ·z+uγ . Because deg (v(z)) = γ, we must have v0 6= 0,
so that from the equation of system (6) u0 and y0 cannot both are zero vectors. We will
consider two cases, (γ+1)s < q−1 and (γ+1)s ≥ q−1. For (γ+1)s < q−1, from Lemma
6.1.2 and Proposition 6.1.3 in York [4], we have (u0 u1 · · · uγ ) · Φγ+1 (A, B) =
0. Because (γ + 1)s < q − 1, matrix Φγ+1 (A, B) is equivalent with Vandermonde
matrix of size (γ + 1)s × δ and so any δ rows of Φγ+1 (A, B) are linearly independent.
Therefore, we get wt (u0 u1 · · · uγ ) ≥ δ + 1. Similarly, if we consider system
(10), we have (y0 y1 · · · yγ ) · Φγ+1 (A0 , B 0 ) = 0 and any 0
δ rows of Φγ+1 (A , B )
0
u0 = u0 + u q−1
s
+ ··· u1 + u q−1 +1 + · · ·
s
··· u (q−1) −1 + u 2(q−1) −1 + · · · .
s s
..
.
q−1
x q−1 −1 + x 2(q−1) −1 + · · · = x0 + x q−1 + · · · · A s −1 .
s s s
..
.
y q−1 −1 + y 2(q−1) −1 + · · · = x q−1 −1 + x 2(q−1) −1 + · · · · C
s s s s
q−1
= x0 + x q−1 + · · · · A s −1 C,
s
which mean y 0 = y0 + y q−1 + · · · + y q−1 −1 + y 2(q−1) −1 + · · · = x0 + x q−1 + · · · ·
s s s s
Ωδ (A, C). Because Ωδ (A, C) is a Vandermonde matrix which is multiplied by non-
singular diagonal matrix, any δ columns of Ωδ (A, C) are linearly independent. Be-
cause y 0 6= 0, wecan show that y 0 contains at most δ zero digits. In other words,
wt (y0 · · · yγ ) ≥ wt(y 0 ) ≥ q − 1 − δ ≥ 3δs − δ ≥ 3δ − δ = 2δ. Because u0 = 0
and u0 6= 0, we must have wt (u 0 · · · u ) ≥ 2. Consequently wt (v · · · v ) =
γ 0
0
0
γ
wt (y0 · · · y γ ) + wt (u0 · · · u γ ) ≥ 2δ + 2 = 2(δ + 1). For u 6= 0 and y = 0,
the proof is similar with previous case. For u0 = 0 dan y 0 = 0, with similar way as in
two previous cases, we can see that the only codeword which satisfies these conditions
is the zero codeword. Therefore, to determine the free distance of the code we just need
to consider our previous cases and we have already proven that df (C) ≥ 2(δ + 1).
By choosing some special matrices, we can construct convolutional codes with
large free distance. For larger complexity, free distance of our constructed code is also
larger. For very large complexity we have:
df (C) 2(δ + 1)
1 ≥ lim ≥ lim δ
δ→∞ df max δ→∞ s ·
s +1 +δ+1
(11)
2δ + 2 2δ + 2 2
≥ lim = lim = = 1.
δ→∞ s · δ + 1 + δ + 1
δ→∞ 2δ + s + 1 2
s
in other words, for very large complexity, free distance of the constructed code is near
the maximum possible value of free distance. For s = 1, the free distance of our
construction is the maximum possible, i.e. forms a MDS Convolutional Codes. For
s > 1, the constructed free distance is not maximum, but near the maximum for very
large complexity. These conditions hold for any rate s/2s.
222 Ricky Aditya and Ari Suparwanto
5. CONCLUDING REMARKS
From our discussion, we can conclude some important points. First, we have seen
convolutional codes as an extension of linear block codes for various length of messages.
Any sequence of messages can be encoded in convolutional codes using polynomial rep-
resentation. From Generalized Singleton Bound, we also see that free distance of a
convolutional code might be very large, depends on its complexity. But, it is difficult
to construct a convolutional code with designed free distance. Therefore, we use repre-
sentation of convolutional codes as linear systems to simplify the construction. In the
construction, we define the system before to determine the encoder matrix. Then, for
special case in rate s/2s, where s is a natural number, we can choose special matrices
(A, B, C, D) such that the constructed codes have large free distance and near the max-
imum possible. However, for large complexity, we need field with large order too, which
might cause some difficulty in the construction. Therefore, some other construction
methods for small order field should be considered.
References
[1] Johanesson, R. and Zigangirov, K. Sh., Fundamentals of Convolutional Coding, IEEE Press,
New York, 1999.
[2] Rosenthal, J. and Smarandache, R., Maximum Distance Separable Convolutional Codes, Ap-
plicable Algebra in Engineering, Communication and Computing, Vol. 10, No. 1, 15-32, 1999.
[3] Smarandache, R. and Rosenthal, J., A State Space Approach for Constructing MDS Rate 1/n
Convolutional Codes, IEEE Information Theory Workshop, 1998.
[4] York, E.V., Ph.D thesis: Algebraic Description and Construction of Error-Correcting Codes, a
Systems Theory Point of View, University of Notre Dame, Indiana, 1997.
Ricky Aditya
Mathematics Graduate Student, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail: downing in [email protected]
Ari Suparwanto
Department of Mathematics, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail: ari [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 223 - 232.
Abstract. We will discuss the characteristayion of the IBN on a ring and their properties. We also
will be discussed about (strong) rank condition and stably finite, with their properties, which will
help in understanding IBN. We will see their relationship then.
Keywords and Phrases: IBN, (strong) rank condition, and stably finite.
1. INTRODUCTION
Assumed here that every ring R is associative and have an identity, and each module
M is unital. In the discussion of vector space V over a field F (or more generally over division
ring D) with dimension n, we can be obtained three following properties.
(i) Any basis in V has cardinality n .
(ii) Any genarating set for V has cardinality n .
(iii) Any linearly independent set in V has cardinality n .
But in the discussion of the R-module M, M does not necessarily have a basis. In other words,
these three conditions are not necessarily satisfied. If these properties remain satisfies in any
free module M over ring R with basis dimension n, then condition (i), (ii) and (iii) above will
motivate the condition that called IBN, rank condition, and strong rank condition. [3] is
excellent reference for these three facts and this paper will give some details of their
properties. To further understanding IBN, (strong) rank condition, and also to recognize the
classes of those three things, we will discuss the condition that are more stronger than all
those conditions, which is called stably finite. The definition of each of these conditions is as
follows:
Definition 2.1: R is said to have IBN (invariant basis number) if for any natural number n,
223
224 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI
These are things for which any free right module has a unique rank. On the other hand,
denoting the set of m x n matrices over R by M mn R , it can be shown that two things are
equivalent in proposition below.
Proposition 2.2: For any ring R, the following conditions are equivalent:
(i) R has IBN.
(ii) X M mn R , Y M nm R , XY I m , YX I n m n .
Proof:
i ii Let X M mxn R , Y M nxm R with XY I m and YX I n . If
X M mn R and Y M nm R then there exist a module homomorphism
f : R n R m such that X f and Y g . From XY I m we get f g iRm
which means g is a right invers, and from YX I n we get g f iRn which also means g
Rm Rn , and m n by (i).
is a left invers, so f must be an isomorphism. Then implies
ii i If R R , then there is an isomorphism k : R R such that
m n m n
Definition 2.3: R satisfies the rank condition if for any n , any set of R-module
generators for R has cardinality n .
n
Proof:
i ii Note that each list v v1 ,..., vm R n determines a linear transformation
Lv : Rm Rn by Lv v11 ... vm m for all 1 ,..., m Rm . Since
Rm Rn 0 is exact, Lv : Rm R n is an epimorphism, so we have v spans R n , and
Characteristics of IBN, Rank Condition, and Stably Finite Rings 225
iii i Let n and any generator G for free right module R n with G m . Then
exist an epimorphism LG : Rm R n such that LG r11 ... rmm for all
1 ,..., m Rm . Recall that every exact sequence of free modules are split, so there is
a monomorphism LG ' : Rn Rm such that LG o LG ' 1Rn . Then we get
LG LG ' I n for some matrices LG M nxm R , LG ' M mxn R , so mn
by (iii). We have shown that R satisfies the rank condition.
Now we will see a condition that more stronger than the rank condition, that called strong
rank condition.
Definition 2.5: R satisfies the strong rank condition if for any n , any set of linearly
independent element in R has cardinality n .
n
Analog to the rank condition, then the strong rank condition is also can be shown in three
sentences.
(iii) For any right R-module M generated by n elements, any n 1 elements in M are
linearly dependent.
Proof:
i ii Note that each list v v1 ,..., vm R n determines a linear transformation
Lv : Rm R n by Lv v11 ... vm m for all 1 ,..., m Rm . Since
0 Rm Rn is exact, Lv : Rm R n is a monomorphism, so we have v is linearly
independent in R n , and by (i) we get m n , since v m .
ii i Let n and H is a set of linearly independent in R n with H m . Then
there exist a monomorphism LH : Rm Rn such that LH r11 ... rmm for all
1 ,..., m Rm . This implies that 0 Rm Rn is exact, and m n by (ii). We
have shown that R satisfies the strong rank condition.
226 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI
i iii First suppose that in any right R-module M generated by n elements, any n 1
elements in M are linearly dependent. Then, for any n, R cannot contain a copy of RRn 1 .
n
R
This implies that R satisfies strong rank condition. Conversely, assume that R satisfies the
strong rank condition. Let M be any right R-module generated by x1 ,..., xn and let
y1 ,..., yn1 M . Let : Rn M by the R-epimorphism defined by ei ei xi
(where ei is the standard basis in R ), and let fi R 1 i n 1 such that
n n
Corollary 2.7: A ring R satisfies the strong rank condition iff any homogeanus system of n
m
linear equation over R, i.e aij x j 0 |1 i n with m n unknowns has a
j 1
nontrivial solution over R.
The following observation says that the strong rank condition is stronger than the rank
condition.
Proposition 2.8: If R satisfies the strong rank condition, then it satisfies the rank condition.
Proof: Assume R satisfies the strong rank condition, and consider an epimorphism
: Rk Rn . Then must split (recall that every exact sequences of free modules are
split), and we get a monomorphism : Rn Rk with 1R . n By the strong rank
condition, we have n k . We have shown that R satisfies the rank condition.
The following definition and proposition is about stably finiteness, that more stronger than
IBN and rank condition.
[5, Ch. 1 and 2] is right reference for some properties of stably finite. But then, stably
finiteness is also can be describe in three sentences.
Proof:
i ii Let n
and Rn Rn K (as free right R-module). Then there is an
isomorphism f : Rn Rn K such that f M mn R , m n x for some
dimension x of K, and there also exist g : R K R with g M nm R such that
n n
f g 1Rn . We can say now that f , g E and by (iv), we get gf 1 so we get that f is
an isomorphism.
Since R End RR , so we will get surprise from the following conclusion that every
stably finite rings are Dedekind finite.
We will begin first with the relationship between a ring homomorphism and IBN,
rank condition, strong rank condition and stably finite.
Proof:
1) Let : Rk Rn be an isomorphism as right free modules. Tensoring this with R S we
get an isomorphism R S : Rk R S k Rn R S n . But then Rk R S k S k
and R R S S , so we get an isomorphism R S : S S . By IBN on S,
n n n k n
we get m n . For the rank condition we have the same way, let : R R be an
k n
rank condition on S. We have shown that R satisfies the strong rank condition.
3) Upon identifying R with f R , the identity e of R is an idempotent in S , with the
complementary idempotent f 1 e satisfying Rf fR 0 . Let A, B M n R
such that AB eI n . Then for A fI n , B fI n M n S ,
A fI n B fI n AB f 2 I n eI n fI n e f I n .
If S is stably finite, this implies that I n B fI n A fI n BA fI n , so we get
Proof: First assume R satisfies the rank condition. If Rn Rm , then the rank condition gives
Characteristics of IBN, Rank Condition, and Stably Finite Rings 229
k n . But then
Rk Rn ker Rk Rnk ker Rk Rnk ker ,
where ker 0 , so that R nk
ker 0 . Therefore, R is not stably finite. We have
shown that stably finiteness implies the rank condition.
The sufficient condition that R is simple will be needed when we will show that rank
condition implies stably finiteness.
Proposition 3.3: A simple ring R satisfies the rank condition if only if it is stably finite.
Proof: The “if” part follows from (3.2). Conversely, if R satisfies the rank condition, then
R I 0 that is stably finite for some ideals I. But then the projection map R R I must
be an isomorphism, so R R I is stably finite.
Proposition 3.4: For any nonzero ring, the following properties is satisfied:
(i) R satisfies IBN/rank condition /stably finite iff n , M R
n IBN/rank
condition/ /stably finite.
(ii) R satisfies IBN/rank condition/strong rank condition/stably finite iff R x satisfies
IBN/rank condition /stably finite.
(iii) R satisfies IBN/rank condition/strong rank condition/stably finite iff R x IBN/rank
condition /stably finite.
Schematically:
M n R is "IBN / RC / SF " R is "IBN / RC / SF " R x is "IBN / RC / SF "
R x is "IBN / RC / SF"
Proof:
i The relationship between ring R and M n R for IBN, the rank condition, and stably finite.
a) Suppose M n R has IBN. Since we have a ring homomorphism f : R M n R
defined by r f r diag r ,..., r , so we get R is also has IBN. Conversely,
suppose M n R does not have IBN. Then p, q , p q
and matrices
p , A, B M M R
p n with AB I p BA . Since
rank condition. Conversely, suppose R x has IBN/ the rank condition. Since we also
have a ring homomorphism g : R x R defined by ri xi
i 0
f ri xi r0 ,
i 0
so we get R x is also has IBN/ the rank condition.
b) Suppose R x is stably finite. Since R is the subring of R x , then we get R is also
stably finite. Conversely, suppose R is stably finite, and consider the ideal x R x
generated by x. Since 1 x consists of units of R x , then we get
x rad R x . But then R x x R and R x x is stably finite iff
Characteristics of IBN, Rank Condition, and Stably Finite Rings 231
R x is stably finite, so tha fact R is stably finite implies that R x is also stably
finite.
iii The relationship between ring R and R x for IBN, the rank condition, and stably finite.
a) For IBN/the rank condition, the same proof given in a) ii above.
Proposition 3.5: For any nonzero ring, the following properties is satisfied:
i) R satisfies the strong rank condition finite iff n , M R satisfies the strong
n
rank condition.
ii) If R x satisfies the strong rank condition then R satisfies the strong rank condition.
iii) If R x satisfies the strong rank condition then R satisfies the strong rank condition.
Schematically:
M n R satisfies "SRC " R satisfies "SRC " R x satisfies "SRC "
R x satisfies "SRC"
Proof:
(i) Suppose M n R satisfies the strong rank condition, and consider an embedding
: R Mn R defined by r r diag r ,..., r . Viewing M n R as a
, we have M n R R n M n R is a flat R -
2
left R -module via . In particular,
module, so we get R satisfies the strong rank condition. Conversely, suppose R does not
satisfies the strong rank condition. Then m
such that exists an embedding
R
m 1 m
n2
Rn
2
embedding . Hence R also does not satisfy the strong rank
condition.
232 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI
(ii) Suppose R x satisfies the strong rank condition. Viewing R x as a left R -module
via the embedding R R x , we have R x R R ... , which is a flat module.
Therefore, R is also satisfies the strong rank condition.
(iii) Suppose R x is stably finite. Then so does R since R x R R ... , which is
free and flat.
References
[1] BERRICK, A. J., KEATING, M. E., An Introduction to Rings and Modules, 2000, Cambridge University Press,
United Kingdom.
[2] BREAZ, S., CALUGAREANU, G., AND SCHULTZ, P., 1991, Modules With Dedekind Finite Endomorphism Rings,
Babes Bolyai University, 1-13.
[3] HAGHANY, A., VARADARAJAN, K., 2002, IBN And Related Properties For Rings, Acta Mathematica
Hungarica, 94, 251-261.
[4] HUNGERFORD, T. W. , Algebra, 2000, Springer Verlag, New York.
[5] LAM, T. Y., A First Course in Noncommutative Rings, 1991, Springer Verlag, New York.
[6] LAM, T. Y., Exercises in Modules and Rings, 2007, Springer Verlag, New York.
[7] LAM, T. Y., Lectures On Modules And Rings, 1999, Springer Verlag, New York.
SAMSUL ARIFIN
Department of Mathematics
Surya College of Teaching and Education
e-mail: [email protected]
Abstract. Let G be a graph with a set of vertices V(G) and a set of edges E(G). The distance from vertex u
to vertex v in G, denoted by d(u, v), is the length of the shortest path from vertex u to v. The eccentricity of
vertex u in graph G is the maximum distance from vertex u to any other vertices in G, denoted by e(u).
Vertex v is an eccentric vertex from u if d(u, v) = e(u). The eccentric digraph ED(G) of a graph G is a graph
that has the same set of vertices as G, and there is an arc (directed edge) joining vertex u to v if v is an
eccentric vertex from u. In this paper, we answer the open problem proposed by Boland and Miller [1] to find
the eccentric digraph of various classes of graphs. In particular, we determine the eccentric digraph of the
cartesian product PnPm graph, Pn and Pm are path with n and m respectively.
1. INTRODUCTION
Most of the notations and terminologies follow that of Chartrand and Oellermann [2]
and Gallian [4]. Let G be a graph with a set of vertices V(G) and a set of edges E(G). The
distance from vertex u to vertex v in G, denoted by d(u,v), is the length of the shortest path
from vertex u to v. If there is no a path joining vertex u and vertex v, then d(u, v) = . The
eccentricity of vertex u in graph G is the maximum distance from vertex u to any other
vertices in G, denoted by e(u), and so e(u) = max{d(u, v)v V(G)}. Radius of a graph G,
denoted by rad(G), is the minimum eccentricity of every vertex in G. The diameter of a graph
G, denoted by diam(G), is the maximum eccentricity of every vertex in G. If e(u) = rad(G),
then vertex u is called central vertex. Center of a graph G, denoted by cen(G), is an induced
subgraph formed from central vertices of G. Vertex v is an eccentric vertex from u if
d(u,v) e(u). The eccentric digraph ED(G) of a graph G is a graph that has the same set of
vertices as G, V(ED(G)) = V(G), and there is an arc (directed edge) joining vertex u to v if v is
an eccentric vertex from u. An arc of a digraph D joining vertex u to v and vertex v to u is
called a symmetric arc. Further, Fred Buckley concluded that almost in every graph G, its
_____________
233
234 SRI KUNTARI AND T.A.KUSMAYADI
* *
eccentric digraph is ED(G) = G , where G is a complement of G which is every edge
replaced by a symmetric arc.
One of the topics in graph theory is to determine the eccentric digraph of a given
graph. The eccentric digraph of a graph was initially introduced by Fred Buckley (Boland and
Miller [1]). Some authors have investigated the problem of finding the eccentric digraph.
For example, Boland and Miller [1] determined the eccentric digraph of a digraph, while
Gimbert, et.al [5] found the characterisation of the eccentric digraphs. Boland and Miller [1]
also proposed an open problem to find the eccentric digraph of various classes of graphs.
Some results related to this open problem can be found in [6, 7].
In this paper, we also answer the open problem proposed by Boland and Miller [1]. In
particular, we determine the eccentric digraph of the cartesian product PnPm graph.
The materials of this research are mostly from the papers related to the eccentric
digraph.
There are three steps to determine the eccentric digraph from the given graph. The first
step, we determined the distance from vertex u to any vertex v in the graph, denoted by d(u,
v), using Breadth First Search (BFS) Moore Algorithm taken from Chartrand and Oellermann
[2] as follows.
(1) Take any vertex, say u, and labeled 0 stating the distance from u to itself, and other
vertices are labeled .
(2) All vertices having label adjacent to u are labeled by 1.
(3) All vertices having label adjacent to 1 are labeled by 2 and so on until the required
vertex, say v, has already labeled.
The second step, we determined the vertex eccentricity u by choosing the maximum
distance from the vertex u, and so we obtained the eccentric vertex v from u if d(u, v) = e(u).
The final step, by joining an arc from vertex u to its eccentric vertex, so we obtained
the eccentric digraph from the given graph.
Theorem 3.1. Let PnPm be the cartesian product graph where n and m even, then the
eccentric digraph ED( Pn Pm ) is a digraph 2Sp,p where Sp,p is a double star with
nm 4 .
p
4
Proof. Using BFS Moore Algorithm, we obtain the eccentricity of PnPm graph as on the
Table 1.
The eccentricity of all vertices from Table 1 are used to determine the eccentric vertex of all
vertices of PnPm graph. The arcs can be obtained by joining every vertex to its eccentric
vertex of PnPm graph. Table 2 shows the eccentric vertices and arc of PnPm graph.
24 41 16
33
41 42 43 44 45 46
25 42
26 43
Figure 1. The cartesian product P4P6 graph and its eccentric digraph
Theorem 3.2. Let PnPm be the cartesian product graph, for n even and m odd, then the
eccentric digraph ED( Pn Pm ) is a digraph S 1p, p S p2, p 2 K n where S 1p , p and S p2, p are
2,
2
nm n 4
double stars with p , K n K 2 K n where K 2 is a set of two cut vertices of
4 2,
2 2
S 1p , p and S p2, p .
Proof. Using BFS Moore Algorithm, we obtain the eccentricity of PnPm graph as on the
Table 3. The eccentricity of all vertices from Table 3 are used to determine the eccentric
vertex of all vertices of PnPm graph. The arcs can be obtained by joining every vertex to its
eccentric vertex of PnPm graph. Table 4 shows the eccentric vertices and arc of PnPm graph
vertex.
... ...
n m1 , n m1
n m1
, n1 m1
n m 1
1 1 1 1
2 2 2 2 2 2 2 2 2
m1 , m1 n
m 1
1
1 n
2 2 2
... …
n m1 , n m1
n m 1
1
2 2 2 2 2 2
The Eccentric Digraph of Pn×Pm Graph
237
From Table 4, the arc of PnPm graph vertex is adjacent to its eccentric vertices. Their arcs
are not symmetric except 11 nm and n11m . Therefore the eccentric digraph of PnPm graph
where n even and m odd can be formed into S 1p , p S p2, p 2 K n
where S 1p , p and S p2, p are
2,
2
nm n 4
double stars with p , K n K 2 K n where K 2 is a set of two cut vertices of
4 2,
2 2
S 1p , p and S p2, p . □
The following figure is the cartesian product P4P5 graph and its eccentric digraph.
11 12 13 14 15 34 35 12
21
44 11 45
21 22 23 24 25 22
43 33 23 13
31 32 33 34 35
42 15 41
14
41 42 43 44 45 32
25 24
Figure 2. The cartesian product P4P5 graph and
31
its eccentric digraph
Theorem 3.3. Let PnPm be the cartesian product graph, where n, m odd, then the eccentric
digraph ED( Pn Pm ) is a digraph S 1p, p S p2, p K n1 K m1 K1,4 where S 1p , p and
2, 2,
2 2
(n 1)( m 1) 4
are double stars with p
S p2, p , K1, 4 K1 K 4 where K 4 is a set of all cut
4
vertices of S 1p , p and S p2, p
238 SRI KUNTARI AND T.A.KUSMAYADI
Proof. Using BFS Moore Algorithm, we obtain the eccentricity of PnPm graph as on the
Table 5. Table 6 shows the eccentric vertices and arc of PnPm graph vertex.
Table 5. Eccentricity of PnPm graph vertex , where n and m odd
vertex of PnPm graph Eccentricity
11, 1m , n1, nm nm2
12 , 21, 1(m1) , 2m , (n1)1, n2 , (n1)m , n(m1) n m3
... ...
n1 m1 , n1 m1
n1 m1
, n1 m1
nm
1
1 1 1 1
2 2 2 2 2 2 2 2 2
m1 , m1 n
m 1
1
1 n
2 2 2
... …
n1 m1 , n1 m1
n 1 m 1
1
2 2 2 2 2 2
n1 , n1 n 1
1 m m 1
2 2 2
... …
n1 m1 , n1 m1
nm
1
2 2 2 2 2
n1 m1 nm
1
2 2 2
From Table 6, the arc of PnPm graph vertex is adjacent to its eccentric vertices. Their arcs
are not symmetric except 11 nm and n11m . The eccentric digraph of PnPm graph where n
and m odd can be formed into S 1p, p S p2, p K n1 K m1 K1, 4 where S 1p , p and S p2, p are
2, 2,
2 2
(n 1)( m 1) 4
double stars with p , K1, 4 K1 K 4 where K 4 is a set of all cut vertices of
4
S 1p , p and S p2, p . □
The following figure is an example of the cartesian product P3P5 graph and its eccentric
digraph.
23
11 12 13 14 15
34 33
32
21 22 23 24 25 11 15
25 24 21 22
31 32 33 34 35
31 35
13 12
14
Figure 2.The cartesian product P3P5 graph and its eccentric digraph
4. CONCLUDING REMARK
The results show that the eccentric digraphs of PnPm graph is a digraph
nm 4
1. 2Sp,p where Sp,p is a double star with p for n and m even,
4
2. S 1p , p S p2, p 2 K n
where S 1p , p and S p2, p are double stars with
2,
2
nm n 4
p , K n K 2 K n where K 2 is a set of two cut vertices of S 1p , p and
4 2,
2 2
S p2, p for n even and m odd,
240 SRI KUNTARI AND T.A.KUSMAYADI
3. S 1p, p S p2, p K n1 K m1 K1,4 where S 1p , p and S p2, p are double stars with
2, 2,
2 2
(n 1)( m 1) 4
p , K1, 4 K1 K 4 where K 4 is a set of all cut vertices of S 1p , p and
4
S p2, p for n and m odd.
As mentioned in previous sections the main goal of this paper is to find the eccentric
digraph of a given class of graph. Some authors have conducted research on this such
problem. Most of them have left some open problems on their paper for the future research.
We suggest readers to investigate the problem proposed by Boland and Miller [1] by
considering other classes of graphs.
References
[1] BOLAND, J. AND M. MILLER, The Eccentric Digraph of a Digraph, Proceeding of AWOCA’01, Lembang-
Bandung, Indonesia, 2001.
[2] CHARTRAND, G. AND O. R. OELLERMANN, Applied and AlGorithmic Graph Theory, International Series in
Pure and Applied Mathematics, McGraw-Hill Inc, California, 1993.
[3] CHARTRAND, G. AND L. LESNIAK, Graphs and Digraphs, Chapman and Hall/CRC, New York, 1996.
[4] GALLIAN, J. A. Dynamic Survey of Graph Labeling, The Electronic Journal of Combinatorics, 2009, #16, 1-
219.
[5] GIMBERT, J., N. LOPEZ, M. MILLER, AND J. RYAN, Characterization of eccentric digraphs, Discrete
Mathematics, 2006, Vol. 306, Issue 2, 210 - 219.
[6] KUSMAYADI, T.A. AND M. RIVAI. The Eccentric Digraph of an Umbrella Graph, Proceeding of INDOMS
International Conference on Mathematics and Its Applications (IICMA), Gadjah Mada University
Yogyakarta, Indonesia 2009, ISBN 978-602-96426-0-5, pp 0627-0638
[7] KUSMAYADI, T.A. AND M. RIVAI. The Eccentric Digraph of a Double Cones Graph, Proceeding of INDOMS
International Conference on Mathematics and Its Applications (IICMA), Gadjah Mada University
Yogyakarta, Indonesia 2009, ISBN 978-602-96426-0-5, pp 0639-0646
SRI KUNTARI
Department of Mathematics Faculty of Mathematics and Natural Sciences Sebelas Maret
University Surakarta.
e-mail: [email protected]
1. INTRODUCTION
modules, (i.e.; finite indexed set, finitely -linearly independent, and ).We obtain that
N is -linearly independent if and only if there exist finite indexed set with
∐ is a monomorphism. Furthermore, N is finitely -linearly independent if and
only if every , N is a finitely -linearly independent.
In section 3, we study definition of notion subcategory for { } and shown
that is subcategory of category R-Mod. We obtain properties subcategory that are
for any submodule and factor module of modules in belong to .
Example 2.1Let be aring ofreal numbers and be a ring of polynomial over . Then
there exist a monomorphism , which is defined by where
with , and for .
In studying notion of direct sum and -copies of modules, we need the properties of
morphisms direct sum such as on the following proposition.
Study the characterization is an important way for observe further some properties. The
following proposition is the characterization of -linearly independent.
{ .
Thus ⁄
∐ ∐ ⁄
∐ is a mono-morphism, with
∐ ⁄ . This means N is -linearly independent. □
Definition 2.4 Let be a class of R-modules and let N be an R-module. is finite linearly
independent to N (or N is an finite -linearly independent), there is a monomorphism
∏ , for finite indexed module in .
Proof: Suppose N is isomorphic with direct sum of modules in . Then there exist
isomorphism for . So f is both monomorphism and epimorphism.
This implies is linearly independent to N and is generates N.
Suppose is both linearly independent to N and generates N. Then there exist an
isomorphism for . So N is isomorphic with direct sum of modules in
.□
3. SUBCATEGORY
In this section, it is shown some examples of , with will be use to show that is
a subcategory of a category R-Mod.
Proof:
1. Let N, K and L be R-modules. If , then ⁄ ⁄ . For N is M-linearly
independent, we have ⁄ . So we obtain K is M-linearly independent, this
implies ⁄ .
2. Let N, K and L be R-modules. If , then ⁄ ⁄ . By the modules
⁄ ⁄
isomorphism theorem, (i.e. ⁄ ⁄ ⁄ ), this implies ⁄ ⁄ .
3. Let ⁄ be a factor module in . Then we have ⁄ in . Now we
showed that ( ⁄ ) ⁄ , as follow:
We define ⁄ ( ⁄ ) or ̅̅̅̅̅̅̅ ̅̅̅ , so we have shown:
i. is a mapping. Take any ̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅ ⁄ with ̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅ , so
we have;
̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅
̅̅̅̅̅̅ ̅̅̅̅̅̅
(̅̅̅̅̅̅̅) (̅̅̅̅̅̅̅̅) ,
̅̅̅ ̅̅̅̅
̅̅̅ ̅̅̅̅
̅̅̅ ̅̅̅̅ ,
iii. is injective. Take any ̅̅̅ ̅̅̅̅ ( ⁄ ) with ̅̅̅ ̅̅̅̅ , so
we have;
̅̅̅ ̅̅̅̅
̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅ ,
4. CONCLUDING REMARK
References
[1] ADKINS, W., WEINTRAUB, S.H., Algebra, Springer-Verlag New York Berlin Heidelberg, 1992.
[2] ANDERSON, F.W., FULLER, K.R., Rings and Category of Modules, 2nd edition, Springer-Verlag, 1992.
[3] BEACHY, J. A., M-injective Modules and Prime M-ideals, Article,----
[4] GARMINIA, H., ASTUTI, P., Journal of The Indonesian Mathematical Society, Karakterisasi Modul -
koherediter, Indonesian Mathematical Society, 2006.
[5] MACLANE, S. , BIRKHOFF, G., Algebra, MacMillan Publishing CO., INC., New York, 1979.
[6] PASSMAN, S. DONALD, A Course in Ring Theory, Wadsworth & Brooks, Pacific Grove, California, 1991.
[7] WISBAUER, R., Foundations of Module and Ring Theory, University of Dűsseldorf, Germany, Gordon and
Breach Science Publishers, 1991.
SUPRAPTO
Junior High School of 1 Banguntapan, Bantul, Yogyakarta, Indonesia.
Ph.D. Student of Mathematics, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail : [email protected] HP 0817 273 274
SRI WAHYUNI
Department of Mathematics, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail : [email protected]
IRAWATI
Department of Mathematics, Institute Technology Bandung, Indonesia.
e-mail : [email protected]
248 SUPRAPTO et al.
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 249 - 258.
Abstract. In this paper we present about the Moore–Penrose inverse in rings with involution . J.J
Koliha and Pedro Patricio, formulated the definition ofthe Moore–Penrose inverse in rings with
involution and described that two one-sided invertibility conditions imply the Moore Penrose
invertibility. In this paper a ring means an assosiative ring with unit 1 ≠ 0.
Many concepts ina ring which given by many authors were motivated this paper. There are
regularityand group invertibility of elements were given by Pedro Patricio and Roland Puystjens,
drazin invertibility was given by J.J. Koliha. We use those concepts to investigate the existence of
Moore–Penrose inverse in rings with involution.
1. INTRODUCTION
During the decade 1910–1920 ,E H. Moore introduced and studied the general
inverse for any complex matrices. The general inverse matrixwas rediscovered by R. Penrose
in 1955. In [5] R.Penrosedescribed a generalization of the inverse of a non-singular matrix,
as the unique solution of a certain set of equations. We can see that from Theorem 1.1 below
.This generalized inverse exists for any (possibly rectangular) matrix whatsoever with
complex elements and is nowadays called the Moore–Penrose inverse.
Theorem 1.1 :[5]If A and X are matrices (not necessarily square) with complex elements ,
then the four equations :
AXA =A (1.1)
XAX =X (1.2)
(AX)H = AX (1.3)
(XA)H = XA (1.4)
have a unique solution for any A.
The conjugate transpose of the matrix A is written withAH. The unique solution of
(1.1), (1.2), (1.3) and (1.4) will be called the Moore Penrose inverse of A and writtenX =
249
250 T.UDJIANI, S.WAHYUNI , B, SURODJO
A+.(Note that A need not be a square matrix and may even be zero.)
In this paper we present the Moore–Penrose inverse in rings with involution . Koliha
and Pedro Patricio [3] , formulated the definition ofthe Moore–Penrose inverse in rings with
involution and described that two one-sided invertibility conditions imply the Moore Penrose
invertibility.
*
Definition 1.2: [3]An involution in a ring R is operation a→ a*in a ring R such that,
i). (a + b )* = a* + b*
ii). (ab)* = b* a*
iii).(a*)* = a
for each a,b∈R.
The next well known lemma asserts that two one-sided invertibility conditions imply the
Moore–Penrose invertibility.
Lemma 1.4 :[3]Let a ∈R. Then a ∈R+ if and only if there exist x, y ∈ R such that
axa = a = aya , (ay)* = ay , (xa)* = xa
+
In this case a = xay.
Proof :By (1.20) we obtain thata = aa*x* = a(xa)* = axa⇔(xa)* = xaand by (1.25) we havea =
y*a*a = (ay)*a = aya⇔ (ay)* = ay. Similarly we have axa = a = aya. Further a = aya =
axaya = aa+awitha+ = xay.
There are some concepts of elements in rings with involution. In this section we
discuss the concepts of regular element and group inverse which given by Pedro Patricio and
Roland Puystjens [4]. We also study the Drazin inverse element which given by Koliha [2].
Further if the group inverse exists then it is unique and a is called group
invertible.Suppose x and y are group inverse of a. Then
axa = a (2.4)
xax = x (2.5)
ax = xa (2.6)
and
aya = a (2.7)
yay = y (2.8)
ay = ya . (2.9)
By (2.4) − (2.9) we get
x = xax = axx = ayaxx = ayxax = ayxayax = yaxayax = yayax = yayxa = yyaxa = yya =
252 T.UDJIANI, S.WAHYUNI , B, SURODJO
Definition 2.2 : [4]An element a ∈R is said to have a Drazin inverse if there exists b ∈R such
that
ab = ba (2.10)
b = ab2 (2.11)
ak= ak+1 b (2.12)
for some nonnegative integer k. The least nonnegative integer k for which these equations
hold is the Drazin index i(a) of a. The set of all Drazin invertible elements of R will be
denoted by RD.
Definition 2.3:[4]An element a ∈ R is regular (in the sense of von Neumann) if it has an inner
inverse x, that is, if there exists x ∈R such that axa = a. Any inner inverse of a will be denoted
by . The set of all regular elements of R will be denoted by .
Applying the involution to Definition 2.5, we observe that ais*-cancellable if and only if a*is
*
-cancellable.
Proof :
⇒ Suppose that a is *-cancellable , that is a*ax = 0⇒ax = 0 and xaa* = 0⇒xa = 0. Then
(a*)*a* x* = 0 ⇒x a a* = 0
⇒x a = 0
⇒( x a ) * = 0
⇒a * x*= 0
* * * *
x a (a ) = 0 ⇒a*ax = 0
⇒ax = 0
⇒(ax)* = 0
⇒x*a * = 0
⟸ *
a ax = 0 ⇒(a*ax )* = 0
⇒x*a *(a*)*= 0
⇒x*a * = 0
⇒(ax) * = 0
⇒ax = 0
x a a* = 0 ⇒( x a a*) * = 0
⇒(a*)*a* x* = 0
The Existence of Moore Penrose Inverse in Rings with Involution 253
⇒a * x* = 0
⇒( x a ) * = 0
⇒x a = 0
Hence a is*-cancellable if and only if a*is *-cancellable.
It is often useful to observe that if ais *-cancellable thena*a and aa* are *-cancellable.
Proof :
Suppose ais *-cancellable and let ( a*a)* ( a*a) x = 0. Then
( a*a)* ( a*a) x = 0 ⇒a*aa*a x = 0
⇒aa*a x = 0
⇒x*a *(a*)*a*= 0
⇒x*a*(a*)*=0
⇒(a*a x ) * = 0.
Hence a a is -cancellable .With similar way we can prove that aa* is *-cancellable.
* *
By studying some concepts elements in rings with involution that have been
described previously, motivates us to know the relation between those concepts and the
Moore Penrose inverse. From the next theorem, we will see those concepts construct the
existence of Moore Penrose inverse. If J.J. Koliha [3] have been proved the next theorem also
uses the spectral idempotent element, this paper use another way, the theorem proved only by
using group invertible , drazin invertible and regular element. We also proved many
characteristic that has not beenproven by Koliha.
Proof :First we prove the implications (i) ⇒(iii) ⇒(v) ⇒(vii) ⇒ (i). (*)
( i ) ⇒ ( iii ) We need to proof that a is *-cancellable .
Given a∈R+ and a*a x = 0 ⇒a x = a a+a x
= ( a a+)* a x
= (a+)*a* a x
= 0.
Given a∈R+ andx a a* = 0 ⇒x a = x a a+a
= x a(a+a ) *
= x a a*(a+) *
= 0.
Next we will to proof that a*a∈R+ .The Moore–Penrose invertibility of a*ais obtained by
verifying that ( a*a)+= a+(a+)*
254 T.UDJIANI, S.WAHYUNI , B, SURODJO
References
[1] BEN ISRAEL, The Moore of The Moore Penrose Inverse , The Electronic Journal of Linear Algebra 2002.
[2] J.J.KOLIHA ANDA ADI , Generalized Drazin Inverse, Glasglow Math, 1996
[3] J.J.KOLIHA AND PEDRO PATRICIO, Elements of Rings with Equal Spectral Idempotents , Australian
Mathematical Society M, 2001.
[4] PEDRO PATRICIO AND ROLAND PUYSTJENS, Drazin Moore Penrose Invertibility in Rings, AMS, 2004.
[5] R.PENROSE, ( Communicated by J.A.Todd) , A Generalized Inverse for Matrices , St. John’s College
Cambridge,1954.
The Existence of Moore Penrose Inverse in Rings with Involution 257
BUDI SURODJO
UGM
e-mail: [email protected]
SRI WAHYUNI
UGM
e-mail: [email protected]
258 T.UDJIANI, S.WAHYUNI , B, SURODJO
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 259–266.
Atok Zulijanto
Abstract. In this paper, we prove a result that yields an upper bound of the oscillation
index of a function that is a limit of a sequence of Baire-1 functions using the zero index
of related gauges.
Keywords and Phrases: Oscillation index, convergence index, Baire-1 functions, gauges
.
1. PRELIMINARIES
Let X be a metrizable space. A real-valued function f is said to be of Baire class
one or simply Baire-1, if there exists a sequence (fn ) of real-valued continuous functions
that converges pointwise to f . The Baire Characterization Theorem [1] states that if
X is a Polish (separable completely metrizable), then f : X → R is a Baire-1 function if
and only if for all nonempty closed subset F of X, f |F has a point of continuity. This
leads naturally to an ordinal index for Baire-1 functions, called the oscillation index
(see, [2] and [3]). In [3], Kechris and Louveau also introduced another ordinal index for
Baire-1 functions, called the convergence index. The study of Baire-1 function in terms
of ordinal indices was continued by several authors (see, e.g.,[4], [6] and [7]).
Let X be a metrizable space and C denote the collection of all closed subsets of X.
Now, let ε > 0 and a function f : X → R be given. For any H ∈ C, let D0 (f, ε, H) = H
and D1 (f, ε, H) be the set of all x ∈ H such that for every open set U containing x,
there are two points x1 and x2 in U ∩ H with |f (x1 ) − f (x2 )| ≥ ε. For all α < ω1 (the
first uncountable ordinal number), set
259
260 Atok Zulijanto
gauge of f . We will recall the definition of convergence index of a sequence in the next
section.
Before we present our results, we recall the definition of zero index of gauges (see,
[8]). Let X be a Polish space and π be a positive function on X. For any closed subset
H of X, let Z 0 (π, H) = H and Z 1 (π, H) be the set of all x ∈ H such that for any
neighborhood U of x it holds that inf{π(y) : y ∈ U ∩ H} = 0. For all α < ω1 , define
Z α+1 (π, H) = Z 1 (π, Z α (π, H)). If α is a limit ordinal, let
\ 0
Z α (π, H) = Z α (π, H).
α0 <α
We shall write o(π) for oX (π). It is easy to prove that for any closed subset H of X,
Z 1 (π, H) is closed.
The following theorems can be found in [8].
Theorem 1.2. (see [8], Proposition 3) Let ε > 0 and a Baire-1 function f : X → R be
given. If δ : X → R+ is an ε-gauge of f , then β(f, ε) ≤ o(δ).
Theorem 1.3. (see [8], Theorem 4) Let f : X → R be a Baire-1 function. Then for
any ε > 0 there exists an ε-gauge δ of f such that o(δ) = β(f, ε).
In order to compute the oscillation index of Baire-1 functions using zero index of
the appropriate gauges, we need the following computational tool that can be found in
[8].
Theorem 1.4. (see [8], Theorem 6) If π1 , π2 : X → R+ are positive functions with
o(π1 ) ≤ ω ξ and o(π2 ) ≤ ω ξ for some ξ < ω1 , then o(π1 ∧ π2 ) ≤ ω ξ .
2. RESULTS
Throughout this section, let X be a Polish space with a compatible metric d.
Before we proceed to our main result, we recall the definition of convergence index of
a sequence. Let (fn ) be a sequence of real-valued functions on X and H be a closed
subset of X. For any ε > 0, let D0 ((fn ), ε, H) = H and D1 ((fn ), ε, H) be the set of
those x ∈ H such that for every neighborhood U of x and any N ∈ N, there are n and
m in N with n > m > N and x0 ∈ U ∩ H such that |fn (x0 ) − fm (x0 )| ≥ ε. For all
countable ordinals α, let
Dα+1 ((fn ), ε, H) = D1 ((fn ), ε, Dα ((fn ), ε, H)).
262 Atok Zulijanto
Proof. Let ε > 0 be given. By Theorem 1.3, there exists a sequence (δn )n≥1 of positive
functions on X such that for each n ∈ N, δn is an 3ε -gauge of fn and o(δn ) = β(fn , 3ε ) ≤
0
ω ξ . By replacing δn+1 with δn+1 = δn+1 ∧ δn for each n ∈ N if necessary, we can assume
that δn+1 ≤ δn for all n ∈ N.
For all x ∈ Dα ((fn ), 3ε , X)\Dα+1 ((fn ), 3ε , X), there exist rα (x) > 0 and N (α, x) ∈
N such that whenever n > m ≥ N (α, x) we have
ε
|fn (x0 ) − fm (x0 )| <
3
An Application of Zero Index to Sequences of Baire-1 Functions 263
for all x0 ∈ B(x, rα (x)) ∩ Dα ((fn ), 3ε , X). Taking the limit as n → ∞, we have
ε
|f (x0 ) − fm (x0 )| ≤ (1)
3
for all m ≥ N (α, x) and all x0 ∈ B(x, rα (x)) ∩ Dα ((fn ), 3ε , X).
α
Since {B(x, r 2(x) ) : x ∈ Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X)} is an open cover of
the separable (thus Lindelöf ) space Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X), there exists
(xi )∞ α ε
i=1 ⊆ D ((fn ), 3 , X) \ D
α+1
((fn ), 3ε , X) such that
∞
ε ε [ rα (xi )
Dα ((fn ), , X) \ Dα+1 ((fn ), , X) ⊆ B(xi , ).
3 3 i=1
2
r α (xj )
For all x ∈ Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X), let j(x) = min{j : x ∈ B(xj , 2 )}.
For all m, j ∈ N, denote
rα (xi )
α 1
rm,j = ∧ min .
m 1≤i≤j 2
α α
Then (rm,j )m and (rm,j )j are non-increasing sequences. For all α < γ0 and m ∈ N,
let Um be m -neighborhood of Dα ((fn ), 3ε , X), set mx = min{m : x ∈ Dα ((fn ), 3ε , X) \
α 1
α+1
Um } and Nα,j = max1≤i≤j N (α, xi ).
Define δ : X → R+ by
α
δ(x) = rm x ,j(x)
∧ δNα,j(x) (x)
whenever x ∈ Dα ((fn ), 3ε , X)\Dα+1 ((fn ), 3ε , X). Let x, y ∈ X with d(x, y) < min{δ(x), δ(y)}.
Suppose that x ∈ Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X). Since y ∈ B(x, δ(x)) and δ(x) ≤
α
rm x ,j(x)
≤ m1x , we have y 6∈ Dα+1 ((fn ), 3ε , X). Therefore y ∈ Dβ ((fn ), 3ε , X)\Dβ+1 ((fn ), 3ε , X)
for some β ≤ α. By symmetry we have β = α.
If j(x) ≤ j(y), then Nα,j(x) ≤ Nα,j(y) . It follows that δNα,j(x) ≥ δNα,j(y) . Therefore
d(x, y) < min{δNα,j(x) (x), δNα,j(y) (y)} ≤ min{δNα,j(x) (x), δNα,j(x) (y)}
r α (xj(x) )
which implies that |fNα,j(x) (x) − fNα,j(x) (y)| < 3ε . Also, since x ∈ B(xj(x) , 2 ) and
y ∈ B(x, δ(x)), we see that
rα (xj(x) )
d(y, xj(x) ) ≤ d(y, x) + d(x, xj(x) ) < δ(x) +
2
rα (xj(x) ) rα (xj(x) )
≤ + = rα (xj(x) ).
2 2
Thus, both x and y belong to B(xj(x) , rα (xj(x) ))∩Dα ((fn ), 3ε , X) and Nα,j(x) ≥ N (α, xj(x) ).
It follows from (1) that |fNα,j(x) (z) − f (z)| ≤ 3ε for z = x or z = y. Therefore we have
|f (x) − f (y)| ≤ |f (x) − fNα,j(x) (x)| + |fNα,j(x) (x) − fNα,j(x) (y)| + |fNα,j(x) (y) − f (y)|
ε ε ε
< + + = ε.
3 3 3
Similarly, if we assume that j(x) ≥ j(y), we will also have |f (x) − f (y)| < ε.
264 Atok Zulijanto
Acknowledgement. I would like to thank Dr Tang Wee Kee for suggesting this
problem to me.
References
[1] Baire, R., Sur les Fonctions des Variables Relles, Ann. Mat. Pura ed Appl. 3, 1-122, 1899.
[2] Haydon, R., Odell, E. and Rosenthal, H. P., On Certain Classes of Baire-1 Functions with
Applications to Banach Space Theory, in : Functional Analysis, Lecture Notes in Math., 1470,
1-35, Springer, New York, 1991.
[3] Kechris, A. S. and Louveau, A., A Classification of Baire Class 1 Functions, Trans. Amer.
Math. Soc. 318, 209-236, 1990.
[4] Kiriakouli, P., A Classification of Baire-1 Functions, Trans. Amer. Math. Soc. 351, 4599-4609,
1999.
An Application of Zero Index to Sequences of Baire-1 Functions 265
[5] Lee, P.-Y., Tang, W.-K, and Zhao, D., An Equivalent Definition of Functions of The First Baire
Class, Proc. Amer. Math. Soc. 129, 2273-2275, 2000.
[6] Leung, D. H. and Tang, W.-K., Functions of Baire Class One, Fund. Math. 179, 225-247, 2003.
[7] Leung, D. H. and Tang, W.-K., Extension of Functions with Small Oscillation, Fund. Math.
192, 183-193, 2006.
[8] Leung, D. H., Tang, W.-K. and Zulijanto, A., A gauge Approach to An Ordinal Index of Baire
One Functions, Fund. Math. 210, 99-109, 2010.
Atok Zulijanto
Department of Mathematics Faculty of Mathematics and Natural Sciences
Gadjah Mada University, Indonesia
e-mail: [email protected]
266 Atok Zulijanto
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 267–274.
1. INTRODUCTION
Based on the definition of regulated function that is introduced by Tvrdŷ [5], the
set of all regulated functions has been characterized as the closure of the union of C[a, b]
and BV [a, b], where C[a, b] and BV [a, b] stand for the collection of all continuous func-
tions on [a, b] ⊆ R and the collection of all bounded variation functions on [a, b] ⊆ R,
respectively [2]. Based on the result of the characterization of the regulated function,
some theorems in the Henstock-Stieltjes integral on [a, b] ⊆ R could be improved by
using regulated function. In [3], Indrati used the generalized of the concept of regu-
lated function in the n-dimensional space and used it to have the multiplication of two
Henstock integrable functions still Henstock integrable on a cell E ⊆ Rn . In this paper
we characterize the set of regulated function on a cell E ⊆ Rn . We characterize the
regulated function in the Henstock-Stieltjes integral to have a generalized result in [2].
Furthermore, we build a convergence theorem for the Henstock-Stieltjes integral on a
cell E ⊆ Rn by using regulated function.
Some concepts that will be used in the generalization are stated below [3].
In this discussion, a cell E stands for a non degenerate closed and bounded interval
in the Euclidean space Rn . Its volume will be represented by |E|.
267
268 Ch. Rini Indrati
We start the result of the research in this paper by giving some characterization
below. The characterization of the regulated function has been done based on the
Definition 2.2.
From the definition of regulated function, we have a characteristic of regulated
function in a sequence of step functions in Theorem 2.2.
Theorem 2.2. A function f is regulated on a cell E ⊆ Rn if and only if there exists a
sequence of step functions {ϕk } that converges uniformly to f on E.
Regulated Functions in the n-dimensional Space 269
Lemma 2.2. If ϕ and φ are step functions on a cell E, then ϕ + φ and αϕ are step
functions.
Proof. (⇒) Let > 0 be given. There exist a step function ϕ on E, such that for every
x ∈ E, we have
|ϕ(x) − f (x)| < .
2
Let D = {I} = {I1 , I2 , . . . , Ip } be the partition of E due to the step function ϕ on E.
Therefore, for any x, yıIio , we have
|f (x) − f (y)| ≤ |f (x) − ϕ(x)| + |ϕ(y) − f (y)| < .
(⇐) Let > 0 be given. From the hypothesis, there is a partition D = {I} = {Ii =
[ai , bi ], i = 1, 2, 3, . . . , p} of E such that for every x, y ∈ Iio , 1 ≤ i ≤ p, we have
|f (x) − f (y)| < .
We define a function ϕ on E, with
for x ∈ Iio
f (ai ),
ϕ(x) =
min{f (x) : x ∈ ∂(Ii )}, otherwise.
We have ϕ is a step function on E and for every x ∈ E,
|ϕ(x) − f (x)| < .
270 Ch. Rini Indrati
where the supremum is taken over all partitions D = {D} of E. We have g has bounded
variation on E, if Vg (E) < ∞, i.e., there is a constant M ≥ 0 such that for every
partition D = {I} of E, we have
X
(D) |g(I)| ≤ M.
In this section, we will consider that the function g is additive on a cell E.
3.1. The Henstock-Stieltjes Integrability of the Regulated Functions. In the
Riemann-Stieltjes in the real line, there is no guarantee of the integrability of a function
with respect to a function when they have a point of discontinuity. The Henstock-
Stieltjes in the real line gives a guarantee of that case [2]. In this paper we give a
generalized result in Theorem 3.4.
Definition 3.3. A function f is said to be Henstock-Stieltjes integrable with respect to
a function g on a cell E ⊆ Rn , if there exists a real number A, such that for every > 0,
there is a positive function δ on E, such that for every δ-fine partition D = {(I, x)} =
{(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} of E, we have
X
|(D) f (x)g(I) − A| < ,
P Pp
where (D) f (x)g(I) = i=1 f (xi )g(Ii ).
The real number A in Definition 3.3 is unique and will be called the Henstock-
Stieltjes integral value of f with respect to g on E, written
Z
A = (HS) f dg.
E
Proof. Since g has bounded variation, there exists a constant M > 0, such that for
every partition D = {I} of E, we have
X
(D) |g(I)| ≤ M.
272 Ch. Rini Indrati
Put δ(x) = 2−(n+1) δ, for every x ∈ E. As corollary, for every two δ-fine partitions
D1 = {(I, x)} = {(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} and
D2 = {(J, y)} = {(J1 , y1 ), (J2 , y2 ), . . . , (Jm , ym )} of E, we have
X X p X
X m
|(D1 ) f (x)|g(I)| − (D2 ) f (y)|g(J)|| = | (f (xi ) − f (yj )) g(Ii ∩ Jj )|
i=1 j=1
p X
X m
≤ |f (xi ) − f (yj )| |g(Ii ∩ Jj )|
i=1 j=1
p Xm
X
< |g(Ii ∩ Jj )|
M i=1 j=1
p+m
X
≤ |g(Di )|
M i=1
≤ M = .
M
Proof. Let > 0 be given. Since f is regulated on E, by Theorem 2.4, there exists a
partition D = {I} = {I1 , I2 , . . . , Ip } of E such that for every x, y ∈ Iio , 1 ≤ i ≤ p, we
have
|f (x) − f (y)| < ,
M
where M = Vg (E). We define a positive function δ on E such that for every x ∈ Iio ,
B(x, δ(x)) ⊆ Iio , for i = 1, 2, 3, . . . , p. Therefore, for any two δ-fine partition D1 =
{(I, x)} = {(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} and
Regulated Functions in the n-dimensional Space 273
there exists A ∈ R, such that {Ak } converges to A. The number A is the Henstock-
Stieltjes integral value of f with respect to g on E. Moreover, we have
Z Z
lim (HS) fk dg = (HS) lim fk dg.
k→∞ E E k→∞
4. CONCLUDING REMARKS
We have had a characterization of regulated function in section 2, especially The-
orem 2.4. The space of all regulated functions includes the space of all continuous
functions on a cell E ⊆ Rn . From Theorem 2.4 we prove that every regulated function
on a cell E ⊆ Rn is Henstock-Stieltjes integrability with respect to an additive bounded
variation function in Theorem 3.4. The convergence theorem involving regulated func-
tion is stated in Theorem 3.6. These results open opportunity in solving differential
equation problems with discontinuity.
References
[1] Bartle, R.G. and Sherbert, D.R., Introduction to Real Analysis, Third Edition, J. Wiley &
Sons, USA, 2000.
[2] Indrati, Ch. R., On the Regulated Function, Proceeding of National Seminar on Mathematics, ,
Surabaya, Indonesia, June 20, 2009, pp. 8 - 24, 2009.
[3] Indrati, Ch. R., The Application of Regulated Function on the Multiplication of Two Henstock
Integrable Functions, Proceeding of ICMSA IMT-GT, Padang, Indonesia, 2009.
[4] Pfeffer, W.F., The Riemann Approach to Integration, Cambridge University Press, New-York,
USA, 1993.
[5] Tvurdŷ, M., Linear Bounded Functionals on the Space of regular Regulated Functions, Tatra
Mountains Mathematical Publications 8, 203 - 210, 1996.
Abstract. In this paper we define a uniformly continuous mapping which is induced by a symmetric
gauge, called M s -uniformly continuous mapping. By this M s -uniformly continuous mapping, we
characterize new compactness which is induced by gauge symmetric, called symmetric gauge
compact.
Keywords and Phrases: Symmetric gauge, M s -uniformly continuous mapping, Symmetric
gauge compact.
1. INTRODUCTION
function : X
. A gauge is used to define Henstock-Kuzweil [1] and to provide a
Cauchy type characterization for Baire class one function [4]. For any positive valued
function on the metric space X , d , we can define an open neighbourhood, i.e. for
each
x X , x B x, x y X : d x, y x .
Since for an arbitrary topological space we do not use the metric, consequently we can not
define an open neighbourhood as in a metric space, we define a gauge on a topological
space X , as a function from topology space X to the topology of space itself. In 2005,
Zhao introduced a binary relation RM which is induced by gauge δ on the topological
space X [5], i.e. for every x, y X , x y or y x . by this relation he created M-
uniformly continuous mapping and gauge compact space [5].
275
276 D. K. SARI AND CH. R. INDRATI
2.1. Symmetric Gauge. Let X ,d be a metric space and a be any positive number. We
define a gauge a on X, where
a
x z X : d z, x a , for every x X . The
Theorem 6 Let X and Y be topological spaces. Every continuous mapping is Ms- uniformly
continuous.
278 D. K. SARI AND CH. R. INDRATI
Corollary 9 A space Y is an R0 -space if and only if for any space X, every M s - uniformly
continuous mapping is continuous.
In this section we will use a symmetric gauge to generate a gauge compact, i.e. a
topological space X is called gauge compact if for every gauge 𝛿 on X, there exist finitely
many points x1 , x2 ,..., xn on X, such that for every x X there is xi x1 , x2 ,..., xn so
that xRM xi , see [5].
279 C om p a ct n es s Spa c e wh i c h i s In du c ed b y S ym m et ri c G a u ge
4. CONCLUDING REMARK
References
[1] BARTLE, R. G. AND SHERBERT, D. R., Introduction to Real Analysis, 3th edition, John wiley and Son, Inc.,
New York, 2000.
[2] ENGELKING, R.,General Topology, , revised and completed edition, Heldermann Verlag, Berlin., United Stade
of America, 1989.
[3] PREUSS, G., Theory of Topological Structure- An Approachto Categorical Topology, D. Reidel Publishing
Company, City ?, 1988.
[4] LEE, P.Y., TANG, W. K., AND ZHAO, D., An equivalent definition of functions of the first Baire class, Proc.
Amer. Math. Soc., 129, 2273-2275, 2001.
[5] ZHAO, D., A New Compactness Type Topological Property, Quaestiones Mathematicae, 28, 1-11, 2005.
Abstract. Let (G, ) and (Lc(V), ) be a topological group and a topological vector space,
respectively. A continuous linear representation c of (G, ) into (Lc(V), ) is a
homomorphism of (G, ) into (GLc(V), GL). Therefore the Ker( c) is a normal subgroup of G.
In this paper, we prove that the (Ker( c), Kr) is a topological normal subgroup of the
topological group (G, ), where Kr is a topology induced by . Futhermore, we also prove
that the set G = G/Ker( c) = {g + Ker( c)│g G}is a topological quotient group and there is
1. INTRODUCTION
2. MAIN RESULTS
Futhermore we will find the connection between G and a continuous linear representation
c.
Let G be a topological group, G be a topological quotient group and GLc(V) be a topological
subspace of Lc(V). If we have the diagram
G GLc(V)
G
where c : G GLc(V) and :G G are continuous homomorphism. Then there
exists a map : G GLc(V) such that ◦ = c. It state on the proposition followed
Corollary. For any x,y G, if c(x) = c(y) then there exists an isomorphism from G or
G into Im( c) such that the diagram
G Im( c)
G
is commutative.
Proof. If c(x) = c(y) then( ◦ )(x) = ( ◦ )(y) that is ( (x)) = ( (y)). So
( x ) = ( y ) that is Tx = Ty for every x,y G or x , y G . ■
3. CONCLUSION
Let c be a continuous linear representation from a topological group (G, ) into a
topological vector space (V, ) then
1. the set G = G/Ker( c) is a topological quotient group
2. there is a continuous linear representation from G into (V, ), that is a
homomorphism : G GLc(V) such that ◦ = c
3. for any x,y G, if c(x) = c(y) then G isomorphic to Im( c) and Im( c)
isomorphic to G
REFERENCES
SUPERPOSITION OPERATOR
Abstract. Let L be a Banach lattice and 𝜙 be a weight function that satisfies the −condition, we
define the L-value sequences where
. In this paper we discuss about necessary and sufficient conditions for
superposition operator maps the space to the space .
Keywords and Phrases : Banach lattice, weight function, −condition, superposition operator.
1. INTRODUCTION
Let ℝ be the set of all real nunbers, L be a Banach lattice and be the space of all L-
valued sequences. We denote the k-th term of sequence by . Any vektor
subspace of is called L-valued sequence space.
A L-valued sequence space X is called BK-Space if it is a Banach space and the
canonical function defined by continuous for all k ∈ℕ. For
sequence and N ∈ ℕ, finite sequences defined by and
zero otherwise. A Banach space X is said to have the AK if X contains all finite sequences
and every sequence , the limit as N →∞, hold.
We write for the real sequence space of all sequences assosiated with absolutely
convergent series. It is known that is a BK space with norm defined by
289
290 E. HERAWATY, SUPAMA, I.E. WIJAYANTI
Characterization of on Orlicz space was given by Robert and Šragin. The complete
investigation of superposition operators on sequence space and (0 < p < ∞) for the
sequence spaces of all bounded and null sequences and p-absolutely covergent series,
respectively was given by Dedagich and Zabreĭko. The acting condition for pg : L 1
was proved by Chew Tuan Seng by asumption that the function g(k, ⦁) are continuous. The
results of Šragin contain characterizations of superposition operators on and
where is a sequence of 𝜙-function. The main of the present paper is introduced the
L-valued sequence space , where 𝜙 is a weight function and L is a Banach lattice. For a
function such that
(1) for every k ∈ ℕ and
(2) continuous on L for every k ∈ ℕ
the necessary and sufficient conditions are given for the superposition operator maps
to .
Lemma 2.1. Let L-value sequence space X is a BK-space with AK-properties. If and
N∈ℕ, then for any number ε > 0, there exist a number δ > 0 so that for every with
, we have
k ∈ℕ.
The sufficient condition for functional F : X → ℝ continuous on norm space X we
shall present in the following theorem
Theorem 2.2. Let L-value sequence space X be an BK-space with AK-properties. If the
function satisfy the condition g(k, 0) = 0 and is continuous on L for
each k ≥ 1, then for N ∈ ℕ the functional defined by
is continuous on X.
Proof. Suppose not continuous on X, then there are and real numbers such
that for every δ > 0 there exist so that if then valid
for every N ∈ ℕ
If there exist such that if then
.
The result can be choosed with such that
Since is continuous on L for each k ≥ 1, then there is a real number δ > 0 so that for
every with is applicable:
On Necessary and Sufficient Conditions for into Superposition Operator 291
(1)
So
and so on.
So forth so that the obtained sequence and sequence with and
increasing sequence natural number such that for and there is so
that for every with is obtained:
(3)
Defined for each k ∈ ℕ. So there is a sequence . Let
then for natural numbers n ≥ m applies:
So .
292 E. HERAWATY, SUPAMA, I.E. WIJAYANTI
This means that Cauchy sequence in X. Because X complete, then there is so that
. Or . Since x ,z ∈ X then
For Banach lattice L and the set of real number ℝ, a function 𝜙 : L ⟶ ℝ is a weight
function if it is non-decreasing on L+, continuous, even, and it satisfies
doubling condition ( condition), that is there exists a real number M > 0 such that
for every u ∈ L+.
For a weight function 𝜙 that satisfies condition, the function
defined by . So it is obtained that the sequence is
monoton non-decreasing on ℝ. The space defined by
3. MAIN RESULT
The main result of the research is the theorem 3.3. The following lemmas are
required for proving those theorems.
Lemma 3.1. If given any ∈ then for every real number β > 0 there is a real number
α > 0 so that if 𝜙 for every k, then
On Necessary and Sufficient Conditions for into Superposition Operator 293
Proof :Take any real number β > 0, then there is so that . Since 𝜙 satisfy the
for every k. ∎
Lemma 3.2. Given any function that satisfy the condition g(k, 0) = 0 and
is continuous on L for each k ≥ 1. If there is real number α, β > 0 so that
for each with result
then for every real number ɛ > 0 there is with for each with
and for
Proof :Since is continous on L, hence if then for a δ > 0.
Also since 𝜙 is continous on L, then for δ > 0 is true . So for each k ∈ℕ,
Therefore, for each k, the set
is upper bound.. Furthermore, define
.
can be seen that for each . Because the function 𝜙 is continous on L then the set
is closed and finite. It means .
Next take any real number ε > 0 with 0 < ε ≤ 1, then there is so that
So that
for i = 1, 2,... , l–1 with and
As a result
for i = 1, 2,... , l.
and obtained
294 E. HERAWATY, SUPAMA, I.E. WIJAYANTI
≤ lβ – (l – 1) + ε = β + ε, for each m,
As a result β + ε.
So for for each k, with . ∎
Theorem 3.3. Let satisfy (1) and 2). Superposition operator acts from
to if and only if the following condition satisfied
there is exist a real number 𝛼, 𝛽 > 0 and with for each so that
whenever
Proof :Sufficient condition, take any then . So there is
exist N ∈ ℕ such that .
As a result
for each k
Using the hypothesis, for each k valid
So for every k
Because
and
then
It means
The necessary condition. Let is a superposition operator, functional
given by
Then by lemma 2.1 , the functional continous on that also it is continous at 0.
This means for each number ε > there is a number η > 0 so that
If then
From lemma 3.1, there is exist a number 𝛼 > 0 so that if
then .
On Necessary and Sufficient Conditions for into Superposition Operator 295
References
[1] APPEL,JURGEN AND PETER P. ZABREJKO, Nonlinear Superposition Operators,Cambridge University Press, 1990.
[2] CHEW TUAN SENG, Superposition Operator On ω0 and W0, comment. Math (2)29,149-153, 1990.
[3] DEDAGICH, F AND ZABREἱKO, P.P., On Superposition Operators in 𝓵p spaces , Sibirsk.Math. Zh. 28, no 1, 86-98
(Russian), English tranlation, Serbian Math. J.28, no.1,63-73,1987.
[4] KAMTHAN, P.K. AND GUPTA, M., Sequences Space and Series, Marcel Dekker, Inc , 1981.
[5] MEYER, P. AND NIERBERG, Banach Lattice, Springer-Verlag, 1991.
[6] PAREDES, L.I., Boundedness of Superposition Operators on w0, SEA Bull. Math.,Vol.15, number 2, 145-151,
1991.
[7] PAREDES, L.I., Orthogonally Additive Functional and Superposition Operators on w0(ф) Ph.D dissertation,
University of The Philippines, 1993.
[8] PETRANUARAT, S AND YUPAPRON KEMPRASIT, Superposition Operators on ℓp and c0 into ℓq (1 p, q < ), SEA
Bull. Math. Vol 21, 139-147, 1997.
[9] RAO, M.M. AND REN, Z.D., 2002 : Applications of Orlicz Spaces, Marcell Dekker, Inc, N. Y
[10] ROBERT,I.J, Continuité d’un opérateur nonlinéar sur certains espacesde suites, C.R. Acad. Sci. Paris, Ser. A 259
(1964), 1287-1290.
[11]ŠHRAGIN, I.V., 1976, Condition for imbedding of classes and their consequences (Russian), Math. Zametki (5)
20, 681-692 ; English translation : Math.Note. 20, 942-948.
[12] SRI DARU, U., 1998 : Operator superposisi Terbatas Pada Beberapa Ruang Barisan, Disertasi Doktor,
Universitas Gadjah Mada
ELVINA HERAWATY
Department of Mathematic, FMIPA USU, Medan
e-mail : [email protected]
SUPAMA
Department of Mathematic, FMIPA UGM, Yogyakarta, 55281
e-mail : [email protected]
Abstract. Water flow in unsaturated soils that is induced by infiltration and plant
root water uptake processes is governed by Richard’s equation. To study the governing
equation more conveniently, it is transformed into a form of Helmholtz equation using
Kirchhoff transformation with dimensionless variables. It may be difficult or even impos-
sible to obtain the analytical solutions of boundary value problems involving this type
of Helmholtz equation. In this study, we employ the dual reciprocity boundary element
method (DRBEM) to solve these problems numerically. The proposed method is tested
through an example involving infiltration from periodic flat channels with root water
uptake.
1. INTRODUCTION
The study of infiltration through soils has been considered by numerous re-
searchers. Philips [10], Batu [3], Azis et al [2], Lobo et al [9], and Clements et al
[4] are some of such researchers.
In this paper, we investigate solutions to time independent water flow problems
in unsaturated soils involving processes of infiltration and absorption by plant roots.
The process of water absorption by the plant roots is also known as root water uptake
process. A set of transformations is employed to obtain the governing equation in
the form of a Helmholtz equation. An integral formulation is used to construct a
numerical scheme based on the dual reciprocity boundary element method or DRBEM,
for obtaining numerical solutions of the Helmholtz equation. An example of infiltration
from periodic flat channels with root water uptake is considered to test the scheme.
The solutions obtained are compared with solutions of problems involving infiltration
from the same flat channels without root water uptake process.
297
298 I. Solekhudin And K. C. Ang
xxxxxxxx xxxxxxx X
xxxxxxxx
xxxxxxxx xxxxxxx
xxxxxxx Zm
xxxxxxxx xxxxxxx
xxxxxxxx xxxxxxx
2L Xm
Root
2D zone
Flat
Channel
The actual root uptake, S, depends on the potential cumulative root water uptake
and soil water pressure head [5], and is modelled as
S(X, Z) = γ(ψ)Sm (X, Z), (5)
where γ is the dimensionless water stress response function, which takes values from
zero to one, and ψ is the suction potential (L). In this study we assume that the root
water uptake is under the condition of no stress, γ = 1.
where
2π lt β ∗ (x, z) Tpot
s(x, z) = R b R zm , (21)
αL β ∗ (x, z)dzdx v0
a 0
x z pZ ∗ pX ∗
β ∗ (x, z) = −1 1− e−( Zm |Z −2z/α|+ Xm |X −2x/α|) , 0 ≤ x ≤ b,
xm zm
z ≥ 0, (22)
and
α α α α α
lt = Lt , xm = Xm , zm = Zm , a = L, b= (L + D). (23)
2 2 2 2 2
Boundary conditions (13) to (17) can be written as
∂φ 2π α
= − φ, 0 ≤ x ≤ L and z = 0, (24)
∂n αL 2
∂φ α α
= −φ, L < x ≤ (L + D) and z = 0, (25)
∂n 2 2
∂φ
= 0, x = 0 and z ≥ 0, (26)
∂n
∂φ α
= 0, x= (L + D) and z ≥ 0, (27)
∂n 2
(28)
and
∂φ α
= −φ, 0 ≤ x ≤ (L + D) and z = ∞. (29)
∂n 2
Here, ∂φ/∂n = (∂φ/∂x)nx + (∂φ/∂z)nz is the normal derivative of φ.
4. METHOD OF SOLUTION
According to Ang [1], an integral equation to solve equation (20) is
Z Z
λ(ξ, η)φ(ξ, η) = ϕ(x, z; ξ, η)[φ(x, z) + s(x, z)e−z ]dxdz
Z R
∂
+ φ(x, z) (ϕ(x, z; ξ, η))
C ∂n
∂
− ϕ(x, z; ξη) (φ(x, z)) ds(x, z), (30)
∂n
where
1
λ(ξ, η) = 2 , (ξ, η) lies on a smooth part of C , (31)
1, (ξ, η) ∈ R
and
1
ϕ(x, z; ξ, η) = ln[(x − ξ)2 + (z − η)2 ] (32)
4π
302 I. Solekhudin And K. C. Ang
(a (1),b(1) )
(a(N ),b(N ))
Line segment
( ) ( )
(a N + M ,b N+ M )
ρφ p (i) φ ≅ φ(i)
≅
ρn
Let C (1) , C (2) , C (3) ,..., C (N ) be the line segments on the boundary, points (a(1) , b(1) ),
(a , b(1) ), (a(2) , b(2) ),..., (a(N ) , b(N ) ) be the midpoints of the segments, and points
(1)
(a(N +1) , b(N +1) ), (a(N +2) , b(N +2) ), (a(N +3) , b(N +3) ),..., (a(N +M ) , b(N +M ) ) are the cho-
sen interior points.
The value of φ(x, z) + s(x, z)e−z in (30) may be approximated by
NX
+M
φ(x, z) + s(x, z)e−z ' δ (i) ρ(x, z; a(i) , b(i) ), (33)
i=1
Z Z NX
+M
ϕ(x, z; ξ, η)[φ(x, z) + s(x, z)e−z ]dxdz ' δ (i) Υ(ξ, η; a(i) , b(i) ), (35)
R i=1
where
Z
∂
Υ(ξ, η; a(i) , b(i) ) = λ(ξ, η)χ(ξ, η; a(i) , b(i) ) + ϕ(x, z; ξ, η) (χ(x, z; a(i) , b(i) ))
C ∂n
∂
− χ(x, z; a(i) , b(i) ) (ϕ(x, z; ξ, η)) ds(x, z), (36)
∂n
and
1 1
χ(x, z; a(i) , b(i) ) = [(x − a(i) )2 + (z − b(i) )2 ] + [(x − a(i) )2 + (z − b(i) )2 ]2
4 16
1
+ [(x − a(i) )2 + (z − b(i) )2 ]5/2 . (37)
25
N Z
X ∂ (i) (i)
(χ(x, z; a , b )) (x,z)=(a(j) ,b(j) ) ϕ(x, z; ξ, η)ds(x, z)
j=1
∂n C (j)
N Z
X ∂
+ χ(a(j) , b(j) ; a(i) , b(i) )) ϕ(x, y; ξ, η)ds(x, z). (38)
j=1 C (j) ∂n
NX
+M
(k)
φ(a(k) , b(k) ) + s(a(k) , b(k) )e−b = δ (i) ρ(a(k) , b(k) ; a(i) , b(i) ),
i=1
k = 1, 2, .., N + M. (39)
NX
+M
(k)
δ (i)
= ω (ik) [φ(a(k) , b(k) ) + s(a(k) , b(k) )e−b ], i = 1, 2, ..., N + M, (40)
k=1
where
Now, (ξ, η) in equation (30) is taken to be (a(n) , b(n) ). The value of φ can be
approximated by
NX
+M
(k)
λ(a (n)
,b(n)
)φ (n)
= ν (nk) [φ(k) + s(a(k) , b(k) )e−b ]
k=1
N
(m) (m)
X
+ [φ(k) F1 (a(n) , b(n) ) − p(k) F2 (a(n) , b(n) )],
m=1
n = 1, 2, ..., N + M. (42)
where
NX
+M
ν (nk) = Υ(a(n) , b(n) ; a(i) , b(i) )ω (ik) , (43)
i=1
Z
(m)
F1 (a(n) , b(n) ) = ϕ(x, z; a(n) , b(n) )ds(x, z), (44)
C (m)
and
Z
(m) ∂
F2 (a(n) , b(n) ) = (ϕ(x, z; a(n) , b(n) ))ds(x, z). (45)
C (m) ∂n
xxxxxxxx
xxxxxxxx xxxxxxx
xxxxxxx X 0 0.25 0.5
xxxxxxxx
xxxxxxxx xxxxxxx
xxxxxxx
100 cm xxxx x
xxxxxxxx xxxxxxx xxxx
xxxx
100 cm xxxx
xxxx0.5
100 cm
α = 0.01cm -1
Z
z
Figure 3. Geometry of
the problem Figure 4. Transformed
geometry of the problem
A DRBEM for a Steady Infiltration from Periodic Flat Channels... 305
In the present discussion, the value of potential transpiration rate, Tpot , is set to
0.4 cm/day, which is the same value assumed by Li et al in their experiments [8]. The
value of α is taken to be 0.01 cm−1 , a typical value for homogeneous soil. The domain
in Figure 4 lies between z = 0 and z = 4.
Numerical values of Φ on lines x = l for various values of l are presented in Figures
5, 6, 7, 8, and 9. In order to make comparisons, graphs of Φ for the condition of no
root water uptake process are also given in the same figures accordingly.
4
without root water uptake
with root water uptake
3.8
3.6
Φ
3.4
3.2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z
4
without root water uptake
with root water uptake
3.8
3.6
Φ
3.4
3.2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z
Values of Φ for problems involving infiltration with and without root water uptake
process along x = 0.05 and x = 0.15 are respectively shown in Figures 5 and 6. These
two lines are under a channel. Since the lines are under the channel, the maximum
values of Φ for both problems are achieved at z = 0. As z increases the values of Φ
decrease and eventually the values of Φ converge to some constants between 3 and 3.2.
It can be seen that there are decreases in Φ when there is root water extraction.
The drop in Φ increases as z increases from 0 to 0.5, and seems to remain the same for
306 I. Solekhudin And K. C. Ang
3.2
Φ
2.6
2.4
2.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
z
3.2
3
without root water uptake
with root water uptake
2.8
Φ
2.6
2.4
2.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z
z ≥ 0.5. In other words, the drop seems to be increase except for region deeper than
the depth of the root zone.
Figures 7 and 8 show the values of Φ for problems involving infiltration with and
without root water extraction along x = 0.35 and x = 0.25 respectively. These two lines
are not under the channels. Since there is no flux at the surface of the soil, the minimum
values of Φ for both problems are at z = 0. The values of Φ increase as z increases and
converge to some constants. There are drops in the values of the dimensionless MFP of
infiltration with root water uptake process compared with those of infiltration without
root water uptake extraction. As before, the drop increases in the region shallower than
the depth of the root zone, and seems to be constant in the region deeper than the root
zone.
A DRBEM for a Steady Infiltration from Periodic Flat Channels... 307
4
without root water uptake
with root water uptake
3.8
3.6
Φ
3.4
3.2
3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z
From Figure 9, it can be seen that the values of Φ for the two problems, with
and without root water uptake process, are constants except at z = 0. This is probably
because of the singularity of point (0.25,0).
6. CONCLUDING REMARKS
A study of a model with new factor incorporated, root water uptake process,
has been made. A numerical method, the DRBEM, is employed to obtain numerical
solutions of the dimensionless MFP. Numerical solutions of the dimensionless MFP
obtained from this problem are compared with those obtained from a problem involving
infiltration from the same periodic flat channels without root water uptake process.
The results are presented in graphs for some lines along the z axis. They illustrate
the effect of root water extraction on the dimensionless MFP. The root water extraction
process reduces the dimensionless MFP. This implies declines in water contents in the
soil. The results shown are reasonable as roots absorb water from soil, and therefore the
model is accepted. However, it is not a trivial task if we want to verify quantitatively.
References
[1] Ang, W. T., A Beginer’s Course in Boundary Element Method, Universal Publishers Boca Raton,
Florida, 2007.
[2] Azis, M. I., Clements, D. L. and Lobo, M., A Boundary Element Method for Steady Infiltration
from Periodic Channels, ANZIAM J., 44(E), C61 - C78, 2003.
308 I. Solekhudin And K. C. Ang
[3] Batu, V., Steady Infiltration from Single and Periodic Strip Sources, Soil Sci. Soc. Am. J. 42,
544 - 549, 1978.
[4] Clements, D. L., Lobo, M. and Widana, N., A Hypersingular Boundary Integral Equation for
a Class of Problems Concerning Infiltration from Periodic Channels, El. J. of Bound. Elem., 5, 1
- 16, 2007.
[5] Feddes, R. A., Kowalik, P. J., Zaradny, H., Simulation of Field Water Use and Crop Yield,
John Wiley & Sons, New York, 1978.
[6] Gardner, W. R., Some Steady State Solutions of the Unsaturated Moisture Flow Equation with
Application to Evaporation from a Water Table, Soil Sci., 85, 228 - 232, 1958.
[7] Hoffman, G. J. and van Genuchten, M. T., Soil Properties and Efficient Water Use, Water
Management for Salinity Control, 73 - 85. In H. M. Taylor et al. (eds) Limitation to Efficient
Water Use in Crop Production. Am. Soc. Agron., Madison, WI, 1983.
[8] Li, K. Y., De Jong, R. and Boisvert J. B., An Exponential Root-Water-Uptake Model with
Water Stress Compensation, J. Hydlol., 252, 189 - 204, 2001.
[9] Lobo, M., Clements, D. L. and Widana, N., Infiltration from Irrigation Channels in a Soil with
Impermeable Inclussion, ANZIAM J., 46(E), C1055 - C1068, 2005.
[10] Philips, J. R., Flow in Porous Media, Annu. Rev. Fluid Mechanics 2, 177 - 204, 1970.
[11] Prasad, R., A Linear Root Water Uptake Model, J. Hydrol. 99, 297 - 306, 1988.
[12] Raats, P. A. C, Steady Flows of Water and Salt in Uniform Soil Profiles with Plant Roots, Soil
Sci. Soc. Am. Proc. 38, 717 - 722, 1974.
[13] Vrugt, J. A., Hopmans, J. W. and Šimunek, J., Calibration of a Two-Dimensional Root Water
Uptake Model, Soil Sci. Soc. Am. J. 65, 1027 - 1037, 2001.
[14] Vrugt, J. A., van Wijk, M. T., Hopmans, J. W. and Šimunek, J., One-, Two-, and Three-
Dimensional Root Water Uptake Functions for Transient Modeling, Water Resources Res. 37,
2457 - 2470, 2001.
Imam Solekhudin
Mathematics and Mathematics Education, National Institute of Education,
Nanyang Technological University, Singapore.
Permanently at Department of Mathematics, Faculty of Mathematics and Natural Sciences,
Gadjah Mada University, Yogyakarta-Indonesia.
e-mail: [email protected]
Keng-Cheng Ang
Mathematics and Mathematics Education, National Institute of Education,
Nanyang Technological University, Singapore.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 309–316.
1. INTRODUCTION
The determinant gives a mechanism in measuring the subset in multi dimensional.
For example second order of determinant gives the way of how we can compute the
linear shape in a plane. Also an operator acting on function spaces may not only
depend on a main variable but also on several other function-variables that are often
treated as parameters. Examples of such operators are ubiquitous in harmonic analysis:
for example multiplier operators, the Calderon commutators, and the Cauchy integral
along Lipschitz curves. These kind of operators can be treated as multilinear operators.
In this article we study about the boundedness of bilinear generalized fractional
integral operators in the form
f (x + y) g (x − y)
Z
Iα (f, g) (x) = n−α dy
R n |y|
309
310 W.S. Budhi and J. Lindiarni
where 0 < α < n [2]. There are many similar discussion in bilinear operators. In 1997,
M.T. Lacey solved the long standing Calderon’s conjecture of bilinear Hilbert transform
[3]. Calderon studied about the bilinear Hilbert transform which is intimately related to
Carleson’s theorem asserting the pointwise convergence of Fourier series [1]. L. Grafakos
and N Kalton studied about multilinear fractional integral operator in classical Lebesgue
spaces [2].
Recently the study of one variable version of the above operator has been extended
from the classical Lebesgue spaces to Morrey spaces. Let 1 ≤ p < ∞ and 0 ≤ λ ≤ n,
the classical Morrey space Lp,λ = Lp,λ (Rn ) is defined to be the space of all functions
f ∈ Lploc (Rn ) for which
Z !1/p
1 p
kf kp,λ = sup |f (y)| dy <∞
x∈Rn rλ B(x,r)
r>0
where B (x, r) is the open ball centered at x ∈ Rn with radius r > 0 [5]. Many result
about the Morrey space and also the result about the boundedness of the classical
fractional operator for one function. For example Adam; Chiarenza-Frasca proved that
kIα f kq,λ ≤ Cp,λ kf kp,λ
n
for 1 < p < α , 0 ≤ λ < n − αp and 1q = p1 − n−λ
α
.
In [5] E. Nakai defined the generalized Morrey spaces. For 1 ≤ p < ∞ and
a suitable function φ : (0, ∞) → (0, ∞), he defined the (generalized) Morrey space
Mp,φ = Mp,φ (Rn ) to be the space of all functions f ∈ Lploc (Rn ) for which
Z !1/p
1 1 p
kf kp,φ = sup |f (y)| dy <∞
x∈Rn φ (r) |B (x, r) | B(x,r)
r>0
Notice that for φ (t) = t(λ−n)/p , 0 ≤ λ ≤ n we have Mp,φ = Lp,λ , the classical Morrey
space. The function φ satisfies the two conditions:
(1) There exist C1 such that for all r, s with 12 ≤ rs ≤ 2 then C11 ≤ φ(r)
φ(s) ≤ C1
R ∞ φp (t) p
(2) There exist C2 such that r t dt ≤ C2 φ (r) for all 1 < p < ∞
The condition (1) is known as the doubling condition. For more detail about this
information, see the works of H. Gunawan and Eridani [4], and the references therein.
In order to prove boundedness of the fractional integral operators, it usually proves
boundedness of the Hardy-Littlewood maximal operators for one function, defined by
the formula Z
1
M f (x) = sup |f (y)| dy
r>0 |B (x, r)| B(x,r)
The boundedness of this operator in the classical Morrey space was proved by Chiarenza-
Frasca, that is for p > 1 and 0 ≤ λ < n the following
kM f kp,λ ≤ Cp,λ kf kp,λ
Boundedness of the Bimaximal Operator and Bifractional Integral Operators ... 311
holds. Then in [5], Nakai proved the extension of this for generalized Morrey space,
that is the inequality
and then we will use it to prove the boundedness of Iα (f, g). In the proof we will use
the results of one variable maximal Hardy-Littlewood operators.
Theorem 2.1. Let p, q be the real numbers with p, q > 1 and s be the harmonic mean
of p, q then for f ∈ Mp,φ and g ∈ Mq,φ
Proof. In order to prove the inequality, we can assume that f, g > 0, then we use Holder
inequality to have
Z
1
f (x − y) g (x + y) dy
|B (0, r)| B(0,r)
Z !s/p Z !s/q
1 p/s q/s
≤ f (y) dy g (y) dy
|B (0, r)| B(x,r) B(x,r)
Z !s/p Z !s/q
1 p/s 1 q/s
= f (y) dy g (y) dy
|B (0, r)| B(x,r) |B (0, r)| B(x,r)
Using the definition of the maximal function for one component, then we have the
relation between bilinear maximal function with the classical Hardy-Littlewood maximal
function. In this case we get
s/p s/q
M2 (f, g) (x) ≤ M1 f p/s (x) M1 g q/s (x)
312 W.S. Budhi and J. Lindiarni
Z !
1 1 s
2 M2 (f, g) (x) dx
φ (r) |B (x, r)| B(x,r)
!
1 1
Z s2 /p s2 /q
≤ 2 M1 f p/s (x) M1 g q/s (x) dx
φ (r) |B (x, r)| B(x,r)
!s/p
1 1
Z s2 /p p/s
p/s
≤ M1 f (x) dx
φ (r) |B (x, r)| B(x,r)
!s/q
1 1
Z s2 /q q/s
q/s
× M1 g (x) dx
φ (r) |B (x, r)| B(x,r)
Z !s/p
1 1 s
p/s
= M1 f (x) dx
φ (r) |B (x, r)| B(x,r)
Z !s/q
1 1 s
× M1 g q/s (x) dx
φ (r) |B (x, r)| B(x,r)
With inequality (1) in hand, we can conclude that the inequality (3) holds.
Theorem 3.1. Let p, q be the real numbers greater than 1 and s be their harmonic
mean, and let φ be a positive function satisfies the doubling condition on the it’s domain
2βs
(0, ∞) and also φ (t) ≤ Ctβ for − ns ≤ β < −α, 1 < s < α n
. Then, for r = α+2β , the
bifractional integral operators satisfy
Proof. Let x ∈ Rn and R > 0 be any real numbers. In the proof, we will write C for
any constant bound. Then for any f ∈ Lp and g ∈ Lq , we can write the operator as
f (x − y) g (x + y) f (x − y) g (x + y)
Z Z
Iα (f, g) (x) = n−α dy + n−α dy
|y|<R |y| |y|>R |y|
Boundedness of the Bimaximal Operator and Bifractional Integral Operators ... 313
(1)
For the first integral Iα (f, g) (x), we can write it as
(1)
Z f (x − y) g (x + y)
Iα (f, g) (x) = dy
n−α
|y|<R |y|
−1 Z
X |f (x − y) g (x + y)|
= dy
n−α
k
k=−∞ 2 R≤|y|<2
k+1 R |y|
−1 Z
X
k
α−n
≤ 2 R |f (x − y)| |g (x + y)| dy
k=−∞ 2k R≤|y|<2k+1 R
−1 Z
n
X
k
α 1
≤2 2 R n |f (x − y)| |g (x + y)| dy
k=−∞
(2k+1 R) |y|<2k+1 R
(2)
For second integral Iα (f, g) (x), we write it as
(2)
Z f (x − y) g (x + y)
Iα (f, g) (x) = dy
n−α
|y|>R |y|
∞
|f (x − y)| |g (x + y)|
X Z
≤ n−α dy
k
k=0 2 R≤|y|<2
k+1 R |y|
∞ Z
X
k
α−n
≤ 2 R |f (x − y)| |g (x + y)| dy
k=0 |y|<2k+1 R
In the last line, we use the property of that φ is doubling function. Finally, because the
order of φ, then we have
∞
(2) X α+2β
Iα (f, g) (x) ≤ C kf kp,φ kgkq,φ 2k R ≤ CRα+2β kf kp,φ kgkq,φ
k=0
if α + 2β < 0. Now if kf kp,φ = 0 or kgkq,φ = 0, then M (f, g) (x) = 0, therefore we can
1/2β
set R = kfMk (f,g)(x)
kgk . Combine the both inequalities, we have
p,φ q,φ
References
[1] A. Calderon, Commutators of singular integral operators, Proc. Natl. Acad. Sci. USA 53 (1965),
1092-1099.
[2] L. Grafakos and N Kalton, Some Remarks on Multilinear Maps and Interpolation, Mathema-
tische Annalen 319 (2001), no. 1, 151–180.
[3] M. Lacey and C Thiele, Lp estimates for the bilinear Hilbert transform, Proc. Natl. Acad. Sci.
USA 94 (1997), 33-35
[4] H. Gunawan and Eridani, Fractional Integrals and Generalized Olsen Inequalities, Kyungpook
Mathematical Journal 49, (2009) 31-39
[5] E. Nakai, Hardy-Littlewood maximal operator, singular integral operators, and the Riesz potentials
on generalized Morrey spaces”, Math. Nachr. 166 (1994), 95-103.
Boundedness of the Bimaximal Operator and Bifractional Integral Operators ... 315
[6] I. Sihwaningrum, Operator Integral Fraksional dan Ruang Morrey Tak Homogen yang Diperumum,
Disertasi, Institut Teknologi Bandung, 2010 (in Indonesian).
Janny Lindiarni
FMIPA Institut Teknologi Bandung.
e-mail: [email protected]
316 W.S. Budhi and J. Lindiarni
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 317–322.
Agah D. Garnadi
Abstract. Iterative regularization methods for nonlinear ill-posed equations of the form
F (a) = y, where F : D(F ) ⊂ X → Y is an operator between Hilbert spaces X and
Y , usually involve calculation of the Fréchet derivatives of F at each iterate and at the
unknown solution a] . A modified form of the generalized Gauss-Newton method which
requires the Fréchet derivative of F only at an initial approximation a0 of the solution
a] as studied by Mahale and Nair [11]. This work studied an a posteriori stopping rule
of Lepskij-type of the method. A numerical experiment from inverse source potential
problem is demonstrated.
Keywords and Phrases: Nonlinear ill-posed problem, a posteriori stopping rule, regular-
ized Gauss-Newton.
1. INTRODUCTION
Nonlinear ill-posed problem usually posed as a non-linear operator equation
F : D(F ) ⊂ X → Y
between Hilbert spaces X and Y. We assume that F is one-to-one and Fréchet differ-
entiable on its domain D(F ) and denote the derivative at a point a ∈ D(F ) by F 0 [a].
Since it is ill-posed, F does not have a bounded inverse.
In this work we will focus on the simplified iteratively regularized Gauss-Newton
method (sIRGNM) which is a variant of iteratively regularized Gauss-Newton method
(IRGNM), one of the most attractive iterative regularization methods. For an overview
on iterative regularization methods for non-linear ill-posed problem, we refer to the
monograph by Kaltenbacher, Neubauer, and Scherzer [10] or Bakushinsky, Kokurin,
Kokurin, and Smirnova [1]. At (n+1)−st iteration of the IRGNM, the iterate aδ(n+1) ∈ X
317
318 Agah D. Garnadi
is defined as the unique global minimizer of the quadratic functional a 7→ kF 0 [aδn ](a −
aδn ) + F (aδn ) − y δ k2Y + αn k(a − a0 )k2X , n ∈ IN0 . Where a0 ∈ D(F ) is some initial guess,
and αn is a regularization parameter, here we chose αn = α0 q n , for some 0 < q < 1.
The (n + 1)−st iterate aδ(n+1) can be expressed in a closed form
aδ(n+1) := an + (F 0 [aδn ]∗ F 0 [aδn ] + αn I)−1 F 0 [aδn ]∗ (y δ − F (aδn ) + F 0 [aδn ](aδn − a0 )). (1.1)
A variant of IRGNM, where we approximate F 0 [aδn ] by an equivalent linear operator A,
typically by F 0 [a0 ]. Hence the previous formula at the (n + 1)−st iteration, we use
aδ(n+1) := aδn + (F 0 [a0 ]∗ F 0 [a0 ] + αn I)−1 F 0 [a0 ]∗ (y δ − F (aδn ) + F 0 [a0 ](aδn − a0 )). (1.2)
This variant called the simplified IRGNM, which is widely used in practice, but lacking
in theoretical grounds. Kaltenbacher [9] initiated studying the methods, and closely
studied in details recently by Mahale & Nair [11], Jin [8]and George [5].
One of important thing during iteration is when to terminate the steps as the
error kaδn − a] k experiencing deterioration as n → ∞ in the presence of noise. One of
the rule that widely use is the discrepancy principle, which is the iteration terminated at
the index N (δ, y δ ) for the first time the criteria kF (aN ) − y δ k ≤ τ δ satisfied with some
parameter τ > 1. In [2, 3] the authors studied a Lepskij-type stopping rule for IRGNM
with deterministic and random noise, their studies showed that both theoretically and
numerically, compared to the discrepancy principle, the proposed stopping rule yields
at least as good, and at some point even better results.
In this work, we examine the stopping rule to the sIRGNM, fills a gap left behind
by [2] works on IRGNM and completing the works of Bauer & Lukas [4] on extensive
survey on stopping criteria in linear inverse problem.
modification, this is also true for sIRGNM, by utilizing the property of F 0 [a0 ], which is
likely known apriori. The essence of the Lepskij stopping rule is to extract information
from the a-priori bound (2.1) to detect the point after which the propagated data error is
become dominant.In the following theorem as given in [2], this situation stated precisely,
as the proof is quite illustrative and short, we reproduce it here.
Theorem 2.1. [2] Let aδn be the sequence of iterates produced by an iterative regulariza-
tion method for an initial guess a0 from some admissible set and data (δ, y δ ) satisfying
uobs := y δ = F (a] ) + δξ. (2.2)
We assume that
• There exists an a-priori known index Kmax = Kmax (δ) such that aδn is well
defined for 0 ≤ n ≤ Kmax .
• There exists an ’optimal’ stopping index N = N (δ, y δ , a] ) ∈ {0, 1, · · · , Kmax },
and a known increasing function Φ : IN0 → [0, ∞) such that
kaδn − a] k ≤ Φ(n)δ, n = N, · · · , Kmax . (2.3)
Then ther error at the Lepskij stopping index n∗ = n∗ (δ, y δ ) defined by
n∗ := min{n ∈ {0, · · · , Kmax (δ)} : kaδn − a] k ≤ 2Φ(m)δ, ∀m = n + 1, · · · , Kmax },
is bounded by
kaδn∗ − a] k ≤ 3Φ(N )δ.
κ = 1.1 κ = 1.1
0
10 1.5
Lepskij Optimal
Optimal Lepskij
Exact
1
0.5
L2−error
−1
10 0
−0.5
−1
−2
10 −1.5
−3 −4 −5 −6 −1.5 −1 −0.5 0 0.5 1 1.5
10 10 10 10
Noise level
κ = 0.3 κ = 0.3
0
10 1.5
Lepskij Optimal
Optimal Lepskij
Exact
1
0.5
L2−error
−1
10 0
−0.5
−1
−2
10 −1.5
−3 −4 −5 −6 −1.5 −1 −0.5 0 0.5 1 1.5
10 10 10 10
Noise level
References
[1] Bakushinsky,A.B. and Kokurin,M.Y. and Kokurin,M.I.U. and Smirnova,A., Iterative Meth-
ods for Ill-Posed Problems: An Introduction,(De Gruyter, Berlin, 2010)
[2] Bauer,F. and Hohage,T., A Lepskij-type stopping-Rule for regularized Newton-type methods,
Inv. Problems, 21,1975, 2005.
[3] Bauer,F. and Hohage,T., A Lepskij-type Stopping-Rule for Newton-type Methods with Random
Noise, PAMM, 5,15, 2005.
[4] Bauer,F. and Lukas,M.A., Comparing parameter choice methods for regularization of ill-posed
problems, Mathematics and Computers in Simulation,81(9),1795, 2011.
[5] George,S., On convergence of regularized modified Newton’s method for nonlinear ill-posed prob-
lems, J. Inv. and Ill-posed Prob.,18(2), 133, 2010.
[6] Bauer,F. and Hohage,T. and Munk,A. Iteratively Regularized Gauss-Newton Methods for Non-
linear Inverse Problem with Random Noise, SIAM J. Num.An., 47,1827,2009.
[7] Hohage,T., On the numerical solution of a three dimensional inverse medium scattering prob-
lem,Inv.Problem, 17, 1743, 2001.
322 Agah D. Garnadi
[8] Jin, Q.N. On a class of frozen regularized Gauss-Newton methods for nonlinear inverse prob-
lems,Math. Comp., 79, 2191, 2010.
[9] Kaltenbacher,B., A posteriori parameter choice strategies for some Newton type methods for
the regularization of nonlinear ill-posed problems, Num. Math., v 79, 501, 1998.
[10] Kaltenbacher,B. and Neubauer,A. and Scherzer,O., Iterative Regularization Methods for
Nonlinear Ill-Posed Problems, (de Gruyter, Berlin, 2008).
[11] Mahale, P. and Nair,M.T., A simplified generalized Gauss-Newton method for nonlinear ill-
posed problems, Math. Comp., 78, 171-184, 2009.
Agah D. Garnadi
Dept. of Mathematics
Institut Pertanian Bogor.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 323–330.
1. INTRODUCTION
Let i = 1, 2, and s and xi be variables dependent on time t, that denote the
densities of a single renewable prey and the ith exploitative competitor respectively.
These competitors contend by exploiting their consumption of the prey. We study the
population model
ds 1 1
= s (1 − s) − · F1 x 1 − · F2 x2 , s(0) = s0 ,
dt b1 b2
(E)3
dx
i = Fi xi − di xi ,
xi (0) = xi0 , i = 1, 2,
dt
where s0 and xi0 are initial densities, ai , bi , di and mi are parameters, and
mi s
Fi = , i = 1, 2,
a i + s + xi
323
324 A. E. S. Almocera, L.S. Almocera, P.W. Sy
Definition 2.2. [5] Let x(t) be a given solution of (3a) defined for all t > 0. We say
that p is an omega limit point of x(t) if there exists a sequence htn i ↑ ∞ such that
hx(tn )i → p. The set of all omega limit points of x(t) is called the omega limit set of
x(t).
If x(t) is forward bounded, that is, x(t) stays inside a fixed compact subset of D
for sufficiently large t, then as one of the main results in [3], the omega limit set of x(t)
is nonempty. Moreover, the following result has been used in many population models
to show that the solution tends to an equilibrium point [4].
Theorem 2.1. (Markus [3, 4]) Let (3a) → (3b) in D, and x(t) be a forward bounded
solution of (3a), with a nonempty omega limit set Ω. Suppose that P is a locally
asymptotically stable fixed point of (3b). If there is an omega limit point y0 ∈ Ω such
that the solution y(t) of (3b), with y(0) = y0 , has limt→∞ y(t) = P , then limt→∞ x(t) =
P.
3. ANALYSIS
Now, we are given equation (2) where limt→∞ xi (t) = 0 for i = 1, 2. Following
Hsu, Hubbell, and Waltman [2], we investigate on the limiting behavior of s(t), one of
the coordinates of the solution (s(t) , x1 (t) , x2 (t)). To facilitate our analysis, we let
ε > 0 be arbitrarily small. Then (2) implies that for i = 1, 2, there is a τi > 0 large
enough such that
1 bi
xi (t) = |xi (t) − 0| < min , a1 , a2 · ε, for all t > τi , (4a)
2 mi
which yields
bi ai bi
xi (t) < ε, xi (t) < ε, for all t > τ = max(τ1 , τ2 ) . (4b)
2mi mi
∂
Note that our treatment of xi (t) yields the partial derivative ∂s xi (t) = 0, since
holding t fixed makes xi (t) constant. Moreover, we have the following partial derivatives:
2
∂G X mi (ai + xi (t)) xi (t)
= 1 − 2s − · 2,
∂s b
i=1 i (ai + s + xi (t))
∂G∞
= 1 − 2s.
∂s
If we let D = (0, ∞), so that ai + s + xi (t) > 0, for i = 1, 2, for each s ∈ D, and for all
t > 0, then it follows that G(s, t) and G∞ (s) are:
• Continuous in (s, t) for all s ∈ D and for all t > 0; and,
• Continuously differentiable for all s ∈ D.
Furthermore, G(s, t) and G∞ (s) are respectively the vector fields of the following ordi-
nary differential equations:
2
ds X 1 mi s xi (t)
= s (1 − s) − · , (5a)
dt b ai + s + xi (t)
i=1 i
ds
= s (1 − s) , (5b)
dt
where (5b) corresponds to the s-subsystem (1). The following result enables us to
compare (5a) with (5b).
Lemma 3.1. If equation (2) holds, then (5a) → (5b) in D = (0, ∞).
Proof. Let E be any compact subset of D. Then for i = 1, 2, for all s ∈ E, and
for all t > τ ,
s s
xi (t) > 0, ai + s + xi (t) > 0, ai + s + xi (t) = ai + s + xi (t) < 1.
(6)
bi
and by (4b), where |xi (t)| < 2mi ε,
2
X mi bi ε
|G(s, t) − G∞ (s)| < · = ε.
i=1
bi 2mi
Asymptotically Autonomous Subsystems 327
Therefore, |G(s, t) − G∞ (s)| < ε for all time t > τ , and for all s ∈ E. That is,
G(s, t) → G∞ (s) locally uniformly in s ∈ D, as t → ∞. Consequently, we have
(5a) → (5b) in D.
3.2. Forward Bounded Solution. Now we return to the solution of (E)3 , which
is (s(t) , x1 (t) , x2 (t)). Our next result relates one of the component functions s(t)
with (5a).
Lemma 3.2. If (s(t) , x1 (t) , x2 (t)) is a solution of (E)3 , then s(t) is a forward bounded
solution of (5a) for all time t ≥ 0.
Proof. From the fundamental properties of an initial value problem, the solution
(s(t) , x1 (t) , x2 (t))
exists for all t ≥ 0, where the component function s(t) is:
(1) Continuously differentiable, that is, s(t) and its derivative s0 (t) are both con-
tinuous;
(2) Positive, so that s(t) ∈ (0, ∞) = D; and,
(3) Defined by the equation
2
X 1 mi s(t) xi (t)
s0 (t) = s(t) [1 − s(t)] − · .
b ai + s(t) + xi (t)
i=1 i
With each other coordinate xi (t) assumed to be an explicit function, it follows that
s0 (t) = G(s(t) , t). Therefore, s(t) is a solution of (5a) for all t ≥ 0.
It can be shown that there is a sufficiently large T > τ such that (4b) holds and
s(t) ≤ 1 + ε for all t ≥ T . Assuming further that 4ε 1, this yields,
2
0
X mi s(t) xi (t)
s (t) = s(t) [1 − s(t)] − ·
i=1
bi ai + s(t) + xi (t)
2
X mi
> s(t) [1 − s(t)] − · s(t) · xi (t) , since ai + s(t) + xi (t) > ai ,
a b
i=1 i i
2
X 1 ai bi ai bi
> s(t) [1 − s(t)] − · s(t) , since xi (t) < ε< ,
i=1
4 mi 4mi
1 s(t)
= s(t) 1 − 1 .
2 2
Letting sT = s(T ), we have the following
0 1 s(t)
s (t) ≥ s(t) 1 − 1 , t ≥ T,
2 2
from which, by the theory of differential inequalities,
1
1 sT
min sT , ≤ 2 ≤ s(t) ≤ 1 + ε, (7)
sT + 2 − sT exp − 21 (t − T )
1
2
328 A. E. S. Almocera, L.S. Almocera, P.W. Sy
3.3. Limiting Behavior. We are now ready to prove our main result.
Theorem 3.1. Let (s(t) , x1 (t) , x2 (t)) be the solution of (E)3 . If limt→∞ x1 (t) = 0
and limt→∞ x2 (t) = 0, then limt→∞ s(t) = 1.
Proof. Suppose that limt→∞ x1 (t) = 0 and limt→∞ x2 (t) = 0, that is, equation (2)
holds. Then, (5a) → (5b) in (0, ∞) by Lemma 3.1. Since Lemma 3.2 states that s(t)
is a forward bounded solution of (5a), it follows that s(t) has a nonempty omega limit
set Ω. With s(t) differentiable for all t ≥ 0, one can find a sequence htm i ↑ ∞ and
some y0 such that hs(tm )i → y0 and y0 ∈ Ω. Furthermore, with inequality (7) true for
sufficiently large t, we must have y0 > 0.
Since the limiting system (5b) corresponds to the s-subsystem (1), it has a locally
asymptotically stable equilibrium point K = 1, from which the unique solution y(t)
of (5b), with y(0) = y0 , satisfies limt→∞ y(t) = 1. Therefore, limt→∞ s(t) = 1 by
Theorem 2.1.
4. CONCLUSION
To summarize, we analyzed the limiting behavior of (s(t) , x1 (t) , x2 (t)) and in
particular one of its components s(t), by considering the s-subsystem (1), which is
a logistic growth model. To apply Markus’ theory to (E)3 under the assumption of
equation (2), we constructed a non-autonomous system (5a) that can be compared to
the s-subsystem, such that s(t) is a forward bounded solution of (5a).
Our main result Theorem 3.1 implies that when both competitors do not survive,
the prey saturates. That is, not only the prey becomes the sole survivor, but also, its
density tends to the carrying capacity. The key to this result is the global attractiveness
of the carrying capacity of a logistic growth model, aside from being a locally asymptot-
ically stable equilibrium point. Theorem 3.1 also demonstrates a relationship between
a given system and one of its subsystems, where under (2) the model (E)3 eventually
behaves more like that of (1).
Acknowledgement. The authors would like to thank the Department of Science and
Technology for their financial support of this research, through the Accelerated Science
and Technology Human Resource Development Program (ASTHRDP). The authors are
grateful for the helpful comments of Prof. Jean-Stephane Dhersin, from Université Paris
13, as well as the insights of S. B. Hsu, S. P. Hubbell and P. Waltman on their paper
[2].
Asymptotically Autonomous Subsystems 329
References
[1] DeAngelis, D. L., Goldstein, R. A., and ONeill, R. V., A Model for Tropic Interaction,
Ecology 56, 881-892, 1975.
[2] Hsu, S. B., Hubbell, S. P., and Waltman, P., Competing Predators, SIAM Journal on Applied
Mathematics 35, 617-625, 1978.
[3] Markus, L., Asymptotically autonomous differential systems, in: Contributions to the Theory of
Nonlinear Oscillations, Vol. 3, Princeton University Press, 1956.
[4] Thieme, H. R., Asymptotically Autonomous Differential Equations in the Plane, Rocky Mountain
Journal of Mathematics 24, 351-380, 1993.
[5] Wiggins, S., Introduction to Applied Nonlinear Dynamical Systems and Chaos, Texts in Applied
Mathematics, No. 2, Springer-Verlag, New York, 2003.
Lorna S. Almocera
University of the Philippines, Cebu
e-mail: [email protected]
Polly W. Sy
University of the Philippines, Diliman
e-mail: [email protected]
330 A. E. S. Almocera, L.S. Almocera, P.W. Sy
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 331 - 338.
Abstract. Sequence analysis is one of methods to assign function, structure evolution and
features from sequence and one of kind this methods is a Super Pairwise Alignment. This
methods was assigned homologous sequences DNA H1N1 Virus.
Keywords and Phrases : DNA, Homologous, Super Pairwise Aligment
1. INTRODUCTION
DNA sequencing methodology was developed in the late 1970 and has become one of
the most widely uses technique in moleculer biology. The importance of this technique is
underlined by the volume research funds now being invested in development of outomated
sequencers and sequence analysis system. Sequence analysis in moleculer biology includes a
very wide range of relevan topics like construction of map, translation, protein analysis,
similarty search, alignment with a similar sequence and submission and retrievel. The most
task sequence analysis is alignment with similar sequence. An optimal alignment is achieved
between two similar sequences (DNA or amino acid) and the percent or similarity calculated
[1].
One of methods in sequence alignment is dynamic programming. The widely used
alignment, dynamic programming though generating optimal alignment, takes too much due
to its high computation complexity O(N2). A majority of sequence alignment software utilizes
dynamic programming, such as global Needleman Wunsch and local alignment use Smith
Waterman [3]. Both methods were a classical algorithm in sequence alignment. Based on the
result of the research of Shen et all, both methods have disadvantage of which one is the
speed of computation. To solve this problem, Shen et all found a new methods that is a Super
Pairwise alignment. This methods combines the analysis of the methods combinatorics and
probability[2,3].
________________________________
2011 Mathematics Subject Classification : Computer Science For Biology and other naturall science
331
332 A.Y. ZAKIYYAH, M . I. IRAWAN, M . SHOVITRI
The management of the new virus can be analyzed by homologous. On the problem of
the identification of disease, DNA virus mutates so that it may give rise to new viruses. The
case H1N1 is one example of mutation. The H1N1 virus mutates quickly enough.
Hemaglutinin virus transmission from aspartic acid (D) to Glynine (G) on the line 222 [4].
This becomes the reference to the writer for further consideration applied method Super
pairwise Alignment to analyze sequence DNA H1N1 virus.
1.1 Super Pairwise Alignment. The mathematical methods used to study mutation and
alignment fall mainly into three groups there are stochastics analysis, modulus structure and
combinatorial graph theory [2]. At first glance, a DNA sequence structure may seem
disorderly and unsystematics and nucleotides at each positions (or a group of positions) are
not fixed. That is to say that biological sequence analysis is stochastics sequence. statistically,
we may find that frequency of observing all molecules or segment changes based o different
dataset of biological sequences. Therefore, we may use stochastics model to describes
biological sequences
1.1.1 Parameter Estimation.The key to solving the uniform alignment of the pairwise
sequences is knowing how to estimate the parameters in the mutation mode T based on
sequence (A,B). then T is a group of statistical parameter and
T iˆk , ˆ k , k 1, 2,...kˆa
is a set of statistics determined by (A,B), and estimate of the parameter set T. the vital
problem of uniform alignment of pairwise sequences is the estimate of the parameter in T.
The approach to solving this problem is briefly described below :
1.1.2 Algorithm. Let (A,B) be two fixed sequences. This algorithm based on Shen et all
research [2]. Specifically, we first select the importance parameters n, h , , ', . Here, n
is selected according to the convergence of the law of large numbers or the central limit
'
theorem. Typically, we choose n = 20,50, 100, 150 etc. , are selected based on the error
rate of mutation and error rate of the indepently random variables. Thus, we choose
0 ' 0.75 . For the parameters h, as two local modifications, we choose them to
be proportional to n; typically, n , h n , 0 , 0.5 etc. The SPA
algorithm is described below:
w( A, B; i, j, n) w '
then let iˆ1 0 . This means the shifting mutations occurs at the beginning of [1, n].
Otherwise, go to step (2)
w( A, B; i, j, n) w
w( A, B; i, j, n) w
b. Estimate 1 based on the estimation iˆ1 of the first mutation position in T. typically,
w( A, B; iˆ1 , iˆ1 , n) , w( A, B; iˆ1 , iˆ1 , n) , 1, 2,3,...
If pair (iˆ1 , iˆ1 ) or pair (iˆ1 , iˆ1 ) satisfies w 0.3 or 0.4, where w is it
corresponding sliding window function, then this is the length of the shifting
mutation, specifically:
- If w( A, B; iˆ1 , iˆ1 , n) , we note that ˆ1 and we insert virtual
symbols into sequence B following the positions iˆ1 , while keeping sequence A
invariant
334 A.Y. ZAKIYYAH, M . I. IRAWAN, M . SHOVITRI
symbols into sequence A B following the positions iˆ1 , while keeping sequence
B invariant.
Through the use of these two steps we may estimate the local mutations mode
T1 i1 , 1 and its corresponding locally uniform alignment C1 , D1 . It is
decomposed as follows
C1 C1,1 , A2,1 , D1 D1,1 , B2,1
Denote the length of vector C1,1 and D1,1 by iˆ1 ˆ 1 . Since there is no shifting
e.
Continuing the above process, we find the sequence iˆk , ˆ k and the corresponding
sequence Ck , Dk for all k 1, 2,3... . The process will terminate at some k0
.
The process will be terminate at some Ck0 C1,k0 , A2, k0 and Dk0 D1, k0 , B2,k0
have shifting mutation occurring in ( A2,k0 , B2,k0 ) .
1.2. DNA Virus H1N1. In this research, we get sequences of H1N1 from database The
national Center for Biotechnology Information [5]. For preliminary research, we use
ClustalW software to alignment several strain DNA H1N1 virus. The output from this
software is phylogenetics trees which showing the inferred evolutionary relationships among
various biological spesies or other entities based upon similarities and differences in their
physical and genetic characteristics. The result of running the program Clustal W to
determine the phylogenetics tree as follows :
S e q u e n c e An a l y si s o f D N A H 1 N1 Vi ru s u si n g Su p e r P a i r wi se A l i g n me n t 335
gi|284999378|gb|GU576514.1|
gi|284999370|gb|GU576506.1|
gi|283831864|gb|GU451262.1|
gi|283831900|gb|GU451280.1|
gi|284999362|gb|GU576500.1|
gi|284999366|gb|GU576502.1|
gi|284999368|gb|GU576504.1|
gi|284999372|gb|GU576508.1|
gi|284999374|gb|GU576510.1|
gi|284999376|gb|GU576512.1|
In this section discussion about alignment of several DNA H1N1 virus using Super
Pairwise Aligment (SPA). One of pairwise alignment are DNA GU451262 and GU451280
sequences which each length of character 923 bp and 909 respectively. These DNA taken
from NCBI [5]. For the first step, take local similarity (n) and for this this problem this study
GU4512 G T A G A C A C A G T A C T A G A A A A
62
GU4512 T A G A C A C A G T A C T A G A A A A G
80
'
use value of n0 20 and 0.6 and this study obtain the result of local similarity from
20 character of sequences are :
From the list 2.1, There are three pair sequences in the same character from 17nd up to 19nd.
' 0.6 . If the value
Further identify is determined sliding window ( w ) and compared by
'
of w and its can be assumed the shifting Mutation in i j 0 Otherwise, meaning
.
no shifting in [1,n] and it can be continued to estimate iˆ . in this case, alignment GU451262
'
dan GU451280, similarity = 3 and w 0.85 0.6 . Second step, after it determine
local mutation, next step is estimate ( ) based on iˆ and criterion of . The result from
running program is gived in the figure below :
-5 -4 -3 -2 -1 0 1 2 3 4 5
w 0.7 0.6 0.7 0.6 0.6 0.8 0.0 0.8 0.6 0.6 0.7
5 5 5 5 5 5
From the list above can be determined the suitable value. The suitable =1 and it is
related by w 0 0.4 . This study assign shifting distance 1 and insert one ‘-’
into the first point of second sequence . so the sequence GU451262 and Sequence
GU451280 change into
In this case, determination one gap in the first position, it doesn’t alignment again
because fom this step it get optimal alignment. The output result are 14 gap and homolog
99,56%. In the other result alignment using BLAST software, it is obtain 98% homolog with
same value of gap. The alignment both sequence, it is related by Clustal W, GU451262 and
GU451280 sequence is located in one subdivision.
In the apply algoritm of Super Pairwise Alignment, some parameter need adjustment
as the value of the decision local similarity (n). Inaccuracy when taking value of local
similarity (n) influence in to optimization of the sequence alignment. Taking same value of
the parameter alignment sequences before, it is used to align GU576500 and GU451280 and
it doesn’t obtain optimal alignment. These DNA have each character 1701 bp and 909 bp
respectively. The alignment both sequences can be obtained the result of local similarity
from 20 character of sequences are :
GU57650 A T G A A G G C A A T A C T A G T A G T
0
GU45128 T A G A C A C A G T A C T A G A A A A G
0
Figure 2.4 The result of local similarity
From the Figure 2.4, there are three pair sequences in the same character at 3nd, 4nd and 18nd
'
position. The result of sliding windows w 0.85 0.6 so it can be conclude the
shifting mutation in i = j = 0. Next, this study determinate the value of . The result of , it
same with the value of aligment betwen GU451262 and GU251280 that value of
w 0 0.4 . This study know shifting distance 1 and insert one ‘-’ into the first
point of second sequence . By addition with one gap at the first point, aligment both
sequences have 231 same characters. Comparing with the alignment by BLAST software
there is difference. Using BLAST software, there are 97 gap at the first point and the final
338 A.Y. ZAKIYYAH, M . I. IRAWAN, M . SHOVITRI
3. CONCLUDING REMARK
Super Pairwise Alignment get optimal alignment. This study found that sequences
GU451262 and GU451280 have 99,56% homologous. When both sequences alignt with
BLAST software have result about 85%. This study still faced several problem for example
how to determinate parameter of local similarity. Determination of local similarity influence
the value of optimal alignment.
Acknowledgement. I would like to express my gratitude to Department for
Higher Education with scholarship. It is ver y useful to me to continue my study
and research in Institute Teknologi Sepuluh Nopember (ITS) Surabaya.
References
[1] G IFFIN, H UGH G AND ANNETTE M. GRIFFIN., Computer Analysis of Sequence Data.Humana Press, Totowa,
1994.
[2] PUZELLI, SIMONA. MARCIA FACCHININ, D OMENICO SPAGNOLO, MARIA A.DE MARCO, LAURA C ALZONETTI,
ALESSANDRO ZANETTI, R OBERTO FUMAGGALLI, MARIA L.TANZI, ANTONIO C ASSONE,G IOVANNIE REZZA,
ISABELLA DONATELLI, AND THE SURVEILLANCE GROUP FOR PANDEMIC A (H1N1) 2009., Transmission of
Hemaglutinin D222G Mutant Strain of Pandemic (H1N1) 2009 Virus. Vol t6. No,5 May 2010.
[3] SHEN, SHI Y I N ANKAI AND TUSZYNKI., Theory and Mathematical methods for Bioinformatics. Springer,New
York,2008
[4] SHEN, SHI Y I., J UN Y ANG, ADAM Y AO, PEI I NG H WANG. Super Pairwise Alignment (SPA) :An Efficient
Approach to Global Alignment For Homologous Sequences. Journal of Computational Biology Volume 9,
Number 3,2002@Mary Ann Liebert inc Pp477-486
[5] Database Sequences DNA H1N1 Virus, National Center for Biotechnology Information (2011)
www.ncbi.nlm.nih.gov
M.ISA IRAWAN
Supervisor, Lecturer of Mathematics Department at Institut TeknologiSepuluh Nopember
(ITS)
e-mail: [email protected]
MAYA SHOVITRI
co.Supervisor, Lecturer of Biology Department at Institut Teknologi Sepuluh Nopember
(ITS)
e-mail: [email protected]
Proceedings of The 6th SEAMS-GMU Conference 2011
Applied Mathematics, pp. 339–346.
Abstract. This paper studies an optimization problem, i.e., the optimal tracking error
control problem, on an inverted pendulum model with oblique track. We characterize
the minimum tracking error in term of pendulum’s parameters. Particularly, we derive
the closed form expression for the pendulum length which gives minimum error. It is
shown that the minimum error can always be accomplished as long as the ratio between
the mass of the pendulum and that of the cart satisfies a certain constancy, regardless
the type of material we use for the pendulum.
Keywords and Phrases: inverted pendulum, tracking error, optimal pendulum length.
1. INTRODUCTION
Direct pendulum as well as inverted pendulum models are important devices in
supporting education and research activities in the field of control system as they have
distinct characteristics such as nonlinear and unstable systems thus can be linearized
around fixed points, its complexity can be modified, and they can easily be applied in
actual systems. In the field of engineering, direct and inverted pendulums are utilized
to monitor displacement of foundation of structures such as dam, bridge, and pier.
Cranes work based on pendulum principles. In geology, inverted pendulum system aids
us in detecting seismic noise due to macro-seismic, oceanic, and atmospheric activities
[11]. In physiology we may employ the pendulum laws to study the human balancing
[8, 9, 10]. The theoretical studies of pendulum systems are some. An analytical treat-
ment of the stability problem in the context of delayed feedback control of the inverted
pendulum can be found in [1], while a discussion on the limitations of controlling an
inverted pendulum system in term of the Poisson integral formula and the complemen-
tary sensitivity integral is presented in [13]. Further, an H2 control performance limits
339
340 B. Edisusanto, T. Bakhtiar , A. Kusnanto
If the stability control problem is carried-out under the constraint of minimizing the
tracking error then it refers to the optimal tracking error control problem. Formulating
in time domain, the problem is to achieve minimal tracking error E ∗ , where
Z ∞
E ∗ := inf |e(t)|2 dt (3)
K∈K 0
with K is a set of all stabilizing controllers. In modern paradigm, however, the primary
interest is not on how to find the optimal controller, which commonly represented
as Youla’s parameterizations [12]. Rather, we are interesting in relating the optimal
performance with some simple characteristics of the plant to be controlled. In other
words, we provide the analytical closed-form expressions of the optimal performance in
terms of dynamics and structure of the plant [3, 4, 5, 6]. From (1) and (2) we may
construct a single-input and single-output (SISO) plant by selecting either P = Px or
P = Pθ , or alternatively a single-input and two-output (SITO) plant by selecting both
plants, i.e., P = (Px , Pθ )T .
Theorem 3.1. Let P be an SISO plant which has non-minimum phase zeros zi (i =
1, . . . , nz ) and unstable poles pj (k = 1, . . . , np ). Then the analytical closed-form expres-
sion of (3) is given by
nz np
∗
X 2 Re zi X 4 Re pj Re pk (1 − φ(p̄j ))(1 − φ(pk ))
E = + ,
i=1
|zi |2 (p̄j + pk )p̄j pk σ̄j σk
k,`=1
where
nz
Y zi + s
φ(s) := ,
i=1
z̄i − s
1Y ; np = 1
σj := pk − pj
; np ≥ 2.
p̄k + pj
k6=j
Theorem 3.1 shows that the minimum tracking error is mainly determined by
non-minimum phase zeros and unstable poles of the plant. In particular, it is clear
that non-minimum phase zeros close to the imaginary axis contribute more detrimental
effect. Moreover, unstable poles and unstable zeros close each other will deteriorate the
minimum tracking error as revealed by following corollaries.
Optimization Problem in Inverted Pendulum System 343
Corollary 3.1. If P has only one non-minimum phase zero z and unstable pole p, both
are real, then
2 8p
E∗ = + .
z (z − p)2
Corollary 3.2. If P has only one non-minimum phase zero z and two unstable poles
p1 and p2 , then
2 8(p1 + p2 ) p1 (p1 + p2 ) 2p1 p2 p2 (p1 + p2 )
E∗ = + − + .
z (p1 − p2 )2 (z − p1 )2 (z − p1 )(z − p2 ) (z − p2 )2
By imposing a simple differential calculus on (6) we determine the length which provides
the lowest possible tracking error
√
M (3 cos2 α − 8 + 1024 − 768 cos2 α + 9 cos4 α)
`∗ = , (7)
(8 − 6 cos2 α)ϕ
where ϕ is the ”length density” constant which represents the ratio between mass and
length of the pendulum, i.e., ϕ := m/`. We can see from (7) that the optimal length
can be reduced by decreasing the mass of the cart or by selecting the material of the
pendulum with bigger length density.
By reformulating (7) we may have
√
m 3 cos2 α − 8 + 1024 − 768 cos2 α + 9 cos4 α
= , (8)
M 8 − 6 cos2 α
which suggests that, for a given elevation α, the minimum error can always be ac-
complished as long as the ratio between the mass of the pendulum and that of the cart
344 B. Edisusanto, T. Bakhtiar , A. Kusnanto
80
α = 0o
70 α = 15o
α = 30o
60
50
tracking error 40
30
20
10
0
0 0.5 1 1.5 2 2.5 3
pendulum length (m)
satisfies a certain constancy in the right-hand side of (8), regardless the type of material
we use for the pendulum. In particular, for α = 0 we have a kind of magic number
√
m 265 − 5
= .
M 2
To illustrate our result, we consider a rod-shaped pendulum made from platina
mounted on a cart of weight 1 kg. We assume that the base of pendulum is fixed at
radius of 1 cm and the length is varied. We may find that the pendulum has a length
density of 3.3677 kg/m. Figure 3 depicts the minimum tracking error calculated in
(6) with respect to the pendulum length ` and the track elevation α. It is endorsed
that more elevation needs more control effort. Figure 4 plots the relation between the
track elevation and the optimal pendulum length based on (7). It is shown that `∗ is
a decreasing function of α, indicates that more elevation, and thus more control effort,
can be compensated by selecting a shorter pendulum.
5. CONCLUSION
We have examined a simple but interesting optimization problem that arises in
the field of control engineering. In the perspective of tracking error control problem of a
pendulum, it has been shown that the lowest possible tracking error is solely dependent
on the pendulum parameters. In particular, we provide the analytical closed-form ex-
pression of the optimal pendulum length. The approach adopted in this paper, however,
enables us to design an apparatus that optimally accomplishes a certain objective.
Optimization Problem in Inverted Pendulum System 345
1.8
1.7
1.6
1.4
1.3
1.2
1.1
0.9
0.8
0 10 20 30 40 50 60 70 80 90
track elevation (deg)
Figure 4. The optimal pendulum length with respect to the track elevation.
References
[1] Atay, F.M., ”Balancing the Inverted Pendulum Using Position Feedback,” Applied Mathematics
Letters, 12, 51–56, 1999.
[2] Chen, G., Chen, J., and Middleton, R., ”Best Tracking and Regulation Performance under
Control Energy Constraint,” IEEE Transactions on Automatic Control, 48:8, 1320–1336, 2003.
[3] Chen, J., Hara, S., and Chen, G., ”Optimal Tracking Performance for SIMO systems,” IEEE
Transactions on Automatic Control, 47:10, 1770–1775, 2002.
[4] Chen, J., Qiu, L., and Toker, O., ”Limitations on Maximal Tracking Accuracy,” IEEE Trans-
actions on Automatic Control, 45:22, 326–331, 2000.
[5] Hara, S., Bakhtiar, T., and Kanno, M., ”The Best Achievable H2 Tracking Performances for
SIMO Feedback Control Systems,” Journal of Control Science and Engineering, 2007, 2007.
[6] Hara, S. and Kogure, C., ”Relationship between H2 Control Performance Limits and RHP
Pole/Zero Locations,” Proceedings of the 2003 SICE Annual Conference, Fukui, Japan, 1242–
1246, 2003.
[7] Lazar, T. and Pastor, P., ”Factors Limiting Controlling of an Inverted Pendulum,” Acta Poly-
technica Hungarica, 8:4, 23–34, 2011.
[8] Loram, I.D. and Lakie, M., ”Human Balancing of an Inverted Pendulum: Position Control by
Small, Ballistic-Like, Throw and Catch Movements,” Journal of Physiology, 540:3, 1111–1124,
2002.
[9] Loram, I.D., Gawthrop, P.J. and Lakie, M., ”The Frequency of Human, Manual Adjustments
in Balancing an Inverted Pendulum is Constrained by Intrinsic Physiological Factors,” Journal of
Physiology, 577:1, 417–432, 2006.
[10] Loram, I.D., Kelly, S.M., and Lakie, M., ”Human Balancing of an Inverted Pendulum: Is Sway
Size Controlled by Ankle Impedance?” Journal of Physiology, 532:3, 879–891, 2001.
[11] Taurasi, I., Inverted Pendulum Studies for Seismic Attenuation, SURF Final Report LIGO
T060048-00-R, California Institute of Technology, USA, 2005.
[12] Vidyasagar, M., Control System Synthesis: A Factorization Approach, MIT Press, Cambridge,
MA, 1985.
346 B. Edisusanto, T. Bakhtiar , A. Kusnanto
[13] Woodyatt, A.R., Middleton, R.H., and Freudenberg, J.S., Fundamental Constraints for the
Inverted Pendulum Problem, Technical Report EE9716, Department of Electrical and Computer
Engineering, the University of Newcastle, Australia, 1997.
Bambang Edisusanto
Madrasah Tsanawiyah Negeri Pakem,
Jl. Cepet Purwobinangun (PKM), Sleman 55582, Yogyakarta.
e-mail: [email protected]
Toni Bakhtiar
Departemen Matematika, Institut Pertanian Bogor,
Jl. Meranti, Kampus IPB Darmaga, Bogor 16880.
e-mail: [email protected]
Ali Kusnanto
Departemen Matematika, Institut Pertanian Bogor,
Jl. Meranti, Kampus IPB Darmaga, Bogor 16880.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 347–364.
Abstract. In this work the existence of traveling wave solutions of some time-delayed
lattice reaction-diffusion systems is studied. Employing iterative method coupled with
the explicit construction of upper and lower solutions in the theory of quasi-monotone
dynamical systems, we obtain a critical speed, c∗ , and show the existence of traveling
wave solutions connecting the trivial solution and the coexistnece state when the wave
speed c is larger than c∗ .
Keywords and Phrases: Traveling Wave Solution, Delayed LDE, Upper And Lower So-
lutions.
1. INTRODUCTION
The purpose of this work is to investigate the existence of traveling wave solutions
for the following time-delayed lattice reaction-diffusion systems:
d
un,i (t) = dn un,i−1 (t) − 2un,i (t) + un,i+1 (t) + un,i fn (ui )t (−τn ) , (1)
dt
where t ∈ R, dn > 0, un,i ∈ C 1 (R, R), fn ∈ C 1 (RN , R), τn = (τn,1 , · · · , τn,N ) for some
nonnegative constants τn,1 , · · · , τn,N and
ui )t (−τn ) = (u1,i (t − τn,1 ), · · · , uN,i (t − τn,N ) ,
for i ∈ Z and 1 ≤ n ≤ N . We assume fn (0, · · · , 0) > 0 and there exists positive
numbers k1 , · · · , kN such that fn (k1 , · · · , kN ) = 0 for each n. Then it is obvious that
0 := (0, · · · , 0) and K := (k1 , · · · , kN ) are equilibria of systems (1).
2010 Mathematics Subject Classification: Primary: 34A33, 34C37, 34K10, 35C07; Secondary: 92B20.
347
348 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
The systems of lattice differential equations (1) can be seen as the discrete version
of the following time-delayed reaction-diffusion systems:
∂
un (x, t) = dn ∆un (x, t) + un (x, t)fn u1 (x, t − τn,1 ), · · · , uN (x, t − τn,N ) . (2)
∂t
The systems (1) or (2) describe the dynamical interaction of N species distributed in
the one-dimensional integer lattice Z1 or R1 respectively. In recent years, the study of
the existence of traveling wave solutions for systems (1) and (2) has attracted a lot of
attention. For the related results of lattice differential equations, we refer the readers
to Bates et. al. [1], Chen et. al. [2], Chow at. al. [3], Hsu et. al. [4, 5, 6], Huang et.
al. [7], Keener [8], Ma et. al. [11], Mallet-Paret [12], Wu et. al. [13, 14] and Zinner et.
al. [16, 15]; and the references cited therein.
Motivated by the works of Lin et. al. [10] and Hsu et. al. [6], we extend the
results of Lin et. al. [10] to the lattice systems (1) in this work. A traveling wave
solution of systems (1) is a solution of the form
un,i (t) = φn (i + ct), for all i ∈ Z; t ∈ R; 1 ≤ n ≤ N,
where each φn ∈ C 1 (R, R) and c ∈ R is called the wave speed. Under the moving
coordinate s = i + ct, we then have the profile equation,
cφ0n (s) = dn φn (s − 1) − 2φn (s) + φn (s + 1) + φn (s)fn Φs (−cτn ) , (3)
for 1 ≤ n ≤ N and Φs (−cτn ) = (φ1 (s − cτn,1 ), · · · , φN (s − cτn,N )). Our purpose is
to find positive traveling solutions of (1) connecting the equilibria 0 = (0, · · · , 0) and
K = (k1 , · · · , kN ), i.e., the function φn is positive for all 1 ≤ n ≤ N , and satisfy the
following asymptotically boundary conditions,
lim (φ1 (s), · · · , φN (s)) = 0 and lim (φ1 (s), · · · , φN (s)) = K. (4)
s→−∞ s→∞
is not empty, where λn,1 (c) and λn,2 (c) are roots of the following function
∆n (λ, c) = −cλ + dn (e−λ + eλ − 2) + fn (0, · · · , 0).
Basing on the assumptions (A1), (A2) and (A3), we will show that the functions
f1 , · · · , fN satisfy the special condition called (EMQM). Then we can use the assump-
tion (A4) and apply the iteration method to obtain the traveling wave solutions of
systems (1). Our main results are stated as follows.
Theorem 1.1. Assume (A1)∼(A4) hold. Then there exist a positive number c∗ such
that if c > c∗ then there is a δ > 0 such that equation (3) has positive solutions satisfying
(4) when max{τn,n | 1 ≤ n ≤ N } < δ.
Note that although we apply the techniques similar to those in Lin et. al. [10],
there are some differences. First, the authors in the paper establish three pairs of
upper-lower solutions for three specific models to derive the existence of traveling wave
solutions, respectively. Compared to the results of the paper, we establish the pair
of upper-lower solutions of systems (1) explicitly and generally. Then we obtain the
existence of traveling wave solutions for more general models. Moreover, our results
also generalize the results in Hsu et.al. [6].
The remainder of this paper is organized as follows. In Section 2, we introduce
some definitions as well as notations, and show that (1) of condition (EMQM) holds
under some sufficient conditions. Next, we define the solution operator for equation
(3) and examine its properties in Section 3. In Section 4, we introduce the character-
istic function for equation (3) and use its roots to establish the upper-lower solutions.
According to the results in Sections 2, 3 and 4, we then use the iteration scheme and
Schauder’s fixed point theorem to prove our main results in Section 5. In the final sec-
tion, we illustrate some well-known models and using our results to derive the existence
of traveling wave solutions.
2. PRELIMINARIES
In this section, we will introduce some notations, a terminology and a lemma
which will be used in the proof of the main theorem. First of all, some definitions are
given as follows.
where 3K = (3k1 , · · · , 3kN ). Then Cb (R, RN ) is a Banach space with the norm
kΦk := sups∈R;1≤n≤N |φn (s)|.
(3) Let Φ̂ = (φ̂1 , · · · , φ̂N ) and Φ̌ = (φ̌1 , · · · , φ̌N ) be two functions in C3K (R, RN )
which are continuously differentiable for all except finite t. They are called a pair
of upper-lower solutions of (3), respectively, if Φ̌ Φ̂ and satisfy the following
inequalities for all except finite t:
− cφ̌0n (s) + dn (φ̌n (s − 1) − 2φ̌n (s) + φ̌n (s + 1)) + φ̌n (s)fn (Φs (−cτn )) ≥ 0, (6)
and
− cφ̂0n (s) + dn (φ̂n (s − 1) − 2φ̂n (s) + φ̂n (s + 1)) + φ̂n (s)fn (Ψs (−cτn )) ≤ 0, (7)
(4) Let Φ̂ and Φ̌ be a pair of upper-lower solutions of (3). We define Γ(Φ̌, Φ̂) as
the set of all functions Φ ∈ C3K (R, RN ) satisfying Φ̌ Φ Φ̂ such that
eβn s [φ̂n (s) − φn (s)] and eβn s [φn (s) − φ̌n (s)]
for some βn > 0, 1 ≤ n ≤ N , are nondecreasing for all s ∈ R.
Definition 2.2. The functions f1 , · · · , fN of systems (1) are said satisfying conditions
(EMQM) if the following conditions hold:
(1) There exist positive real numbers, β1 , · · · , βN , such that given any 1 ≤ n ≤ N
and s ∈ R,
ψn (s)fn (Φ̄s (−cτn )) − φn (s)fn (Φs (−cτn )) + (cβn − 2dn )(ψn (s) − φn (s)) ≥ 0,
for all Φ̄ = (φ1 , · · · , ψn , · · · , φN ), Φ = (φ1 , · · · , φN ) ∈ C 1 (R, RN ) with 0 ≤
φn ≤ ψn ≤ 3kn , 0 ≤ φj ≤ 3kj for j ∈ {1, · · · , N }\{n} and eβn s (ψn (s) − φn (s))
is nondecreasing for s ∈ R.
(2) fn (x1 , · · · , xN ) is monotone with respect to xj for 1 ≤ n, j ≤ N with n 6= j.
Next, we show that part (1) of condition (EMQM) holds when all the delay times
τn,n are small enough.
Lemma 2.1. Let c > 0 be fixed. There exists some δ > 0 such that if τn,n < δ for all
1 ≤ n ≤ N, then (1) of condition (EMQM) holds.
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 351
for 1 ≤ n ≤ N , and a fixed point of the operator G is a solution of (3). Some properties
Lemma 3.3. Assume (EMQM) holds. Then G : Γ(Φ̌, Φ̂) → Γ(Φ̌, Φ̂) is a compact
operator.
Proof. Let Φ = (φ1 , · · · , φN ) ∈ Γ(Φ̌, Φ̂). First of all, we claim that eβn s [φ̂n (s) −
Gn (Φ)(s)] and eβn s [Gn (Φ)(s) − φ̌n (s)] are nondecreasing for all s ∈ R and 1 ≤ n ≤ N.
Here we only prove the former case. The later case can also be shown by the same way.
For any 1 ≤ n ≤ N, by the definition of the upper solutions and the condition (1)
in (EMQM), it is easy to see that
d βn s
e [φ̂n (s) − G(Φ)(s)]
ds
=eβn s βn φ̂n (s) + φ̂0n (s) − Hn (Φ)(s)
≥eβn s βn φ̂n (s) + φ̂0n (s) − Hn (Ψ)(s) + Hn (Ψ)(s) − Hn (Φ)(s)
≥0
352 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
for all except finite s, where Ψ(s) = (ψ1 (s), · · · , ψN (s)) and
∂fn (X)
φ̂j (s), if ≥ 0,
∂xj
ψj (s) = φ̂j (s), if j = n,
∂fn (X)
φ̌j (s), if ≤ 0.
∂xj
Then the continuity of φ̂n (s) and Gn (Φ)(s) implies that eβn s [φ̂n (s) − Gn (Φ)(s)] is non-
decreasing for all s ∈ R. Hence, the assertion of our claim follows.
Next, we prove that Φ̌(s) G(Φ)(s) Φ̂(s), for all s ∈ R. To this end, we first
show that
φ̌n (s) ≤ Gn (φ1 , φ2 , · · · , φ̌n , · · · , φN )(s), for all s ∈ R, 1 ≤ n ≤ N. (8)
Without loss of generality, we may assume that n = 1. By (A1) and (A2), we have
Z s
−β1 s
G1 (φ̌1 , φ2 , · · · , φN )(s) =e eβ1 z H1 (φ̌1 , φ2 , · · · , φN )(z)dz
−∞
Z s
−β1 s
≥e eβ1 z H1 (φ̌1 , ψ2 , · · · , ψN )(z)dz,
−∞
where
∂fn (X)
φ̂j (s), if ≤ 0,
∂xj
ψj (s) =
∂fn (X)
φ̌j (s), if ≥ 0.
∂xj
Note that φ̌01 maybe does not exist at finite real numbers. If s ∈ R and φ̌01 exists on
(−∞, s), then we have
Z s Z s
e−β1 s eβ1 z H1 (φ̌1 , ψ2 , · · · , ψN )(z)dz ≥ e−β1 s eβ1 z (φ̌01 (z) + β1 φ̌1 (z))dz = φ̌1 (s),
−∞ −∞
(9)
since Φ̆ is a lower solution. On the other hand, if s ∈ R and φ̌01 does not exist at finite
points of (−∞, s), by improper integral, integration by parts and similar arguments as
above, one can also easily to check that (9) is also true. Hence the inequality (8) follows.
By the same way, we can also obtain that
Gn (φ1 , φ2 , · · · , φ̂n , · · · , φN )(s) ≤ φ̂n (s) for all s ∈ R, 1 ≤ n ≤ N. (10)
From (8), (10) and Lemma 3.2, we know that
φ̌n (s) ≤ Gn (φ1 , φ2 , · · · , φ̌n , · · · , φN )(s) ≤ Gn (φ1 , φ2 , · · · , φn , · · · , φN )(s)
≤ Gn (φ1 , φ2 , · · · , φ̂n , · · · , φN )(s) ≤ φ̂n (s).
for s ∈ R and 1 ≤ n ≤ N. Therefore, G(Φ) ∈ Γ(Φ̌, Φ̂).
The proof of compactness of operator G is similar the proof in Li et. al. [9] and
omit here.
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 353
is not empty. We start to establish the upper and lower solutions of (3) by using the
properties of the characteristic functions.
First, we define the function hn,q (t) = eλn,1 t − qeηλn,1 t , for 1 ≤ n ≤ N , where
q > 1 and η is a real number satisfying
λn,2 λn,1 + λm,1
1 < η < min{ , | 1 ≤ n, m ≤ N }.
λn,1 λn,1
Direct computation implies that hn,q (t) has a unique global maximum mn (q) at t =
tn (q), where
1 1 1 1 1
mn (q) = (1 − )( ) η−1 and tn (q) = ln( ).
η qη λn,1 (η − 1) qη
It is clear that limq→∞ tn (q) = −∞ and limq→∞ mn (q) = 0+ . Let σ(q) > 1 with
mn (q)/σ(q) < kn for all 1 ≤ n ≤ N and set
t∗n (q) := max{t | hn,q (t) = mn (q)/σ(q)}.
1 1
Note that hn,q (t) = 0 at t = ln , and
λn,1 (η − 1) q
1 1
tn (q) < t∗n (q) < ln .
λn,1 (η − 1) q
354 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
Then the fact q > 1 implies that t∗n (q) < 0. Hence we can choose a number δq > 0 and
small enough such that
mn (q) mn (q) −γt∗n (q) mn (q)
< kn − kn − e < ,
2σ(q) σ(q) σ(q)
for all γ ∈ (0, δq ) and any 1 ≤ n ≤ N. Then there exists a number tn (γ, q) satisfying
1 1
t∗n (q) < tn (γ, q) < ln (13)
λn,1 (η − 1) q
and
mn (q) −γtn (γ,q)
kn − (kn − )e = hn (tn (γ, q)).
σ(q)
N
Next, let κ be a positive number in ∩ (λn,1 , min1≤j≤N {λn,1 + λj,1 , λn,2 }). Then
n=1
we consider the functions ĥn,q (t) = eλn,1 t + qkn eκt for 1 ≤ n ≤ N . For each n, one can
easily check that there exists a unique number b tn (q) such that
ĥn,q (b
tn (q)) = 3kn and lim b
tn (q) = −∞.
q→∞
∗
ĥn,q (b
t∗n (q)) < kn < kn + kn e−γ tn (q) for any γ > 0. (14)
b
There also exists some δ̂q > 0 such that if 0 < γ < δ̂q , then
ĥn,q (b
tn (q)) = 3kn > kn + kn e−γ tn (q) (15)
b
b
t∗n (q) < b
tn (γ, q) < b
tn (q) and ĥn,q (b
tn (γ, q)) = kn + kn e−γ tn (γ,q) ,
b
where 0 < γ < δ̂q . For convenience, let us replace tn (γ, q), mn (q), σ(q) and b
tn (γ, q) by
tn , mn , σ and b
tn respectively, and define
λ t
e n,1 − qeηλn,1 t , if t < tn ,
φ̌n (t) = (16)
kn − (kn − mσn )e−γt , if t ≥ tn .
λ t
e n,1 + qkn eκt , if t < b tn ,
φ̂n (t) = (17)
kn + kn e−γt , if t ≥ b
tn .
From the definitions of each φ̂n and each φ̌n , it is easy to see that
lim φ̌n (t) = lim φ̂n (t) = 0 and lim φ̌n (t) = lim φ̂n (t) = kn
t→−∞ t→−∞ t→∞ t→∞
for all 1 ≤ n ≤ N.
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 355
kn + kn e−γt
kn
eλn,1 t − qeηλn,1 t
0
tn t̂n
Lemma 4.3. Assume (A1)∼(A4) hold. 12Then there is a c∗ > 0 such that for each c > c∗
there exists a δ > 0 such that the functions Φ̂ = (φ̂1 , · · · , φ̂N ) and Φ̌ = (φ̌1 , · · · , φ̌N )
defined by (16) and (17) respectively is a pair of the upper-lower solutions of (3) if q is
large enough, γ is small enough and τn,n < δ for all 1 ≤ n ≤ N .
Now we show that (6) and (7) hold. To prove (6), let us recall that the function
Φ = (φ1 , · · · , φN ) in (6) is defined by
∂fn (X)
φ̂j (t), if ≤ 0,
∂xj
φj (t) = φ̌j (t), if j = n,
∂fn (X)
φ̌j (t), if ≥ 0.
∂xj
and then we consider the following two cases : (i) t < tn and (ii) t ≥ tn .
− cφ̌0n (t) + dn (φ̌n (t − 1) − 2φ̌n (t) + φ̌n (t + 1)) + φ̌n (t)fn (Φt (−cτn ))
≥ − cλn,1 (eλn,1 t − qηeηλn,1 t ) + dn (eλn,1 (t−1) − qeηλn,1 (t−1) )−
2dn (eλn,1 t − qeηλn,1 t ) + dn (eλn,1 (t+1) − qeηλn,1 (t+1) ) + φ̌n (t)fn (Φt (−cτn ))
= − q∆n (ηλn,1 , c)eηλn,1 t + φ̌n (t) fn (Φt (−cτn )) − fn (0) .
for some X(t) ∈ RN . Since φ̌n (t) ≥ 0 and φ̂j (t) ≤ eλj,1 t + qkn eκt for all j, then
∂fn (X(t))
φ̌n (t) fn (Φt (−cτn )) − fn (0) ≥φ̌n (t) φ̌n (t − cτn,n )+
∂xn
X ∂fn (X(t))
φ̌n (t) (eλj,1 (t−cτn,j ) + qkn eκ(t−cτn,j ) ).
−
∂x j
j∈In
Moreover, the fact φ̌n (t) ≥ 0 for all t implies that qeηλn,1 t ≤ eλn,1 t for all t < tn . Then,
it is easy to see that |φ̌n (t)| ≤ 2eλn,1 t and
∂fn (X(t))
−q∆n (ηλn,1 , c)eηλn,1 t + φ̌n (t) φ̌n (t − cτn,n )
∂xn
X ∂fn (X(t))
+ φ̌n (t) eλj,1 (t−cτn,j ) + qkn eκ(t−cτn,j ) > 0.
−
∂x j
j∈In
(ii) Assume t ≥ tn .
mj
Since φ̂j (t) ≤ kj + kj e−γt and φ̌j (t) ≥ kj − (kj − σ )e
−γt
for all t ∈ R and 1 ≤ j ≤ N,
it is clear that
∂fn (X(t))
fn (Φt (−cτn )) ≥ (φ̌n (t − cτn,n ) − kn )+
∂xn
X ∂fn (X(t)) X −∂fn (X(t)) mj −γ(t−cτn,j )
kj e−γ(t−cτn,j ) + (kj − )e .
−
∂x j +
∂x j σ
j∈In j∈In
By (13), if t ≥ tn then
According to (21) and (18), it is easy to see that (6) holds by taking all mj /σ being
small enough and letting γ → 0. .
On the other hand, if tn < t < tn + cτn,n , we have
mn −γtn
φ̌n (t − cτn,n ) = kn − (kn − )e + ε(t),
σ
where ε(t) → 0 as τn,n → 0, and
X ∂fn (X(t)) X −∂fn (X(t)) mj γcτn,j −γtn
( kj eγcτn,j + (kj − )e )e
−
∂xj +
∂xj σ
j∈In j∈In
∂fn (X(t))
+ (φ̌n (t − cτn,n ) − kn )
∂xn
X ∂fn (X(t)) X −∂fn (X(t)) mj γcτn,j −γtn
≥( kj eγcτn,j + (kj − )e )e
−
∂x j +
∂xj σ
j∈In j∈In
∂fn (X(t)) −∂fn (X(t) mn −γtn
+ ε(t) + (kn − )e .
∂xn ∂xn σ
Note that e−γt < e−γtn . According to equations (13), (21) and (18), the inequality (6)
holds by taking all mj /σ being small enough and letting γ as well as τnn be small.
Next, we prove that (7) holds for all 1 ≤ n ≤ N . To this end, let us recall that
function Ψ = (ψ1 , · · · , ψN ) in (7) is defined by
φ̂j (t), if ∂fn (X) ≥ 0,
∂xj
ψj (t) = φ̂ j (t), if j = n,
∂fn (X)
φ̌j (t), if ≤ 0.
∂xj
and then we also consider the following two cases: (i) t < b
tn and (ii) t ≥ b
tn .
(i) Assume t < b
tn .
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 359
− cφ̂0n (t) + dn (φ̂n (t − 1) − 2φ̂n (t) + φ̂n (t + 1)) + φ̂n (t)fn (Ψt (−cτn ))
≤ − cλn,1 eλn,1 t + kn q(−c)κeκt + d1 (eλn,1 (t−1) + kn qeκ(t−1) + eλn,1 (t+1) +
kn qeκ(t+1) ) − 2d1 (eλn,1 t + kn qeκt ) + φ̂n (t)fn (Ψt (−cτn ))
=qkn ∆n (κ, c)eκt + φ̂n (t)(fn (Ψt (−cτn )) − fn (0)).
From the choice of κ and (12), it is clear that ∆n (κ, c) < 0. Similar to equation (19),
we can get the following inequality
for some X(t) ∈ RN . By (18), we know that if τn,n is small enough then
Furthermore, we have
if b
tn and τn,n are small enough. Hence the inequality (7) hold.
(ii) Assume t ≥ b
tn .
− cφ̂0n (t) + dn (φ̂n (t − 1) − 2φ̂n (t) + φ̂n (t + 1)) + φ̂n (t)fn (Ψt (−cτn ))
≤e−γt kn (cγ + dn (eγ − 2 + e−γ )) + φ̂n (t)fn (Ψt (−cτn )).
360 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
Note that φ̂n (t) > kn for all t > btn . If t > cτn,n + b
tn and γ is small enough, then (18)
implies that
X ∂fn (X(t))
φ̂n (t)fn (Ψt (−cτn )) ≤ φ̂n (t)( kj eγcτn,j +
+
∂x j
j∈In ∪{n}
X −∂fn (X(t) mj γcτn,j −γt
(kj − )e )e
−
∂xj σj
j∈In
≤ −Ln kn e−γt .
Thus, if γ is small enough, we have
e−γt kn (cγ + dn (eγ − 2 + e−γ )) + φ̂n (t)fn (Ψt (−cτn )) ≤ 0.
By (18), if b
tn < t < cτn,n + b
tn and γ is small enough then
−∂fn (X(t)) ∂fn (X(t))
fn (Ψt (−cτn )) ≤ kn + φ̂n (t − cτn,n )+
∂xn ∂xn
X ∂fn (X(t)) X −∂fn (X(t) mj γcτn,j −γt
( kj eγcτn,j + (kj − )e )e
+
∂xj −
∂xj σ
j∈In j∈In
−∂fn (X(t)) ∂fn (X(t))
= kn + (kn + kn e−γ tn + ε(cτn,n ))+
b
∂xn ∂xn
X ∂fn (X(t)) X −∂fn (X(t) mj γcτn,j −γt
( kj eγcτn,j + (kj − )e )e
+
∂x j −
∂x j σ
j∈In j∈In
∂fn (X(t))
≤ (kn e−γ tn + ε(cτn,n ))+
b
∂xn
X ∂fn (X(t)) X −∂fn (X(t) mj γcτn,j −γ btn
( kj eγcτn,j + (kj − )e )e
+
∂x j −
∂xj σ
j∈In j∈In
∂fn (X(t))
≤ ε(cτn,n ) − Ln e−γ tn ,
b
∂xn
where ε(t) → 0 as τn,n → 0. Note that eγ + e−γ > 2, for all γ > 0. Therefore, if γ and
τn,n are small enough then
e−γt kn (cγ + dn (eγ − 2 + e−γ )) + φ̂n (t)fn (Ψt (−cτn ))
∂fn (X(t))
≤e−γ tn kn (cγ + dn (eγ − 2 + e−γ )) − 3Ln kn e−γ tn + 3kn |ε(t)| ≤ 0.
b b
∂xn
The proof is complete.
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 361
4.2. Existence Of Traveling Wave Solutions. Now we prove the results of Theorem
1.1 in this subsection.
Proof of Theorem 1.1. Let c∗ > 0 be defined in Lemma 4.3. For c > c∗ , we can choose
q and γ being large and small enough respectively. Then, by Lemma 4.3, there exists
a δ > 0 such that if |τn,n | < δ for all 1 ≤ n ≤ N then the functions Φ̂ = (φ̂1 , · · · , φ̂N )
and Φ̌ = (φ̌1 , · · · , φ̌N ) defined by (16) and (17) form a pair of upper-lower solutions of
(3). One can verify that eβn s (φ̂n (s) − φ̌n (s)) is nondecreasing for all s ∈ R. This implies
Γ(Φ̌, Φ̂) is a non-empty, convex, bounded and closed set with the super norm k · k.
Therefore, by Lemma 3.1, Lemma 3.3 and the Schauder’s fixed point theorem, equation
(3) has a solution Y (t) = (y1 (t), · · · , yN (t)) ∈ Γ(Φ̌, Φ̂) satisfying the inequalities
5. APPLICATIONS
In this section, we will apply our main theorem to show the existence of traveling
wave solutions for various types of lattice reaction-diffusion systems.
Example 5.1 (N Species Delayed Lotka-Volterra Ecological Models).
where i ∈ Z, dn , rn > 0, τnm ≥ 0 and ann < 0 for 1 ≤ m, n ≤ N . If anm are positive
for all n 6= m then systems (22) is called a cooperative model; if anm are negative for
n 6= m then systems (22) is called a competitive model; and if anm ak` < 0 for some
n 6= m and k 6= ` then systems (22) is called a predator-prey systems.
For systems (22), we assume that there is a positive equilibrium (k1 , · · · , kN ), i.e.,
the equations
X
rn + ann kn + anm km = 0 (23)
m6=n
hold for some ki > 0, i = 1, · · · , N . Let Φ(t) = (φ1 (t), · · · , φN (t)) be a traveling wave
solution of (22), then the corresponding profile equations are
−cφ0n (t) = dn (φn (t − 1) − 2φn (t) + φn (t + 1))+
P (24)
φn (t) rn + ann φn (t − cτnn ) + m6=n anm φm (t − cτnm )
362 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
It is clear that the conditions (A1) and (A2) hold for (22). By elementary com-
putation, one can see that condition (A3) holds when
X
−ann kn > |anm |km . (25)
m6=n
Therefore, if the conditions (23) and (25) hold, then we obtain the same results stated
in Theorem 1.1 for systems (22).
Note that the equations (23) can be rewritten as
X
−ann kn = rn + anm km ,
m6=n
then the condition (A3) (or (25)) always holds for cooperative systems.
However, it is not easy to verify condition (23) for general systems. Here we only
consider the following two species ecological systems
0
ui (t) = d1 ui−1 (t) − 2ui (t) + ui+1 (t) +
ui (t) r1 + a11 ui (t − τ11 ) + a12 vi (t − τ12 ) ,
(26)
0
vi (t) = d2 vi−1 (t) − 2vi (t) + vi+1 (t) +
vi (t) r2 + a22 vi (t − τ22 ) + a21 ui (t − τ21 ) .
If a11 a22 − a12 a21 6= 0, then it is obvious that the equilibrium (k1 , k2 ) can be expressed
explicitly by
−r1 a22 + r2 a12 −r2 a11 + r1 a21
(k1 , k2 ) = , .
a11 a22 − a12 a21 a11 a22 − a12 a21
It is required that k1 and k2 should be positive. Hence we assume
a11 a22 − a12 a21 > 0, −r1 a22 + r2 a12 > 0 and − r2 a11 + r1 a21 > 0. (27)
Under above assumptions, (k1 , k2 ) is the unique positive equilibrium of (26). Note that
the assumption (27) also implies the equilibrium (k1 , k2 ) of (26) is linearized stable for
the following ODEs:
(
u0i (t) = ui (t) r1 + a11 ui (t) + a12 vi (t) ,
vi0 (t) = vi (t) r2 + a22 vi (t) + a21 ui (t) .
Now we only need to verify the condition (A3). By (26), the inequalities in (A3)
can be stated as the following:
−a11 k1 > |a12 |k2 and − a22 k2 > |a21 |k1 . (28)
Then we consider the following two cases for competitive and predator-prey systems
respectively.
◦ Assume a12 > 0 and a21 < 0.
By the formula of (k1 , k2 ), the condition (28) is equivalent to
2r1 a21 a22 < r2 (a11 a22 + a12 a21 ). (29)
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 363
References
[1] Bates, P. W., Chen, X. and Chmaj, A. J. J. , Traveling Waves of Bistable Dynamics on a
Lattice, SIAM J. Math. Anal. 35(2), 520-546, 2003.
[2] Chen, X. and Guo, J.-S., Existence and Asymptotic Stability of Traveling Waves of Discrete
Quasilinear Monostable Equations, Journal of Differential Equations 184(2), 549-569, 2002.
[3] Chow, S.-N., Mallet-Paret, J. and Shen, W., Traveling Waves in Lattice Dynamical Systems,
Journal of Differential Equations 149(2), 248-291, 1998.
[4] Hsu, C.-H. and Lin, S.-S., Existence and Multiplicity of Traveling Waves in a Lattice Dynamical
System, Journal of Differential Equations 164(2), 431-450, 2000.
[5] Hsu, C.-H., Lin, S.-S. and Shen, W., Traveling Waves in Cellular Neural Networks, Internat. J.
Bifur. Chaos Appl. Sci. Engrg. 9(7), 1307-1319, 1999.
[6] Hsu, C.-H. and Yang, T.-H., Traveling Plane Wave Solutions of Delayed Lattice Differential
Systems in Competitive Lotka-Volterra Type, Discrete Contin. Dyn. Syst. Ser. B 14(1), 111-128,
2010.
[7] Huang, J., Lu, G. and Ruan, S., Traveling Wave Solutions in Delayed Lattice Differential Equa-
tions with Partial Monotonicity, Nonlinear Anal. 60(7), 1331-1350, 2005.
[8] Keener, J. P., Propagation and Its Failure in Coupled Systems of Discrete Excitable Cells, SIAM
Journal on Applied Mathematics 47(3), 556-572, 1987.
[9] Li, W.-T., Lin, G. and Ruan, S., Existence of Travelling Wave Solutions in Delayed Reaction-
diffusion Systems with Applications to Diffusion-competition Systems, Nonlinearity 19(6), 1253-
1273, 2006.
[10] Lin, G., Li, W.-T. and Ma, M., Traveling Wave Solutions in Delayed Reaction Diffusion Systems
with Applications to Multi-species Models, Discrete Contin. Dyn. Syst. Ser. B 13(2), 393-414,
2010.
[11] Ma, S., Liao, X. and Wu. J., Traveling Wave Solutions for Planar Lattice Differential Systems
with Applications to Neural Networks, Journal of Differential Equations, 182(2), 269-297, 2002.
[12] Mallet-Paret, J., The Global Structure of Traveling Waves in Spatially Discrete Dynamical
Systems, Journal of Dynamics and Differential Equations 11(1), 49-127, 1999.
[13] Wu, J. and Zou, X., Asymptotic and Periodic Boundary Value Problems of Mixed Fdes and
Wave Solutions of Lattice Differential Equations, Journal of Differential Equations 135(2), 315-
357, 1997.
[14] Wu, J. and Zou, X., Traveling Wave Fronts of Reaction-diffusion Systems with Delay, Journal of
Dynamics and Differential Equations 13(3), 651-687, 2001.
[15] Zinner, B., Harris, G. and Hudson, W., Traveling Wavefronts for the Discrete Fishers Equation,
Journal of Differential Equations 105(1), 46-62, 1993.
[16] Zinner, B. Existence of Traveling Wavefront Solutions for the Discrete Nagumo Equation, Journal
of Differential Equations, 96(1), 1-27, 1992.
364 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
Cheng-Hsiung Hsu
Department of Mathematics, National Central University,
Chung-Li 32001, Taiwan.
e-mail: [email protected]
Jian-Jhong Lin
Department of Mathematics, National Tsing Hua University,
Hsinchu 30013, Taiwan.
e-mail: [email protected]
Ting-Hui Yang
Department of Mathematics, Tamkang University,
Tamsui, Taipei County 25137, Taiwan.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 365 - 378.
Abstract. The paper attempts to study the relationship between rainfall and global
radiation on oil palm yield in Riau and Lampung. The method is based on multi -input
transfer function analysis, which is a multivariate time series analysis. This method
combines several properties of univariate ARIMA models and multiple linear regression
analysis. Based on the model obtained, in Riau oil palm yield is affected by last yield at t -
1, t-2, t-3 (1, 2, 3 months before harvest). Rainfall affects yi eld at t (actual), t-2, t-11, t-13
(2, 11, 13 months before harvest). While the global radiation affects yield at t (actual), t -1,
t-8, t-9 (1, 8, 9 months before harvest). In Lampung, oil palm yield is affected by last yield
at t-1, t-2 (1, 2 months before harvest). Rainfall affects yield at t (actual), t-1, t-7, t-8 (1, 7,
8 months before harvest). While the global radiation affects yield at t (actual), t -1, t-6, t-7
(1, 6, 7 months before harvest). Mean Absolute Percentage Error (MAPE) and Mean
Absolute Deviation (MAD) values in Lampung is less than Riau, so to predict the future
level of oil palm this model can be used in Lampung. But in Riau, alternatively ARIMA
model can be used also to predict and explain the future level of oil palm yield.
Keywords and Phrases: Multi-input transfer function, Correlation, Oil Palm Yield
I. INTRODUCTION
Climatic conditions such as rainfall and global radiation are uncontrollable parameters
of the environment. In Libo Estate of Riau, rainfall follows a seasonal pattern with rainy and
dry seasons. Divo, D. S [1] has observed that high rainfall in Libo Estate occurs in October,
November and December, while low rainfall occurs in February and June.
Climatic conditions have a relationship with oil palm yield. Chow [2] showed that yield
and rainfall effects of a number of fields were significant in practically all field analyzed.
Ochs and Daniel [3] described an empirical relationship between soil water deficit and yield,
which could be used to predict yield from rainfall data. Goh [4] compared data on rainfall and
________________________________
2010 Mathematics Subject Classification: STATISTICS Applications to biology (62P10)
365
366 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN
FFB yield from a number of countries, but the relationship was only moderately good.
Various methods have been developed for forecasting oil palm yield. Ahmad Alwi [5]
used ARIMA model that have given a good precision for forecasting oil palm yield. But the
weakness of this method was forecasting result did not consider the effects of uncontrollable
parameters likes rainfall and global radiation. From this cases, it was needed another
statistical analysis that can reflects the combined effects of many different types of causal
phenomena.
The paper attempts to study the relationship between rainfall and global radiation on oil
palm yield in Riau and Lampung. This research was located in oil palm plantation of
PT.SMART Tbk, which is one of the largest private oil palm companies in Indonesia. The
method was used in this paper was multi-input transfer function analysis, which is a
multivariate time series analysis combines several properties of univariate ARIMA models
and multiple linear regression analysis. The model obtained will be used to determine the
relationship between rainfall and global radiation on oil palm yield and predict the future
level of oil palm yield.
Yield and climate data was recorded monthly, in Riau the data period from 1998 to
2010 and was located at Libo Estate (LIBE). In Lampung the data period from 2000 to 2010
and was located at Sungai Buaya Estate (SBYE).
Climate parameters, rainfall and global radiation were provided by the Meteorological
stations in both locations. Oil palm yield was used data from SMARTRI Fertilizer Trial:
LIBE-14 and SBYE-01. For the purpose of this study, we focused only on the optimum level
in each experiment.
400
Rainfall In SBYE (mm)
Rainfall in LIBE (mm)
300
300
200
200
100
100
0
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month Month
600
500
500
400
400
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
Month
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 367
119
5
Outliers are hidden
Extreme
5 values are hidden 118
115 130
5
3
4
3
3 14
99
1
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
Month
Figure 1. Trend average per month: rainfall, global radiation, and oil palm yield in Riau
(LIBE) and Lampung (SBYE).
According to the figure, can be described that the rainfall at LIBE comparing to SBYE
have a different pattern, which at LIBE was more look like equatorial type and SBYE was
look like monsoon type. The same like on Global radiation, at LIBE global radiation is higher
than SBYE and the pattern also different comparing both of these. Based on this condition,
we will investigate the contribution affects of rainfall and global radiation to oil palm yield.
2.1. Time Series Analysis. In the time series analysis, one important step is to identify and
build models.The aim is to find out whether the time series data used follow a stochastic time
series model procedure, which is usually presented in the ARIMA (p, d, q) model (Hamilton
[6]):
or
p (B)(1 B) d Z t θ q (B) a t
at : Random noise.
368 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN
2.2. Transfer Function. Transfer function is a multivariate time series analysis which
combines several properties of univariate ARIMA model and multiple linear regression
analysis. General model of transfer function (Wei [7]) is:
yt v0 xt v1 xt 1 ... nt (2)
v( B) xt v0 xt v1 xt 1 ... vn xt n
s ( B) B b θ q (B)
Then, v( B) and nt at ;
r ( B) p (B)
Thus,
s ( B) B b θ q (B)
yt xt b at ; (Wei[7])
r ( B) p (B)
s ( B) : 0 1 B 2 B 2 ... s B s
r ( B) : 1 1 B 2 B 2 ... r B r
p ( B) : 1 1 B 2 B 2 ... p B p
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 369
q ( B) : 1 1 B 2 B 2 ... q B q
m j ( B) θ(B)
yt x j ,t bj at (3)
j 1 j ( B) (B)
Start
Model Identification,
parameter estimating of
transfer function
End
2,3 Ton/ha
Yield 0,7 1,0 5,1
/month
1,5 Ton/ha
Yield 1,0 0,1 5,5
/month
Based on description of record data (Table.1) the average of rainfall that occurred in
Riau (LIBE) from 1998 to 2010 was 202,46 mm per month with a standard deviation
97,61.The average of global radiation was 514,68 MJ/M 2 per month with a standard deviation
47,6. And the average of oil palm yield was 2,34 ton/ha with a standard deviation 0,70. The
average of rainfall that occurred in Lampung (SBYE) from 2000 to 2010 was 182,8 mm per
month with a standard deviation 128,1.The average of global radiation was 467,6 MJ/M2 per
month with a standard deviation 42,6. And the average of oil palm yield was 1,5 ton/ha with a
standard deviation 1,0.
0.4
t_21
0.3
t_9
t_10
t_36
0.2 t_8
t_33 t_12
t_35t_34 t_24
t_32 t_30 t_23
0.1 t_22 t_20
t_15
t_19 t_13
t_27 t_25 t_7
0
t_6 t
t_11
-0.1 t_26 t_18
t_28 t_3
t_29
t_31 t_16 t_5 t_2
t_4
-0.2
t_1
t_17
-0.3 t_14
0,6
t_9
0,5
t_10 t_8
t_7
0,4
t_31 t_11
0,3 t_33
t_32 t_6
0,2 t_22 t_21 t_19
t_23
t_35 t_30
0,1
t_24
t_34
0
t_18 t_12 t_5
-0,1 t_20 t
t_29
t_36 t_17
-0,2
t_28 t_25 t_4
t_13 t_1
-0,3
t_27t_26
t_16 t_3 t_2
-0,4 t_15
t_14
-0,5
t_28
t_27t_26 t_4 t_3
0.20
t_1
-0.40
0,3
t_14
0,2
t_36 t_13
0,1 t_33
t_26 t_25 t_12 t_8
t_35 t_20 t_19 t_11 t_1
0
t_34 t_32 t_21 t_2
t_24 t_22 t_15
t_31 t_27 t_9
-0,1
t
t_10
t_30 t_28 t_3
t_23 t_18 t_16
-0,2 t_7
t_29
t_17 t_6
t_4
-0,3
t_5
-0,4
Figure 3. Correlation between input and output variable in Riau (LIBE) and lampung
(SBYE).
372 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN
Based on correlation value (Figure 3), in Riau, rainfall was significantly and positively
in related with oil palm yield at t-9 and t-21 (9, 21 months before harvest). Global radiation
was significantly and negatively in related with oil palm yield at t-20 (20 months before
actual). In Lampung, rainfall was significantly and positively in related with oil palm yield at
t-8 and t-9 (8, 9 months before harvest). Global radiation was significantly and negatively in
related with oil palm yield at t-5 (5 months before actual). In the parameter of rainfall, global
radiation and oil palm yield has performed differencing (1) to obtain input and output
variables which were stationary between the mean and variance.
3.2. Prewhitening on Input and Output Series. Prewhitening sequence was carried out on
input and output series to identify the transfer function model parameters.
o Riau
x 1t y1t
αt ; βt
(1 0,781 B)(1 0,690B )
12
(1 0,781B)(1 0,690B12 )
o Lampung
x 1t y1t
αt ; βt
(1 0,890B)(1 0,687B12 ) (1 0,890B)(1 0,687B12 )
o Riau
x 2t y 2t
αt ; βt
(1 0,737B)(1 0,663B )
12
(1 0,737B)(1 0,663B12 )
o Lampung
x 2t y 2t
αt ; βt
(1 0,947B)(1 0,667B )
12
(1 0,947B)(1 0,667B12 )
3.3. Single-Input Transfer Function on Rainfall and Oil Palm Yield. After identification
of the ARIMA model, transfer function parameters and diagnostic test model, the final model
obtained on a single-input transfer function between the rainfalls with yield variable was
given as:
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 373
o Riau
yt 0,388 yt 1 0,0009 x1t 0,0098 x1t 11
at 1,238at 1 0,3298at 2 0,661at 12
0,817at 13 0,217at 14
It was seen that in Riau rainfall affects the oil palm yield at actual t (actual) and t-11
(11 month before harvest).
o Lampung
yt 0,682 yt 1 0,009 x1t 0,001x1t 7
at 1,125at 1 0,302at 2 0,632at 12
0,712at 13 0,191at 14
It was seen that rainfall in Lampung affects the oil palm yield at actual t (actual) and t-
7 (7 month before harvest).
3.4. Single-Input Transfer Function on Global Radiation and Oil Palm Yield. After
identification of the ARIMA model, transfer function parameters and diagnostic test model,
the final model obtained on a single-input transfer function between the global radiations with
yield variable was given as:
o Riau
yt 1.081yt 2 0,001x2t 0,0019 x2t 8
at 0,834at 1 1,081at 2 0,901at 3
0,64at 12 0,53at 13 0,69at 14 0,93at 15
It was seen that global radiation in Riau affects the oil palm yield at t (actual) and t-8
(8 month before harvest).
o Lampung
yt 0,7127 yt 1 0,0021x 2t 0,0025 x 2t 6
at 0,420at 1 0,208at 2 0,611at 12
0,175at 13 0,436at 14
It was seen that global radiation in Lampung affects the oil palm yield at t (actual) and
t-6 (6 month before harvest).
374 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN
3.5. Multi-Input Transfer Function: Rainfall, Global Radiation and Oil Palm Yield. The
final multi-input transfer function model obtained was given as:
o Riau
From the model obtained, it was known that the actual oil palm yield in Riau is
affected by last yield at t-1, t-2, t-3 (1, 2, 3 months before harvest). Rainfall affects yield at t
(actual), t-2, t-11, t-13 (2, 11, 13 months before harvest). While the global radiation affects
yield at t (actual), t-1, t-8, t-9 (1, 8, 9 months before harvest).
o Lampung
In Lampung, oil palm yield is affected by last yield at t-1, t-2 (1, 2 months before
harvest).Rainfall affects yield at t (actual), t-1, t-7, t-8 (1, 7, 8 months before harvest). While
the global radiation affects yield at t (actual), t-1, t-6, t-7 (1, 6, 7 months before harvest).
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 375
3.6. Forecasting. Based on the final model obtained, the oil palm yield from January-
December 2011 was forecasted and presented in Table 2.
Table 2. Forecasting on oil palm yield (ton/ha/month) in Riau and Lampung on January to
December 2011.
Yield 2011
Month
Riau (ton/ha) Lampung (ton/ha)
Jan 2,4 1,4
Feb 2,0 0,8
Mar 2,1 0,7
Apr 2,2 0,8
May 1,8 1,1
Jun 2,4 0,5
Jul 2,2 1,7
Aug 2,1 1,2
Sep 2,3 2,2
Oct 2,9 2,9
Nov 2,1 2,8
Dec 2,4 1,8
From the forecasting result, we consider that the yield (ton/ha) in Riau more larger than
Lampung with total yield per year in 2011 was 26,9 ton/ha/year and Lampung was 17,9
ton/ha/year.
Forecast vs Actual in yield potential (ton/ha/month) - LIBE
5.6
4.8
Yield (ton/ha/month)
4.0
3.2
2.4
1.6
0.8
26 33 40 47 54 61 68 75 82 89 96 103 110 117 124 131 138 145 152 159 166
5.0
Yield(Ton/Ha/month)
4.0
3.0
2.0
1.0
0.0
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121
Series (b)
Forecast Actual
Figure 4.Oil palm yield Actual vs Forecast. (a) Riau, (b) Lampung.
y t ŷ t /y t
Riau: t 1
x 100% 22%
n
n
y t ŷ t /y t
Lampung: t 1
x 100% 18%
n
y t 1
t ŷ t
Riau: 0,095
n
n
y t ŷ t
Lampung: t 1
0,024
n
Comparing to yield actual and forecast result, MAPE in Riau was 22% and MAD was
0,095. This means, yield forecasting in Riau have 22% average error and 9,5% average of
deviation comparing to actual. In Lampung, MAPE was 18% and MAD was 0,024. This
means, yield forecasting in lampung using this method was 18% average error and 2,4%
average of deviation comparing to actual. In Riau, MAPE >20% it is mean ARIMA model
can be used applied also in this region to estimate and explain the future level of oil palm.
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 377
4. CONCLUSSION
In this paper we have discussed, the applied of multi-input transfer function to predict
the future level of oil palm generally has given good results. But we still consider that to
make a good forecasting model, we should have a long term data. Based on this result, for
next is still necessary to simulate this correlation using other statistics method to obtained
good model and get less of MAPE and MAD.
References
[1] DIVO, D .S. 2010. Probability Analysis of Rainy Event With the Weilbull Distribution as a Basic Management
in Oil Palm Plantation. Proceeding of 2010 Conference on Industrial and Applied Mathematics, ITB Bandung.
[2] CHOW, C.S. 1987. The Seasonal and Rainfall Effects on Palm Oil in Peninsular Malaysia. Proceeding of 1987
Oil Palm/Palm Oil Conference, pp.46-55, Kuala Lumpur.
[3] OCHS R. AND DANIEL C. 1976. Research on techniques adapted to dry regions. In: Oil palm Research (Ed. By
R.H.V. Corley, J.J. Hardon & B.J. Wood), pp. 315-330, Elsevier, Amsterdam.
[4] GOH K.J.2000. Climatic requirements of the oil palm for high yields. In: Managing oil palm for high yields:
agronomic principles (Ed. By Goh K.J.), pp. 1-17, Malaysian Soc.Soil Sci. and Param Agric. Surveys, Kuala
Lumpur.
[5] AHMAD ALWI AND CHAN, K.W.1990.The Future of Oil Palm Yield Forecasting: Guthrie’s Autoregressive
Integrated Moving Average Methode. In: Proc.1989 Int. Palm Oil Dev. Conf. Agriculture (Ed. By. B. S. Jalani
et al.), PP.144-150 Palm Oil Rest. Inst. Malaysia, Kuala Lumpur.
[6] HAMILTON, JAMES.D. 1994.Time Series Analysis. Princeton University Press, 41 William St.Princeton. New
Jersey.
[7] WEI, W.W.S. 1990. Time Series Analysis, Univariate and Multivariate Methods, Canada. Addison Wesley
Publishing Company.
DIVO D. SILALAHI
SMART Research Institute (SMARTRI), PT. SMART Tbk, Indonesia
e-mail: [email protected]
J.P. CALIMAN
SMART Research Institute (SMARTRI), PT. SMART Tbk, Indonesia
e-mail: [email protected]
DYLMOON HIDAYAT
1. INTRODUCTION
Frames were first introduced by Duffin and Schaeffer to study an irregular sampling
problem in the context of non-harmonic Fourier series [1].Frames are different from bases.
Even though they can both represent functions in as series, frames may be linearly
dependent. This implies that the representation by frame is not unique. In some application,
the redundancy of the representation has important rules such as in signal processing. The
redundancy leads to robustness: the presence of un-correlated noises is less destructive to the
quality of the signal [2].We developed a Continuously Translated Framelet (CTF) as a
continuous translation and a discrete dilation of a framelet. We showed that a CTF can be
associated to an operator that would be called continuously translated framelet operator (CTF
Operator). Given a CTF, the corresponding CTF operator is self adjoint, bounded, and
positive. It is well known that every bounded self adjoint operator has a spectral
decomposition [3]. Using the spectral decomposition of the CTF operator, we defined a
family of new CTFs.
1.1. Frame and Framelet. Let be a separable Hilbert space with inner product and
379
380 DYLMOON HIDAYAT
The constant m and M are called the frame bound. If m = M then the frame is called tight.
Given a fixed function we mean byDj, Tkand are the following:
(1)
(2)
(3)
Definition 1 A family is called (discrete) framelet if there exist two positive constantsm
and M such that
for all (4)
The constant m and M are called the framelet bound. If m = M then the framelet is called
tight.
Lemma 4 [5]Let { be a framelet satisfying the Equation (4). Then for every positive odd
integer n the family { } remain a frame with the same bounds.
Proof of Theorem 3
FromLemma 4 we can see that the frame condition is true for all positive odd integern
by the CTF condition. Therefore we can find the adjointG* of G as follows. Let us write g(j,z)
as a function in L2 ( ) Then
So F = G*G. Therefore
By the CTF condition on equation (5), the expression is bounded and we have
Since a positive operator is always self adjoint, then it is clear that the following corollary is
true.
We know that a self adjoint operator can be decomposed by its spectral family as stated in the
following theorem:
Theorem 8 Let F be a bounded self adjoint operator in Hilbert space withinf F =m and sup
F =M. then there exist a spectral family { } on the interval [m,M] such that
and the fact that the spectral family satisfy , it implies that are
pairwise orthogonal projections, therefore we have the following definition:
Theorem 10 Let F be as in Equation (7). For each non negative integer s, Then is
bounded positive self adjoint and admits the following properties:
C on t inu ou s ly Tra n s la t ed F ra m el et
383
(i)
and
(ii)
Proof. (i) since and for a positive integers, then , then
and finally
Corollary 11
for
and
for
3. FAMILY OF FRAMELETS
Now we will show that the CTF operator commutes with the dilation and translation operators.
Theorem 13 If Fis the CTF operator and Dm is the dilation operator then
FDm = DmF
Proof. Using a simple substitution, it is not hard to show that
(8)
and
384 DYLMOON HIDAYAT
(9)
More over
by equation (8)
by equation (9)
for all f.
by equation (10)
by
equation (11)
for every f.
Finally we prove that the CTF operator preserves dilations and translations
Corollary 15 For each integer s, the CTF operator preserves dilations and translations.
Proof. Let Dm and Tl be a dilation as in (1) and a translation operators as in (2) respectively.
We will show that and .
(0)
It is clear that for s = 0, F = I commutes with Dm and Tl. For s = 1
FDm = DmF andFTl = TlF
by Theorem 13 and Theorem 14. Hence by induction
(s)
then is a CTF.
(s)
Proof. That is a family of translates and dilates follows from Corollary 15. By Theorem 6 it
suffices to show that the CTF operator G(s) for the CTF (s)
is positive and bounded. Where
Consider
Corollary 18
(s)
is a CTF is a CTF for any
386 DYLMOON HIDAYAT
References
[1] DUFFIN, R. AND SCHAEFER, A.,AClass of non Harmonic Fourier Series, Trans. Amer..Math. Soc. 72, 341 – 366,
1952
[2] GRIBONWAL, R., DEPALLE, P., RODET, X., AND MALLAT, S., Sound Signal Decomposition using a high
Resolution matching pursuit, In Proc. Int. Computer Music Conf. 293 – 296, 1996
[3] RIESZ, F. AND SZ – NAGY, B., Functional Analysis, p 2, Translation from French by Leo F.Boron. New York:
FrederickUngar Pub., 1955
[4] DAUBECHIES, I., GROSSMANN, A., AND MEYER, Y., Painless Non Orthogonal Expansions. J. Math. Physics
27, 1271 – 1276, 1986
[5] CHUI, C. K. AND SHI, X. L., n x oversampling Preserves any Tight Affine Frame for Odd n,Proc. Of the AMS,
121 no. 2, 511 – 517, 1994.
DYLMOON HIDAYAT
Universitas Pelita Harapan, Department of Mathematics Education.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011
Applied Mathematics, pp. 389 - 394.
ENDAR H. NUGRAHANI
1. INTRODUCTION
There are essentially three types of models which can be used to examine a system of
vehicular traffic, namely: microscopic, macroscopic, and kinetic models [1,2]. Microscopic
models focus on the modeling of individual cars with their deterministic or stochastic
interactions. Macroscopic models have the form of partial differential equations of
conservation type, which determine the relation of density and flux of a vehicular system.
Kinetic models predict the statistical distribution of cars with respect to their location and
velocity. Relevant literatures show that since the first continuum model describing traffic
flow given by Lighthill and Whitham [3], much progress has been made in the development
of microsocopic (follow-the-leader) models on the one hand, and of macroscopic (fluid-type)
models on the other hand [4].
A kinetic model, which is also known as mesoscopic model, describes the traffic as a
system of interacting gas particles, which is formulated as a Boltzmann-like equation. The
model is based on the time evolution of a one-vehicle probabilty distribution function in a
phase-space [5]. The first mesoscopic (or gas-kinetic) traffic flow model just appeared in
1960, when Prigogine and Andrews [6] wrote a Boltzmann-like equation to describe the time
evolution of a one-vehicle distribution function in a phase-space, where the position and the
velocity of the vehicles plays a central role [4].
387
388 E. H. NUGRAHANI
However, until the 1990s, mesoscopic traffic models do not get much attention from
scientists due to their lack of ability to describe traffic operations outside of the free-flow
regime. Additionally, compared to macroscopic traffic flow models, gas-kinetic traffic
models have a large number of independent variables that increase the computational
complexity. Nevertheless, in the last decades, the scientific interest on mesoscopic traffic
models has been enriched with the publication of some works that apply these models to
derive macroscopic traffic models. In fact, macroscopic equations for relevant traffic
variables can be derived from a Boltzamnn-like traffic equation by averaging over the
instantaneous velocity of the vehicles. This is a well-known procedure in the kinetic theory
[1,4,6,7].
The organization of this paper is as follows: Section 2 presents the underlying
multilane microscopic traffic model by considering some reaction thresholds. In Section 3,
the corresponding multilane kinetic model is discussed. Moreover, the results of some
numerical simulations are presented in Section 4. Finally, Section 5 gives a concluding
remark.
Consider a multilane road to be modelled. Let the car under consideration is denoted
by c, and the leading and following cars are denoted by c and c , respectively, whereas, the
corresponding cars on the left and right lane are cl , cl , and cr , cr . The velocities before
and after interation are given by v and v’. The maximal velocity is denoted by w, such that all
velocities are between 0 and w.
Let H 0 is the minimum headway between cars. The following are the thresholds for
line changing to the right (HR), line changing to the left (HL), braking (HB), accelerating (HA),
and free driving (HF):
H R H 0 vTR
H L H 0 vTL
H B H 0 vTB
H A H 0 vTA
H F H 0 wTF
with TRS , TLS TB . Therefore, the interactions under consideration can now, according to [1],
be formulated as follows.
Interaction 1 (Lane changing to the right). If v v and H R (v) is satisfied, then the car
will change lane to the right. Thus, a car will be able to pass the car ahead, only if there is
sufficient space on the right lane, i.e. if
Moreover, c and c will accelerate after lane changing with new velocity of
v~ , if xr x H F v~ , if x x H F
v v
v , otherwise v , otherwise
with v~, v~ according to a desired probability distribution function with density f D , i.e.
choose v ~ F 1 ( ) , with a uniform random variable on the interval (0,1) with
D
v
FD (v) f D (vˆ) dvˆ .
0
Interaction 2 (Lane changing to the left). If v v and if H L (v ) is crossed, then the car
will change lane to the left, only if there is enough space available, i.e. if
As before, c and c will accelerate after lane changing with new velocity of
v~ , if xl x H F v~ , if x x H F
v v
v , otherwise v , otherwise
v v (v v) , ,
with uniformly distributed on [0, 1]. Braking will take place only under condition when
390 E. H. NUGRAHANI
acceleration is still possible, i.e. for any v , v the following conditions should be satisfied:
TB
H A (v) H B (v) or 1.
TA w TA
v v (min( w, v ) v) , 1 .
Acceleration is allowed only if there is still possibility of braking, i.e. the velocities v, v
should satisfy
TA
H B (v) H A (v) or 1 .
wTB TB
Interaction 5 (Acceleration II, Free driving). If v v and free driving threshold H F (v) is
satisfied, then the car will accelerate freely to the desired velocity. The new velocity will be
distributed according to certain probability distribution function with density f D , i.e.
1
v FD ( ) .
f f c
c f Q f , f
t x c t
c
Q f , f 1 p c c f x, c, t f x, c, t dc 1 p c c f x, c, t f x, c, t dc
c 0
describes the decelaration processes due to slower vehicles which cannot be immediately
overtaken. The first part of the interaction term corresponds to situations where a vehicle with
velocity c′ must decelerate to velocity c causing an increase of the one-vehicle distribution
function, while the second one describes the decrease of the one-vehicle distribution function
due to situations in which vehicles with velocity c must decelerate to even slower velocity c′.
The positive part of this interaction term is also known as the gain term, and the negative part
is the lost term [1].
From another point of view, let f ( x, v) be a probability density function of the car
( 2)
in lane α, and the probability density function for the car ahead is f ( x, v, h, v ) . Those
functions are defined as follows.
w
f ( x, v) f( 2) ( x, v, h, v ) dh dv .
0 0
q(h ; v, f ) : Probability of the car ahead of the car with velocity v , which has
velocity distribution function f.
The kinetic equation for the distribution functions ( f1 ,..., f N ) on N lane is established
through defining gain (G) and loss (L) terms of the interaction. Furthermore, the interaction
can be defined in the following equation
~
t f v x f C ( f1( 2) ,..., f N( 2) , f1 ,..., f N ) .
~
C is the interaction term, which is defined as follows.
~
C ( f1( 2) ,..., f N( 2) , f1 ,..., f N )
~ ~ ~ ~ ~ ~
(GB LB )( f 1 , f( 2) , f 1 ) (GA LA GF LF )( f( 2) )
~ ~
GR ( f(21) , f ) LL ( f 1 , f( 2) , f 1 ) (1 ,1 )
~ ~
GL ( f , f(21) , f 2 ) LR ( f( 2) , f 1 ) (1 , N ),
4. NUMERICAL SIMULATION
A numerical study has been established for the multilane microscopic model based on the
above mentioned interaction thresholds according to [8]. The simulation is assumed on a 3-
lane highway with some corresponding threshold’s parameter. The computed value is the
average velocity of the cars in the system, denoted by u t . The simulation is carried out on
various values of density parameter, i.e. 0.1, 0.2, 0.4, 0.6 . The result is given in Figure
1.
This result shows that cars will move in lower velocity when the density becomes
higher, as a result of more limited space for the vehicles on the highway. This also resembles
quite well to traffic operations in real-life traffic.
5. CONCLUDING REMARK
Vehicular traffic system can be explained quite well using Boltzmann like kinetic
equation, which originally models the phenomena of interacting gas particles. In the case of
multilane highway, the microscopic interactions between vehicles can be used to resemble the
interacting gas particles in the model construction of kinetic model of vehicular traffic.
Acknowledgement. The author would like to thank the Department of Mathematics IPB
for its financial support. The author also thanks her former student Desyarti Safarini TLS for
M u lt i lan e Ki n et i c M od el of Veh i c u la r Tra ffi c S ys t em
393
References
[1] KLAR, A. AND WEGENER, R., A hierarchy of models for multilane vehicular traffic I: modeling, SIAM J. on
App. Math. 59:983-1001, 1998.
[2] ILLNER, R., BOHUN, C.S., MCCOLLUM, S. AND VAN ROODE, T. Mathematical Modelling: A Case Studies
Approach. AMS, Providence, Rhode Island, 2005.
[3] LIGHTHILL, M. J. AND WHITHAM, G. B., On kinematic waves: II. A theory of traffic flow on long crowded
roads. Proceedings of the Royal Society, Series A 229, 317-345, 1955.
[4] MARQUES JR., W. AND MENDEZ, A. R., On the kinetic theory of vehicular traffic flow: Chapman-Enskog
expansion versus Grad’s moment method, arXiv:1011.6603v1 [math-ph], 2010.
[5] HERTY, M., KLAR, A., PARESCHI, L. General kinetic models for vehicular traffic flow and Monte Carlo
methods, Preprint, TU Kaiserslautern - Germany, 2005.
[6] PRIGOGINE, I. AND ANDREW, F., A Boltzmann like approach for traffic flow, Oper. Res. 8: 789, 1960.
[7] HELBING, D. AND TREIBER, M., Enskog equations for traffic flow evaluated up to Navier-Stokes order,
Granular Matter 1, 21, 1998.
[8] KLAR, A. AND WEGENER, R., A hierarchy of models for multilane vehicular traffic II: numerical
investigations, SIAM J. on App. Math. 59:1002-1011, 1998.
ENDAR H. NUGRAHANI
Bogor Agricultural University.
e-mails: [email protected] / [email protected]
394 E. H. NUGRAHANI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 395–402.
Fajar Adi-Kusumo
1. INTRODUCTION
This paper is a sequel of [2]. In [2], a 6-dimensional system of ODE which repre-
sented a three-coupled oscillators system was studied. The nonlinearity of the system
was quadratic, preserved the energy and perturbed by its linear term. The system is
a generalization of the system in [3] in the sense of the dimension and resonance. In
[3], the system is 4-dimensional system and reduced into 3-dimensional system in the
normal form. Analysis of the normal form of the 3-dimensional system was done in
[3, 4, 1]. In [2] the author combines two class of resonance i.e. the widely-separated
frequencies which is the extreme type of the higher order resonance and the lowest order
resonance in the system.
395
396 Fajar Adi-Kusumo
In application, the system in [2] and [3] are motivated by an atmospheric model
which is called Ultra Low Frequency Variability (ULFV). The model represents a long-
time behavior of the interaction patterns in atmosphere.
Using Averaging Method and some other coordinate transformation, the 6-dimensional
system was transformed into 5-dimensional system, see [2]. The transformed system was
called the normal form. The normal form preserves the general properties of the original
system, but reduces its dimension. In this paper we will analyze the basic properties of
the normal form.
2. PROBLEM FORMULATION
Let us now consider a system of ODE in R5 :
ṙ = δ2 xr + δ1 pr + µ1 r
ṗ = (2δ1 q + ω1 )q − δ1 r2 + γxp + µ2 p
q̇ = −(2δ1 q + ω1 )p + γxq + µ2 q (1)
ẋ = (αx + βy + ω2 )y − δ2 r2 − γ(p2 + q 2 ) + µ3 x
ẏ = −(αx + βy + ω2 )x + µ3 y.
We assume that the parameters µi , i = 1, 2, 3 are small parameters (0 < µi 1). For
µi = 0 the system is called unperturbed system which is conservative. In this case, the
solution of System (1) lies on an invariant hyper-sphere (the sphere in R5 ).
To proof the theorem, we use the first integral of System (1) for µi = 0, that is
V = 12 r2 + p2 + q 2 + x2 + y 2 . For µi 6= 0, we see that the equation of V is a Lyapunov
function of System (1) and proof the global stability of the trivial equilibrium.
The other interesting parameter is γ. In [2], the parameter shows the interaction
between the system in the extreme type of higher order resonance class and the one in
the lowest order resonance class. In the Section 3 and Section 4 we will show that the
parameter influences the complexity of the equilibrium points of the system and also
the existence of their bifurcation values.
Analysis in this paper is focused to explore some the properties of the conservative
system, that is System (1) with µ1 = µ2 = µ3 = 0. In this case, we will show the
existence of the manifold of equilibria and the dynamics of the system which lies on the
hyper-sphere.
Analysis of a Higher Dimensional Singularly Perturbed... 397
which is a member of the manifold of equilibria (2). The equilibrium (3) lies on the
intersection point between the manifold of equilibria (2) and the hyper-sphere. The
398 Fajar Adi-Kusumo
and ω2 2 δ1 2 + βγω1 2 ≥ 0.
δ1 βy 2 − (γ + 2δ2 )q 2 + δ1 ω2 y − δ2 ω1 q = 0.
(4)
Equation (4) is an ellipse for γ > −2δ2 , a hyperbola for γ < −2δ2 , and a parabola for
γ = −2δ2 . Furthermore, we parameterize q = q◦ and we have an equilibrium point
√ !
δ1 q◦ (2δ1 q◦ + ω1 ) ω2 H
(r, p, q, x, y) = , 0, q◦ , 0, − ± . (5)
δ1 2β 2δ1 β
with H = 4δ1 2 β(γ + 2δ2 )q◦ 2 + 4βδ1 δ2 ω1 q◦ + ω2 2 δ1 2 which lies on the hyper-sphere.
For δ1 > 0, the value of r of Equilibrium (5) exists for q◦ ≤ − ωδ11 or q◦ ≥ 0.
ω1
Otherwise for δ1 < 0, the value of r of the equilibrium exists for q◦ ≤ 0 or q◦ ≥ − 2δ 1
.
Furthermore, the value of y of Equilibrium (5) exists for
The Equilibrium (6) exists for −γδ2 δ1 2 (4δ2 2 + 4δ2 γ + γ 2 )r◦ 2 + ω1 2 δ2 γ ≥ 0, that is
Proof. Firstly, we consider the manifold of equilibria in (x, y)-plane which is the line l
(Ω2 = 0). The value of R1 is computed by calculating the distance between the line
and the trivial equilibrium. For R < R1 there is no intersection point between the
hyper-sphere and the line. So the nontrivial equilibrium which is on the line Ω2 = 0 do
not exist. The nontrivial equilibrium which is on the line l exist only for R ≥ R1 . For
R = R1 the hyper-sphere has only one intersection point with the line l, and then for
R > R1 the hyper-sphere intersects the line l at two points. The second is that on the
(p, y)-plane we have a manifold of equilibria, see Equation (2), and for γ 6= 0 we have
4δ1 2 γp2 − 4βδ1 2 y 2 − 4ω2 δ1 2 y + γω1 2 = 0. (8)
2
The intersection point between the hyper-sphere and the (p, y)-plane is the circle R =
p2 + y 2 . If we substitute the equation of the circle to the Equation (8) then we have
ω1 2
β+γ ω2
R2 = y2 + y− (9)
γ γ 4δ1 2
ω2
The Equation (9) has maximum value at y = − 2(β+γ) . So that we have the bifurcation
value R2 by substituting the value of y to Equation (9). For γ > 0, the manifold of
equilibria (2) is an ellipse. In this case, the hyper-sphere has no intersection point with
the ellipse for R > R2 , it has one intersection point for R = R2 , and two intersection
points for R < R2 .
By Theorem 4.1 we know that the existence of the nontrivial equilibria of Sys-
tem (7) depend on the radius of the hyper-sphere. In application, the radius of the
hyper-sphere represents the energy of the system and the nontrivial equilibrium can be
interpreted as a structure in atmosphere which do not to be changed in time.
5. CONCLUDING REMARKS
Analysis in this paper is focused on the basic properties of System (1), those
are the invariant properties, the symmetries, and also the conservative properties. For
the conservative properties of the system, we show the existence of the equilibrium
points and the bifurcation related to the radius of the hyper-sphere. We still left some
complicated problem in this case, i.e. more complicated bifurcation values related to
the radius of the hyper-sphere which are at the intersection point between the hyper-
sphere and (r, q, y)-space and between the hyper-sphere and the (r, p, x, y)-space, and
the stability of the equilibria.
The other interesting problem which are still open is the dynamics of System
(1) for µi 6= 0. For p = 0, q = 0, and δ1 = 0, the dimension of the System (1)can
be reduced into 3-dimensional system. In [4] and [1] the authors found that there are
chaotic solution of the system.
References
[1] Adi-Kusumo, F., Tuwankotta, J. M., and Setya-Budhi, W., Chaos and Strange Attractors in
Coupled Oscillators with Energy-preserving Nonlinearity, J. Phys. A: Math. Theor. 41, 255101
(17pp), 2008
[2] Adi-Kusumo, F., Normalisation of A Coupled-Three Oscillator with Energy-Preserving Quadratic
Nonlinearity Near 1 : 2 : ε - Resonance, Proceedings of IICMA 2009, Applied Mathematics, pp.
335-340, 2010.
[3] Tuwankotta, J. M., Widely Separated Frequencies in Coupled Oscillators with Energy-preserving
Quadratic Nonlinearity, Physica D 182, p.125-149, 2003.
[4] Tuwankotta, J. M., Chaos in coupled ocsillators with widely separated frequencies and energy-
preserving nonlinearity, Int. Journal on Nonlinear Mechanics 41, p. 180-191, 2006.
Fajar Adi-Kusumo
Applied Mathematics Group, Department of Mathematics, Gadjah Mada University, INDONE-
SIA.
e-mails: f [email protected]
402 Fajar Adi-Kusumo
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics , pp. 403 – 412.
B. P.ISKANDAR
Abstract. This paper deals with a periodic replacement policy based on the number of
failures after the expiry of warranty. The product is sold with a two -dimensional non-
renewing failure replacement warranty. We model product failures using the one-
dimensional approach which allows to modeling the effect of age and usage to the
product‟s degradation. Under the periodic replacement policy, for a given usage rate y, the
product is repaired with minimal repair when it fails at the first n -1 failures and it is
replaced with a new one at n th failure. We obtain the global optimal values of n which
minimizes the expected cost per unit time to the buyer and give a numerical example to
illustrate the optimal solution.
Keywords and Phrases : Periodic replacement with n failure policy, Two-dimensional
warranty, expected cost per unit times.
1. INTRODUCTION
review of various maintenance policies developed are abundance, examples can be found in
[21] and [24]. Maintenance policies following the expiry of a warranty have been studied by
[23], [9], [25] and [10]. However, most maintenance policies studied are characterized by one
time scale –e.g. age.
A maintenance policy characterized by two time scales –i.e. age and usage has received
less attention in the literature (see [5]), despite in realty there are many products sold within
two time scale warranty scheme. In automotive industry, many products are sold with a two-
dimensional warranty. For example, a dump truck is warranted for 36 months or 30000
miles, whichever comes first. Recently, [6] introduced a periodic replacement policy for a
two-dimensional non-renewing failure replacement warranty and they extended the model
into a hybrid policy [7]. Those papers dealt with replacement in terms of the product‟s age. In
practice, there are alternative ways in deciding when replacement should be carried out. [18]
pointed out that for a large and complex system one could make only minimal repair at each
failure, and make a planned replacement at periodic times. Other alternative, a planned
replacement is done when the nth failure had occurred .
In this paper, we consider a replacement policy in which minimal repair is carried out
for every failure until the penultimate failure (n-1th failure), and right at the last predetermined
failure (nth failure) the product is replaced with a new one. Different from earlier papers
addressing the same problems [2], [18], [13], the present paper takes into account a two-
dimensional warranty policy into consideration. The outline of the paper is organized as
follow. In Section 2 we give the model formulation. The periodic replacement policy
considered is dependent on a usage rate and is characterized by one parameter. Section 3
deals with the analysis of the optimal replacement policy. Section 4 presents numerical
examples for the case where the product has a Weibull failure distribution. Finally, in Section
5, we conclude with a brief discussion for future research.
2. MODEL FORMULATION
2.1 Warranty Policy and Coverage. The product is sold with a two-dimensional non-
renewing failure replacement warranty (NFRW) with warranty region Ω, the rectangle
[0,W ) [0,U ) where W is the time limit and U is the usage limit. With NFRW, all failures
under warranty are rectified at no cost to the buyer. It is assumed that the rectification is done
through a minimal repair and the repaired product comes with the original warranty. The
warranty ceases at the first instance when the age of the product reaches W or its usage
reaches U , whichever occurs first. Wy denotes the warranty expiry time when the usage rate
is y, (see Fig.1).
We assume that the usage rate Y (e.g., the annual distance travelled for an automobile)
varies across the customer population but it does not change for a given consumer. Here Y is a
random variable with a density function g ( y), 0 y . Conditional on Y y, the total
usage at age x is a linear function of x and it is given by
u yx (3)
A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE POLICY... 405
y U /W
W
y U /W
0 W x
W if y U / W ,
Wy (1) , (2)
U / y if y U / W .
B) Modelling First Failure: For a product sold with a two-dimensional warranty, one
needs to model the product‟s degradation which takes into account both age and usage. The
authors in [20] have introduced a more appropriate model which uses the accelerated failure
time (AFT) model, to represent the effect of usage rate on degradation. Let y0 denotes the
nominal usage rate value associated with component reliability. When the actual usage rate is
different from this nominal value, the component reliability can be affected and this in turn
affects the product reliability. As the usage rate increases above the nominal value, the rate of
406 H.HUSNIAH ET AL.
degradation increases and this, in turn, accelerates the time to failure. Consequently, the
product reliability decreases [increases] as the usage rate increases [decreases]. Using the
AFT formulation, if T0 [ Ty ] denotes the time to first failure under usage rate y0 [ y ] then we
have usage rate, the time to first failure has distribution function
Ty T0 y0 y
(6)
The hazard and the cumulative hazard functions associated with F ( x, ( y)) are given by
h( x; ( y)) f ( x; ( y)) /(1 F ( x; ( y)) (9)
x
H ( x; ( y)) h( x; ( y))dx. (10)
0
2.3 The Expected Cost Per Unit Time. We first consider a periodic replacement policy for a
given usage rate y and then the policy for various values of usage rate. Let n y denote the
number of failure parameter of a periodic replacement policy for a given usage rate. The
periodic replacement policy is defined as follows.
For a given usage rate y, the product is repaired with minimal repair when it fails at the first
n-1 failures and it is replaced with a new one at nth.
We seek the optimal value of n y which minimizes the expected cost per unit time to the buyer
then n*y denote the optimal value as y varies.
For a given usage rate y, the expected cost per unit time is obtained as follows.
Let J (n y ) denotes the expected cost per unit time, which is given by
With, then we define
J (ny ) E Cost per cycle E Cycle length (11)
as in [22]. Let Cr and Cm , (Cm Cr ) denote the cost of each repair and the cost of a
replacement, respectively. Since all failures during cycle length are restored by a minimal
repair, then failures occurs according to a non-homogeneous Poisson process in (0, x) with
intensity function y ( x) h( x; ( y)). Let Cd denote the cost incurred to the buyer at each
minimal repair. Sold with NFRW, the cost per cycle depends on whether replacement is
performed within or outside the warranty region. Therefore, from buyer perspective, the cost
A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE POLICY... 407
Cd H Wy N y H Wy 1 Cd Cm Cr (12)
0
Wy
Where, H Wy y x dx . This cost is composed with down time cost at each minimal
repair during warranty region, down time and minimal repair cost at each minimal repair after
the warranty region expiry.
While, the expected cycle length is
T
lim TP Yn T xdP Yn x (13)
T y 0 y
Cd H Wy Cr (15)
0
Wy
Where, H Wy y x dx .
While, the expected cycle length is given as
W
lim WP Yn W xdP Yn x (16)
W y 0 y
Cd H Wy Cr Cd Cm .
(18)
where, p j x H x
j
j! e
H x
.
Next, let L n y denotes the left-hand side of (18) i.e.
ny 1
j 0 0
p j x dx
0
pny x dx n y H Wy 1 . (19)
then we have the following theorem as in [13] and [18] for the case of non-warranty product.
Theorem 1. Suppose that the product is sold with a two-dimensional NFRW and y ( y ) is
IFR, and y () Cd H Wy Cr Cd Cm then there exist a finite and unique solution
*
n y if
L ny Cd H Wy Cr Cd Cm
ny 1, 2, (20)
Where, L n y is stated in eq.(16). The corresponding expected cost per unit time is given by
eq.(14).
Cd H Wy Cr
lmt L n y lmt
Cd Cm
, then L(ny 1) L(ny ) is a non-negative
n y n y
pny x dx
0
and 0
pny x dx is a decreasing function of n y . Consequently, there is a finite and unique
L n*y Cd H Wy Cr Cd Cm . The corresponding expected cost per unit time is
given by eq. (14) .
where, p j x H x
j
j! e
H x
.
Next, let L n y denotes the left-hand side of (21) i.e.
ny 1
j 0 0
p j x dx 0
pny x dx
(22)
A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE POLICY... 409
Theorem 2. Suppose that the product is sold with a two-dimensional NFRW and y ( y ) is
IFR, and y () Cr Cd then there exist a finite and unique solution n*y if
L ny Cr Cd ny 1, 2, . (23)
Where, L n y is stated in eq.(22). The corresponding expected cost per unit time is given by
eq.(17).
optimal solution n*y , and hence L n*y Cr Cd . The corresponding expected cost per unit
time is given by eq.(17) .
Colloraly 1. Suppose that the product is sold with a two-dimensional NFRW and h( x; ( y))
is IFR with xy 0, and Cr , Cd 0 :
If L(ny ) Cd H Wy Cr Cd Cm then there exist a finite and unique solution N G
y
*
in
x y Wy with the corresponding expected cost per unit time,as in (14).
If
Cd H Wy Cr Cd Cm L(ny ) Cr Cd then
N yw* in x y Wy with the
corresponding expected cost per unit time as in (17).
If L(ny ) Cr Cd then nGy* in x y Wy with the corresponding expected cost per unit time as
in (17).
.
Proof: It is clear as the consequences of Theorems 1 and 2.
410 H.HUSNIAH ET AL.
4. NUMERICAL EXAMPLES
We consider a special case where failure distribution function with nominal design
usage rate y0 and scale parameter 0 is F0 ( x;0 ) 1 exp( x / 0 ) , where is the shape
parameter. The conditional failure distribution function, given the usage rate y is
F ( x; ( y)) 1 exp( x / ( y)) with ( y) given by (7). The hazard function associated
1
is given h( x; ( y)) 0 y y0 x 0
with F ( x; ( y)) and its cumulative hazard
function is given by R Wy Wy 0 y0 y
. In what follows we obtain the optimal
solution, n*y provided that
Cr
n*y 1 and
1 Cd
y
nw Wy 0 1 where k is defined as the greatest integer contained in k .
0
y
For numerical examples we consider the following values: (1) Warranty Policy: W 2
(years) and U 2 ( 104 Km); (2) Design Reliability: ( 104 Km per year), 0 1 (year)
and 2 ; (3) AFT Model: 1.5 ; Costs: Cm = Cd = 1,Cr =5. The numerical computation is
done by Maple V 9.5. The result in Table 1 indicates that the increase of usage rate would
take replacement early due to the number of failures. Furthermore, Table 2 shows that if the
costs of downtime and minimal repair are higher due to the numbers of failure (and the
increase of usage rate) then the replacement period is shorter than the replacement period if
otherwise case happens. This is a realistic conclusion since normally buyers would undertake
replacement early to avoid a high total maintenance cost due to high penalty of downtime and
repair costs.
Table 1: n y* for 0.8 y 3.0
W [0, 2) [0, ) W [0, 2) [0, 2)
y n*y J
n*y n*y
J n*y
5. CONCLUDING REMARK
In this paper, we have studied a periodic replacement with nth failure policy for a
product with a two-dimensional warranty. For the case of two-dimensional warranties, one
can study other replacement policies such a replacement based on cumulative repair cost [15],
[16], [19]. These topics are currently under investigation.
References
[1] BARLOW, R.E., AND HUNTER, L., “Optimal preventive maintenance policies,” Operations Research, vol. 8, pp.
90–100, 1960.
[2] BARLOW, R.E., PROSCHAN, F., HUNTER, C.H., Mathematical Theory of Reliability Laser, New York: John Wiley,
1965.
[3] BLISCHKE, W.R., AND MURTHY, D.N.P., Reliability: Modeling,Prediction, and Optimisation, New York: John
Wiley, 2000.
[4] CHUN, Y.H., AND TANG, K., “Cost analysis of two-attribute warranty policies based on the product usage rate”,
IEEE Transactions in Engineering Management, vol. 46, pp. 201-209, 1999.
[5] FRICKENSTEIN, S.G., AND WHITAKER, L.R, “Age replacement policies in two time scales,” Naval Research
Logistics, vol. 50, pp. 592-613, 2003.
[6] HUSNIAH, H., AND ISKANDAR, B.P., An Optimal Periodic Replacement Policy for a Product Sold with a Two-
Dimensional Warranty. Proceedings of the 9th Asia Pacific Industrial Engineering & Management Systems,
pp. 232-238, 2008.
[7] HUSNIAH, H., U. PASARIBU, A.H. HAKIM AND B.P. ISKANDAR. A Hybrid Minimal Repair and Age Replacement
Policy for Warranted Products. Proceedings of the 2nd Asia Pacific Conference on Manufacturing System, pp.
8.25-8.30, 2009.
[8] ISKANDAR, B.P., WILSON R.J., AND MURTHY D.N.P. Two-dimensional combination warranty policies. RAIRO
Operational Research, 28: 57-75, 1994.
[9] JUNG, G..M., AND PARK, D.H., Optimal maintenance policies during the post-warranty period. Reliability
Engineering and System Safety, 82:173-185. 2003.
[10] JUNG, K.M., HAN, S.S., AND PARK, D.H. , Optimization of cost and downtime for replacement model following
the expiration of warranty, Reliability Engineering and System Safety, 93:995-1003, 2008.
412 H.HUSNIAH ET AL.
[11] LAWLESS, J. Statistical Models and Methods for Lifetime Data, Wiley, New York, 1982.
[12] LAWLESS J., HU, J., AND CAO, J., Methods for estimation of failure distributions and rates from automobile
warranty data, Lifetime Data Analysis , 1, 227-240, 1995.
[13] MAMABOLO R.M., AND BEICHELT F.E., Maintenance policies with minimal repair, Economic Quality Control,
19, p.143-166, 2004.
[14 MURTHY, D.N.P., AND ISKANDAR B.P. ,A New shock damage model: Part I-Model formulation and analysis,
Reliability Engineering and System Safety, 31 p. 191-208, 1991.
[15] MURTHY, D.N.P., AND ISKANDAR B.P., A New shock damage model: Part II-Optimal maintenance policies,
Reliability Engineering and System Safety, 31 p. 211-231, 1991.
[16] MURTHY, D.N.P., AND WILSON, R.J., Modelling two-dimensional warranties, In: Proceedings of the Fifth
International Symposium on Applied Stochastic Models and Data Analysis, Granada, Spain, 481-492, 1991.
[17] MOSKOWITZ, H., AND CHUN, Y.H. ,A Poisson regression model for two-attribute warranty policies.Naval
Research Logistics, 41,355-375, 1994.
[18] NAKAGAWA, T., Maintenance Theory of Reliability. Springer-Verlag, London, 2005
[19] NAKAGAWA, T., Shock and Damage Models. Springer-Verlag, London, 2007.
[20] NAT, J., ISKANDAR, B.P., MURTHY, D.N.P. , A repair-replace strategy based on usage rate for items sold with a
two-dimensional warranty. Reliability Engineering and System Safety, 94, 611-617, 2009.
[21] PIERSKALLA, W.P., AND VOELKER, J .A. , A survey of maintenance models: the control and surveillance of
deteriorating systems, Naval Research Logistics, 23, 353-388, 1976.
[22] ROSS, S.M. Stochastic Processes . DJohn Wiley & Sons, INC., Canada,1996.
[23] SAHIN, I.,AND POLATOGLU, H., Maintenance strategies following the expiration of warranty. IEEE Transactions
on Reliability, 45(2), 220-228, 1996.
[24] VALDEZ-FLORES, C., AND FELDMAN, R.M., A survey of preventive maintenance models for stochastically
deteriorating single-unit systems. Naval Research Logistics, 36, 419-446, 1989.
[25] YEH, R.H., CHEN, M.,Y., AND LIN, C., Y. , Optimal periodic replacement policy for repairable products under
free-repair warranty. EJOR, 176, 1678-1686, 2007.
HENNIE HUSNIAH
Department of Industrial Engineering Institut Teknologi Bandung, Jalan Ganesa 10, Bandung
40132, Indonesia
e-mail: [email protected]
UDJIANNA PASARIBU
Department of Mathematics Institut Teknologi Bandung, Jalan Ganesa 10, Bandung 40132,
Indonesia.
e-mail: [email protected]
A. HAKIM HALIM
Department of Mathematics Institut Teknologi Bandung, Jalan Ganesa 10, Bandung 40132,
Indonesia.
e-mail: [email protected]
BERMAWI. P. ISKANDAR
Department of Mathematics Institut Teknologi Bandung, Jalan Ganesa 10, Bandung 40132,
Indonesia.
e-mails: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 413 - 418.
Abstract. Ganglia Basalis is part of the brain which balances motoric pathways of neurons signal
movements. Parkinson is one of the diseases that are caused by Ganglia Basalis disability. This
neurodegenerative disease is caused by the deficit of dopamine produced by Substantia Nigra pars
Compacta (SNPc) in the ganglia basalis. This deficit will reduce activities of Globus Palidus Externa
(GPe) and increase activities of Subthalamic Nucleus (STN). The pattern of neuron signal is almost
periodic. For that reason, the periodic solution must exist in the solution of STN Neuron model
constructed by considering the dynamics of ions within the cells. STN itself receives stimulus input
from GPE. This input will be various, so that the periodic solution must exist for various values of
current input for STN neuron. Using Matcont, the existence of Hopf bifurcation in the STN neuron
model will guarantee the existence of periodic solution for various values of current input.
Keywords and Phrases : Ganglia basalis, Parkinson, STN, Matcont.
1. INTRODUCTION
Every neuron cell communicates with each other by inhibition or excitation
stimulation. The STN neurons are located in the Basal Ganglia which have the function of
balancing body movement. The GPe neurons stimulate the STN cells by inhibition. Basal
ganglia disorder may lead to Parkinson disease.
Potential membrane is caused by differences in ion concentrations inside and outside
the cell. Neuron membrane is selective permeable with some ion such as: , , and
. When ion concentration changes suddenly, it will result in action potential.
2. METHODS
413
414 I MADE EKA DWIPAYANA
Recent neuron models are the derivative of Hudkin-Huxley model developed from
single squid neuron system. Our model is extended from Hudkin-Huxley which includes five
compartments as dynamic system. Those compartments are membrane potential of STN
neuron ( v ), slow gating ( n, h, r ) and ion calcium concentration ( [Ca] ) (Terman, 2002). The
system of differential equation for STN neuron is:
dv
Cm I L I K I Na IT ICa I AHP IG S
dt
dn n (v ) n
n
dt n (v )
dh h (v ) h
h
dt h (v )
dr r (v ) r
r
dt r ( v )
I Ca I T k Ca Ca
d [Ca]
dt
The leakage current is given by I L g L (v v L ) and other currents will follow Hudkin-
Huxley equation (Beuter, 2003):
I K g K n 4 (v v K )
3
I Na g Na m (v)h(v v Na )
3
I T g T a (v)b2 (r )(v vCa )
2
I Ca g Ca s (v)(v vCa )
I AHP g AHP (v v K )
Ca
3. Ca k1
1X
With X (v) X0 , which X can be n, h or r . The steady state
4. (v X )
1 exp
X
1
voltage was formulated as follows: Y (v) , which Y can be
(v Y )
1 exp
Y
n, m, h, a, r or s. The inactivation of T current was determined by
The Existence Of Periodic Solution On Stn Neuron Model in Basal Ganglia 415
Ca 2 in the cell. The current I G S is representing the stimulus input from GPe neurons
(Guyton, 1993). This input is varied depending on GPe stimulus to STN neurons. For that
reason, it is important to know the impact of different input currents to STN cell.
3. RESULTS
At first, the stability of the system above will be calculated to find critical points and
eigenvalues that is related with every critical point. This step can be done by searching the
value of , , , and which will cause the system constant anytime. With software
Matcont, the curve of equilibrium can be seen as in figure 1:
Figure 1: Equilibrium curve for single STN neuron cell. There are two Hopf Bifurcation at
and .
There are four regions which have different dynamics. Region I is for
, in which all the eigenvalues from the system have negative real part, so that
the critical points will be stable at this region. Region II is for and
region III is between and . The eigenvalues of both regions have
416 I MADE EKA DWIPAYANA
positive real part, so that the critical points will not be stable but periodic. This periodicity is
guaranteed by the existence of Hopf bifurcation in points and
. The last region is region IV for and . At this
region, all eigenvalues have negative real part. It means the critical points are stable.
Positive value for means the STN cell has excitation stimulus from GPe,
negative value means the STN cell has inhibition stimulus and means the STN cell
has no stimulus. The interesting part is the inhibition from GPE for can
not cancel the spiking of STN cell, because the capability of the cell itself to spike is higher
than the inhibition from GPe cells. The periodic solution between
can be seen in figure 2:
The amount of inhibition received by STN cell can be defined by step function as follows:
Figure 4: After 500 ms regular spiking the STN cell receives for 450 ms.
The STN cell will produce repetitive spiking at region I. When the inhibition current
received at for 450 ms, system will jump to the fourth region. In region IV, the
STN cell cannot produce repetitive spiking because all the eigenvalues there have negative
real part. It means the solution will be convergent to the critical point. At the end of
inhibition influence, the spike became tense. This situation is called bursting. After bursting
the repetitive spiking will appear again.
418 I MADE EKA DWIPAYANA
4. DISCUSSION
Bursting occurs within a very short time interval. The dynamic of ionic cell is generally
chaotic. Therefore, it is interesting to observe the behavior of every ion. The factors that
influence the length of the bursting need to be investigated also. The inhibition itself, which
makes the STN cell look like stand still, should cause the ion into the cell and out of the cell
to be balanced.
Appendix I
The parameter values for the model discussed above are (Terman, 2002)
, , , , ,
, , , , ,
, , , , , ,
, , , , , , ,
, , , , , , , ,
, , , , , and .
References
[1]BEUTER, A., GLASS, L., MACKEY, M.C., TITCOMBE, M.S., Nonlinear Dynamics in Physiology and medicine, vol
25, Springer-Verlag, New York, 2003.
[2]DHOOGE, A., GOVAERTS, W., KUZNETZOV, Y.A., SAUTOIS, B., Matcont: A Matlab Package for Dynamical System
with Applications to Neural Activity. 2006.
[3]GUYTON, Buku Ajar Fisiologi Kedokteran, vol. 7, Penerbit Buku Kedokteran, 1993.
[4]SHERWOOD, L., Fisiologi Manusia: dari Sel ke Sistem, vol. 2, Penerbit Buku Kedokteran, 1996.
[5]TERMAN, D., RUBIN, J.E., YEW, A.C., WILSON, C.J., Activity Patterns in a Model for the Subthalamopallidal
Network of the Basal Ganglia, The journal of Neuroscience, April 1, 2002, 22(7):2963-2976.
[6]TERMAN, D., An Introduction to Dynamical Systems and Neuronal Dynamics. 2004.
[7]VERHULST, F., Nonlinear Differential Equations and Dynamical Systems. Springer-Verlag, Inc, New York. 1996.
Abstract. A base station is very important to the cellular telecommunication. It provides service to
cellular subscribers. Because of limited radio spectrum capacity, the providers must build several
base stations on different location to give cellular service to their subscribers in targeted area.
However, it is not efficient if the base station site serves only one telecommunication provider
because its construction cost is very high. Optimization of joint base station locations will be
discussed in this paper. Every base station site will serve several GSM and CDMA providers. An
integer programming model based on set-covering problem is developed to determine the minimum
number of joint base stations and their optimum locations. A Branch-and-Bound algorithm is used to
find a set of optimum solution.
Keywords and Phrases : base station, set-covering, integer programming, cellular
telecommunication
1. INTRODUCTION
Cellular telecommunication technology has been advancing very fast up to now.
This brings many benefits both to the telecommunication providers and the cellular
subscribers. The providers can give subscribers a telecommunication service with a better
performance (in speed and reliability) and wider capacity by using newer technology.
However, because of limited radius of a base station coverage, advancement of cellular
technology cannot benefit many people if the providers do not build a number of base stations
in many areas.
Base Station is a site where the basic components of cellular telecommunication
network are located. These components are cellular tower, radio bases station (RBS), power
supply, sectoral antennas, microwave dish, baseband microwave processing, and shelter to
protect the equipment from the weather (http://www.withoutthecat.com). The base station
419
420 I.W. SULETRA, W IDODO, AND SUBANAR
connects a subscriber in one location to the others in a long-distance location. Therefore, the
base station is very important to the cellular telecommunication.
However, it is not efficient if the base station site serves only one cellular provider
because the construction cost of a cellular tower (macro cell) is very high. In addition to
cellular tower cost, the provider must also fund the cellular site leased area cost and the cost
to hire maintenance and security personels to maintain and protect the base station site. All
those site cost can be reduced if several providers use the same base station site. Then, those
shared costs can reduce the cellular service tariff imposed to the subscribers.
A macro cell tower can serve a number of providers according to its size. Each
cellular provider, GSM or CDMA, operate in its specific spectrum and its spectrum is
different to the others. No interference problem will be existed across providers (across
spectrum) if the position of each sectoral antenna is adjusted correctly (see Laiho et al.[6]).
Therefore, cellular tower sharing to several providers is possible.
Based on our knowledge, the available literatures of the base station placement
problem implicitly assume one provider for each base station (see ex. Mathar dan Niessen [8],
Amaldi et al. [1-4], Chen and Yuan, [5]). Because the traffic density or the number of
subscribers of each provider, in the same target area, is not always the same, then, the
number of base stations needed by each provider may be different. Therefore, it is not
realistic to assume each provider need the same number of base stations. We propose an
integer linear programming model based on set-covering problem to fill this gap.
2. MODEL ASSUMPTIONS
To understand the system described by the model developed in this study, the
context and the assumptions used in developing the integer programming model will be
explained in advance in this section.
The first, the subscriber location can be at any point in the area of cellular service
coverage because of the mobility factor. The subscriber located at one particular area (one
RW, rukun warga) is represented by a single point that is called a demand point. Many
previous studies assume that the demand point only represents the subscriber of a single
provider. This assumption is contrary to reality because at one demand point there are multi-
provider subscribers. In this study, one point represents the demand of multiple providers that
is called a multi-mode demand point.
The second, Joint base station (JBS) which serves the subscribers located at the same
point is most likely different if the providers are different. This happens because the JBS
consists of a number of radio base station (RBS) with a distinct coverage radius. Each RBS in
one JBS belongs to a distinct provider and each has a distinct coverage radius according to
the demand traffic density of the corresponding provider. Different coverage radius for traffic
density is intended to maintain the quality of service in an area.
The third, Each RBS is assumed to have sufficient capacity to serve the subscribers
in its service territory. Coverage radius of an RBS can be adjusted through the transmitter
power adjustment which the signal is channeled through the sectoral antennas. RBS with a
shorter coverage radius is set to more dense traffic areas, and vice versa, RBS with a longer
coverage radius is set to less dense traffic areas such as suburban or rural area. Coverage
radius adjustment is expected to meet the level of GOS (grade of service) specified by
provider.
The fourth, the alternative locations of JBS are assumed to have met the terms of the
location of a base station such as no radio wave interference from other sources, no blocking
Optimum Locations of Multi-Providers Joint Base Stations ... 421
hills, no tall buildings, and no large trees. In addition, land area is sufficient and meets
eligibility requirements for a tower establishment, and the locations meet the rules set by local
government. A comprehensive terms of the base station site has been discussed by Freeman
(2007, p.38-43.)}
The fifth, RBS that is assumed in this study is the RBS for GSM (2G), RBS CDMA
2000 1x for CDMA provider, and multi-standard or multi-mode RBS. A multi-standard RBS
is capable of serving multiple systems simultaneously. For example, a family of RBS 6000 by
Ericsson can provide GSM services, WCDMA service, and LTE service at the same time
(Ericsson, 2009). Placement of RBS which is devoted only to Internet data service such as
EV-DO, WCDMA, or LTE, is not discussed in this study.
The sixth, inter-provider inter-cell interference, intra-provider inter-cell interference,
and intra-provider intra-cell interference are not addressed in this study. Each provider has its
own spectrum which is different from one provider to another provider. Therefore, there is no
interference between provider. Spectrum of cellular telecommuication is allocated by
Ditjenpostel Kementerian Kominfo (FCC in the USA, and ITU worldwide) to avoid
interference between cellular providers.
Interference between cells intra-provider for the GSM system is minimized by the
frequency division mechanism. The spectrum of each GSM provider is divided into several
frequency slots, then, a distinct frequency slot is allocated to each of adjacent-cell. In
addition, it is usually sacrificed so-called guard band frequency (in KHz) as the gap between
two consecutive frequency slots (see Weicker et al.[10]). Interference most likely to occur
between the two cell that use the same frequency slot (co-channel) if the coverage radius of
the two cells overlap each other. Fortunately, this interference can be ignored because the two
co-channel cells is separated in a considerable distance by several other cells between the two
cells. For systems that do not use frequency division such as CDMA 2000, WCDMA, and
LTE, interference between cells is prevented by minimizing the overlap between the two
cells.
Interference from other sources that generate electromagnetic waves at radio
frequency is not discussed. This kind of interference is included in the terms of the
assumption of a base station locations that have been described previously.
Minimize c
j
j Xj (1)
422 I.W. SULETRA, W IDODO, AND SUBANAR
Subject to: X
jN i
j 1 i, (2)
Minimize c
j
j X j wrjYrj
r j
(5)
Subject to: t
r j
Y 1
ikrj rj i, k , (6)
Yrj X j 0 r , j, (7)
Y
k
rj 2X j j, (8)
4. NUMERICAL SIMULATION
To describe computation and applicability of the proposed model, we use the data of cellular
towers distribution (Figure 1) and the map of demand point distribution (point of RW, rukun
warga, Figure 2) both on the City of Surakarta, Central Java. Of 105 point locations towers in
use today, only 43 points are eligible to serve as an alternative location JBS (j = 1,2, ..., 43).
The population of Surakarta is around 600,000 citizens, spread over 598 RW. The center
point of each RW serve as the location of demand point (i = 1,2, ..., 598). There are 11
cellular service providers in Surakarta consisting of five GSM providers and sixth CDMA
provider (k = 1,2, ... 11) (r=1,2,…,11). The binary value for parameter tikrj is a hypothetical
data with reference to the approximate coverage radius of each RBS from each provider. RBS
coverage radius for the GSM system in urban areas ranged from 0.5km to 2km, while the
CDMA ranging between 3-5km due to the large capacity of CDMA system with less number
of subscribers. The value of parameters cj and wkj are also hypothetical.
424 I.W. SULETRA, W IDODO, AND SUBANAR
5. CONCLUDING REMARK
We propose a kind of multiple-type demand and multiple-type facility location model applied
to the JBS location problem. Problem is formulated as single objective linear integer
programming. The standard set covering model is a special case of our model by setting it to
single type demand and single type facility. We impose many assumptions about interference
in this study and use non-polynomial Branch-and-Bound Algorithms to solve the integer
model. It is interesting for future study if we can accommodate a multiple objective approach
to represent a more realistic problem, i.e. problem with three different stake holders (provider,
subscriber, and government). Relaxation of interference assumptions may also be interesting.
Reference
[1] AMALDI, E., CAPONE, A., AND MALUCELLI, F., Discrete Models and Algorithms for the Capacitated Location
Problem Arising in UMTS Network Planning. Proceedings of the 5th International Workshop on Discrete
Algorithms and Methods for Mobile Computing and Communications (DIAL-M), ACM, pp.1–8, 2001.
[2] AMALDI, E., CAPONE, A., AND MALUCELLI, F., Planning UMTS Base Station Location: Optimization Models
With Power Control and Algorithms. IEEE Transactions on Wireless Communications, vol.2, no.5, pp.939-952.,
2003.
[3] AMALDI, E., BELOTTI, P., CAPONE, A., AND MALUCELLI, F., Optimizing base station location and configuration in
UMTS networks. Annals of Operations Research, 146, pp.135–151, 2006.
[4] AMALDI, E., CAPONE, A., AND MALUCELLI, F., Radio planning and coverage optimization of 3G cellular
networks. Wireless Networks, 14, pp.435–447, 2008.
[5] CHEN, L. AND YUAN, D., Solving a minimum-power covering problem with overlap constraint. European
Journal of Operational Research, 203, pp.714–723, 2010.
[6] LAIHO, J., WACKER, A., AND NOVOSAD, T., Radio Network Planning and Optimisation for UMTS, 2nd edition.
John Wiley & Sons Ltd, Chichester, England, 2006.
[7] Learn about what is on a cell tower: Without the Cat—Sponsored by MD7 (http://www.withoutthecat.com), URL
retrieved 12 Mei 2011.
[8] MATHAR, R. AND NIESSEN, T., Optimum positioning of base stations for cellular radio networks. Wireless
426 I.W. SULETRA, W IDODO, AND SUBANAR
I Wayan Suletra
Industrial Engineering Department, Sebelas Maret University, Solo.
e-mail: [email protected]
Widodo
Department of Mathematics Gadjah Mada University
e-mail: [email protected]
Subanar
Department of Mathematics Gadjah Mada University
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 427–434.
Abstract. In this paper we present the expected value approach for solving multi-
objective linear programming (MOLP) with fuzzy random parameter. There are two
MOLPs. The first is an MOLP with fuzzy random objective function coefficients and the
second is an MOLP with fuzzy random right hand sides. The expected value approach
transforms the problems to MOLP with fuzzy parameter. In this paper we use triangular
fuzzy number for fuzzy random objective function coefficients and non decreasing linear
fuzzy number for fuzzy random right hand sides. We also introduce a method to solve
MOLP with fuzzy parameter
The probability concept is preserved in this definition. We propose the expected value
approach to transform the problems to MOLP with fuzzy parameters. The approach
preserves the linear properties. Next, we solve the MOLP by a new method.
1.2. Preliminaries. We give a brief summary of the basic theory from Bector and
Chandra [1] and we give some review of the our last results [3].
In this paper we will solve the fuzzy MOLP with fuzzy objective function coef-
ficients by modified simplex method. Because we have fuzzy parameters, we need to
use ranking fuzzy to obtain optimal criteria. There are many methods proposed to
determine the ranking of fuzzy numbers. In this paper, we choose the ranking fuzzy
number in the following definition.
Definition 1.3. The ranking fuzzy for fuzzy number Ā is defined by
R au
xµĀ (x)dx
R(Ā) = Ralau ,
al
µĀ (x)dx
where al and au are the lower and the upper limits of the support of Ā.
Formulation R(Ā) above represents the centroid of Ā. In the MOLP with fuzzy
random objective function coefficients, we have triangular fuzzy number. For the trian-
gular fuzzy number(TFN) Ā = (al , a, au ), it can be verified that R(Ā) = 31 (al + a + au ).
Two TFNs Ā = (al , a, au ) and B̄ = (bl , b, bu ) are given, we define Ā ≤R B̄ if (al + a +
au ) ≤ (bl + b + bu )
Expected Value Approach for Solving Multi-objective Linear Programming 429
1.2.2. Expected Value of Fuzzy Random Variables. One of many methods to solve prob-
abilistic programming is an expected value approach. Based on this idea, we want to
solve fuzzy random programming by expected value approach. The first, we should
define an expected value in the fuzzy random variables(FRV). In [3], we have defined
an expected value of discrete and continuous FRV.
Definition 1.4. The Pnexpected value of n discrete fuzzy random variables Ā˜ on
−1 −1
α-level is Ē(α) = i=1 µi (α)pi , for α ∈ [0, 1] with µi and pi are pre-image and
˜
¯
probability of the fuzzy random variable Ai , respectively.
Furthermore, we have get some theorems in our results [3] Some of them are the
following:
Theorem 1.1. If all discrete fuzzy random variables have continuous membership func-
tion , then the expected value is a fuzzy number with continuous membership function.
Corollary 1.1. If all discrete fuzzy variables on FRV are non decreasing linear fuzzy
number, then the expected value is a non decreasing linear fuzzy number.
Corollary 1.2. If all discrete fuzzy variables on FRV are TFN, then the expected value
is a TFN.
2. MAIN RESULTS
We have two kinds of MOLPs with fuzzy random parameters. They are MOLP
with fuzzy random objective function coefficients and MOLP with fuzzy random rhs.
We solve them by expected value approach to transform fuzzy MOLPs. Finaly, we solve
the MOLPs by a new method.
2.1. MOLP with Fuzzy Random Objective Function Coefficients. Let consider
the multi-objective linear programming with fuzzy random objective function coeffi-
cients below:
min (f1 (x, c˜¯1 ), f2 (x, c˜¯2 ), · · · , fk (x, c˜¯k )) (1)
subject to Ax ≥ b, x ≥ 0,
Based on Corollary 1.2, we have E(fK (x, c˜¯K )) = fK (x, c¯K ), where c¯K is a vector
of triangular fuzzy numbers, for K = 1, 2, ..., k.
430 Indarsih, Widodo, Ch.R. Indrati
We denote the set of all feasible solution for (1) by D. By expected value approach,
we define two solution for (1). They are E-optimal solution and E-efficient solution for
(1).
Definition 2.1. Feasible solution x∗ is an E- optimal solution for (1) if
E(fK (x∗ , c˜
¯K )) ≤ E(fK (x, c˜
¯K )) for all K = 1, ..., k and x ∈ D.
Definition 2.2. Feasible solution x∗ is an E- efficient solution for(1) if there
exists no x ∈ D such that E(fK (x, c˜¯K )) ≤ E(fK (x∗ , c˜¯K )) for all K = 1, ..., k and
E(fl (x, c̄˜l )) < E(fl (x∗ , c̄˜l )) for some l = 1, ..., k.
We choose weighted method to get a single linear programming with fuzzy objec-
tive function coefficients,
min w1 E(f1 ) + w2 E(f2 ) + · · · + wk E(fk ) (2)
subject to Ax ≥ b, x ≥ 0.
Using the properties of arithmetic of fuzzy numbers, model (2) is a linear program-
ming with triangular fuzzy number for coefficient objective function. The problem can
be solved by modified simplex method. We choose the ranking fuzzy R in Definition
1.3 to obtain optimal solution criteria. So we have fuzzy decision variable.
We rewrite the model (2) below,
min z̄ = c̄x (3)
subject to Ax ≥ b, x ≥ 0.
where b̄ ˜ = (b˜
¯1 , · · · , b˜
¯m ) and b̄˜i is fuzzy random parameter for rhs i. Here, we use
b̄i is non decreasing linear fuzzy number, for i = 1, 2, · · · , m.
By the expected value approach the problem becomes a fuzzy MOLP,
min (f1 (x, c1 ), f2 (x, c1 ), · · · , fk (x, ck )) (6)
subject to Ax ≥ E(b̄ ˜ ),x ≥ 0,
where E(b̄˜ ) = (E(b˜¯1 ), ..., E(b˜¯m ))T . By Corollary 1.1, we have E(b¯˜j ) is non decreas-
ing linear fuzzy number, for all j. We write E(b˜¯j ) by b¯j = (bil , bi , biu ).
ltt
We solve (6) by a new method to transform problem (6) to crisp model . The
method is different from [4]. In our problem, we assume that the value of fK is increasing
or the same by increasing of value of rhs in constraints , for all K. So we have one type
of membership function of objective functions. Since the rhs is fuzzy, then the value of
the objective function is also fuzzy. We distinguish the degree of membership function
of constrains and objective functions. Recall by α1 and α2 respectively. The value of
rhs obtain value of the objective functions, so we let α1 ≥ α2 . We want to maximize
the sum of α1 and α2
max α1 + α2 (7)
subject to fK + α2 (UK − LK ) ≤ UK , K = 1, 2, ..., k
Ai x − α1 pi ≥ bi , i = 1, 2, ..., m
α1 ≥ α2
0 ≤ α1 ≤ 1, 0 ≤ α2 ≤ 1, x ≥ 0.
By the expected value approach, we transform model (5) to model (6). Then we
solve model (6) by the method above. Finally, we have crisp model (7).
We give an example for problem (5).
Example 2.1. Given MOLP with fuzzy random right hand side below.
We compute the expected value for rhs by Definition 1.1 and Corollary 1.1 and
we have E(b˜¯1 ) = b̄ltt = (18, 20, 21) and E(b˜¯2 ) = b̄ltt = (9, 10, 11).
1 2
The results from the our method are x∗1 = 4, 789, x∗2 = 5.251 and f1 = 194.978,
f2 = 462.555 with α1∗ = 1, α2∗ = 0.4566.
3. CONCLUDING REMARKS
In this paper, the expected value approach for solving MOLP with fuzzy random
coefficient objective function and MOLP with fuzzy random right hand side have been
presented. The approach transform the problems to MOLP with fuzzy parameters, and
we solve them by different method. The modified simplex can solve the MOLP with
Expected Value Approach for Solving Multi-objective Linear Programming 433
fuzzy objective function coefficients. The solution depends on ranking fuzzy. So, we
define E-efficient solution and R-optimal. The second method can solve the MOLP
with fuzzy rhs.
References
[1] Bector, C.R. and Chandra, S., Fuzzy Mathematical Programming and Fuzzy Matrix Games,
Springer, Germany, 2005.
[2] Caballero, R., Cerda, E., Munoz, M.M. and Rey, L., Stochastic Approach versus Mul-
tiobjective Approach for Obtaining Efficient Solutions in Multi Objective Programming Prob-
lem,European Journal of Operational Research, 158(3), 633-648, 2004.
[3] Indarsih, Widodo and Rini, Expected Value of Fuzzy Random Variables, Submitted to IJQM.
[4] Jana, B. and Roy, T.K., Multi-objective Fuzzy Linear Programming and Its application in
Transportation Model,Tamsui Oxford Journal of Mathematical Sciences,2,243-268, 2005.
[5] Luhandjula, M.K. and Gupta, M.M., On Fuzzy Stochastic Optimization, Fuzzy Set and System
81, 47-55, 1996.
[6] Maleki, H.R., Tata, M. and Mashinchi,M., Linear Programming with Fuzzy Variables, Fuzzy
Sets and Systems, 109, 21-33, 2000.
Indarsih
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]
Widodo
Department of Mathematics, Gadjah Mada University.
e-mail: widodo [email protected]
1. INTRODUCTION
One of the most intense areas of research in the field of symmetric cryptosystems is
about S-box design [13]. S-boxes are quite important components of modern cryptosystems
(especially in block ciphers) in the sense that S-boxes bring nonlinearity to block ciphers and
strengthen their cryptographic security[7]. They are typically used to obscure the relationship
between the key and the ciphertext – Shannon’s property of confusion. In many cases, S-
Boxes are carefully chosen to resist cryptanalysis.
According to [4], there are numbers of criteria forS-box design, i.e. S-box should
satisfy Avalanche Criterion (AC), Strict Avalanche Criterion (SAC), Bit Independence
Criterion (BIC), XOR Table Distribution, Avalanche Weight Distribution (AWD),
andNonlinearity.AWD was a criteria for testing the whole algorithm of block cipher.
The construction of S-boxes can be done by various ways. The most common methods
for constructing S-boxes are based on:random generation, testing against a set of design
criteria, algebraic construction having good properties, or a combination of these [10].One of
the random function that used to generate S-boxes is chaotic dynamic function. The
characteristics of chaos with property of ergodicity, mixing and exactness, and sensitivity to
435
436 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI
initial conditions makes this function is used in cryptography. These characteristics have in
common with confusion and diffusion properties in cryptography. The feature of ergodicity in
chaotic systems is to hide the relationship between plaintext, key, and ciphertext by uniform
distribution on the output for each input.While nature of sensitivity to initial conditions is
equivalent to the concept of diffusion in cryptography bydeploy a single bit inputs to all
outputs [6].
There are four points to be consideredinS-box construction process using chaotic
iteration functions:1) how to divide a range of areas as a result area, 2) what kind of chaotic
function being used, 3) number of iterations, and 4) initial value [3].
In this paper, we present a method for obtaining dynamically cryptographically
strong Substitution boxes (S-boxes) based on Piecewise Linear Chaotic Map (PLCM) that
have confusion and diffusion properties. PLCM uses mixing property to get the values of S-
box. In addition, PLCM requires two inputs as initial values that will produce the output
value. The natureof sensitivity alsoresults of random output, wherethe tiny changesof input
will produce significant changesof output.
We construct 8 x 8 dimension ofS-box which there are numbers of S-boxes with
different input values.We decided to use 8 x 8 dimensionof S-box refer to S-box dimension
used in Advanced Encryption Standard (AES) algorithm as a well accepted standard
algorithm for block cipher. The cryptographic properties such as strict avalanche criterion,
output bits independence, and nonlinearity of these S-boxes are analyzed in details. From this
study, we will conclude whether S-boxes generated by the PLCM function have a good
criteria forS-boxes with the certain input of values.The expected result of this study is to
expand knowledge of S-boxes construction with PLCM function. Such S-Boxes also could be
usedfor a particular encryption algorithm.
2. THEORETICAL BACKGROUND
2.1. Substitution Box (S-box). In general, S-box takes some number of input bits, m,
and transforms them into some number of output bits, n: an m×n S-Box can be implemented
as a lookup table with 2m words of n bits each [11].Fixed tables are normally used, as in the
Data Encryption Standard (DES), but in some ciphers the tables are generated dynamically
from the key; e.g. the Blowfish and the Twofish encryption algorithms. Bruce Schneier
describes International Data Encryption Algorithm (IDEA) modular multiplication step as a
key-dependent S-Box [14].
Definition 1. [15] : An n x n S-box is a mapping function 𝑓: {0,1}𝑛 → {0,1}𝑛 ,which
maps n-bit input strings, 𝑋 = 𝑥1 , 𝑥2 , … , 𝑥𝑛 , to n-bit output strings, 𝑌 = 𝑦1 , 𝑦2 , … , 𝑦𝑛 ,
where Y = f(x). Figure 1 shows S-boxes scheme process.Figure 1. Substitution Box (S-
box)scheme [5]
CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC MAP (PLCM) 437
Input
𝑥1 𝑥2 𝑥𝑛
.........
S-Box
.........
𝑦1 𝑦2 𝑦𝑛
Output
Mister and Adams [9] explained that S-box can be represented in three ways:
1. An n xm S-box S is a mapping 𝑆 ∶ {0,1}𝑛 → {0,1}𝑚 . S can be represented as 2n m-
bitnumbers, denoted 𝑟0 , …,𝑟2𝑛 −1 in which case S(x) = rx, 0 x 2n and the riare the
rows ofthe S-box.
2. 𝑆 𝑥 = [𝑐𝑚 −1 𝑥 𝑐𝑚 −2 𝑥 … 𝑐0 𝑥 ], where theciare fixed Boolean functions
ci : {0,1}n {0,1} i ; these are the columns of the S-box.
3. S can be represented by a 2n x m binary matrix M with the i, j entry being bit j of row
i.
1 𝑒 1
𝑊 𝑎𝑗 𝑖 = for alli, j (1)
2𝑛 2
𝑘𝑆𝐴𝐶 𝑖, 𝑗 in therangeof [0,1] and can be interpreted as probability of a change in the j-th bit
output when the i-th bit input change.If 𝑘𝑆𝐴𝐶 𝑖, 𝑗 is not equal to ½ for every pair of (i,j), then
it is not satisfy SAC.
Relative error of SAC results can be obtained by the formula:
An S-box that satisfy SAC in the range±∈ if for every i and j satisfy the following
equation:
1 1
1 −∈𝑆 ≤ 𝐾𝑆𝐴𝐶 (𝑖, 𝑗) ≤ 1 +∈𝑆 (4)
2 2
438 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI
From equation (6), avalanche variable will generate the correlation coefficient in the
range [0, 1], which means:
a. If the value is 1 then theavalanchevariableare always identical or complements of
one another;
b. If the value is 0 then the avalanchevariableare independent;
In the process of criteria analysis of BIC, the 𝐵𝐼𝐶 (𝑓) value will be the relative error
∈𝐵 . Thus, for an 𝑛𝑥𝑛S-box, the maximum value of ∈𝐵 = 𝐵𝐼𝐶(𝑓) is said the maximum value
of relative error of BIC results, denoted by ∈𝐵𝐼𝐶 ,
𝑚
𝑐. 𝑓 𝑥 = ⊕ 𝑐𝑖 𝑓𝑖 𝑥 , (9)
𝑖=1
where𝑐 = {𝑐1 , 𝑐2 , … , 𝑐𝑚 } ∈ 𝑍2𝑚 .
For a cryptosystem not to be susceptible to linear cryptanalysis, NLMf is required to be
as close as possible to its maximum value (perfect nonlinearity). The maximum nonlinearity
𝑛
value (perfect nonlinearity) of the Boolean function given by 𝑁𝑓 ≤ 2𝑛−1 − 22 − 1 [4].
2.5. Piecewise Linear Chaotic Map (PLCM). PLCM is a piecewise function that has
property of uniform invariant density and a good correlation function so that it can be used in
cryptography. PLCM function have been used in the encryption function, the function of a
random number generator, and the f function.
Given a real interval 𝑋 = 𝛼, 𝛽 ⊂ ℝ, PLCM maps Ϝ: 𝑋 → 𝑋is a multi-segmental
mapping [6]:
𝑖 = 1~𝑚, Ϝ 𝑥 |𝐶𝑖 = 𝐹𝑖 𝑥 = 𝑎𝑖 𝑥 + 𝑏𝑖 , (10)
where{𝐶𝑖 }𝑚 𝑚
𝑖=1 is partition of 𝑋, satisfy ⋃𝑖=1 𝐶𝑖 = 𝑋 and 𝐶𝑖 ∩ 𝐶𝑗 = ∅, ∀𝑖 ≠ 𝑗.
Equation (10) will fullfil the property of surjective function if every linear parts is
mapped to X with𝐹𝑖 : ∀𝑖 = 1 ~𝑚𝐹𝑖 𝐶𝑖 = 𝑋. If 𝑋 = [0,1] then that equation can be said to be
normalized PLCM. We can transform this equation into a linear form, i.e:
𝑥−𝛼
𝐹 −𝛼
𝛽−𝛼
𝐹 0,1 𝑥 = (11)
𝛽−𝛼
𝑥
, 𝑥 ∈ 0, 𝑝
𝑝
𝑥−𝑝 1
𝐹 𝑥, 𝑝 = 1 , 𝑥 ∈ 𝑝, (13)
−𝑝 2
2
1
𝐹 1 − 𝑥, 𝑝 , 𝑥 ∈ ,1
2
3. CHAOTIC S-BOX
In this study,we conducted the 8x8 dimensional S-boxes construction using the chaotic
1
dynamic PLCM function that needs 2 input parameters,i.e., IC and p, with 0 < 𝑝 <
2
and0 ≤ 𝐼𝐶 ≤ 1. According to [1], the input parameter of PLCM can be selected arbitrarily
from the possibility of the value of IC and p. For this study, we select 4 pairs of values that
represent any equation in PLCM function. In PLCM function, there are 3 equations with
different range of IC, where we take randomly 1 pairs of the input value of IC and p.
However, it should be noted whether the input value pairs of the function could do iteration
and generate the S-box. For example, if the value of IC or p are zero then the next iteration
results zero such that we can’t generate the S-box.
Therefore, based on the experiment, we select4 pairs of the input value as seen in
Table 1.
Table 1. Sample values of PLCM
Input value Value
Initial Condition (IC) 0,125
0,425
0,750
Parameter value 0,150
0,275
0,450
0,125
In Table 1, we can see there are 3 values of IC and 4 values of p. From that value, we
generate S-boxes with 4 pairs input value of IC and p, i.e., (0,125; 0,150), (0,425; 0,275),
(0,750; 0,450), and (0,750; 0,125). Each pairs satisfy range of the region of IC, where the
value of (0,125; 0,150) satisfy the first range of IC, the value of (0,425; 0,275) satisfy the
second range of IC, and the value of (0,750; 0,450) and (0,750; 0,125) satisfy the third range
of IC where the next iteration of the function satisfy first equation and second equation.
The number of iterations in chaos function is one of the factor that influence the results
[3]. According to [2], the valueswithminimumerrorandsignificant
differencesobtainedafter100th iterations. In this study, we use 1 time, 50 times, 100 times of
iteration todetermine the influence ofthe number of iterationsonthe resultingoutput of S-box.
Furthermore, we would do the S-box analysisbased on S-box testing criteria. The S-
box testing criteria that will be evaluated including the SAC, BIC, and Nonlinearity. There
are 12 S-boxes which generated with different input value and number as shown in Table 2.
Table 3.S-Box1
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 EA 43 06 9D 0B 13 5C B8 9C C0 77 E0 7B FB D9 05
1 45 72 DD CB 99 22 BA 56 CF 69 84 A6 2D 38 1C 98
2 79 F8 15 47 24 14 E5 A9 3D EE 33 B2 34 53 0D DA
442 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI
3 63 E8 42 7D 19 39 9F AA 62 DF 44 11 7F A1 68 D5
4 CE 6B 6F 12 92 55 C5 A7 7E 2E 58 76 1A 32 2F 01
5 41 E4 F0 10 EC A5 59 AE 03 37 FE DB AB F1 C7 6E
6 6C B3 35 2A CA B4 DC 9A 3E 25 F9 9E A3 BF A2 AF
7 E7 8D 4B 2C 89 8A 8C D3 CD 1E 0A 8E 6A 71 73 31
8 7C FA 88 30 91 B0 FC 5D 6D EF 8B F5 E2 F6 C6 96
9 29 D8 86 3C 1D 95 94 BC 4D 9B FF E6 00 48 09 75
A 28 23 54 5E DE E1 02 74 90 D0 E9 93 07 BD 8F 2B
B 26 80 A4 BE 0C 08 C3 FD 36 3A C9 49 83 04 F4 17
C 20 78 B7 87 CC 21 5F C2 4F B1 4A C1 1B D6 F7 57
D 0E 66 81 70 46 4E 60 61 50 EB 85 65 64 E3 5A B9
E 18 D7 A8 F3 1F ED C4 C8 AC 3B 52 D4 D1 3F B6 27
F 7A 0F AD 82 97 16 5B 51 A0 BB 40 D2 4C 67 B5 F2
3.2.2. Experimental Results of BIC.The results of BIC criteria test on each S-box
summarized in Table 5. The value of each entry in the table is the correlation coefficient
𝑒 𝑒
between the output components of different j.k (𝑐𝑜𝑟𝑟 𝑎 𝑗𝑡 , 𝑎𝑘𝑡 ). From the calculation of
BIC criteria of all S-boxes, we will find the maximum correlation value ∈𝑚𝑎𝑥 and the
average of correlation values.
In Table 5, we can see that maximum value of correlation coefficient satisfied by Sbox2
and Sbox3. Meanwhile, the minimum mean value satisfied by Sbox1 and the maximum mean
value satisfied by Sbox8.
For overallanalysis ofS-boxes,we can concludefor BIC criterion thatif
themaximumcorrelationobtainedisclose to 0thentheavalanchevariablesare independent
betweenthe output.ChaoticS-box producesa goodcorrelationvalueasthe averagevalue ofthe
resultingcorrelationisaround orbelow 0,1.However, the overall value ofthe correlationandthe
maximumerrorvaluegeneratedby theChaoticS-box is0,4056696.
Figure 2. shows the maximum correlation coefficient results of each S-box.
444 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI
BIC Graph
0,5 0,40567
correlation coefficient
0,2
corr
0,1
0
1 2 3 4 5 6 7 8 9 10 11 12
Sbox
InTable 6, we can see that the minimum value of nonlinearity of Chaotic S-boxes ranged
in value 88 until 96. The minimum value of 𝒩ℒ𝑓 is not yet close to a perfect nonlinearity
value.
Based on the minimum value of nonlinearity, the number of vectors, and the probability,
it can be concluded that Chaotic S-boxesare not satisfy the ideal nonlinearity (perfect
nonlinearity) and with the probability far from a half (ranging between 0,625 and 0,6525), we
can say that Chaotic S-box susceptible to linear cryptanalysis.
NLM (min)
98
96
94
NLM min
92
90
NLM (min)
88
86
84
3.3. Analysis of S-Box.In this section, we compare the maximum value of each S-boxes
according to the criteria of test conducted in this research SAC, BIC, and Nonlinearity.From
Table 7 and Table 8, we can conclude that all S-boxes satisfy the SAC and BIC criterias,
otherwise not satisfy Nonlinearity criteria.
From Table 9, we can conclude that Chaotic S-boxes has a higher error value than of the
AES’s S-box as the standard algorithm. However, when compared with the Vergili’s random
S-boxes, Chaotic S-Boxes have a smaller error value for all parameters exceptthe maximum
value of BIC which is approaching the maximum value of Vergili’s random S-box.Therefore,
we can conclude that AES’s S-box is still the best S-box among the three, butChaoticS-box is
betterthan Vergili’s random S-box. Moreover, the test result indicates thatChaotic S-box is
nearly similar to Vergili’s S-box which is the process of its generation is random.
1 1
2. 𝑥 = , if the value of 𝑥 = , then we use the second equation in PLCM function
2 2
and yield
𝐹 𝑥 = 𝐹 1 − 1, 𝑝 = 𝐹(0, 𝑝) which produce the value of 𝑥 = 0.
3. 𝑥 = 𝑝, if the value of 𝑥 = 𝑝, then we use the first equation in PLCM function
𝑝
such that the output will become 𝐹 𝑥 = = 1.
𝑝
4. 𝑥 = 1, if the value of 𝑥 = 1, then we use the third equation in PLCM function
such that the output will become 𝐹 𝑥 = 𝐹 1 − 1, 𝑝 = 𝐹(0, 𝑝) which going to
be the first equation and produce the value of 𝑥 = 0.
4. CONCLUSION
We defined that Chaotic PLCM function can be used to generate random output of S-
boxes.There is no significant effects in output of S-box which generated from same IC and p
with different number of iterations. This is due to the value of the results of the tests are
distributed in the same range.
In this study, only SAC and BIC criteria are satisfied from the test results, otherwise
not with the nonlinearity criteria. When using PLCM function as generator of the output of S-
box, we have to choose carefully the parameters of IC and p to avoid from bad results.
Open Problem
There are some problems that have not done yet by us:
choosing more value of parameters IC and p that included all input value
generate S-box with many different number of iterations.
there are some approaches method related to dynamics system such as spread and
dispersion that can be used to analyze Chaotic S-box.
generate S-box with different chaos functions, such as 2-dimention and 3-dimention
chaos function.
References
[1] ASIM, M., AND VARUN JEOTI. 2008. Efficient and Simple Method for Designing Chaotic S-boxes. ETRI Journal,
Volume 30 Number 1.
[2] BOSE, R AND BANNERJEE, A. 1999. Implementing symmetric (single-key) cryptography using Chaos
Functions.7th Int. Conf. on Advanced Computing and Communications.Roorkee. India.
[3] JAKIMOSKI,G.ANDKOCAREV, L. 2001. Chaos and Cryptography: Block Encryption Ciphers Based on Chaotic
Maps. IEEE Trans. Circuits Syst I, Volume 48, Nomor 2.
[4] KAVUT, S.ANDYUCEL, M.D. 2004.On Some Cryptographic Properties of Rijndael.
[5] KWANGJO, KIM. 1990. A Study on The Construction and Analysis of S-box for Symmetric Cryptosystem.
Yokohama National University.
[6] LI, SHUJUN, GONZALO ALVAREZ. 2005. Some Basic Cryptographic Requirements for Chaos-Based Cryptosystem.
Hongkong Polytechnic University: China.
[7] MAR, PHYUPHYU ANDKHINMAUNGLATT. 2008. New Analysis Methods on Strict Avalanche Criterion of S-boxes.
World Academy of Science, Engineering and Technology.Number 48.
[8] MENEZES, ALFRED J. PAUL C. VAN OORSCHOT, SCOTT A. VANSTONE. 1997. Handbook of Applied Cryptography.
CRC Press LLC. Boca Raton.
448 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI
[9] MISTER, S. AND ADAMS, C. Practical S-box Design. Nortel. Station C, Ottawa. Canada.
[10] NYBERG, K. 1991. Perfect Nonlinear S-boxes. Advances in Cryptology, Proceedings of EUROCRYPT’91.
Berlin: Springer-Verlag.
[11]SCHNEIER, BRUCE. 1996. Applied Cryptography: Protocols, Algorithms, and Source Code in C, Second Edition.
New York: John Wiley & Sons, Inc.
[12]SHANNON, CLAUDE. 1949. Communication Theory of Secrecy Systems.
[13] STALLINGS, WILLIAM. 2011. Cryptography and Network Security: Principles and Practices. Fifth edition.
Prentice Hall
[14] YOUSSEF, AMR M. 1997. Analysis and Design of Block Cipher. Phd Thesis. Queen’s University. Canada.
[15] YUCEL, M.D. AND VERGILI I. 2001. Avalanche and Bit Independence Properties for the Ensembles of Randomly
Chosen nxn S-boxes. EE Department of METU, Turkey.
[16] WEBSTER, A.F. AND TAVARES, S.E.. 1989. On the design of S-boxes. Department of Electrical Engineering.
Queen’s University.
Abstract. Model of predator-prey with infected prey in toxic environment is discussed in this
article. The objective of this research are to find out whether each of populations is extinct or not
and to know the existence of toxicant concentration. In this paper, model of prey predator with
infected prey in toxic environment is constructed. Toxicant affects the growth rate of susceptible
and infected prey but not affects the growth rate of predator population. There are four
equilibrium points. With appropriate parameter value and any initial value near equilibrium
points, then for a long time, there are three possibilities, those are only the infected population
that will be extinct, only the predator population that will be extinct, or the infected prey and
predator population will be extinct. Then, with appropriate parameter value, for a long time,
susceptible and infected prey population and predator population will be existent for any initial
value. Numerical simulation is given to ilustrate stability behaviour of equilibrium point.
Keywords and Phrases : predator-prey, toxic environment, equilibrium points, infected prey,
stability.
1. INTRODUCTION
It is well known that the pollution of the environment is a very serious problem in the
world today because it is the threat for organisms. Organisms are often exposed to a
polluted environment and take up toxicant. In order to use and regulate toxic substances
wisely, we must asses the risk of the populations exposed to toxicant. Therefore, it is
important to study the effects of toxicant on populations and to find a theoretical threshold
value, which determines permanence or extinction of a population. Rapidly expansion of
agriculture and modern industrial, much toxicant contaminates the ecosystem.
The contact between species and environment is often happened. The changing of
environment that caused by pollution affects variety of life. One example is the use of
pesticides. Pesticides are useful tools in agriculture and forestry because they quickly kill
a significant portion of a pest population. Pesticides can be sprayed instantenously and
regularly and all available evidence suggest that pesticides pose potential health hazards
449
450 LINA ARYATI AND ZENITH PURISHA
not only to livestock and wild life but also to mammals and every human beings.
Modelling the depletion and conservation forestry resources (effects of population and
pollution) is studied in Shukla and Dubey [5] and nonlinear models for the survival of two
competing species dependent on resource in industrial environments has been investigated
by Dubey and Hussain [2].
The pollution can affect the depletion of species growth rate. As species do not exist
alone in nature, it is of more biological significance to study the permanence-extinction
threshold of each population about two or more interacting species subjected to toxicant in
a polluted environment. In an ecology, there is biology interaction, that is interaction of
two or more species in a ecosystem. Modelling the Interaction of Two Biological Species
in a Polluted Environment is studied in Dubay and Hussain [1] without spreading of
infection to the species. Modelling of population with spreading of disease is studied by
Mena-Lorca and Hetchote [3].
We have proposed a mathematical model which has been constructed by Sinha et. al
[4] combining three basic models dealing with prey-predator interaction, disease spread
and the effect of environmental pollution on single prey species. We considered that prey-
population is affected by infectious disease. From the model, we found a new equilibrium
point which is locally asymtotically stable.
2. MODEL
Transfer diagram:
Model of Predator-Prey with Infected Prey in Toxic Environment 451
dx1
x1 x2 1 x1 y rUx1 1 d1 x1 , (2.1)
dt
dx2
x1 x2 2 x2 y r2Ux2 dx2 , (2.2)
dt
dy
1 x1 y 2 x2 y d3 y, (2.3)
dt
dC
Q C C ( x1 x2 ), (2.4)
dt
dU
C ( x1 x2 ) mU . (2.5)
dt
Initial conditions:
x1(0) = x10 > 0,
x2(0) = x20 > 0,
y(0) = y0 > 0,
U(0) = U0 > 0,
C(0) = C0 > 0,
where
θ : recruitment rate,
α1 : predations rates of susceptible prey,
α1 : predations rates of infected prey,
β : disease contact rate,
d1 : natural death rate in susceptible prey population,
d3 : natural death rate in predator population,
d2 : natural death rate in infected prey population,
h : disease induced death rate of infected prey population.
d =( d2 + h) : net death rate in infected prey populations.
m : natural wash out rate of the toxicantfrom the organism,
r1 : the rate at which susceptible prey is decreasing due to toxicant,
r2 : the rate at which infected prey is decreasing due to toxicant,
δ : uptake rate of toxicant by organism,
α : the natural depletion rate of the environmental toxicant
Q : exogenous input rate of the toxicant in the environment.
The following lemma discusses that all the solutions of the model are bounded in
5
x1 , x2 , y, C,U x1 , x2 , y, C,U 0. as t to infinity.
Lemma 2.1 As t , all the solutions of the Model 2.1-2.5 will lie in the following
region:
where
452 LINA ARYATI AND ZENITH PURISHA
Q1 C Q M1M 2
m1 , m2 , m3 , m4 min , M1 , M 2 , M 3 .
1 32 2 1 3 1 m
where
1 Q
1 r1 d1 ,
1 1 m1
2 max d1 ,d ,1 ymax ,2 ymax ,rU
1 max ,r2U max ,
x1 2 y rU
1
d 4 x1 m 3 x1 r1 x1C
1
x x1
x1
m x1 12 x1 y 2 x1 12 x1 y Cr1 x1 2 r1 x12C m12 x1 y
m x1 12 x1 y 0
By Routh Hurwitz Theorem, hence:
d r1 Qd3 d Q1 Qd3
E1 3 , 0, 1, , locally asymtotically
1 d3 1m 1 d3 1 1 d3 m 1 d3
stable.∎
Explanation: With appropriate parameter value and any initial value near equilibrium
points, then for a long time, only the infected population that will be extinct.
x1 rU
1
d 1 x1 d3 3 x1 m 2
1
x
x1 m x1 r1 x1 C
x1 x1
x1 m 2 r1 x12C r1 x1 C x1 0
x1
By RouthHurwitz Theorem, hence:
r U d r U d1 Q mU
E3 2 , 1 , 0, ,U locally asymtotically stable.∎
r2 U d
Explanation: With appropriate parameter value and any initial value near equilibrium
points, then for a long time, the infected prey and predator population will be extinct.
1 x1 2 x 2 d3 4 m x1 x 2 3
x1
m
r1 x1 C r2 x 2 C 2 x1 x 2 x1 x 2 m 2
x1 x1
r2 x 2 C
r2 x1 x 2 C r1 x1 C 2 x1 x 2 m
2
x1
m
x 1 x2 r1 x1 C r2 x 2 C 2 x1 x 2
x1
x 1
x 2 r1 x1 C r2 x 2 C x1 x 2 r2 x1 x 2 C
r1 x12 C r2 x 2 C x1 x 2 r2 x1 x 2 C r1 x12 C
r2 x 2 C
2 x1 x 2 m
x1
By Routh Hurwitz Theorem, hence:
r U d r U d1 Q mU
E3 2 , 1 , 0, ,U locally asymtotically stable. ∎
r2 U d
Explanation: With appropriate parameter value and any initial value near equilibrium
points, then for a long time, only the predator population that will be extinct.
V2 a11 x1 ˆx1 a12 x1 xˆ 1 x2 xˆ 2 a13 x1 xˆ 1 y ˆy a15 x1 xˆ 1 U Uˆ
2
a22 x2 ˆx2 a23 x2 ˆx2 y ˆy a25 x2 ˆx2 U Uˆ a33 y ˆy
2 2
a x ˆx C Cˆ a x2 xˆ 2 C Cˆ a55 U Uˆ
2
a44 C Cˆ
2
14 1 1 24
a45 C Cˆ U Uˆ ,
where
a11 x2 1 y rU
1 d1 ,
a12 ˆx1 ˆx2 ,
a13 1 ˆx1 ˆy ,
a C,ˆ
24
ˆ
a15 r1 ˆx1 C,
a22 d 2 y r2U x1 ,
a23 2 ˆx2 ˆy ,
a C, ˆ
24
ˆ
a25 r2 ˆx2 C,
a33 d3 1 x1 2 x2 ,
a44 x1 x2 ,
a45 x1 x2 ,
a55 m .
If
i. 4a122 a11a22 ,
ii. 2a132 a11a33 ,
iii. 3a142 a11a44 ,
iv. 3a152 a11a55 ,
v. 2
2a23 a22 a33 ,
vi. 2
3a24 a22 a44 ,
vii. 2
3a25 a22 a55 ,
viii. 2
3a45 a44 a55 .
then V2 is negative definte .
Hence, ˆ , 1 ˆ1 , 2 ˆ 2 , ˆ, r1 rˆ1 , r2 rˆ2 , Q Qˆ , dan m̂ m ,
where,
̂ is positive root of 11 2 12 13 0 ,
̂1 is positive root of 2112 221 23 0 ,
̂ is positive root of 3112 321 33 0 ,
̂ 2 positive root of 4122 42 2 43 0 ,
456 LINA ARYATI AND ZENITH PURISHA
1 a d1
1
m x2 a 1 ya rU ˆ
1/ 2
r̂1 C.
3x̂1
ˆ ˆx ˆx 1 d y r U x x x
1/ 2
Q 2 ,
1 2 a 2 a 1b 1a 2a
3
1 m
1/ 2
r̂2 d 2 ya r2U a x1b Cˆ ,
x̂2 3
3 2 x1b x2b
2
m̂ .
x1a x2 a
We consider the following set parameters, Sinha [4].
=40, 1 =0,01, 2 =0,02, =0,04, r1=0.005, r2=0.001, d1=0.01, d=0,05, d3=0,5 ,Q=10,
=0.2, =0.1, m=0.3. Then we get the following results (time in year):
80
70
60
x1(t)
x2 (t)
50 y(t)
C(t)
40 U(t)
30
20
10
0
0 5 10 15 20 25 30 35 40 45 50
t (waktu)
Figure 1
500
400
300
200
100
0
0 10 20 30 40 50 60 70 80 90 100
t (waktu)
Figure 2
Model of Predator-Prey with Infected Prey in Toxic Environment 457
3. CONCLUSION
The system of prey-predator in this case has a globally asymtotically stable so that with
appropriate paramatere value and for any initial value of a number of susceptible prey
population, infected prey population, predator population, toxicant concentration in
environment, and toxicant concentration in prey population, then for long time, the
d 2 x2*
number of susceptible prey population will approach 3 , the number of infected
1
*
prey population will approach x2 , and the number of predator population will approach
x̂1 r2Uˆ d
, a number of toxicant concentration in environment will approach
2 2 2
Q̂
, and a number of toxicant concentration in environment in prey
ˆx1 ˆx2
population will approach Û .
References
[1] DUBEY, B. AND HUSSAIN, J., Modelling the Interaction of Two Biological Species in a Polluted
Environment, Journal of Mathematical Analysis and Applications 246, 58-79, 1998.
[2] DUBEY, B. AND HUSSAIN, J., Nonlinear Models for The Survival of Two Competing Species Dependent
on Resource in Industrial Environments, Nonlinear Analysis: Real World Applications 4, 21-44, 2001.
[3] MENA-LORCA, J. AND HETHCOTE, W., Dynamic Models of Infectious Diseases as Regulators of
Population Sizes, J.Math Biol. 30: 693-716,1991.
[4] SINHA, S., MISRA, O.P., AND DHAR, J., Study of a Prey-Predator Dynamics Under the Simultaneous
Effect of Toxicant and Disease, J. Nonlinear Sci. Appl.1, no. 2, 102-117, 2008.
[5] SHUKLA, J. B. AND DUBEY, B., Modelling the Depletion and Conservation Forestry Resources: Effects
of Population and Pollution, J. Math. Biol. 36: 71-94, 1997.
458 LINA ARYATI AND ZENITH PURISHA
LINA ARYATI
Mathematics Department, Gadjah Mada University.
e-mail: [email protected]
ZENITH PURISHA
Mathematics Department, Gadjah Mada University.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 459 - 470.
1. INTRODUCTION
Asnakeboard is a modified version of a skateboard in which the front and back pairs
of wheels are independently actuated. The extra degree of freedom enables the rider to
generate forward motion by twisting the body back and forth, while simultaneously
moving the wheels with the proper phase relationship.As with skateboarding, snakeboard
also be used on flat or curved surfaces.
The motion of snakeboard was first investigatedin details by Lewis et.al [1]. The work
discussed geometrically the non holonomic constraints, the dynamics, the control, and the
force of snakeboard.After that, research on snakeboard developed by Ostrowski et.al [2]
459
460 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID
H
x J ( x) ( x) b( x)u
x (1)
H
y bT ( x ) ( x). (2)
x
On th e M ec h an ic a l Sys t em s wi t h Non h olon om i c C on st ra in t s . . . . 461
Here, 𝑥 ∈ 𝒳 is the state of the system, with the state space 𝒳 being a smooth manifold, and
𝐽 𝑥 : 𝑇𝑥∗ 𝑀 → 𝑇𝑥 𝑀is a skew-symmetric vector bundle map. The map has locally the matrix
representation
0 In
(3)
I 0
n
The input vector fields are of the form 𝑏 𝑞 = 0 𝐵𝑇 (𝑞) 𝑇 . Hamiltonan is a smooth
function on 𝒳 representing the total energy of the system. Then, the state space of the
system is 𝒳 = 𝑇 ∗ 𝑄, the cotangent bundle corresponding to the n-dimensional configuration
manifold 𝑄. A local coordinate for 𝑇 ∗ 𝑄 are denoted by (q,p) with 𝑞 ∈ 𝑄. The kinetic
energy is defined by a Riemannian metric 𝑔on 𝑄 and defines a quadratic form in 𝑝, i.e.,
𝑝, 𝑝 𝑔 −1 𝑞 = 𝑔𝑖𝑗 (𝑞)𝑝𝑖 𝑝𝑗 . The Hamiltonian is giventhen by
1
H ( q, p ) p, p g 1
(q ) V (q ) (4)
2
The constraints can be described as
q i 0 1 qi 0 0
H
H j l ul (6)
p i 1 0 pi j (q)i Bi (q)
H
y l 0 Bil (q ) qH
i
(7)
pi
H
0 i j (q) for j 1,, k (8)
pi
Here, 𝜆𝑗 are functions of both 𝑞and 𝑝, called Lagrange multipliers, 𝐵𝑖𝑙 𝑞 are input
vector fields and 𝑢𝑙 ∈ ℝ2 . Because of the constraints are satisfied at all times, then
equations (6), (7), and (8) is called a constrained Port-Controlled Hamiltonian System.
The snakeboard is a modified version of a skateboard in which the front and back
pairs of wheels are independently actuated. The extra degree of freedom enables the rider to
generate forward motion by twisting the body back and forth, while simultaneously moving
the wheels with the proper phase relationship. The configuration manifold S T is the
2 3
configuration space for a snakeboard played on the curved surface (the internal surface of
462 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID
Let { Θ, Φ, 𝜃} represent the position and orientation of the center of the board, 𝜓 is the
relative angle between the main body and the rotor, and 𝜙 is the relative angle between the
main body and the back wheel. The distance between the center of the board with wheels is
l.
The inertia tensor of snakeboard is
mR 2 0 0 0 0
0 mR sin 0
2 2
0 0 (9)
M 0 0 ml 2 Jt 0
0 0 Jt Jt 0
0 0 J w
0 0
If the energy of the snakeboard played on thearena of the form of the internal surface of a
sphere is small enough, the above constraints can be implemented. Therefore, the
constraints of the snakeboard played on the internal surface of the spherecan be written in
spherical coordinate system as one-forms
P
mR 2
P
mR sin
2 2
1
2 P P
2
(15)
2 ml J t
2
ml P
2
P
and the output variables ml J t J t ml 2 J t
P
Jw
P 1 R cos sin 2 R cos sin
P 1 R sin cos 2 R sin cos
P l cos l cos
1 2
P U
P U
P P ml 2 (16)
y1
ml 2 J t J t ml 2 J t
(17)
P
y 2
Jw
While the constraints are given by
P P 1 2 P P
0 cos sin cos 2 l cos (18)
mR 2
mR sin
2 2
2 ml J t
P P 1 2 P P (19)
0 2 cos sin cos l cos
mR mR 2 sin 2 2 ml 2 J t
In the above equations of motion, the constraint forces still appear and there are no
restrictions in momentum phase space by the constraints. The PCHS method requires a
basis consisting of a vector field which vanish all of the constraint one-forms and
diagonalize the inertia metric of the system (snakeboard). The calculation however is not
simple and not every case gives an analytic answers.
The Christoffel symbols {Γ𝑗𝑘𝑖 ∶ 𝑖, 𝑗, 𝑘 ∈ {1, ⋯ , 𝑛}} of the inertia tensor 𝑀 are defined
by
1 𝜕𝑀𝑙𝑗 𝜕𝑀𝑙𝑘 𝜕𝑀𝑘𝑗 (20)
Γ𝑗𝑘𝑖 = 𝑀𝑙𝑖 + −
2 𝜕𝑞 𝑘 𝜕𝑞 𝑗 𝜕𝑞 𝑙
where Mli are the components of M−1(the summation convention is assumed throughout
the paper). All relevant quantities are assumed to be smooth. In coordinates the equations
of motion are
𝑚
(21)
𝑘
𝑞 + Γ𝑖𝑗𝑘 𝑞 𝑖 𝑞 𝑗 = 𝑀𝑘𝑗 𝐹𝑎 𝑗 𝑢𝑎 ,
𝑎=1
where 𝑋 𝑖 and 𝑌 𝑖 are the ith and jth component of X and Y . The operator ∇is called an affine
connection and it is determined by the functions Γ𝑗𝑘𝑖 .
Let ℒ𝑋 𝑓 be the Lie derivative of a scalar function f with respect to the vector field X .
Given a scalar function f, its gradient grad f is the unique vector field defined implicitly by
grad 𝑓, 𝑋 = ℒ𝑋 𝑓 (23)
q q M 1 Fa ua
m
(24)
a 1
can be written as
q PM 1 F u
m
q a a (25)
a 1
Y Y P (Y )
(26)
X X X
Y P Y
(27)
X X
Now let’s express the main Theorem [5] without proof on which our study is based.
On th e M ec h an ic a l Sys t em s wi t h Non h olon om i c C on st ra in t s . . . . 465
Teorema : Let { X1 , ... ,Xn - p }be an orthogonal basis of vector fields for D. The
generalized Christoffel symbols of ∇ are
1
ijk 2
Xi X j , X k (28)
Xk
where vi are the components 𝑞 of along { X1 , ... ,Xn - p }; i.e.;𝑞 = 𝑣𝑖 𝑋𝑖 ; and where the
coeficients of the control forces are
1
Yak 2
Fa , X k (30)
Xk
Furthermore; if the control forces are differential of functions; that is; if F a = da for
some a∈ {1, ... ,m}, then
1
Yak 2
L X k a (31)
Xk
The number of Christoffel symbols calculated according to equation (8) is 125, which
is nota small number. Fortunately, there are only three symbols which does not vanish.
They are explicitly given by
(32)
and
(33)
Furthermore, we must find an orthogonal basis spanning D, i.e. vanishing all of the
constraint one-forms. Initially, presumed a vector field X1 as
X q v1 v2 v3 v4 v5 . (34)
This vector field must eliminate the one-forms𝜔1 and𝜔2 of constraints. So the vector field
of snakeboard is
l sin l cos R cos
X1 ,
cos sin cos (35)
X '2 ,
' (36)
X '3 .
'
466 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID
(37)
Note that the field X’3is perpendicular to both X1andX’2. A direct way of computing an
orthogonal basis {X1, X2, X3}from the basis { X1, X’2, X’3}is to define
X '2 , X 1
X 2 X '2 X1
X1 , X1 (38)
so that
1
f8 , , , (3(cos 2 sin ( cos 2 R(3sin 2 cos 2 cos 2 ) sin l
3
(40)
1
sin cos sin ) cos3 cos 4 cos 2 sin l (3sin 2
3
3
1
cos cos ) cos sin ( cos 2 R(sin 2 cos 2
2 2 2
3
cos 2 )(cos 4 cos 2 sin 2 cos 2 2lf1 , , , ) sin
1 2
cos sin l (cos 4 cos 2 sin 2 cos 2 lf1 , , , )
3 3
1
sin ) cos cos cos l sin (sin 2 cos 2
3 2 2
3
cos 2 )(cos 4 cos 2 sin 2 cos 2 2lf1 , , , )) cos 2
cos 2 J t f1 , , , cos 2 )
f 9 , , , (cos J t (3cos 2 cos 2 2lf1 , , , sin 2
cos 2 cos 4 cos 2 ) f1 , , , cos (( R cos 2 (sin 2 cos 2
cos 2 ) sin l sin 3 cos sin ) sin cos l cos 2
sin cos 2 (sin 2 cos 2 cos 2 )))
f10 , , , ( f1 , , , J t cos ( cos3 cos 4 ) sin R
sin cos 2 cos 4 sin cos 2 l sin l (sin cos 2
sin 3 Rf1 , , , cos 2 sin ) cos sin
cos 2 l 2 f1 , , , cos 2 )
f11 , , , (l 3 sin f1 , , , cos Rm( cos 2 cos 2 (1
cos 2 ) cos 4 lf1 , , , ))
f12 , , , ( J t cos 2 cos ( cos 2 2 cos 2 cos 2 sin 2 ) sin ( cos 4
cos 2 cos 2 cos 2 sin 2 cos 2 2lf1 , , , )
cos 2 f1 , , , )
According to equation (6), the only non-vanishing Chistoffel symbols are
f 3 , , ,
111
cos 2 cos sin f1 , , ,
f 4 , , ,
122
m 2 l 6 sin f1 , , ,
f 5 , , ,
112
ml R sin f1 , , ,
3 3
f 6 , , ,
121
ml R sin f1 , , ,
3 2
468 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID
f 7 , , ,
112
cos 2 cos 2 sin f 2 , , ,
f8 , , ,
222
ml R sin f1 , , , f 2 , , ,
3 3
f 9 , , ,
122
sin f1 , , , f 2 , , ,
2
f10 , , ,
221
sin f1 , , , f 2 , , ,
f11 , , ,
31
2
cos 2 f 2 , , ,
f12 , , ,
32
2
f1 , , , f 2 , , ,
2
(41)
We also compute the three norms
(42)
(43)
(44)
The next step is to calculate all of the general external forces acting on the snakeboard, by
using the following equation
(45)
Finally, in the our coordinate system, the kinematic equations are given by
J cos 2
cos 2
t
sin ml 3 f1 , , ,
v 111vv 122
12
1
v 121 v 131v 132
g tan sin
U
lRf1 , , ,
2 2
(48)
112 vv 22
2
12
2
v 21
2
v 31 v 32
l 2 mf1 , , , gJ t sin cos cos cos sin l 2 f1 , , ,
U
J t f 2 , , ,
1
U
Jw
The dynamics and kinematicsof the snakeboard obtained by making use of the method of
constrained Levi-Civita connection do not contain Lagrange multiplier expressing
constraint forces. The forces are therefore hidden by the method according to Lemma .
The non holonomic constraints of snakeboard do not restrict the direction of the motion of
snakeboard as shown by the equations of kinematics. While all the forces that appear in
the dynamic equation of snakeboard presented by the righthand side of equation (12).
Therefore, the equations of motion of the snakeboard obtained by making use of the
method of constrained Levi-Civita connection are well defined.
3. CONCLUDING REMARK
References
[1]. LEWIS, A., OSTROWSKI, J.,MURRAY, R. DAN BURDICK, J., NonholonomicMechanicsand Locomotion: the
Snakeboard Example, In Proceeding of IEEE Conference onRobotics and Automation, 3, 2391-2400, San
Diego, CA, USA, 1994.
[2]. OSTROWSKI, J. P., BURDICK, J. W., LEWIS, A. D. DAN MURRAY, R. M., The Mechanics of Undulatory
Locomotion: The Mixed Kinematic and Dynamic Case,In IEEE International Conference on Robotics and
Automation, 2, 1945-1951, Nagoya, Japan, 1995.
[3]. KOON, W. S., MARSDEN, J. E., Optimal Control for Holonomic and NonholonomicMechanical Systems
with Symmetry and Lagrangian Reduction, SIAM J. ofControl and Optimization, 35, 1997, 901-929,
1997.
[4]. BLANGKEINSTEIN, G., Symmetries and Locomotion of a 2D Mechanical Net-work:The Snakeboard,
European Sponsored Project GeoPlex, IST-2001-34166,2001.
[5]. BULLO, F. DAN ZEFRAN, M., On Mechanical Control Systems with NonholonomicConstraints and
Symmetries, Systems and Control letters, 1, 45, 133-143, 2001.
[6]. KRISHNAPRASAD, P. S. DAN TSAKIRIS, D. P., 1998, Oscillations SE(2)-Snakes and Motion Control: Study
of Roller Racer, Center for Dynamics and Control of Smart Structures (CDCSS), Technical Report,
University of Maryland, College Park.
[7]. Duindam, V. dan Stramigioli, S., Energy-Based Model-Reduction and Control of Nonholonomic
Mechanical Systems, In Proceeding of IEEE International Conference on Robotics and Automation,
4584-4589, New Orleans, LA, 2004.
Muharani Asnal
Kelompok Penelitian Kosmologi, Astrofisika dan Fisika Matematik (KAM),
Laboratorium Fisika Atom dan Inti, Universitas Gadjah Mada Yogyakarta.
e-mail: [email protected]
Abstract. In this paper we investigate a safety analysis or reachability of timed automata hybrid
systems as an extension of safety analysis of linear systems. The safety verification problem of
linear systems with complex eigenvalues will be converted to an emptiness problem from semi
algebraic set. Sum of square (SOS) decomposition will be used to check the emptiness of the set.
Suppose given set of initial state of mode i and final state at mode i+1. The safety analysis is
solved by determining the solution of differential systems at mode i and the solution of mode i at
the time occurrence. The solution then become initial state at mode i+1 and then we analyze by
using safety analysis of linear systems.
Keywords : hybrid systems, sum of square (SOS), safety, reachability, complex eigenvalues.
1. INTRODUCTION
Hybrid systems are used for modeling and analyzing systems which have interacting
continuous-valued and discrete-valued state variables. The continuous state variable may be
the value of the state in continuous time, discrete time or a mixture of the two.
A method which provides safety certificates in realistic computation time has been
developed for linear systems with polyhedral sets of initial and final states using geometric
programming in Yazarel, Pappas[6]. The extended method in timed automata hybrid systems
has been done by using geometric programming approach in Megawati, et. al.[8].
In Yazarel, et. al.[7], SOS method is used to analyze safety problems with eigen
structure. The safety verification problem of linear system with certain eigen-structure will be
converted to an emptiness problem for semi-algebraic set. Sum of squares (SOS)
decomposition will be used to check the emptiness of the set.
The safety analysis of linear systems with sum of square (SOS) is determined the
feasibility of boolean combination sets from polynomial equality and inequality. The
polynomial equality describes the trajectory of systems and the polynomial inequalities
describe the region of initial and final state. The tool which is used to check the emptiness set
of polynomial equalities and inequalities is Positivstellensatz Theorem.
471
472 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI
From [7], in this paper, we will discuss about safety predicate for timed automata hybrid
systems, especially for the case of two modes hybrid systems for purely complex eigenvalues
case.
2. SUM OF SQUARE
The safety analysis of linear systems with sum of square (SOS) is determine the feasibility
of boolean combination sets from polynomial equality and inequality. The polynomial
equality describe the trajectory of systems and the polynomial inequalities are describe the
region of initial and final state. The tools which used to check the emptiness set of polynomial
equalities and inequalities is Positivstellensatz theorem that is discussed in [1] .
Definition 2.1. The monomial m associated to the n – tuple has the form
m x x x11 x22 xnn ,
where Z n .
where c R.
Definition 2.3 A multivariate polynomial f x is a sum of square (SOS) if there exist some
polynomials pi x , i 1, 2, , m such that
m
f x pi2 x .
i 1
m
The condition that f x pi2 x is equivalent to the existence of a positive
i 1
I pi generated by pi x is
I pi ai pi ai are polynomials for all i .
i
Safety Analysis of Timed Automata Hybrid Sytems Using SOS for Complex Eigenvalues
473
k
P pi a b j q j a, b j aresum of squares, q j M pi for j 1, ,k
j 1
The above definitions will be used in Positivstellensatz lemma. The following lemma from
[7] provides characterization of infeasibility certificates for real solution of systems of
polynomial equalities and inequalities.
Lemma 2.7. Let f j , g k be finite sets of polynomials in x, then the following statements are
equivalent.
1. The following sets is empty
x R n
f j x 0, hl x 0, j, l (1)
m k
X f x f Rn
p x 0 ,
i m 1
i f (6)
474 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI
Definition 3.1. Given a set of initial state X 0 , the forward reachable set Post A, X 0 and
the backward reachable set Pre A, X 0 of the linear systems (3) defined as
Post A, X 0 x f Rn t x0 : t 0 x0 X 0 x f e At x0 , (7)
Pre A, X 0 x f Rn t x0 : t 0 x0 X 0 x f e At x .
0 (8)
The forward and backward safety predicates of linear systems are defined in the following
definitions.
Definition 3.2. Given a set of final or unsafe state X f , then the forward safety predicate
1, if Post A, X 0 Xf
Safe A, X 0 , X f , (9)
0, otherwise
1, if Pre A, X 0 Xf
Safe A, X 0 , X f . (10)
0, otherwise
and Safe A, X 0 , X f 1 .
0 1
0
1 0
, (12)
0 m
0 0
m
where eigenvalue of are ii , i Q . The differential equation in each 2 – dimensional
subspace take the form,
z2i 1 0 i z2i 1
z 0 z , i 1, , m . (13)
2i i 2i
The equation (13) has the following solution
z2i 1 cos i t z0,2i 1 sin i t z0,2i ,
(14)
z2i sin i t z0,2i 1 cos i t z0,2i .
Let set of the initial and final or unsafe state in eigenspace, x0 and x f is a state of set
X 0 and X f which defined as (5) and (6). Since T invertible, then Z 0 , Z f become
m
Z 0 z0 R n
p z 0 ,
i 1
zi 0
Z f z f R n pzi z f 0 ,
m
i 1
where pzi z are polynomials with rational coefficients. The safety analysis for linear systems
in [7[ with purely imaginary eigenvalue is given in the following theorem.
Theorem 4.1. Given linear systems A, X 0 ,Xf where A is diagonalizable matrix with
purely imaginary eigenvalue, then the following statements is equivalent.
1. Safe A, X 0 , X f 1 .
2. Safe , Z0 , Z f 1 .
3. The following set defined by polynomial equalities and inequalities is empty for the
system in modal coordinate.
z f ,2i yi z0,2i 1 wi z0,2i 0, i 1, , m,
z f ,2i 1 wi z0,2i 1 yi z0,2i 0, i 1, , m,
wi f i w, y 0, i 1, , m,
yi gi w, y 0, i 1, , m, (15)
w2 y 2 1 0,
pzi z0 0, i 1, ,m, (16)
pzi z f 0, i 1, ,m , (17)
476 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI
n
where i R , ni Z , di Z , d di , d Z , si i d , c gcd s1 , s2 , , sn
i 1
and gcd is great command divisor and f x, y and g x, y are a polynomial function
as defined by
cos t f cos t ,sin t , 1, (18)
sin t g cos t ,sin t , 1. (19)
In this part, we will discussed the safety predicate of timed automata hybrid systems,
especially for the case of two modes hybrid systems.
Let start with the system at mode 1 ( q1 ), with dynamic equation x A1 x. Assume at t the
mode jump from mode 1 to mode 2 ( q2 ) with new dynamic equation x A2 x , where
x t R n is a state at time t and A1 , A2 R nxn are a matrix systems. The set of initial state
q1 , X 0 and final state q2 , X f are defined as follow,
m
X 0 x0 R n pi x0 0 ,
i 1
mk
X f x f R n pi x f 0 ,
i m 1
where pi x is polynomial with rational coefficient. From [2], defined reachability set of
timed automata hybrid system.
Definition 5.1. State qˆ, xˆ is reachable, if there exist a finite execution , q, x where
and q N' , x N, qˆ, xˆ . All collection of reachable state is denoted by
N
i , i'
i 0
If unsafe state of timed automata hybrid system is denoted by X f , the safety predicate of
timed automata hybrid system is defined as follows.
Definition 5.2. If X f at mode q2 is the final or unsafe state then the forward safety
predicate is defined as,
1, unsafe stateunreachable
Safe A,(q1 , X 0 ),(q2 , X f ) .
0, others
The problem in safety analysis of timed automata hybrid system is if given two set of
initial and final or unsafe state, we will determine if trajectory state from initial state can
reach the final state or Safe+ A, q1 , X 0 , q2 , X f 1 .
From the Definition 5.1., the following steps are used to solve the safety predicate of the
timed automata hybrid systems.
1. Determine the solution of differential equation at mode 1 ( q1 ) where x0 X 0 . Find the
state value of x T where T denotes the occurrence of the discrete transition. The state
x T becomes the initial state of the mode 2 ( q2 ) denoted by x0 x 0 x T .
2. Determine safety analysis using the same steps with the initial state mode 2 ( q2 ) is
x0 x 0 x T .
Example : Given a hybrid automaton with time to change mode T = 2 with the system
matrices
2 2 0 1
A1 , A2 .
1 3 9 0
The sets of initial mode q1 is 3x01 1 x02 2 1 0 ,and final or unsafe state mode
2 2
2 2 1 8 2 2
3e 3e e 2 e 8
x 2
3 3
x0 ,
e e
1 2 1 8 1
e e
2 2 8
3 3 3 3
0.0903 0.09
x0 .
0.0450 0.0453
Next, we will be analyze safety predicate at mode q2 . The initial state of mode q2 is
0.0903 0.09
xˆ0 x 2 x0 , where x0 X 0 . The eigen value of the matrix A2 are
0.0450 0.0453
1 3i and 2 3i .
478 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI
h1 : z f 1 wz01 yz02 0,
h2 : z f 2 yz01 wz02 0, (20)
h3 : w2 y 2 1 0.
f1 : zˆ01 1 zˆ02 2 1 0,
2 2
(21)
2 2
f 2 : z f1 4 z f2 4 1 0, (22)
SOSTOOLS test returns that Safe A, q1 , X 0 , q2 , X f 1 It can be concluded : if it is
selected x0 X 0 at mode q1 , then x f X f , or the systems is safe.
6. CONCLUSION
InThis paper, the safety analysis of linear systems with purely complex eigen value using
SOS has been extended to analyze the safety of timed automata hybrid system which has two
modes. We assumed that the matrix systems of each mode of timed automata hybrid systems
are diagonalizable. The safety verification for hybrid timed automata are if given two set of
initial and final or unsafe state, the safety analysis was carried out as follows : find the
solution of the differential of the mode 1 where x0 X 0 , determine the state at changing
mode T, x T . Change the terminal state to the initial state of the next mode and then
determined the safety analysis in the mode 2 with like safety analysis in linear systems.
References
[1] BOCHANAK, J., COSTE, M., AND ROY, M.F., 1998, “Real Algebraic Geometry”, Springer – Verlag, Berlin
[2] BEMPORAD, A.,DE SCHUTTER, B. AND. HEEMELS, W.M.P.H., 2003,”Modelling and Control of Hybrid
Systems”, Lecture Notes of the DISC Course.
[3] BRANICKY, M.S. V.S. BORKAR AND S.K. MITTER, ”A Unified Framework for Hybrid Control: Model and
Optimal Control”, IEEE Trans on Automatic Control, Vol. 43, pp. 31-45, 1998.
[4] YAZAREL, H. AND PAPPAS,G.J., “Geometric Programming Relaxations for Linear System Reachability”,
Proceeding American Control Conference, pp. 553 – 559, 2004.
[5] YAZAREL,H. ,PRAJNA, S., AND PAPPAS,G.J., “SOS for Safety”, Proceeding 43rd IEEE Conference on Decision
and Control, vol.1, pp 461-466, 2004.
[6] MEGAWATI, N.Y., SUTARTO, H.Y., SALMAH, SUPARWANTO, A., WIJAYANTI, I.E., SOLIKHATUN, BUDIYONO, A.,
Safety Analysis of Timed Automata Hybrid Sytems Using SOS for Complex Eigenvalues
479
JOELIANTO, E., 2009, “Safety Analysis of a Class of Timed Automata Hybrid Systems”, Proceeding International
Conference of Instrument, Control and Automation, ICA, 20 – 22 Oktober 2009, pp. 277 – 282.
Salmah
Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia
e-mail :[email protected]
Abstract. Various of virus have attacked humans and caused diseases. In the human’s
body, a virus requires a host cell for reproduction and live on. The existence of virus
particles will activate the immune system that are roled by CTL and antibody. In this
paper, we present the virus dynamics with CTL and antibody responses. The global
stability of equilibrium point for dynamics virus models with CTL and antibody responses
are explored by using appropriate Lyapunov functions. We derive the basic reproduction
number of virus R0 and the immune responses reproduction number R1 and R2 for the
virus infection model. The global dynamics of the model are completely determined by
the values of R0 . If R0 <1 then the virus-free equilibrium is globally asymtotically stable,
and in case R0 >1 there is a unique endemic equlibrium which takes over this property. In
addition, we show that the CTL and antibody responses have important role in controling
of the density of free virus particles and of infected cells.
Keywords and Phrases: Virus, CTL and antibody immune responses, reproduction num-
ber, global stability, Lyapunov function.
1. INTRODUCTION
Virus is the one of intracellular pathogens which trouble the cell growth. For
reproduction, the virus has to enter a cell and use the cells metabolic machinery. Each
virus has an affinity for a particular type of cell. For example, HIV virus recognizes
CD4+ T cells-white blood cells which also known as helper T cell. A virus particle
encounters noninfected cells and becomes an infected cell. Then infected cells produce
more virus over time. In the course of viral infection, the host cell is changed or even
killed. On the other hand, virus particles activate the immune system. The role of
immune system is to fight off invasion by foreign phatogens such as virus. Note that
481
482 N.A. Kurdhi and L. Aryati
the immune response after viral infection is universal and necessary to eliminate or
control the desease. During the process of viral infection, the host response is induced
which is initially rapid and nonspecific (natural killer cells, macrophage cells, etc.) and
specific (CTLs, antibodies). However, in most virus infections, CTL and antibody play
a critical part in antiviral defense.
In fact, the specific immune system has three major branches. Two of them are
mainly effector responses; that is, they directly fight the virus. The third branch is
mainly a regulatory response that helps the effector responses to become established.
The two effector responses are CTL and antibody. There is a salient distinction between
their respective roles. That is, while antibodies attach to the free virus and neutralize
it, CTL identify and destroy infected cells. In addition, CTL can secrete substances
that trigger a reaction inside the infected cells that prevents viral genome from being
expressed. The helping branch of the immune system are the so called CD4+ T helper
cells. They help the induction of antibody and CTL responses.
In their books, Nowak and May [5] and Woodarz [10] have developed several
mathematical models (systems of differential equations) describing the dynamics of
virus and the responsiveness of immune system. He discussed equilibrium points of the
models and the basic reproduction number of the virus. Furthermore, Korobeinikov [2]
have analyzed the global properties of basic virus dynamics model; whereas Kurdhi and
Aryati [3] and Pruss et al. [8] analyzed the global stability of equilibrium points of model
for virus dynamics with cytotoxic T lymphocyte (CTL). Moreover, Yousfy et al. [9] and
Wodarz [10] analyzed the model of virus dynamics with CTL and antibody responses.
However, they did not consider the effects of immune responses in controling the virus
infection and the interaction between CTL and antibody when they work together as a
part of the immune system. In this paper, we analyze the global stability of the model
for virus dynamics with CTL and antibody responses by using appropriate Lyapunov
functions. In the global stability theorem, we consider the value of R0 , R1 , and R2
as basic reproduction number for virus, CTL, and antibody response, respectively. In
particular, the role of immune responses and interaction between CTL and antibody
can be described through equilibrium points of each models and their stability.
The paper is organized as follows. In section 2 we formulate the virus dynamics
models with and without CTL. In section 3 we formulate and analyze the global stability
of the virus dynamics model with CTL and antibody. The numerical simulation of the
models is presented in section 4. Finally, the conclusion are summarized in section 5.
viruses are mZ, µI, and cV , respectively. Thus, the average respective life times of non-
infected cells, infected cells, and free virus particles are m, µ, and c, respectively. The
interaction among these population variables are described by a system of differential
equations (Nowak and May [5]; Wodarz [10])
Ż = α − mZ − rV Z, t ≥ 0,
I˙ = rV Z − µI, t ≥ 0,
(1)
V̇ = kI − cV, t ≥ 0,
Z(0) = Z0 , I(0) = I0 , V (0) = V0 ,
with given constants α, m, r, µ, k, c > 0 and initial values Z0 , I0 , V0 ≥ 0. The basic
reproduction number
αrk
R0 = (2)
mµc
is the average number of newly infected cells produced by a single infected cell at the
beginning of the infection when almost all cells are still noninfected. Persistence or
extinction of an infection depends on the quantity of R0 . To describe it, we first note
that the system (1) has two equilibrium points:
¯ V̄ = α , 0, 0 ,
Q̄ = Z̄, I, (3)
m
α mc(R0 − 1) m(R0 − 1)
Q∗ = Z∗ , I∗ , V∗ = , , . (4)
mR0 rk r
The first one is virus free equilibrium point which is always positive, whereas the second
one is endemic equilibrium point and has positive value if and only if R0 >1. In fact, it
can be proved that if R0 <1 then the equilibrium Q̄ is globally asymtotically stable. If
R0 >1 then Q̄ becomes unstable, whereas the equilibrium Q∗ is globally asymtotically
stable. (See Korobeinikov[2, Thm.1.1]).
The one of important role in the human response is played by CTL. Their number
is denoted by T . Following Nowak and May [5] and Pruss et al. [8], we assume that
infected cells I are destroyed at a rate sIT by CTLs and CTL proliferation rate is
propotional to the abundance of infected cells and CTLs dIT . Hence, model (1) can be
modified to
Ż = α − mZ − rV Z, t ≥ 0,
I˙ = rV Z − µI − sIT, t ≥ 0,
V̇ = kI − cV, t ≥ 0, (5)
Ṫ = dIT − nT, t ≥ 0,
Z(0) = Z0 , I(0) = I0 , V (0) = V0 , T (0) = T0 ,
with given constants α, m, r, µ, k, c, s, d, n > 0 and initial values Z0 , I0 , V0 , T0 ≥ 0.
In this model, the interaction between infected cells and CTLs is very similar to the
dynamics between predator and prey in ecology. The CTL are predator that grow on
and kill their prey (infected cell).
484 N.A. Kurdhi and L. Aryati
3.2. Analysis of the model. In this subsection, we will study the global asymptotic
stability of model (10). This model reduces to (5) if A0 = 0. Then we obtain three
equilibrium points
¯ V̄ , T̄ , Ā = α , 0, 0, 0, 0 ,
Ē = Z̄, I, (12)
m
µc mc(R0 − 1) m(R0 − 1)
E∗ = Z∗ , I∗ , V∗ , T∗ , A∗ = , , , 0, 0 , (13)
rk rk r
486 N.A. Kurdhi and L. Aryati
ˆ V̂ , T̂ , Â = αdc n kn αdrk µ
Ê = Ẑ, I, , , , − ,0 , (14)
mdc + rkn d dc s(mdc + rkn) s
where Ē is always positive; whereas E∗ and Ê are positive if and only if R0 >1 and
R1 < R0 , respectively. There are exactly two more endemic equilibrium points, namely
αg αrh h αrkg c
˜ Ṽ , T̃ , Ã =
Ẽ = Z̃, I, , , , 0, − , (15)
mg + rh µ(mg + rh) g µp(mg + rh) p
αg
˘ V̆ , T̆ , Ă = n h αrhd µ kng − cdh
Ĕ = Z̆, I, , , , − , , (16)
mg + rh d g ns(mg + rh) s dhp
where Ẽ and Ĕ are strictly positive if and only if R0 > R2 and R2 < R1 < R0 ,
respectively, where
rh
R2 = 1 + . (17)
mg
Here, R2 is called basic reproduction number for antibody response of system (10).
Prior to activation of the CTL and antibody responses, spread of an virus infection
depends on the value of R0 . If R0 >1, then the infection will spread initially. The
existence of infected cells and virus particles trigger the activation of the CTL and
antibody responses. At this stage, persistence or extinction of the CTL and antibody
responses depends on the value of R1 and R2 . The next theorems describe the global
asymptotic stability of the equilibrium points completely in terms of the numbers R0 ,
R1 , and R2 .
Theorem 3.1. Let R0 <1. Then Ē is the unique positive equilibrium of (10). It is
globally asymptotically stable.
Proof. We introduce the function Φ0 : Ω0 → <, where Ω0 = {(Z, I, V , T , A) ∈ <5 :
Z > 0, I, V , T , A ≥ 0} and
Z Z r p s
Φ0 (Z, I, V, T, A) = Z̄ − ln + I + Z̄ V + A + T.
Z̄ Z̄ c g d
Clearly, Φ0 ∈ C 1 (Ω0 ) and Φ0 (Z, I, V, T, A) is positive difinite with respect to Ω0 . Cal-
culating the time derivative of Φ0 (Z, I, V, T, A) along the positive solutions of the model
(10), we obtain
Z Z̄ sn αrph
Φ̇0 (Z, I, V, T, A) = α 2 − − + µI(R0 − 1) − T− A.
Z̄ Z d mcg
Using the arithmetic-geometric inequality, we have that
Z Z̄
+ ≥2
Z̄ Z
for all Z>0, and the equality holds only for Z = Z̄. Furthermore, since R0 < 1 and
I, T, A ≥ 0, we obtain that Φ̇0 (Z, I, V , T , A) ≤ 0 for all (Z, I, V , T , A) ∈ Ω0 . Thus
Φ0 is a Lyapunov function. And Φ̇0 (Z, I, V, T, A) = 0, when Z = Z̄, I = 0, T = 0, and
A = 0. Let M be the largest invariant set in the set
H = {(Z, I, V, T, A} ∈ Ω0 : Φ̇0 (Z, I, V, T, A) = 0}
= {(Z, I, V, T, A} ∈ Ω0 : Z = Z̄, I = 0, T = 0, A = 0}.
Dynamics of Virus With CTL and Antibody Responses 487
We have from the first equation of (10) that M ={Ē}. It follows from La Salle’s principle
that the equilibrium Ē is globally asymptotically stable on Ω0 .
Theorem 3.2. Let 1<R0 <R1 and R0 <R2 . Then Ē and E ∗ are the positive equilibriums
of (10). The equilibrium E ∗ is globally asymptotically stable.
Theorem 3.3. Let R1 <R0 and R1 <R2 . If R0 <R2 , Ē, E ∗ , and Ê are the positive
equilibriums of (10); whereas if R2 <R0 , Ē, E ∗ , Ê, and Ẽ are the positive equilibriums
of (10). The equilibrium Ê is globally asymptotically stable.
488 N.A. Kurdhi and L. Aryati
We have from the second equation of (10) that M ={Ê}. It follows from La Salle’s
principle that the equilibrium Ê is globally asymptotically stable on Ω̂.
Theorem 3.4. Let R2 <R0 <R1 . Then Ē, E ∗ , and Ẽ are the positive equilibriums of
(10). The equilibrium Ẽ is globally asymptotically stable.
Clearly, Φ̃ ∈ C 1 (Ω̃) and Φ̃(Z, I, V, T, A) is positive difinite with respect to Ω̃. Calculating
the time derivative of Φ̃(Z, I, V, T, A) along the positive solutions of the model (10), we
obtain
˙
1 2 α2 α2 1
Φ̃(Z, I, V, T, A) =α 3 1 − + − mZ + 2 − 1−
R2 R2 R2 mZ R2 mZ R2
αr 1 V Z mµ 1 I
− 1− − R2 1 −
µ R2 I r R2 V
αs rh µn
+ − T.
µ mg + rh αd
Since R2 > 1 and the arithmetic mean is greater than or equal to the geometric mean,
it is clear that
α2 1 αr 1 V Z mµ 1 I 1
1− + 1− + R2 1 − ≥ 3α 1 − ,
R2 mZ R2 µ R2 I r R2 V R2
and
α2 2α
mZ + ≥
R22 mZ R2
˜ and V = Ṽ . Further-
for all Z, I, V >0, and the equalities hold only for Z = Z̃, I = I,
more, since
R1 − 1 αrk kng rh rh µn
R0 < R1 < R2 ⇐⇒ < 1+ ⇐⇒ <
R2 − 1 mµc cdh mg mg + rh αd
˙
and T ≥ 0, we obtain that Φ̂(Z, I, V , T , A) ≤ 0 for all (Z, I, V , T , A) ∈ Ω̃. Thus Φ̃ is
˙
a Lyapunov function. And Φ̃(Z, I, V, T, A) = 0, when Z = Z̃, I = I, ˜ V = Ṽ , and T = 0.
Let M be the largest invariant set in the set
˙
H = {(Z, I, V, T, A} ∈ Ω̃ : Φ̃(Z, I, V, T, A) = 0}
= {(Z, I, V, T, A} ∈ Ω̃ : Z = Z̃, I = I,˜ V = Ṽ , T = 0}.
We have from the third equation of (10) that M ={Ẽ}. It follows from La Salle’s
principle that the equilibrium Ẽ is globally asymptotically stable on Ω̃.
Theorem 3.5. Let R2 <R1 <R0 . Then Ē, E ∗ , Ê, Ẽ, and Ĕ are the positive equilibriums
of (10). The equilibrium Ĕ is globally asymptotically stable.
Clearly, Φ̆ ∈ C 1 (Ω̆) and Φ̆(Z, I, V, T, A) is positive difinite with respect to Ω̆. Calculating
the time derivative of Φ̆(Z, I, V, T, A) along the positive solutions of the model (10), we
obtain
˙
1 2 α2 α2 1
Φ̆(Z, I, V, T, A) =α 3 1 − + − mZ + 2 − 1−
R2 R2 R2 mZ R2 mZ R2
αr R 1 V Z mµ R0 R2 1 I
− 1− − 1− ,
µ R0 R2 I r R R2 V
where R = kng
cdh R2 . Using the arithmetic-geometric inequality, since R2 > 1, we have
that
α2 1 αr R3 1 V Z mµ R0 R2 1 I 1
1− + 1− + 1− ≥ 3α 1 − ,
R2 mZ R2 µ R0 R2 I r R3 R2 V R2
and
α2 2α
mZ + 2 ≥
R2 mZ R2
˘ and V = V̆ . Then, we
for all Z, I, V >0, and the equalities hold only for Z = Z̆, I = I,
˙
obtain that Φ̆(Z, I, V , T , A) ≤ 0 for all (Z, I, V , T , A) ∈ Ω̆. Thus Φ̆ is a Lyapunov
˙ ˘ and V = V̆ . Let M be the
function. And Φ̆(Z, I, V, T, A) = 0, when Z = Z̆, I = I,
largest invariant set in the set
˙
H = {(Z, I, V, T, A} ∈ Ω̆ : Φ̆(Z, I, V, T, A) = 0}
˘ V = V̆ , T = 0}.
= {(Z, I, V, T, A} ∈ Ω̆ : Z = Z̆, I = I,
We have from the second and third equations of (10) that M is the singleton {Ĕ}. It
follows from La Salle’s principle that the equilibrium Ĕ is globally asymptotically stable
on Ω̆.
We observe that for R0 < R1 and R0 < R2 , the CTL and antibody responses have
no influence for large t in so far the solution of (10) converges to the equilibrium given
by T = 0 and A = 0 and the equilibrium of the basic virus system (1). In case R0 > 1,
notice that the threshold condition R0 < R1 and R0 < R2 are equivalent to I∗ < nd and
V∗ < hg , respectively. And that the number of CTLs and antibodies decrease strictly if
and only if I < nd and V < hg , respectively, due to (10) (where T , A> 0, say). Therefore
since I and V converges to I∗ and V∗ for large t, respectively, the number of CTLs and
antibodies converges to zero as t → ∞. Hence, in order to trigger a significant immune
responses the reproduction rate must be large enough to push I and V over the critical
value nd and hg , respectively. In this case, the following three outcomes can be observed.
(i) If R1 < R0 and R1 < R2 , the CTL response develops and the antibody response
cannot become established. This is because the CTL response is strong and
reduces virus load to levels that are too low to stimulate the antibody response.
In this case, Iˆ = nd , whereas the threshold condition R1 < R2 is equivalent to
V̂ < hg . Therefore, the number of antibodies converges to zero as t → ∞.
Dynamics of Virus With CTL and Antibody Responses 491
(ii) If R2 < R0 < R1 , the antibody response develops and a sustained CTL fails.
This is because the antibody response is strong relative to the CTL response
and reduces virus load to levels that are too low to stimulate the CTL. In this
case, we have Ṽ = hg . Moreover, since R2 < R0 and R2 < R1 , it is clear that
R2 (R2 − 1) < R0 (R2 − 1) = R2 (R2 − 1) + (R2 − 1)(R0 − R2 ),
R2 (R2 − 1) < R2 (R1 − 1) = R2 (R2 − 1) + R2 (R1 − R2 ).
Furthermore, since R2 < R0 < R1 , we have that (R2 − 1)(R0 − R2 ) < R2 (R1 −
R2 ). Therefore,
R2 − 1 R1 − 1
R0 (R2 − 1) < R2 (R1 − 1) ⇐⇒ <
R2 R0
α n
⇐⇒ (R2 − 1) <
µR2 d
˜ n
⇐⇒ I < .
d
Thus the number of CTLs converges to zero as t → ∞.
(iii) If R2 < R1 < R0 , both CTL and antibody responses develop. It is attained
because in this case, I˘ = nd and V̆ = hg .
These outcomes are thus governed by competition between CTL and antibody
responses for the virus population. This is because the virus population is a resource
that both CTL and antibody require for survival.
The role of CTL and antibody immune responses can be described through equi-
librium points of each models. Comparing the model (10) with the basic model (1), we
have that
V∗ R0 − 1 I∗ R0 − 1 Z∗ R2
= > 1, = > 1, = < 1.
V̆ R2 − 1 I˘ R1 − 1 Z̆ R0
Thus the CLT and antibody immune responses decrease the density of free viruses and
of infected cells and increase the density of noninfected cells. In addition, the effects of
antibody response is shown by the following comparison
V̂ R1 − 1 Iˆ Ẑ R2
= > 1, = 1, = < 1.
V̆ R2 − 1 I˘ Z̆ R1
Compared with the model (5), the antibody response thus decreases the density of
infected cells and increases the density of noninfected cells. However, the antibody
response has no influence to the density of infected cells.
4. NUMERICAL SIMULATION
In this section, we perform some numeric simulations to demonstrate the theoret-
ical results obtained in Section 4 by using Mathematica 7.0. We present the numerical
simulations to observe the dynamics of system (10) with a set of parameter values in
Table 1. We have seen in previous sections that the value of R0 , R1 , and R2 play a
dicisive rule in determining the virus and immune responses dynamics. We can get
492 N.A. Kurdhi and L. Aryati
the immune responses during acute infection. As the virus population grows, both CTL
and antibody responses will start to expand. The outcome of the dynamics in acute
infection depends on the relative strengths of the responses. According to the numerical
result, three outcomes are possible.
(i) The CTL response is strong relative to the antibody response. Thus, the CTL
response develops while the antibody response does not become fully estab-
lished. And the CTL will clear the infection.
494 N.A. Kurdhi and L. Aryati
Figure 2. State trajectories for (i) system (1), (ii) system (5), dan
(iii) system (10), starting from different initial conditions.
(ii) The CTL response is weak relative to the antibody response. However, the
antibody response is unlikely to clear the infection. The reason is that while
free virus particles are removed, a relatively large pool of infected cells remains
because they do not become killed. Hence, the result is persistent infection in
the presence of an ongoing antibody response.
(iii) Both the CTL and antibody responses are sufficiently strong to become fully
established. And the outcome is virus clearance.
Dynamics of Virus With CTL and Antibody Responses 495
5. CONCLUDING REMARKS
In this paper, we have studied the global dynamics of virus dynamics model with
CTL and antibody immune responses. By constructing suitable Lyapunov functions,
sufficient conditions have been derived for the global stability of five equilibrium points.
If the basic reproduction number for virus infection R0 < 1, the virus-free equilibrium is
globally asymtotically stable, and in case R0 > 1 there is a unique endemic equlibrium
which takes over this property. The stability of the four endemic equilibrium points is
also dependent upon both the basic reproduction number for CTL response R1 and for
antibody response R2 , which determine the persistence or extinction of CTL and anti-
body responses; If 1<R0 <R1 and R0 <R2 , the equilibrium E ∗ is globally asymptotically
stable and the infection becomes chronic but without CTL and antibody responses; If
R1 <R0 and R1 <R2 , the equilibrium Ê is globally asymptotically stable and the infec-
tion turns to chronic with CTL response but without antibody response; If R2 <R0 <R1 ,
the equilibrium Ẽ is globally asymptotically stable and the infection becomes chronic
with antibody response but without antibody response; If R2 <R1 <R0 , the equilibrium
Ĕ is globally asymptotically stable and the infection turns to chronic with CTL and
antibody responses.
The interaction between CTL and antibody is the competition for virus popula-
tion. This is because both the CTL and antibody proliferate in response to stimulation
496 N.A. Kurdhi and L. Aryati
by the same virus. We have shown it by analysis and numerical simulation. For ex-
ample, if the CTL response suppresses virus load to levels that are to low to stimulate
the antibody response, then a successful antibody response might not be generated.
Conversely, if the antibody reduce virus load to levels that are to low to stimulate the
CTL, a successful CTL response will not be established.
Dynamics of Virus With CTL and Antibody Responses 497
From numerical simulation, we see that the persistence of the CTL and antibody
response will decrease the density of infected cells and of free virus particles either in
equilibrium condition or in peak of infection. Hence, the CTL and antibody responses
plays an important role in the reduction of the virus infection.
References
[1] Adams, B. M., Banks, H. T., Davidian, M., Hee-Dae Kwon, and Tran, H. T., Dynamic
Multidrug Therapies for HIV: Optimal and STI Control Approaches, Mathematical Biosciences
and Engineering 1, 223-241, 2004.
[2] Korobeinikov, A., Global properties of Basic Virus Dynamics Models, Bull. Math. Biol. 68,
615-626, 2009.
[3] Kurdhi, N. A. and Aryati, L., Global Stability of Virus Dynamics Model with CTL Response,
Department of Mathematics UGM, 2010.
[4] Nagumo, N., Uber die lage der integralkurven gewohnlicher differential gleichungen, Proc. Phys-
Math. Soc. Japan 24, 551-559, 1942.
[5] Nowak, M. A. and May, R., Virus Dynamics, Oxford University Press, Inc., New York, 2000.
[6] Perelson, A. S., Kirschner, D. E., and Boer, R. D., Dynamics of HIV infection of CD4+ T
Cells, Mathematical Biosciences 114, 81-125, 1993.
[7] Perko, L., Differential Equations and Dynamical Systems, Springer-Verlag, New York, 1991.
[8] Pruss, J., Zacher, R., and Schnaubelt, R., Global Asymptotic Stability of Equilibria in Models
for Virus Dynamics, Math. Model. Nat. Phenom 3, 126-142, 2008.
[9] Yousfi, N., Hattaf, K., and Rachik, M., Analysis of a HCV Model with CTL and Antibody
Responses, Applied Mathematical Sciences 3, 2835-2846, 2009.
[10] Wodarz, D., Killer Cells Dynamics, Mathematical and Computational Approaches to Immunolog,
Springer-Verlag, New York, 2007.
Lina Aryati
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]
498 N.A. Kurdhi and L. Aryati
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 499 - 504
Abstract. In this paper we present a mathematical model of diffusion within vascular wall on
dengue infection that may capture the plasma leakage phenomena. The aim of this model is to
analyze the relation between the increased level of cytokine and blood concentration within vascular
wall. Numerical solutions of the diffusion model are obtained by finite difference method.
Simulations of the model indicate the effect of cytokine for variations of parameter values. It is
shown that high increased level cytokine will cause high blood concentration in the vascular wall that
may contribute to plasma leakage.
Keywords and Phrases: diffusion model,dengue viruses, cytokine level, plasma leakage.
1. INTRODUCTION
Dengue virus infection is an acute febrile disease that has become major public
health problems in many tropical and subtropical regions of the world. One of the forms of
the illness is Dengue Haemorrhagic Fever (DHF). DHF is characterized by plasma leakage
that may lead to death. Once inoculated into a human host, dengue has an incubation period
of 3-14 days (average 4-7 days) while viral replication takes place in target dendritic cells
[3,4].
Infection of target cells, primarily those of the reticuloendothelial system, such as
dendritic cells, hepatocytes, and endothelial cells. After a person is infected with dengue, the
person develops an immune response to that dengue subtype. The immune response produced
specific antibodies to that subtype's surface proteins that prevent the virus from binding to
macrophage cells (the target cell that infected by dengue viruses). However, if another
subtype of dengue virus infects the individual, the virus will activate the immune system to
attack the first subtype.
The immune system is tricked because the four dengue subtypes (DEN 1, DEN 2,
DEN 3 and DEN 4 ) have very similar surface antigens. The antibodies bind to the surface
_______________________________
2010 Mathematics Subject Classification :34C60, 92D30
499
500 N. NURAINI, D . R. PASYA, E. SOEWONO
proteins but do not inactivate virus. The immune response attracts numerous macrophages,
which the virus proceeds to infect because it has not been inactivated. This makes the viral
infection much more acute [6].
The infected macrophages then give signals to the immune system. This
phenomenon is called by antigen presentation. As the result, it may lead activation of the T
cells. Activated T cell is purpose to produce a range of cytotoxics, which lyses infected
monocytes-macrophages and some uninfected target cells, and cytokine that regulate or
"help" the immune response such as gamma interferon (IFN-), IL-2, IL-4, IL-5, IL-6, IL-8,
IL-10, IL-12, TNF-α , and TNF-β [2,3,4]. Macrophages or monocytes which are infected by
dengue viruses also produce TNF- α , TNF- β, IL-1, IL-1B, IL-6, and platelet activating factor
(PAF). Base on Kurane and Ennis's hypothesis, the rapid increase in the levels of TNF- α, IL-
2, IL-6, IFN-γ , and PAF induce increasing vascular permeability, plasma leakage, shock and
malfunction of the coagulation system, which may lead to hemorrhagic [3].
Activation of complement is another important clinical manifestation in DHF. It was
reported that levels of C3a and C5a, complement activation products, are correlated with the
severity of DHF and the levels of C3a and C5a reached the peak at the time of defervesce
when plasma leakage become most apparent [5].
In summary, there are several factors that increasing vascular permeability such as
cytokine, chemical mediator (PAF) and complement activation. Increasing vascular
permeability may be leading to plasma leakage. One of manifestations that induce plasma
leakage is an increasing hematocrite level up to 20 % above than normal condition [1].
The model for dengue transmission among population is established, some of them
is explained in [7]. But the mathematical modelling of dengue infection within a host is quite
rare,modeling of dynamical virus for this infection using differential equation is studied in
[4,5,6]. The model in [4] discuss an immune response but in [5,6] without immune response.
In this model we discuss a simple model to capture the plasma leakage phenomena in dengue
infection within a host. We use an one-dimensional diffusion equation with cytokine effect to
see the dynamic of blood concentration in vascular wall.
2. MODEL FORMULATION
a
b
We assume that the concentration of blood represent plasma leakage only affected
by cytokine. Suppose that C is the concentration of blood (density in %), the radius of
vascular wall represented by r (in mm), diffusion coefecient of blood is D, t for time. The
model equation is as follow,
in case of diffusion process from inner radius represent by a, into outer radius, b, of vascular
wall the equation (1) would be transformed by r (b a)r a :
~
In this model we assume that blood stream pumped in the body periodically C 1 e it with
angular velocity ω rad/s Then we have the boundary condition for inner vascular wall as
We assumed that no blood leakage in the outside boundary of the vascular wall, then the
boundary condition for outside part of vascular wall is
Solving the equation (1) numerically, we use finite difference and develop the derivatives
approximation to determine the values of unknown function at points in its domain
3. NUMERICAL SOLUTION
In this numerical simulation, we used hypothetical data to present the trend of blood
concentration at viremia phase (1-7 days), because it is difficult to get the real data for this
phenomena. Table 1 will give a parameter values that used in simulation.
502 N. NURAINI, D . R. PASYA, E. SOEWONO
Figure 3 expresses the comparison between blood concentrations at the normal condition and
infection. The first condition represent by K(t) = 0, the second, K(t) is not zero. Figure 3a
(left) simulate the dengue infection before plasma leakage takes place, for K(t)=0, from the
first, fourth, and seventh day, there is no blood concentration changes. On the contrary, in
Figure 3b (right), when K(t)=2.10-4+60, from the first, fourth, and seventh day, the blood
concentration changes along the radius of vascular wall.
A S i m p le Di f fu s i on M od el Of P la s m a Lea k a ge In Den gu e In fec t i on 503
Figure 4, simulate the condition that increasing cytokine level will be increasing blood
concentration. It can be seen on Figure 4a, where K(t)=2.10 -4t+60, the cytokine level will
increase 1% from normal condition. In Figure 4b, when K(t)=5.10 -4t+60, the blood
concentration will increase more than or equal to 20% from normal condition, this may lead
to plasma leakage.The simulation confirm the medical information that one of manifestations
that induce plasma leakage is an increasing hematocrite level up to 20 % above than normal
condition [1].
4. CONCLUDING REMARK
In this paper we formulate a mathematical model to capture plasma leakage phenomena in
vascular wall caused by dengue infection. Numerical solutions of the diffusion model are
obtained by finite difference method. Numerical simulations of the model indicate the effect
of cytokine for variations of parameter values. It is shown that high increased level cytokine
will cause high blood concentration in the vascular wall that may contribute to plasma
leakage.
References
[1] B AYLEY N.J.T. The Mathematical Theory of Infectious Diseases and its Application, Griffin, London. 1975.
[2] KURANE, I, Dengue Hemorrhagic Fever with Special Emphasis on Immunopatho- genesi, National Institute
of Infectious Disease. Tokyo, Japan, 2006.
[3] MAZUMDAR, J, An Introduction to Mathematical Physiology and Biology, Cambridge University Press 1999.
[4] NURAINI, N, TASMAN H, SOEWONO, E, and KUNTJORO, AS A with-in Dengue infection model with immune
response, Journal Mathematical and Computer Modelling 49 pp 1148 - 1155.2009.
[5] NURAINI, N, SOEWONO, E, KUNTJORO, AS Mathematical Model of Dengue Internal Transmission Process,
Journal on Indonesian Mathematical Society (MIHMI) Vol 13, pp.123-132, 2007.
[6] NURAINI, N, ARI, Y, KUNTJORO, AS Model Matematik Penyebaran Internal Demam Berdarah dalam Tubuh
Manusia, Prosiding Konferensi Nasional Matematika XIII, UNNES. 2006.
[7] SUPRIATNA, A.K, NURAINI, N, SOEWONO, E, Mathematical Model of Dengue Transmission and control,
Dengue Virus: Detection, Diagnosis and Control. , Basak Ganim and Adam Reis, Nova Science
Publishers, New York pp.187 – 208, 2010.
NUNING NURAINI
Institut Teknologi Bandung.
e-mail: [email protected]
EDY SOEWONO
Institut Teknologi Bandung.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011
Applied Mathematics, pp. 505 – 514.
Abstract. This research compares DNA sequences of H5N1 virus to analyze them with a tree
diagram method. We use tree diagram method to analyze similarity level of nucleotides. A tree
diagram method is one of several methods to align a pair of DNA sequences at entirely length.
This method uses concept data structure in general tree with post -order traversal. Furthermore, we
obtain mutation level of nucleotides. The scoring equation and parameters are determined.
Keywords and Phrase : Global Alignment, Pair of DNA Sequences, Tree Diagra m Method.
1. INTRODUCTION
The comparison of the existing sequences is a modern method for studying the
evolutionary interaction between genes. It is based on the alignment-the process of arranging
two or more sequences to achieve the maximum level of identity (for evaluation purposes),
the degree of similarity and eventual homology. Sequence alignment is an important method
in DNA and protein analysis [6,3]. Increasing of new biological sequences is basic of any
sequence analysis [4]. Bioinformatics is a collection of mathematical, statistical and
computational methods for analyzing biological sequences, that is, DNA, RNA and amino
acid (protein) sequences.
We will compare the DNA sequences of H5N1 to analyze their similarity level. H5N1
is Influenza-A virus that has a segmented single strain negative RNA linear genome. Inside
the host cell, virus’ RNA will be a reverse transcription into RNA-DNA hybrid and
eventually forms the DNA. Furthermore, the virus’ DNA will enter into the nucleus’ host cell.
DNA virus will damage the DNA host’s and to forms mRNA (messenger RNA), then mRNA
will translate to produce a viral envelope protein to form new viruses. The virus has very high
mutation rate, so it can create different variations of the viruses [5].
In the previous research, the similarity level in segment HA and NA protein was
505
506 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI
determined by [1] using tools EMBOSS, this tools applied “needle” algorithm. This research
use tree diagram method to align a pair of DNA sequences [7]. This method consists of
three parts: (i) simple alignment algorithm, (ii) extension algorithm, (iii) Graphical Simple
Alignment tree (GSA tree). These theories will be explained as follows
1.1.1 DNA Sequences. DNA (deoxyribonucleic acid) sequences are associated with the four-
letter DNA alphabet {A,C, G, T}, where A, C, G and T stand for the nucleic acids or
nucleotides Adenine, Cytosine, Guanine and Thymine respectively. Most DNA sequences
currently is being studied come from DNA molecules that found in chromosomes. They that
are located in the nuclei of the cells of living organisms. More information about DNA can be
found in [4].
DNA sequences are string of letters from a four-letter alphabet called nucleotides (A,
C, G, T). The length of sequence is a variable and not all sequences are of the same length.
Generally, we use the following description of DNA sequence:
A = (a1 a2 … am) B = (b1 b2 … bn) (1)
Where the capital letters A, B represent the sequence and a i, bi, ci represent the basic
units of the sequence at position i, whose elements are obtained from the set {A,C, G, T}. For
instance, DNA sequence could do substitutions (change of one nucleotide to another),
insertions and deletions (gain or loss of one or more nucleotides) and therefore algorithms
should include the possibility of gaps [9]. There are some biology assumptions about
beginning and the end of a sequence are useful to development of algorithm [2].
1.1.2 Sequence Alignment. An alignment between two sequences is simply a pairwise match
between the characters of each sequence. A true alignment of DNA sequences is one that
reflects the evolutionary relationship between two or more homologs (sequences that share a
common ancestor).
Given two sequences (X and Y) in first equation with lengths is m,n respectively, let
c be the total length of the alignment we have, i.e.,
max(n, m) c n m
The alignment is represented by a matrix M(X,Y), with size of matrix is 2 x c (2 is
row and c is column). The row are the sequences and the columns are matches, mismatch,
insertion and deletion (called “indels”) [2].
Example: Given two sequences, X = GAATTAGTTA and Y = GGATCGA. Where length is
m = 10, n = 7 respectively. One possible arrangement can be defined as :
G A A T T A G T T A
M ( X ,Y )
G G A T C G A
Since there are different ways of arranging the sequences and find mismatch and
indels, so we are interested in the best possible arrangement. Furthermore, the “best”
alignment depends on how the scoring of matches, mismatches and gaps. Since we are
interested in the similarity of two sequences, we would reward a match and penalize a
mismatch/gap. Thus, the first step is to define an appropriate scoring equation in order to
quantify the sequence alignments.
A scoring equation can be designed to quantify the edit distance (mutations,
insertions and deletions):
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 507
where, S is the percentage sequence similarity, Ls is the number of aligned residues with
similar characteristics, La and Lb are the total lengths of each individual sequence in
alignment.
1.2 Tree Diagram Methods. In this section we discuss about Tree Diagram Methods and
their theory related to the problem discussed in this research. Tree diagram method is one
method to align a pair of DNA sequence in entire length (global alignment). This method
consists three parts: (1) improved simple alignment algorithm, (2) extension algorithm, and
(3) GSA Tree [7].
1.2.1 Improved Simple Alignment Algorithm. Given two sequences X={ x1 , x2 ,..., xm } and
Y={ y1 , y2 ,..., yn }, where m and n is denote the length of X and Y, respectively. So
improved simple alignment algorithms can be defined as step for align X and Y sequence,
with initial position is first base (x1) of X overlaps with the first base (y1) of Y. Then the
sliding process is done along the left and right, respectively. These steps shown by [7] as
follows:
(1) Initial position
x1 x2 x3 … xm
y1 y2 y3 … ym … yn
(2) Every times X moves one base position along the right direction
x1 x2 x3 … xm
y1 y2 y3 … ym … yn
(3) Every times X moves one base position along the left direction
x1 x2 x3 … xm
y1 y2 y3 … ym … yn
Every alignment has score S according to the scoring equation on equation 2 nd. Then
at the step (2) and (3) still define another score S’i and S’j, respectively is
508 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI
1.2.2 Extension Algorithm. This algorithm used to protect the longer common substring than
C in R from being split by C.
Let C is a common substring in R, with C {C1 , C2 ,..., Cm } . If C , then
none longest common substring in C can be extended. If C {C1 , C2 ,..., Cm } there are m
longest common substrings, where | C1 || C2 | ... | Cm | k with k denotes the number of
matches within a longest common substring. Here are some steps of extension algorithm to
find the longest common substring [7]:
(1) Let K be length of the longest common substring of X and Y.
(2) When k=K, none of the longest common substrings Ci of R can be extended into
longer common substring.
(3) When k < K, there are exist at least a longer common substring than C i . There
are several sub-steps to find out the longer common substrings as the following:
(a) Let LL be the number of mismatches from the right end of C i-1 to the left
end of Ci. When i = 1, LL denotes the number of mismatches from the left
end of X and Y to the left end of C1. Similarly, let LR denotes the number of
mismatches from the right end of Ci to the left of Ci+1. When i= 1, LR
denotes the number of mismatches from the right end of Ci to the right end
of X and Y.
(b) When K < LL, the K mismatches are extracted from the left of Ci.
Otherwise, the LL mismatches are extracted from the left of Ci. Similarly,
when K < LR, the K mismatches are extracted from the right of C i.
Otherwise, the LR mismatches are extracted from the right of Ci. Then the
sequences extracted from left of Ci , Ci and from the right of Ci are
connected into two new sub-sequences Si1 and Si2.
(c) Apply the simple alignment algorithm to Si1 and Si2. If there exist a new
longer common substring within Si1 and Si2 than Ci, we find several choice,
i.e., a new longer common substring or Ci. If there is an increment of sore
when the new longer common substring comes into being, we can replace
the original Ci with a new substring also called as Ci.
(4) As for every Ci of R, the original Ci is replaced by a new longest common
substring if a new substring exists.
Output these algorithm is data C and U from R (or repaired R) which has extend if extension
is occur.
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 509
1.2.3 Graphical Simple Alignment Tree. This algorithm use to explore the appropriate gaps in
Uj1. Any several step in here [7].
(1) Compute the scores of all simple alignment of Uj1 by the simple alignment
algorithm. A good simple alignment Rj1 of Uj1 is generated when its score
maximum.
(2) If there is increment of the score due to appropriate gaps within U j1, Uj1 can be
further divided into the second level sub-alignment. When and how to add the gaps
in the sequences? Now, let Ci2 be the longest common substrings of Rj1, where Ci2
{Ci21 , Ci2 2 ,...Ci2 m } . Then there are two sub-steps as the following:
(a) If Ci2 {Ci21 , Ci2 2 ,...Ci2 m } ,there are m longest common substrings. Then Uj1
can be further divided into the second level sub-alignment by Ci2 . Let U2j be
U 2j , the operation flow goes back to the step (1). And the level of sub-
alignment enters the next.
(b) If Ci2 , there is no a longest common substrings in R1j . Then U1j can not
be further broken down. The good simple alignment R1j of U1j becomes a leaf
node in GSA tree. The two sequences of R1j might be entirely overlapping, or
partially overlapping, or one sequence might be aligned entirely internally to the
other. When the two sequences of R1j are entirely overlapping, there are no gaps
within R1j . Otherwise, the hanging ends of the overlap come into being the gaps
of R1j . The relative position of these gaps is fixed, and becomes gaps within the
final global alignment.
These above steps repeatedly do, until all U the last level sub-alignment cannot be
further decomposed by the improved simple alignment algorithm and the extension
algorithm. Then we can obtain a Graphical Simple Alignment Tree (GSA Tree) for string X
and Y, consisting of a series of substrings.
1.2.4 General Tree Using post-order Traversal. Global alignment in this method is formed
by GSA tree. In this case, GSA tree is a general tree or tree structure that has any children.
We can construct any general tree in figure 1.
510 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI
X,Y
C11 C31
U21
U32
U12
C22
General tree in Figure 1 is construct from any sequence X and Y. In the position of root is
consist best alignment R, from improved simple alignment algorithm and extension algorithm. From
R, we can divide two kind of child in each level, i.e: C and U. C is a longest common substring from R
and U is substrings spaced by C. There are consists two types of nodes: inner and leaf nodes. The
global alignment of string X and Y is formed by all leaf nodes. To obtain global alignment, GSA tree
is traversed by post-order traversal of tree. A post-order traversal of a general tree performs a
post-order traversal of the root’s sub-trees from left to right, then visits the root [8]. Then all
inner nodes are deleted from the result of post-order traversal [7].
In this section, we discuss about the analysis of sequences DNA virus H5N1 using
tree diagram methods. This method is described in figure 2.
Output:
Improved Extension length
Input 2 algorithm
Simple Similarity
sequence Gaps
alignment
Score
C U
GSA Tree
In figure 2. After we input a pair of sequences DNA virus H5N1, we can process
alignment of two sequences in three part, i.e improved simple alignment, extension algorithm
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 511
and GSA Tree. The scoring equation is S = p q r (o ke) and we choose the
parameters 5, 4 (from DNAFULL matrix) and o = 10, e = 0.5 (default gap open
and gap extension in tools EMBOSS). Then we obtain percentage of similarity and gaps to
analysis mutation level in nucleotides.
We obtain DNA sequences in this research from GenBank database. These data are 2
sequences from host human and 4 sequences from host avian. The result of alignment using
this method can explain in Table 1.
Table.1, The result alignment sequence DNA H5N1 virus on HA segment using tree diagram
methods
Using Table 1, we found similarity level internal host (human-human and avian-
avian) and external host (human-avian). The similarity level in human-human is 89.2%,
avian-avian is 91.3%, and human-avian is 90.1%. furthermore, we count mutation level of
512 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI
nucleotides using information of similarity and gap in the Table 1. We obtain the mutation in
human-human is 7%, avian-avian is 7.3% and human-avian is 7.8%. These result was same
with the result alignment from EMBOSS tools (a tools for pairwise alignment using “needle”-
algorithm).
3. CONCLUDING REMARK
According to this result, we can conclude that tree diagram method is sufficient to
align the sequences by applying the concept of a tree data structure that contains simple
alignment algorithms, extension algorithms and GSA tree. This method can improved the
alignment of two DNA sequences by exploring the appropriate gaps, gradually based on
simple alignment algorithms and extension algorithms. Based on validation results using tools
EMBOSS, shows that the optimal alignment generated from parameters match, mismatches,
penalty gap open and penalty gap extend respectively 5, 4 and o = 10, e = 0.5.
Furthermore, based from similarity level result of sequence DNA H5N1 virus on
segment HA, in internal and external host we conclude that in biological this is an indication
that they are different species. Based from mutation level result shown that mutation level on
this virus is high, and the highest is mutation level between host human-avian 7.8%.
References
[1] CHEN, G.W, CHANG, S.C, MOK, C.K, LO, Y.L, KUNG, Y.N, HUANG, J.H, SHIH, Y.H, WANG, J.Y, CHIANG, C,
CHEN, C.J, SHIH, S.R., Genomic signature of Human versus Avian Influenza A viruses. Emerging
infectious Diseases. www.cdc. Vol. 12, no.9, September 2006.
[2] ESCARINO, CLAUDIA-RANGEL: A two-base encoded DNA sequences alignment problem in computational
biology. Math-In-Industry Project, National Institute Of Genomic Medicine, Mexico, 2009.
[3] I. EIDHAMMER. Protein Bioinformatics: an algorithmic to sequences and structure analysis’ John Wiley &
Sons, Ltd ISBN: 0-470-84839-1. 2004
[4] ISAEV, ALEXANDER: Introduction to Mathematical Methods in Bioinformatics, Springer-Verlag Berlin
Heidelberg, Germany, 2004,.
[5] PETERSON A. TOWNSEND, SARAH E. BUSH, ERICA SPACKMAN, DAVID E. SWAYNE, AND HON S: Influenza
A Virus Infections in Land Birds, People’s Republic of China”, Emerging Infectious Diseases •
www.cdc.gov/eid • Vol. 14, No. 10, 2008.
[6] PEVSNER, JONATHAN. Bioinformatics and Functional Genomics . Department of Neurology, Kennedy
Krieger Institut & department of neuroscience and division of health Sciences informatics, The John
Hopkins School of Medicine, Baltimore: Maryland, 2009.
[7] QI, Z.H, QI, X.Q., New method for alignment 2 DNA sequences by tree data structure. Journal of
theoretical Biology 263, 227-236, 2009.
[8] SHAFFER, A. Clifford: A Practical Introduction to Data Structures and Algorithm Analysis Edition 3.2
(C++ Version), Department of Computer Science Virginia Tech Blacksburg, VA 24061, 2011.
[9] HEN, SHIYI NANKAI, JACK A. TUSZYNSKI: Theory and Mathematical Methodes for Bioinformatics, Springer
Vierlag, San Francisco, 2008.
[10] XIONG, JIN: Essential Bioinformatics, CAMBRIDGE University Press, United States Of America, 2006.
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 513
SITI FAUZIYAH
Graduate Student of Mathematics Department at Institut Teknologi Sepuluh Nopember (ITS)
Surabaya.
e-mail: [email protected]
M. ISA IRAWAN:
supervisor, lecturer of Mathematics Department at Institut Teknologi Sepuluh Nopember
(ITS) Surabaya.
e-mail: [email protected]
MAYA SHOVITRI
co. supervisor, lecturer of Biology Department at Institut Teknologi Sepuluh Nopember (ITS)
Surabaya.
e-mail: [email protected]
514 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 515 – 528.
Abstract. In this paper, the linear state feedback controller for fuzzy model is designed by using
the linear matrix inequalities (LMIs). Fuzzy model are described as sum of weighting of r
subsystems. The controller design is guarantees the stability of system and satisfies the desired
transient responses of system. If the r approach to infinity, the existence of controller that stabilize
the system will be difficult to obtain. It is caused by find the solution of set of LMIs that satisfying
some the conditions. By relaxing the stability conditions we will formulate the problem of controller
design in LMIs feasibility problem.
Keywords and Phrases : fuzzy control, feedback, linear matrix inequalities, Lyapunov stability,
pole placement, Takagi-Sugeno model, relaxed stability conditions .
1. INTRODUCTION
In recent years, there have been many research efforts on these issues based on the
Takagi-Sugeno (TS) model based fuzzy control. For this TS model based fuzzy control
system, Wang et all [14] proved the stability by finding a common symmetric positive
definite matrix P for the r subsystems in general and suggested the idea of using Linear
Matrix Inequalities (LMIs). The process of controller design are involves an iterative process,
that is, for each rule a controller is designed based on consideration of local performance
only, then LMI-based stability analysis is carried out to check the global stability condition.
In the case that the stability conditions are not satisfied, the controller for each rule should be
redesigned.
Olsder [9] and Ogata, K, [8] presented about the basic theories of systems and controls.
Boyd, et all [1] discussed about the linear matrix inequality in system and control theory.
Lam, H.K et all [4] designed the controller to fuzzy system by linear matrix inequality
approach. Mastorakis, N.E. [5] discussed about the modeling of dynamic system by TS fuzzy
model. Messousi, W.E. et all [6] discussed a bout how the pole placement on fuzzy model by
515
516 SOLIKHATUN AND SALMAH
LMI-approach. Tanaka, K., dan Sugeno, M.[12] analyzed the stability and designed the fuzzy
control system.
Motivated by the LMI formulation of pole placement constraint of the conventional
state feedback in Chilali [2] and Hong, S.K. & Nam, Y [3], Solikhatun [11] modify the
formulation and apply to the multi-objective TS model based fuzzy logic controller design
problem. The design the fuzzy controller system for the way of simultaneously guaranteeing
global stability and adequate transient behavior (pre-specified transient performance) are
formulated. Tanaka and Wang [13] wrote that if the r approach to infinity then the existence
of controller that stabilize the system will be difficult to obtain. It is caused by find the
solution of set of LMIs that satisfying some the conditions. By relaxing the stability
conditions we will formulate the problem of controller design in LMIs feasibility problem.
Polderman [10] presented about the dynamic of satellite of motion system and the
factors that affect it and derive the equations of satellite motion of dynamic without
disturbance. Last, we will present the simulation results by apply the proposed methodology
to model of motion system of the satellite. The followed definitions, lemmas and theorems
are used in the main results.
Definition 1. The matrix P Rnxn is called positive definite matrix if ut Pu 0, u Rn .
Lemma 2. Suppose Q, R R nxn are symmetric matrix and matrix S R nxn . The condition
Q S
S t R
0 is equivalent to
R 0, Q SR1S 0 .
Lemma 2 is known as Schur complement.
Definition 3. Consider the linear x Ax, A Rnxn , x Rn , x(0) x0 .
system
Equilibrium point x is stable if for all 0 there ( ) 0 such that for each solution
x(t , x0 )
If x0 x then x(t , x0 ) x , t t0 .
Equilibrium point x is asymptotically stable if x stable and there 0 0 such that for each
solution x(t , x0 )
if x0 x 0 then lim x(t , x0 ) x 0 .
t
The stability of linear system can be formulated by LMI. It is known by Lyapunov
theorem about stability. In stabilization, we need a controller such that the state feedback
fuzzy control system is asymptotically stable, i.e.
lim x(t ) 0
t
continuously then
1. V (0) 0 and V ( x) 0, x S \ {0} .
2. If V ( x) 0, x S then x 0 is stable.
3. If V ( x) 0, x S \ {0} then x 0 is asymptotically stable.
Definition 6. Consider the set X and set M X . Consider function M is defined as
function of
M : X [0,1]
x X with the real number M (x) on [0,1] , with
that corresponding the each element of
the M (x) represent membership function grades for x on M . The fuzzy set M X is
defined by x, M ( x), x X .
Definition 7. A subset of D of the complex plane is called an LMI-D region if there exist a
symmetric matrix [ kl ] Rmxm and [kl ] Rmxm such that
D z C f D ( z) 0
where the characteristic function f D is given by
f D ( z) [ kl kl z kl z ]1k ,l m .
Example 8. Consider a circle LMI region D
D x iy C ( x q)2 y 2 r 2
r 0 , where the characteristic function is given by
centered at (-q, 0) and has radius
r z q
f D ( z) .
z q r
As shown in Figure 2, we can chose the poles in region D such that desired transient
responses of system.
518 SOLIKHATUN AND SALMAH
max n Im
(-q,0)
max d
Re
r
Definition 9. Consider the circular region D of the left half complex plane. The system
x Ax, x Rn , A Rnxn is said D-stable if all the poles lies on LMI-D region.
1.1 Affine Fuzzy Model. The problem of LMI-based fuzzy state feedback controller becomes
yet more complex if some of model parameters are unknown. By using a Takagi-Sugeno (TS)
fuzzy model, a nonlinear model can be expressed as a weighted sum of r simple subsystem.
The inference performed via the Takagi-Sugeno model is an interpolation of all the relevant
linear models. Takagi and Sugeno define the inference in the rule base as the weighted
average of each rule’s consequents :
r
( A x(t ) B u(t ) d )
i i i i
x(t ) i 1
r
. (1)
i 1
i
The TS fuzzy model consists of an if-then rule base. The rule antecedents partition a subset of
the model variables into fuzzy sets. The consequent of each rule is a simple functional
expression. The i-th rule of the Takagi-Sugeno fuzzy model is of the following form :
If x1 (t ) is Li1 and xn (t ) is Lin and u (t ) is M i
then x(t ) Ai x(t ) Biu(t ) di
where i 1, 2,..., r and r is the number of rules and Lij , j 1, 2,..., n and M i are fuzzy
sets centered at the i-th operating point. The categories of the fuzzy sets are expressed as
N_Left, Z_Equal and P_Right where N_Left represents negative, Z_Equal zero and P_Right
positive (Figure 1).
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 519
-1 0 1
The truth value of the i-th rule in the set i is obtained as the product of the
membership function grades :
i ( x, u) L ( x1 )...L ( xn ).M (u)
i1 in i
with L ( x j )
ij
represent membership function grades for Lij at x j . Consider the linearized
state space form with the bias term d induced from the model linearization is follow:
x(t ) Ax(t ) Bu(t ) d (2)
with x ( x1 , x2 ,..., xn ) R , A R , B R , u R and d R . The model
n nxn nxm m n
(2) is known an affine model. When d 0 , this model is called a linear model.
1.2 Design Controller. The linear control theory can be used to design the consequent parts
of the fuzzy control rules because they are described by linear state equations. Suppose that
the control input u is
ui (t ) ui (t ) k0i
in order to cancel the bias term d i . Then the Takagi-Sugeno fuzzy model is described by
x(t ) Ai x(t ) Biui (t ), i 1, 2,..., r .
Hence, the state feedback controller described by
ui (t ) Ki x(t )
where Ki R n is vector of feedback gains to be chosen for i-th operating point. Therefore a
set of r control rules takes the following form:
If x1 (t ) is Li1 and xn (t ) is Lin and u (t ) is Mi
ui (t 1) Ki x(t ) koi
then
where the index t 1 in the consequent part is introduced to distinguish the previous control
action in the antecedent part in order to avoid algebraic loops.
The resulting total control action is
520 SOLIKHATUN AND SALMAH
(K x ki i 0i )
u i 1
r
. (3)
i 1
i
Substituting (3) into (1), the state feedback fuzzy control system can be represented by
r r
( A B K
i 1 j 1
i j i i j )x
x(t ) r r
. (4)
i 1 j 1
i j
i ( z ) r
Define hi r
then h ( z) 1. The system (4) can be written by
i
i ( z)
i 1
i 1
r
x (t ) hi Gii 2 hi h jGij ,
2
i 1 i j
( Ai Bi K j ) ( Aj B j Ki )
Gii Ai Bi Ki , i 1,2,...r and Gij ,i j r .
2
r r
1 r
Corollary 10. hi ( z)
i 1
2
2hi ( z)hj ( z) 0 where
r 1 i 1 i j
h ( z) 1, h ( z) 0
i 1
i i
for all i.
Proof. It holds since
r
1 r 1 r
hi ( z) i j (hi ( z) hj ( z))2 0 .■
2
2 h ( z ) h ( z )
i 1 r 1 i 1 i j r 1 i 1 i j
Corollary 11. If the number of rules that fire for all t is less than or equal to s, where
1 s r , then
r
1 r
hi ( z) 2hi ( z)hj ( z) 0
2
i 1 s 1 i 1 i j
r
where h ( z) 1, h ( z) 0 for all i.
i 1
i i
( Ai Bi K j )t P P( Ai Bi K j ) 0, i, j 1, 2,..., r . (5)
v* 0 rQ qQ ( Ai Bi K j )Q v 0
* 0
0 v qQ ( Ai Bi K j ) Q rQ
t
0 v
rv*Qv qv*Qv v* ( Ai Bi K j )Qv
* 0
qv Qv v Q( Ai Bi K j ) v rv*Qv
* t
r q
v*Qv 0
q r
Because Q 0 then we obtain
r q
f D ( ) 0.
q r
In other word the fuzzy control system (4) is D-stable.
Contrary. Consider the fuzzy control system (4) is D-stable. We divide into two cases. For the
case Ai Bi K j , i, j 1,2,..., r is diagonal matrix Diag (l ), l 1,2,..., n, l D .
rI qI It
Suppose that 0 then according to the Lemma 2, we obtain
qI I rI
522 SOLIKHATUN AND SALMAH
rI 0
1
rI (qI It )(rI )1 (qI I ) 0 rI (qI It )(qI I ) 0
r
Because r 0 then rI 0 . It is contradiction with assumption. For the case
Ai Bi K j , i, j 1,2,..., r is diagonalizable matrix then there exists an invertible matrix T
such that
T 1 ( Ai Bi K j )T , i, j 1,2,..., r
with is a diagonal matrix that elements of are eigenvalues of
Ai Bi K j , i, j 1,2,..., r . Then
rI qI I (T 1 ( Ai Bi K j )T )t
1
qI (T ( Ai Bi K j )T ) I rI
rI qI It
0.
qI I rI
For the case Ai Bi K j , i, j 1,2,..., r is not diagonalizable matrix then there exists an
invertible matrix T such that
T 1 ( Ai Bi K j )T J , i, j 1,2,..., r
with J is a Jordan matrix. Then
rI qI I (T 1 ( Ai Bi K j )T )t
1
qI (T ( Ai Bi K j )T ) I rI
rI qI IJ t
0.
qI JI rI
Let Q TT such that Q 0 . Then
*
T 0 rI qI I (T 1 ( Ai Bi K j )T )t T * 0
0 T qI (T 1 ( A B K )T ) I *
0
i i j rI 0 T
rTT * qTT * TT * (T 1 ( Ai Bi K j )T )t
1 0.
qTT (T ( Ai Bi K j )T )TT rTT *
* *
rQ qQ Q(T 1 ( Ai Bi K j )T )t
1 0
qQ (T ( Ai Bi K j )T )Q rQ
Furthermore if we take Re(Q) is matrix that its element is real part of Q then
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 523
We will formulate a problem for the design of fuzzy state feedback control system
that guarantees stability and satisfies desired transient responses by using the LMIS
constraints. The LMIs formulations of fuzzy state feedback synthesis problem are followed:
Theorem 14. The fuzzy control system (4) can be stabilized in the LMI-D region if there
exists a common positive definite matrix Q and Yi such that the following conditions hold
AiQ QAit BiYi Yi t Bit 0
AiQ QAit BiY j Y jt Bit Aj Q QAtj B jYi Yi t Btj
0 (6)
2 2
rQ qQ QAit Yi t Bit
0, i, j 1,2,..., r .
qQ AiQ BiYi rQ
1
Given solution (Q, Yi ) , the fuzzy state feedback gain is obtained by Ki Yi Q .
Proof. System (4) can be presented as
1 r
x(t ) i iGii 2 i j Gij x ,
W i 1 i j
1 1
Gij ( Ai Bi K j ) ( Aj B j Ki ), i j , Gii ( Ai Bi Ki ) ( Ai Bi Ki ) and
2 2
r r
W i j . By define Q P 1 then LMI of (5) of stability Lyapunov can be rewrite
i 1 j 1
as
QGii t Gii Q 0, i 1, 2,..., r
QGij t Gij Q 0, i j r.
It is equivalent to
524 SOLIKHATUN AND SALMAH
Q( Ai Bi Ki )t ( Ai Bi Ki )Q 0
Q( Ai Bi K j ) ( Ai Bi K j )Q
t
Q( Aj B j Ki )t ( Aj B j Ki )Q
0
2 2
Q( Ai Bi Ki )t ( Ai Bi Ki )Q 0
Q Q
( Ai Bi K j )t ( Ai Bi K j )Q ( Aj B j Ki )t ( Aj B j Ki )Q 0
2 2
2 2
The last sufficient can be derived immediately from Theorem 13.■
Fuzzy controller model (4) is described as sum of weighting of r subsystems. The controller
design is guarantees the stability of system and satisfies the desired transient responses of
system. If the r approach to infinity, the existence of controller that stabilize the system will
be difficult to obtain. It is caused by find the solution of set of LMIs that satisfying some the
conditions.
There are two approach to relax the stability conditions according [7], namely, the global and
the regional Membership Function Shape Dependent (MFSD). In this paper we used the
regional MFSD. The operating regions of membership functions is divided into r region. Each
of region have has individual constraints that brings regional information to relaxation of
stability condition by some slack matrices T. We will formulate the problem of controller
design in LMIs feasibility problem as followed:
Theorem 15. Assume that the number of rules that fire for all t is less than or equal to s,
where 1 s r . The fuzzy control system (4) can be stabilized in the specified region D if
there exists a common positive definite matrix Q , Yi and a common positive semi definite
matrix T such that the following conditions hold
AiQ QAit BiYi Yi t Bit ( s 1)T 0
AiQ QAit BiY j Y jt Bit Aj Q QAtj B jYi Yi t Btj ( s 1)T
0
2 2 2 (7)
rQ qQ QAit Yi t Bit
0, i j 1,2,..., r , hi h j
qQ AiQ BiYi rQ
where s 1 . Given solution (Q, Yi ) , the fuzzy state feedback gain is obtained by
Ki YiQ1 .
Proof. Consider a candidate of Lyapunov function V ( x(t )) xt (t ) Px(t ), P 0 . Then
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 525
r
V ( x(t )) hi ( z )xt (t )(Gii P PGii ) x(t )
2
i 1
r (Gij G ji )t (G G ji )
i 1 i j
2 hi ( z ) h j ( z ) x t
(t )
2
P P ij
2
x(t )
1 1
Gij ( Ai Bi K j ) ( Aj B j Ki ), i j and Gii ( Ai Bi Ki ) ( Ai Bi Ki ) .
2 2
From condition of second LMI (7) and Corollary 10, we have
r r
V ( x(t )) hi ( z )xt (t )(Gii P PGii ) x(t ) 2hi ( z )h j ( z ) xt (t )Tx(t )
2
i 1 i 1 i j
r r
hi ( z )xt (t )(Gii P PGii ) x(t ) ( s 1) hi ( z )xt (t )Tx(t )
2 2
i 1 i 1
r
hi ( z )xt (t )(Gii P PGii ( s 1)T ) x(t ).
2
i 1
r
Where x(t ) r and u (t ) ur (t ) .
( t ) u (t )
( )
The r (t ) is the distance between the surface earth and the satellite that is affected by
time. The (t ) t is the different between angle and angle position that is affected by time.
Let the x1 (t ) as variable fuzzy, then exist three power one rules as followed:
526 SOLIKHATUN AND SALMAH
1.5
To: Out(1)
1
9
x 10 Respon impulse bef ore are given controller
5 0.5
0 0
Amplitude
To: Out(1)
-5
-0.5 -4
x 10
-10
-15
Amplitude
-20 5
x 10 2
To: Out(2)
15
10
To: Out(2)
1
5
0
-5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec)
Time (sec)
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 527
1.5
To: Out(1)
0.5
0
Amplitude
-0.5 -4
x 10
2
To: Out(2)
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec)
3. CONCLUSIONS
In this paper, the linear state feedback fuzzy controller with guaranteed stability and pre-
specified transient performance is presented. By formulate the system into TS-fuzzy model
and recasting these constraints into LMIs, we formulate an LMI feasibility problem for the
design of the fuzzy state feedback control system. If the r approach to infinity then the
existence of controller that stabilizes the system will be difficult to obtain. It is caused by find
the solution of set of LMIs that satisfying some the conditions. By relaxing the stability
conditions we have formulated the problem of controller design in LMIs feasibility problem.
References
[1] BOYD, S., GHAOUI, FERON, LE., E. AND BALAKRISHNAN, V., LMI in System and Control Theory,
SIAM, Philadelphia, 1994.
[2] CHILALI, M., AND GAHINET, P., H Design with Pole Placement Constraints: An LMI Approach,
IEEE Trans. Automatic Control, Vol 41, No 3 pp 358-367,1996.
[3] HONG, S.K. AND NAM, Y., Stable Fuzzy Control System Design with Pole Placement Constraint: An
LMI Approach, Computer in Industry, Elsevier Science, 2003.
[4] LAM, H.K., LEUNG, F.H.F. AND TAM, P.K.S., A LMI Approach for Control of Uncertain Fuzzy
System, IEEE Control Systems Magazine, August 2002.
[5] MASTORAKIS, N.E., Modeling Dynamical Systems via the Takagi-Sugeno Fuzzy Model, Department of
Electrical Engineering and Computer Science, Hellenic Naval Academy Piraeus, Greece, 2004.
[6] MESSOUSI, W.E., PAGES, O. AND HAJJAJI, A.E., Robust Pole Placement for Fuzzy Models with
Parametric Uncertainties: An LMI Approach, University of Picardie Jules Verne, France, 2005.
[7] NARIMANI, M and LAM, HK., Relaxed LMI-Based Stability Conditions for Takagi-Sugeno Fuzzy
Control Systems Using Regional-Membership-Function-Shape-Dependent Analysis Approach, IEEE
Transaction on Fuzzy Systems, Vol 17, No 5, 2009.
[8] OGATA, K, Modern Control Engineering, 2nd ed. Englewood Cliffs, N.J,: Prentice Hall, Inc, 1990.
[9] OLSDER,J., Mathematical System Theory, Faculty of Technical Mathematics and Informatics, Delf
528 SOLIKHATUN AND SALMAH
Solikhatun
Department of Mathematics
Faculty of Mathematics and Natural Sciences
Gadjah Mada University, Yogyakarta
e-mail : [email protected]
Salmah
Department of Mathematics
Faculty of Mathematics and Natural Sciences
Gadjah Mada University, Yogyakarta
e-mail : [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 529–546.
Abstract. This work investigates the application of the new successive linearisation
method (SLM) to the problem of unsteady heat and mass transfer from a stretching sur-
face embedded in a porous medium with suction/injection and thermal radiation effects.
The governing non-linear momentum, energy and mass transfer equations are success-
fully solved numerically using the SLM approach coupled with the spectral collocation
method for iteratively solving the governing linearised equations. Comparison of the SLM
results for various flow parameters against numerical results and other published results,
obtained using the Homotopy Analysis Method and Runge-Kutta methods, for related
problems indicates that the SLM is a very powerful tool which is much more accurate
and efficient than other methods. The SLM converges much faster than the traditional
methods like the Homotopy Analysis Method and is very easy to implement.
Keywords and Phrases: successive linearization, heat & mass transfer, porous medium,
thermal radiation.
1. INTRODUCTION
The study of heat and mass transfer over a stretching surface is important in many
industrial applications such as hot rolling and wire drawing, glass fibre production, the
aerodynamic extrusion of plastic sheets, the continuous casting, paper production, glass
blowing and metal spinning. The quality of the final product depends on the rate of
heat transfer at the surface. In the pioneering work of Crane [1], the flow of Newtonian
fluid over a linearly stretching surface was studied. Subsequently, the pioneering works
of Crane are extended by many authors to explore various aspects of the flow and heat
transfer occurring in an infinite domain of the fluid surrounding the stretching sheet
529
530 S. Shateyi, S.S. Motsa
(e.g Laha et al. [2]; Afzal [3]; Prasad et al. [4]; Abel and Mahesha [5]; Abel et al. [6];
Cortell [7]).
Physically, the problem of natural/mixed convection flow past a stretching sheet
embedded in porous medium arises in some metallurgical processes which involve the
cooling of continuous strips or filaments by drawing them through quiescent fluid. Draw-
ing the strips through porous media allows to control the rate of cooling better and the
final product of desired characteristics can be achieved. Abdou [8] developed a numeri-
cal model to study the effect of thermal radiation on unsteady boundary layer flow with
temperature dependent viscosity and thermal conductivity due to a stretching sheet in
porous media. Pal and Mondal [9] performed a boundary layer analysis to study the
influence of thermal radiation and buoyancy force on two-dimensional magnetohydrody-
namic flow of an incompressible viscous and electrically conducting fluid over a vertical
stretching sheet embedded in a porous medium in the presence of inertia effect.
Motivated by the previous works and the vast possible industrial applications, it
is of interest in this article to analyze unsteady heat and mass transfer from a stretching
permeable surface embedded in a porous medium with suction/injection and thermal
radiation. The governing partial differential equations are transformed into ordinary
differential equations using the similarity transformation, before being solved by a new
technique called successive linearization method (SLM).
2. MATHEMATICAL FORMULATION
We consider an unsteady boundary-layer flow due to a stretching permeable sur-
face embedded in a uniform porous medium and issuing from a slot as shown in Figure
1. We assume that equal and opposite forces are applied along the x− axis so that the
sheet is stretched, keeping the origin fixed in the fluid of the ambient temperature T∞
and concentration C∞ .At t = 0, the sheet is impulsively stretched with the variable ve-
locity Uw (x, t), the temperature distribution Tw (x, t) and the concentration distribution
Cw (x, t) varies both along the sheet and with time. We assume the fluid properties to
be constant. The radiative heat flux in the x− direction is negligible in comparison with
that in the y−direction. The fluid flow over the unsteady stretching sheet is composed
of a concentration. Under these assumption with the usual Boussinesq approximation,
the governing boundary-layer equations for this investigation are given by:
∂u ∂v
+ = 0, (1)
∂x ∂y
∂u ∂u ∂u ∂2u ν
+u +v = ν 2
+ gβT (T − T∞ ) + gβC (C − C∞ ) − u, (2)
∂t ∂x ∂y ∂y k
∂T ∂T ∂T ∂2T 1 ∂qr
+u +v = α 2 − , (3)
∂t ∂x ∂y ∂y ρcp ∂y
∂C ∂C ∂C ∂2C
+u +v = D . (4)
∂t ∂x ∂y ∂y 2
Unsteady Heat and Mass Transfer from a Stretching Surface... 531
Slit
y
6 9
6
j
j
j
j
1
s
j x
z+ j
j
Solid block
*
z
Force
Stretching Sheet
By using the Rosseland diffusion approximation (Hossain et al. [12], Seddeek [13])
and following Raptis [14] among other researchers, the radiative heat flux, qr is given
by
4σ ∗ ∂T 4
qr = − , (7)
3Ks ∂y
∗
where σ and Ks are the Stefan-Boltzman constant and the Rosseland mean absorption
coefficient, respectively. We assume that the temperature differences within the flow
are sufficiently small such that T 4 may be expressed as a linear function of temperature.
T 4 ≈ 4T∞
3 4
T − 3T∞ . (8)
Using (5) and (6) in the last term of equation (3) we obtain
∂qr 16σ ∗ T∞
3
∂2T
=− . (9)
∂y 3Ks ∂y 2
532 S. Shateyi, S.S. Motsa
The stretching velocity Uw (x, t), the surface temperature Tw (x, t) and the surface
concentration Cw (x, t) are assumed to be of the form:
ax bx bx
Uw (x, t) = , Tw (x, t) = T∞ + , Cw (x, t) = C∞ + , (10)
1 − ct 1 − ct 1 − ct
where a, b and c are positive constants with dimension reciprocal time and ct <
1. The effective stretching rate b/1 − ct is increasing with time. In the context of
polymer extrusion the material properties may with time even though the sheet is
being pulled by a constant force. With unsteady stretching (i.e c 6= 0), however, a−1
becomes the representative time scale of the resulting unsteady boundary layer problem.
The expressions for the temperature Tw (x, t) and concentration Cw (x, t) of the sheet
represent a situation in which the sheet temperature and concentration increase (reduce)
if b is positive (negative) from T∞ and C∞ , respectively at the leading edge (x = 0) in
proportion to x and such that the amount of temperature and concentration increase
(reduction) along the sheet increase with time. Further, in equation (5) Vw is the
suction/injection parameter, Vw > 0 (injection) and Vw < 0 (suction).
We introduce the following self-similar transformation (see Ishak et al. [15] and
Anderson et al. [16], among others):
1/2
Uw 1 T − T∞ C − C∞
η= y, ψ = (νxUw ) 2 f (η), θ = , φ= , (11)
νx Tw − T∞ Cw − C∞
where ψ(x, y) is the physical stream function which automatically satisfies the conti-
∂ψ
nuity equation (1). The velocity components are then given as: u = ∂Ψ ∂y , v = − ∂x .
Governing equations are then transformed into a set of ordinary differential equations
and associated boundary conditions as given below:
η
f 000 + f f 00 − (f 0 )2 − Kf 0 − A(f 0 + f 00 ) + Grθ + Gcφ = 0, (12)
2
1+R 1
θ00 + f θ0 − f 0 θ − A(θ + ηθ0 ) = 0, (13)
Pr 2
1 00 1
φ + f φ0 − f 0 φ − A(φ + ηφ0 ) = 0, (14)
Sc 2
where the prime indicates differentiation with respect to η.
3
c 16σT∞
A= , R= , Gr = gβT (Tw − T∞ )x3 /ν 2 ,
a 3Ks αρcp
νx ν
Gc = gβC (Cw − C∞ )x3 /ν 2 , K = , P r = , Sc = ν/D, (15)
kUw α
In view of equations (11), the boundary conditions (5) to (6), transform into
are the skin-friction coefficients Cf , local Nusselt number N ux and local Sherwood
number Shx which are defined as follows:
1 1/2
Re Cf = −f 00 (0), (18)
2 x
N ux Re−1/2
x = −θ0 (0), (19)
Shx Re−1/2
x
0
= −φ (0), (20)
xU∞
where Rex = ν , is the Reynolds number.
i−1
X i−1
X i−1
X
fi (η) = Fi (η) + fm (η), θi (η) = Gi (η) + θm (η), φi (η) = Hi (η) + φm (η),
m=0 m=0 m=0
(21)
where i = 1, 2, 3, . . . ; Fi , Gi , Hi are unknown functions and fm , θm , φm (m ≥ 1) are
the successive approximations which are obtained by recursively solving the linear part
of the equation system that results from substituting (21) in the governing equations
(12 - 14). Substituting (21) in the governing equations gives
F 000 + a1,i−1 Fi00 + a2,i−1 Fi0 + a3,i−1 Fi + GrGi + GcHi + Fi00 Fi − Hi0 Hi0 = ri−1 ,
i
1+R
G00i + b1,i−1 G0i + b2,i−1 Gi + b3,i−1 Fi0 + b4,i−1 Fi + Fi G0i − Fi0 Gi = si−1 ,
Pr
1 00
H + c1,i−1 Hi0 + c2,i−1 Hi + c3,i−1 Fi0 + c4,i−1 Fi + Fi Hi0 − Fi0 Hi = ti−1 (22)
Sc i
534 S. Shateyi, S.S. Motsa
where the coefficient parameters ak,i−1 , bk,i−1 , ck,i−1 (k = 1, .., 4), ri−1 , si−1 and ti−1
are defined as
i−1 i−1 i−1
X η X
0
X
00
a1,i−1 = fm − A, a2,i−1 = −2 fm − K − A, a3,i−1 = fm ,
m=0
2 m=0 m=0
i−1
X i−1
X i−1
X
0 0
b1,i−1 = a1,i−1 , b2,i−1 = − fm − A, b3,i−1 = − θm , b4,i−1 = θm ,
m=0 m=0 m=0
i−1
X i−1
X
c1,i−1 = a1,i−1 , c2,i−1 = b2,i−1 c3,i−1 = − φm , c4,i−1 = φ0m ,
m=0 m=0
i−1 i−1 i−1 i−1
!2 i−1
X X X X X
000 00 0 0
ri−1 = − fm + fm fm − fm − (K + A) fm
m=0 m=0 m=0 m=0 m=0
i−1 i−1 i−1
#
η X X X
− A f 00 + Gr θm + Gc φm ,
2 m=0 m m=0 m=0
" i−1 i−1 i−1 i−1 i−1
1 + R X 00 X
0
X X
0
X
si−1 = − θm + θm fm − fm θm
Pr m=0 m=0 m=0 m=0 m=0
i−1 i−1
!#
X η X 0
− A θ+ θ ,
m=0
2 m=0 m
" i−1 i−1 i−1 i−1 i−1
1 X 00 X X X X
ti−1 = − φm + φ0m fm − 0
fm φm
Sc m=0 m=0 m=0 m=0 m=0
i−1 i−1
!#
X η X 0
− A φ+ φ ,
m=0
2 m=0 m
which are chosen to satisfy the boundary conditions (15) and (16). The solutions
for fm , θm , φm for m ≥ 1 are obtained by successively solving the linearised form
of equations (22) - (22) which are given as
f 000 + a1,i−1 fi00 + a2,i−1 fi0 + a3,i−1 fi + Grθi + Gcφi + = ri−1 , (24)
i
1+R
θi00 + b1,i−1 θi0 + b2,i−1 θi + b3,i−1 fi0 + b4,i−1 fi = si−1 , (25)
Pr
1 00
φ + c1,i−1 φ0i + c2,i−1 φi + c3,i−1 fi0 + c4,i−1 fi = ti−1 , (26)
Sc i
with boundary conditions
fi (0) = fi0 (0) = fi0 (∞) = θi (0) = θi (∞) = φi (0) = φi (∞) = 0. (27)
Unsteady Heat and Mass Transfer from a Stretching Surface... 535
Once each solution for fi , θi and φi for (i ≥ 1) has been found from iteratively solving
equations (24) - (26) for each i, the approximate solutions for f (η), θ(η) and φ(η) are
obtained as
M
X M
X M
X
f (η) ≈ fm (η), θ(η) ≈ θm (η), φ(η) ≈ φm (η), (28)
m=0 m=0 m=0
Since the coefficient parameters and the right hand side of equations (24) - (26), for i =
1, 2, 3, . . ., are known (from previous iterations), the equation system (24 - 27) can easily
be solved using analytical means (whenever possible) or any numerical methods such as
finite differences, finite elements, Runge-Kutta based shooting methods or collocation
methods. In this work, equations (24 - 27) are solved using the Chebyshev spectral
collocation method.
Table 1 indicates that the SLM is very accurate and rapidly converges to the
numerical results generated by bvp4c. The numerical results obtained by Ziabakhsh et
al. [10] are comparable with the present numerical results of bvp4c. The HAM results
are not so accurate. We also observe in Table 1 that the local skin friction for the flow
is increased by increasing values of fw and A.
Table 2 depicts the comparison between the present successive linearisation results
and the bvp4c numerical results against results of Dulal and Hiremath [11]. It can be
clearly seen in this table that the SLM is very accurate and rapidly converges to the
numerical results generated by bvp4c. At 3rd order of approximation the solution has
already converged. The numerical results obtained by Dulal and Hiremath [11] using
the Runge-Kutta-Fehlberg with shooting technique are not as accurate as our present
results. In Table 2 we also observe that the skin friction increases as the values of the
permeability parameter K increase.
The results of both the SLM and numerical bvp4c computations are displayed in
Figures 2 - 5 for non-dimensional velocity f 0 (η), temperature θ(η) and concentration
φ(η). As expected, the velocity f 0 (η) is also a decreasing function of A as clearly shown
in Figure 2. As can be seen in this Figure 2, increasing the unsteadiness parameter A
reduces the flow properties such as velocity, temperature θ(η) and concentration φ(η).
As evidenced in Figure 2, the boundary layer thickness decreases with increasing values
of A. As a consequence the transition of the boundary layer to turbulent flow conditions
occurs farther downstream. This confirms that the stretching of surfaces can be used
as a flow stabilizing mechanism.
Figure 3 depicts the effects of varying the permeability parameter K of the porous
medium on the velocity f 0 (η), temperature θ(η) and concentration φ(η). The parameter
K as defined in equation 15 is inversely proportional to the actual permeability k of the
porous medium. As the permeability of the porous medium represents resistance to the
flow since it restricts the motion of the fluid along the surface, the stream function and
subsequently the velocity f 0 (η) decreases as K increases (as the permeability physically
becomes less with increasing k). With increasing K, the thickness of the boundary
layer increases so the velocity decreases with the increase of K. This is by virtue of the
fact that the effect of porous medium which opposes the flow also increases and leads
to enhanced deceleration of the flow. We observe in this Figure 3 that both the fluid
temperature and concentration are increasing functions of the permeability parameter
K. This is consistent with the fact that the increase of porosity parameter causes
the fluid velocity to decrease and due to which there is rise in the temperature and
concentration in the boundary layer.
Figure 4 represents the velocity, temperature and concentration profiles for fw =
0, 1, 2, and 3. We see that the effect of suction is to decrease the horizontal velocity f 0 (η).
The physical explanation for such behaviour is that while stronger suction is provided,
the heated fluid is sucked through the wall where buoyancy forces can act to decelerate
the flow with more influence of viscosity. Sucking decelerated fluid particles through
the porous wall reduce the growth of the fluid boundary layer as well as thermal and
concentration. From Figure 4 it is clear that the dimensionless temperature and con-
centration decrease due to suction. The physical interpretation of this is that the fluid
538 S. Shateyi, S.S. Motsa
0.9
0.8
0.7
0.6
f 0 (η)
0.5
A = 0, 1, 2, 5
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6
η
0.9
0.8
0.7
0.6
θ(η)
0.5
0.4
A = 0, 1, 2, 5
0.3
0.2
0.1
0
0 1 2 3 4 5 6
η
0.9
0.8
0.7
0.6
φ(η)
0.5
0.4 A = 0, 1, 2, 5
0.3
0.2
0.1
0
0 1 2 3 4 5 6
η
0.9
0.8
0.7
0.6
f 0 (η)
0.5
0.4 K = 0, 2, 4, 8
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8
η
0.9
0.8
0.7
0.6
θ(η)
0.5
0.4
K = 8, 4, 2, 0
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8
η
0.9
0.8
0.7
0.6
φ(η)
0.5
0.4
0.3 K = 8, 4, 2, 0
0.2
0.1
0
0 1 2 3 4 5 6 7 8
η
at the ambient conditions is brought closer to the surface and reduces the thermal and
solutal boundary layer thicknesses. But the temperature and concentration increase due
to injection. The thermal and solutal boundary layer thicknesses increase (decrease)
with injection (suction) which cause decreases (increases) in the rates of heat and mass
transfer.
Figure 5 shows the thermal radiation effect on the velocity, temperature and concen-
tration profiles. Increasing the thermal radiation parameter produces increases in the
stream function, velocity and temperature of the fluid. This can be explained by the
fact that the effect of radiation R is to increase the rate of energy transport to the fluid
and accordingly to increase the fluid temperature. In Figure 5 we see that radiation
has no significant effect on the concentration of the flow.
Unsteady Heat and Mass Transfer from a Stretching Surface... 541
0.9
0.8
0.7
0.6
f 0 (η)
0.5
0.4
0.3 fw = 0, 1, 2, 3
0.2
0.1
0
0 1 2 3 4 5 6 7 8
η
0.9
0.8
0.7
0.6
θ(η)
0.5
0.4
fw = 0, 1, 2, 3
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8
η
0.9
0.8
0.7
0.6
φ(η)
0.5
0.4
fw = 0, 1, 2, 3
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8
η
0.9
0.8
0.7
0.6
f 0 (η)
0.5
0.4
R = 5, 2, 1, 0
0.3
0.2
0.1
0
0 2 4 6 8 10
η
0.9
0.8
0.7
0.6
θ(η)
0.5
R = 5, 2, 1, 0
0.4
0.3
0.2
0.1
0
0 2 4 6 8 10
η
0.9
0.8
0.7
0.6
φ(η)
0.5
0.4
0.3
R = 0, 1, 2, 5
0.2
0.1
0
0 2 4 6 8 10
η
Nomenclature
a, b, c constants
A unsteadiness parameter
C species concentration at any point in the flow field
Cf skin-friction coefficient
Cw species concentration at the wall
cp specific heat at constant pressure
C∞ species concentration at the free stream
D molecular diffusivity of the species concentration
f dimensionless stream function
f0 dimensionless velocity
fw mass transfer coefficient
g acceleration due to gravity
Gc Concentration buoyancy parameter
Gr Grashof number
k Darcy permeability
K permeability parameter
Ks mean-absorption coefficient
N ux local Nusselt number
Pr Prandtl number
qr Rosseland approximation
R thermal radiation parameter
Rex Reynolds number
Sc Schmidt number
Shx local Sherwood number
T fluid temperature at any point
Tw fluid temperature at the wall
T∞ free stream temperature
u streamwise velocity
Uw velocity of the stretching sheet
v normal velocity
Vw Suction/injection velocity
x streamwise coordinate axis
y normal coordinate axis
Greek Symbol
α thermal conductivity
ν kinematic viscosity
βC volumetric coefficient expansion with concentration
βT volumetric coefficient of thermal expansion
ρ density of the fluid
σ∗ Stefan-Boltzman constant
η similarity variable
θ dimensionless temperature
φ Dimensionless concentration
544 S. Shateyi, S.S. Motsa
5. CONCLUDING REMARKS
In this work, we employed a very powerful new linearisation technique, known
as the Successive Linearisation Method (SLM), to study the unsteady heat and mass
transfer from a stretching surface embedded in a porous medium with suction/injection
and thermal radiation effects. The SLM results for the governing flow properties, such
as velocity profiles, temperature profiles, concentration profiles, wall heat transfer and
mass transfer, were compared with results obtained using MATLAB’s bvp4c function
and excellent agreement was observed. The SLM was found to converge very rapidly
to the numerical results and accuracy to 10−7 was achieved only after two or three
iterations when varying all the governing physical parameters.
Acknowledgement. The authors acknowledge the financial support from the National
Research Foundation (NRF) and University of Venda.
References
[1] Crane, L. J., Flow past a stretching plate, Z. Angew. Math. Physc. 12, 645-647, 1970.
[2] Laha, M. K., P. S Gupta and A. S Gupta, Heat transfer characteristics of the flow of an
incompressible viscous fluid over a stretching sheet, Warme-und Stoffubertrag, 24, 151-153, 1989.
[3] Afzal, N., Heat transfer from a stretching surface, Int. J. Heat Mass Trans. 36, 1128-1131, 1993.
[4] Prasad, K. V., A. S Abel, and P. S. Datti., Diffusion of chemically reactive species of non-
Newtonian fluid immersed in a porous medium over a stretching sheet, Int.J. Non-Linear Mech.
38, 651-657, 2003.
[5] Abel, M.S. and Mahesha N., Heat transfer in MHD viscoelastic fluid over a stretching sheet
with variable thermal conductivity, non-uniform heat source and radiation, Appl. Math. Modell.
32, 1965-1983, 2008.
[6] Abel, P. G. Siddheshwar and Mahantesh M. Nandeppanava, Heat transfer in a viscoelastic
boundary layer flow over a stretching sheet with viscous dissipation and non-uniform heat source,
Int. J. Heat Mass Trans. 50, 960-966, 2007.
[7] Cortell, R., Viscoelastic fluid flow and heat transfer over a stretching sheet under the effects of
a non-uniform heat source, viscous dissipation and thermal radiation, Int. J. Heat Mass Trans.
50, 3152-3162, 2007.
[8] Abdou, M.M.M, Effect of radiation with temperature dependent viscosity and therma
conductivity on unsteady a stretching sheet through porous media, Nonlinear Analysis:
Modelling and Control, 15(3), 257-270, 2010.
[9] Pal, D., and S. Chatterjee, Heat and mass transfer in MHD non-Darcian flow of a
micropolar fluid over a stretching sheet embedded in a porous media with non-uniform
heat source and thermal radiation,Commun Nonlinear Sci Numer Simulat, 15(7), 1843-1857,
2010.
[10] Ziabakhsh, Z, Domairry, G. Mozaffari, M. Mahbobifar, M., Analytical solution of heat
transfer over an unsteady stretching permeable surface with prescribed wall temper-
ature, J. Taiwan Inst. Chem. Eng., 41 (2), 169-177, 2010.
[11] Pal, D., and P. S. Hiremath, Computational modeling of heat transfer over an unsteady
stretching surface embedded in a porous medium, Meccanica, 45(3), 415-524, 2009.
[12] Hossain, M. A., M.A. Alim, and D. A. S. Rees, The effect of radiation on free convection
from a porous vertical plate,Int. J. Heat Mass Transfer, 42, 181 - 191, 1999.
[13] Seddeek, M. A.M., Thermal radition and buoyancy effects on MHD free convection
heat generation flow over an accelerating permeable sur face with temperature de-
pendent viscosity, Canadian Journal of Physics, 79(4), 725-732, 2001.
Unsteady Heat and Mass Transfer from a Stretching Surface... 545
[14] Raptis, A., Flow of a micropolar fluid past a continuously moving plate by the pres-
ence of radiation, Int. J. Heat Mass Transfer, 41, 2865-2866, 1998.
[15] Ishak, A., R. Nazar, and I. Pop, Heat transfer over an unsteady stretching permeable
surface with prescribed wall temperature,Nonlinear Analysis: Reall World Applications,
10, 2909-2913, 2009.
[16] Andersson, H. I., J. B. Aarseth, N. Braud, and B. S. Dandapat, Flow of a power-law
fluid film on an unsteady stretching surface, J.Non-Newtonian Fluid Mech. 62, 1-8, 1996.
Abstract. We develop a level set method for computing multi -valued solution to quasi-
linear hyperbolic partial differential equation. Here we apply the method to nonlinear two -
channel dissipation model. The model does not have level set equation, so we find the level
set equations that approximate the model. We explain the difference between of the level
set and level-set-like equations. The multi-valued solutions of the model are approximated
as the zeros of set scalar functions that solve the initial val ue problems of a time dependent
linear partial differential equation in augmented space.
Keywords and Phrases: Hyperbolic PDEs, Multi-valued solutions, Level set method
1. INTRODUCTION
The phenomenon of wave breaking in system occurred in physical systems. Where the
physics of one system may dictate that a shock develops after the wave breaking event, the
physics of another system may dictate that the formation of multivalued solutions is
appropriated after the breaking wave event. Physical systems where multi-valued solutions
may be appropriate include geometric optic Osher et. al. [10], arrival time in seismic imaging
Formel et. al.[5], nonlinear plasma waves, stellar dynamic and galaxy formation, multi-lane
traffic flows, multi-phase fluids. So we need the computation of multivalued solutions.
There are two classes of method used to compute multivalued solutions of the
nonlinear partial differential equations. The first class is Lagrangian method that solves a set
ordinary differential equations in order to trace the wavefronts Benamou [2] [3]. The second
class involves Eulerian methods, which solve partial differential equation a fixed grid. The
methods based in physical space often use the classical Liouville equations. For the
computation of wavefronts, a Liouville equation based phase techniques, using the
______________________________________________
2010 Mathematics Subject Classification :35F06,35L06,65M06
547
548 SUMARDI ET AL.
segmentation projection method Engquist et. al. [4], or using level set method Osher et.
al.[10], Jin et. al. [6] [7]. There are nonlinear partial differential that cannot transform in
Liouville equations, which solved by level set method, for example: multidimensional
hyperbolic partial differential equations Liu et. al. [8] [9].
In this paper we develop the level set method to solve system nonlinear partial
differential that has not level set equation. The system is
u t uu x (v u )
(1)
vt vvx (u v).
We could consider them as inviscid Burgers equations, in two parallel channels, with a
dissipative exchange to one another that called the nonlinear two channels dissipation model.
This system was presented by Van Beckum [1], who also derived a travelling wave solution,
i.e. a solution that travels at constant speed and undisturbed in shape. Here we find the level
set equation that approximate the nonlinear two channels dissipation model. The method is
named the level-set-like method.
In this section we discuss level set equation for the computation of multi-valued solutions to
nonlinear differential equation. We formulate level set equation to first order differential
equation, first order partial differential equation and system ordinary differential equation.
The last example we give level set equations that approximate the system of ordinary
differential equation. We present these in the following lemmas.
f ( ),
u e t
x (1 e t )
If (u, x, t ) 0 , then
u f ( )e t
f ( ) .
x (1 e t )
The last two equations are the solution of characteristic of the initial value problem
u ( x, t ) u ( x, t )
u ( x, t )
t x
u ( x,0) f ( x)
The proof is complete.
Lemma 3. If a pair of the ( x, y) and ( x, y) is the solution of the initial value
1 2
problem:
1 ( x, y, t ) 1 ( x, y, t ) 1 ( x, y, t )
f ( x, y ) g ( x, y ) 0
t x y
2 ( x, y, t ) 2 ( x, y, t ) 2 ( x, y, t )
f ( x, y ) g ( x, y ) 0 (7)
t x y
1 ( x, y,0) x x0
2 ( x, y,0) y y 0
then the intersection of 1 ( x, y, t ) 0 and 2 ( x, y, t ) 0 is the solution of the initial
value problem:
dx
f ( x, y )
dt
dy
g ( x, y ) . (8)
dt
x ( 0) x 0
y ( 0) y 0
Le v el - S et - Li k e M et h od For C om p u ti n g Mu lt i -Va lu ed Solu t i on s ... 551
dx dy
f ( x, y ), g ( x, y )
dt dt
t 0, x x0 , y 1
dx dy
f ( x, y ), g ( x, y )
dt dt (9)
t 0, x 2 , y y 0
The intersection two systems above is
552 SUMARDI ET AL.
dx
f ( x, y )
dt
dy
g ( x, y )
dt
x ( 0) x 0
y ( 0) y 0
which is the characteristic of the initial value problem of (8).
Lemma 3 describes the exact level set equation, the following Lemma 4 we give the
approximation of the level set equation for initial value problem (8).
problem:
1 ( x, y, t ) 1 ( x, y, t )
f ( x, y ) 0
t x
2 ( x, y, t ) 2 ( x, y, t )
g ( x, y ) 0 (10)
t y
1 ( x, y,0) x x0
2 ( x, y,0) y y 0
then the intersection of ( x, y, t ) 0 and ( x, y, t ) 0 approximates the initial
1 2
Suppose the solution the system above at t is ( x , y ) and the exact solution of system (8) is
( x1 , y1 ) at t . The error can be compute using Taylor series as follow:
x1 x x(0) f ( x0 , y 0 )t O(t 2 ) x(0) f ( x0 , y )t O(t 2 )
f ( x0 , y 0 ) f ( x0 , y ) t O(t 2 )
Applying mean value theorem, we obtain
f ( x0 , y 0 ) f ( x0 , y ) f ( x0 , y
*
)
( y y 0 ), y 0 y * y
y
f ( x0 , y * )
y ( )t , 0 t
y
So we have x1 x O(t ) . Similarly we can proof for variable y .
2
3. LEVEL-SET-LIKE-METHOD
In this section we write theorem of the level-set-like method for nonlinear two channels
dissipation model without boundary condition using idea Lemma 4. Then we obtain the
corollary of the level-set-like method for boundary value problem of the nonlinear two
channels dissipation model.
t1 u 1x (v u ) u1 0
t2 v x2 (u v) v2 0
(11)
1 ( x, t 0 , u , v ) u f ( x )
2 ( x, t 0 , u , v ) v g ( x )
then the intersection of ( x, t 0 t , u, v) 0 and ( x, t 0 t , u, v) 0
1 2
dx
u , x(t 0 ) 1
dt
du
(v u ) , u (t 0 ) 1
dt
dv
0, v(t 0 ) 1
dt
d 1
0 1 (t 0 ) 1 f (1 )
dt
and using zero level ( x, t , u, v) 0 , we have the solution:
1
v 1
u 1 ( f (1 ) 1 )e (t t0 ) (13)
( f (1 ) 1 ) ( f (1 ) 1 )
x 1 (t t 0 ) e (t t0 ) 1
Similarly for the second equation of (11), we have the solution:
u 2
v 2 ( g ( 2 ) 2 )e (t t0 ) (14)
( g ( 2 ) 2 ) ( g ( 2 ) 2 )
x 2 (t t 0 ) e (t t0 ) 2
The intersection of the equation (13) and (14) for (u, v) :
( t t 0 )
f (1 ) g ( 2 ) g ( 2 )e
u
2 e (t t0 )
g ( 2 ) f (1 ) f (1 )e (t t0 )
v
2 e (t t0 )
Suppose ( x , u , v ) approximates the exact solution ( x, u, v) of the initial value (12) at
t 0 t , so we have
f (1 ) g ( 2 ) g ( 2 )e t
u
2 e t
(15)
g ( 2 ) f (1 ) f (1 )e t
v
2 e t
We have the characteristic form of the initial value (12) as follow
du dx
(v u ) on u with initial condition t t 0 , x 1 , u f (1 )
dt dt
(16)
dv dx
(u v) on v with initial condition t t 0 , x 2 , v g ( 2 )
dt dt
Le v el - S et - Li k e M et h od For C om p u ti n g Mu lt i -Va lu ed Solu t i on s ... 555
(17)
dx
The error u at t 0 t on characteristic u can be computed using Taylor series as
dt
follow:
du (t 0 ) d 2 u (t 0 ) t 2 3
u (t 0 t ) u (t 0 t ) u (t 0 ) t 2
O ( t )
dt dt 2
du (t 0 ) d u (t 0 ) t
2 2
u (t 0 ) t 2
O(t 3 )
dt dt 2
d 2 u (t 0 ) t 2
f (1 ) ( g ( 2 ) f (1 ))t 2
O(t 3 )
dt 2
d 2 u (t 0 ) t 2
f (1 ) ( g (1 ) f (1 )t 2
O(t 3 )
dt 2
d u (t 0 ) d u (t 0 ) t
2 2 2
( g ( 2 ) g (1 ))t 2
2
O(t 3 )
dt dt 2
It is clear that if g is constant function, then the error is O(t ) . It is possible to
2
O(t 2 )
v (t 0 t ) v(t 0 t ) O(t 2 ) .
Similarly, we can proof that
dx
The error x at t 0 t on characteristic u can be computed using Taylor series as
dt
follow
556 SUMARDI ET AL.
dx (t 0 ) d 2 x (t 0 ) t 2
x (t 0 t ) x(t 0 t ) x (t 0 ) t 2
O(t 3 )
dt dt 2
dx(t 0 ) d 2 x(t 0 ) t 2
x(t 0 ) t 2
O(t 3 )
dt dt 2
d 2 x (t 0 ) t 2
u (t 0 )t 2
O(t 3 )
dt 2
d x(t 0 ) t
2 2
3
u (t 0 )t 2
O ( t )
dt 2
d x (t 0 ) d x(t 0 ) t
2 2 2
2
2
O(t 3 )
dt dt 2
O(t 2 ).
The proof is complete.
This theorem has initial values which are single value functions. Here we can apply
for the initial values of (12) is multivalued functions, for example the values of the function f
at x x0 are f1 , f 2 , f m that is notated :
f ( x0 ) f 1 , f 2 , f m .
Then the level set equation has initial value
1 ( x0 , t 0 , u, v) u f i if f i min u f j : j 1,2,m .
Here we can generalize the Theorem 1 to approximate values that obtained at a set grid times:
t 0 t1 t 2 t n
The approximate value at each t n is obtained by using some of the values obtained in
previous step. This method we call level-set-like method, because of the idea as level set
method, but the system doesn’t have the exact level set equation. The method has local errors
that formulate as follow:
u (t n 1 ) u (t n 1 ) O(t 2 )
v (t n 1 ) v(t n 1 ) O(t 2 ) (19)
x (t n 1 ) x(t n 1 ) O(t 2 )
where ( x , u , v ) is the approximate solution and ( x, u, v) is the exact solution. The global
error is still difficult to find.
u 1x (v u ) u1 0
, (17)
v x2 (u v) v2 0
for u and v are not zero, so the equation (17) can be written:
(v u )
1x u1 0
u
(18)
(u v) 2
x 2
v 0
v
Derive ( x, u, v) and ( x, u, v) respect to x and compare to the equation (18), then we
1 2
Corollary: Theorem 1, given for problems on the whole x-axis, is also valid for problems
with boundary conditions.
4. CONCLUDING REMARK
The level set method changes nonlinear differential equations into linear differential
equations in higher dimension. However, the level-set-like method approximates the solution
of nonlinear differential equations into linear differential equations in higher dimensions. The
multivalued solutions of two channels dissipation model are computed as the zeros of the
level set equation.
References
[1] VAN BECKUM F.P.H., Travelling wave solution of a coastal zone non-Fourier dissipation model, Proceedings of
the Symposium on Coastal Zone Management, 2003.
[2] BENAMOU J,-D., Big ray tracing: Multi-valued travel time field computation using viscosity solution of the
eikonal equation, J. Comp. Phys. 128, 463-474, 1996.
[3] BENAMOU J,-D., Direct Computation of multivalued phase space solution for Halmiton-Jacobi Equations,
Commun. Pure. Appl. Math. 52(11), 1443-1475, 1999.
[4] ENGQUIST B., RUNBORG O. AND TORBERG K, High frequency wave propagation by the segmen projection, J.
Comp. Phys. 178, 373-390, 2002
[5] FORMEL S. AND SETHIAN J.A., Fast phase space computation of multiple arrivals, Proc. Natl. Acad. Sci.,
99(11) , 7329-7334, 2002
[6] JIN S., AND OSHER S., A level set method for the computation multivalued solution to quasilinear hyperbolic
PDE’s and Hamilton-Jacobi equations, Comm. Math. Sci. 1(3) 575-591, 2003
[7] JIN S., LIU H., OSHER S. AND TSAI R., Computing multi-valued physical observables for the high frequency
limit of symmetric hyperbolic systems, J. of Comp. Physics, 2005
[8] LIU H. L, AND WANG Z. M., Computing multivalued velocity and electrical fields for 1D Euler-Poisson, App.
Num. Math, 2005
[9] LIU H., CHENG L.T. AND OSHER S, A level set framework for capturing multi-valued solutions of nonlinear
first-order equations, J. Sci. Comput,2005
[10] OSHER S., CHENG L.T., KANG M., SHIM H. AND TSAI Y. H., Geometric Optics in a phase space based
level set and Eulerian framework, J. Comp. Phys., 79, 622-648, 2002
558 SUMARDI ET AL.
SUMARDI
Department of Mathematics, Gadjah Mada University, Indonesia
e-mails: [email protected], [email protected]
SOEPARNA DARMAWIJAYA
Gadjah Mada University, Indonesia
E-mails:
LINA ARYATI
Department of Mathematics, Gadjah Mada University, Indonesia
e-mail: [email protected]
F. P. H. VAN BECKUM:
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 559 - 564.
1. INTRODUCTION
559
560 S. HARIYANTO, L. ARYATI, WIDODO
In this section, we discuss about a method to solve the problem (1). In this paper
D(T ) is represented domain of T, where T is an operator. First we make the following
assumptions.
If DA,B is { z(t) D(A)| Az(t) + Bf(t) Ran M }, then DA,B is subspace of H and
any strict solution z (t ) of degenerate Cauchy problem (1) is clearly in DA,B. Obviously
we have Ker A D A, B and operator A |DA, B is closed.
B. Thaller et. al. [5] e Operator M injective if only if Ker M={0}. In order to
transform the problem (1) to nondegenerate problem we will restrict D(M) domain of
operator M to ( Ker M ) D(M ) D( M r ). Therefore, the restriction of operator M
on D(Mr), that is defined by
Mr = M |D ( M r ) , where D( M r ) ( Ker M ) D(M ),
is an invertible operator.
Next, we define an operator A0 which is restriction A on ( Ker M ) ,
A0 x(t ) A P T 1
{x(t )} DA, B (Ran M) , for every x(t) D(A0)
where D(A0)= x(t ) ( Ker M ) | P T 1
{x(t )} DA, B .
The operator A, however, will become a multi valued operator A0 on ( Ker M ) . So,
we need the following assumption.
Assumption 2.3: PDA,B DA,B dan operator (QAP)|PDA,B has a bounded inverse.
Under Assumption 2.1, 2.2, and 2.3 a vector z (t ) H is in the subspace DA,B if
only if z(t ) D( A) , and Pz(t ) (QAP) 1 QAP T z (t ) . Therefore any
x(t ) P T DA, B ( Ker M ) uniquely determines z (t ) DA, B such that
1
x(t ) P z (t ) and z(t ) (1 (QAP) QA) x(t ) . Hence the set ( P ) x(t ) DA, B
T 1
Finally by Assumptions 2.1, 2.2, 2.3 and 2.4, the degenerate Cauchy problem (1)
can be reduced to nondegenerate problem:
d
M r x(t ) A0 x(t ) Y A Bf (t ), x(t ) P T z 0 (5)
dt
where M r is invertible .
Assumption 2.5: Let DA, B D( M ) and the operator M has a closed domain.
Remark 2.6: If M is closed and densely defined (Assumption 2.1), by using closed graph
theorem we have the equivalent formulation: M r is bounded and defined on all of P T
H.
By Assumption 2.1, 2.2, 2.3, 2,4 and 2.5, the nonhomogeneous abstract
degenerate Cauchy problem with the bounded operator on the nonhomogen term can be
written in the normal form:
d
x(t ) A1 x(t ) ( M r ) 1 Y A Bf (t ) , where A1 (M r ) 1 A0 . (6)
dt
The operator A1 ( M r ) 1 A0 on the natural domain,
D( A1 ) x PT DA A0 x Ran M A01 Ran M PT DA ,
is closed, because it is the product of a boundedly invertible operator ( M r ) 1 with a
x(t ) e A1t P T z 0 A1 e A1t (t s ) g ( s)ds , where g (t ) P T A 1 Bf (t ).
0
(7)
0
t
x(t ) e P z 0 A1 e A1t (t s ) g ( s)ds,
A1t T
Proof: We know that where
0
d
M r x(t ) A0 x(t ) YA Bf (t ), x( 0 ) P T z 0 on D( A0 ) P T DA, B ,
dt
1 0 1 0 1
where M r MP T ; A0 YA A APT ; and Y A B .
0 0 0 0 0
By restriction to D( A0 ) P T DA, B the problem can be expressed as nondegenerate
problem:
1 0 1 0 1
x x f
0 0 0 0 0
According (6) we have:
1 0
A1
0 0
So, solving of descriptor system the above on P DA, B ( Ker M ) DA, B is:
T
t t
A1t ( t s )
x(t ) e P z 0 e
A1t T
A1 g ( s)ds e P z 0 A1 e A1 (t s ) g ( s)ds,
A1t T
0 0
1
where g (t ) P A B f . According to Theorem 2.8, any solution of nondegenerate
T
3. CONCLUDING REMARK
References
[1] CARROL, R.W. AND SHOWALTER, R.E., Singular and Degenerate Cauchy Problems:Math. Sci.
Engrg., vol 127, Academic Press, New York-San Fransisco-London, 1976.
[2] DAI, L. A., A Singular Control Systems, Lecture Notes in Control and Inform, Sci., vol.118, Springer-
Verlag, Berlin-Heidelberg-New York, 1989.
[3] FAVINI, A., Laplace Tranform Method for a Class of Degenerate Evolution Problems, Rend. Mat. Appl.
12(2): 511-536.
Non h om ogen eou s Ab s t ra c t Degen era t e C a u c h y Prob lem . . . . 565
[4] FAVINI, A. 1981. Abstract Potential Operator and Spectral Method for a Class of Degenerate Evolution
Problems. J. Differential Equations. 39: 212-225.
[5] THALLER, B. AND THALLER, S., Factorization of Degenerate Cauchy Problems : The Linear Case. J.
Operator Theory. 36:121-146, 1996.
[6] THALLER, B. AND THALLER, S., Approximation of Degenerate Cauchy Problems. SFB F0003
”Optimierung und Kontrolle” 76. University of Graz.
[7] WEIDMAN, J., Linear Operators in Hilbert Spaces, Springer-Verlag, Berlin-Heidelberg- New York,
1980.
[8] ZEIDLER, E., Nonlinear Functional Analysis and Its Applications II/A. Springer-Verlag, Berlin-
Heidelberg- New York, 1990.
SUSILO HARIYANTO
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Diponegoro
University, Semarang, Indonesia.
e-mail: [email protected]
LINA ARYATI
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah Mada
University, .Yogyakarta, Indonesia.
e-mail:[email protected]
WIDODO
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah Mada
University, Yogyakarta, Indonesia.
e-mail: [email protected]
566 S. HARIYANTO, L. ARYATI, WIDODO
Proceedings of ”The 6th SEAMS-GMU Conference 2011”
Applied Mathematics, pp. 567 - 578.
SYAMSUDDIN T OAHA
Abstract. This paper studies the effect of time delay and harvesting on the dynamics of the
predator-prey which is based on the Lotka-Volterra model. The time delay is incorporated in the
growth rate of the prey equation. In the delayed predator-prey model, the predator and prey are
then harvested with constant efforts. It is shown that the time delay can induce instability, Hopf
bifurcation and stability switches. The constant efforts do not affect the stability of the stable
equilibrium point when the positive equilibrium point exists and is stable. For the model with
constant effort, we found that there exists a critical value of the efforts for a certain value of time
delay that maximizes the profit function and present value. This means that the predator and prey
populations can live in coexistence and also give maximum profit and present value.
1. INTRODUCTION
567
568 SYAMSUDDIN TOAHA
between 40% and 60%. The main problem of the MSY is economical irrelevant because it
just considers the benefits of the exploitation, but disregard the cost operation of resource
exploitation. Confronted with the inadequacy of the MSY, people tried to replace it by the
optimum sustainable yield (OSY) or maximum profit.
In this paper we present a deterministic and continuous model for predator – prey
population based on Lotka – Volterra model and then extend the model by incorporating
time delay and harvesting. The same models have been considered in [10] and discussed
the effect of time delay in the stability of equilibrium point related to the maximum profit
problem. The objective of this paper is to study the combined effects of harvesting and
time delay on the dynamic of predator-prey model. Besides that, for the model with
constant effort of harvesting we relate the stable equilibrium point to the maximum profit
and present value of a continuous time-stream of revenue by using Pontryagin’s maximum
principle.
We consider a predator – prey model based on Lotka – Volterra model with one
predator and one prey populations. The model for change rate of prey population (x) and
predator population (y) is
dx x
rx1 xy
dt K (1)
dy
cy xy .
dt
The model includes parameter K, the carrying capacity, for the prey population in
the absence of the predator. Parameter r is the intrinsic growth rate of the prey, c is the
mortality rate if the predator without prey, measures the rate of consumption of prey by
the predator, measures the conversion of prey consumed into the predator reproduction
rate. All parameters are assumed to be positive.
The equilibrium points of model (1) are 0, 0 , K , 0 and
c r ( K c)
E0 , . In order to get a positive equilibrium point we assume that
K
K c 0 . The characteristic equation of the Jacobian matrix evaluated at the
denotes a density dependent feedback mechanism which takes units of time to respond
to changes in the population density. If we think the gestation period of prey is , then
the per capita growth rate function should carry a time delay .
We consider model (2) where the two populations are subjected to constant efforts
of harvesting. The model with harvesting is as follows
dx(t ) x(t )
rx (t ) 1 x(t ) y (t ) q x E x x(t )
dt K (3)
dy(t )
cy(t ) x(t ) y (t ) q y E y y (t ) .
dt
Here, q x and q y are the cathability coefficients of the prey and predator
populations respectively. The constants E x and E y are the efforts of harvesting for the
prey and predator populations. For analysis, we set q x q y 1 . The model (3) can be
rewritten as
dx(t ) x(t )
r1 x(t ) 1 x(t ) y (t )
dt K1 (4)
dy(t )
c1 y (t ) x(t ) y (t ) ,
dt
where r1 r E x , K1
r E x K , and c c E .
1 y
r
c1 r1 ( K1 c1 )
The equilibrium point of model (4) is E1 x1 , y1 , . In
K1
order to have a positive equilibrium point we assume that r E x , K c 0 , and
K1 c1 0 , or equivalently E , E ,
x y where
K
Ex , E y Ex E y K c, Ex 0, E y 0 .
r
To linearize the model about the equilibrium point E1 of model (4), let
u(t ) x(t ) x1 and v(t ) y(t ) y1 . We then obtain the linearized model
r
u (t ) 1 x1u (t ) x1v(t )
K1
v(t ) y1u (t ) .
From the linearized model we have the characteristic equation
2 P1e Q1 0 , (5)
570 SYAMSUDDIN TOAHA
r1
where P1 x1 and Q1 x1 y1 .
K1
For 0 , the characteristic equation (5) becomes
2 P1 Q1 0 , (6)
P1 P1 4Q1
2
which has the roots 1, 2 . Since P1 and Q1 are both positive, the
2
characteristic equation has negative real roots. Hence, for 0 and E x , E y , the
equilibrium point E1 is globally asymptotically stable.
Now for 0 , if i , 0 , is a root of the characteristic equation (5), then
we have
2 iP1e i Q1 0 ,
2 iP1 cos() P1sin() Q1 0 .
Separating the real and imaginary parts, we have
2 P1 sin() Q1 0 ,
(7)
P1 cos() 0 .
Squaring both sides of equation (7) gives
P1 2 sin 2 () 4 2Q12 Q1
2 2
P1 2 cos2 () 0 .
2
Adding both equations and regrouping by powers of , we obtain the fourth degree
polynomial
4 P1 2 2Q1 2 Q1 0 ,
2
(8)
from which we have
2
1
2
P1 2Q1 P1 4P1 Q1 .
2 4 2
(9)
From (9) we can see that there are two positive solutions of 2 . We can now find
the values of j by substituting 2 into equation (7) and solving for . We obtain
2k 3 2k
k , k , k 0, 1, 2, . (10)
2 2
Theorem 1. Let K c 0 , E x , E y and k be defined by equation (10). Then
there exists a positive integer m such that there are m switches from stability to instability
and to stability. In other words, when [0, 0 ) ( 0 , 1 ) ( m1 , m ) , the
equilibrium point E1 of model (4) is stable, and when
( 0 , 0 ) (1 , 1 ) ( m1 ,
m1 ) , the equilibrium point E1 is unstable.
Therefore, there are bifurcations at the equilibrium point E1 for
k , k 0, 1, 2, .
Proof. From (6) we know that the equilibrium point E1 is stable for 0 . Then to
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 571
prove the theorem we need only to verify the transversality conditions, see Cushing [3],
d (Re ) d (Re )
0 and 0.
d k d k
Differentiating the equation (5) with respect to we obtain
d d d
2 P1e P1e 0 ,
d d d
2 (1 ) P1e
d
d
2 P1e .
1
d d
For convenience, we study instead of . Then we have
d d
1
d 2e P1 (1 ) 2e P1
.
d 2 P1 2 P1
P
From the characteristic equation (5) we know that e 2 1 .
Q1
1
d 2 Q1
Then we have . Therefore
d ( Q1 )
2 2
d Re d 1
sign sign Re
d i d
i
1 Q
sign Re 2 Re 4 1 2
Q 1 Q
1 i
i
1 4 Q 2
sign
Q1
sign 1
2
Q 1 4
2
Q 1
Q1
2 2
2
sign 4 Q1 .
2
From equation (8) we know that 4 Q1 24 P1 2Q1 2 . Then we have
2
2
d Re
sign
d i
2
2
sign 24 ( P1 2Q1 )2 sign 22 ( P1 2Q1 ) .
By substituting the expression for 2 , it is easy to see that the sign is positive for
2 while the sign is negative for 2 . Therefore, crossing from left to right with
increasing occurs for values of corresponding to and crossing from right to left
occurs for values of corresponding to . From (9) and the last result, we can verify
that the transversality conditions are satisfied. Therefore k are Hopf bifurcation values.
572 SYAMSUDDIN TOAHA
Example 1. Consider model (4) with parameters r 1.1 , K 110 , 0.2 , c 0.8 ,
0.1 , E x 0.1 , and E y 0.2 . The equilibrium point of the model in the positive
quadrant is 10, 4.5 . For 0 , the Jacobian matrix of the model associated with the
equilibrium point has eigenvalues 0.05000 0.94736i . This means that the
equilibrium point of the model without time delay is stable. Following Theorem 1 we
have 0 1.57080, 0 5.23599, 1 7.85398, 1 12.21730, 2 14.13717,
2 19.19862, 3 20.42035, 3 26.17994, 4 26.70354, 4 33.16126,
5 32.98672 and 5 40.14257. Then we have 4 stability switches from stability to
instability and to stability.
4. BIONOMIC EQUILIBRIUM
Bionomic equilibrium is one concept that integrates the biological equilibrium and
economic equilibrium, Bhattacharya and Begum [1]. As we discussed before, the
dx dy
biological equilibrium is found by solving 0 and 0 simultaneously. The
dt dt
economic equilibrium is reached when the total revenue from selling harvested biomass
equals to the total cost of harvesting efforts.
Let cx = harvesting cost per unit effort of prey population
cy = harvesting cost per unit effort of predator population
cf = fixed cost of harvesting
px = price per unit biomassa of prey population
py = price per unit biomassa of predator population.
The profit function is given by
px xEx p y yE y c f cx Ex c y E y .
The harvesting cost per unit effort is actually not constant, but here we assume the cost is
constant in order to simplify the analyzes. The bionomic equilibrium x , y , E x , E y
is found by solving the following simultaneous equation
x
r1 1 y 0
K1 (11)
c1 x 0
and px xEx p y yEy c f cx Ex c y E y 0 . (12)
c1 r ( K c1 )
By solving equation (11) we get x and y 1 1 . Substituting
K1
c1 r1 ( K1 c1 )
x and y into equation (12) we get
K1
pyr KE pc
E y 1 x 1 E y x c x E x c f
2
with critical point
K K
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 573
1 2 p y r1
E x ,
E y , where Ex
1K
, E y
1
1
, 1 p x p y ,
1 K
2
B1 p y Kymin
2
p y Kry min p y cry min c y Kymin c f r c y Kr c y cr .
A1
The critical point of the profit function is E x 2 . Substitute Ex 2 to get
2 px K
2 p x Kymin 2 p x Kr A1 2 p x cr
E y 2
2 px r
. The critical point E x 2 , E y 2
maximizes the profit function and also prevents the predator population from the
extinction. When the efforts E x 2 and E y 2 are applied into the model, the equilibrium
point E1 is in the positive quadrant and stable.
The critical point E x1 , E y1 belongs to and the maximum profit is
max 20.800625. After we substitute Ex1 0.5225 and E y1 3.7750 , the
equilibrium point becomes x , y 47.75, 0 .
When we apply the critical effort E x1 ,
E y1 0.5225, 3.7750 , the profit
function is at the maximum level but the predator population becomes extinct. This policy
will not be considered. If we take, for example, ymin 1 , we have
1 E x , E y 10E x E y 7 0, E x 0, E y 0 . The profit function becomes
( E x ) 100Ex 74.5000E x 1.5 and the critical point for the profit function is
2
Ex 2 0.3725 and E y 2 3.2750 . The critical point E x 2 , E y 2 belongs to 1 and
the maximum profit becomes max 15.375625. Substituting Ex 2 0.3725 and
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 575
get y
e t p y q y y c y .
qy y
From the Hamiltonian equation we also have
H x rx
e t p x q x E x x r 1 y q x E x y y and
x K K
H
y
e t p y q y E y x x y c x q y E y .
H
From the Pontryagin’s maximum principle x we get
x
e t p x q x x c x t x rx
e p x q x E x x r 1 y q x E x y y 0 ,
qx x K K
or equivalently
e t Kp x q x x Kc x e t p x q x2 E x xK x q x xrK 2 x q x x 2 r
(15)
x q x xyK x q x2 xE x K y yq x xK 0
H
Again, since y we get
y
e t p y q y y c y e t
p y q y E y x x y c x q y E y 0 ,
qy y
or equivalently
et p y q y y c y et p y q 2y E y y xxqy y
(16)
y q y y c x q y E y 0 .
By substituting x
e t p x q x x c x
and y
e t p y q y y c y into
qx x qy y
equation (15) and (16) and solving simultaneously, we get
Ex
1
qx Kq y cx
Kp x qx xqy Kc x q y rKq y px qx x rKq y cx 2 x 2rq y px qx
(17)
2 xrq y cx yKq y px qx x yKq y cx qx xKp y q y y qx xKc y
and
Ey
1
q y qx c y
p y q y yqx c y qx q y ypx qx x q y ycx cqx p y q y y
(18)
cqxc y xqx p y q y y xqxc y .
c qy Ey Kr cr Kq x Ex q y E y r
By substituting x and y , which
K
constitute the value of stable equilibrium point, into equation (17) and (18), and then
solving simultaneously we get the value of control variables E x and E y . Therefore, the
value of E x , E y , x , and y maximize the present value (13).
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 577
6. CONCLUSIONS
References
[1] BHATTACHARYA, D. K. AND BEGUM, S., Bionomic Equilibrium of Two-Species System, Mathematical
Biosciences, 135(2), 111-127, 1996.
[2] CLARK, C. W., Mathematical Bioeconomics, The Optimal Management of Renewable Resources, 2nd Ed.,
578 SYAMSUDDIN TOAHA
SYAMSUDDIN TOAHA
Department of Mathematics, Hasanuddin University, Makassar, Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied mathematics, pp. 579–588.
Abstract. In this paper, a dynamic model is presented to describe the behavior of glucose
concentration, Saccharomyces, and ethanol concentration during batch fermentation process. The
desired product of batch alcohol fermentation is ethanol. The form of this mathematical model is
nonlinear differential equation systems. The equilibrium point stability of the dynamic model is
discussed. Further, simulation numeric based on the experimental data is proposed to analysis the
stability of dynamic model. From the simulations results, the behavior of glucose, Saccharomyces,
and ethanol achieve steady-states at the 3rd day.
Keywords and Phrases : Dynamic model, ethanol, glucose, Saccharomyces, steady state
1. INTRODUCTION
_________________________________
2010 Mathematics Subject Classification : 92B99
579
580 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH
2. MATHEMATICAL MODELING
dP
X ,
dt
dS
qX , (1)
dt
dX VS
X ,
dt KS
where is the growth rate of ethanol (mg / ml), is the rate of glucose consumption (mg /
ml), is concentration of ethanol (ml / ml), is concentration of glucose (mg / ml), is
Saccharomyces wet weight (mg / ml), V is the maximal growth rate of Saccharomyces (mg /
ml), K is Michaelis-Menten constant, and .
Let the equilibrium for ethanol, glucose, and Saccharomyces model
system of equations (1). Equilibrium point can be obtained by
, , .
(2)
, , , (3)
where , .
3. STABILITY ANALYSIS
Consider,
(4)
where ̅ ̅ and ̅ .
Linearization model (4) in the equilibrium point using Taylor series are as
follows
̅
̅ ̅ ̅
̅ (5)
̅ ̅ ̅
̅
̅ ̅ ̅
̅
̅
̅
̅ ̅
̅
̅
[ ] [ ̅] (7)
̅ ̅
[ ]
J (8)
[ ]
The behavior of the system (4) around the equilibrium point ( ) can be seen from the
Jacobian matrix as follows:
J ( ) [ ]
J ( ) | |
(9)
Equation (9) has solutions and . It is indicate that the behaviour of the system
around the equilibrium point will be stable if . In this case, it should be
and the behaviour of the system around the equilibrium point is unstable
Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation 583
if . In this case From the above results, the stability of equilibrium point
is determined by the eigenvalues of the Jacobian matrix.
4. NUMERICAL SIMULATION
(10)
̅
̅
̅
̅ (11)
̅
̅ ̅
Based on the data of ethanol, glucose, and Saccharomyces wet weight [10], we obtain the
Jacobian matrix,
J [ ]
584 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH
and
̅ ̅ ̅
̅
̅
̅
Furthermore, we evaluate the behavior of the dynamic model around the equilibrium point.
P(ml/ml)
day
S(mg/ml)
day
X (mg/ml)
day
5. CONCLUDING REMARK
References
[1] ARELLANO-PLAZA, M., et.al., Unstructured kinetic model for tequila batch fermentation International Journal of
Mathematics and Computers in Simulation, 1, 1, 1-6, 2007.
[2] BOYCE, W. E. AND DIPRIMA, R. C., Elementary Differential Equation and Boundary Value Problem, John Wiley
& Sons, Inc, New York, 1992.
[3] CHENG-CHE LI, Mathematical models of ethanol inhibition effect during alcohol fermentation, Nonlinear
Analysis, 71, e1608-e1619, 2009.
[4] CRONIN, J., Differensial Equation : Introduction and Qualitatif Theory, New York : Marcel Dekker. Inc., 1994.
[5] HISAHARU, T, MICHIMASA, DIMITAR, SUBHABRATA, DAN TOSHIOMA, Application of the Fuzzy Theory to
Simulation of Batch Fermentation, Japan, 1985.
[6] JAMES, L., Kinetics of ethanol inhibiton in alcohol fermentation, Biotechnol. Bioeng, 280–285, 1985.
[7] LEDDER, G, Differensial Equation: A Modeling Approach, The McGraw-Hill, New York, 2005.
[8] LEI, F., ROTBOLL, M. JORGENSEN, S.B., A biochemically structured model for saccharomyces cerevisiae, Journal
of Biotechnology, 88, 205-221, 2001.
[9] WEI, C. AND CHEN, L. Dynamics analysis of mathematical model of ethanol fermentation with gas stripping,
Journal of Nonlinear Dynamic, 77,. 13-23, 2009.
[10] WIDOWATI, NURHAYATI, DAN SUTIMIN, Laporan Penelitian: Model dinamik fermentasi alkohol untuk
menentukan optimasi produk etanol: Studi Kasus Industri alkohol Sukoharjo, FMIPA Universitas Diponegoro,
Semarang, 2011.
[11] WIDOWATI, NURHAYATI, LAILATUSYSYARIFAH, Kestabilan model dinamik fermentasi alkohol secara kontinu,
Prosiding Seminar Nasional Statistika, ISBN: 978-979-097-142-4, Mei 2011.
WIDOWATI
Mathematics Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.
e-mail: [email protected]
Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation 587
NURHAYATI
Biology Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.
SUTIMIN
Mathematics Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.
LAILATUSYSYARIFAH
Mathematics Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.
588 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 589–600.
Abstract. Much of the existing research in data mining focused on how to generate
rules efficiently from static databases. However, it is often the database for generating
the rules has been collected over a considerable period of time, making it subject to
changes than not. As a result, it is possible that the underlying rules will also be subject
to change as a function of time. As an example, consider the personalization of web sites,
changes in topology of the web site may result in different user navigation behaviour.
Observation on the changing of the rules could provide more useful information that the
rules themselves. In this paper, we survey different methods for monitoring the behaviour
of association rules over periods of time.
Keywords and Phrases: Data mining, Association rules, rule evolution, rule monitoring.
1. INTRODUCTION
Much of the existing research in data mining focused on how to generate rules
efficiently from static datasets. However, it is often the dataset for mining has been
collected over a considerable period of time making it subject to changes than not.
Moreover, consumer behavior and preferences also change over time. As a result, it is
possible that the underlying rules will also be subject to change as a function of time.
As an example, consider the personalization of web sites, changes in topology of the
web site may result in different user navigation behaviour.
Research associated with mining rule changes has been focused in several areas.
In classification mining, there has been work on concept drift, see [1, 2]. Here concept
drift refers to the phenomenon that some or all of the rules defining classes change over
time. In association rule mining, a number of works on mining rule changes has been
done with the objective to observe the behaviour of association rules over several time
periods [3, 4, 5, 6, 7]. In addition, other works focus on the changes that occurred
over two periods of time [8, 9, 10]. Apart from these, a number of studies, called
incremental mining has emerged. The objective is to update previously discovered
rules incrementally when the underlying dataset is updated. By doing this, they can
589
590 A.D. Rahajoe and E. Winarko
avoid scanning the whole dataset again. In incremental mining the statistical properties
of known patterns are updated, instead of being recorded over time as in mining rule
change. Incremental mining algorithms designed for maintaining discovered association
rules can be found in [11, 12, 13, 14, 15].
In this paper, we survey different methods for monitoring the behaviour of associ-
ation rules. We first give an overview of research in mining association rule changes. In
this context, mining rule change generally consists of three steps. The most important
step is the monitoring step, in which different methods to monitor the rules have been
proposed. We will describe each step briefly, then present the classification of meth-
ods currently used in the monitoring. Finally, we describe how each method in each
category is used to monitor the behaviour of association rules.
1.1. Overview of Research in Mining Rule Change. Based on their mining plat-
form, research on mining association rule changes has been done in both temporal
database and non-temporal database. However, only a few of them using temporal
databases as a platform for data mining task, for example [3]. Saraee et al. [16] is
the first to introduce a framework for mining association rules and sequential patterns
from a temporal database, using the ORES temporal database management system.
However, the framework does not consider the rule changes. This work focuses on two
areas and their integration, i.e., data mining as a technique to increase the quality of
data, and temporal databases as a technique to keep the history of data. A number of
enhancements to the basic algorithm for mining association rules and sequential pat-
terns is introduced. One of them is a new measure for mining association rules, called
time confidence. Tansel et al. [3] study the problem of discovering association rules
and their evolution from temporal databases. The proposed approach allows the user
to observe the changes in association rules occur over periods of time. The observed
changes include a decrease/increase in the support/confidence of an association rule
and addition/removal of itemsets from a particular itemsets.
The problem of monitoring the support and confidence of association rules from
non-temporal databases has been addressed in [4, 5, 6, 7]. Agrawal et al. [4] propose
the method to monitor rules from different time periods. The discovered rules from
different time periods are collected into a rule base. Ups and downs in support or
confidence over time are represented and defined using shape operators. The user can
then query the rule base by specifying some history specifications. In addition, the user
can specify triggers over the rule base in which the triggering condition is a query on
the shape of the history. In [5], Liu et al. propose a technique to use statistical methods
to analyze the behavior of association rules over time. They focus on determining rules
that are semi-stable, stable, showing trends over the several time periods. [6] proposes
visualization techniques that allow the user to visually analyze association rules and
their changing behaviours over a number of time periods. Baron et al. [7] introduce the
GRM (General Rule Model ) to model both the content and the statistics of a rule as
a temporal object. Based on this two components of a rule, different types of pattern
evolution are defined, such as changes of statistics or content, disappearance of a rule,
the correlated changes of pairs of rules. In [17], Baron et al. study the evolution of web
usage patterns using PAM (PAttern Monitor ). The association rules that show which
Survey of Methods for Monitoring Association Rule Behavior 591
pages tend to be visited within the same user session are generated from a web server.
They demonstrate how the mechanisms implemented by PAM can be used to identify
interesting changes in the usage behaviour. In most of these works, the behaviour of
rules is based on the behaviour of the rule’s statistics, the changing in support and
confidence values. They do not consider the changes in the rule contents.
Other works in mining association rule changes focus on detecting the changes
from two datasets, i.e., to find rule changes that occur from one dataset to another
[8, 9, 10]. Ganti et al. [8] present a general framework for measuring changes or
differences in two sets of association rules from two datasets. They compute a deviation
measure which makes it possible to quantify the difference between two datasets in
terms of the model they induce, called FOCUS. In [9], Dong and Li introduce a new
kind pattern, called emerging pattern. The support differences of association rules
mined from two datasets are used to detect the emerging patterns. Liu et al. [10]
study the discovery of fundamental rule changes. They consider rules of the form
r1 , . . . , rm−1 → rm and detect changes on support or confidence between two consecutive
time periods by applying a chi-square test.
The rest of the paper is organized as follows. Section 2 describes three basic steps
in mining association rule changes. Section 3 describes several monitoring methods
based on statistical test, while Section 4 describes monitoring methods based on visu-
alization. Section 5 presents methods to monitor rule behaviour in two datasets. The
conclusion and future work are given in Section 6.
Step 1: Partitioning the dataset. In this step, the important parameters are the
length of the interval during which data is accumulated, and the number of such inter-
vals. Then, we walk over the dataset D to extract a subset of dataset for each time
period. Let ti = [bi , fi ) be a time period where bi denotes its starting time point and
fi denotes its end. Time periods t1 , t2 , . . . , tn are consecutive, non overlapping fixed
length time periods. Di denotes the portion of dataset that is valid during the time
period ti .
Step 2: Mining rules from sub-datasets. Two different approaches are generally
used to mine the rules from the set of sub-datasets. The first approach is to mine
each sub-dataset Di in sequence. Let Ri be the set of temporal association rules from
Di , then after the mining we have the rule sets of R1 , R2 , R3 , . . ., accordingly. If R is
the set of rules that will be monitored in the next step, R is defined as R = {r|r ∈
(R1 ∪ R2 . . . ∪ Rq )}. It is possible for a rule r ∈ R to appear in Ri but not in Rj (i 6= j)
592 A.D. Rahajoe and E. Winarko
because r may not satisfy minsup and/or minconf in Dj . This approach is used in
[3, 4, 5, 7, 17].
The second approach is to mine the rules from one sub-dataset, and apply the
resulting rules on other sub-datasets to calculate the support and confidence values on
them. It means that only initial mining session is launched (on D1 ). At each later time
period, an instance of each existing rule is created, computing the statistic values from
the sub-dataset in the corresponding time period. If R is the set of rules that will be
monitored in the next step, R is defined as R = {r|r ∈ R1 }. Thus, for each rule, we get
a sequence of support and confidence values. This approach is used in [6].
The first approach results in larger number of rules than the second one. However,
the users may find this is more useful as it gives more detailed view of the whole data.
In the second approach, since the monitoring is focused only on the rules generated
from the first time period, it cannot be used to detect new rules that appear in the next
time periods. It can only detect rules that disappear in the next time periods.
A variation of the second approach is proposed in [18], by selecting a subset of
rules generated in the first time period. If R is the set of rules that will be monitored,
then R is a subset of R1 . It results in reducing the computational effort to a minimum
while focusing only on interesting rules. If the user choose to monitor all rules in
R1 , then it would be similar to the second approach described above. Choosing the
rules to be monitored is generally user and application dependent. The rules that are
interesting to one user may be of no interest to another user, and the interestingness of
patterns varies from application to application. Regardless of which approach is used,
the number of discovered rules could still be large. Several methods have been proposed
to reduce the number of generated rules, for example by pruning [19, 20] or by using
templates [21].
For the monitoring purpose, we need support and confidence values of every rule
r ∈ R, in all time periods. Therefore, we need to obtain the missing support and
confidence values in certain time periods for each r ∈ R. This can be done by rescanning
the corresponding sub-dataset to calculate the support and confidence values. If a rule
does not appear in a sub-dataset Dk , we set its support and confidence values in a time
period tk to zero.
Step 3: Monitoring rules over time. The direct and simple approach is to monitor
the rule from one time period to other time periods by comparing the support and
confidence of each rule from all time periods. This can be done using a graph, where
x-axis represents the time line, and y-axis represent the support of a large itemset or
support/confidence of a particular rule. It is useful when the user wants to see the
fluctuation in a particular rule. This method is used in [3, 22]. However, it has two
drawbacks. First, it often reports far too many changes and most of them are simply
the snowball effect of some fundamental changes. Second, analysing the difference in
supports/confidences may miss some interesting changes [10].
In this paper, we describe monitoring methods which are more advance than
the above method. We classify these methods into three categories: statistical based
methods and visualization based methods, and methods to monitor the rule from two
Survey of Methods for Monitoring Association Rule Behavior 593
datasets, as shown in Table 1. Statistical based methods use statistical test, while
visualization based methods use visualization.
Definition 3.1. Semi-stable confidence rules. Let minsup and minconf be the
minimum support and confidence, supD and confD be the support and confidence of a
rule r from the whole dataset D, confi be the confidence of the rule in the time period
ti , and α be a specified significance level. The rule r is a semi-stable confidence rule
over the time periods t1 , t2 , . . . , tn , if the following two conditions are met:
1. supD ≥ minsup and confD ≥ minconf
2. for each time period ti , we fail to reject the following null hypothesis at significance
level α: Ho : confi ≥ minconf
The first condition is used to ensure that the confidence of a rule r satisfies the
minimum confidence threshold in the whole dataset. The second condition is tested
using the z test.
594 A.D. Rahajoe and E. Winarko
3.2. Detecting Stable Rules. A semi-stable rule only requires its confidences (sup-
ports) over time are not statistically below minconf (minsup). However, the confidences
(supports) of the rule may vary a great deal. Hence, the behaviour can be unpredictable.
A stable rule is a semi-stable rule and its confidences (supports) are homogeneous.
Definition 3.2. Stable confidence rules. Let minsup and minconf be the minimum
support and confidence, supD and confD be the support and confidence of a rule r from
the whole dataset D, confi be the confidence of the rule in the time period ti , and α be
a specified significance level. The rule r is a stable confidence rule over the time periods
t1 , t2 , . . . , tn , if the following two conditions are met:
1. r is a semi-stable confidence rule
2. we fail to reject the following null hypothesis at significance level α: Ho : conf1 =
conf2 = . . . = confn
The second condition is tested using χ2 test for testing homogeneity of multiple
proportions.
3.3. Detecting Rules that Exhibit Trends. Sometimes the users are more inter-
ested in knowing whether changes in support or confidence of a rule are random or there
is underlying trend. In this case, a statistical test called the run test can be used to
detect if a rule’s support or conficence values exhibit trend or not. The run test can
find those rules that exhibit trends. But, it does not tell the types of trends.
3.4. Detecting Significant Change. In [17] a mechanism called change detector is
used to identify significant changes. In this mechanism a two-tailed binomial test is
utilized to verify whether an observed change is statistically significant or not.
For a a rule r and a statistical measure s at a time point ti it is tested whether
r.s(ti−1 ) = r.s(ti ) at a confidence level α. The test is applied upon the subset of data
Di accumulated between ti−1 and ti , so that the null hypothesis means that Di−1 is
drawn from the same population as Di , where Di−1 and Di have an empty intersection
by definition. Then, for a rule r an alert is raised for each time point ti at which the
null hypothesis is rejected.
where xi and yi are the values of X and Y in the ith time period, respectively. Given
a similarity threshold value , if this distance is below user-defined threshold , we say
that the two series are similar.
The parameter is a distance parameter that controls when two series should be
considered similar. It could be either user-defined, or determined automatically. For a
rule r with n time periods, is calculated as:
n−1
P
|zi+1 − zi |
i=1
=
n−1
where zi is a support (confidence) value of r at a period i. The bigger the value of ,
the more rules will be included as similar rules, with respect to r.
4.2. Visualizing Neighbour Rules. This method is used to visualize the neighbour
rules of a rule r. A rule r1 : lhr1 → rhs1 is a neighbour of a rule r2 : lhr2 → rhs2 if the
following two conditions are met:
(i). rhs1 = rhs2
(ii). lhs1 ⊇ lhs2 or lhs1 ⊆ lhs2
As an example, take a rule r : A, B → D, then rules r1 : A → D, r2 : B → D, and r3 :
A, B, C → D are neighbour of r. But r4 : A, C → D is not, because {A, B} 6⊇ {A, C}
and {A, B} 6⊆ {A, C}
naive algorithms such as Apriori algorithm [23] are not efficient for discovering EPs.
Therefore, an efficient algorithm for discovering EPS was proposed in [9].
5.2. Detecting Fundamental Rule Changes. Fundamental rule changes are de-
fined as changes that cannot be explained by other changes (its formal definition will be
described below). To detect fundamental rule changes, two techniques are used: quanti-
tative analysis and qualitative analysis. Quantitative analysis measures the magnitude
of change, while qualitative analysis finds the direction of change. We describe each
analysis by focusing our discussion on the change of the rule’s support. The change of
the rule confidence can be done in the same way. This method considers the rule r of
the form r: a1 , a2 , . . . , an → y.
5.2.1. Quantitative Analysis. In order to perform this analysis, we need to calculate the
expected support of a rule r in t2 , which is defined as follows:
1. If r is a 1-condition rule, its expected support in t2 is its support in t1
2. If r is a k-condition rule (k > 1) of the form r : a1 , a2 , . . . , an → y, then r can be
considered as a combination of two rules, a 1-condition rule rone and a k-condition
rule rrest , where
rone : ai → y rrest : a1 , a2 , . . . , aj → y
and {a1 , a2 , . . . , aj } = {a1 , a2 , . . . , an } − {ai }. Let supt (x) be the support of a
rule x at a time period t. The expected support of r in t2 with respect to rone
and rrest are
supt1 (r)
Erone (supt2 (r)) = min × supt2 (rone ), 1 (1)
supt1 (rone )
supt1 (r)
Errest (supt2 (r)) = min × supt2 (rrest ), 1 (2)
supt1 (rrest )
After we know the expected support of a rule r in t2 , we can check if the change
in support of a rule r from t1 to t2 is fundamental or not. The change is fundamental
if:
1. r is a 1-condition rule and its support is significantly different from its expected
support, or
2. r is a k-condition rule (k > 1), and Erone (supt2 (r)), Errest (supt2 (r)) and supt2 (r)
are significantly different, for all rone and rrest combinations.
Then, χ2 test is used to check if the support is significantly different from the
expected support. It means that if r is a 1-condition rule, the different is significant
if we fail to reject null hypothesis Ho : E(supt2 (r)) = supt2 (r) at significance level α.
If r is a k-condition rule, the different is significant if we fail to reject null hypothesis
Ho : Erone (supt2 (r)) = Errest (supt2 (r)) = supt2 (r) at significance level α, for all rone
and rrest combinations.
As an example, consider the example shown in table 2 (adopted from [10]). Since
r1 is a 1-condition rule, to test if the change in support of r1 is fundamental, we have
null hypothesis Ho : E(supt2 (r1 )) = supt2 (r1 ), where the E(supt2 (r1 )) = supt1 (r1 ) =
Survey of Methods for Monitoring Association Rule Behavior 597
0.052. Assume that the size of dataset at each period is 1000 tuples, we have 2 × 2
contingency table in which each cell contains observed frequencies of tuples that satisfy
r1 and do not satisfy r1 , for each time period, as shown in table 3. From this table,
we can compute the value of chi-square, which is equal to 0.7. Using significance level
of 5% and degrees of freedom 1, the critical value is equal to 3.84. Since the chi-squre
value is smaller than the critical value, we do not reject the Ho , and conclude that the
support of r1 is not significantly different1 . It means that r1 does not show fundamental
change in support.
1 The computation is performed using chi-square calculator, which is available on the web at
http://schnoodles.com/cgi-bin/web chi.cgi
598 A.D. Rahajoe and E. Winarko
6. CONCLUSION
All of the proposed methods, either statistical or visualization based methods,
consider only the statistical properties of the rule, i.e., its support or confidence. In
[7], the first step toward an integrated treatment of two aspects of a rule, its content
and statistics, has been made, by proposing the GRM which models both the content
and the statistics of a rule as a temporal object. In our next research, we will combine
statistical and visualization methods for observing evolution of temporal association
rules generated from interval sequence data.
References
[1] Hembold, H.M., Long, P.M.: Tracking drifting concepts by minimizing disagreements. Machine
Learning 114 (1996) 27–45
[2] Hickey, R.J., Black, M.M.: Refined time stamps for concept drift detection during mining for
classification rules. In Roddick, J.F., Hornsby, K., eds.: Proceedings of the 1st International
Workshop, TSDM 2000. Volume 2007 of LNAI., lyon, France, Springer (2000) 20–30
[3] Tansel, A.U., Ayan, N.F.: Discovery of association rules in temporal databases. In: Proceedings
of the 4th International Conference on Knowledge Discovery and Data Mining, Distributed Data
Mining Workshop, New York City, New York, USA (1998)
[4] Agrawal, R., Psaila, G.: Active data mining. In Fayyad, U.M., Uthurusamy, R., eds.: Proceed-
ings of the First International Conference on Knowledge Discovery and Data Mining (KDD’95),
Montréal, Québec, Canada (1995) 3–8
[5] Liu, B., Ma, Y., Lee, R.: Analyzing the interestingness of association rules from the temporal
dimension. In: Proceedings of IEEE International Conference on Data Mining (ICDM’01), Silicon
Valley, USA (2001) 377–384
[6] Zhao, K., Liu, B.: Visual analysis of the behavior of discovered rules. In: Workshop Notes in
ACM-SIGMOD 2001 Workshop on Visual Data Mining, San Fransisco, CA, USA (2001)
[7] Baron, S., Spiliopoulou, M.: Monitoring change in mining results. In Kambayashi, Y., Winiwarter,
W., Arikawa, M., eds.: Proceedings of the 3rd International Conference on Data Warehousing and
Knowledge Discovery (DaWak’01). Volume 2114 of LNCS., Munich, Germany, Springer (2001)
51–60
[8] Ganti, V., Gehrke, J., Ramakrishnan, R.: A framework for measuring changes in data character-
istics. In: Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of
Database Systems, Philadelphia, Pennsylvania (1999) 126–137
[9] Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In:
Knowledge Discovery and Data Mining. (1999) 43–52
[10] Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Proceedings of
the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San
Francisco, CA, USA (2001) 335–340
[11] Cheung, D.W.L., Ng, V., Tam, B.W.: Maintenance of discovered knowledge: A case in multi-level
association rules. In: Knowledge Discovery and Data Mining. (1996) 307–310
[12] Cheung, D.W.L., Lee, S.D., Kao, B.: A general incremental technique for maintaining discovered
association rules. In: Database Systems for Advanced Applications. (1997) 185–194
[13] Ayan, N.F., Tansel, A.U., Arkun, M.E.: An efficient algorithm to update large itemsets with
early pruning. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, San Diego, CA, USA (1999) 287–291
[14] Omiecinski, E., Savasere, A.: Efficient mining of association rules in large dynamic databases.
In Embury, S.M., Fiddian, N.J., Gray, W.A., Jones, A.C., eds.: Proceedings of the 16th British
National Conference on Databases (BNCOD’98. Volume 1405 of LNCS., Cardiff, Wales, U.K.,
Springer (1998) 49–63
Survey of Methods for Monitoring Association Rule Behavior 599
[15] Ganti, V., Gehrke, J., Ramakrishnan, R.: DEMON: Mining and monitoring evolving data. In:
Proceedings of the 16th International Conference on Data Engineering (ICDE’00), San Diego,
CA, USA, IEEE Computer Society Press (2000) 439–448
[16] Saraee, M.H., Theodoulidis, B.: Knowledge discovery in temporal databases. In: IEE Colloquium
on ’Knowledge Discovery in Databases’, IEE, London (1995) 1–4
[17] Baron, S., Spiliopoulou, M.: Monitoring the evolution of web usage patterns. In: Proceedings of
the First European Web Mining Forum (EMWF 2003), Cavtat-Dubrovnik, Croatia (2003) 181–200
[18] Baron, S., Spiliopoulou, M., Günther, O.: Efficient monitoring of patterns in data mining envi-
ronments. In Kalinichenko, L.A., Manthey, R., Thalheim, B., Wloka, U., eds.: Proceedings of the
7th East European Conference on Advances in Databases and Information Systems (ADBIS’03).
Volume 2798 of LNCS., Dresden, Germany, Springer (2003) 253–265
[19] Toivonen, H., Klemettinen, M., Ronkainen, P., Hatonen, K., Mannila, H.: Pruning and grouping
of discovered association rules. In: ECML-95 Workshop on Statistics, Machine Learning, and
Knowledge Discovery in Databases, Heraklion, Greece (1995) 47–52
[20] Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proceedings
of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
San Diego, CA, USA (1999) 125–134
[21] Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., Verkamo, A.: Finding interesting
rules from large sets of discovered association rules. In Adam, N., Bhargava, B., Yesha, Y., eds.:
Proceedings of the 3rd International Conference on Information and Knowledge Management,
Gaithersburg, Maryland, ACM Press (1994) 401–407
[22] Koundourakis, G., Theodoulidis, B.: Association rules and evolution in time. In: Proceedings
of Methods and Applications of Artificial Intelligence, Second Hellenic Conference on AI, SETN
2002, Thessaloniki, Greece (2002) 261–272
[23] Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th
International Conference on Very Large Data Bases. (1994) 487–499
Edi Winarko
Faculty of Mathematics and Natural Sciences,
Universitas Gadjah Mada, Jogjakarta.
e-mail: [email protected]
600 A.D. Rahajoe and E. Winarko
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp.601–614.
Abstract. Many fingerprint recognition methods have been proposed and the need arises
for a methodology to compare these methods, in order to be able to decide whether a
particular method is better than another. In this paper, we report on our effort to
develop a methodology to compare the robustness of fingerprint recognition methods. As
a case study, we apply this methodology to compare two recent fingerprint recognition
algorithms proposed by Chikkerur (2005) and Wibowo (2006). We are able to conclude
that, overall, Chikkerur’s algorithm performs better than Wibowo’s.
Keywords and Phrases: Fingerprint recognition methods, comparison framework.
1. INTRODUCTION
The needs for biometrics that can be used to recognize people based on their bodily
characteristics have existed long. Biometric recognition is associated with identification
(“Who is X?”) and verification (“Is this X?”) [13]. Alphonse Bertillon, chief of the
criminal identification division of the police department in Paris, conceived an idea that
body measurements can be used to identify criminals; and this has changed major law
enforcement departments in the mid-19th century [3].
Not all body measurements can be eligible to be a biometric. Human fingerprint,
which has been used for authentication purposes for more than 100 years [3, 4, 7], is
one of the most well-known biometrics. Fingerprints can be a biometric because they
have characteristics that are feasible to measure, distinct, permanent, accurate, reliable,
and acceptable [7]. There are three levels of fingerprints’ features that can be used in
recognition processes [5]:
(1) Global level: the ridge flows of fingerprints create particular patterns, such as
shown in Figure 1.
601
602 Ary Noviyanto and Reza Pulungan
(2) Local level: there are 150 different patterns or forms of ridges in fingerprints.
These patterns are called minutiae (see Figure 2). The most popular minutiae
are ridge endings and ridge bifurcations.
(3) Very-fine level: at this level, we look at the deeper levels of detail in the ridges.
The most important feature is finger sweet pore, which can be observed using
a high resolution sensor (1000 dpi) (see Figure 2).
Figure 2. Black solid circles are minutiae and circles with hole are
sweat pore [5]
2. PRELIMINARIES
2.1. Failures in Biometric System. There are two possible errors in biometric sys-
tems [2], namely:
(1) α-error, which is a failure occurring when comparison results reject or conclude
as different, things which are the same. Hence, this is also called a false non-
match. The ratio of this failure is called false non-match rate (FNMR) or false
reject rate (FRR).
(2) β-error, which is a failure occurring when comparison results accept or conclude
as the same, things which are different. Hence, this is also called a false match.
The ratio of this failure is called false match rate (FMR) or false accept rate
(FAR).
2.2. False Non-Match (FNM) and False Match (FM). In order to define FNM
and FM, we first define feature extraction, matching and the process of making con-
clusions. We define a sample of biometrics as Sik ; where i is the individual that the
biometric sample belongs to and k denotes the index of the successful acquisition pro-
cess (different samples of biometrics can be acquired from the same individual). The
features of every biometric sample Sik , denoted by Xik , is then extracted. The result
of matching, denoted by Yik,i0 k0 , is obtained from matching biometric samples Sik and
Si0 k0 . The next process is to decide whether the two biometric samples represent the
same biometric. Given a threshold τ , two biometrics are not similar if Yik,i0 k0 > τ and
two biometrics are similar if Yik,i0 k0 ≤ τ .
We define Dii0 j , a binary function that represent the j-th conclusion taken from
comparing the biometrics of the i-th and i0 -th individuals. If the value of Dii0 j is one,
then the sistem has made a mistake and if the value of Dii0 l is zero, then the sistem
was right. Dii0 l is defined as follows:
From Dii0 k we can compute False Non-Match Rate (FNMR) dan False Non-Match
Rate (FMR) as follows:
P P P
j Dii j
0
i i0 6=i
FMR = P P , and, (2)
i0 6=i nii
0
i
P P
i Diij
FNMR = P j , (3)
i nii
where nii0 is the number of comparisons between two individuals, and nii is the number
of time an individual is compared with himself. If i = i0 then the comparison is called
genuine matching and if i 6= i0 then the comparison is called imposter matching.
604 Ary Noviyanto and Reza Pulungan
2.3. Sensitivity and Specificity. Sensitivity and specificity are used to measure the
success of an algorithm in detecting minutiae [8]. They are defined as follows:
missedminutiae
Sensitivity = 1 − , and (4)
groundtruth
falseminutiae
Specificity = 1 − , (5)
groundtruth
where missedminutiae is the number of genuine minutiae that are not detected, falseminutiae
is the number of false minutiae that are detected, and groundtruth is number of minutiae
that are defined by fingerprint experts.
2.4. Equal Error Rate (ERR). Equal Error Rate (ERR) is an objective evaluating
criteria for classifier performance testings. It is objective in the sense that the rejection
threshold is selected independently. Equal Error Rate defines the intersection point
between FNMR and FMR curves as the function of the rejection threshold [6]. In other
words, ERR is a value where FNMR is equal to FMR, as shown in Figure 3.
2.5. Mean and Standard Deviation. Mean is used to assert the common value from
a collection of values. The mean of N data Xi , denoted by X̄, is defined by [9]:
PN
Xi
X̄ = i=1 . (6)
N
Beside mean, we need a way to measure the spread of the collection of values from
its means. Standard deviation of N data Xi , denoted by s, is defined by [9]:
s
PN 2
i=1 (Xi − X̄)
s= . (7)
N −1
A Comparison Framework for Fingerprint Recognition Methods 605
3.1. Enhancement Testing. The enhancement testing is a testing that compares only
the enhancement process of fingerprint recognition methods. The aim of this testing is to
compare the success rate of each enhancement process. Figure 4 shows how enhancement
testings are carried out. To fairly compare the quality of enhancement processes, we use
third party softwares that do not contain any enhancement process whatsoever. We use
MINDTCT [11] as feature extraction software and BOZORTH3 [10] as matcher software
to perform verification. The quality of each enhancement process is then represented
by its FNMR and FMR.
3.2. Feature Extraction Testing. The feature extraction testing is a testing that
compares only the feature extraction process of fingerprint recognition methods. The
difficulty of this testing lies in that a particular feature extraction method might be
linked with a particular enhancement method during analysis. Therefore, we need to
pass all necessary parameters from the enhancement process to the feature extraction
process, if any. We have to make sure that the image is not changed after the en-
hancement process. Figure 5 shows that the enhancement process is retained in the
testing, but filtering process that modifies the raw image is removed. With this scheme,
parameters that are required during the feature extraction process can be passed on
without changing the input image, and hence the input images before and after the
enhancement process are the same.
The result as shown in Figure 5 is a fingerprint image with additional feature
points. We then compute the values of sensitivity and specificity to determine the
performance of the feature extraction process. Ideally we need fingerprint experts to
606 Ary Noviyanto and Reza Pulungan
create a standard template of the genuine fingerprint features, so that we can compute
the values of sensitivity and specificity precisely.
3.3. Matching Testing. The matching testing is a testing that compares only the
matching process of fingerprint recognition methods. The difficulty of this testing lies in
the differences of features’ representation. A particular matcher might be related with a
particular features’ representation. Therefore, features’ representations are converted to
a particular format that conforms with the matchers. To compare fairly, we have to make
sure that the features are the same although they might have different representations.
In this testing, we compute mean and standard deviation of matched feature
points (minutiae) in genuine matching and imposter matching. Using the combination
of the values of mean and standard deviation of matched minutiae, the performance
of matcher to determine fingerprint images through its features can be observed. The
distance of the mean ± standard deviation between genuine matching and imposter
matching is required to determine the threshold. The greater the distance, the easier it
is to determine the threshold. An overlap of the mean ± standard deviation between
genuine matching and imposter matching—i.e., their mean ± standard deviation inter-
sect each other—means that there must have been mistakes or failures in the matching
process. If such overlap exists, the threshold cannot be determined precisely.
3.4. Overall Testing. Beside the partial testings (i.e., enhancement testing, feature
extraction testing and matching testing), we also perform an overall testing that com-
pares the whole process of fingerprint recognition methods. This testing is used to
compare the evaluation results of the fingerprint recognition methods.
EER is used as an evaluating measure for this overall testing . EER provides
information of the best performance of the FRM, namely when its value is the same as
that of FNMR.
A Comparison Framework for Fingerprint Recognition Methods 607
4. EXPERIMENTAL DATA
We collect our data set using a 500 dpi resolution fingerprint sensor that can
produce images of size 280 × 360 pixels. Figure 7 shows several examples of the obtained
fingerprint images. We also have a particular naming scheme: each fingerprint image is
named according to format: name.fingerprint-code.index-of-acquisition.bmp.
From 16 volunteers, a total of 640 fingerprint images have been collected. From
each volunteer we took 40 images; eight different images for each finger.
5. EXPERIMENTAL RESULTS
To demonstrate the use of the framework, we will use and compare the imple-
mentations of two fingerprint recognition methods based on Chikkerur’s [1] and Wi-
bowo’s [12].
5.2. Feature Extraction Testing Result. In the second testing, we compare two fea-
ture extraction methods: chain code based method [1] and templating based method [12].
The result of feature extraction testing is shown in Table 3. The columns of Table 3
are as follows:
(1) File Name is name of the file of the fingerprint image.
(2) Ground truth is the number of the genuine minutiae based on benchmark.
(3) Total is the total number of minutiae that can be extracted.
(4) Missed is the number of genuine minutiae that cannot be extracted.
(5) False is the number of minutiae that are extracted but not genuine.
(6) Match is the number of minutiae that are extracted and genuine.
A Comparison Framework for Fingerprint Recognition Methods 609
(7) Sens. and Spec. are the values of sensitivity and specificity, respectively.
The associated sensitivity and specificity of the two methods from Table 3 are
shown in Figure 9. From Figure 9, we observe that both sensitivity and specificity of
Chikkerur’s method are higher than those of Wibowo’s. This means that Chikkerur’s
feature extraction method performs better than Wibowo’s features extraction method.
5.3. Matching Testing Result. In this testing, we compare two matching methods:
graph-based [1] and point pattern matching based on alignment methods [12]. The result
Table 3. Feature Extraction Testing Result
No File Name Ground Total Missed False Match Sens. (%) Spec. (%)
truth C W C W C W C W C W C W
1 aaron.0.01.bmp 38 46 77 13 19 21 58 25 19 65.79 50.00 44.74 -52.63
2 arief.0.01.bmp 23 52 84 10 11 39 72 13 12 56.52 52.17 -69.57 -213.04
3 ata.0.01.bmp 24 60 96 12 17 48 89 12 7 50.00 29.17 -100.00 -270.83
4 beny.0.01.bmp 29 38 85 10 15 19 71 19 14 65.52 48.28 34.48 -144.83
5 christ.0.01.bmp 17 35 70 10 9 28 62 7 8 41.18 47.06 -64.71 -264.71
6 danu.0.01.bmp 28 53 27 8 13 33 12 20 15 71.43 53.57 -17.86 57.14
7 danu.1.08.bmp 27 41 57 13 14 27 44 14 13 51.85 48.15 0.00 -62.96
8 danu.4.07.bmp 23 48 32 10 16 35 25 13 7 56.52 30.43 -52.17 -8.70
9 firsty.0.01.bmp 27 35 82 12 15 20 70 15 12 55.56 44.44 25.93 -159.26
10 ilham.0.01.bmp 23 38 97 9 15 24 89 14 8 60.87 34.78 -4.35 -286.96
11 illy.0.01.bmp 29 44 106 13 16 28 93 16 13 55.17 44.83 3.45 -220.69
12 reza.0.01.bmp 30 33 111 15 14 18 95 15 16 50.00 53.33 40.00 -216.67
13 ria.0.01.bmp 25 54 200 9 14 38 189 16 11 64.00 44.00 -52.00 -656.00
14 riza.0.01.bmp 41 67 72 22 19 48 50 19 22 46.34 53.66 -17.07 -21.95
15 sigit.0.01.bmp 31 34 66 19 17 22 52 12 14 38.71 45.16 29.03 -67.74
16 willy.0.01.bmp 37 43 74 21 23 27 60 16 14 43.24 37.84 27.03 -62.16
Ary Noviyanto and Reza Pulungan
Average 28.25 45.06 83.50 12.88 15.44 29.69 70.69 15.38 12.81 54.54 44.80 -10.82 -165.75
STD 6.26 9.99 38.71 4.32 3.29 9.65 39.27 4.08 4.17 9.46 7.91 44.91 167.91
Table 4. Matching Testing Result
Method
Data set Chikkerur’s Wibowo’s
Mean STD Mean STD Mean STD Mean STD
Genuine Genuine Imposter Imposter Genuine Genuine Imposter Imposter
1 7.29 3.43 3.45 0.79 24.28 11.73 20.14 10.30
2 12.58 6.94 3.20 0.73 25.63 13.47 10.40 3.95
3 12.76 6.39 3.13 0.55 22.90 11.73 8.05 2.87
4 10.95 5.15 2.96 0.63 19.53 9.69 7.57 4.51
5 8.97 5.32 3.13 0.69 21.47 11.12 11.23 5.44
6 11.98 5.62 2.90 0.61 18.68 9.60 6.22 2.76
7 6.32 3.15 3.15 0.60 22.49 10.83 17.31 10.45
8 10.90 6.03 3.16 0.74 20.78 9.93 8.84 4.03
Avg. 10.22 5.25 3.14 0.67 21.97 11.01 11.22 5.54
Table 5. The Overall Testing Result
Method
Data set Chikkerur’s Wibowo’s
Mean STD Mean STD Mean STD Mean STD
EER FNMR/ Gen. Gen. Imp. Imp. EER FNMR/ Gen. Gen. Imp. Imp.
(%) FMR (%) (%) (%) (%) (%) FMR (%) (%) (%) (%)
1 7.60 0.20 18.54 12.65 6.48 3.26 19.51 0.43 22.74 9.84 18.24 7.42
2 10.00 0.10 28.08 15.20 7.64 1.79 19.10 0.23 38.07 18.70 15.45 5.20
3 11.30 0.05 27.91 12.90 7.95 1.87 17.60 0.18 38.83 18.13 13.66 4.51
4 14.50 0.04 38.61 14.64 9.48 2.46 18.70 0.20 37.74 17.77 13.83 6.16
5 9.60 0.14 21.40 12.77 7.60 1.86 19.77 0.30 33.42 18.17 16.27 6.73
610
6 13.55 0.05 32.09 13.81 9.31 2.48 17.20 0.20 40.15 19.28 13.30 5.13
7 6.95 0.28 21.47 18.08 6.26 2.27 21.38 0.40 25.62 10.86 18.39 8.46
8 12.20 0.09 31.35 16.02 8.64 2.52 18.50 0.24 36.53 18.50 14.64 5.75
Avg. 10.71 0.12 27.43 14.51 7.92 2.31 18.97 0.27 34.14 16.41 15.47 6.17
STD 2.69 0.08 1.31 0.10
A Comparison Framework for Fingerprint Recognition Methods 611
of the matching testing is shown in Tabel 4. Tabel 4 shows comparison of the mean
and the standard deviation of genuine and imposter matchings. The values of the mean
and the standard deviation of both genuine and imposter matchings of both methods
are plotted in Figure 10.
From Figure 10, we observe that Wibowo’s method produces more overlaps than
Chikkerur’s method (seven overlaps compared to three). This means that Chikkerur’s
matcher has better ability in distinguishing fingerprint images based on their features
than Wibowo’s matcher.
5.4. Overall Testing Result. The overall testing result is shown in Table 5. Table 5
shows the comparison of EER with the corresponding FNMR/FMR and also the com-
parison of the mean and the standard deviation of both genuine and imposter matchings
(Mean Gen., STD Gen., Mean Imp. and STD Imp.). The values of the mean and the
standard deviation of both genuine and imposter matchings are the measure of the sim-
ilarity between two fingerprint images. The comparison of the mean and the standard
deviation of both genuine and imposter matchings is depicted in Figure 11 and the
comparison of EER with the corresponding FNMR/FMR is depicted in Figure 12.
From Figure 11, we observe that Chikkerur’ method produces greater gaps be-
tween genuine matching and imposter matching than Wibowo’s method. This means
that Chikkerur’s method is able to distinguish fingerprint images better than Wibowo’s.
This result can also be confirmed in Figure 12: Chikkerur’s method produces higher
values of FNMR/FMR than Wibowo’s method.
612 Ary Noviyanto and Reza Pulungan
Figure 11. Overall testing result: the similarity value of (A) Wi-
bowo’s method, and (B) Chikkerur’s method. The continuous line rep-
resents genuine matching, while the dashed line represents imposter
matching
Figure 12. Overall testing result: the value of (A) EER, and (B)
FNMR/FMR. The continuous line represents Chikkerur’s FRM, while
the dashed line represents Wibowo’s FRM
6. CONCLUDING REMARKS
Our experiment shows that, overall, Chikkerur’s FRM is better then Wibowo’s
FRM. This conclusion is based on the partial comparison results. The result is that
Chikkerur’s enhancement method performs better than Wibowo’s, which is shown by
the fact that Chikkerur’s method has smaller false non-match rate in accuracy test-
ing. The Chikkerur’s feature extraction method also performs better than Wibowo’s,
which is shown by its higher values of sensitivity and specificity. For matching method,
A Comparison Framework for Fingerprint Recognition Methods 613
Chikkerur’s method can distinguish features format better than Wibowo’s method. In
addition, we estimate the classification accuracy of the whole FRM in the overall test-
ing. In this testing, Chikkerur’s method has a higher accuracy than Wibowo’s. Hence,
the partial testings and the overall testing bring us to the same conclusion: Chikkerur’s
method is better than Wibowo’s.
In this paper, we have developed a framework that can be used to compare fin-
gerprint recognition methods. We have also demonstrated the use of the proposed
framework by comparing two recent methods. The experiments showed that the com-
parison framework performs well in measuring the relative quality of the two fingerprint
recognition methods. Since a fingerprint recognition method can usually be divided into
the three processes—i.e., enhancement, feature extraction and matching processes—the
proposed comparison framework provides specific and detailed information in each pro-
cess. The comparison results of each process enable us to investigate the performance
of a fingerprint recognition method in a more detail way. This framework provides a
basis to compare other fingerprint recognition methods.
References
[1] Chikkerur, S.S.: Online Fingerprint Verification System. Master’s thesis, State University of New
York at Buffalo, Buffalo, New York, June 2005.
[2] Dunstone, T., Yager, N.: Biometric System and Data Analysis Design, Evaluation, and Data
Mining. Springer, 2009.
[3] Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Transac-
tions on Circuits and Systems for Video Technology, 14(1), 4–20, 2004, http://dx.doi.org/10.
1109/TCSVT.2003.818349.
[4] Komarinski, P.: Automated Fingerprint Identification Systems (AFIS). Academic Press, 2004.
[5] Maltoni, D., Maio, D., Jain, A.K., Prabhakar, S.: Handbook of Fingerprint Recognition.
Springer Publishing Company, Incorporated, 2009.
[6] Poh, N., Bengio, S.: Evidences of equal error rate reduction in biometric authentication fusion.
Idiap-RR Idiap-RR-43-2004, IDIAP, 2004.
[7] Ravi, J., Raja, K.B., Venugopal, K.R.: Fingerprint Recognition Using Minutia Score Matching.
CoRR abs/1001.4186, 2010.
[8] Sherlock, B., Monro, D., Millard, K.: Fingerprint enhancement by directional Fourier fil-
tering. IEEE Proceedings - Vision, Image, and Signal Processing, 141(2), 87–94, 1994, http:
//link.aip.org/link/?IVI/141/87/1.
[9] Stockburger, D.W.: Introductory statistics: Concepts, models and applications 1998, http:
//business.clayton.edu/arjomand/book/sbk00.html.
[10] Watson, C.I., Garris, M.D., Tabassi, E., Wilson, C.L., Mccabe, R.M., Janet, S., Ko, K.:
User’s Guide to Export Controlled Distribution of NIST Biometric Image Software (NBIS-EC),
2007.
[11] Watson, C.I., Garris, M.D., Tabassi, E., Wilson, C.L., Mccabe, R.M., Janet, S., Ko, K.:
User’s Guide to NIST Biometric Image Software (NBIS), 2007.
[12] Wibowo, M.E.: Sistem Identifikasi Sidik Jari Berdasarkan Minutiae. Master’s thesis, Universitas
Gadjah Mada, Yogyakarta, Indonesia, Oktober 2006.
[13] Woodward, J.D., Orlans, N.M.: Biometrics. McGraw-Hill, Inc., New York, NY, USA 2002.
Ary Noviyanto
Faculty of Computer Science, Universitas Indonesia, Indonesia
e-mail: [email protected]
614 Ary Noviyanto and Reza Pulungan
Most of this work was done when the first author was with Universitas Gadjah Mada
Reza Pulungan
Department of Computer Science and Electronics
Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Indonesia
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 615–620.
Janpou Nee
Abstract. We first extend general maximum principle of Lou and Ni [6] of elliptic
equations to parabolic equations then we show that certain Turing system has a global
attractor when the coefficient of the diffusion is closed to 1. Then we show the existence
of periodic solution due to Hopf’s bifurcation.
Keywords and Phrases: Global existence, global attractor, periodic solution.
1. INTRODUCTION
Recently, many research engaged in the study of instability of steady states of
system of reaction diffusion [3, 5, 8, 7] that induced by the difference of the diffusion
coefficients. Such a phenomenon is the signature of Turing system [11]. The importance
of such instability of Turing system is that it corresponding to the model of development
of morphogenetic of biology. Mathematically, such instability corresponds to Hopf’s
bifurcation.
In this paper, we will study the Brusselator equation which is a model of auto
catalytic reaction in which one of the reactant is also the product. This model is
characterized by the reactions:
A → X, B + X → C + Y, 2X + Y → 3X, X → E.
Moreover, it is a model of activation-depletion mechanism of a chemical (or biological
chemical reaction [1, 2, 5, 8]) and a Turing system as well. The behavior of the solution
of such a system is complicated because the system contains many parameters. To
simplify the parameters of the equation, we rescale the parameters and change variables.
Eventually, the equation of Brusselator model looks like:
ut = D∆u + α − (1 + β)u + u2 v
x ∈ Ω, (1)
vt = ∆v + βu − u2 v,
615
616 Janpou Nee
Proof. : We only prove (i) since (ii) can be derived similarly. Let (x0 , t0 ) be an interior
point of domain Ω × (0, T ) such that w(x0 , t0 ) = maxξ∈Ω×(0,T ) w(ξ). Thus (5) implies
g(x0 , t0 , w(x0 , t0 )) ≥ wt (x0 , t0 ) − ∆w(x0 , t0 ) ≥ 0.
Let (x0 , t0 ) ∈ ∂ Ω̄ × [0, T ]; then we argue by contradiction (cf. [10]) and we
assume that g(x0 , t0 , w(x0 , t0 )) < 0 where w(x0 , t0 ) = maxξ∈∂ Ω̄×[0,T ] w(ξ). By (4),
g(x0 , t0 , w(x0 , t0 )) + ∆w(x0 , t0 , w(x0 , t0 )) ≥ wt , we have wt < 0, thus, w is larger at
some eariler time. This derives a contradiction, and hence results of (i) hold.
With the help of maximum principle above, the following results hold.
The Global Behavior of Certain Turing System 617
and µ-periodic solution is now a 1-periodic solution satisfying u(t, x) = u(t + 1, x) and
v(t, x) = v(t + 1, x).
To show the existence of 1-periodic solution of equation (13), we denote
D∆ 0
LD = .
0 ∆
We consider Banach space X = C(Ω) × C(Ω) and let U = (u, v), and we denote
F (α, β, µ, u, v) = F (α, β, µ, U ); then, (1) may rewrite as follows:
Ut = LD U + F (α, β, µ, U )
(14)
U (t) = U (t + 1) ∈ X,
α µ3 β(β + 1) αβ µ3 β(β + 1)
E = {(x, y) ∈ X : ≤x≤α+ , 2 3
≤v≤ }.
1+β α α + µ β(β + 1) α
By (15), we let Poincare’s map
α µ3 β(β + 1) αβ µ3 β(β + 1)
≤u≤α+ , ≤ v ≤ . (17)
1+β α α2 + µ3 β(β + 1) α
Thus P (X 0 ) ⊂ E ∩ X 2 . By Azelá Ascoli theorem, Poincare map has a fixed point and
hence (1) has a periodic solution.
The Global Behavior of Certain Turing System 619
5. CONCLUDING REMARKS
The solution of this system is rather complicated. If the condition |D − 1| ≤ of
the theorems can be removed then this would be a major improvement. However, the
blowup result of the last section indicates that this task could be difficult.
There are a lot of research subjects of this system remain open, for example, the
threshold of the activation and depletion. In ([9]), we only observed a global condition
of it, global in the sense of the concentration of the reacton, but not local. It would be
interesting if one can find the local condition of the thershold in any Turing system.
References
[1] K.J. Brown and F.A. Davidson, Global bifurcation in the Brusselator system, Nonlinear Anal-
ysis, 24, 1713-1725, 1995.
[2] T. Erneux and E. Reiss, Brusselator isolas, SIAMJ. appl. Math., 43, 1240-1246, 1983.
[3] P. Fife, Mathematical Aspects of Reacting and Diffusing Systems, Lecture Notes in Biomathemat-
ics, Springer-Verlag, Berlin, Heidelberg, New York, 28 1979. bibitemghr M. Ghergu Non-constant
steady-state solutions for Brusselator type systems, Nonlinearity, 21, 23312345 2008.
[4] Daniel Henry, Geometric Theory of Semi-linear Parabolic Equations, Lecture notes in Math.
840, Springer-Verlag, New York, 1981.
[5] T. Kolokolnikov, T. Erneuxa, J. Weib, Mesa-type patterns in the one-dimensional Brusselator
and their stability, Physica D, 214, 63-77 2006.
[6] Lou Y. and Ni W-M Diffusion, self-diffusion and cross-diffusion, J. Diff. Eqns, 131, 79131 1996.
[7] J. D. Murray, Mathematical Biology, Springer-Verlag, Berlin, Heidelberg, New York 1993.
[8] G. Meinhardt, Models of Biological Pattern Formation, London Academic Press 1982.
[9] J. Nee, On Brusselator Equation to appear.
[10] J. H. Protter and H. F. Weinberger, Maximum Principles in Differential Equations, 2nd ed.,
Springer-Verlag, New York, 1984.
[11] A.M. Turing, The chemical basis of morphogenesis,Phil. Trans. R. Soc., London B 237, 3772,
1952.
Janpou Nee
Chieng-Kuo Technology University
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 621 - 630
Abstract. A number of techniques based on logic have been developed to provide formal
verification of cryptographic protocols. Some of them are based on logic of belief, and others based
on logic of knowledge, or even combine the logic of knowledge and logic of belief. This paper
discusses two logical approaches towards verification of security protocols. We implement BAN
logic approach and CS logic approach to verify a security protocol. A case study incorporating the
verification of an authentication protocol is presented. The result of the analysis highlight the
advantages and limitations of these approaches, and comparing their specifications and
competences.
1. INTRODUCTION
destruction, and disclosure. Therefore, much attention is directed to the development and use
of cryptographic protocols.
In the literature, two main classes cryptographic protocols have been proposed,
namely: authentication protocols and key distribution protocols. The main purpose of the
authentication protocol is that it allows principals to identify themselves to each other.
Cryptographic key distribution protocol aims to distribute the key among the principals.
The design of cryptographic protocols is very difficult and complicated. If the
protocol is not designed carefully enough, it will contain weaknesses that could be an ideal
starting point for various attacks. In that context, it is not surprising that we encountered
several examples of cryptographic protocols, which is believed to be good, and then shown
to have some security flaws. It is well known also that the design of cryptographic protocols
prone to error. In contrast, formal methods seem more appropriate to resolve the issue. The
application of formal methods for cryptographic protocols become widespread in the early
1990s. Indeed, the search for undiscovered security flaws in cryptographic protocols
encourage the use of formal analysis techniques. This fact further stimulated research into
the development of several different formal methods to detect weaknesses in the protocol.
This paper discusses two logical approaches towards verification of security
protocols. We implement BAN logic approach and CS logic approach to verify a security
protocol. A case study incorporating the verification of an authentication protocol is
presented. The result of the analysis highlight the advantages and limitations of these
approaches, and comparing their specifications and competences.
Coffey and Saidha [6] developed a logic for verification of security protocols with
public key. The logic is then referred to as CS logic. Logic CS is based on logic of belief and
logic of knowledge. Inference rules used is natural deduction. Knowledge operator and belief
operator are used in this logic.
2.1. The BAN Logic BAN logic proposed by Mike Burrows, Martin Abadi, and Roger
Needham [1] is one approach based on logic of belief. It allows the assumptions and goals of
a protocol to be stated abstractly in logic of belief. BAN logic provides notations, constructs,
inference rules, and postulates. Using BAN Logic, the analysis of a protocol is performed as
follows:
1. Idealization of the original protocol using BAN logic notations and constructs.
2. Assumptions about the initial state are written
3. Logical formulas are attached to the statements of the protocol as assertions about the
state of the system after each statement
4. The logical postulates are applied to the assumptions and the assertions in order to
discover the beliefs held by the parties in the protocol
This procedure may be repeated as new assumptions are found to be necessary and as the
idealized protocol is refined.
BAN logic formalism is built on three kinds of objects: principals involved in
security protocols, encryption/decryption and signing/verification key held by the principal,
and the messages exchanged between principals. The symbols P, Q, and R denote specific
principals; X, Y range over statements; K range over keys (encryption keys); and K -1 range
over corresponding decryption keys; Np, Nq, Nc denote specific statements. All these are used
in writing a propositional logic connected with conjunction, denoted by a comma. In forming
a proposition, BAN logic uses several constructs (Table 1).
At the idealization stage, the protocol is rewritten using BAN logic notation and
constructs. Initial assumptions will be needed in the analysis and expressed using the
notation and constructs that exist. BAN logic has 19 inference rules. These rules are used in
security protocol analysis to examine whether the ultimate goal of security protocols is
achieved. Inference rules in BAN logic are:
624 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI
Decomposition
P (X, Y) P Q (X, Y) P Q ~ (X, Y)
, ,
P X P QX P Q ~ X
Decomposition in see
P (X, Y) P X Y P Q P, P {X}K
K
, , ,
PX PX P X
K K
P Q, P {X}K P Q, P {X}K -1
,
PX PX
2.2. The CS Logic Another logic approach towards formal verification of cryptographic
protocols is CS Logic that was proposed by Coffey and Saidha [6]. CS logic is built on logic
of belief and logic of knowledge. It provides belief operators and knowledge operators. One
knowledge operator is propositional and deals with the knowledge of statements or facts. The
other knowledge operator is a predicate and deals with the knowledge of objects (e.g.
cryptographic keys, ciphertext data, etc.). The inference rules provided are the standard
inferences required for natural deduction. The axioms of the logic are sufficiently low level
to express the fundamental properties of cryptographic protocols.
Logic Approach Towards Formal Verification of Cryptographic Protocol 625
The language for the CS logic, as shown in Table 2, is used to formally express the
logical postulates and protocol facts. The languange includes the clasical logical connectives
of conjuction ‘’, disjunction ‘’, complementation ‘’, and material implication ‘’. The
symbol ‘’ denote universal quantification and ‘’ denote universal quantification.
Membership of a set is denoted by the symbol ‘’, and set exclusion by ‘/’. The symbol ‘├’
denotes a logical theorem.
a, b, c Variables
Arbitrary statement
and Arbitrary entities
i, j Range over entities
ENT Set of all possible entities
k A cryptographic key
t, t’, t” Time
e(x, k ) Encryption function, encryption of x using key k.
d(x, k-1) Decryption function, decryption of x using key k-1.
Propositional knowledge operator, K,t means knows statement at
K
time t.
Knowledge predicate, L,tx means knows and can reproduce object x at
L
time t.
B Belief operator, B,t means belief at time t that statement is true.
C Contains operator. C(x,y) means that the object x contains the object y.
S Emission operator. S(, t, x) means sends message x at time t.
R Reception operator. R(, t, x) means receive message x at time t.
Logical Axioms
A1. (a). t p q (K,t p K,t ( p q) K,t q )
(b). t p q (B,t p B,t ( p q) B,t q )
A2. t p (K,t p p)
A3. (a). t x i, i {ENT} (Li,t x t’, t’ t Li,t’ x)
(b). t x i, i {ENT} (Ki,t x t’, t’ t Ki,t’ x)
626 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI
Non-logical Axioms
A5. t x (S(.t,x) L,t x i, i {ENT/} t’, t’ > t R(i.t’,x))
A6. t x (R(.t,x) L,t x i, i {ENT/} t’, t’ < t S(i.t’,x))
A7. (a). t x i, i {ENT} (Li,t x Li,t k Lj,t (e(x,k)))
(b). t x i, i {ENT} (Li,t x Li,t k-1 Lj,t (d(x,k-1)))
A8. (a). t x i, i{ENT}(Li,tk t’, t’ < t Lj,t’ (e(x,k)) (y (R(i.t,y) C(y,
e(x,k)))) Li,t (e(x,k)))
(b). t x i, i{ENT}(Li,tk-1 t’, t’<t Lj,t’(d(x,k-1)) (y (R(i.t,y) C(y,
d(x,k-1)))) Li,t (d(x,k-1)))
A9. t (i, i {ENT} Li,t k-1 j, j {ENT/i} Lj,t k-1
A10. t x (i, i{ENT} Li,t d(x,k-1) L,t x )
4. PROTOCOL VERIFICATION
4.1. Verification Using BAN Logic The original NSSK protocol without idealization has
been presented in the previous section. The corresponding idealised protocol using BAN
Logic notations and constructs is as follows:
Message 2
S A : { N a , (A
B), # (A
K ab
B), {(A
K ab
B)}K bs }Kas
K ab
Message 3 A B : { A
B}K bs
K ab
Message 4 B A : { N b , (A
B)}K ab from B
K ab
Message 5 A B : { N b , (A
B)}K ab from A
K ab
S A
S
K as
; S B
S
K bs
; S A
B
K ab
A (S A
K
B) ; B (S A
K
B) ;
A (S # (A
K
B))
A # (Na ) ; B # (Nb )
S # (A
B) Kab
; B # (A
B)
Kab
Once all the assumptions have been written, verification is conducted to prove that some
formulas hold as conclusions. These conclusions describe the goal of authentication
protocols. The authentication is complete between A and B if there is a K such that the
following beliefs are attained
A A
B ;
Kab
B A
B ; A B A
Kab
B
Kab
;
B A A
B.
Kab
and then by the nonce verification and the jurisdiction postulates, we immediately obtain
B A B.
Kab
the key Kab. A then can deduce that B believes in the key, A B A B . In
Kab
628 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI
Message 5, A replies similarly, and then B can deduce that A also believes in the key,
B A A B.
Kab
B A A
B . This result shows that the NSSP protocol attained it’s objective.
Kab
The verification using BAN logic proved that there is no flaw in the protocol.
4.2. Protocol Verification Using CS Logic The first step of the verification procedure is the
formalization of the protocol using CS Logic notations. The NSSP protocol is then rewritten
into
Message 2 KA,t2 (R(A, t2, {Na, B, Kab, {Kab, A}Kbs}Kas))
Message 3 KB,t3 (R(B, t3, { Kab, A}Kbs))
Message 4 KA,t4 (R(A, t4, { Nb}Kab))
Message 5 KB,t5 (R(B, t4, { Nb}Kab)).
The goals of this protocol are specified as follows
Goal 1 (Message 2) : KA,t2 (t, t0 < t < t2, S (S, t, {Na, B, Kab, {Kab, A}Kbs}Kas))
Goal 2 (Message 3) : KB,t3 (t, t2 < t < t3, S (A, t, {Kab, A}Kbs))
Goal 3 (Message 4) : KA,t4 (t, t3 < t < t4, S (B, t, {Nb}Kab))
Goal 4 (Message 5) : KB,t5 (t, t4 < t < t5, S (A, t, {Nb}Kab)).
The initial assumptions are
LA,t0 (Kas) ; LB,t0 (Kbs) ; LS,t0 (Kas) ; LS,t0 (Kbs) ;
KA,t0 (i, i {ENT}, t’, t’ < t0, Lit’ (Na) ;
KB,t0 (i, i {ENT}, t’, t’ < t0, Lit’ (Nb).
Message 2 of the protocol is analyzed in order to determine if Goal 1 can be derived
using the axioms and the inference rules of the CS logic. By using axioms A6 and inference
rule R3 on Message 2: KA,t2 (R(A, t2, {Na, B, Kab, {Kab, A}Kbs}Kas)), we obtain i,
i{ENT/A}, t’, t’<t2, S(i, t’, {Na,B,Kab, {Kab, A}Kbs}Kas). By A9, only S knows the key Kas
so that KA,t2 (t’, t’ < t2, S(S, t’, {Na, B, Kab, {Kab, A}Kbs}Kas)). This can be compared to
Goal 1, except the time range is not restricted to being after t 0. It shows that there is a flaw in
the NSSK protocol.
BAN logic and CS logis are two logic approaches for formal verification of
cryptographic security protocol. BAN logic is built on logic of belief, while CS logic is built
on logic of belief and logic of knowledge. The analysis of a protocol using BAN logic
consists of four stages: idealization; definition of assumptions about the initial states;
attachement of logical formulas to the statements of the; application of the logical postulates
to the assumptions and the assertions in order to discover the beliefs held by the parties in the
protocol. The verification process using CS logic involves four steps: formalisation of the
protocol messages; specification of the initial assumption; specification of the protocol goals;
application of the axioms and inference rules. The objective of the verification to prove
whether the desired goals of the protocol can be derived from the initial assumptions and
protocol steps. The difference between BAN logic and CS logic is that the second employes
Logic Approach Towards Formal Verification of Cryptographic Protocol 629
timelines in the verification. The time range of every step in the protocol could be verified
using CS logic. The verification of NSSK protocol using CS logic shows that the protocol
has flaw, while BAN logic failed to show that the protocol has flaw.
6. CONCLUSION
References
[1]. BURROWS, M., ABADI, M., AND NEEDHAM, R., A Logic of Authentication, DEC System Research Centre
Report, 39, February 1989.
[2]. GONG, L., NEEDHAM, R., AND YAHALOM, R., Reasoning about Belief in Cryptographic Protocols, Proceedings
1990 IEEE Symposium on Research in Security and Privacy, IEEE Computer Society Press, 234-248. 1990.
[3]. SYVERSON, P. F. AND VAN OORSCHOT, P. C., On Unifying SOme Cryptographic Protocol Logics, IEEE
Symposium on Research in Security and Privacy, 14-28, 1990.
[4]. ABADI, M., AND TUTTLE, M., A Semantics for A Logic of Authentication, Proceedings of the ACM Symposium
of Principles of Distributed Computing, ACM Press, 201-216, 1991.
[5]. VAN OORSCHOT, P. C., Extending Cryptographic Logics of Belief to Key Agreement Protocols, Proceedings of
The 1st ACM Conference on Communications and Computer Security, 1993.
[6]. COFFEY, T., AND SAIDHA, P. Logic for Verifying Public-Key Cryptographic Protocols, IEEE Proceedings
Online, 1996.
[7]. NEEDHAM, R. M., AND SCHROEDER, M. D., Using Encryption for Authentication in Large Networks of
Computers. Commun. ACM., 1978, 21 (12), pp.993-999.
[8]. LAL, S., JAIN, M., AND CHAPLOT, V., Approaches to Formal Verification of Security Protocols.
http://arxiv.org/ftp/arxiv/papers/1101/1101.1815.pdf.
D. L. CRISPINA PARDEDE
Gunadarma University
e-mail: [email protected]
MAUKAR
Gunadarma University
e-mail: [email protected]
SULISTYO PUSPITODJATI
Gunadarma University
630 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 631–646.
Abstract. A high-level specification language PROMELA can be used not only to model
interactions that occur in distributed or reactive systems, but also to express requirements
of logical correctness about those interactions. Several approaches to a formal semantics
for PROMELA have been presented, ranging from the less complete formal semantics to
the more complete ones. This paper presents a significantly different approach to provide
a formal semantics for PROMELA model, namely by an operational semantics given as
a set of Structured Operational Semantics (SOS) rules. The operational semantics of a
PROMELA statement with variables and channels is given by a program graph. The
program graphs for the processes of a PROMELA model constitute a channel system.
Finally, the transition system semantics for channel systems yields a transition system
that formalizes the stepwise behavior of the PROMELA model.
Keywords and Phrases: PROMELA, formal semantics, SOS rules, program graphs, chan-
nel systems, transition systems.
1. PRELIMINARIES
It is still a challenging problem to build automated tools to verify systems especially
reactive ones and to provide simpler formalisms to specify and analyze the system’s
behavior. Such specification languages should be simple and easy to understand, so that
users do not require a steep learning curve in order to be able to use them [1]. Besides,
they should be expressive enough to formalize the stepwise behavior of the processes and
their interactions. Furthermore, they must be equipped with a formal semantics which
renders the intuitive meaning of the language constructs in an unambiguous manner.
The objective of this work is to assign to each model built in the specification language
PROMELA a (labeled) transition system that can serve as a basis for further automated
analysis, e.g., simulation or model checking against temporal logical specifications.
A PROMELA model consists of a finite number of processes to be executed
concurrently. PROMELA supports communications over shared variables and message
passing along either synchronous or buffered FIFO-channels. The formal semantics of
631
632 Suprapto And Reza Pulungan
a PROMELA model can be provided by means of a channel system, which then can
be unfolded into a transition system [1]. In PROMELA, the stepwise behavior of the
processes is specified using a guarded command language with several features of classical
imperative programming languages (variable assignments, conditional and repetitive
commands, and sequential composition), communication actions where processes may
send and receive messages from the channels, and atomic regions that avoid undesired
interleaving [1].
Several researches in formal semantics of PROMELA have been carried out with
various approaches [2, 3, 4]. By considering previous researches, the derivation approach
of formal semantics of PROMELA presented here consists of three phases of transfor-
mation. First, a PROMELA model is transformed into the corresponding program
graphs; then the generated program graphs will constitute a channel system. Finally,
the transition system semantics for the resulting channel systems then produces a tran-
sition system of the PROMELA model that formalizes the operational behavior of that
model. The rules of transitions presented here are operational semantics given as a set
of Structured Operational Semantics (SOS) rules. SOS rules are a standard way to
provide formal semantics for process algebra, and are useful for every kind of operational
semantics.
The discussion of LTS semantics of PROMELA with this approach should be
considered as an initial version that only covers some small parts of PROMELA features.
Therefore, in order to handle more features, an advanced research will be required in
the near future. One thing could have been noticed as a contribution in this research is
the initial implementation of LTS to explain the behavior of PROMELA model.
1.1. PROMELA. PROMELA is a modeling language equipped with communication
primitives that facilitate an abstraction of the analyzed systems, suppressing details that
are not related to the characteristics being modeled. A model in PROMELA consists of
a collection of processes that interact by means of message channels and shared variables.
All processes in the model are global objects. In a PROMELA model, initially only
one process is executed, while all other processes are executed after a run statement [6].
A process of type init must be declared explicitly in every PROMELA model and it
can contain run statements of other processes. The init process is comparable to the
function main() of a standard C program. Processes can also be created by adding
active in front of the proctype declaration as shown in Figure 1.
Line (1) defines a process named Bug, a formal parameter x of type byte and the
body of process is in between and . Line (2) defines the init process containing two run
statements of process Bug with different actual parameters; the processes start executing
after the run statement. Line (3) creates three processes named Bar, and immediately
the processes are executed.
Communnications among processes are modeled by using message channels that
are capable of describing data transfers from one process to another; they can be either
buffered or rendezvous. For buffered communications, a channel is declared with the
maximum length no less than one, for example, chan buffname = [N] of byte, where
N is a positive constant that defines the size of the buffer. The policy of a channel
communication in messages passing is FIFO (first-in-first-out). PROMELA also allows
A Framework for an LTS Semantics for PROMELA 633
run One(qname[0]);
run Two(qforb);
qname[0] ! qforb
}
retrieve more parameters than are available, the values of the extra parameters will be
undefined; on the other hand, if it retrieves fewer than the number of parameters that
was sent, the extra values will be lost.
The send operation is executable only when the channel being used is not full.
While the receive operation is executable only when the channel (for storing values)
is not empty. Figure 3 shows an example that uses some of the mechanisms in data
communication using message channel. The process of type One has two channels q1
and q2 ; they are as a parameter and a local channel respectively; while the process of
type T wo has only one channel qford as a parameter. Channel qford is not declared as
an array and therefore it does not need an index in send operation at the end of the
initial process. The value printed by the process of type T wo will be 123.
The discussion so far is about asynchronous communications between processes
via message channels created in statements such as chan qname = [N] of { byte },
where N is a positive constant that defines the buffer size. A channel size of zero, as
in chan port = [0] of { byte }, defines a rendezvous port that can only pass, but
not store, single-byte messages. Message interactions via such rendezvous ports are
synchronous, by definition. Figure 4 gives an example to illustrate this situation. The
two run statements are placed in an atomic sequence to enforce the two processes to start
simultaneously. They do not need to terminate simultaneously, and neither complete
running before the atomic sequence terminates. Channel name is a global rendezvous
port. The two processes synchronously execute their first statement : a handshake on
message msgtype and a transfer of the value 124 to local variable state. The second
statement in process of type XX is not executable, since there is no matching receive
operation in process of type YY.
PROMELA allows several general structures of control flow, namely atomic se-
quences, conditional (if-statement), repetition (do-statement), and unconditional jumps
(go to) [7]. The if-statement has a positive number of choices (guards). If there are at
least two choices executable, it is executable and the guard is chosen non-deterministically.
A Framework for an LTS Semantics for PROMELA 635
#define msgtype 33
chan name = [0] of { byte, byte };
byte name;
proctype XX() {
name!msgtype(124);
name!msgtype(121) }
proctype YY() {
byte state;
name!msgtype(state) }
if
:: (n % 2 != 0) -> n = 1
:: (n >= 0) -> n = n-2
:: (n % 3 == 0) -> n = 3
:: else -> skip
fi;
proctype keeper()
{
do
:: timeout -> guard!reset
od
}
proctype monitor () {
(1) ............ assert (n <= 3);
}
proctype receiver () {
...
toReceiver ? msg;
(2) ............ assert (msg != ERROR);
...
}
the definition of the process that will send a reset message to a channel named guard
whenever the system is blocked.
The assert statement, i.e., assert(any boolean condition) is always executable. If
the specified boolean condition holds (true), the statement has no effect, otherwise,
the statement will produce an error report during the verification process. The assert
statement is often used within PROMELA models to check whether certain properties
are valid in a state. Figure 7 shows that assert statement in line (1) will have no effect
whenever the value of variable n is less than or equal to 3, otherwise it will produce an
error report during the verification process; and similarly to assert statement in line (2).
Another interesting statement in PROMELA is the unless statement, with syntax
{statement1 }unless{statement2 }. The mechanism of execution of unless statement
might be explained as follows. The start point of the execution is in statement1 , but
before each statement in statement1 is executed, enabledness of statement2 is checked. If
statement2 is enabled then statement1 is aborted and statement2 is executed, otherwise
statement1 is executed. Figure 8 illustrates the use of unless statement in a fragment
of codes. The result of the statement execution depends on the value of c: if c is equal
to 4 then x will be equal to 0. Since then statement {x! = 4; x = 1} is not enabled,
statement {x > 3; x = 0} is executed. In case the value of c is 5 then x is equal to 1,
since statement {x! = 4; x = 1} is enabled. This means that statement {x > 3; x = 0} is
aborted and statement {x! = 4; x = 1} is executed.
A Framework for an LTS Semantics for PROMELA 637
byte x = c;
{ x > 3; x = 0 }
unless
{ x != 4; x = 1 }
1.3. Program Graph. A program graph (PG) over a set of typed variables is a
digraph (directed graph) whose arrows are labeled with conditions on these variables
and actions. The effect of the actions is formalized by means of a mapping : Effect :
Act → Eval(V ar) × Eval(V ar), which indicates how the evaluation η of variables is
changed by performing an action. The formal definition of program graph is follows.
Definition 1.2. A Program Graph PG over set V ar of typed variables is a tuple
(Loc, Act, Ef f ect, ,→, Loc0 , g0 ) where Loc is a set of locations, Act is a set of actions,
Ef f ect : Act → Eval(V ar) × Eval(V ar) is the effect function, ,→⊆ Loc × Cond(V ar) ×
Act × Loc is the conditional transition relation, Loc0 ⊆ Loc is a set of initial locations,
and g0 ∈ Cond(V ar) is the initial condition.
` g:α 0 0
,→ ` is shorthand for (`, g, α, ` ) ∈,→. The condition g is called the guard of
g:α 0
conditional transition ` ,→ ` , therefore, if g is tautology then conditional transition
α 0
would simply be written ` ,→ ` . The behavior in ` ∈ Loc depends on the current variable
evaluation η. A nondeterministic choice is made between all transitions ` g:α 0
,→ ` which
638 Suprapto And Reza Pulungan
satisfy condition g in evaluation η (i.e., η |= g). The execution of α changes the variables
evaluation according to Effect(α, ). The system changes into `0 subsequently, otherwise,
the system stop.
Each program graph can be interpreted as a transition system. The underlying
transition system of a program graph results from unfolding (or flattening). Its states
consist of a control component, i.e., a location of the program graph, together with an
evaluation η of the variables. States are thus pairs of the form h`, ηi. An initial state
is initial location that satisfies the initial condition g0 . To formulate properties of the
system described by a program graph, the set AP of propositions is comprised of location
∈ Loc, and Boolean conditions for the variables.
Definition 1.3. Transition System Semantics of Program Graph The transition
system TS(PG) of program graph P G = (Loc, Act, Ef f ect, ,→, Loc0 , g0 ) over variables
set V ar is (S, Act, →, I, AP, L) where
• S = Loc × Eval(V ar)
• →⊆ S × Act × S is defined by the rule
0
` g:α
,→ ` ∧η|=g
α
h`,ηi → h`0 ,Ef f ect(α,η)i
• I = h`, ηi|` ∈ Loc0 , η |= g0
• AP = Loc ∪ Cond(V ar)
• L(h`, ηi) = {`} ∪ {g ∈ Cond(V ar)|η |= g}.
1.5. Communication via Shared Variables. The interleaving operator ||| can be
used to model asynchronous concurrency in which the subprocess acts completely
independent of each other, i.e., without any form of message passing or contentions
on shared variables. The interleaving operator for transition systems is, however, too
simplistic for most parallel systems with concurrent or communicating components.
In order to deal with parallel programs with shared variables, an interleaving
operator will be defined on the level of program graphs (instead of directly on transition
systems). The interleaving of program graphs P G1 and P G2 is denoted P G1 |||P G2 .
The underlying transition system of the resulting program graph P G1 |||P G2 , i.e.,
T S(P G1 |||P G2 ) describes a parallel systems whose components communicate via shared
variables. In general, T S(P G1 |||P G2 ) 6= T S(P G1 )|||T S(P G2 ).
Definition 1.5. Interleaving of Program Graphs
Let P Gi = (Loci , Acti , Ef f ecti , ,→i , Loc0,i , g0,i ), for i = 1, 2 are two program graphs
over the variables V ari . Program graph P G1 |||P G2 over V ar1 ∪ V ar2 is defined by
U
P G1 |||P G2 = (Loc1 × Loc2 , Act1 Act2 , Ef f ect, ,→, Loc0,1 × Loc0,2 , g0,1 ∧ g0,2 )
where ,→ is defined by the rules :
0 0
`1 ,g:α
→ `1 `2 ,g:α
→ `2
1
h`1 ,`2 i g:α
0 and 2
h`1 ,`2 i g:α
0
,→ h`1 ,`2 i ,→ h`1 ,`2 i
The program graphs P G1 and P G2 have the variables V ar1 ∩ V ar2 in com-
mon. These are the shared (sometimes also called global ) variables. The variables in
V ar1 V ar2 are the local variables of P G1 , and similarly, those in V ar2 V ar1 are the
local variables of P G2 .
1.6. Handshaking. The term handshaking means that concurrent processes that want
to interact have to do this in a synchronous fashion. Hence, processes can interact only
if they are both participating in this interaction at the same time - they shake-hand [1].
Definition 1.6. Handshaking (Synchronous Message Passing) Let T Si = (Si , Acti , →i
, Ii , APi , Li ), i = 1, 2 be transition systems and H ⊆ Act1 ∩ Act2 with τ ∈
/ H. The tran-
sition system T S1 kH T S2 is defined as follow :
T S1 kH T S2 = (S1 × S2 , Act1 ∪ Act2 , →, I1 × T2 , AP1 ∪ AP2 , L) where L(hs1 , s2 i) =
L1 (s1 ) ∪ L2 (s2 ), and the transition relation → is defined by the rules :
• interleaving for α ∈
/H :
α 0 α 0
s1 → s1 s2 → s2
1 2
α 0 α 0
hs1 ,s2 i → hs1 ,s2 i hs1 ,s2 i → hs1 ,s2 i
• handshaking for α ∈ H :
α 0 α 0
s1 → s1 ∧s2 → s2
1 2
α 0 0
hs1 ,s2 i → hs1 ,s2 i
conditional transitions (labeled with guards and actions), or one of the communication
actions with their respective intuitive meaning :
c!v transmit the value v along channel c,
c?x receive a message via channel c and assign it to variable x.
Let Comm = c!v, c?x|c ∈ Chan, v ∈ dom(c), x ∈ V ar with dom(x) ⊇ dom(c) denote the
set of communication actions where Chan is a finite set of channels with typical element
c.
The transition relation ,→ of a program graph over (V ar, Chan) consists of two
types of conditional transitions. Conditional transitions ` g:α
,→ `’ are labeled with guards
and actions. These conditional transitions can happen if the guard holds. Alternatively,
conditional transitions may be labeled with communication actions. This yields con-
ditional transitions of type ` g:c!v g:c?x
,→ `’ (for sending v along c) and ` ,→ `’ (for receiving a
message along c).
2. TRANSFORMATION
PROMELA is a descriptive language used to model especially concurrent systems.
Elements of PROMELA model P mostly consist of a finite number of processes P1 , . . . , Pn
to be executed concurrently. PROMELA supports communication over shared variables
and message passing along either synchronous or asynchronous (buffered FIFO-channels).
The formal semantic of a PROMELA programs can be provided by means of a channel
system, which then can be unfolded into a transition system.
As already mentioned in the previous section, the discussion here will only cover
small part of PROMELA features, which primarily concentrates on the basic elements of
PROMELA. A basic (element) PROMELA model consists of statements that represent
the operational behavior of the processes P1 , P2 , . . . , Pn together with a Boolean condition
on the final values of the program variables. It is then represented as P = [P1 |P2 | . . . |Pn ],
where each process Pi is normally built by one or more statement(s). So that, the
statements formalize the operational behavior of the process Pi . The main element
of the statements are the atomic command (skip), variable assignment (x := expr ),
communication activities: reading a value for variable x from channel c (c?x) and sending
the current value of expression expr over channel c (c!expr), conditional commands
(if..fi ), and repetitive commands (do..od ). The syntax of basic PROMELA statements
is shown in Figure 9.
Considering the fact that the PROMELA-statement itself is built by either variables,
expressions or channels; before proceeding further discussion about statements it will be
wiser to do a brief discussion of them. The variables in a basic PROMELA model P
are used to store either global information about system as a whole or information that
is local to one specific process Pi , depending on where the variable declaration takes
place. They may be in (basic) type of (bit, Boolean, byte, short, integer and channel ).
Similarly, data domains for the channels must be specified: they must be also declared
642 Suprapto And Reza Pulungan
Initial state
conditional command
true : x := 0 x > 1 : y := x + y
y := x Exit
true : y := x
For conditional commands, the set of sub-statement is defined as the set consisting
of the if..fi statement itself and sub-statements of its guarded commands. Then, its
sub-statements is defined as :
sub(conditional command) = {conditional command}∪sub(stmnt1 )∪. . .∪sub(stmntn ).
For example, sub(if :: x > 1 → y := x + y :: true → x := 0; y := x f i) = {if ::
x > 1 → y := x + y :: true → x := 0; y := x f i, y := x + y, x := 0; y := x, y :=
x, exit}.
The sub-statements of loop command (loop = do :: g1 → stmnt1 . . . :: gn →
stmntn od) is defined as : sub(loop) = {loop, exit}∪{stmnt; loop|stmnt∪sub(stmnt1 ) {exit}}∪
. . . ∪ {stmnt; loop|stmnt ∪ sub(stmntn )\{exit}}. For example, sub(do :: x > 1 → y :=
x + y :: y < x → x := 0; y := x od) = {do :: x > 1 → y := x + y :: y < x →
x := 0; y := x od, y := x + y; do :: x > 1 → y := x + y :: y < x → x := 0; y :=
x od, x := 0; y := x; do :: x > 1 → y := x + y :: y < x → x := 0; y := x od, y :=
x; do :: x > 1 → y := x + y :: y < x → x := 0; y := x od, exit}.
For atomic regions atomic{stmnt}, the sub-statement is defined as:
sub(atomic{stmnt}) = {atomic{stmnt}, exit}.
Then, for example the sub-statements of atomic{b1 := true; x := 2} is sub(atomic
{b1 := true; x := 2}) = {atomic{b1 := true; x := 2}, exit}.
3. INFERENCE RULES
The inference rules for the atomic commands, such as skip, assignment, communi-
cation actions, and sequential composition, conditional and repetitive commands give
rise to the edges of a large program graph in which the set of locations agrees with the
set of basic PROMELA statements [1]. Thus, the edges have the form :
g:α
stmnt −→ stmnt0 or stmnt g:comm
−→ stmnt
0
skip true:id
−→ exit
where id denotes an action that does not change the values of the variables, i.e.,
for all variable evaluations η, Effect(id, η) = η.
(2) Similarly, the execution of a statement consisting of an assignment x := expr
has trivial guard (true) and terminates in one step.
x:=expr true:assign(x,expr)
−→ exit
A Framework for an LTS Semantics for PROMELA 645
where assign(x, expr) denotes the action that changes the value of variable
x according to the assignment x := expr and does not affect the other vari-
ables, i.e., for all variable evaluation η (η ∈ Eval(V ar)) and y ∈ V ar then
Effect(assign(x, expr), η)(y) = η(y). If y ∈ x and Effect(assign(x, expr), η)(x)
is the value of expr when evaluated over η.
(3) For the communication actions c!expr and c?x the following axiom apply :
cap(c)6=0 len(c)<cap(c)
and
c?x dom(c)⊆dom(x):c?x
−→ exit c!expr dom(Eval(expr))⊆dom(c):c!expr
−→ exit
where cap(c) is maximum capacity of channel c, len(c) is current number of
messages in channel c, dom() is set of type, and Eval(expr) is the value of
expression expr after evaluated.
(4) For an atomic region atomicx1 := expr1 ; ...; xm := exprm , their effect is defined
as the cumulative effect of the assignments xi := expri . It can be defined by the
rule:
atomicx1 :=expr1 ;...;xm :=exprm true:assign(x,expr)
−→ exit
where α0 = id, αi = Ef f ect(assign(xi , expri ), Ef f ect(a( i − 1), η)) for 1 ≤ i ≤
m.
(5) There are two defined rules for sequential composition stmnt1 ; stmnt2 that
distinguish whether stmnt1 terminates in one step. If stmnt1 does not terminate
in one step, then the following rule applies :
g:α 0
stmnt1 − → stmnt1 6=exit
g:α 0
stmnt1 ;stmnt2 −→ stmnt1 ;stmnt2
4. CONCLUDING REMARKS
A new approach to a formal semantics for PROMELA has been presented. This
approach first derives channel system from PROMELA model P which consists of a finite
number of processes P1 , P2 , . . . , Pn to be executed concurrently (P = [P1 |P2 | . . . |Pn ]).
The channel system CS = [P G1 |P G2 | . . . |P Gn ], where P Gi corresponds to process
Pi , so that the transition system of CS consists of transition system of P Gi . Each
program graph Pi can be interpreted as a transition system, but the underlying transition
system of a program graph results from unfolding (flattening). Then the transition
system for channel systems yields a transition system T S(P ) that formalizes the stepwise
(operational) behavior of PROMELA model P . This approach is modular that makes
reasoning and understanding the semantics easier. It is more practical and fundamental
than the one given in [8]. Therefore, this approach should be more suitable for reasoning
about the implementation of a PROMELA interpreter.
References
[1] Baier, C., and Katoen, J. P., Principles of Model Checking, The MIT Press, Cambridge,
Massachusetts, 2008.
[2] Bevier, W. R., Toward an Operational Semantics of PROMELA in ACL2, Proceedings of the
Third SPIN Workshop, SPIN97, 1997.
[3] Natarajan, V. and Holzmann, G. J., Outline for an Operational Semantic of PROMELA,
Technical report, Bell Laboratories, 1993.
[4] Ruys, T., SPIN and Promela Model Checking, University of Twente, Department of Computer
Science, Formal Methods and Tools, 1993.
[5] Shin, H., Promela Semantics, presentation from The SPIN Model Checker by G. J. Holzmann,
2007.
[6] Spoletini, P., Verification of Temporal Logic Specification via Model Checking, Politecnico Di
Milano Dipartimento di Elettronicae Informazione, 2005.
[7] Vielvoije, E., Promela to Java, Using an MDA approach, TUDelft Software Engineering Research
Group Department of Software Technology Faculty EEMCS, Delft University of Technology Delft,
the Netherlands, 2007.
[8] Weise, C., An Incremental formal semantics for PROMELA, Prentice Hall Software Series,
Englewood Cliffs, 1991.
Suprapto
Universitas Gadjah Mada.
e-mail: [email protected]
Reza Pulungan
Universitas Gadjah Mada.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Mathematics Education, pp. 647 - 658.
Abstract. This paper is a brief research on modelling of lecturers' performance with modified
Hotelling-Fuzzy. The observation data is considered to be as a fuzzy set which is obtained from
students survey at Mathematics Department of Science Mathematics Faculty in SWCU in the year
2008-2009. The modified Hotelling relies on a harmonic mean instead of an arithmetic mean which
is normally used in a literature. The used charaterization function is an exponential function which
identifies the fuzziness. The result will be a general measurement of lecturers' performance. Based on
the 4 variables used in the analysis (Utilizes content scope and sequence planning, Clearness of
assignments and evaluations, Systematical in lecturing, and Encourages students attendance) we
obtain that lecturers' performance are fair, poor, fair, and poor respectively.
Keywords and Phrases : Hotelling, harmonic mean, fuzzy, exponential function, characteristic
function.
1. INTRODUCTION
University is one of post investments for human development in education which most
activities based on the learning process during in a university. This process relies on
interaction between lecturers and students. After some period of time, students may adopt
some knowledge which may determine their futures. Therefore lecturers require to transfer
their knowledge such that students capable to reproduce their knowledge obtained in their
universities as a support for themselves in their working places.
Since a lecturer becomes an important agent in the effort of human development in a
university, some necessary conditions must be satisfied as a lecturer. This paper will propose
some results on lecturers' performance based on students survey. Students evaluate a lecturer
by some questions (some are adopted from a literature given in a web which are listed on
Table 1 and these are considered the same items in Indonesian that have been used as
647
648 H.A. PARHUSIP AND A. SETIAWAN
The used data here are obtained from students survey in Mathematics Department in
Science and Mathematics Faculty, Satya Wacana Christian University in the year 2008-2009.
Furthermore, the lecturers‟ performance can not be exactly in one of the values 0,1, 2 or 3.
Therefore data can be considered nonprecise and hence we refer to fuzzy in presenting the
analysis.
The remaining paper is organized as follows. The used theoretical background is
shown in Section 2. Procedures to analyze the obtained data are shown on Section 3. The
analysis is then shown in Section 4 and finally some conclusion and remark are written in the
last Section of this paper.
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 649
2. MODELLING HOTELLING-HARMONIC-FUZZY
ON LECTURERS’ PERFORMANCE
2.1 Review Stage. Some authors have proposed a quality evaluation based on some
mathematical approaches such as using fuzzy set. Teacher performance is studied with fuzzy
system [2] to generate an assesment criteria. In complex system such as cooling process of a
metal, the combined PCA (Principal Component Analysis) algorithm and T2 -Hotteling are
used to define quality index after all data are normalized in a range -1 and 1 [3]. The other
instrument for lecturers‟ performance have been developed and validitated using variable
analysis . Lecturers‟ strictness are also estimated using intuitionistic fuzzy [4]. There are
several standards can also be used to indicate a quality of a teacher which are more a
qualitative approach which are easily found through internet. In a strategic planning, SWOT
is one of the most pointed references. In SWOT analysis some uncertainties can be
encountered and therefore using fuzzy will renew the classical SWOT by presenting the
internal and external variables in the sense of fuzzy as shown by Ghazinoory, et.all[5].
It is assumed that the given data contain non-precision. Thus the representation data
will be as a fuzzy set [6]. The fuzzy data are indicated by a characteristic function. There are
some well-known characteristic functions. In this research, we apply an exponential
characteristic function ,i.e
L( x), x m1 q2
with L( x) exp x m1 and R( x) exp x m 2 .
q
1
(1)
( x) 1, m1 x m2 a2
a1
R( x), x m
2
Figure 1. (x) with the given data is 0.5 (denoted by star on the peak) , and the value of
each parameter m1 , a1 , q1 is 0.4980, 0.1 , 2 respectively and the value of each parameter
m2 , a2 , q2 is 0.2, 0.502 , 0.8 respectively. The tolerance for x* is reduced. The
corresponding cut is also shown as a horizontal line for 0.2 .
A problem appears here that the determination of parameters becomes time consuming
650 H.A. PARHUSIP AND A. SETIAWAN
for each value. Up to now, there exists no optimization procedure to find the best values of
parameters here. Therefore we may vary parameters based on the given data. Let us consider
by this following example.
Example 1: Let the given data be a vector x [0.5 0.67 0.83 1. 1.17 1.33 1.5 1.67
T
1.83 2] . Let the vector x be an element in the observation space M X . We want to
express this vector in the sense of fuzzy. By trial and error we try to use the value of each
parameter. This means that each value in the vector has its own characteristic function. We
will also define the average value of this vector by the harmonic mean and the characteristic
function of this average value. The used parameters for all characteristic functions are
a1 0.1, q1 q2 2 , a 2 0.2 . These parameters are chosen freely. The illustration of
characteristic functions is depicted in Figure 2a. The values of m1 and m 2 are taken from
the given vector. The characteristic function of the harmonic average is obtained as
L( x), x 1.039
with L( x) exp x 1.039 and
2
( x) 1, x 1.039
0.1
R( x), x 1.309
x 1.039 2
R( x) exp .
0 .2
Thus the characteristic function is defined after the harmonic mean is obtained. The statistical
2
test for multivariate data is based on the T named as Hotelling [7] in a classical sense that
2 nX 0 S 1 ( X 0 )
T
(2)
that 2 ~ T p2,n1 or n p 2
~ Fp ,n p . (3)
p(n 1)
Figure 2 The illustration of all characteristic functions with a1 , q1 are 0.1 and 2. and
a 2 , q 2 are 0.2 and 2.The value of m1 and m 2 are taken from the given vector.
Since 2 is distributed as (n 1) p
Fp ,n p ( )
, then 2 can be used for hypotheses. A test of
(n p)
hypothesis: H 0 : 0 versus H1 : 0 at the level of significance, reject H 0 in favor of
H 1 if [7] 2
(n 1) p
Fp ,n p ( )
. (4)
(n p)
Thus the control limit of the 2 control chart can be formed as [8]
(n 1) p and LCL=0. (5)
UCL Fp ,n p ( )
(n p)
There is a reason proposed in [8] for LCL=0 , but it ignores here. It is convenient to refer to
the simultaneous intervals for the confident interval of each j , j=1,…,p as ([7](page 193))
xj (n 1) p s jj (n 1) p s jj , j=1,…,p.
Fp,n p ( ) j xj Fp,n p ( )
(n p) n (n p) n
Equation (5) allows us to simplify the statement above as
s s
x j UCL jj j x j UCL jj , j =1,…,p. (6)
n n
In this paper, we will replace an arithmetic mean by a harmonic mean in the sense of
fuzzy. The harmonic mean is used to increase the precision of average quantity (since we
know that arithmetic mean ≥ harmonic mean) and the harmonic mean is formulated as
n
xH n
1
x
i 1 i
The harmonic mean will be useful if we have highly oscillated data. Additionally, there exists
no literature so far in doing this way which shows an originality of this paper though the
number of the used data is considerable small. The used covariance matrix S is in the usual
way, i.e
S11 S12 ... S1, p with 1 n
S S 22 ...
S
S 2, p
( x x )( x x )
pq
n
pi p qi q
S i 1
21
for p, q = 1, 2, ...., M.
The example 2 shows the idea of Hotteling-harmonic-Fuzzy. Note that the number of samples
for each item (variable) is 6.
Example 2. Suppose the given data are shown in Table 2 with each number in the entry is an
average from the n j -number of students for each lecturer. We assume that the result is
independent with the number of students in each class.
If each number in the Table 2 presented as a characteristic function then there will be
many parameters must be determined. Hence we define the characteristic function of the
harmonic mean of each item. The resulting of characteristic-harmonic mean is depicted in
Figure 2.
Furthermore, the harmonic mean of each lecturer in the whole items and for each item
for all lecturers are shown in Table 2. Based on the average quantity, one has 1.77 which is
not precisely in the original values (0, 1, 2, and 3). Since it closes to 2, the lecturers‟
performance is considered satisfactory. This paper will evaluate into more rigorously.
The covariance matrix in the classical sense can be obtained by function cov in
MATLAB, one yields
0.2776 0.2921 0.2748 0.0939
. Furthermore, using harmonic mean, the matrix
0.2921 0.3180 0.2759 0.1185
S
0.2748 0.2759 0.4447 0.2008
0.0939 0.1185 0.2008 0.2422
0.2396 0.2531 0.2449 0.0889
covariance sample S h is obtained as . In the rest paragraphs
0.2531 0.2765 0.2488 0.1114
Sh
0.2449 0.2488 0.4015 0.1881
0.0889 0.1114 0.1881 0.2157
we will compute all related formulas using harmonic mean.
One may also observe that covariance matrix S h is positive definite by computing its
eigenvalues which are all positive (0.0015, 0.0717, 0.1710 and 0.8891).We have also
395.0665 324.4680 71.335 67.0196
compute the invers matrix of S h as 324.4680 274.6743 53.3587 54.7142 . Finally one
S h1
71.3350 53.3587 20.4292 15.9823
67.0196 54.7142 15.9823 19.2192
needs to consider the Hotelling to involve in this paper which mostly taken from literature
(Johnson and Wichern, 2007) and reformulate it using harmonic mean.
The hypothesis test with H 0 : 0 3, 2, 2,1T against H 1 : 1 3, 2, 2,1T at level
significance 0.10 is employed here. The observed is =238.9666. Comparing the
2 2
Thus we get 2 > 92.434 and consequently we reject H 0 at the 10% level of significance.
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 653
For completeness, The value of F4, 2 (0.10 ) is computed using function in MATLAB and type
as finv(0.90,4,2). Finally, we try fuzziness takes place in this hypothesis.
2.2 Hotelling-fuzzy. We agree that hypothesis is proposed by assuming that the given
0 is known. In this case we should present the example 0 3, 2, 2,1T with its
characteristics and using the previous procedures shown in Example 1, we directly obtain the
characteristics as depicted in Figure 3. Since we allow 0 as in the Figure 3, then we will
also have 2 as many as the number of points to define each characteristic function. Thus we
have a vector value of which is shown in Figure 3.
2
Figure 3 The illustration of all characteristic functions of 0 with a1 0.1, q1 2 and
a 2 0.2, q 2 2 . The values of m1 and m 2 are taken from the harmonic mean of each item on Table
2. Note that there exists 4 characteristic functions but the second and the third are in the same curves.
Be aware that the given 0 in the sense in fuzzy means that all points in horizontal axis
(x) with (x) > 0. Practically means that we need to search the index of each characteristics
function (since each function is discritized) and hence find the corresponded value of x. What
will be a problem here ?. We need always a set of 0 (with the number of element is the
same as the predicted one) . Let us study for (x) = 0.6 for one of possible memberships of
0 . We draw a horizontal line that denotes (x) = 0.6 to find intersection points such that we
can draw the vertical line (denoted by an arrow in Figure 4).
The vertical lines denote the possible new sets of 0 and we always have two values of
each given characteristic value (membership) that may act as the bounds of each fuzzy- 0 .
Let us denote these intervals as I o , j where index j denotes the j-th vector of
0 . Thus to
make the computation simpler, we choose the inf( I o , j ) and sup( I o , j ) to continue our
hypothesis. The used number of points will influence the obtained fuzzy- 0 .
654 H.A. PARHUSIP AND A. SETIAWAN
Figure 4 The illustration of all 0 taken from the characteristic function with a1 0.1, q1 2 and
a 2 0.2, q 2 2 . The values of m1 and m2 are taken from the harmonic mean of each item on Table
2.
Due to numerical techniques, the idea to use the inf( I o , j ) and sup( I o , j ) still can not
be implemented. Instead, we can do manually, but it will be time consuming if we have a
large number of points. Thus we are left this problem for the next research. In this paper we
apply 0 in the hypothesis as the usual procedure in a classical sense.
Example 3. Suppose that we have 0 3, 2, 2,1T and the resulting characteristic function is
shown in Figure 4. If (x) =0.6, we get intervals of each value of 0 3, 2, 2,1T as shown in
Figure 4. Practically, there exists no (x) =0.6 precisely since its value is constructed by
Equation 1. Manually, by drawing the horizontal line and the vertical line as shown in
Figure 4, we have approximations of inf( I o , j ) and sup( I o , j ). We propose that
inf( I o , j )=[0.8, 1.9, 1.9, 2.9] and sup( I o , j ).=[1.1, 2.1, 2.1, 3.2]. These results lead to the
same conclusion that H 0 : 0 3, 2, 2,1T is again rejected at the 10% level of significance.
On the other hand, one may determine all sets of 0 to find a 100%(1- ) confidence
region for the mean of a p-dimensional normal distribution such that
2 nX 0 S 1 ( X 0 ) (n 1) p F ( )
T
(7)
p ,n p
(n p)
provided the positive definite matrix covariance S. The complete observation will be
examined in Section 4. One may introduce Principal Component Analysis (PCA) of Variable
Analysis to reduce the number of variables if necessary.
3. RESEARCH METHOD
The used data are taken from the students‟ survey. We assume that the observations
are independent with the given lecturers and the number of students on each class. Table 1
contains questions which are used to evaluate lecturers‟ performance. The result is shown in
Table 3.
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 655
TABLE III
STUDENTS SURVEY FOR 6 LECTURERS BASED ON 16 VARIABLES
THE SYMBOL D j INDICATES THE NAME OF J-TH LECTURER.
No D1 D2 D3 D4 D5 D6
1 4.00 2.67 2.89 2.71 2.62 3.20
2 4.00 2.51 3.00 2.57 2.58 3.05
3 4.00 2.72 2.11 2.29 3.01 2.83
4 3.20 2.41 2.78 2.14 3.48 2.83
5 4.00 2.59 2.11 2.29 3.14 2.88
6 3.20 2.79 2.89 2.29 3.48 2.85
7 4.00 2.41 3.11 2.43 3.35 3.12
8 3.73 2.33 2.33 2.14 2.88 2.67
9 4.00 2.21 3.00 2.71 2.75 3.02
10 3.73 2.33 2.89 2.29 2.67 3.03
11 3.73 2.62 3.00 2.71 3.40 2.92
12 4.00 2.31 2.56 2.86 3.18 2.43
13 3.47 2.51 3.56 2.86 3.05 3.38
14 3.47 2.23 3.22 2.71 3.05 3.20
15 3.73 2.36 3.11 2.29 2.92 2.92
16 3.73 2.64 2.78 2.57 3.35 3.12
Evaluation is based on Table 2 which means that we have p=16 as the number of
variables. Unfortunately, the covariance matrix is almost singular that causes the invers
matrix is badly obtained. One may observe by computing its determinant which tends to 0
(O( 10 170 )). According to [7](page 110), this is caused by the number of samples is less than
the number of variables which happens in any samples. This means that some variables
should be removed from the study. The corresponding reduced data matrix will then lead to a
covariance matrix of full rank and nonzero generalized variance. Thus as mention in Section
656 H.A. PARHUSIP AND A. SETIAWAN
j 1
j
j 1
j
the sixth eigenvalue, we observe that only six variables with nonzero proportions. Thus we
will only consider the 6 variables in the analysis. Additionally, we also observe that
j 0 ,j=7,…,p whereas 1 3.1702, 2 0.3948, 3 0.2859, 4 0.1110,
5 0.0274, and 6 0.0016. It is reasonable to choose the first six variables.
Unfortunately, the singular covariance matrix still exists. To handle this problem, the pseudo-
invers matrix is applied. Again, one needs to concern Equation (2) that the number of
samples is not allowed to be the same as the number of variables which leads to the division
by zero. Since the eigenvalues represent variables variances, we select only the first four
variables such that n > p. In this case, we are led to the result in Section 2.
The control limit of the control chart can be obtained by using Eq.(4), one yields
2
(2 1)4
UCL F4, 2 (0.10) = 92.4342 and LCL=0.
(6 4)
Section 2 suggests us to compute the confidence intervals of the reduced data and we have the
following results
1.0453 1 4.8552 ; 0.8337 2 4.9181 ;
0.2776 3 5.1428 ; 0.9465 4 4.5201 .
Up to now, the conclusion for lecturers‟ performance qualitatively is not yet concluded
which is practically useful for application. These intervals show us that all observed values
are in the intervals. Finally, we suggest that the evaluation is proceeded as follows :
1. Compute the UCL= (n 1) p F ( ) , n= the number of the observed lecturers, p =
p ,n p
(n p)
the number of variables used in the evaluation (the number of items used to survey).
2. Compute the harmonic mean of each variable.
3. Compute the interval confidence of harmonic mean of each variable using the
Equation (6).
TABLE IV
TABLE V
QUALITY OF LECTURERS‟ PERFORMANCE
BASED ON THE DATA SURVEY 2008-2009.
Name Utilizes Clearness Systematical Encourages
of content of in lecturing students
parameter scope assignments (1.61) attendance
and and (1.39)
sequence evaluations
planning (1.57)
(2.75)
Fair Poor Fair Poor
5. CONCLUSION
References
[1] YUSRIZAL AND HALIM, A.,”Development and Validation of an Intrument to Access the Lecturers‟ Performance
in the Education and Teaching Duties”, Jurnal Pendidikan Malaysia 34(2) 33-47,2009.
[2] JIAN MA AND ZHOU, D. „‟Fuzzy Set Approach to the Assessment of Student-Centered Learning‟‟, IEEE
Transaction on Education, Vo. 43, No. 2,2000.
[3] BOUHOUCHE, S., BOUHOUCHE, M. LAHRECHE, , A. MOUSSAOUI, AND J. BAST, „‟Quality Monitoring Using
Principal Component Analysis and Fuzzy Logic Application in Continuous Casting Process‟‟, American
Journal of Applied Sciences, 4 (9):637-644,ISSN 1546-9239, 2007.
[4] SHANNON, A, E. SOTIROVA, K. ATANASSOV, M. KRAWCZAK, P.M, PINTO, T. KIM, „‟Generalized Net Model of
Lecturers‟s Evaluation of Student Work With Intuitionistic Fuzzy Estimations‟‟, Second International
Workshop on IFSs Banska Bystrica, Slovakia, NIFS 12,4,22-28,2006.
[5] GHAZINOORY, A., ZADEH, A.E., “Memariani, Fuzzy SWOT analysis, Journal of Intelligent & Fuzzy Systems”
18, 99-108, IOP Press, 2007.
[6] VIERTL, R., “Statistical Methods for Non-Precise Data”, CRC Press,Tokyo, 1996.
[7] JOHNSON, R.A. AND WICHERN,D.W Applied Multivariate Statistical Analysis, 6th ed. Prentice Hall, ISBN 0-13-
187715-1, 2007.
[8] KHALIDI, M.S.A, „‟Multivariate Quality Control, Statistical Performance and Economic Feasibility‟‟,
Dissertation, Wichita State University, Wichita-Kansas ,2007.
ADI SETIAWAN
FSM-Satya Wacana Christian University.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Mathematics Education, pp. 659 - 670.
WARLI
Abstract. This research aims at describing differences between creativities reflective and impulsive
students in solving mathematics problems. The analyses and results are based largely on written tests
and task-based interviews. This research is of explorative and qualitative nature. The subjects are
junior high school students of reflective and impulsive cognitive styles measured by MFFT
(Matching Familiar Figures Test). Method triangulation was used to validate data collected by
comparing written test and interview results. This research results in the following: Differences in
creativity qualities between reflective and impulsive students in solving problems are as follows. In
planning phase, reflective students have a little bit better performance than that of impulsive ones. In
executing phase, for novelty and flexibilty qualities, reflective students perform better impulsive
students. In looking back phase, reflective students tend to look back their works, while impulsive
students do not.
Keywords and Phrases: creativity, problem solving, reflective, impulsive, and cognitive styles.
1. INTRODUCTION
Since the 1980's to present problem-solving has been recognized as a difficult task in
learning mathematics. LeBlanc, Proudfit & Putt [1] said that developing skill in problem
solving has long been recognized as one of the important goals in the elementary school
mathematics program. Instruction in problem solving has also been recognized as being a
difficult task. Suryadi et. al. [2] also said that solving the problem is still considered the most
difficult for students to learn and for teachers to teach it. This reinforced the results of tests
conducted by the PISA (Programme for International Student Assessment) Indonesian student
achievement has not been satisfactory. Mode of mathematical problem-solving abilities of
students Indonesia is located at Level 1, which is about 49.7% of students are at the lowest
level. At level 1, students are only able to solve mathematical problems can be solved with
one step [3]. In addition, research Siswono [4] showed that one of the problems in learning
mathematics in junior high is the low ability students in problem solving (word problems), in
659
660 WARLI
particular the problem is not routine or open (open ended). One cause of low problem-solving
skills in planning is not considered problem-solving strategies that vary or that encourage
creative thinking skills to find answers to problems.
Polya [5] says that the real problem-solving skills on the idea of drafting a plan.
Likewise, Orton [6] the crucial and sometime very difficult stages are the middle two,
particularly stage (2, devising a plan) for which creativity, inventiveness and insight might be
required. Based on some of the clear opinion that creativity is an important capital to improve
students' problem-solving abilities. The development of problem-solving ability is indirectly
have developed creative solutions to problems.
Creativity in solving the problem is the individual's ability to generate ideas that are
"new" in finding a way / tool to obtain answers to questions (problem) is fluent, and flexible.
Fluent in problem solving refers to the diversity (variety) which made the students answer the
problem correctly. Flexibility in problem solving refers to the ability of students to solve
problems in many different ways; students are able to change a decision problem into a
different solution. "New" in problem solving refers to the ability of students to answer the
problems with a few different answers but the correct value or an answer that is not usually
done by individuals (students) at this stage of their intellectual development or their
knowledge level.
Sternberg [7], creativity is a unique meeting point between the three psychological
attributes: intelligence, cognitive style, and personality/ motivation. These three attributes in
mind helps to understand what lies behind the creative individual. Woodman & Schoenfeldt
[7] says that creativity can be investigated from the perspective of: a) personality differences,
2) differences in cognitive style or ability, and 3) social psychology, which combines the
behavior of individual creative, socially creative behavior. Based on the opinion, between
creativity and cognitive styles have a close relationship. Creativity can be investigated from
the perspective of cognitive style differences. Cognitive style is an important part in assessing
creativity, because cognitive style is one of the psychological attributes of creativity.
Therefore, creative problem solving can be assessed based on different cognitive styles.
Cognitive style is characteristic of individuals in remembering, organizing, processing,
and problem solving, in an attempt to distinguish, understand, store, create, and use
information. Cognitive style are examined in this study is cognitive style proposed by Jerome
Kagan [19], the reflective vs. impulsive cognitive style. Impulsive students are students who
have characteristics of quick in answering the problem, but no/less accurate, so that the
answers tend to be wrong. Reflective students are students who have a characteristic slow in
responding to problems, but accurate, so that the answers tend to be correct. The reason for
reflective vs impulsive cognitive style (Kagan), among others: a) The object studied is
creative problem solving, which requires divergent thinking skills and reflective. b) Man has
a few errors in thinking, among others, in haste, disheveled, out of focus, and narrow, which
tend to have impulsive kids. Slow thinking which tends to have a reflective child may both be
annoying/support in creative problem solving. d) Reflective vs impulsive cognitive style in
Indonesia has not been much studied and developed in depth.
Different cognitive styles of students may cause differences in creativity in solving
problems. Nietfeld and Bosma [8] describe impulsives as individuals who act without much
forethought, are spontaneous, and take more risks in everyday activities and reflectives as
more cautious, intent upon correctness or accuracy, and take more time to ponder situations.
According Kozhevnikov [9] Researchers did find, however, that impulsive children displayed
Differences In Creativity Qualities Between Reflective And Impulsive Students In Solving .... 661
more aggression than reflective children and also that reflective children exhibited more
advanced moral judgment than impulsive ones. The differences in executive functioning and
attentional control would be reflected in cognitive style dimensions located on the control
allocation metadimension, with reflective–impulsive individuals differing primarily in
allocation of their attentional resources when performing simple perceptual tasks and with
constricted–flexible individuals differing in their level of self-monitoring when carrying out
complex thinking and reasoning processes. Such an approach relating information processing
theories and intelligence components to different cognitive style dimensions could provide a
general research model, which could be more fully adapted by investigators concerned with
the specific relations among learning, memory, attention, and cognitive style. Based on
gender students may also cause differences in the reflective vs. impulsive cognitive style.
Norton [10] noted boys tended to be more impulsive, had a willingness to take risks, and were
happier to launch into practical work even though they did not know what they were doing. In
contrast, girls were more reflective and felt inhibited about commencing the task.
The relationship between reflective vs. impulsive cognitive style with problem solving,
McKinney [11] explains that the data show that children who are reflective of information
processing tasks / problems more efficiently than the children's impulsive and do more
systematic or forward the strategy. According Landry [12] student problem solving behavior
can be classified along a continuum from impulsive to reflective. Reflective problem-solving
is characterized by thoughtfulness and looking back. Impulsive problem-solving is
characterized by a precipitous jump to an implemented solution without either sufficient
reflection or thought. Although correct solutions can be conceived from an intuitive leap,
standard practice prescribes reflective approaches to software development. Whereas Orton
[6] Suydam and Weaver also noted that more impulsive students are often poor problem-
solvers, while more reflective students are likely to be good problem-solvers. According to
some opinions, it can be said that the child impulsive or reflective cognitive style have
differences in problem-solving strategies. This, allows children who have different cognitive
styles will have a different problem-solving profiles as well.
Several previous studies that evaluated the reflective or impulsive children with
creativity, including Fareer [13] investigated the relationship between impulsive-reflective
and creative thinking, critical thinking, and intelligence. There is no difference between
students' reflective and impulsive students on cognitive factors (action recognition,
interpretation, fluency in the skills, ability to think generally, spontaneous flexibility, and
general ability to think critically). Fuqua, Bartsch, and Phye [14] found a significant effect for
cognitive tempo, subjects who had higher scores reflective than impulsive subjects on each
measure of creativity. But Ward [14] found no significant relationship between students'
reflective-impulsive and Wallach-Kogan divergent scores. Whereas Kagan & Kogan [15]
explains that the reflective-impulsive dimensions also affect the quality of inductive
reasoning. Garrett [16] they found reflectives performed better on logical deductive
reasoning tasks than impulsive.
This study will examine how differences between creativities of those students with
such distinct cognitive styles (reflective and impulsive cognitive style) in solving
mathematical problems. The purpose of this study is to describe in detail the differences
between creativities reflective and impulsive students in solving mathematics problems.
662 WARLI
2. METHOD
This study intends to describe in detail about the differences in mathematical problem-
solving creativity in particular the geometry of the research subjects. To obtain these
descriptions, do creative problem-solving tests. This type of research is exploratory
qualitative research with primary data in the form of writing (the written test) and the words
on the interview-based task.
Research subjects is a class VII Junior High School Students of reflective and
impulsive cognitive styles. Subjects were 8 students, including four students each reflective
and impulsive. Instrument to determine the reflective-impulsive cognitive style, developed
from the tests made by Jerome Kagan, the MFFT (Matching Familiar Figure Test). The
reasons include: 1) test MFFT is a typical instrument for assessing reflective impulsive
cognitive style [17]. 2) MFFT is an instrument that is widely used to measure cognitive tempo
[18].
The main instrument is the researcher's own research, aided by auxiliary instruments,
including: 1) problem-solving task, 2) interview guide, and 3) MFFT (determination of the
subject). Instruments PST (problem-solving task) is used to obtain data geometry students'
creative problem solving of reflective or impulsive students. Instruments interview guide to
explore creative solving geometry problem of the student.
The process of collecting data in this study, using two methods, ie problem-solving
task and interview. Given PST is a matter of geometry, consisting of two issues, namely: 1)
the widespread problem of rectangle, 2) around the rectangle. The task of solving the
problem, followed by clarification, to get clear of some of the written answers are unclear or
the answer is not written. Triangulation used in this study is the triangulation method. Data is
said to satisfy the validity of the data (data valid), if the answer to a written test result data
equal to data from interviews. Furthermore, the data are analyzed to obtain valid conclusions
on the outcome of research.
For valid data made scoring (coding) refers to any indicator of creativity and
problem-solving phase. Scoring performed twice, namely achievement scores, and weighted
scores. Achievement scores are the scores achieved by students in solving the problem-
solving task. Weighted score is the score obtained from the multiplication achievement scores
by weighting each indicator of creativity. For the weighted score is determined based on the
quality of each indicator of creativity. Novelty given weight 3, the flexibility given weight 2,
and fluency is weighted 1. The quality of creativity is sum of weighted scores of fluency,
flexibility, and novelty. If each indicator reached the highest score, which is 3 then the
highest weight obtained 18 ((3 x 1) + (3 x 2) + (3 x 3) = 18) by multiplying each score by the
weight of each indicator. The following criteria for determining the quality of creativity.
3 N 9 9 0 9 0 0 0 9 0 0 0 0 0 0 0 0
F 6 2 0 6 0 0 - 0 2 2 0 0 0 0 - 0
L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -
4
Keterangan: PS = Phase of Problem Solving K = Fluency
CI = Creativity Indicator B = Novelty
Pi = Problems i , i = 1, 2 F = Flexibility
Based on the scores (code) data is valid between students who are reflective and
impulsive in Table 4, would explain differences in creative problem solving at every stage of
problem solving and indicators of creativity (fluency, novelty and flexibility)
3.1.2. Novelty Indicator. Based on Table 4, the reflective and impulsive subjects tend not to
meet the novelty in problem-solving plan. But there is one subject that meets the novelty of
reflective problem solving in planning, while others do not meet the novelty. As the interview
excerpt above, " ... (silence) I cut into pieces, keep up the pieces connected formed a new
build" plan is a "new", because beyond the ability of the child at her age.
Referring to the fact it can be concluded that both reflective and impulsive subjects
tend not to meet the novelty in problem-solving plan. However, reflective students are better
than impulsive students in a novelty.
3.1.3. Flexibility Indicator. Based on Table 4, there are two reflective subjects that meet the
flexibility in planning for problem solving, while others do not meet the flexibility. While
impulsive subjects there is no that meet the novelty. Referring to the fact it can be concluded
that the flexibility in planning for problem solving of the reflective student is better than
impulsive students.
To find out students' creative differences reflective and impulsive students in the
planning stages pemecehan problem, a weighted score of all three indicators, namely fluency,
novelty, and flexibility will be summed. Results sum tesebut authors present the diagram
below.
666 WARLI
P-i = Problems i , i = 1, 2
S.i = Subjct i, i = 1, 2, 3, 4
Figure 1. The difference profile Reflective and Impulsive Children's Creativity in Devising a
Plan Phase
Based on Figure 1 differences in students' creativity shows that reflective students’
creativity in problem-solving planning tends to be low. Being impulsive students' creativity in
solving-problems planning tend to be very low.
3.2. Creativity Differences between Reflective and Impulsive Students Carrying out the
Plan Phase
3.2.1. Fluency Indicator. Based on Table 4, found that in general both reflective and
impulsive students tend to meet the fluency in carrying out the plan. Likewise, there is a
reflective or impulsive subjects are only able to do as much as 2 pieces. Referring to the fact
it can be concluded that between reflective and impulsive students in problem-solving fluency
doing relatively the same. Both reflective and impulsive students tend to be very fluent in
carrying out the plan. Likewise, there is a reflective or impulsive subjects are only able to do
as much as 2 pieces.
3.2.2. Novelty Indicator. Based on Table 4, found that there are three subjects that meet the
novelty of reflective on carrying out the plan, whereas subjects of impulsive meet no novelty.
The following one excerpt of the interview with KP (reflective subjects).
R : Are you able to transform again into another form?
KP : ... (still) be a rectangle
R : How you do?
KP : ... (silence) I move the pieces
Differences In Creativity Qualities Between Reflective And Impulsive Students In Solving .... 667
Referring to the fact it can be concluded that the novelty of carrying out the plan the
reflective students a better than impulsive students.
Referring to the fact it can be concluded that reflective and impulsive students tend not
meet the flexibility in executing phase, however, reflective students are slightly more flexible
than impulsive students in executing phase.
To find out students' creative differences reflective and impulsive students on
executing phase, a weighted score of all three indicators, namely fluency, novelty, and
flexibility will be summed. Results sum tesebut authors present the diagram below.
668 WARLI
20
15
10
IMPULSIF
5
REFLEKTIF
0
S.1 S.2 S.3 S.4 S.1 S.2 S.3 S.4
P-1 P-2
P-i = Problems i , i = 1, 2
S.i = Subjct i, i = 1, 2, 3, 4
Figure 2. The Difference Profile Reflective and Impulsive Children's Creativity in Executing
Phase
Based on Figure 2. Creative differences indicate that reflective students' creativity in
executing phase tends to be high. Being impulsive students' creativity in executing phase
tends to be low.
Referring to the planning and executing phase indicates that reflective students'
creativity in solving geometry problems tend to be high. Being impulsive students' creativity
in solving geometry problems tend to be very low.
3.3. Creativity Differences between Reflective and Impulsive Students Looking Back
Phase
At the looking back phase of students' reflective work tends to examine the results of
his work. Although there is one subject that did not check his work, but he checked before
being written on the answer sheet. Answers will be written on the answer sheet, when it is
believed to be the truth. Being impulsive students tend not to examine the results of their
work. A common reason is that they check when writing an answer, if there is something
wrong immediately crossed out or corrected. Based on this we can conclude that impulsive
students tend not to examine the results of his work, being reflective students tend to check
the results of his work.
3. CONCLUDING REMARK
solving tends to be low. Meanwhile, the creativity of impulsive students in planning the
problem solving tends to be very low.
2. In executing phase, the creativity of reflective students in carrying out the problem
solving tends to be high. Whereas the creativity of impulsive students in carrying out the
problem solving tends to be low.
3. In looking back phase, reflective students were much cautious in executing phase (more
trials first), consider various aspects so that they obtained much fewer answers, but the
correct ones. impulsive students were less acurate in executing phase (less trials), rushed
through the problems, so that they have more answers, but often the wrong ones.
Impulsive students tend not to examine the results of his work, being reflective students
tend to check the results of his work.
References
[1] LEBLANC, JOHN F., PROUDFIT, LINDA & PUTT, IAN J. Teaching in Problem Solving in the Elementary
School. In Krulik, Stephen & Reys, Robert E. (Ed) Problem Solving in School Mathematics. Reston,
Virginia: NCTM Yearbook 1980.
[2] SUHERMAN, ERMAN DKK. Strategi Pembelajaran Matematika Kontemporer. Bandung: Jurusan
Pendidikan Matematika FPMIPA UPI Bandung. 2001
[3] BALITBANG-DEPDIKNAS. Rembug Nasional Pendidikan Tahun 2007. Jakarta: Badang Penelitian dan
Pengembangan, Departemen Pendidikan Nasional. 2007.
[4] SISWONO, TATAG YE. Desain Tugas untuk Mengidentifikasi Kemampuan Berpikir Kreatif dalam
Matematika. Pancaran Pendidikan Tahun XIX, No. 63 April 2006. Jember: FKIP Universitas Jember.
[5] POLYA, G. How to Solve It. Second Edition. Princeton, New Jersey: Princeton University Press. 1973.
[6] ORTON, ANTHONY. Learning Mathematics. Issues, Theory and Classroom Practice. Second Edition.
Printed and bound in Great Britain by Dotesios Ltd. Trowbrigde, Wilts. 1992.
[7] STERNBERG, ROBERT J. & LUBART, TODD I. Defying the Crowd Cultivating Creativity in a Culture of
Conformity. The Free Press. New York. 1995.
[8] NIETFELD, JOHN & BOSMA, ANTON. Examining the Self-Regulation of Impulsive and Reflective Response
Style on Academic Tasks. Journal of Research in Personality. Vol. 32, (2003) 118 – 140,
[9] KOZHEVNIKOV, MARIA. Cognitive Styles in the Context of Modern Psychology: Toward an Integrated
Framework of Cognitive Style. Psychological Bulletin. 2007, Vol. 133, No. 3, (2007) 464–481. Retrieved
April 1, 2011, from http://www.nmr.mgh.harvard.edu.cognitive/styles2007
[10] NORTON, STEPHEN. Pedagogies for the Engagement of Girls in the Learning of Proportional Reasoning
through Technology Practice. Mathematics Education Research Journal 2006, Vol. 18, No. 3, (2006) 69–
99. Retrieved April 1, 2011, from http://www.merga.net.auMERJ/18/3/Norton.
[11] MCKINNEY, JAMES D. Problem Solving Strategies in Reflective and Impulsive Children. Journal of
Educational Psychology Vol. 67 No.6. (1976) 807-820.
[12] LANDRY, JEFFREY P.; PARDUE, J. HAROLD; DORAN, MICHAEL V. ; DAIGLE, ROY J. 2002. Encouraging
Students to Adopt Software Engineering Methodologies: The Inf luence of Structured Group Labs on
Beliefs and Attitudes. Journal of Engineering Education. (2011) p-103 – 107. Retrieved April 23, 2011.
from http://jee.org
[13] AL-SHARKAWY, ANWAR M. (1998) Cognitive Style Research in The Arab World. Psychology in The Arab
Countries. Cairo. Egypt: Menoufia University Press. Retrieved Juny 15, 2005.
[14] KOGAN, NATHAN. Creativity and Cognitive Style: A Life-Span Perspective. In Baltes, R.B. & Schaie,
K.W. Life-Span Developmental Psychology. Academic Press. London. 1973.
[15] KAGAN, J AND KOGAN, N. Individual Variation in Cognitive Process. Dalam Mussen, P (Edt.)
Carmichael’s Manual of Child Psychology (3rd ed. Vol. 1), New York: Wiley. 1970
[16] GARRETT, ROGER M. Problem-solving and Cognitive Style. Research in Science & Technological
Education, Vol. 7: No. 1, (1989) 27 — 44. Retrieved April 23, 2011, from http://pdfserve.informaword
[17] ROZENCWAJG, PAULETTE & CORROYER, DENIS. Cognitive Processes in the Reflective-Impulsive
Cognitive Style. The Journal of Genetic Psychology, 166(4), (2005) 451 – 463.
[18] KENNY, ROBERT F. Digital Narrative as a Change Agent to Teach Reading to Media-Centric Students.
International Jurnal of Social Sciences Volume 2 Number 3 Tahun 2007.
[19] KAGAN, JEROME. Impulsive and Reflective Children: Significance of Conceptual Tempo. In Krumboltz,
J.D (Edt.) Learning and the Educational Process. Chicogo: Rand Mc Nally & Company. 1965.
670 WARLI
WARLI
Departement of Mathematics Education. UNIROW Tuban. East Java
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 671 – 678.
1. INTRODUCTION
Many products are sold with warranty policy that the manufacturer agrees to repair or
provide a replacement for failed items free of charge up to a time W or up to a usage U,
whichever occurs first, from the time of the initial purchase. For example, certain motor
company in Indonesia gives its costumers 3 years warranty or 100,000 km of usage,
whichever occurs first. This is an example of two-dimensional warranty policies because it
has two factors that affect warranty coverage, age of product and usage. An approach to
modeling this type of warranty by one-dimensional model was clearly explained in Blischke
and Murthy [2]. This was originally proposed by Moskowitz and Chun [3]. Another approach
to modeling two-dimensional warranty policies is by two-dimensional renewal process, which
is will be investigated thoroughly in this paper.
To lessen difficulties of finding renewal function, which is main topic in warranty cost
estimation, some estimators were proposed by Brown, Solomon and Stephens [1] via Monte
Carlo simulation. These estimators are unbiased and compete favorably with the Naive
estimator N(t). In this paper we give an alternative method in finding renewal function by
using one of those estimator and applying Copula method as a replacement of joint
671
672 ADHITYA RONNIE EFFENDIE
distribution function of two random variable W, time until product failed, and U, usage of the
product.
______________________________
2010 Mathematics Subject Classification :
1.
2. is a sequence of independent and identically distributed
nonnegative bivariate random variables with a common joint distribution function
.
3. where and
for .
Let denote the expected number of renewals over the rectangle
or in other word . We have from Blischke and
Murthy [2],
(1)
The last expression is quite difficult to solve, because it involves inside and outside
the integration which lead to the integral equation. One way to solve this equation is by using
an estimation. Brown, et.al. [1] give an estimation of (1), but in one dimensional case, by
using a weighted average. After some modification in order to fit to two dimensional case we
have the following Brown, et.al estimation for two dimensional renewal function:
(2)
allow one to easily model and estimate the distribution of random vectors by estimating
marginal and copula separately. There are many parametric copula families available, which
usually have parameters that control the strength of dependence.
Suppose that we can find copula function C such that
, a n-dimensional distribution function with
marginal distribution functions , of random vectors
where are their marginal functions. Furthermore, according to the
famous Sklar’s theorem of copula we can find, in anyway, this copula function.
Theorem (Sklar) Let H be a two-dimensional distribution function with marginal
distribution functions F and G. Then there exists a copula C such that
Conversely, for any univariate distribution functions F and G and any copula C, the function
H is a two-dimensional distribution function with marginals F and G. Furthermore, if F and
G are continuous, then C is unique.
Corollary (Main result) We can obtain a new estimator from (2) using Copula method as
follows:
(3)
Proof: This result can easily found by substituting Sklar theorem into equation (2)
We observe 33 failure time data of the most frequent component claim from a
Japanese motorcycle importer in Indonesia. The claims include failure of cylinder, piston and
piston ring. We use this data to estimate marginal distribution and copula parameters.
Furthermore, we find the best copula function which fit to the data and use this in simulating
warranty cost estimation. Below some statistics of the claim data:
Two- Di m en s i on a l Wa rra n t y Po li c i es Us i n g C opu la 675
To find best copula from these 6 families, we use Kolmogorov-smirnov test and the result as
follow:
From this table we see that goodness of fit test shows that all family have a good
result except family no. 2 and 18 (pvalue < 0,05). The bigger p-value of the model the better
model fit to the data and the best is family no.14 with estimated parameter .
Thus we have the following model :
where , and .
References
[1] BROWN, M., SOLOMON, H., AND STEPHENS, M., (1981). Monte Carlo simulation of the renewal function, J.
Appl. Prob., 18, 426-434
[2] BLISCHKE, W.R., MURTHY, D.N., (1994) Warranty Cost Analysis, Marcel Dekker,inc, New York
[3] BLISCHKE, W.R., KARIM, M.R., AND MURTHY, D.N.P, (2011) Warranty Data Collection and Analysis, Springer
[3] MOSKOWITZ, H., CHUN, Y.H., (1988) a Bayesian approach to the Two-attribute Warranty Policy, Paper No 950
Krannert Graduate School of Management, Purdue University
Table: Estimated for single item motorcycle sale for various warranty period
U\W 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
0.3 0.012 0.026 0.043 0.045 0.048 0.050 0.052 0.054 0.050 0.050 0.052 0.052
0.6 0.029 0.092 0.122 0.153 0.180 0.182 0.184 0.190 0.193 0.194 0.195 0.200
0.9 0.042 0.148 0.251 0.303 0.362 0.368 0.376 0.385 0.386 0.397 0.397 0.412
1.2 0.062 0.202 0.340 0.509 0.549 0.575 0.589 0.591 0.593 0.631 0.633 0.636
1.5 0.069 0.261 0.444 0.619 0.721 0.807 0.828 0.843 0.872 0.873 0.880 0.897
1.8 0.079 0.306 0.588 0.757 0.922 1.068 1.122 1.140 1.158 1.183 1.199 1.205
2.1 0.087 0.331 0.629 0.893 1.143 1.287 1.314 1.341 1.485 1.497 1.506 1.566
2.4 0.090 0.366 0.696 1.091 1.323 1.409 1.596 1.617 1.669 1.683 1.752 1.792
2.7 0.093 0.378 0.794 1.158 1.406 1.659 1.834 1.936 2.004 2.006 2.041 2.110
3 0.101 0.393 0.837 1.203 1.572 1.781 2.003 2.160 2.261 2.337 2.348 2.411
Table: Estimated warranty cost per motorcycle item for various warranty period
U\W 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
0.3 39145 8395 13809 14495 15541 16065 16860 17452 16088 16132 16779 16895
0.6 9394 29584 39265 49500 58194 58795 59461 61254 62299 62529 63098 64452
0.9 13687 47671 81217 98027 117053 118723 121384 124241 124600 128089 128130 133195
1.2 19944 65252 109932 164441 177282 185782 190370 190881 191623 203660 204416 205271
1.5 22149 84395 143498 200015 232741 260661 267364 272194 281716 281947 284295 289772
1.8 25632 98960 189795 244637 297816 344924 362538 368285 373937 382209 387267 389096
2.1 27940 106782 203015 288598 369193 415638 424311 433116 479759 483449 486422 505958
2.4 29040 118366 224873 352337 427355 455216 515450 522137 538986 543542 565906 578791
2.7 30048 122083 256553 373927 454056 535940 592391 625408 647149 648005 659122 681571
3 32501 126780 270378 388686 507724 575171 647109 697709 730399 754854 758415 778646
678 ADHITYA RONNIE EFFENDIE
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 679 – 688.
Abstract. It is known that by Strong Law of Large Number, the sample mean X converges almost
surely to the population sample . Central Limit Theorem asserts that the distribution of
n X converges to Normal distribution with mean 0 and variance 2 as n . In
bootstrap view, the key of bootstrap terminology says that the population is to the sample as the
sample is to the bootstrap samples. Therefore, when we want to investigate the consistency of the
bootstrap estimator for sample mean, we investigate the distribution of
n X* X contrast to
n X , where X * is bootstrap version of X computed from sample bootstrap X * .
Asymptotic theory of the bootstrap sample mean is useful to study the consistency for many other
statistics. Thereupon some authors call
n X as pivotal statistic. Here are two out of some
ways in proving the consistency of bootstrap estimator. Firstly, the consistency was under Mallow-
Wasserstein metric was studied by Bickel and Freedman [2]. The other consistency is using
Kolmogorov metric, which is a part of paper in Singh [9]. In our paper, we investigate the
consistency of mean under Kolmogorov metric comprehensively and use this result to study the
consistency of bootstrap variance using delta Method. The accuracy of the bootstrap estimator using
Edgeworth expansion is discussed as well. Results of simulations show that the bootstrap gives good
estmates of standard error, which agree to the theory. All results of Monte Carlo simulations are also
presented in regard to yield apparent conclusions.
Keywords and phrases: Bootstrap, consistency, Kolmogorov metric, delta method, Edgeworth
expansion, Monte Carlo simulations
1. INTRODUCTION
Some questions are usually arise in study of estimation of the unknown parameter
involves the estimation: (1) what estimator ˆ should be used or choosen? (2) having choosen
679
680 B. SUPRIHATIN, S. GURITNO, S. HARYATMI
to use particular ˆ , is this estimator consistent to the population parameter ? (3) how
accurate is ˆ as an estimator of ? The bootstrap is a general methodology for answering
the second and third questions. Consistency theory is needed to ensure that the estimator is
consistent to the actual parameter as desired.
Consider the parameter is the population mean. The consistent estimator for is the
1 n
sample mean ˆ X X . The consistency theory is then extended to the
n i 1 i
The consistency of bootstrap estimator for mean is then applied to study the
consistency of bootstrap estmate for variance using delta method. We describe the
consistency of bootstrap estimates for mean and variance. Section 2 reviews the consistency
of bootstrap estimate for mean under Kolmogorov metric. Section 3 deal with the consistency
of bootstrap estimate for variance using delta method. Section 4 discuss the results of Monte
Carlo simulations involve bootstrap standard errors and density estmation for mean and
variance. Section 5, is the last section, briefly describes concluding remarks of the paper.
H n , H Boot 0 a.s.
Let functional T is defined as T X 1, X 2 ,, X n ; F n X where X and are
sample mean and population mean respectively. Bootstrap version of T is
T
X 1* , X 2* ,, X n* ; Fn n X *
X , where X *
is boostrapping sample mean. Bootstrap
We state some theorems and lemma which are needed to show that
K H n , H Boot 0 a.s.
n
a.s., i.e. S n X i converges a.s. to Xn .
i 1 n 1
n
X j an 0 .
j 1
n
Proof. Set bn X i ai and a 0 b0 0. Then, bn b and
i 1
1
a j b j 1
n n
= a jb j
an j 1 j 1
1
a j b j 1
n 1 n
= bn a jb j
an j 1 j 1
1
a j b j 1
n n
= bn a j 1b j 1
an j 1 j 1
= bn
1
an
n
j 1
b j 1 a j a j 1 b b 0. □
n X
distribution of the X i , such that sup P
x x
C E X1
3
.
3 n
x
E X p
for some 0 < p < 1. Then, Sn
0 a.s.
n1 / p
Proof. This is consequence of the corrolary following Theorem 1 and Kronecker lemma, as
desired. □
Now we show the consistency of H Boot under Kolmogorov metric, which is based
on Sigh [9] and DasGupta [4]. We can write that
H Boot K H n , H Boot = sup PF Tn x P* Tn* x
x
T x T * x
= sup PF n P* n
s s
x
= An Bn Cn , say.
By Polya’s theorem, we conclude that An 0 . Also, by SLLN, we obtain s 2 2
a.s., and by the continuous mapping theorem, s a.s. Hence, we conclude that Bn 0
a.s. Finally, by the Berry-Essen theorem,
3 n 3
C E X 1* X n C X1 X n
i 1
Cn
3/ 2
=
n varFn X 1* n ns 3
3
n 3
C X1 n Xn
i 1
n3/ 2 s3
CONSISTENCY OF THE BOOTSTRAP ESTIMATOR FOR MEAN... 683
Xn
3
C 1
n 3
= 3 3/ 2 X1 .
i 1
s n n
3
Xn 3
Since X , it is clear that 0 a.s. In the first term, let Yi X 1
3
ns
and take p = 2/3, by Zygmund-Marcinkiewicz SLLN yields
1 n 3 1 n
X 1/ p Y 0 a.s. as n .
3/ 2 i 1 1 i 1 i
n n
Thus, An Bn Cn 0 a.s. and hence K H n , H Boot 0 a.s.
Since
n X
d
N 0, 2 and K H n , H Boot 0 a.s we could infer that
n X * X
d
N 0, * , where * is bootstrap version of 2 . On the other hand,
2
2
e dp( x ) r(it )e t
2
/2
itx
, where r(it ) can be derived from Hermite’s polynomials
684 B. SUPRIHATIN, S. GURITNO, S. HARYATMI
dn
r d / dx H n ( x)( x) and satisfies H n ( x) ( 1) n e x e x
2 2
/2 /2
n
. The bootstrap
dx
estimate of H admits an analogous expansion
Hˆ ( x) P T * x X x / ˆ n 1/ 2 pˆ x / ˆ x / ˆ O p n 1 ,
where p̂ is obtained from p on replacing unknowns by their bootstrap estimate.
According to Davison and Hinkley [5], the estimate in the coefficients of p̂ are typically
distant O p n 1/ 2 from their respective value in p, and so pˆ p O p n 1/ 2 . Hall [7] also
showed that ˆ O p n 1/ 2 whence Hˆ ( x) H ( x) x / ˆ x / O p n 1 . Thus
we can deduce that x / ˆ x / is generally of size n 1/ 2 not n 1 . Hence,
P T * x X PT x O p n 1/ 2 . Consistency of the bootstrap sample mean is useful to
study the consistency for many other statistics, see e.g. van der Vaart [10] and Cheng and
Huang [3].
The delta method consists of using a Taylor expansion to approximate a random vector of
the form Tn by the polynomial Tn in Tn . This method is useful
to deduce the limit law of Tn from that of Tn . This method is also valid in
bootstrap view, which is given in the following theorem.
surely.
Let = is the population mean, and then ˆn X is the sample mean. The SLLN
delta method. Again, the SLLN asserts that unbiased sample variance
1 n 2
s2 Xi X converges almost surely to 2. Let
n 1 i 1
1 n 2
s 2* X i* X * is the bootstrap estimate for the sample variance, the
n 1 i 1
X
X X
n 2 2
*
X* n
*2 *
2 n i 1 i i i
counterpart of s . Set s 2*
.
n 1 n n 1 n n
The question is the s 2* converges a.s. to s 2 ? We see that s 2 equals to X, X 2 and s 2*
equals to X * , X *2 for the map x, y
n
n 1
y x 2 . Thus, according to Theorem 5 we
The simulation is conducted using S-Plus and the sample is twenty marks of statistics
test for 20 students are taken as follows: 80, 90, 75, 50, 85, 85, 45, 65, 50, 95, 70, 90, 35, 45,
50, 75, 70, 95, 60, 70. It is obvious that sample mean X = 69.0 with standard error 18.4.
Efron and Tibshirani [6] suggested to conduct simulations using at least B equals 50 for
standard errors and that 1000 for confidence intervals due to give good approximations.
Using the number of bootstrap samples B = 2000, the resulting of simulation gives X * =
69.12 with estimate for standard error 18.1, which is a good approximation. Figure 1 depicts
the densities estimation for the distribution of n X * X and
n s 2* s 2 , respectively.
From the figure, we could infer that the distributions for both statistics are approximately
normal.
686 B. SUPRIHATIN, S. GURITNO, S. HARYATMI
0.020
0.0012
0.0010
0.015
0.0008
0.010
0.0006
0.0004
0.005
0.0002
0.000
0.0000
-60 -40 -20 0 20 40 60 -1000 -500 0 500 1000
sqrt(n) * (mean.boot - mean.sample) sqrt(n) * (var.boot - var.sample)
5. CONCLUDING REMARK
A number of points arise from the consideration of Section 2, 3, and 4, amongst which
we note as follows.
1. Since X a.s. and X * X a.s., according to the bootstrap terminology, we
conclude that X * is a consistent estimator for .
2. So far, by using delta method we have shown that unbiased bootstrap sample
variance s 2* s 2 a.s., and it is obvious that for biased version
X
* 2
n *
i X
i 1
sˆ 2* . Accordingly, both s 2* and sˆ 2* are consistent estimators
n
for 2 .
3. Resulting of Monte Carlo simulation show that the bootstrap estimators are good
approximations, as represented by their standard errors and plot of densities
estimation.
References
[1] BABU, G. J. AND SINGH, K. On one term Edgeworth correction by Efron’s bootstrap, Sankhya, 46, 219-232,
1984.
[2] BICKEL, P. J. AND FREEDMAN, D. A. Some asymptotic theory for the bootstrap, Ann. Statist., 9, 1996-1217,
1981.
[3] CHENG, G. AND HUANG, J. Z. Bootstrap consistency for general semiparametric M-estimation, Ann. Statist.,
5, 2884-2915, 2010.
CONSISTENCY OF THE BOOTSTRAP ESTIMATOR FOR MEAN... 687
[4] DASGUPTA, A. Asymptotic Theory of Statistics and Probability, Springer, New York, 2008.
[5] DAVISON, A. C. AND HINKLEY, D. V. Bootstrap Methods and Their Application, Cambridge University
Press, Cambridge, 2006.
[6] EFRON, B. AND TIBSHIRANI, R. Bootstrap methods for standard errors, confidence intervals, and others
measures of statistical accuracy, Statistical Science, 1, 54-77, 1986.
[7] HALL, P. The Bootstrap and Edgeworth Expansion, Springer-Verlag, New York, 1992.
[8] HUTSON, A. D. AND ERNST, M. D. The exact bootstrap mean and variance of an L-estimator, J. R. Statist.
Soc, 62, 89-94, 2000.
[9] SINGH, K. On the asymptotic accuracy of Efron’s bootstrap, Ann. Statist., 9, 1187-1195, 1981.
[10] VAN DER VAART, A. W. Asymptotic Statistics, Cambridge University Press, Cambridge, 2000.
BAMBANG SUPRIHATIN
University of Sriwijaya
e-mail: [email protected]
SURYO GURITNO
University of Gadjahmada
e-mail: [email protected]
SRI HARYATMI
University of Gadjahmada
e-mail: [email protected]
688 B. SUPRIHATIN, S. GURITNO, S. HARYATMI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 689 – 696.
DEDI ROSADI
Abstract. In this paper, we discuss the application of our new R-GUI software, which we called as
RcmdrPlugin.Econometrics (Rosadi[6],[7]) especially for doing the multivariate time series analysis.
We show an empirical application of this software for finance modeling, especially for forecasting
the yield curve of Indonesia Government Bond using a multivariate time series models, which is
called VAR (Vector Auto Regressive) model. For further detail application of the
RcmdrPlugin.Econometrics for various application of econometrics and time series analysis in
Business, Economics and Finance is discussed in Rosadi [8].
1. INTRODUCTION
R (R Development Core Team [5]), an open source programming environment for data
analysis and graphics, has becoming the 'lingua franca' of data analysis and statistical
computing. The functionality of R is based on the add-on packages/library (similar to
toolboxes in MATLAB). Default installation of R will automatically install and load several
basic packages. Beyond these packages, there are thousand of contributed packages, available
in CRAN and related websites. For time series analysis, there are various packages of R,
available under the taskviews Econometrics, Finance and Time Series in CRAN), with the
main user interaction via Command Line Interface (CLI). Unfortunately, e.g., for teaching
purpose, R-CLI seems to be less user friendly and relatively difficult to use, especially if we
compare it with the commerical econometrics softwares which has an extensive GUI
capabilities, such as Eviews. For solving this problem, Hodgess and Vobach [3] introduced
RcmdrPlugin.epack and Rosadi [6],[7] introduced RcmdrPlugin.Econometrics. In this paper,
we discuss the latest development of RcmdrPlugin.Econometrics, which is not reported in
Rosadi [6],[7] yet, especially for the purpose of modeling multivariate time series analysis.
689
690 DEDI ROSADI
The rest of this paper is organized as follows. In the second section, we quickly review
R-GUI, R-Commander and discuss the philosophy design of RcmdrPlugin.Econometrics. In
section three, we discuss the VAR modeling and its computation using R-CLI and
RcmdrPlugin.Econometrics. In section four, we provide empirical application of the
RcmdrPlugin.Econometrics in Finance, especially for the forecasting the yield curve using
VAR model. Final section concludes.
The main user interaction of R is using CLI (Command Line Interface). For the purpose
of improving the user-friendliness of R, some statistican and programmers have been
developed the R-GUI version, see http://www.sciviews.org for more information on R-GUI.
One of the most popular R-GUI package is R Commander (Rcmdr). It provides the point and
click GUI for doing some basic statistical analysis, and it can be easily extended using
suitable plug-in (Fox [1],[2]). Currently, in CRAN server, there are several Rcmdr plug-ins.
For time series and econometrics analysis, Hodgess and Vobach [3] introduced
RcmdrPlugin.epack and Rosadi [6],[7] introduced RcmdrPlugin.Econometrics. Compare to
RcmdrPlugin.epack, in Rosadi[7], it is shown that RcmdrPlugin.Econometrics is more easy
to use, has better input-output dialog, more comprehensive GUI layout (it is more compatible
to Eviews and has better input dialog for forecasting purpose than Eviews) and has better
menu coverage.
In this paper, we discuss the latest development of RcmdrPlugin.Econometrics, which
is not reported in Rosadi [6],[7] yet, especially for the purpose of modeling multivariate time
series analysis. For multivariate time series analysis, currently RcmdrPlugin.Econometrics
can be used for Granger Causality and Cointegration test, Johansen Test, VAR (Vector Auto
Regressive), ECM and VECM modeling, modeling dynamic linear model (ADL) and
modeling linear panel model (see Figure 1) . These types of analysis are not available in
RcmdrPlugin.epack.
In the following sections, we provide in detail application of
RcmdrPlugin.Econometrics, especially for VAR modeling. We provide empirical examples
in Finance showing a unique features of RcmdrPlugin.Econometrics. For further detail
application of the software for econometrics and time series analysis is discussed in Rosadi
[8].
3. VAR MODELING
3.1. Introduction. The VAR(p) model with k endogeneous variables y t ( y1t , , ykt )
can be written as
yt A1yt 1 A p y t p CDt ut
Multivariate Time Series Analysis using RcmdrPlugin… 691
3.2. VAR Analysis using R-CLI. For illustrative purpose, we provide an example of
computation of VAR model using R-CLI. We simulate the following VAR(2)
m m m m m
R(t , m) 0 1 1 exp( ) / 2 1 exp( ) / exp( )
5. CONCLUDING REMARKS
for the purpose of modeling multivariate time series analysis. We provide an empirical
example of the package for Finance modeling and showing a unique features of the package.
Further detail application of the RcmdrPlugin.Econometrics for various application of
econometrics and time series analysis in Business, Economics and Finance is discussed in
Rosadi [8].
References
[1] FOX, J. , The R Commander : A Basic –Statistics Graphical User Interface to R,. Journal of Statistics Software,
Vol.14, Issue 9, 2005.
[2] FOX, J. , RcmdrPlugin.TeachingDemos. 2009 [Online] Available at www.cran.r-project.org
[3] HODGESS, E. AND VOBACH, C., RcmdrPlugin.epack: A Menu Driven Package for Time Series in R. Paper
presented at the annual meeting of the The Mathematical Association of America MathFest, TBA, Madison,
Wisconsin, Jul 28, 2008
[4] LÜTKEPOHL, H., New Introduction to Multiple Time Series Analysis, Springer, New York, 2006.
[5] R DEVELOPMENT CORE TEAM, R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria, 2011. ISBN 3-900051-00-3.
[6] ROSADI, D., Rplugin.Econometrics: R-GUI for teaching Time Series Analysis, Proceeding of CompStat 2010,
1st Edition., ISBN: 978-3-7908-2603-6, Springer Verlag, Paris, 2010
[7] ROSADI, D., Teaching Time Series analysis course using RcmdrPlugin.Econometrics, Proceeding USER 2010,
Gaithersburg, Washington DC, USA, 2010
[8] ROSADI, D., Analisa Ekonometrika dan Runtun Waktu dengan R : Aplikasi untuk bidang Ekonomi, Bisnis dan
Keuangan, Andi Offset, Yogyakarta (in Bahasa Indonesia), 2011
[9] ROSADI, D., Pemodelan Kurva Imbal Hasil dan Komputasinya dengan Paket Software
RcmdrPlugin.Econometrics, Prosiding Seminar Nasional Statistika, 21 Mei 2011, Program Studi Statistika,
Universitas Diponegoro (in Bahasa Indonesia)
[10] ROSADI, D., NUGRAHA, Y. A. AND DEWI, R.K., 2011, Forecasting the Indonesian Government Securities
Yield Curve using Neural Networks and Vector Autoregressive Model, Bulletin of the International Statistical
Institute 58th Session, 21-26 August 2011, Dublin, Ireland
[11] STANDER, S.Y., Yield Curve Modeling, Palgrave Macmilan, New York, 2005.
DEDI ROSADI
Dept. of Mathematics, Faculty of Math. and Natural Sciences, Gajah Mada University
e-mail: [email protected]
696 DEDI ROSADI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 697 – 704.
Abstract. In recent years a lot of works have been done to bridge the gap between the two main
approaches in credit risk modeling: structural models and reduced form models. Many papers try to
obtain this using special assumptions about the problem. This paper is a literature study about unified
models. We use a method to unify these two models by removing the discrepancy of yield spreads
between structural models and reduced-form models. We show the equivalence of yield spreads
between structural model and reduced-form model. The unified model is obtained.
Keywords and Phrases : Credit Risk, Structural Models, Reduced Form Models, Yield Spreads
1. INTRODUCTION
In credit risk model, there are two broad classes of models. There are Structural
Models, and Reduced-Form Models. Structural models use the capital structure to find the
default probability and the mean recovery rate (Merton [16]). Reduced-form models use the
market spread to find the default probability and the mean recovery rate (Jarrow & Turnbull
[8], Jarrow, Lando, & Turnbull [9]).
Structural models are based on the information set available to the firm’s
management, which includes continuous-time observations of both asset values and
liabilities. Reduced-form models are based on the information set available to the market,
typically including only partial observations of both the firm’s asset values and liabilities.
The main distinguishing characteristic of structural models with respect to reduced-
form models is the link the former provide between the probability of default and the firms’
fundamental financial variables: assets and liabilities. Reduced form models use market
prices of the firms’ defaultable instruments (such as bonds) to extract both their default
probabilities and their credit risk dependencies, relying on the market as the only source of
information regarding the firms’ credit risk structure. Although easier to calibrate, reduced
form models lack the link between credit risk and the information regarding the firms’
financial situation incorporated in their assets and liabilities structure.
697
698 D. A. I. M ARUDDANI ET AL.
Merton [16] firstly builds a model based on the capital structure of the firm, which
becomes the basis of the structural approach. In his approach, the company defaults at the
bond maturity time T if its assets value falls below some fixed barrier at time T. Thus the
default time is a discrete random variable which picks T if the company defauls and infinity
if the company does not default. As a result, the equity of the firm becomes a contingent
claim of the assets of the firm's assets value. Black and Cox [3] extends the definition of
default event and generalize Merton's method into the first-passage approach. In Black and
Cox [3], the firm defaults when the history low of the firm assets value falls below some
barrier D. Thus, the default event could take place before the maturity date T.
These models ignore the possibility of bankcruptcy of underlying firm and in real
worls, firms have a positive probability of default in finite time (Mendoza [15]). Empirical
studies of Indonesian structural models of credit risk have been worked in Maruddani et al
[12] and Maruddani et al [13].
Intensity-based approach, also known as reduced form model, as a counterparty of
the structural model, is introduced by Artzner & Delbaen [2], Jarrow & Turnbull [8] and
Duffie & Singleton [5]. In this approach, the default event is modelled as either a stopped
Poisson process or a stopped Cox Process with intensity ht. The intensity ht is then called
hazard rate in reduced form approach, since the product of ht and an infinitesimal time period
dt is the default probability of the firm at that infinitesimal time period dt given the firm has
not default yet before time t. It was showed in Lando [11] and Duffie & Singleton [5] that the
defaultable bonds can be calculated as if they were default-free using an interest rate that is
the risk-free rate adjusted by the intensity.The results of the two models are usually different.
These credit models ignore the information of stock option market and there is no
connection between equity model and credit model (Mendoza [15]). Empirical study of
Indonesian reduced-form models of credit risk has been worked in Maruddani et al [14].
In recent years, some papers have tried to bridge the gap between these two models.
Many papers try to obtain this using special assumptions about the problem (Elizalde [6],
Chen & Panjer [4], Akat [1]). Elizalde [6] argues that the key element to link both approaches
lies in the model’s information assumptions. Using a specification of a structural model
where investors do not have complete information about the dynamics of the processes which
trigger the firm’s default, these reconciliation models derive a cumulative rate of default
consistent with a reduced form model. Akat [1] proposed a model where the credit default
event is defined as the minimum of the two default times, one from the structural default and
the other from the exogenous intensity.
This paper is a literature study about unified model. We study a method for unifying
structural models and reduced form models by showing the equivalency of these two yield
spreads, then the unified model is obtained (Chen & Panjer [4]).
2. YIELD SPREADS
when credit risk is involved. Empirically, we assume treasuay bonds issued by the
government are default-free while corporate bonds issued by firms are defaultable. A
defaultable bond is always mentioned together with its issuer. Thus we use PC(t, T) denotes
the price of a defaultable zero-coupon bond issued by a firm paying $1 at maturity date T
given there is no default. We will see in this section that credit risk theory share many
similarities with interest rate theory.
The term structure of defaultable bond price PC(t, T) is a function of T, for fixed t.
We only consider the bond prices when T ≥ t. We assume both bond prices vanish to 0 as T
goes to infinity (Yi [17]).
Definition 3 Yield Spread R (t, T) for defaultable bond PC(t, T) is defined by the difference
of those two yields mentioned above
(3)
3. STRUCTURAL MODEL
(4)
Where
: the instantaneous variance
: instantaneous return
: total payout including dividends or coupons
: a standard Brownian motion.
A default event can only occur at its bond’s maturity and occurs only if the value of
the firm VT at bond maturity T falls below the debt obligation D. Default-free interest rate is
assumed to be a positive constant r, therefore the risk-free bond prices are given by
(5)
700 D. A. I. M ARUDDANI ET AL.
(6)
Yield Spread spread in Merton’s Model at time t of a risky debt that matures at time
T given by Merton is
(7)
Where
(8)
Let the equity of the firm at time t or the stock price be denoted by St. The price of
the equity is considered as the price of an European call option on the value of the firm.
Black and Scholes [3] show that price of the equity is
(9)
Where
(10)
(11)
Assuming that a company has survived t years and its random failure time is t. At
current time t, the probability that the firm will default at time T is
(12)
The recovery rate (T) is the proportion VT of D if the firm defaults at time T. The
formal definition of recovery rate can be expressed as
(13)
UN IF IE D S TR UC TUR AL M ODE LS AND R E DUC E D - FOR M M ODE LS … 701
4. REDUCED-FORM MODEL
Reduced form approach or intensity-based approach goes back to Artzner & Delbaen
[2], Jarrow & Turnbull [8], Lando [11], and Duffie & Singleton [5]. The basic idea is based
on modeling the default process as a Stopped Poisson process (Yi [17]).
Reduced-form model use hazard rate framework to model default. Based on
the models of Jarrow & Turnbul [8], and Jarrow, Lando, & Turnbull [9], we assume the
financial market is frictionless with a finite time horizon. The price of a riskless bond
can be written as
(14)
Where r(s) is the spot rate at time s.
And the price of risky bond can be written as
(15)
If the default-free spot rates and the default process are independent, the price of a
risky bond is
(16)
(17)
Where represents the mean recovery rate if the maturity is time T.
5. UNIFIED MODEL
Up to now, there are two main quantitative approaches to analyzing credit risk:
structural approach and reduced form approach. Both classes have its own advantages and
disadvantages, e.g. although reduced form models have a lot of room for calibrating to
historical data, they lack the financial ingredient for the model parameters. On the other hand
structural models have a nice explanation in financial terms and rather intuitive, they lack
measuring in particular the short-term credit risk and much harder to apply when there is
more than one name involved. Hence, a framework that would combine the two general
classes of models in a way that none of the above criticisms do not apply would be the ideal
framework to model credit risk.
702 D. A. I. M ARUDDANI ET AL.
The other critical assumption of the structural model is that the evolution of firm
value follows a diffusion process. Thus, the yield spreads of corporate bonds, especially those
with short maturities, are explained in the context (Jones et al [10], Fons [7]).
In the reduced-form model, since the hazard rate of default is modeled as an
exogenous process, it is unknown what economic mechanism behind the default process.
According to Duffie & Singleton, the parameters of the reduced-form models are unstable
when the models are applied to fit observed yield spreads.
In this paper, we unify the structural model with the reduced-form model by showing
that the yield spread of the Merton model is equivalent to the yield spread of a reduced-form
model. After rearranging the yield spread in the Merton structural model, the price of risky
bond for a structural model can be rewritten as the price of a contingent claim under risk-
neutral valuation paying full obligation if there is no default and paying the recovery rate if
default happens at maturity like a reduced-form model.
We start with a reduced-form model under risk-neutral valuation. The corresponding
value of the firm process defines the default probability and mean recovery rate. We now
show that the yield spread of the Merton model is equivalent to the yield spread for a
reduced-form model. The yield spread of the Merton model is
(18)
Which is the yield spread for a reduced-form model in equation (17).
In general, the structural model and reduced-form model can be unified by
UN IF IE D S TR UC TUR AL M ODE LS AND R E DUC E D - FOR M M ODE LS … 703
(19)
Structural models usually impose assumptions on the value of the firm Vt while
reduced-form models usually impose assumptions on the components related to the equation
(20)
References
[1] AKAT, M., 2007, A Unified Credit Risk Model, Dissertation, Department of Mathematics, Stanford University.
[2] ARTZNER, P. AND DELBAEN, F., 1995, Default Risk Insurance and Incomplete Markets, Mathematical Finance,
5:3, 187-195.
[3] BLACK, F. AND SCHOLES, M., 1973, The Pricing of Options and Corporate Liabilities, Journal of Political
Economy, 81, 637-654.
[4] CHEN, C. AND PANJER, H., 2007, Unifying Discrete Structural Models and Reduced Form Models in Credit
Risk using a Jump Diffusion Process, Insurance: Mathematics and Economics, 33, 357-380.
[5] DUFFIE, D. AND SINGLETON, K., 1999, Modeling Term Structure of Defaultable Bonds, Review of Financial
Studies, 12, 687 – 720.
[6] ELIZALDE, A., 2005, Credit Risk Models III: Reconciliation Reduced-Structural Models, www.abelelizalde.com
[7] FONS, J.S., 1994, Using Default Rates to Model the Term Structures of Credit Risk, Financial Analysts Journal,
25-32.
[8] JARROW, R.A. AND TURNBULL, S.M., 1995, Pricing Derivatives on Financial Securities Subject to Credit Risk,
The Journal of Finance, 50, 53-85.
[9] JARROW, R.A., LANDO, D., AND TURNBULL, S.M., 1997, A Markov Model for the Term Structure of Credit
Risk Spreads, The Review of Financial Studies, 10, 481-523.
[10] JONES, E.P., MASON, S.P., AND ROSENFELD, E., 1984, Contingent Claims Analysis of Corporate Capital
Structures: An Emprical Investigation, Journal of Finance, 39, 611 – 627.
[11] LANDO, D., 1998, On Cox processes and credit risky securities. Review of Derivatives Research, 2, 99-120.
[12] MARUDDANI, D.A.I., 2011a, Pengukuran Risiko Kredit Obligasi dengan Model Merton, Jurnal Ekonomi
Manajemen dan Akuntansi, Fakultas Ekonomi Universitas Mercu Buana Yogyakarta, Vol. 1, No. 1, 123-141.
[13] MARUDDANI, D.A.I., ROSADI, D., GUNARDI, AND ABDURAKHMAN, 2011b, Credit Spreads Obligasi Korporasi
dengan Model Merton, Prosiding Seminar Nasional Statistika Universitas Diponegoro, ISBN: 978-979-097-
142-4.
[14] MARUDDANI, D.A.I., ROSADI, D., GUNARDI, AND ABDURAKHMAN, 2011c, Credit Spreads pada Reduced-Form
Model, Jurnal Media Statistika, Universitas Diponegoro, Vol. 4., No. 1, 57-63.
[15] MENDOZA, N, 2009, Unified Credit-Equity Modeling, Recent Advancements in the Theory and Practice of
Credit Derivatives, The University of Texas at Austin.
[16] MERTON, R.,1974, On the Pricing of Corporate Debt: The Risk Structure of Interest Rate Journal of Finance,
29, 449–470.
[17] YI, C., 2005, Credit Risk from Theory to Application, Thesis, Mc. Master University.
DI ASIH I MARUDDANI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Diponegoro University, Semarang, Indonesia
e-mail: [email protected]
704 D. A. I. M ARUDDANI ET AL.
DEDI ROSADI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta, Indonesia
e-mail: [email protected]
GUNARDI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta, Indonesia
e-mail: [email protected]
ABDURAKHMAN
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta, Indonesia
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 705 - 714.
Abstract. Estimation parameters are one of the important steps in interest rate modeling. The
interest rate model is defined under risk-neutral measure and the data in the real world
characterizes the distribution of the interest rate model under the actual measure. In this case, the
changing measure of the model is needed and is conducted by using Girsanov’s theorem. This
paper will investigate the effect of this changing measure on Vasicek’s model and Cox, Ingersoll
and Ross’ model. The implementation shows that the simulation of Vasicek and CIR models
obtained via changing measure has less mean absolute error than the simulation of those models
without changing measure.
Keywords and Phrases : changing measure, Girsanov’s theorem, interest rate model, Vasicek’s
model, Cox, Ingersoll and Ross’ model.
1. INTRODUCTION
The dynamics of interest rates are important factors to consider in pricing derivative
products such as bond, option, and equity, etc. As the value of interest rates are always
changing from time to time, an interest rate is considered as a stochastic process. The
dynamic of interest rate can be represented in stochastic differential equations (SDEs). This
paper discusses two interest rate models, Vasicek and Cox, Ingersoll dan Ross (CIR) models.
Based on Dominedo [2], Vasicek dan CIR models have unique solutions. Based on Indarti [5]
and Wahyuni [10], both are stable models. To determine the model’s parameters, the
real/historical data was needed. According to Brigo [1], data are collected in the real world,
and their statistical properties characterize the distribution of interest rate process under actual
measure, while the models are under risk-neutral measure. Interest rate model under risk-
neutral measure guarantees that if the model used for assets pricing, it will satisfy the
martingale properties so there is no arbitrage. Related to the model implementation to the real
data, the model transformation from risk-neutral measure to actual measure is needed.
According to this, the Girsanov theorem will be used and parameter estimation process can
705
706 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI
proceed under actual measure. To obtain the model’s parameters under risk-neutral measure,
some market price of risks will be used. These market prices of risks relate the model’s
parameters in actual measure to the parameters in risk-neutral measure. Hence, the model’s
parameters can be obtained and the simulations of two models can be generated. By
comparing the simulations, the effect of changing measure will be investigated. The content
of this paper is as follows: At first, changing measure in Vasicek and CIR models will be
discussed, followed by parameter estimation of Vasicek and CIR models. Next, the effect of
changing measure will be discussed which followed by a concluding remark and
acknowledgements.
2.1 Changing Measure in Vasicek Model. Vasicek model is a model of interest rate
introduced by Oldrich Vasicek in 1977. The interest rate of Vasicek model has a mean
reversion property, i.e. the interest rate appears to be pulled back to some long-run average
over time. Vasicek model under risk-neutral measure P is defined as
drt ( rt )dt dWt , (1)
withr0 as an initial value, r0 , , , and are positive constants, rt is an interest rate at t,
is a long-run level (mean reversion level), is the speed of adjustment of the interest rate
towards its long-run level , 2 is a variance rate, and Wt is a Brownian motion on a
probability space , F , P .
In order to obtain the analytic solution of (1), set rt X t then we get
dX t X t dt dWt , which is an Orstein-Uhlenbeck process. To solve this Orstein-
Uhlenbeck process, we set Yt e t X t . Then by using Ito-Doeblin formula, we get
t t
1
Yt Y0 e s X s X s e s 2 .0 ds e s dWs
0
2 0
t
e s dWs ,
0
so, we obtain
t
X t e t X 0 e t e s dWs .
0
E u2 Zu2 du .
T
0
Set Z = Z(T). Then E[Z] = 1 and under the probability measure P , the process Wt , 0 ≤ t ≤
T, is a Brownian motion.
t 1 t
Zt exp 0 ru dWu 0 2 ru2 du ,
2
Wt Wt 0t ru du
Wt rt t ,
then we have
Wt Wt rt
t . (4)
708 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI
By substituting equation (4) into Vasicek model (1), we have Vasicek model under actual
measure as follows
drt [ ( )rt ]dt dWt , (5)
2.2 Changing Measure in CIR Model. The Cox-Ingersoll-Ross (CIR) model is an interest
rate model which is constructed by Cox, Ingersoll, and Ross in 1980 and published in 1985.
Like Vasicek model, CIR model also has mean reversion property. CIR model eliminates the
main drawback of the Vasicek model, a positive probability of getting negative interest rate.
So, the interest rate of CIR model is always positive. CIR model under risk-neutral measure is
defined as
drt ( rt )dt rt dWt , r (0) r0 , (7)
with r0 as an initial value, r0 , , , are positive constants, rt is an interest rate at t,
is a mean reversion level, is the speed of adjustment of the interest rate towards its long-
run level , is a variance rate, and Wt is a Brownian motion on a probability space
2
, F , P .
The CIR model has no analytic solution. However, its mean and variance can be
calculated analytically [8]. By using Ito-Doeblin formula with Yt e t X t , for 0 < u < t can
be obtained
E rt | Fu ru e (t u ) , (8)
and
ru 2 2
Var rt | Fu
e ( t u )
e 2 ( t u )
2 1 e(9) .
( t u ) 2
In the CIR model, the interest rate admits a noncentral chi-square distribution [1].
By using Girsanov’s theorem and let t rt t , we have
t 1 t
Zt exp 0 ru dWu 0 2 ru du ,
2
Wt Wt 0t ru du
Wt rt t ,
Th e E ffec t Of C h an gi n g M ea s u re In In t e res t R a t e M odels 709
so
Wt Wt rt t. (10)
Then, by subtituting (10) into CIR model (7) yields CIR model under actual measure as
follows
drt [ ( )rt ]dt rt dWt . (11)
3.1 Parameter Estimation of Vasicek Model. After obtaining Vasicek model under actual
measure, the parameter estimation are performed by using maximum likelihood method. To
simplify the estimation process, Vasicek model under actual measure (6) can also be
expressed as follows,
drt * ( * rt )dt *dWt , (12)
where
* , * , * . (13)
( )
The market price of risk λ relates the model’s parameters in actual measure with the
parameters in risk-neutral measure as shown in equations (13).
The analytic solution of Vasicek model under actual measure (12) can be obtained in a similar
way as we get the analytic solution of Vasicek model under risk-neutral measure. It can be
shown that the analytic solution of Vasicek model under actual measure has the form
t
* t u * t u * t u
e
*s
(1 e ) e
*
rt ru e *
dWs ,
u
with distribution
* t u
* t u
*2
1 e , * 1 e 2 t u .
*
rt ~ N ru e *
(14)
2
According to equations (3) and (14) show that Vasicek model in different measure has
different distribution.
According to Brigo [1], let
*2
exp * , V 2
2 *
1 exp 2 * , (15)
By using maximum likelihood method, the formulas of parameter estimation are given by
n n n n
n ri ri 1 ri ri 1 r ˆr i 1
i
1 n
, Vˆ ri ˆri 1 ˆ (1 ˆ ) .
2
ˆ i 1 i 1 i 1
, ˆ i 1
* 2 *
n
n
2 n(1 ˆ ) n i 1
n ri 21 ri 1
i 1 i 1
Then by substituting the above equations to equation (15), we get the estimated parameters of
Vasicek model under actual measure as
n
ln ˆ r ˆr i i 1
2ˆ *Vˆ 2
ˆ
*
, ˆ * i 1
, ˆ * .
1 exp 2ˆ *
(17)
n(1 ˆ )
Since the model under actual measure in equation (12) is similar to the model under risk-
neutral measure (1), the parameter estimations of the model under risk-neutral measure
(without changing measure) using maximum likelihood method also can be stated as
n
ln ˆ r ˆr i i 1
2ˆ *Vˆ 2
ˆ , ˆ i 1
, ˆ .
(18)
n(1 ˆ ) 1 exp 2ˆ *
From equation (13), we have
( ) *
* , , *. (19)
Finally, by substituting equation (17) into equation (19), the formulas of parameter estimation
of Vasicek model under risk-neutral measure can be stated as
ˆ *ˆ *
ˆ ˆ * ˆ * , ˆ , ˆ ˆ *. (20)
ˆ ˆ
* *
According to Brigo [1] and Zeytun [11], is known as a market price of risk which is
assumed as a constant. In this paper, is chosen so that the parameters satisfy the stability
condition of Vasicek model [5].
3.2 Parameter Estimation of CIR Model. The parameters of CIR model under actual
measure are estimated using Generalized Method of Moments (GMM) based on Matyas [7].
GMM is an estimation technique which suggests that the unknown parameters should be
estimated by matching population (or theoretical) moments (which are functions of unknown
parameters) with the appropriate sample moments. Parameter estimation does not require the
knowledge of the data distribution.
CIR model under actual measure (11) can also be expressed as
~
drt (c1 c2rt )dt * rt dWt , (21)
where
c1 , c2 ( ) and * . (22)
In equation (21), there are three parameters that will be estimated, ĉ1 , ĉ2 , and ˆ 2 that can
Th e E ffec t Of C h an gi n g M ea s u re In In t e res t R a t e M odels 711
4. DISCUSSION
The effect of changing measure in interest rate models is shown through simulations
by using MATLAB software. In case of Vasicek model, simulations are performed by using
Monte Carlo method [3], whereas the simulations of CIR model are performed by using
Milstein method since the CIR model does not have an analytic solution. Simulations are
conducted by using the daily interest rate data of the zero-coupon bond with five years
maturity which can be downloaded in www.bankofengland.co.uk.
There are two simulations of each model, the simulation use the model’s parameters
with changing measure and without changing measure. By assuming the daily interest rate
data and the interest rate model have the same measure (both satisfy the risk-neutral
measure), the model’s parameters without changing measure can be obtained directly from
the corresponding risk-neutral model. The model’s parameters with changing measure are
obtained by initially changing the measure of the model into actual measure, then estimating
the model’s parameters under actual measure using the daily interest rate data, and finally
substituting it back to the equations that relate the parameters in risk-neutral and actual
measure in order to get the model’s parameters under risk-neutral measure.
Based on the data, by equation (18) the parameter estimations of Vasicek model
without changing measure are ˆ 0, 2439, ˆ 0,0479, ˆ 0.002743 , and by equation
(20) the parameter estimations with changing measure are
ˆ 0, 238463, ˆ 0,048988, ˆ 0.002743 .
712 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI
Monte Carlo iteration scheme of rt for 0 = t0 < t1 < ... < tn can be expressed as
2
r (ti 1 ) e (ti1 ti ) r (ti ) 1 e (ti ,ti1 ) 2
1 e2 (t ,t i i 1 )
Z i 1
where Z1 , , Z n are random variables which has N(0,1) distribution. This scheme is used to
generate the simulation of Vasicek model.
The simulation of Vasicek model is as follows,
0.05
r(t)0.048
0.046
0.044
0.042
In figure 1, we can see that data and its simulations are different. On time interval, t ≤
0,2, the simulation quite close with data but on the further time interval, the simulation and
data are very different. In other words, Vasicek’s model only quite good to analyze interest
rate for short time interval. It’s agree with [5] that states Vasicek’s model is short rate model.
The simulations in figure 1 show that the mean absolute error of the simulation
without changing measure is 0,003389 and the mean absolute error of simulation with
changing measure is 0,003319.
By using the same data, by equation (24) the parameter estimation of CIR model
without changing measure are ̂ = 1,55770, ˆ = 0,04095, and ˆ = 0,03292, and by
equation (25) the parameter estimation with changing measure are
ˆ 1, 49187, ˆ 0,04276, ˆ 0.03292 .
Milstein scheme for CIR Model in interval [0, T] is
1
rj rj 1 rj 1 t rj 1 W j W j1 rj 1
1
W j W j1 t
2
2 2 rj 1
rj 1 rj 1 t rj 1 W j W j1 2 W j W j1
1
t ,
2
4
T
where j = 1, 2, ..., L , and t .
N
Th e E ffec t Of C h an gi n g M ea s u re In In t e res t R a t e M odels 713
0.052
0.05
r(t)
0.048
0.046
0.044
0.042
0.04
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
Figure 2. The Comparison of Interest Rate Data with Simulations of CIR Model with and
without Changing Measure
The simulations of CIR model show that the mean absolute error of simulation without
changing measure is 0,0028 while the mean absolute error of simulation with changing
measure is 0,0016.
Based on both simulations, by applying the changing measure on both models, show
that the mean absolute error of simulation with changing measure is less than the mean
absolute error of simulation without changing measure.
5. CONCLUDING REMARK
The mean absolute error of simulation of Vasicek and CIR model by using parameter
estimation with changing measure is less than the mean absolute error of simulation of
Vasicek and CIR model by using parameters estimation without changing measure.
Therefore, changing measure is one factor to consider in interest rate modeling.
References
[1] BRIGO, DAMIANO AND FABIO MERCURIO, Interest Rate Models Theory and Practice, Springer Finance, New
York, 2001.
[2] DOMINEDO. Affine jump-diffusion interest rate models and application to insurance, thesis, Universitas Roma
Tor Vergata. 2009.
[3] GLASSERMAN, PAUL, Monte Carlo Methods in Financial Engineering : Applications of Mathematics, Springer,
New York, 2004.
[4] HIGHAM, DESMOND J, An Algorithmic Introduction to Numerical Simulation Differential Equations, SIAM
Review Vol.43, No.3, 525-546, 2001.
[5] HULL, J.C, Options, futures, and other derivatives (5th ed), Prentice-Hall, New Jersey, 2003.
[6] INDARTI, DINA. Analisis Stabilitas dan Implementasi Model Vasicek, thesis, Universitas Indonesia. 2011.
[7] KLOEDEN, P.E. DAN PLATEN, E. 1991. Numerical Solution of Stochastic Differential Equations. Applications of
Mathematics, Vol.23, Springer-Verlag Berlin Heidelberg.
[8] MATYAS, LASZLO. Generalized Method of Moments Estimation, Cambridge
University Press, 1999.
[9] MISHRA, RAJA KUMAR. Study of Positivity Preserving Numerical Methods for Cox-Ingersoll-Ross Interest Rate
Model, project report, Indian Institute fo Science, Bangalore. 2010.
[10] SHREVE, STEVEN E, Stochastic Calculus for Finance II Continuous-Time Models, Springer Finance, New
York, 2004.
[11] WAHYUNI, IAS SRI. Kajian Stabilitas dan Implementasi Model Cox, Ingersoll, and Ross, thesis, Universitas
Indonesia. 2011.
[12] ZEYTUN, S AND A.GUPTA, A Comparative Study of the Vasicek and the CIR Model of the Short Rate, Institut
Techo- und Wirtschaftsmathematik ITWM, Fraunhofer. 2007.
DINA INDARTI
University of Indonesia.
e-mail: [email protected]
BEVINA D. HANDARI
University of Indonesia.
e-mail: [email protected]
Abstract. Weighted fuzzy time series that developed based on concept of fuzzy theory is one of
relatively new methods for time series forecasting. Up to now, a single order of weighted fuzzy time
series, either non-seasonal or seasonal order, is mostly used for time series forecasting. In practice,
many time series forecasting problems usually also deal with more than single order, known as high
order model. This paper focuses on the development of new weighted fuzzy time series method for
high order model. New rules to find the forecast in high order weighted fuzzy time series model are
also proposed. Three data about Indonesia’s inflation are used as case study. Root mean of squares
errors in testing datasets is used for evaluating the forecast accuracy. The results are compared to
other three weighted fuzzy time series methods (i.e. Chen’s, Yu’s, and Cheng’s methods) and two
classical statistical methods, namely ARIMA and exponential smoothing models. The results show
that the proposed weighted high order fuzzy time series yields more accurate forecast than other
methods in two datasets, whereas ARIMA yields the best forecast in one dataset.
Keywords and Phrases: fuzzy time series, high order, inflation, new weight.
1. INTRODUCTION
Fuzzy time series is a concept which can be used to deal with forecasting problems in
which historical data are linguistic values. Fuzzy time series was firstly introduced by Song
and Chissom [10, 11]. Song and Chissom [11] stated that fuzzy time series divided into two
types, namely time variant and time-invariant. If the relations are all same between time t and
its prior time t – k (where k = 1,2, ... , m), it is a time-invariant fuzzy time series; likewise, if
the relations are not the same, then it is time variant. This paper discusses about the time
invariant fuzzy time series. First order time invariant fuzzy time series was studied by Chen
[1]. The fuzzy time series model proposed by Chen were ignoring of recurrence and not
properly handle weighted to various fuzzy relationships. The problem was solved by Yu [12],
Cheng et al. [4], and Lee and Suhartono [7]. Futhermore, Chen [2] also has studied about high
order time invariant fuzzy time series.
2010 Mathematics Subject Classification: 62A86 (Fuzzy analysis in statistics)
715
716 LUSIA AND SUHARTONO
In this paper, the first order method introduced by Lee and Suhartono [7] was
developed for high order model. First, new rules to find the forecast in high order weighted
fuzzy time series model were proposed. Then, empirical study was done by using data about
Indonesia’s inflation as a case study. Root mean of squares errors is used for evaluating the
forecast accuracy, particullarly in testing datasets. The results were compared to other three
weighted fuzzy time series methods (i.e. Chen’s, Yu’s, and Cheng’s methods) and two
classical statistical methods, namely ARIMA and exponential smoothing models. The results
showed that the proposed weighted high order fuzzy time series yielded more accurate
forecast than other methods.
1.1 Fuzzy Time Series. Generally, Song and Chissom [10, 11] described concepts about
fuzzy time series as follows:
Let U be the
universe of dicourse, where U = and
= [begin, end]. A fuzzy set of U is defined as
, where is teh membership function of
the fuzzy set . is a generic element of fuzzy set , and is the
degree of belongingness of to , where and .
First order seasonal fuzzy time series model defined by Song [11] is given as follows:
Definition 5. Let is a fuzzy time series which there exist seasonality with m period, then
FLR is represented by .
1.2 First Order Weighted Fuzzy Time Series. Based on Song and Chissom [10,11], Chen
[1] improved the establishment step of fuzzy relationships which it used a simple operation
and instead of complex matrix operations. The algorithm of Chen’s method is as follows:
Chen’s Algorithm
New Weighted High Order Fuzzy Time Series for Inflation Prediction 717
1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR and group them based on the current states of the data of the FLR. For
example FLR are , , , , , then the Fuzzy
Logical Relationship Group (FLRG) of FLR is
5. Forecast . Let , then the forecast is as follows:
i. If the FLR of is empty , then ,
ii. If there is only one FLR (for example ), then ,
iii. If then .
6. Defuzzy. For example , then , where is
defuzzy and is a midpoint of .
The fuzzy time series model proposed by Chen were ignoring of recurrence and not
properly handle weighted to various fuzzy relationships. The problem was solved by Yu [12],
Cheng et al. [4], and Lee and Suhartono [7]. The diffecence between Chen’s algorithm and
Yu, Cheng, also Lee’s algorithm is after third step. Model proposed by Yu, Cheng, and Lee
and Suhartono called Yu’s Algorithm, Cheng’s Algorithm, and Lee’s Algorithm,
respectively. These three algorithms are as follows:
Yu’s Algorithm
1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR and FLRG. For example the FLR are , , ,
, , then the FLRG is .
5. Forecast. Use the same rule as Chen’s algorithm.
6. Defuzzy. Suppose , then the defuzzified matrix is equal to a
matrix of the midpoints of is represented by ,
where represents the defuzzified forecast of .
7. Assigning weights. Suppose the weigths of are
which it specified as , where and for .
Then the weight matrix can be written as
.
8. Calculating the final forecast value. In the weighted model, the final forecast is
Cheng’s Algorithm
718 LUSIA AND SUHARTONO
1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR, FLRG, and calculated the weighted. For example if the FLR are
, , , , , then the FLRG is
which it have weighted (the first RHS of ),
(the first RHS of ), (the second RHS of ), (the first RHS of ),
(the third RHS of ). So the weight matrix can be written as
.
5. Calculated the standardize matrix ( ). The standardize matrix calculate using
Lee’s Algorithm
1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR and FLRG. For example if the FLR are , , ,
, , then the FLRG is
5. Forecast. Use the same forecast rule as Chen’s algorithm.
6. Defuzzify. If , then defuzzified matrix is equal to a matrix of the
midpoint of can be written as , where is
represented the defuzzified forecast of .
7. Assigning weight. The weight of are which it
specified as , where and for and . Then
the weigth matix can be written as
.
8. Calculate final forecast using
1.3 High Order Weighted Fuzzy Time Series. High order weighted fuzzy time series was
defined by Chen [2] can be explained as follows:
(1)
Calculation for Chen, Yu and Cheng all are the same as first order weighted fuzzy
time series, but there are several rules in defuzzified step. The rule for each method is given
as follows:
Chen’s rule:
1. If only have one value in RHS. For example
and in FLR have one value of RHS
called then defuzzified value are
2. If have more than one value in RHS. For example
then the defuzzified value are
3. If FLR of is empty set. For example
then the defuzzified value is as follows:
i. or
ii.
Yu’s rule:
1. If then the defuzzified value or is equal to the first step of Chen’s rule.
2. If have more than one value in RHS. For example
then the defuzzified value is
3. If FLR of is empty set. For example
then the fuzzified value is as follows:
(2)
Cheng’s rule:
1. If then the defuzzified value or is equal to the first step of Chen’s rule.
2. If have more than one value in RHS. For example
then the defuzzified value is
(3)
1. If then the defuzzified value or is equal to the first step of Chen’s rule.
2. If have more than one value in RHS. For example
then the defuzzified value is
(4)
2nd Scheme:
1. If and exist, then the defuzzified is
2. If and exist, then the defuzzified is
3. If and exist, then the defuzzified is
4. If the thirt of second scheme do not exist, then the defuzzified is equal to the fourth
rule of the first scheme.
3th Scheme:
1. If , and exist, then the defuzzified is
Data about general Indonesia’s inflation, food stuffs inflation, and education and
sport inflation are used as case study, where January 2000-Desember 2009 are use as training
datasets and January-Desember 2010 are use as testing datasets. Time series plot for each data
are shown in Figure 1.
Time Series Plot of Y(t) Time Series Plot of Y(t) Time Series Plot of Y(t)
10 8 8
4 4
6
Y(t)
Y(t)
Y(t)
2 2
4
0 0
2
-2 -2
0
-4 -4
Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan
Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Figure 1. Time series plot of general Indonesia’s inflation (a), food stuffs inflation (b),
and education and sport inflation (c).
Figure 1.a shows that general Indonesia’s inflation are stationary in mean and
variance and also there is neither seasonal nor trend pattern. This figure also shows an outlier
in October 2005 which is caused by increasing price of fuel. Because of no seasonal pattern in
this data, the order model that can be used are the first order, the second order and the third
order. To demonstrate the proposed algorithm, general Indonesia’s inflation data is used as a
numerical example in the second order as follows:
Step 1. Define the universe of discourse and partitioned into several equally length of
intervals.
We justifed that partition the universe of discourse into 16
invervals which have length of interval 0,6. The 16 intervals are ,
, , , , until
.
Step 2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data
In this step, the fuzzy sets, for the universe of discourse are defined
as in Table 1.
Step 6. Defuzzify.
Using example in step 5, the forescast are then the deffuzified forecast of
are .
Table 2. Fuzzy Logical Relationship
1
2
3
4
5
6
7
117
118
119
120
New Weighted High Order Fuzzy Time Series for Inflation Prediction 723
0.585
(3) For , we already have the linguistic of the final forecast for
( is and for ( ) is . Based on Table 3 we have
the forecast of and is
.
Thus the formula to calculate final forecast value is as follows:
The forecasting of the proposed model is verified by using the general Indonesia’s
inflation, the food stuffs inflation, the education and sport inflation, and three weighted fuzzy
time series models, Chen’s [1], Yu’s [12], and Cheng’s [4] and also two classical time series
models, exponential smoothing and ARIMA, are employed as comparison models. To
evaluate the performance of high order fuzzy time series, the root mean squared error
(RMSE) and mean absolute percentage error (MAPE) are selected as an evaluation index in
testing data. The RMSE and MAPE are defined as
and
where is the number of forecast.
The result of RMSEs obtained by using three weighted fuzzy time series for several
partition and order in general Indonesia’s inflation are listed in Table 4.
Table 4. Accuration of the proposed method and three fuzzy time series in several k (number
of partitions) and order for general Indonesia’s inflation
RMSE MAPE
Order Method
k = 16 k = 19 k = 22 k = 16 k = 19 k = 22
First Chen 0.693 1.193 1.331 318.203 528.283 587.233
Yu 0.462 0.504 0.513 189.899 115.293 110.247
Cheng 0.451 0.498 0.532 166.423 116.691 99.705
Lee 0.457 0.487 0.502 168.026 125.627 115.982
New Weighted High Order Fuzzy Time Series for Inflation Prediction 725
Table 7. Forecast accuracy of all methods for education and sport inflation
Method RMSE MAPE
Winter’s exponential smoothing ( , 0.153 278.143
, and )
726 LUSIA AND SUHARTONO
3. CONCLUDING REMARK
In this paper, we have proposed new rule for high order fuzzy time series based on
Lee’s first order fuzzy time series. Three empirical data were used to compare the forecasting
accuracy between an exponential smoothing, ARIMA, and weighted fuzzy time series
methods. In general, the results showed that the proposed method (Lee’s for high order
WFTS) yielded more accurate forecast than other methods, particularly three WFTS methods.
Specifically, the result of RMSEs for general Indonesia’s inflation showed that ARIMA with
outlier generates more accurate forecasted value than the proposed model, three other fuzzy
time series model, and single exponential smoothing model. Whereas, the results of
forecasting accuracy in food stuffs inflation and education and sport inflation ults showed that
the proposed model, i.e. Lee’s high order WFTS and Lee’s seasonal order WFTS,
respectively, yielded more accurate forecasted value than three other fuzzy time series
models, exponential smoothing, and ARIMA.
References
[1] CHEN, S.M. 1996. “Forecasting Enrollments Based on Fuzzy Time Series”. Fuzzy Sets and System 81, 3:311-
319.
[2] CHEN, S.M. 2002. “Forecasting Enrollments Based on High-order Fuzzy Time Series”. Cybernetics and
Systems 33, 1:1-16.
[3] CHEN, S.M. AND HWANG, J.R. 2000. “Temperature Prediction Using Fuzzy Time Series”. IEEE Transaction
on Systems, Man, and Cybernetics 30, 2:263-275.
[4] CHENG, C.H., CHEN, T.L., TEOH, H.J., AND CHIANG, C.H. 2008. “Fuzzy Time Series Based on Adaptive
Expectation Model for TAIEX Forecasting”. Expert Systems with Application 34, 2:1126-1132.
[5] HUARNG, K.H. 2001. “Heuristic Models of Fuzzy Time Series for Forecasting”. Fuzzy Sets and Systems 123,
3:369-386.
[6] HWANG, J.R., CHEN, S.M., AND LEE, C.H. 1998. “Handling Forecasting Problems Using Fuzzy Time Series”.
Fuzzy Sets and Systems 100, 2:217–228.
[7] LEE, M.H., AND SUHARTONO. 2010. “An Novel Weighted Fuzzy Time Series Models for Forecasting Seasonal
Data”. Proceeding 2nd International Conference on Mathematical Sciences. Kuala Lumpur, 30 November-30
Desember: 332-340.
[8] SIGH, S.R. 2007. “A Simple Time-Variant Method for Fuzzy Time Series Forecasting”. Cybernetics and
Systems 38, 3:305-321.
[9] SONG, Q., AND CHISSOM, B.S. 1993a. “Forecasting Enrollments with Fuzzy Time Series-part I”. Fuzzy Sets
and System 54, 1-9.
New Weighted High Order Fuzzy Time Series for Inflation Prediction 727
[10] SONG, Q., AND CHISSOM, B.S. 1993b. “Fuzzy Time Series and Its Model”. Fuzzy Sets and System 54, 269-277.
[11] SONG, Q. 1999. “Seasonal Forecasting in Fuzzy Time Series”. Fuzzy Sets and Systems 107, 235-236.
[12] YU, H.K. 2005. “Weighted Fuzzy Time Series Models for TAIEX Forecasting”. Physica A. Statistical
Mechanics and Its Application 349, 609-642.
[13] ZHANG, G.P. 2003. “Time Series Forecasting using A Hybrid ARIMA and Neural Network Model”.
Neurocomputing 50, 159-175.
Edisanter Lo
1. INTRODUCTION
Outlier detection is an important research area in remote sensing using hyperspec-
tral imaging. In this article an outlier detector is developed in analytical expression for
detecting anomalous objects in a large area in remote sensing using hyperspectral imag-
ing. Reviews of some common outlier detectors for hyperspectral imagery are discussed
in [1,2]. The conventional outlier detectors for detecting anomalous objects in a large
area are the RX detector [3] and SSRX detector [1,4] so the performance of the new
outlier detector developed in this article is compared with the RX detector and SSRX
detector. The RX detector is a general-purpose outlier detector and is defined as the
Mahalanobis distance of the pixel. The SSRX detector is defined as the Mahalanobis
distance of the pixel in the noise subspace. The SSRX detector has been known to
perform better than the RX detector.
729
730 Edisanter Lo
The MSM (Maximized Subspace Model) detector in [5] partials out the effect of
the clutter subspace in a pixel by predicting each spectral component of the pixel using
a linear combination of the clutter subspace. Both the SSRX and MSM detectors have
only one user-specified parameter which is the dimension of the deleted clutter subspace.
The maximum number of possible values for this parameter is typically large and this
would result in a large number of images of detector output to be analyzed. This paper
proposes an outlier detector that would result in significantly fewer images of detector
output to be analyzed and the outlier detector is developed in Section 2. The outlier
detector partials out the effect of the unknown clutter subspace in a pixel by modeling
the pixel as a linear transformation of the unknown clutter subspace plus an unknown
error in which the transformation matrix is also unknown. The dimension of the clutter
subspace can vary from one spectral component to another one. The outlier detector
is the Mahalanobis distance of the resulting residual. The performance of the outlier
detector is compared to the RX detector and SSRX detector using a hyperspectral data
cube in the visible and near-infrared range and the results are presented in Section 3.
estimate the population covariance C. The likelihood function for C is the Wishart
density function
−1
L(C) = k|S|(n−p−1)/2 |C|−n/2 e−(n/2)tr(C S ) (3)
where
p !−1
Y n + 1 − i
k= π (p−1)/4 2np/2 Γ (4)
i=1
2
and Γ is the gamma function and tr denotes trace. The maximum likelihood estimates
of γ and δ are obtained by maximizing the logarithm of the likelihood function in
(3) subject to the constraint in (2). The constrained maximization problem can be
transformed into an unconstrained maximization problem by substituting the constraint
into the likelihood function. Maximizing the function in (3) is equivalent to minimizing
the following function
−1
φ(γ, δ) = ln δ + γγ T + tr δ + γγ T
S . (5)
The maximum likelihood estimates for γ and δ are obtained by minimizing the function
φ in (5) numerically using optimization methods.
Quasi-Newton method with inexact line search is used to find a minimum solution
for the function φ. The Quasi-Newton method that has been implemented updates the
Hessian matrix using the BFGS update [6,7] and estimates the step size using three
different inaccurate line search methods (Armijo’s rule, Goldstein test, and Wolfe test).
Quasi-Newton method with inexact line search does not require the computation of the
Hessian matrix in analytic form but it requires the gradient vector of φ in analytic form.
The gradient vector of φ, denoted by g(z), can be derived to be
h iT
∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ)
g(z) = ∂δ1 . . . ∂δp ∂γ1,1 . . . ∂γ1,q . . . ∂γp,1 . . . ∂γp,qp (6)
1
where
T
z = δ1 ... δp γ1,1 ... γ1,q1 ... γp,1 ... γp,qp (7)
∂φ(γ, δ)
= ψi,i (8)
∂δi
∂φ(γ, δ) T
= 2 ψi,1 ψi,2 . . . ψi,p × γ1,j γ2,j ... γp,j (9)
∂γi,j
−1 −1
ψ = δ + γγ T I − S δ + γγ T (10)
where
−1
Q = x − γγ T γγ T + δ x . (12)
A large value in the detector output d(x) would indicate that the pixel x is a potential
outlier.
3. EXPERIMENTAL RESULTS
A relative comparison of the performance between the outlier detector in (11) and
the SSRX detector with respect to the RX detector is presented in this section using
the RIT (Rochester Institute of Technology) data cube [8] which is in the visible and
near-infrared wavelengths with spatial dimensions of 280 by 400 and spectral dimension
of 126. The outlier pixels are selected to be man-made objects that are easy to detect.
The 280x400 RGB image of the data cube using spectral band number 17, 7, and 2
for the red band, green band, and blue band, respectively, is shown in Fig. 1 and the
targets are man-made objects.
The number of high-variance principal components for each spectral component
of the pixel is shown in Fig. 2 for tol = 10−6 , tol = 10−7 , and tol = 10−8 . The
ROC curves for the SSRX detector are shown in Fig. 3 for 1 ≤ q ≤ 12. The ROC
curves for 1 ≤ q ≤ 125 show that the SSRX detector performs like the RX detector
for q = 1 but it performs increasingly worse than the RX detector for 2 ≤ q ≤ 125.
The Quasi-Newton method fails for tol = 10−1 , tol = 10−2 , tol = 10−3 , tol = 10−4 ,
and tol = 10−5 because the objective function is undefined during the line search test.
The Quasi-Newton method requires a significant increase in memory for tol = 10−9 so
the iteration is not carried out. The ROC curves obtained using the Goldstein test are
shown in Fig. 4 for tolq = 10−3 . The ROC curves for tolq = 10−3 show that there is
no significance difference among Armijo’s test, Goldstein test, and Wolf test in which
Goldstein test is the most efficient. The outlier detector performs better than the RX
detector for tol = 10−6 , performs like the RX detector for tol = 10−7 , and performs
worse than the RX detector for tol = 10−8 . The outlier detector for tol = 10−6 performs
better than the SSRX detector for 1 ≤ q ≤ 125.
Detecting Outlier in Hyperspectral Imaging 733
Figure 2. An image of the locations of targets for the data cube in Fig. 1.
4. CONCLUSION
An outlier detector for detecting anomalies in hyperspectral imaging using remote
sensing is defined as the Mahalanobis distance of the residual resulting from partialling
out the effect of the clutter subspace from a pixel. The pixel of known random variables
from a data cube is modeled as a linear transformation of a set of unknown random
734 Edisanter Lo
80
tol=10−6
−7
70 tol=10
−8
tol=10
60
50
Number of PC
40
30
20
10
0
0 20 40 60 80 100 120 140
Spectral band number
0.02
0
Difference in probability of detection
−0.02
−0.04 q=1
q=2
q=3
−0.06 q=4
q=5
q=6
−0.08 q=7
q=8
q=9
−0.1 q=10
q=11
q=12
−0.12
−6 −4 −2 0
10 10 10 10
Probability of false alarm
variables from the clutter subspace plus an error of unknown random variables in which
the transformation matrix of constants is also unknown. The dimension of the clutter
subspace for each spectral component of the pixel can vary. The experimental results
are obtained by implementing the outlier detector as a global anomaly detector in
unsupervised mode using a hyperspectral data cube with wavelengths in the visible and
near-infrared range. The results show that the best ROC curve of the outlier detector is
better than that of the SSRX detector. Moreover, the outlier detector would generate
significantly fewer images of detector output to be analyzed than the SSRX detector
Detecting Outlier in Hyperspectral Imaging 735
0.1
tol=10−6
−7
tol=10
−8
0.05 tol=10
−0.05
−0.1
−0.15
−6 −5 −4 −3 −2 −1 0
10 10 10 10 10 10 10
Probability of false alarm
which generates 125 images. Thus, the outlier detector is computational more efficient
than the SSRX detector.
References
[1] Schaum, A.P., Hyperspectral anomaly detection beyond RX, Proceeding of 13th SPIE Conference
on Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery 6565,
656502, 2007.
[2] Stein, D.W.J., Beaven, S.G., Hoff, L.E., Winter, E.M., Schaum, A.P., and Stocker, A.D.,
Anomaly detection from hyperspectral imagery, IEEE Signal Processing Magazine 19, 58-69, 2002.
[3] Reed, I.S. and Yu, X., Adaptive multiple-band CFAR detection of an optical pattern with un-
known spectral distribution, IEEE Trans. Acoustics, Speech, Signal Processing, 38, 1760-1770,
1990.
[4] Schaum, A. and Stocker, A. Joint hyperspectral subspace detection derived from a Bayesian
likelihood ratio test, Proceeding of 8th SPIE Conference on Algorithms and Technologies for
Multispectral, Hyperspectral, and Ultraspectral Imagery 4725, 225-233, 2002.
[5] Lo, E., Maximized Subspace Model for Hyperspectral Anomaly Detection, Pattern Analysis and
Applications (Published online March 20, 2011), 1-11, 2011.
[6] Luenberger, D.G, Linear and Nonlinear Programming, 2nd Ed., Addison-Wesley, 1984.
[7] Fletcher, R, Practical Methods of Optimization, 2nd Ed., John Wiley and Sons, 1987.
[8] Kerekes, J.P. and Snyder, D.K., Unresolved target detection blind test project overview, Pro-
ceeding of 16th SPIE Conference on Algorithms and Technologies for Multispectral, Hyperspectral,
and Ultraspectral Imagery 7695, 769521, 2010.
Edisanter Lo
Department of Mathematical Sciences, Susquehanna University,
514 University Avenue, Selinsgrove, Pennsylvania 17870, U.S.A.
e-mail: [email protected]
736 Edisanter Lo
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 737 - 750.
1. INTRODUCTION
737
738 E. HARAHAP, M.Y . FAJAR, H. NISHI
Fault management as part system of NMS, has the function to support the
administration on managing problems in the network. Based on RFC-3877 as concerns to the
network fault, a network administrator will get alarm regarding to the problems.
Unfortunately, in some cases, the administrator faced with repeatedly similar problems with
unknown causes. This kind of problem can be avoided if the exact causing the problem is
known. However, based on RFC-1157 fault management via SNMP protocol do not have
feature to detect the cause of problem [3].
Performing a prediction on the cause of problem is one alternative way to support a
network administrator. Prediction system may avoid the occurrence of the same problem in
the future. This prediction is also might be used to reduce the network administration cost on
keeping the network in high performance and to maintain QoS as well. Fault prediction will
support the activity of network administration and may help the administrator on performing
a certain action to solve the problem that will occur.
2. BACKGROUND INFORMATION
3.1 Existing Network Fault Management. Traditionally, network management has been
performed using a wide variety of software tools and applications [4] for diagnosing and
isolating network faults, which require human intervention for corrective action, resulting in
most network management solutions specialized for fault management, at the expense of the
other functional areas as specified by the OSI FCAPS model. Over the years, network
management as a specialized field has matured tremendously, with consumers particularly IT
managers demanding high level of sophistication and support for monitoring and managing
newer network technologies. Moreover organizations are now more interested in
infrastructure management in context of business process that it directly or indirectly impacts.
However, existing network management solutions have not kept pace with the changing
requirements of industry providing only partial solutions to these issues [5].
Such an approach can backfire in scenarios of network congestion, where such systems
end up contributing to network congestion while suffering from packet losses and timeouts
which further impacts their effectiveness and efficiency. An improvement over such systems
can be envisaged in terms of dynamically formulated polling strategies, which are able to
pinpoint network faults faster, while being dynamically able to adjust the amount of network
management traffic that is generated based on the state of the network. Such a system could
utilize historical data collected about the network made available from the network base
lining statistics, which could provide some indication on the existing hot-spots in the network
and potential trouble areas which the solution could focus on, resulting in faster RCA and
lower Mean-Time-To-Repair (MTTR) for network faults.
Typical fault management tasks include detecting and identifying faults, isolating faults
and determining the root cause, and recovering from the faults. A manual approach requires
accepting and acting on error detection notifications, maintaining and examining error logs,
tracing and identifying faults by examining environmental changes and database information,
carrying out sequences of diagnostics tests, and correcting faults by reconfiguring/
restarting/replacing network elements. Manual fault management is usually a time-consuming
and tedious task, requiring a human expert to have a thorough knowledge of the network and
to comprehend a large amount of information. [6].
It is desirable to provide Autonomic Fault Management (AFM) for any large-scale
network supporting many users and a diverse set of applications. AFM aims to automate
many of the fault management tasks by continuously monitoring network condition for self-
awareness, analyzing the fault after it is detected for self-diagnosis, and taking adaptation
actions for self-recovery. Thus AFM can reduce potential human errors and can respond to
faults faster, thus effectively reducing the network downtime.
Based on the existing network management system, especially for fault management,
new method is needed to minimize time on determining all objects in the network. Also, the
new method is needed for resulting faster Root Caused Analysis (RCA) and lower Mean-
Time-To-Repair (MTTR) on network faults.
3.2 Proposed System. The main objective of NMS is to help reduce the administration of
network cost by predict the congestion link in the network. It would be a great help to the
network administrator, if there is a new method that develop the capability of the NMS,
especially for fault management. In this paper, a method for network management system,
especially for network fault management, by using Bayesian network is proposed. The
method is to minimize the time on determining all objects in the network, resulting faster
740 E. HARAHAP, M.Y . FAJAR, H. NISHI
RCA, to reduce MTTR, and other network faults. Bayesian Network is a method which can
be used as a prediction tools through its causal-relation features. Bayesian network is a
graphical probability model for representing the probabilistic relationships among a large
number of nodes and probabilistic inference with them. Bayesian network provides a
framework for addressing problems that contains uncertainty and complexity [7].
The development on fault management is needed to achieve the objective of NMS,
keeping network running optimally and effectively. With concerning to the reducing of
administration cost, and the development of network technology and network complexity, the
following are the features of the proposed system:
Root-Cause Analysis, the proposed system can detect not only specified fault but it can detect
the cause of fault as well. This feature is the development from current existing network fault
management method.
Posterior Fault Predicting, the proposed system can detect the fault earlier before it actually
occurs. This is the best way to keep the network running effectively.
Adaptive System, the proposed system has learning capability so that it can update the
resources periodically to keep monitoring the network.
3.3 Methodology. This section will be describe about the methodology concerning to the
proposed system. Referring to the figure 1, the proposed system image, the proposed system
process is starting from data collection. The data source is from Management information
Base (MIB) by implementing SNMP protocol into the real network. The network data can be
collected in the simulation generated by network simulator application. There are some
network simulator applications that can be used as data source, such as network simulator 2
(NS2) [8] or OPNET [9]. Next, a data collection is sent to the proposed system. The propose
system, has two main modules, Learning System and Diagnosis System. Learning system is a
module that has a function to build a Bayesian Network model. Diagnosis system is a module
that has a function to monitoring the fault in the network by using the constructed Bayesian
Network model. After that, the monitoring result is sent to the alarm and recovery system.
Prediction The Cause of Network Congestion Using Bayesian Probabilities 741
4. EXPERIMENTAL CONFIGURATION
4.1 Initialization. The mesh network topology consists of four routers and five links between
routers provided in figure 2.
Each router connected with the nodes that will send packet each other. The bandwidth
links between routers are 2.5 Mbps with 40ms delay on each link. The bandwidth and delay
between routers and nodes are not measure.
The link between nodes and routers are duplex links with drop tail queue type. Each
node may send packet or receive packet. The purpose of the experiment is to confirm that
congestion occurs is caused by link down in specified location. Refer to the figure 2, if ink 2
down, the flowing packer will through link 0, or link 1, or link 3, or link 4, so that the related
links has higher possibility to be congested than link 2 is not down.
Figure 3.a.
Figure 3.b.
Prediction The Cause of Network Congestion Using Bayesian Probabilities 743
Figure 3.c.
Figure 3.d.
4.3 Constructing Model. The next step is constructing Bayesian network model. Data
throughput and link down from training simulation is collected and then categorized by “D”
for link down and “C” for congestion, following by the link number. For example, “D1” for
link 1 down, or, “C4” Link congested. Then, send data to B-Course or Bayonet for model
construction. Figure 4 is a Bayesian network model constructed by Bayonet with its joint
probability distribution.
744 E. HARAHAP, M.Y . FAJAR, H. NISHI
P( D0 , D1 , D2 , D3 , D4 , C2 , C3 , C4 )
P( D0 )* P( D1 )* P( D2 )* P( D3 )* P( D4 )*
P(C2 | D0 , D1 , D3 )* P(C3 | D2 , D4 )* P(C4 | D3 )
4.4 Testing Simulation. After Bayesian network model is built, the next step is to evaluate
the validity of model by testing simulation. Let the mesh topology (figure 1) that will be used
Prediction The Cause of Network Congestion Using Bayesian Probabilities 745
as a network topology for testing simulation. Let a scenario for testing simulation as follow:
• Total time simulation : 100 second
• Link 2 down : from 10 second to 40 second
• Link 4 down : from 60 second to 90 second
• Traffic used : FTP and CBR
The result of testing simulation shoes below. Traffic at link 2 is down from 10 second
to 40 second (fig. 6 and 7) makes congestion in link 3. Link 4 down from 60 second to 90
second makes congestion in link 3. Congestion occurs in link 3 caused by link 2 down.
Figure 6.a.
Figure 6.b.
746 E. HARAHAP, M.Y . FAJAR, H. NISHI
Figure 6.c.
4.5 Discussion and Result. The evaluation of proposed system will be performed using data
from testing simulation. Given throughput data from link 3 (figure 6), then the proposed
system will be tested whether it can predict that the congestion occurs is caused by the down
of link 2 and link 4. Apply the reduced Bayesian network at fig. 5, we can write its joint
probability distribution as follows:
P( D2 , D4 , C3 ) P( D2 )* P( D4 )* P(C3 | D2 , D4 )
To predict the cause of congestion in link 3, the following posterior probabilities are
needed:
true
P( D2 , C3 )
P( D 2
true
, D4 , C3 )
P( D2true | C3 )
D4
(eq.1)
P(C3 ) P( D , D , C )
D2 , D4
2 4 3
true
P( D4 , C3 )
P( D , D 2 4
true
, C3 )
| C3 )
true D2
P( D4 (eq.2)
P(C3 ) P( D , D , C )
D2 , D4
2 4 3
Prediction The Cause of Network Congestion Using Bayesian Probabilities 747
All data throughput at link 3 are analyzed by Eq. 1 and Eq. 2. The results are showed at
fig. 8 and fig. 9.
5. CONCLUSION
The conclusion is based on the observation made while utilizing a fairly limited set of
scenarios. Development and technology update for Network Management System is
needed especially in Fault Management System due to increasing network complexity in the
future. Based on the simulation on the mesh network, the proposed system can detect the
cause of fault. It showed by diagnosis that congested link is caused by the down on one or
more links.
The proposed method can also be implemented to predict a congested link in other
network topology based on the steps of training simulation and testing simulation which
similar. The experiment result shows an accurate prediction when monitoring on link in
testing simulation. The prediction result can be used as quick reference to support the
network administrator on taking specified action and can reduce the network administration
cost.
For future works, some improvement of the proposed system is needed to predict other
failures such as high latency, throughput degradation, or packet loss. The most important part
is the information or data about causal relation between cause and effect. Once the
information of causal relation is obtained, it will make easier to construct the Bayesian
network model. Some research should be conducted to compare the performance of Bayesian
network with other similar prediction method to on network fault management area.
Regarding to MPLS network, it should be examine deeper about the congested link given
more than one burst. In some specific situation, some traffic burst doesn’t caused congestion
on the link.
Scholarship Program (ADB-JSP). Thank you is addressed to Prof. Hiroaki Nishi, for valuable
guidance and support throughout the course of this work and for enriching knowledge about
the paper contents, and to all West-Lab members, Keio University Japan, for great help and
research collaborations.
References
ERWIN HARAHAP
Mathematics Dept. Bandung Islamic University, Indonesia.
e-mail: [email protected]
M. YUSUF FAJAR
Mathematics Dept. Bandung Islamic University, Indonesia.
e-mail: [email protected]
HIROAKI NISHI
Integrated Design Engineering Dept., Keio University, Japan.
e-mail: [email protected]
750 E. HARAHAP, M.Y . FAJAR, H. NISHI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 751 – 762 .
1. INTRODUCTION
In the most important problem on the financial market is to estimate the price of
underlying asset of an European call option, see [6, 7]. In the central form of SDE for
evolution of a firm stock price is:
Where , ,W (t ) express respectively the drift, volatility and the Wiener process.
Generally a stochastic differential equation
751
752 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI
In order to construct a Taylor series expansion using Ito calculus, the chain rule for this
calculus must be defined, and this requires Ito’s lemma. Let f : [0, T ] have
continuous partial derivatives f x , f t andf xx .A scalar transformation by f Of the
stochastic differential (2) results after some nontrivial analysis in the formula,
f (t , X t ) f (t , X t ) 1 2 f (t , X t )
df ( X t , t ) dt dX t dWt , (3)
t x 2 x 2
and in other way
f f 1 2 f f
t t
f (t , X t ) f (t , X t ) ( a b 2 2 )du b dWu , (4)
s
t x 2 x s
x
W.P.1, for any 0 s t T , where the integrand are evaluated at ( X u , u ) .
By using Ito lemma and Taylor expansion, we can obtain numerical method for solving
(2).In this paper, for discretizing SDE are used Euler-Maruyama (EM) and Milstein (MS)
method.
for n 0,1,2,..., N 1, of the Wiener process W {Wt , t 0} .We notice that these
increments are independent Gaussian random variables with mean E (Wn ) 0 and
variance E ((Wn ) ) .
2
1 '
bb ' (1,1) bb {(Wn ) 2 n } , (6)
2
from the Ito-Taylor expansion, then we obtain the Milstein method as
1
Yn1 Yn a n bWn bb ' {(Wn ) 2 n } . (7)
2
We can rewrite this as
1
Yn 1 Yn a n bWn bb ' (Wn ) 2 , (8)
2
1
Where a a bb ' .
2
2.3 path wise approximation and strong and weak convergence. We shall say that a
discrete time approximation Y convergence strongly with order 0 to X at time T if
there exist a positive constant C, which does not depend on and a 0 0 such that
( ) E ( X T Y (T ) ) C , (9)
for each (0, 0 ) .We shall investigate the strong convergence of a number of
different discrete time approximation experimentally.
We shall say that a discrete time approximation Y convergence weakly with order
0 to X at time T as 0 if for each polynomial g there exist a positive constant C
754 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI
E ( g ( X T )) E ( g (Y (T ))) C . (10)
E ( X 0 ) ,
2
(i)
1 1
2 2
(ii) E ( X 0 Y0 ) K1 , 2
1
(v) a( s, x) a(t , x) b( s, x) b(t , x) K 4 (1 x ) s t 2 ,
1
E ( X T Y (T ) ) K 5 2
,
E ( X 0 ) ,
2
(i)
1 1
2
(ii) E ( X 0 Y0 ) 2 K1 2 ,
(iii)
a(t , x) a(t , y) K 2 x y ,
b j1 (t , x) b j1 (t , y) K 2 x y ,
Solving Black-Scholes Equation by Using Interpolation Method ... 755
L 1 b j2 (t , x) L 1 b j2 (t , y) K 2 x y ,
j j
(iv)
a(t , x) L a(t , y) K 3 x y ,
j
b j1 (t , x) L b j2 (t , y) K 3 x y ,
j
L L 1 b j2 (t , x) K 3 (1 x ),
j j
(v)
1
a( s, x) a(t , x) K 4 (1 x ) s t 2 ,
1
b j1 ( s x) b j1 (t x) K 4 (1 x ) s t 2 ,
1
L 1 b j2 ( s x) L 1 b j2 (t x) K 4 (1 x ) s t 2 ,
j j
E ( X T Y (T ) ) K 5 ,
3. LINEAR INTERPOLATION
The following theorem guarantee existence and uniqueness of the strong solution of the
(1).A modulus of continuity condition in the t variable is required to obtain the similar
order of convergence as in the deterministic case of EM and MS scheme.
b(t , x) K 2 (1 x ),
2 2 2
(ii) a(t , x)
where h t k t k 1 is the constant step size. It is clear that the linear interpolation
process for Yt has the same order of mean square error,
i.e. E ( X t Yt ) Ch , t [t k , t k h] where h is the equidistant step size and C is a
2 2
We have a sample set of data that represent the evolution of a firm stock prices précised in
Table1 in order to approximation of (1).
Day of the Date S(i) R(i) Day of the Date S(i) R(i)
Week Week
We estimate the drift and volatility by using unbiased estimators. In discrete time the rent
ability of stock S over a time interval (t k 1 , t k ) is:
R(t k ) (S (t k ) S (t k 1 ) S (t k 1 ) , k 1,
By using Ito stochastic integral the exact solution of (1) has the form[7]
2
W ( t ) ( )t
S (t ) S 0 e 2
.
5. NUMERICAL RESULTS
In order to obtain approximation values of (1), first we solve Black –Scholes equation by
EM and MS method. We use linear interpolation for approximate point. Numerical results
are given below. The programs are run by MATLAB.
6. CONCLUSION
In this paper, we used a simple method for solving SDE. First, SDE are solved by explicit
finite difference methods and use linear interpolation for connect numerical points.
Convergence and error analysis theorems are given. We can use implicit methods (for
example Runge-Kutta) for solving SDE .Wavelets is very useful and simple method. By
applying Wavelets (For example Haar wavelet and etc.), we can obtain good results.
References
F.DASTMALCHISAEI
Department of mathematics, Islamic Azad University-Tabriz branch, Tabriz- Iran
e-mail: [email protected]
M. JAHANGIR HOSSEINPOUR
Department of mathematics, Islamic Azad University-Tabriz branch, Tabriz- Iran
e-mail: [email protected]
S.YAGHOUBI
Department of mathematics, Islamic Azad University-Tabriz branch, Tabriz- Iran
e-mail: [email protected]
762 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 763 - 772.
HERI KUSWANTO
Abstract. Ensemble forecast has been widely used in the Ensemble Prediction System
(EPS) of developed countries to forecast the weather condition. The idea is to use the
Numerical Weather Prediction (NWP) models for generating probabilistic forecast which is
able to cover uncertainty in the atmospheric behavior as well as in the model itself. Indeed,
ensemble forecast is able to generate reliable forecast after calibration. This paper explores
the current status of weather forecast applied in Indonesia and formulates some potential
works dealing with ensemble forecast. It turns to an idea of creating artificial ensemble
forecast. An illustration about the proposed methodology is given.
1. INTRODUCTION
763
764 H. KUSWANTO
The term “artificial” means that the ensemble forecast data are not generated
from the Numerical Weather System, but it is generated from any time series models. In
this case, we propose to generate the forecast from ARIMA Models. The procedure of
making prediction using several time series models has been well known in time series or
statistics modeling namely model combination (see Karllson and Eklund (2007),
Kapetanios et al. (2006), and Feldkircher (2011)).
The main difference of the artificial ensemble forecasts with model combination is
about the calibration. Model combination does not involve any calibration to the
collection of forecast, while artificial ensemble forecast employs calibration to the
generated forecast. Nevertheless, they are similar in the sense of averaging the forecast i.e.
weighted average.
The procedures of generating the artificial ensemble forecasts by ARIMA Box-
Jenkins (1970) are described as follows:
1. Determine the in-sample data for ARIMA modeling.
765 Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia
2. Do the identification of the data from time series plot, Autocorrelation (ACF) and
Partial Autocorrelation (PACF) plots.
3. Estimate the order of ARIMA from the plots. In this step, we should guess more
than one model. As we might know that there are usually a couple of models can
be fitted.
4. Estimate the model parameters and evaluate the goodness of fit of the models
based on the guess at step 2. It is important to note that the error residuals are not
necessarily to fit the required assumptions (normality and white noise). However,
the models which are not satisfy the assumptions may be used in the case where
we obtain only a few numbers of fitted models. In this case, we relax the
assumption as we only need the collection of ensemble data to be calibrated. The
models resulted in this step is hereafter called as reference models.
5. Generate the forecasts using the reference models. ARIMA models can be used
to generate forecasts for several lead times. This is one of the advantages of using
ARIMA where we can obtain a sequence of artificial ensemble data from one
model. From this step, we have ensemble forecasts for a single date.
6. Repeat the procedures 1 to 5 by inserting one more actual data and omitting the
most past data in order to obtain forecasts for all examined date. Tabulate all
generated forecasts in accordance to its lead. These dataset are the artificial
ensemble forecasts.
Calibration means that there are any consistency between the distribution of the
forecasts and observations. Bayesian Model Averaging (BMA) is one of the most popular
calibration methods in climatology, introduced at first by Raftery et al. (2005). Several
calibration methods have been developed as the extension of BMA. The concept of BMA
is to give more weight to the best model, by firstly remove the bias of the forecasts. The
weight represents the contribution of the model forecast to the predictive distribution. The
procedure of calibration using BMA can be briefly summarized as follows:
1. Suppose that we have four (k) model outputs (ensemble member) i.e.
and observation . In this case, we would like to calibrate the ensemble forecasts
at valid date . Determine the training length for calibration i.e. the number of
dataset used to estimate the calibration parameters. We denote the training length
as . Therefore, for calibration of ensemble forecast at date t, we will use the
dataset (both ensemble and observation) from t-m to t-1.
2. Remove the bias of the forecasts by carrying out linear regression between y and
using training data. From this regression, we obtain parameters and .
These parameters are used to remove the bias of forecasts at date t so that
where is the bias corrected forecast for k-th forecast.
3. Estimate the variance ( and weight for each member using maximum
likelihood by employing the Expectation-Maximization (EM) algorithm. See
Raftery et al. (2005) for more details of EM algorithm. In specific case, the
variance can be set to be the same for all ensemble member.
4. Using and , we can generate the predictive pdf for each member.
766 H. KUSWANTO
In this case, the pdf can be fitted following the considered case for instance
normal pdf for temperature, Gamma for wind speed, etc.
5. The calibrated ensemble forecast is obtained by averaging the weighted
distribution of the forecasts (mixture distribution) such that
4. APPLICATION
This section discusses the application of the proposed method for generating
reliable forecast. We apply the method to calibrate the daily temperature forecast observed
over the Bandara International Juanda Station, Surabaya Indonesia. The dataset used in
this study is spanning from January 2008 to December 2009. The artificial ensemble
forecast will be generated for the last three months ie. from October to December 2009.
Therefore, the remaining datasets are used to build the ARIMA models.
Figure 1. depict the time series plot of the daily mean temperature for the
considered case
32
31
30
29
Tmean
28
27
26
25
24
1 64 128 192 256 320 384 448 512 576
Index
Figure 2. Time series plot of generated artificial ensemble for 1 day (upper) and 7 day
(lower) lead forecast
From the figure, we see that the generated ensemble forecasts are
underdispersive both for one and seven lead times and hence it has to be calibrated. The
temperature ensemble has normal distribution (Raftery et. A. (2005)), and hence the pdf of
the temperature forecast is generated following normal distribution. The following figures
depicts a sample of normal pdf for forecast on 31st December 2009. We can see that the
interval forecast is not reliable as it is unable to cacth the observation well.
5
1.5
4
Density
Density
3 1.0
2
0.5
1
0.05 0.05 0.05 0.05
0 0.0
27.3 27.4 27.5 27.9 28.3 28.6
31 Desember 2009 31 Desember 2009
Figu
re 3. Illustration of pdf forecast on 31st December 2009
768 H. KUSWANTO
The reliability of the forecast is better assessed by the CRPS. The lower the
CRPS, the more reliable the forecast. The CRPS measures the reliability by evaluating
the compactness and validity of the resulted interval.
Table 2. CRPS of calibrated forecast for single date (31st December 2009)
CRPS
Periode
m=10 m=15 m=20 m=25
31 December 2009 (lead1) 0.152 0.179 0.370 0.217
31 December 2009 (lead7) 0.340 0.365 0.312 0.225
The CRPS in Table 2 does not has any meaning when if we do not compare it
with another case. In this paper, we will compare the CRPS of the uncalibrated ensembel
forecast with calibrated one. The calibration is done by the BMA. The following are the
parameters of the BMA for the considered date of forecast. We perform only lead 1 for
sake of space availability.
m
10 15 20 25
Parameter Model
M1 27.733 27.589 27.140 27.478
M2 27.733 27.607 27.177 27.481
Mean
M3 27.694 27.529 27.071 27.427
M4 27.698 27.558 27.087 27.447
M1 0.435 0.470 0.438 0.660
M2 0.435 0.470 0.438 0.660
Varians
M3 0.435 0.470 0.438 0.660
M4 0.435 0.470 0.438 0.660
M1 0.250 0.000 0.000 0.001
M2 0.250 0.000 0.000 0.999
Weight
M3 0.250 0.999 0.006 0.000
M4 0.250 0.001 0.994 0.000
We examine four different training window for the calibration. The varians of the
BMA parameter is set to be the same for all members. Using different weight leads to
different parameters in particular of the weight. We se that using 15 to 25 training
769 Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia
windows leads to the domination of one ensemble member as the best model. The
predictive model for the calibrated forecast using BMA can be expressed as (Raftery et
al., 2005):
The predictive pdf using those four training length can be seen at Figures 4 and
Figure 5. If we compare Figure 3 with Figures 4 and 5 we can clearly seen that the
interval of the forecast is now adjusted or mov closer to the observation. Indeed, the
observation (represented by the blue vertical line) is well captured by the forecast interval.
The interval of the forecast has a very rasonable range i.e. between 26 to 29 degree.
m = 10 m = 15
m = 20 m = 25
m = 10 m = 15
m = 20 m = 25
We now assessing the CRPS of the uncalibrated (denoted as ORI) and calibrated
(denoted as BMA) ensemble forecast. The evaluation is not done for a single day,
however in this case we evaluate the performance of the calibration in the system. It
means that the evaluation is done for the whole date of calibration by taking the average
CRPS.
Let us first compare the CRPS of ORI with BMA. In all cases, we can see that
the calibration using BMA reduces the CRPS significantly, in particular of lead 7 forecast.
It means that the calibration can generate a more reliable forecast, by creating more
compact interval with lower bias forecast. The choice of the optimum training length will
be 25 days for lead 1 and 10 days for lead 7.
Another way to show the accuray of the calibration is by showing the percentage
of the observations captured by the forecast interval as in Table 1, but for calibrated
ensemble forecast.
6. CONCLUSION
References
[1] Box, G.E.P; Jenkins, G.M.: Time Series Analysis - Forecasting and Control, San Francisco: Holden Day,
1970.
[2] Feldkircher, M., Forecast combination and Bayesian Model Averaging: A prior sensitivity analysis. Journal
of Forecasting. Published online DOI: 10.1002/for.1228, 26 March 2011.
[3] Hamill, T. M., and S. J. Colucci, Verification of Eta-RSM short-range ensemble forecasts, Monthly Weather
Review, 125, 1312–1327, 1997.
[4] Hersbach, H. (2000). Decomposition of The Continuous Ranked Probability Score for Ensemble Prediction
System. Weather Forecasting. 15, 559-570, 2000.
[5] Karllson, S. and Eklund, J. Forecast combination and model averaging using predictive measures.
Econometrics Review 26 (2-4), 329-363, 2007.
[6] Kapetanios, G., Labhard, V. and Price, S., Forecasting using predictive likelihood model averaging.
Economics Letters, 91 (3), 373-379, 2006.
772 H. KUSWANTO
[7] van der Linden P., and J.F.B. Mitchell (eds.). ENSEMBLES: Climate Change and its Impacts: Summary of
research and results from the ENSEMBLES project. Met Office Hadley Centre, FitzRoy Road, Exeter EX1
3PB, UK. 160pp., 2009
[8] Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski, Using Bayesian model averaging to
calibrate
[9] Wang, X. and Bishop, C.H., Improvement of ensemble reliability with a new dressing kernel, Q. J. R.
Meteorol. Soc,131, 965–986, 2005.
[10] Wilks, D. S., and Hamill, T. M., Comparison of Ensemble-MOS Methods Using GFS Reforecasts. Mon.
Wea. Rev., 135, 2379–2390, 2007
[11] Zhu, Y., Ensemble Forecast: A New Approach to Uncertainty and Predictability, Advance in Atmospheric
Science, 22 (6), 781–788, 2005.
HERI KUSWANTO
Department of Statistics, Institut Teknologi Sepuluh Nopember (ITS) Surabaya Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 773–780.
Abstract. We propose the second-order least square estimator (SLSE) for ARCH models.
This estimator minimizes the quadratic distances of the response variable to its first
conditional moment and the squared response variable to its second conditional moment
simultaneously. We proof that this estimator is strongly consistent and asymptotically
normal under general regularity conditions. A Monte Carlo simulation study is done to
demonstrate the finite sample properties of the proposed estimator.
Keywords and Phrases: time series, ARCH model, SLSE, conditional mean, conditional
variance.
1. INTRODUCTION
Time series models with homoscedastic errors, such as autoregressive, moving av-
erage, or autoregressive moving average models are widely applied in practice. However,
they are not appropriate when dealing with certain financial market variables such as
the stock price indices or currency exchange rates. These financial market variables
typically have three characteristics that standard time series models fail to consider:
(1) the unconditional distribution of the time series has heavier tails than the nor-
mal distribution;
(2) the values of time series Xt at different time points are not strongly correlated,
but the values of Xt2 are strongly correlated; and
(3) the volatilities of Xt tend to be clustered.
Least square and maximum likelihood estimation methods for ARCH models have
been widely used are . (see. Weiss [13], Johnston and DiNardo [6], Pantula [7], Bollerslev
[2], and Straumann [9]. While the MLE can only be used if the probability distribution
of the random error is known, the least square estimation is based on minimization of
the square distance of response variable to its conditional moment given the predictor
variable. The LS estimation procedure in ARCH model consists of two steps. First, the
least square estimator of the regression equation is calculated. Second, parameters of
variance equation are estimated using an ARCH regression model. Weiss [13] and Pan-
tula [7] studied studied the asymptotic properties of least square estimators which also
773
774 Herni Utami et al.
do not require a normality assumption. They proved the consistency and asymptotic
normality of such estimators.
In this paper, we propose the second order least square estimator (SLSE) for
ARCH models. This estimator is based on the first and second conditional moments of
the response variable, which can be computed easily without any further distributional
assumptions on the random error term. SLSE method was first used by Wang [10],[11]
to deal with the measurement error problem in nonlinear model. Then, Leblanc and
Wang [12] used the estimation method to estimate nonlinear regression model. They
have studied a SLSE for a general nonlinear model, where no distributional assumption
for the random error is made. The SLS estimations are more efficient than LS estimation
if used optimal weight and the random error in model has a nonzero third moment.
where ρt (v) = (Yt − E(Yt |It ), Yt2 − E(Yt2 |It ))0 and Wt is a weight matrix that is
nonnegative definite. Alternatively, we can write
v̂SLS = arg min QT (v) (5)
v
Corollary 3.1. (Hayashi [5]) Let {Zt } be stationary ergodic and f (.) be a continuous
function. Then {f (Zt )} will be stationary and ergodic.
Corollary 3.2. (Hayashi [5]) Let {Zt } be stationary ergodic and let f (.) be a continuous
function. Assume that E(f (Zt )) = η. Then
T
1X a.s
f (Zt ) −−→η.
T t=1
Theorem 3.2. For the ARCH model with v ∈ Θ, under the condition of Lemma 3.2,
a.s
QT (v) −→ Q(v) for all v ∈ Θ and Q(v) attains a unique minimum at v0 .
776 Herni Utami et al.
Proof. We first refer to the standard ergodic theorem, corollary 3.2, that for any v ∈ Θ,
Q(v) = limT →∞ QT (v)
1X 0
= limT →∞ (ρt (v)Wt ρt (v))
T
0
= E(ρt (v)Wt ρt (v)).
Furthermore, the expectation can be written as
Q(v) = E (ρ0t (v)Wt ρt (v))
= Q(v0 ) + 2E [ρ0t (v0 )Wt (ρt (v) − ρt (v0 ))]
0
+ E (ρt (v) − ρt (v0 )) Wt (ρt (v) − ρt (v0 )) .
Since ρt (v) − ρt (v0 ) is a function of Jt−1 and does not depend on Yt , we have
E [ρ0t (v0 )Wt (ρt (v) − ρt (v0 ))] = E [E (ρ0t (v0 )|Xt ) Wt (ρt (v) − ρt (v0 ))] = 0.
Therefore
0
Q(v) = Q(v0 ) + E (ρt (v) − ρt (v0 )) Wt (ρt (v) − ρt (v0 ))
≥ Q(v0 ).
with equality holding for v = v0 only.
Corollary 3.3. Under the conditions of Theorem 3.2, Q(v) = limT →∞ QT (v) exists
a.s for all v ∈ Θ and has unique minimizer at v0 .
Lemma 3.3. Under the condition of Lemma 3.2,
0
∂ρt (v) ∂ρt (v)
B=E Wt
∂v ∂v
is finite and B is non singular matrix, where
∂ρ0t (v) −X0t βXt
−Xt
=
∂v 0 ∂ht /∂α
We need to find conditions under which there exit consistent roots of the equation
∂QT (v)
∂v = 0. By Taylor’s expansion, the derivative ∂Q∂v T (v)
can be expressed as
∂QT (v) X ∂qt (v0 ) X ∂ 2 qt (v0 ) X ∂ 2 qt (v∗) ∂ 2 qt (v0 )
= + (v − v0 ) + (v − v0 ) −
∂v ∂v ∂v∂v0 ∂v∂v0 ∂v∂v0
(6)
where v∗ = v0 + r(v − v0 ) with |r| ≤ 1 and qt (v) = ρ0t (v)Wt ρt (v). Basawa [1] gives a
set of sufficient conditions for the consistency and asymptotic normality of the MLE in
ARCH models. These conditions can be adapted to the SLSE and based on equation 6,
these results imply that there exists a consistent root of the equation ∂QT (v)/∂v = 0
if
P ∂qt (v0 ) p
(1) T −1 ∂v −→ 0;
Second Order Least Square Estimator for ARCH Models 777
(2) there exists a non random matrix M (v0 ) > 0 such that for all ε > 0,
X ∂ 2 qt (v0 )
−1
P −T ≥ M (v0 ) > 1 − ε
∂v∂v0
for all T > T1 (ε); and
(3) there exists a constant M < ∞ such that
3
∂ qt (v0 )
E
<M
∂v∂v0 ∂v
for all v ∈ Θ.
Theorem 3.3. (Consistency). In addition to the condition of Lemma 3.2, assume that
v0 lies in the interior of Θ. Then the SLSE of v, v̂SLS , is consistent for v0 .
Proof. We first show that the previous three conditions are satisfied and hence there
exists a consistent root of the equation ∂QT (v)/∂v = 0. To this end we write the
derivative of qt (v0 ) as
∂qt (v0 ) ∂
= (ρ0 (v0 )Wt ρt (v0 ))
∂v ∂v t
∂ 2 2
ε0t w11 + yt2 − E(yt2 |=t ) ε0t (w12 + w21 ) + yt2 − E(yt2 |=t ) w22
=
∂v !
∂ε0t ∂ yt2 − E(yt2 |It ) 2 2
∂ε0t
= 2ε0t w11 + ε0t + yt − E(yt |It ) (w12 + w21 )
∂v ∂v ∂v
2 2
∂ yt2 − E(yt2 |It )
+ 2 yt − E(yt |It ) ,
∂v
h i
where wij is the element of Wt for i, j = 1, 2. Therefore, E ∂qt∂v (v0 )
= 0 since
E(εt |=t ) = 0 and E yt2 − E(yt2 |=t ) = 0. The ergodic theorem then implies that
P ∂qt (v0 ) p
T −1 ∂v −→ 0.
Then, by the ergodic theorem, for any constant vector c 6= 0,
X ∂ 2 qt (v0 ) a.s 2
−1 0 0 ∂ qt (v0 )
T c c −→ E c c .
∂v∂v0 ∂v∂v0
2
Now for the given c, let 0 < δ(c) < − 12 c0 E ∂∂v∂v qt (v0 )
0 c. Then, for all ε > 0, there exist
T1 = T1 (ε) such that
−1 X 0 ∂ 2 qt (v0 )
2
0 ∂ qt (v0 )
P T c c−E c c <δ >1−ε
∂v∂v0 ∂v∂v0
2
for all T > T1 . Let M (v0 ) = − 21 E ∂∂v∂v
qt (v0 )
0 . It follows that
X ∂ 2 qt (v0 )
P −T −1 c0 c > c 0
M (v 0 )c >1−ε
∂v∂v0
2
for all T > T1 . Finally, by differentiating ∂∂v∂v
qt (v0 )
0 we can show that the third derivative
of qt (v) evaluated at v0 is also bounded, which completes the proof.
778 Herni Utami et al.
Then by Lemma 3.3 we can show that D0 is finite. It follows from a martingale central
limit theorem that
1 ∂QT (v0 ) 1 X ∂qt (v0 ) d
√ =√ −→ N (0, 4A0 ) .
T ∂v T ∂v
P ∂ρ0t (v) ∂ρt (v) p
0
∂ρt (v) ∂ρt (v)
Again by the ergodic theorem, T1 ∂v W t ∂v0 −→ E ∂v W t ∂v0 . So by
equation (8), we obtain
1 ∂ 2 QT (ṽ) p
0
−→ 2B0 ,
T ∂v∂v
∂ρ0t (v0 )
∂ρt (v0 )
since B0 = E ∂v Wt ∂v0 for non random B0 > 0.
5. CONCLUDING REMARKS
We have proposed a second order least square estimator for the ARCH model
Yt = X0t β + εt , εt ∼ (0, ht )
780 Herni Utami et al.
and ht = α0 +α1 ε2t−1 +...+αR ε2t−R . We have shown the proposed estimator is consistent
and asymptotically normal under standard regularity conditions. The Monte Carlo
simulation studies show that the SLSE preforms satisfactorily in finite sample situations.
References
[1] Basawa, I.V., Feigin, P.D., and Heyde, C.C., Asymptotic Properties of Maximum Likelihood
Estimators for Stochastic Processes, The Indian Journal of Statistics28(3), 259-270, 1976.
[2] Bollerslev, T., Generalized Autoregressive Conditional Heteroscedasticity, Journal of
Econometrsics, 31, 307-327, 1986.
[3] Capinski, M., Kopp, E., Measure, Integral and probability, Springer, New York, 2003.
[4] Hannan, E.J., Multiple Time series, New York: Wiley, 1970.
[5] Hayashi, F., Econometric, Princeton University Press, United Kingdom, 2000.
[6] Johnton, J., DiNardo, J., Econometric Methods, Fouth Edition, New York: McGraw-Hill, 1997.
[7] Pantula, S.G., Estimation of Autoregressive Models with ARCH Errors, The Indian Journal of
Statistics, Series B, 50, 119-138,1988.
[8] Sarkar, N., ARCH model with Box-Cox Transformed Dependent Variable, Statistics and Proba-
bility Letters 50, 365-374, 2000.
[9] Straumann,D., Estimation in Conditionally Heteroscedastic Time Series, New York: Springer,
2005.
[10] Wang, L., Estimation of Nonlinear Berkson-Type Measurement Error Models, Statistica Sinica,
13, 1201-1210, 2003.
[11] Wang, L., Estimation of Nonlinear Models with Berkson Measurement Error, Annals of Statis-
tics, 32, 2559-2579,2004.
[12] Wang, L., Leblanc, A., Second-order nonlinear least square estimation, Ann Inst Math 60,
883-900, 2008.
[13] Weiss, A.A., Asymptotic Theory for ARCH Model: Estimating and Testing, Econometric Theory
2(1), 107-131, 1986.
Herni Utami
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]
Subanar
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]
Dedi Rosadi
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]
Liqun Wang
Department of Statistics, Universityof Manitoba.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 781 - 790.
Abstract. In this paper, the construction of two-dimensional Weibull failure modeling for a
system where its degradation is due to age and usage is studied. This failure model will be based
on the construction of bivariate Weibull model from the consideration of component failure
behaviors of a two-component system. The idea is originally taken from J. Baik, et.al. whose
paper study about two-dimensional failure modeling and also from Lu and Bhattacharyya who
studied about constructions of bivariate Weibull models. Numerical examples are given to obtain
the values of cumulatif failure rate function. The data sets come from part of warranty claims data
for an automobile component (20 observation out of 497). This numerical example was made to give
a brief ilustration about how two-dimensional Weibull failure modeling deal with a case where the
product has a Weibull failure distribution on its age and usage.
Keywords and Phrases : Weibull distribution, failure modeling, bivariate, reliability, minimal
repair.
1. INTRODUCTION
2. FAILURE MODELING
and Hunter [4], t x ht x . In the second approach, the modeling of system failure
involves a bivariate distribution. This approach was used by Murthy, et.al. [5], and Hunter
in the context of warranty cost analysis for two-dimensional warranty [2].
2.1. One-Dimensional Failure Modeling. According to Baik J., et.al [2], a system can be
either repaired or replaced at each failure and the durations of all such corrective maintenance
actions are assumed to be small compared to the times between failures and so they can be
ignored. Let the nonnegative random variable Tn denotes the time of the n th system failure,
with n 1 , and Yn Tn Tn1 denotes the time between the n th and n 1 failure,
th
where T0 0 .
Suppose that T has a distribution function
S t PT t (2)
dF t
f t (3)
dt
and the hazard or failure rate function is defined as
f t
ht (5)
S t
The probability that the system will fail for the first time in the interval t , t dt , given that
it has not failed prior to t , is ht dt odt .
Successive system failures can be modeled as a point process formulation. Let
N1 s, t be the number of system failures in the interval s, t , 0 s t , with N1 0, t is
abbreviated to N1 t . The failure intensity function or rate occurence of failures (ROCOF) at
t is given by
PN1 t , t dt 1
t lim , (6)
dt0 dt
So the probability that a failure will occur in the interval t , t dt is t dt odt .
Assuming simultaneous failures may not occur, it follows from (6) that t dEN1 t dt
or
784 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU
t
t EN1 t r dr (7)
0
t denotes the cumulative intensity function for the failure process.
Let H t denote the history of the failure process up to, but not including, time t [2].
The conditional failure intensity function is then given by
PN1 t , t dt 1 H t
t H t lim , (8)
dt0 dt
Which implies that t E t H t . Thus, t is the mean of t H t averaged over all
sample paths of the failure process [2].
2.2. Two-Dimensional Failure Modeling. It is now assumed that the degradation of a system
depends on its age and usage. Let Tn and X n , n 1 , denote the time of the n th system
failure and the corresponding usage at that time. Yn Tn Tn1 denotes the time between
n th and n 1 failure, and Z n X n X n1 is the system usage during this period,
th
the
where T0 0 and X 0 0 .
Refering to Baik J., et.al. [2], in the two-dimensional approach to modeling failures,
it is assumed that T, X is a nonnegative bivariate random variable with distribution
function
S t , x PT t , X x f u, v dvdu (10)
t x
and if F u, v is differentiable, then the bivariate failure density function is given by
2 F t , x
f t , x (11)
tx
f t , x
ht , x , (12)
S t , x
Two-Dimensional Weibull Failure Modeling 785
So the probability that the first system failure will occur in t , t dt x, x dx given that
T t and X x is ht , x dtdx odtdx .
Successive system failures can be modeled using two-dimensional point process
formulation [2]. Expanding the one-dimensional formulation used in (6), now, let
N 2 s, t; w, x denotes the number of system failures in the rectangle s, t w, x with
0 s t , 0 w x , and N 2 0, t ; 0, x is abbreviated to N 2 t , x . The
failure
intensity or rate of occurrence of failure (ROCOF) at the point t, x is given by the function
PN 2 t , t dt ; x, x dx 1
t , x lim (13)
dt0 ;dx0 dtdx
So the probability that a failure may occur in t, t dt x, x dx is
t , xdtdx odtdx .
Assuming simultaneous failures cannot occur, it follows from (13) that
t , x 2 EN 2 t , x tx or
t x
H t , x EN 2 t , x u, v dvdu (14)
0 0
Let H t , x denotes the history of the failure process up to, but not including, the point t, x .
The conditional failure intensity function is then given by
P N 2 t , t dt ; x, x dx 1 H t , x
t, x H t ,x lim
dt0 ;dx0 dtdx
(15)
For a nonrepairable system, rectification involves replacing the failed item by a new one. If
the failure is detected immediately and the time to replace is negligible, so that it can be
ignored, then failures over the two-dimensional plane can be modeled by a two-dimensional
renewal process [2].
3.1. One-Dimensional Model. It is assumed T as the system’s age when failure occur.
According to (1), T has a Weibull distribution with
F t 1 e
t
, (16)
786 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU
Where 0 and 0 . So, the survival distribution function according to (2), can be
described as
S t 1 F t e
t
(17)
The corresponding hazard or instant failure rate function then can be define as the following
equation
1
t t
e
f t
1
t t 1
ht (18)
S t t
e
This failure rate is increasing (decreasing) for 1 1 , and will coincides with the
exponential distribution for 1 [7].
Under minimal repair, the ROCOF is given by t r t and the resulting point
process is referred as a Weibull process. From (17), the expected number of system failures in
the interval 0, t under minimal repair is given by
t
H t (19)
The expected number of system failures in the interval 0, t under replacement is a renewal
function.
3.2. Two-Dimensional Model. Several approach had been proposed towards the construction
of bivariate Weibull models by considering the failure behaviour of the systems. This model
will be based on the model proposed by Lu and Bhattacharyya [3]. This following theorem
provide by Lu and Bhattacharyya, serves as general method of constructing bivariate life
models with specified marginals.
Theorem. Suppose S x, y s exp H x H y w represents the conditional survival
H x 1 S X x 1
,
H y 1 SY y
1
(20)
Proof. Any absolutely continuous and nondecreasing function H x on 0, such that
H 0 0 and H x as x , is a valid cumulative failure rate (CFR) function for a
Two-Dimensional Weibull Failure Modeling 787
univariate life distribution. Letting y 0 in S x, y w and taking expectation over W , we
get the relation
S x S x,0 exp H x w wdw H x
1
Solving for H x we get H x 1 S x which is a valid CFR on 0, in the light
of the assumptions made on t . Similarly, H y is also a valid CFR function. So
S x, y w is a valid conditional model. The joint distribution of X , Y is then
S x, y E exp H x H y W qx, y
The X -marginal of this joint survival function is H x S x which was initially
targeted and likewise for the Y -marginal.
Now, assuming that T, X is a nonnegative bivariate random variable with bivariat
1
t
Weibull distribution function, consider the Weibull marginals S t e 1
and
2
x
S x e , with 0 t, x , and let u e u , with 0 1 . It is the Laplace
2
transform of a positive stable distribution [3] and it is satisfies the conditions of the theorem.
y log y , so
1
Since
1
1 2
t x
H t
, H x
(22)
1 2
1 2
t
S t , x exp
x
, 0 1, 0 1
(23)
1 2
Obviously, in this end results, the parameters and are not individually identifiable.
Combining them into a single parameter , with 0 1 , then the bivariate
Weibull model can be describe as
1 2
t x
S t , x exp
(24)
1 2
788 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU
Note that X and T are not independent. So the corresponding failure density
function is defined as
1 1 2 1 2
1 2
t x
t x
f t , x 1 2
1 2 1
2 1 2
(25)
1 2
1 2
t x
1 1 exp t x
1
2
1
2
and the bivariate hazard/failure rate and cumulative failure rate function are each given by
2
1 1 2 1 1 2
ht , x 1 2
t x
t x
1 2 1 2 1 2
(26)
1 2
t x
1 1
1 2
1 2
t x
H t , x ln S t , x
(27)
2
1
4. CASE STUDY
The cumulative failure rate values are shown in this following table , with scale
parameter 1 154.84 and 2 23,324 , whereas shape parameter 1 1.63 and
2 1.29 , with 0.38 . Parameters are estimated using Maximum Likelihood method
and computed numerically using Newton-Raphson method. Computation was done by
MATLAB 7.10.0.. Result will be displayed on this following table.
Used KM at Failure
Age at Failure
0 100 200 300 400 500 600
0 0 0.335 0.820 1.383 2.005 2.674 3.383
50 0.158 0.352 0.824 1.385 2.006 2.674 3.383
100 0.490 0.552 0.894 1.417 2.023 2.685 3.391
150 0.949 0.972 1.156 1.560 2.107 2.739 3.428
200 1.517 1.528 1.625 1.891 2.327 2.888 3.533
250 2.183 2.189 2.245 2.413 2.729 3.186 3.755
300 2.938 2.942 2.977 3.086 3.308 3.659 4.130
Table 1. Cumulative Failure Rate Values
5. CONCLUDING REMARKS
In this paper, two-dimensional Weibull failure modeling has been studied base on
the idea of Baik J., et.al. [2], and Lu and Bhattacharrya [3]. The numerical example show that
the highest value of cumulative hazard rate is given by the highest KM used and the eldest
age at failure. This result gives us an information, that along with the increasing of age and
usage, the cumulative hazard rate function also has an increasing pattern.
References
[1] WALPOLE, R. E., MYERS, R. H., AND YE, K., Probability & Statistics for Engineer & Scientists, 7th ed.,Pearson
Prentice Hall, United States , 2007.
[2] BAIK, J., MURTHY, D. N. P., AND JACK, N., Two-Dimensional Failure Modeling with Minimal Repair, Wiley
Periodical Inc., 2003.
[3] LU, J. C. AND BHATTACHARYYA, G. K., Some New Construction of Bivariate Weibull Models, Ann. Inst.
Statist. Math., 1990.
[4] HUSNIAH, H., PASARIBU, U. S., HALIM, A. H., AND ISKANDAR, B. P., A Hybrid Minimal Repair and Age
Replacement Policy for Warranted Products, 2nd Pasific Conference on Manufacturing System, 2009.
[5] BLISCHKE, W. R. AND MURTHY, D. N. P., Warranty Cost Analysis, Marcel Dekker, London, 1994.
[6] NAKAGAWA, T., Maintenance Theory of Reliability, Springer-Verlag London Limited, 2005.
[7] BLISCHKE, W. R., KARIM, M. R., MURTHY, D. N. P., Warranty Data Collection and Analysis, Springer Series
in Reliability Engineering, Springer-Verlag London Limited, 2011.
INDIRA P. K INASIH
Master student at the Faculty of Mathematics and Natural Sciences, Bandung Institute of
Technology. She is also a lecturer at the Faculty of Mathematics and Natural Sciences
Education, IKIP Mataram.
e-mail: [email protected]
790 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU
UDJIANNA S. P ASARIBU
Associate Professor and Lecturer at the Faculty of Mathematics and Natural Sciences,
Bandung Institute of Technology. Her research interests include Stochastic Process and
Space Time Analysis.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 791 - 800.
JAKA NUGRAHA
Abstract. We have studied estimator properties of multivariate binary Probit Models using
simulation study. Estimation of parameter is performed by GEE, MLE and SMLE method.
Statistics software that was used in the calculation is R.2.8.1. Probit Model can be applied
on binary multivariate response by using MLE and GEE estimation method. Based on the
simulation data, MSLE estimator is inappropriate to multivariate Probit model. We
recommend to combine GEE and MLE. GEE can be used to estimate parameter of
regression. MLE can be used to estimate parameter correlations only.
Keywords and Phrases :Discrete Choice Model, MLE, GEE, simulation study.
1. INTRODUCTION
Discrete Choice Model (DCM) is a model constructed on the assumption that decision
maker faced the choice among a group of alternatives based on their utilities. The alternatives
or responses are nominal and one of them is having maximum utility. In this case, the
decision maker can be a person, family, company or other unit of decision maker. DCM is
correlated with two connected activities; determination of model and calculation of
proportion for each choice. The model has been widely discussed are Logit Model and Probit
Model. Methods of parameter estimation used are Maximum Likelihood Estimation (MLE)
method, Moment method and Generalized Estimating Equation (GEE) method.
Some researchers have studied this similar estimating method on panel binary
response. GEE estimator’s have invariant properties, consistent and normally asymptotic [1,
2]. In binary panel Probit Model, MLE is the best compare to Solomon-Cox or Gibbs sampler
[3, 4]. Probit Model needs multiple integral and it can be solved using Geweeke-
Hajivassilou-Keane (GHK)[5,6] Frequently, some dependent variables are observed in each
individual. Because the data include simultaneous measurements on many variables, this data
is called multivariate data. However the applications of multivariate binary response model
are most extensive. Researchon multivariate binary response models still gets a little
attention.On binary response, MLE and GEE are consistent estimator [7] and the estimators
791
792 JAKA NUGRAHA
of regression parameters are not influenced by the correlation [8]. On multivariate binary
Logit Model, GEE is more efficient compared to univariate approximation but the estimator
of correlation in GEE tends to be underestimate [9]. ProbitModel can be used in multivariate
binary response by using some parameter estimation that can be used such as GEE, MLE and
MSLE based on GHK simulation [10,11]
Based on the development of binary response model that also supported by
computational field, we studied properties of estimator using simulated study in multivariate
binary response data. Modeling of multivariate binary response utilized used Probit model
and estimation of parameter is performed by GEE, MLE and SMLE method.
It is assumed that Yit is binary response, Yit=1 as the subject i at the response of t
choosing the alternative 1 and Yit =0 if the subject of i at the response of t choosing the
alternative of 2. Each individual has covariate Xi as individual characteristics i and covariate
Zijt as characteristic of choice/alternative j at the individual of i.
Utility of subject i selecting the alternative of j on response t is
with
By assuming that decision maker select the alternative based on the maximum utility value,
the model can be expressed in different of utility
with
and it = (i1t - i0t). The probability of subject of i selecting (yi1 = 1,...,yiT = 1) is
It is assume that R is correlation matrix of i, so = Q iRQ i with Q i is diagonal matrix with its
Simulation Study of MLEon MultivariateProbit Models 793
yit = 0 at the respondent of i choosing first alternative and yit = 1 if respondent of i choosing
second alternative. So, tt’ = (2yit-1)(2yis -1)ts. To simplify the notation, take Vit = tXit (t is
identified parameter and the parameters of and are not included).
wiT wil wi 1
n
LL( ; ) ln T (wi ;0; ) (1)
i 1
or also can be represented as
n
LL( ; R) ln T (wi ;0; R)
i1
Estimation of and (or R) using MLE method can be derived from likelihood function of
equation (1). Define
1 ... 1T 1l
... ... l l
11 12
l 1T
... 1 Tl l
21 1
... Tl 1
1l
794 JAKA NUGRAHA
1 ... 1T 1k 1l
... ... ... ... ...
... 1 Tk Tl 11 12
kl kl
kl 1L kl kl
. 21 22
.. Tk 1 kl
1k ... Tl kl 1
1l
First derivative of log likelihood function (1) with respect to the parameter is
n ( w ;0;1) l l
LL( ; ) il T 1 ( wi , l ; M ; S )(2 yil 1) xil
(2)
1l i 1 T ( wi ;0; )
l l l
with M l 12
l
wil ; S 11 12 l21 and wi,-l = (wi1,..,wi(l-1),wi(l+1),...,wiT).
First derivative of log-likelihood function (1) with respect to parameter is
kl kl kl
LL( , R | x) N 2 ( wik , wil ;0, 22 )T 2 ( wi , kl , M i ; S )
(2 yik 1)(2 yil 1) (3)
kl i 1 (wi ; )
n ( w ;0;1) l l 2
2 LL( ; ) it T 1 ( wi , l ; M ; )[(2 yil 1) xil ]
.
12l i 1 T ( wi ;0; )
1 (w ;0;1)
it T 1 ( wi , l ; M l ; l ) (4)
2 n
LL ( ; ) ( wik ;0;1) ( wil ;0;1)( 2 yil 1)( 2 yik 1) xik xil
.
1k 1l i 1 T ( wi ;0; )
( T 2 (wi , kl ; M kl ; kl ) 2T 1 (wi, k ; M k ; k )T 1 (wi, l ; M l ; l ) (5)
kl kl kl kl
with M kl 12 kl
wik ; 11 12 21
and wi,-kl = (wi1,.., wi(k-1),wi(k+1),wi(l-),wi(l+1), ...,wiT)
kl kl
2 LL( , R | x ) n
T 2 ( wi , kl , M i ; S )
( 2 yik 1)(2 yil 1)
2kl i 1 [ T ( wi ; )]2
(w ; ) A1 (w , w ;0,
T i 2 ik il
kl
22 2
) T 2 ( i , kl ; M kl ; S kl )(2 yik 1)(2 yil 1) (6)
Simulation Study of MLEon MultivariateProbit Models 795
with A1
2 ( wil , wik ; kl )
2 (1 ) 2
(w w )
il ik (1 kl2 ) ln(1 kl2 ) 4 kl
kl
kl kl k
2 LL( , R | x) n (2 yik 1)(2 yil 1)T 2 (wi, kl , Mi ; S ) (wil ;0;1) (wik ; m ;1)(2 yil 1) xil
l kl i1 (wi ; )
n (2 y 1)(2 y 1) 2 x (w , w ;0, kl ) kl kl l l
ik il il 2 ik il 22 T 2 (wi , kl , M i ; S ) ( wil ;0;1).T 1 (wi , l ; M ; ) (7)
i 1 [(wi ; )]2
2.2 MSLE on Multivariate Binary ProbitModel. Probit Model is based on the assumption
that vector i = (i1,...,iT)′ on equation (5) has normal multivariate distribution with the mean
null and the covariance matrix . Marginal probability ( for t and i) is
Vit 1 1
(Vit ) 2 1/ 2
exp[ 2 it2 ]d it
(2 t ) 2 t
P (Yit yit ) ityit (1 it )1 y it foryit=0,1.
From the symmetric properties of normal distribution, the equation (8) can be expressed as
P(Yi1 yi1,...,YiT yiT ) Pi1 (2 yi1 1)Vi1,...,iT (2 yiT 1)ViT (wi ;0; )
n
LL( ; ) log (w i ;0; )
i 1
t
U it Vit ctlli for t=1,...,T andi~ N(0,I) (9)
l 1
t 1
T T
( 2 y it 1)V it c tk k( r )
~i( r ) it( r ) k 1
t 1 t 1
c
tt
Therefore
1 R
~i ~i( r )
R r 1
n
1 R ~(r ) n
1 R T (r )
simlog L( ) log log it
i 1 R r 1 i 1 R r 1 t 1
If it is known the utility model as represented in equation (9) that is fit to regularity
condition and i=(i1,..., iT) is normal multivariate distributed with the mean of null and
covariance matrix , so by using GHK simulation, MLE for the parameter of = (,) is the
solution of estimator equation:
(r )
n
1 R ~ (r ) T
a li( r )
1
1
R i
R r 1
li
r
.
0
i 1
R
~
r 1
i
(r ) l 1 li
where
l 1 clh ( r ) (ahi( r ) ) a hi( r ) ( 2 yil 1) Vil
ali( r )
u hi . . , l 1
c ( hi( r ) ) cll
h1 ll
(2 y il 1) Vil
, l 1
c11
where i=1,...,n and t=1,2,3;j=0,1.ijt ~N(0,1). Data were generated on the parameter value of
t =-1,t = 0.5 and t=0.3
Structure of correlation that will be examined is r 12= and r13=r32= 0. Utility on t=1
is correlated with the utility on t=2 with the values of correlation, =0,0.2,...,0.9. The value of
observation variable Xi and Zijt were taken from the Normal distribution,
Xi ~ N(0,1) ; Zi0t ~N(0,1) ; Zi1t ~N(2,1)
Survey on the effect of correlation to the estimator was conducted on GEE, MLE
and MSLE for n=1000. On each sample, the iteration for 50 times is performed. Results of
estimating parameter are presented in Table 1, Table 2 and Table 3.
Based on Table 1, GEE is good method to estimate regression parameter except correlation
parameter (21). The estimatorsof , β, and γ are close(small bias) to the true parameter at all
of correlation value. So, the value of correlation among utility does not influence to GEE
798 JAKA NUGRAHA
Utility 1 (Ui1) is correlated with utility 2 (Ui2) and both utility are not correlated to utility 3
(Ui3). Therefore correlation value is only affecting the parameters within Ui1 and Ui2. At both
utility, bias of estimator is too high and proportional to the value of correlation. MLE of
parameter 2 and 3are not good because it produced the high bias (see Table 2). Estimator of
21 in MLE is more better than GEE.
Result of parameter estimation by using GHK simulation produce very high bias (see Table
3). In other side, the value of estimator is influenced by initial estimator value. Problem
encountered in ProbitModel is that its log-likelihood function are not globally concave. This
causes the maximum global point difficult to define and the computation will be not efficient
(time consuming) to reach convergence point.
Simulation Study of MLEon MultivariateProbit Models 799
0.35
0.3
0.25
0.2
bias
0.15
0.1
0.05
0
0 0.2 0.4 0.6 0.8 1
Kore las i
GEE MLE
4. CONCLUDING REMARKS
ProbitModel can be applied on binary multivariate response by using MLE and GEE
estimation method. Based on simulation data,
1. GEE estimator for regression coefficients is not affected by the value of correlation
between the responses.
2. MLE estimator for regression coefficients is affected by the value of correlation
between the responses.
3. GEE estimator for correlation parameter tends to be underestimated. Whereas MLE
method is more accurate to estimate the correlation parameter.
4. MSLE estimator not appropriate to multivariate ProbitModel
Open Problem
In this research, estimation of parameter used are MLE and GEE. It is very possible to use
other estimation method such as Bayes methods. From computational side, simulation method
applicable for Probit model is need to be developed to overcome the limitation of GHK
method.
References
[1] LIANG, K.Y., AND ZEGER,S.L, Longitudinal Data Analysis Using Generalised Linear Models, Biometrika73, 13-
22, 1986.
[2] PRENTICE, Correlated Binary Regression with Covariates Specific to Each Binary Observation. Biometrics 44,
1043-1048, 1988.
[3] HARRIS, M.N, MACQUARIE L.R AND SIOUCLIS AJ., Comparison of alternative Estimators for Binary Panel Probit
Models, Melbourne Institute Working Paper no 3/00, 2000
800 JAKA NUGRAHA
[4] CONTOYANNIS P, ANDREW M. J, AND R ICE N, Dynamics of Health in British Household: Simulation-Based
Inference in Panel Probit Model, Working Paper, Department of Economics and Related Studies, University of
York, 2001.
[5] HAJIVASSILIOU, V., D. MCFADDEN, AND R UUD P., Simulation of Multivariate Normal Rectangle Probabilities and
Their derivatives: Theoretical and Computational Results, Journal of Econometrics 72, 85–134, 1996.
[6] GEWEKE J.F., KEANE M.P., ANDRUNKLE D.E., Statistical Inference in The Multinomial MultiperiodeProbit
Model, Journal of Econometrics 80, 125-165, 1997.
[7] NUGRAHA J., GURITNO S., ANDHARYATMI S., Logistic Regression Model on Multivariate Binary Response Using
Generalized Estimating Equation, National Seminar on Math and Education of Math conducted by UNY,
Indonesia, 2006.
[8] NUGRAHA J, H ARYATMI S. AND G URITNOS, A Comparison of MLE and GEE on Modeling Binary Panel
Response, ICoMS3th IPB, 2008.
[9] NUGRAHA J., HARYATMI, ANDGURITNO, Logistic Regression Model on Multivariate Binary Response Using
Generalized Estimating Equation, Proceeding of National Seminar on Mathematicsconducted by UNY,
Indonesia FMIPA UNY, 2006.
[10] NUGRAHA J., GURITNO S.,AND H ARYATMIS., Likelihood Function and its Derivatives of Probit Model on
Multivariate Biner Response, JurnalKalam, Vol. 1 No. 2, Faculty of Science and Technology, Universiti
Malaysia Terengganu, Malaysia, 2008
[11] NUGRAHA J., GURITNO S.,AND HARYATMIS., ProbitModel on Multivariate Binary ResponsUsing SMLE,
JurnalIlmuDasar, FMIPA Univ. Jember, 2010.
[12] LECHNER M, LOLLIVIER S AND MAGNAGT, Parametric Binary Choice Models, Discusion paper no 2005-23,
2005.
[13] HAJIVASSILIOU V., MCFADDEN, AND RUUD P., Simulation of Multivariate Normal Rectangle Probabilities and
Their derivatives: Theoretical and Computational Results, Journal of Econometrics 72, 85–134, 1996.
[14] TRAIN, KENNETH, Discrete Choice Methods with Simulation, UK Press, Cambridge, 2003.
JAKANUGRAHA
Dept. of Statistics, Islamic University of Indonesia, Kampus Terpadu UII, Jl. Kaliurang
Km.14, Yogyakarta, Indonesia
e-mails: [email protected] or [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 801 - 812.
KARIYAM
1. INTRODUCTION
ad bc
S ij (1)
a bc d a c b d
Moreover, it can also be applied to the similarity of Yule's Q, the similarity of Yule's Y, and
Hamannsimilarity, as in equation (2), (3), and (4).
Yule’s Q : S ad bc (2)
ad bc
ij
Yule’s Y : ad bc
Sij (3)
ad bc
Hamman : S a d b c (4)
abcd
ij
A proximity matrix of similarity among variables was used as the basis for the
analysis of the grouping hierarchy. A number of linkage methods that can be implemented
consistent with the type of dichotomous variables, suppose a complete linkage method, the
method of between-group linkage, the methods of within-group linkage, or centroid
clustering. Some problems in the variables clustering are difference in proximity matrix and
difference in linkage method can result difference of members of groups, although they are
formed in the same variable group sizes. [9]
2. RELATED WORK
similarity measures for categorical data [6], clustering binary sequence through a two-step
iterative procedure [7], cluster analysis and categorical data [8], and a comparison of
different approaches to hierarchical clustering of ordinal data [10]. This paper will discuss
the application of dichotomous variable cluster in the quality control of reconstruction
process of house a small type of post-earthquake Yogyakarta.
3.1. Materials.The author uses data as many as 8123 data with the consent of the team, where
the author also member team of quality assurance activities in the rehabilitation and
reconstruction of quake victim's home. There are forty variables were derived from eleven
major components of process of making home, which includes the availability of design,
map, foundation of house, sloof, columns, wall, girder of ring, reinforcement at a joint of the
beams ends and column, the connection reinforcement, bearing wall, and easel.Detail of
statement that must be answered with dichotomous variables Yes (1) and No (0) by quake
victims are listed in Table 2. [3]
Observation Yes No
2sand : 3gravel
F20 Broad of wall < 9 m²
F Wall F21 Existing anchor in wall
F22 Composition 1pc : 4sand
G23 Minimal of size 12 cm x 15 cm
Minimal reinforcement in ring beams: 4d12
G24
mm
Size of begel in column: d8 mm x 15 cm or
G Ring Beams G25
d6 mm x 12,5 cm
Condition of ring beams concrete (not
G26
porous)
Composition of ring beams concrete 1pc :
G27
2sand : 3gravel
Detail Of Reinforcement in the end of angle with
H H28
Reinforcement size: length 40 cm
Connection Of
I I29 Minimum overlap: 40cm
Reinforcement
J30 Existing of sloping ring beams
J31 Condition of sloof concrete (not porous)
J32 Size of sloping ring beams 12 cm x 15 cm
J Bearing Wall Reinforcement of sloping ring beams: 4d12
J33
mm
Size of begel in column: d8 mm x 15 cm or
J34
d6 mm x 12,5 cm
J35 Existing of knot of wind
K36 Minimal size of wood: 6 cm x12 cm
K37 Hookup with begel
K Easel K38 Existing of knot of wind
K39 Existing of anchor
K40 Wood color is dark
linkage. Meanwhile, the distance measure was used Pearson correlation, Yule’Q, Yule’s Y,
and Hamann. Outline of application method procedureare follows:
(i) apply a hierarchical cluster analysis based on the Pearson correlation matrix, and the
complete linkage method;
(ii) apply between-group linkage, within-group linkage, centroid-clustering and comparing
the results with the complete linkage,
(iii) apply the similarity of Yule's Q, Yule's Y similarity and Hamann similarity, and
compare;
(iv) The data separated into two sets of data, and comparing the level of conformity of the
groups of variables use result of step (ii) and (iii).
Furthermore, applied cluster analysis using a different linkage method, that is the
within, between, and centroid linkage. Comparison of the suitability of the number of
members in the group of variables, between complete linkage and within linkage was 60%.
The suitability level of members in the group has obtained between complete linkage and
between linkage80%. Meanwhile, complete and centroid linkage producing suitability as
many as 83%. Thus, it can be said that the complete linkage method can be used with good
for case of quality variable clustering of the building, as shown in figure 2.
Clustering of Dichotomous Variables and Its Application… 807
Furthermore, by using the complete linkage method, the data were analyzed with
different of similarity namelyYule’Q, Yule’s Y and Hamannand compared with Pearson.
Comparison of percentage of the suitability of the members of variables groups for these
similarities, as shown in figure 3.
It shows that Pearson correlation can be used to analyze of quality variables data of
building. Application of different distance has producing the level of suitability of each
group of variables as many as 95%, except for Hamannsimilarity obtained 58%. Separating
the data into two datasets,first datasets contain 6088 data, and the second datasets contain
808 KARIYAM
2035 data. For each datasets was analyzed by cluster analysis using complete linkage and
Pearson. This step produced level of suitability of variables members that high, 95%. The
difficulty of a series of such procedures is the process of calculation to obtain percentage of
the conformity of members of the group variable. This is because the difference in sign of the
group caused by difference of linkage and similarity methods, so the conversion must be done
manually with the help of Microsoft Excel.
Furthermore, groups of variables based on dendrogram plot on figure 1. whichwas
formed in 11 groups, are listed in Table 3.
Name of
Code Name of variables Cluster variable
clusters
(1) (2) (3) (4)
Minimal reinforcement in column: 4d12 reinforcement
E16. 6
mm
Minimal reinforcement in ring beams: 4d12
G24. 6
mm
Reinforcement of sloping ring beams: 4d12
J33. 6
mm
D13. Condition of sloof concrete (not porous) 7
E18. Condition of column concrete (not porous) 7
Quality of
F20. Broad of wall < 9 m² 7 concrete
Condition of ring beams concrete (not
G26. 7
porous)
Reinforcement in the end of angle with
H28. 8 Detail of
size: length 40 cm
reinforcement
I29. Minimum overlap: 40cm 8
J30. Existing of sloping ring beams 9
J31. Condition of sloof concrete (not porous) 9
Quality of
J32. Size of sloping ring beams 12 cm x 15 cm 9 bearing wall
Size of begel in column: d8 mm x 15 cm or
J34. 9
d6 mm x 12,5 cm
J35. Existing of knot of wind 10
Knot of wind
K38. Existing of knot of wind 10
K36. Minimal size of wood: 6 cm x12 cm 11
K37. Hookup with begel 11
Easel
K39. Existing of anchor 11
K40. Wood color is dark 11
The groups of variables can be named the group the variable availability of design
and map, foundation of house, foundation of house, stone of foundation, composition mixture
and size of concrete, size of reinforcement, quality of concrete, detail of reinforcement,
quality of bearing wall, knot of wind, and quality of easel. Further, as comparator has also
been applied using the Pearson correlation matrix as the basis on factor analysis, and based on
Eigen value above one, it was found twelve factors. However, in this paper is selected as
many as eleven groups of variables or factors, because this results are relatively stable and
valid, and easily interpreted in the context of the early problems.
Detail of analysis especially about percentage of nonconformities of building with
standard of earthquake resistant is shown in figure 4.
810 KARIYAM
5. CONCLUDING REMARK
References
[1]ADELFIO, G., CHIODI, M., AND LUZIO, D., An Algorithm for Earthquakes Clustering Based on Maximum
Likelihood, Proceedings of the 6th Conference of the Classification and Data Analysis Group of the
SocietaItaliana di Statistica, Springer New York, Part II: Cluster Analysis, 25 – 32, 2010.
[2] BAI, L., LIANG, J., DANG, C., AND CAO, F., A Novel Attribute Weighting Algorithm for Clustering High-
Dimensional Categorical Data, Pattern Recognition, 44, 2843 – 2861, 2011
[3] DINASPEKERJAANUMUM DIY, LaporanAkhirpadapekerjaan Quality Assurance (QA) dan Quality Control (QC)
PelaksanaanRehabilitasi/ RekonstruksiPascaGempaBumi di D.I. Yogyakarta danJawa Tengah, DPU
Yogyakarta, 2007.
[4] FINCH, H., Comparison of Distance Measures in Cluster Analysis with Dichotomous Data, Journal of Data
Science, Vol. 3, 85 – 100, 2005.
[5]HARDLE, W., danSimar, L., Applied Multivariate Analysis Statistical Analysis, Second Edition, Springer–Verlag,
2007
[6] KUMAR, V., CHANDOLA, V., AND BORIAH, S., Similarity Measures for Categorical Data: A Comparative
Evaluation, SIAM, 2008.
[7] PALUMBO, F., AND D’ENZA, A.I., A Two-Step Iterative Procedure for Clustering of Binary Sequences,
Proceedings of the 6th Conference of the Classification and Data Analysis Group of the SocietaItaliana di
Statistica, Springer New York, Part II: Cluster Analysis, 33 – 40, 2010.
[8] REZANKOVA, H., Cluster Analysis and Categorical Data,
http://panda.hyperlink.cz/cestapdf/pdf09c3/rezankova.pdf, 2010.
[9] TIMM, N.H., Applied Multivariate Analysis, Springer, 2002.
[10]ZIBERNA, A., KEJZAR, N, AND GOLOB, P., A Comparison of Different Approaches to Hierarchical Clustering of
Ordinal Data, Journal of Metodoloskizvezki, Vol.1, No.1., 57 – 73, Slovenia, 2004.
KARIYAM
Department of Statistics, Faculty of Mathematics and Natural Science
Islamic University of Indonesia
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 813 - 820.
Keywords and Phrases: Employee Stock Options, Hull-White ESO’s model, Monte Carlo
Method
1. INTRODUCTION
Employee Stock Options (ESOs for short) are call options granted by a company to
an employee on the stock of the company. ESOs differ from standard-traded options in at
least: ESOs can be exercised only after vesting period; ESOs cannot be transferred; and in
case of the employee leaving the company during vesting period the ESOs are forfeited.
Regarding these specific features of ESOs, the use of lattice model in valuing ESOs, such
as the Hull-White ESOs model [3], is more preferable. ESOs encourage the employee to
remain in the company and to work towards improvement of the company’s earning and
management which will result in the increase of share price and eventual increase wealth
of the employee (West [4]). Hence the company may use ESOs as a strategy to increase
their stock price. In the Hull-White model it is assumed that an employee will exercise
their ESOs prior to maturity if the stock price is at least M times the strike price. Binomial
method is then used to find its value. In this paper a modification on the Hull-White
model concerning exercise strategy is proposed. First, the case in which an employee may
813
814 K. A.SIDARTO AND D.PUSPITA
exercise their ESOs prior to maturity if the stock price reaches a certain value. Second, in
which an employee may exercise their ESOs prior to maturity if the stock pricespends a
certain period of time above a certain value. Thus a Parisian-style is added as an ESO’s
feature. Hence a Monte Carlo method will be more readily to apply for valuing the ESOs
(Bernard and Boyle [2]).
2. HULL-WHITE MODEL
Recall that in the Hull-White ESO’s model [3]: option can be exercised at any
time during its life after a vesting period; a vested option is exercised prior to maturity
if the stock price is at least times the strike price; there is an employee exit rate
which is the rate of employees leaving the company per year. In case an employee
leaves the company during vesting period their ESOs will be forfeited.Assume that the
probability an employee leaving the company in each period of time is , as
in Ammann and Seiz [1].
Monte Carlo Method for Valuing Hull-White Model.Partition the time to expiration
into times step: Let denote the
corresponding stock prices at these time. Further suppose that is the value of the
option at time . Define as the strike price of the option; as the time when the
vesting period ends; as the risk-free interest rate; as the volatility of the underlying
stock; and as the dividend yield. Assume that the stock price follows a geometric
Brownian motion. First, simulate in terms of using the formula
The equations for describing the backward recurrence through path are:
When
if then
if and then
if and then
Repeat the simulation times, and find as the value of ESO from the
simulation. Then the value of ESO is given by
Va lu i n g E mp lo ye e St oc k Op t ion s Us i n g M on t e C a rlo M et h od 815
Table 2 gives the result of Hull-White ESO’s model computed using binomial method and
Monte Carlo method based on data given in table1.The number of stock price path
simulation is 105 with 2520 time-steps for each simulation. The 95% confidence interval is
[12.2968, 12.5015].
Monte Carlo Method for Valuing ESOs with Parisian-style.Define as the first time
that the Parisian condition is metfor the simulation. Assume the window period of the
ESOs with Parisian-styleis . Figure 1 gives the illustration of ESOs with Parisian-style
Parisian condition has not been met than the ESOs is forfeited and the value of the ESO is
zero.
After simulate the stock price define the exercise time. Define
Using backward recurrence, the value of ESO with Parisian-style for the simulation
at is given by
with
Table 3 gives the result of ESOs with Parisian-style model computed using Monte Carlo
method based on data given in Table 1 and value of for two window periods 0
and 15 days. The number of stock price path simulation is 10 5 with 2520 time-steps for
each simulation.
The ESOs with Parisian-style model with window period 0 day is a special case. It
can be viewed as a standard barrier option. Lattice method can then be implemented to
value the ESOs with standard barrier. Using the trinomial lattice method we got the value
of ESO with Parisian-style model where day aboveas $12.2113. Butthe lattice
method is hardly to apply for valuing the ESOs with Parisian-style model for others
window period. Monte Carlo method is a flexible method and easy to implement to price
the ESOs with Parisian-style model.
The following Figures 2, 3 and 4 show the influence of several parameters on ESOs
value. The parameters are: vesting period ( ), psychologicalbarrier ( ), real barrier ( )
and window period ( ).
level. The red line represents the value of ESOs with Parisian-style model with day
and different real barrier level. The value of ESOs with real barrier is lower than the one
with psychological barrier because ESOs with real barrier doesn’t give exercised
opportunity to an employee leaving the company before the Parisian condition is met. As
Figure 3 shows, there is a value of barrier at which the value of ESOs with real barrier has
a maximum value.
Figure 4 gives the ESO with Parisian model valuation using input
parameters: ,
λ=0.06 and different window period. It is natural that as the length of window period
become longer, the option will be more difficult to be exercised. Hence, as Figure 4
shows, the increasing length of window period gives the decreasing value of ESOs.
4. CONCLUDING REMARK
We have presented a simple Monte Carlo method for valuing employee stock options. In
particular, we analyzed the Hull-White model where we added the Parisian style to the
model. As a consequence it is not easy to use a lattice method to perform valuation. Hence
we propose the use of Monte Carlo method for its valuation, which is easier to implement.
A graphical analysis of several ESOs parameter’s influence on the option value is also
given.
References
[1] AMMAN, M. AND R. SEIZ., Valuing Employee Stock Options: Does the Model Matter? Financial Analysis
Journal, vol.60, 5, 21-37, 2004.
[2] BERNARD, C. AND P. BOYLE., Monte Carlo Methods for Pricing Discrete Parisian Option.The European
Journal of Finance, 1-28, 2010.
[3] HULL, J. AND A. WHITE., How to Value Employee Stock Options.Financial Analysis Journal, 60, 114-
Va lu i n g E mp lo ye e St oc k Op t ion s Us i n g M on t e C a rlo M et h od 819
119, 2004.
[4] WEST, G., Employee Stock Options. http://www.riskworx.com/pdf/esoPDF5 (accessed: June 5, 2008).
KUNTJOROADJISIDARTO
Industrial and Financial Mathematics Group, Faculty of Mathematics and Natural
Sciences, Institute of Technology Bandung, Indonesia.
e-mail: [email protected]
DILAPUSPITA
Industrial and Financial Mathematics Group, Faculty of Mathematics and Natural
Sciences, Institute of Technology Bandung, Indonesia.
e-mail: [email protected]
820 K. A.SIDARTO AND D.PUSPITA
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 821 - 830.
Abstract. Determining the region in the brain of epileptic foci that need to be removed or isolated
is a crucial task.The EEG signals not only have incomplete recorded data but also not much on
priori information about the unknown source. Thus EEG signals are tackled in a fuzzy
environment for classification. Fuzzy clustering is one of the techniques used to determine the
electrical activity most occurred in the brain. In this paper, two algorithms of fuzzy clustering i.e.
Fuzzy c-Means and Gustafson-Kessel are investigated and applied to the real epileptic data. The
results of each algorithm are compared by referring to the optimal number of cluster s. The number
of clusters is well spread within the considered time which gives the hint of generalized epilepsy
seizure.
Keywords and Phrases :Fuzzy clustering, EEG signals analysis, Fuzzy c-Means, Gustafson-
Kessel
1. INTRODUCTION
Epilepsy is one of the most common disorders of the brain characterized by recurrent
seizures [1]that affects approximately 1% of the world’s population which is more than 50
million individuals worldwide. Seizures are classified into two major categories as either
partial or generalized [2]. A partial seizures occurs when the initial discharges occurs at a
localized focus while a generalized seizure has multiple foci at various locations throughout
the whole brain, in both hemispheres. Not all seizures can be easily defined as either partial or
generalized. Some people have seizures that begin as partial seizures but then spread to the
entire brain. Others may have both types of seizures but with no clear pattern.
Epileptic seizure, which is caused by abnormal electrical activity in the brain, can be
measured by using Electroencephalogram (EEG). The non-stationary recorded EEG signals
contain information regarding changes in the electrical potential of the brain obtained from a
________________________________
2010 Mathematics Subject Classification : 92C55, 93A30, 94A12
821
822 NAZIHAH AHMAD ET AL.
given set of recording electrodes.Whenever there is a net current flow between two
electrodes, a potential difference will develop. These potential differences developed due to
volume of currents that spread from their source in active neural tissue throughout the
conductive media of the head.
These changes appear as wriggling lines along the time axis in typical EEG recording
(Figure 1). The recorded data include the characteristics waveforms with accompanying
variations in amplitude and frequency which is in a series of numerical values over time. EEG
waveforms are typically recorded from1 to 50µV in amplitude with frequencies of 2 to 50 Hz.
During brain disease state, such as in epileptic seizure, the EEG amplitudes can be shot up to
nearing 1000 µV[2].
Figure 1: Sample of EEG signals from an epileptic patient during seizure attack.
In order to localize the source or strength of EEG signals, the classification of data is
often required [4]. Classification method groups a scattered data of the EEG signals into two
or more clusters with a corresponding cluster center each. The cluster center gives some clues
where the electrical activities in the brain occur most [5] and [6]. It also provides the most
likely possible location where the epileptic seizure begins and identifies the category of the
seizure which is either partial or generalized [7] and [8].
The EEG signals not only have incomplete recorded data but also not much on priori
information about the unknown source [9]. Therefore, in this study, EEG signals are tackled
in a fuzzy environment where it has significant advantages over other approaches. Fuzzy has
the capability in underlying the remarkable human ability to make rational decisions in the
environment of imprecision, partial knowledge, partial certainty and partial truth [10].
Classification of EEG recordings based on fuzzy clustering algorithms has been applied in
spike detection and classifying emotions[11] and [12].
In this study, fuzzy partition clustering [13] and [14] is implemented to the real data of
EEG signals (i.e the potential difference)of epilepsy patients. Then a comparison based on the
optimal number of cluster is performed.
Classification of Epileptic Data Using Fuzzy Clustering 823
where Pit captures the amplitude of raw EEG signals at any fixed time t for ithchannel which
is segmented into several consecutive connected points.
These non-overlapping signals have been proven to be metric space, hence topological
space and finally implemented as digital space in order to explain the theoretical background
of these particular signals[15]. Since this EEG signals are topological spaces, so these signals
can be stretched to any geometric figure (i.e fuzzy clustering) in order to identify the current
source where the epileptic seizure originated.
3. FUZZY CLUSTERING
3.1 Fuzzy c-means Algorithm. FCM algorithm is based on the minimization of an objective
function.
824 NAZIHAH AHMAD ET AL.
c c n
J U , c1 , c2 ,..., cc J i uij m dij 2
i 1 i 1 j 1
where
uij is between 0 and 1;
ci is the centroid of cluster i;
dij is the Euclidean distance between ith centroid(ci )and jth data point;
m 1, is a weighting exponent.
Given the data set Z, choose the number of clusters 1 c N , the weighting exponent m 1,
the termination tolerance 0 and the norm-inducing matrix A.
Repeat for l 1, 2,...
Step 1:Compute the cluster prototypes (means):
Z
m
N l 1
k 1 ik k
vi (l )
, 1 i c.
m
N l 1
k 1 ik
AZ
T
D 2ikA Z k vi vi , 1 i c, 1 k N .
l l
k
otherwise
c
ik l 0 if DikA > 0, and ik l 0,1 with ik l
i 1
l l 1
until U U .
Given the data set Z, choose the number of clusters 1 c N , the weighting exponent m 1,
the termination tolerance 0 and the norm-inducing matrix A.
Repeat for l 1, 2,...
Step 1:Compute the cluster prototypes (means):
Z
m
N l 1
k 1 ik k
vi (l )
, 1 i c.
m
N l 1
k 1 ik
Z v Z
m T
l 1
vi
N l l
k 1 ik k i k
F , 1 i c.
i m
N l 1
k 1 ik
i det Fi 1/ n Fi 1 Z k vi l , 1 i c, 1 k N .
T
D 2ikAi Z k vi
l
j 1 ikAi / DikAi
otherwise
c
ik l 0 if DikA > 0, and ik l 0,1 with ik l 1
i
i 1
l l 1
until U U .
3.3 Cluster Validity Measurement. In order to determine the optimal number of clusters
826 NAZIHAH AHMAD ET AL.
present in the data, cluster validity need to be done. So Xie and Beni’s Index (XB) has been
implemented. The XB index is defined as follows:
c n m 2
i 1 j 1 ij x j vi
XB 2
n min i , j x j vi
According to [5] and [19] XB index gives an effective measurement for the number of
clusters compared to the other indexes when EEG signals are used.
The real EEG data of epileptic patient is digitized at 256 samples per second using
Nicolet One EEG software. The software performed Fast Fourier Transform (FFT) of raw
data for the signals. The EEG data is recorded by placing electrodes on the scalp according to
the International 10-20 system. Nineteen channels of EEG are recorded simultaneously.
These channels with duration of ten seconds during the seizures attack are considered.
The FCM and GK algorithms are implemented using MATLAB R2010a with
different number of clusters varying from two to ten for each second. At time t=1, the values
of validity measures corresponding to the number of clusters are plotted as depicted in Figure
4 and Figure 5. The XB index for FCM and GK reach the local minimum at c=4. Hence the
optimal cluster for both algorithms is four.
Table 1 shows the optimal number of cluster using FCM and GK at each second. At
t=1 and t=2, there are 4 cluster center. However the positions of the cluster centers at t=1 in
R2areslightly different for each algorithm (Table 2). The highest number of clusters occurs at
t=3. This means that at this time, the electrical current in the brain has been triggered. For the
rest of the time, the number of cluster centers lies within three to five.
Table 1: Optimal number of cluster for ten seconds using FCM and GK
Algorithm t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
FCM 4 4 6 4 3 3 4 5 3 4
GK 4 4 5 3 5 3 4 5 3 4
FCM GK
( x, y) ( x, y)
As shown in Table 2, at one particular time, the position of the cluster centers using FCM and
GK are quite similar. However, the position of cluster centers for both algorithms are varies
from time to time. These cluster centers are well-spread throughout the ten seconds. The
pattern gives a hint of the general case of epilepsy since the position of cluster centers are
scattered. These positions only show the location of cluster centers of the EEG signal in two-
dimensional. The result need to be further investigated using any inverse projection in order
to transform the information of the location of cluster centers inside the brain.
5. CONCLUSION
This study shows that FCM and GK can be used in identifying the type of
epilepsywhich is either partial or generalized. The spreading pattern for the position of cluster
centers in both algorithms is similar. This means that FCM and GK can be used to cross
reference the individual results with one another, where both identify generalized epilepsy.
Classification of Epileptic Data Using Fuzzy Clustering 829
References
[1] SHORVON, S. D., Handbook of Epilepsy Treatment, 2nd ed, Blackwell Publishing Ltd., USA, 2005.
[2] MARKS, D.A.. Classification of Seizure Disorders.In Schulder, M., and Gandhi, C.D. (Ed).Handbook of
Stereotactic and Functional Neurosurgery. New York. Marcel Dekker, Inc, 2003
[3]KUTZ, M., Standard Handbook of Biomedical Engineering & Design, McGraw- Hill.,New York,2003.
[4] SANEI, S ANDCHAMBERS, J.A.,EEG Signal Processing, John Wiley & Sons Ltd.,England, 2007.
[5] FAUZIAH, Z., Dynamic Profiling of EEG Data During Seizure Using Fuzzy Information Space, PhD
Thesis,Universiti Teknologi Malaysia, Skudai, 2008.
[6] NAZIHAH, A. AND TAHIR A., Information Granulation in Biomedical Signal, Proceeding of National Seminar
on Fuzzy Theory and Applications, 2008.
[7] HARIKUMAR, R.AND NARAYANAN, B. S., Fuzzy Techniques for Classicification of Epilepsy Risk Level
from EEG Signals, Proceedings of the IEEE Conference, 2003.
[8] TAHIR, A., RAJA A. F., FAUZIAH, Z.AND HERMAN, I., Selection of a Subset of EEG Channels of Epileptic
Patient During Seizure Using PCA. Proceeding of the 7th WSEAS International Conference on Signal
Processing, Robotics and Automation, 270-273, 2008.
[9] SOLOMON, E. P.,Introduction to Human Anatomy and Physiology, 2nd ed., Elsevier Science,USA,2003.
[10] ZADEH, L. A.,TowardA Theory of Fuzzy Information Granulation and Its Centrality in Human Reasoning and
Fuzzy Logic,Fuzzy Sets and System, 90: 111-127, 1997.
[11] HILAL, Z. I. ANDKUNTALP, M., A Study on Fuzzy C-means Clustering-based Systems in Automatic Spike
Detection, Computers in Biology and Medicine, doi:10.1016/j.compbiomed. 2006.
[12] MURUGAPAN, M., RIZON, M., NAGARAJAN, R., YAACOB, S., ZUNAIDI, I., ANDHAZRY., D., EEG
Feature Extraction for Classifying Emotions using FCM and FKM, International Journal of Computers and
Communications, 2(1), 21-25, 2007.
[13] BEZDEK, J.C.,Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic
Publishers,USA, 1981.
[14] GUSTAFSON, D. E. AND KESSEL, W. C., Fuzzy Clustering with a Fuzzy Covariance Matrix, Proceedings of
the IEEE Conference on Decision and Control, pp. 761-766, San Diego, Calif, USA, 1979.
[15] NAZIHAH, A., TAHIR A.AND HUSSAIN I. H. M. I., Topologizing The Bioelectromagnetic Field Proceeding
of the 5th Asian Mathematical Conference, PWTC, 2009.
[16] MILLER, D. J., NELSON, C. A., CANNON, M. B., ANDCANNON, K. P., Comparison of Fuzzy Clustering
Methods and Their Application to Geophysis Data. Applied Computational Intelligence and Soft Computing, 1-
16, 2009.
[17] BARGIELA, A. ANDPEDRYCZ, W., Granular Computing: An Introduction, Kluwer Academic Publishers,
USA, 2003.
[18] KAYMAK, U. ANDSETNES, M., Extended Fuzzy Clustering Algorithm. ERIM Report Series Research in
Management, 1-23, 2000.
[19] CHIANG, W. Y., Establishment and Application of Fuzzy Decision Rules: an Empirical Case of the Air
Passenger Market in Taiwan, Int. J. Tiurism Res., doi: 10.1002/jtr.819, 2010.
[20] MARKS, D.A..Classification of Seizure Disorders.In Schulder, M., and Gandhi, C.D. (Ed).Handbook of
Stereotactic and Functional Neurosurgery. New York. Marcel Dekker, Inc, 2003
830 NAZIHAH AHMAD ET AL.
NAZIHAH AHMAD
Universiti Utara Malaysia.
E-mails: [email protected]
SHARMILA KARIM
Universiti Utara Malaysia.
E-mails: [email protected]
AZIZAN SAABAN
Universiti Utara Malaysia.
E-mails: [email protected]
HAWA IBRAHIM
Universiti Utara Malaysia.
E-mails: [email protected]
Abstract. Purchasing product is one of a complex knowledge in decision making problem under
uncertain conditions. Some decisions have to be taken based on mathematical method. Soft set
theory is a new general mathematical method for dealing with uncertain data proposed by
Molodtsov in 1999. However many soft set researches produce exact solution even when their
initial description of the data are approximation valueswhich is supposed to be better give a soft
solution or recommendation. This paper uses soft set as a generic mathematical tool to describe the
objects or products under consideration in the form of parameters they needed and
multidimensional scaling techniques to give recommendation or soft solution for purchasing
products system. The proposed purchasing products system usesa simple form of ranking
evaluation for each object parameters which is filled out by customers self and yielding a
recommendation based on soft set which could be used as a suggestion for customer in taking a
decision.
Keywords and Phrases: Soft set theory, multidimensional scaling, recommendation analysis,
clustering.
1. INTRODUCTION
831
832 R.B.F. HAKIM , SUBANAR , E. WINARKO
happen also, say, when we will have a special guest that comes to our house. We need to
decorate our dining room and buy a new chair and dining tables. We just go to a furniture
store and ask the salesman and describe our need of dining room decoration. Rather than
mentions an exact attributes of chairs or tables we prefer to ask „the traditional look‟, „the
comfortable chairs‟ and so on. Molodtsov [2] has laid the foundation of a set that can collect
different objects under consideration on the form of parameters they needed. For example (F,
E) is a soft set that defined the clothes belonging to someone which are c1, c2, c3, c4, and c5
with {upper wear, lower wear, blue, black, jeans, formal, sporty} as a description of the
clothes, wherec represents the clothes. Another example (F, E) is a soft set that defined the
dining chairs (dc1, dc2, dc3, dc4, dc5) and their parameters = {solid wood, durable, heavy
carved, dark finish, traditional look, luxurious look, easy to clean}. Molodtsov insisted that
soft set could use any parametrization we prefer such as words and sentences, real numbers,
functions and so on. This parametrization caused the multidimensional topological space.This
model space needs not only metric space but also non metric space. Due to the basic notions
of soft set that offers an approximate nature of the object under consideration, the solution for
someone‟s problem based on soft set should also be a soft decision. Many research activities
were using soft set theory in decision making process give an exact solution rather than a soft
solution. However, this paper will show a simple ranking evaluation applied for each object
parameters in the soft set and mapping that parameters family of the objects (houses and its
attractiveness parameters of Mr. X‟s Molodtsov example) using non metric multidimensional
scaling and give a soft solution or recommendation based on soft set that could be used as a
suggestion for Mr. X‟s decision.
The rest of this paper is organized as follows. Section 2 describes the notion of soft set
theory. Section 3 presents a review on soft set-based decision making techniques. Section 4
and 5 describes soft set-based recommendation systems and proposed the software of soft set
based recommendation systems. Finally, the conclusion of this work is described in section 6.
Molodtsov [2] first defined a soft set which is a family of objects whose definition
depend on a set of parameter. Let U be an initial universe of objects, E be the set of adequate
parameters in relation to objects in U. Adequate parametrization is desired to avoid some
difficulties when using probability theory, fuzzy sets theory and interval mathematics which
are in common used as mathematical tool for dealing with uncertainties. The definition of soft
set is given as follows.
Definition 2.1.([Molodtsov [2]).A pair (F, E) is called a soft set over U if and only if F is a
mapping of E into the set of all subsets of the set U.
From definition, a soft set F, E over the universeU is a parameterized family that gives an
approximate description of the objects in U. Let e any parameter in E, e E , the subset
F e U may be considered as the set of e -approximate elements in the soft set F, E .
Example 1.Let us consider a soft set F, E which describes the “attractiveness of houses”
that Mr. X is considering to purchase.
U – is the set of houses under Mr. X consideration
Recommendation Anlysis Based On Soft Set For Purchasing Products 833
In this example, to define a soft set means to point out expensive houses, that shows which
houses are expensive due to the dominant parameter of the houses is expensive, in the green
surrounding houses, which shows houses that their surrounding are greener than other, and so
on. Molodtsov [2] also stated that soft set theory has an opposite approach which is usually
done in classical mathematics that should construct a mathematical model of an object and
define the notion of the exact solution of this model. Soft set theory uses an approximate
nature as an initial description of the objects under consideration and does not need to define
the notion of exact solution. A common mathematical tools to solve complicated decision
problems with uncertainties are probability theory, fuzzy theory and interval mathematics,
but their difficulties are probability theory must perform a large number of trials, fuzzy
theory must set the membership function in each particular case and the nature of the
membership function is extremely individual, interval mathematics should construct an
interval estimate for exact solution of a problem but not sufficiently adaptable for problem
with different uncertainties. To avoid these difficulties, in the soft set theory, when someone
is faced to the decision problems with many uncertainties, he or she could express their own
problems using objects and any information belongs to the objects. This relevant information
refers to the necessary parameter of the objects. The necessary parameters could be a
particular interest that he or she can express their preference, knowledge, perception or
common words in a simple way to the objects under consideration. Parameters attached to
the objects are said to be adequate if he or she considers that the information involved to
identifying a problem is sufficient to elucidate the objects and could give a fair valuation to
each object based on this information and then get suggestion to make a decision. Setting the
objects and their necessary information using words and sentences, real numbers, function,
mappings, etc., is a parametrization process that makes soft set theory applicable in practice.
Maji et al. [1] have extended example 1 to decision making problem of choosing
six houses based on the attractiveness of houses as a house parameters. Some parameters are
absolutely belonging to some houses and some parameters are absolutely not belonging by
some houses. Their example of choosing of houses problem based on soft set has initiated
many important applied and theoretical research that have been achieved in soft set decision
making problem. However, soft set theory has not been yet find out the right format to the
solution of soft set theory due to many research using binary, fuzzy membership or interval
valued for parameters of objects valuation that should be avoided as notified by Molodtsov
and will be described in the next section.
Molodtsov‟s example give a soft set F, E that describes the attractiveness of
houses which Mr. X is going to buy, with U is the set of houses and E= {expensive, beautiful,
wooden, cheap, in the green surroundings, modern, in good repair, in bad repair} as a set of
parameter that defined the houses under consideration.
834 R.B.F. HAKIM , SUBANAR , E. WINARKO
3.1. Review on soft set-based decision making. Maji et al. [1] applied the theory of soft set
to solve a decision making problem that has been encountered by Mr. X. They defined the six
houses {h1, h2, h3, h4, h5, h6} that each house have their own parameters, for example h1, h2,
h3, h4, h5, h6 are beautiful houses and h1, h2, h6 are wooden houses, etc. Mr. X interested in a
several parameters as a priority to buy a houses, that are „beautiful‟, „wooden‟, „cheap‟, „in
the green surroundings‟ and ‟in good repair‟ as a subset of E. His decision is based on the
maximum number of parameters of the soft set. They continued to present the soft set of
attractiveness of houses which Mr. X is going to buy using tabular representation with the
entry is 1 if a house has a particular parameter, and zero if it does not has. This quantification
means, for example, the house h1 is absolutely beautiful and wooden or h3 are not wooden,
Then the choice of Mr. X is only a cumulative numbers based on the houses that have all the
parameters. At this point, Maji et al [1][3] assumed that the parameters as an attribute of an
objects or object‟s features. This assumption is of course, need a process to transform the
parameter to attribute or feature of objects. To handle the binary valuation, Maji et al [1] tried
to introduce the W-soft set or weighted soft sets, but their effort did not give a new approach
to decision analysis caused by the weights are multiplied to each parameter (as attribute) and
hence, this will not changes the final result. Later, Herawan and Mat Deris [4] and Zou and
Xiao [5] have proven that soft set could be transformed to binary information systems.
Maji et al [1] also used rough set theory to reduce the parameters that have been hold
to every object in the universe. Rather than optimize the worth of parameter as necessary
information; they preferred to reduce the parameter. The information involved in the
parameter will be loosed. Molodtsov has insisted the adequacy of parameterization to objects
of universe rather than reducing the parameter that has been belonged to every object. Due to
binary value of the entries, their decision result gives an exact solution rather a soft solution
that contradicted to the philosophy of the initial Molodtsov‟s soft set that insisted the
approximation to the result which is caused by soft information accepted in a parametrization
family of soft set.
Chen et al [16] and Kong et al [17] also wanted to reduce the parameter of the
objects, but Molodtsov already pointed out that the expansion of the set of parameters may be
useful due to the expansion of parameters will give more detailed description of the objects.
For example, Mr. X could add the parameter „distance to office‟ for the attractiveness of
houses. This parameter gives more detailed description of houses and may help him to re-
decide which house to buy. The reduction of parameters is worthless since the adequacies of
parameter are crucial in soft set theory to describe the houses. Reducing parameter may
deduct valuable information from objects and it can be used only for special case. For
example, reducing the parameter „expensive‟ and „cheap‟ is allowed since Mr. X exactly
knows that all houses has an actual same prices.
3.2. Review on fuzzy soft set-based decision making. Roy and Maji [6] then combined
the fuzzy set and soft set, that fuzzy numbers is used to evaluate the value of the parameter‟s
judgment for each object. This idea develops to the hybrid theory of fuzzy soft set. This fuzzy
soft set also initiated by Yang et al. [7]. They also said that rather than using {0, 1} value to
define the object hold the parameter, it will be better to use the degree of membership to
represent the object which hold the parameter. Since the parameter‟s value of each object
filled by fuzzy numbers, then it means that there must be an expert to determine the
membership value that represent the matching number for each house. It become more
difficult since the valuation of parameters of the objects are on the interval-valued fuzzy
Recommendation Anlysis Based On Soft Set For Purchasing Products 835
number (Feng et al [15], Jiang et al [10]). An expert should give not only matching number of
parameter but should determine the lowest and the highest number as the value of the objects
parameters. Molodtsov had been stated that this is the nature difficulties when dealing with
fuzzy numbers and should be avoided.
Several researchers could be grouped that follows this two main idea i.e. treat the
soft set as an attribute of information systems (Herawan and Mat Deris [4], Zou and Xiao [5])
then using rough set (soft rough set) to handle the vagueness to make a decision (Feng et
al.[8]) and the fuzzy soft set (Jun et al. [9], Feng et al [15] and Jiang et al. [10]).
Both of them (sections 3.1 and 3.2) gave an exact solution or best decision rather
than a soft solution or recommendation that satisfying Molodtsov‟s soft set philosophy.
This paper will use soft set as a generic mathematical tool for describe the objects
under consideration on the form of parameters they needed to give recommendation rather
than exact solution. This paper also shows a simple ranking evaluation applied for each object
parameters in the soft set and mapping that parameters family of the objects (houses and its
attractiveness parameters of Mr. X Molodtsov‟s example) using non metric multidimensional
scaling as Nijkamp and Soffer [11] introduced to soft multicriteria decision models and give a
soft solution or recommendation based on soft set that could be used as a suggestion for Mr.
X‟s or someone‟s decision.
4.1. Definition of the Soft Set and Soft Solution.From definition 2.1. Let U be an initial
universe set and let E be a set of parameters. A pair (F, E) is called a soft set over U if and
only if F is a mapping of E into the set of all subsets of the set U. From that definition, a soft
set F, E over the universeU is a parameterized family that gives an approximate
description of the objects in U. Let e any parameter in E,e E , the subset F e U may
be considered as the set ofe-approximate elements in the soft set F, E .
As an illustration, let us consider the following examples from Molodtsov, (1999). A
soft set (F, E) describes the attractiveness of the houses which Mr. X is going to buy.
U – is the set of houses under consideration
E – is the set of parameters. Each parameter is a word or a sentence.
E = {expensive, beautiful, wooden, cheap, in the green surroundings, modern, in good repair,
in bad repair}
In this problem, to define a soft set means to point out expensive houses, beautiful houses and
so on. Expensive houses may show which houses are expensive due to the dominating
parameter is „expensive‟ compared to other parameters that are possessed by the house, in the
green surrounding houses, which shows houses that their surrounding are greener than other,
and so on. It is worth noting that the sets F(e) may be arbitrary. Some of them may be empty,
some may have nonempty intersection. That is, the solution of the soft set is a set which are a
subset of object and a subset of parameters that shows the objects and its parameters.
Definition 4.1. (soft solution). A pair (F’, E’) over U’ is said to be a soft solution of soft set
(F, E) over U if and only if
i) U’ U
836 R.B.F. HAKIM , SUBANAR , E. WINARKO
We shall use the notion of restriction parameter of eE’ to U’ in order to obtain the
parameters which dominate an object compared to other parameters that may be possessed by
those objects.
We are trying to approach the soft solution using information system theory that has
been widely disclosed by Demri and Orlowska (2002) which already has an established
theoretical foundation. Soft set theory is different from information systems in which a
problem or an object in the soft set is determined by the person dealing with the problem,
then relies on the ability of the person to be able to explain various things related to that
object. Various things that might be related are referred to as the object parameter in the soft
set. Meanwhile,an information system is a collection of objects and their properties. That is
why, soft set described as a pair (F, E) over U with F is a mapping of E into the power set of
U, and instead of (U, E) where U is the set of objects and E is the set of attributes as the
structure of information systems (OB, AT) where OB is the set of objects and AT is the set of
attributes (properties). A formal information systems (Demri and Orlowska, 2002) may be
presented as a structure of the form (OB, AT, (Va)aAT, f ), where OB is a non-empty set of
objects, AT is anon-empty set of attributes, Va is a non-empty set of values of the attribute a,
and f is a total function OB x ATaATP(Va) such that for every (x, a) OB x AT, f(x, a)
Va. We often use (OB, AT) as a concise notation instead of a formal structure.
A soft set (F, E) over U might be considered as an information systems (U, AT) such
that AT = {F} and value of a mapping function of F = eE make available the same
information about objects from U. It is a common thing to identify a wide range of matters
(parameters) relating to the object and then create a collection of objects that possess this
parameters. To compose this intuition, for a given soft set S = (F, E) over U, we define a soft
set formal context S = (U, E, F) where U and E are non-empty sets whose elements are
interpreted as objects and parameters (features), respectively, and FUxE is a binary relation.
If xU and eE and (x, e) F, then the object x is said to have the feature e. If U is finite then
the relation F can be naturally represented as a matrix with entries (c(x,e)) xU, eE such that
the rows and columns are labeled with objects and object parameters, respectively, and if (x,
e)F, then c(x,e) = 1, otherwise c(x,e) = 0. In this concept, the soft set formal context provide the
following mappings ext: P(E) P(U), that shows extensional information for objects under
consideration. This means an object parameters may be able to be expanded on someone
views as the set of those objects that possess the parameters.
def
For all XU and eE we define ext(E) {xU | (x,e)F, for every eE}; ext(E) is
referred to as the extent of E.
Lemma4.1. For all A, A1, A2 E if A1 A2, then ext(A2) ext(A1)
def
Proof. Let A1, A2E. Assume that xU and for every eA2 then ext(A2) {xU| (x, e) F,
for every eA2} since A1A2 , xU and for every eA1 then (x,e)F that is ext(A1).
Each soft set formal context could be viewed as an information system. Given a soft set
formal context S = (U, E, F), we define the information system (OB, AT, (Va)aAT, f )
determined by S as follows :
Recommendation Anlysis Based On Soft Set For Purchasing Products 837
def
- OB U
def
- AT E
def
- For every aAT and for every xOB, f(x, a) {1} if (x, a) F, otherwise f(x, a)
def
{0}
Any soft set formal context S = (U, E, F) which has been viewed as an information
systems (OB, AT) could be represented as soft set information system S = (U, E, F) that
contains some information about relationships among parameter of the objects under
consideration. This relations reflect a various forms of indistinguishability or „sameness‟of
objects in terms of their parameters. Let S = (U, E, F) be soft set information system. For
every AE we define the following binary relations on U:
- The indiscernibility relation ind(A) is a relation such that for all x, yU, (x, y)
ind(A) if and only if for all aA, a(x) = a(y).
- The similarity relation sim(A) is a relation such that for all x, yU, (x, y) ind(A) if
and only if for all aA, a(x) a(y) .
Intuitively, two objects are A-indiscernible whenever their sets of a-parameter
determined by the parameter aA are the same, while objects are A-similar whenever the
objects share some parameters. In addition to having indistinguishability, we also show a
formal distinguishability relation from a soft set information system,
- The diversity relation div(A) is a relation such that for all x, yU, (x, y) div(A) if
and only if for all aA, a(x) a(y).
Objects are A-diverse if all sets of their parameters determined by A are different. The
information relations derived from soft set formal context (U, E, F) satisfy a property below,
Lemma 4.2.For every soft set formal context S = (U, E, F), for every A E, this assertion
holds:
- ind(A) is an equivalence relation.
- sim(A) is reflexive and symmetric.
4.2. Application Using Diversity Relations. Let S = (U, E, F) be a soft set information
system and AE. We say that parameter aA is indispensable in A if and only if ind(A)
ind(A-{a}). It follows that if a is indispensable in A means, classification of objects with
respect to parameter from A is properly finer than the classification based on A –{a}. The
notion finer here is that the classification based on indiscernibility of A-parameter provide a
finer partition and never coarser of the set of objects than the A-parameters objects without
parameter a. The set A of parameters is independent if and only if every element of A is
indispensable in A, otherwise A is dependent. This indispensable parameter property plays
important roles in the soft solution. The absence of this parameter will cause a fundamental
change to the outcome of soft solution. In the molodtsov houses example, a parameter „in the
green surroundings‟ is indispensable, while parameter „expensive‟ and „cheap‟ or „in good
repair‟ and „in bad repair‟ may could be selected one from each pair. The set of all parameters
indispensable in A are referred to as the parameter core of A in the system S:
def
CoreS(A) (aA | ind(A) ind(A - {a})
838 R.B.F. HAKIM , SUBANAR , E. WINARKO
Let S be soft set information system. By the discernibility matrix of S, the entries of the
matrix (cx,y)x,yU where cx,y = cy,x and cx,x= Ø, cx,y = {eE | (x,y)div(e)}. The columns and the
rows of the matrix are labeled with objects whose entries are cx,y.
Lemma 4.3. Let S = (U, E, F) be soft set information system and let B A E. Then the
following assertions hold.
i) (x, y) ind(A) iff cx,y A = Ø;
ii) ind(B) ind(A) iff for all x, y U, ( cx,y A Ø implies cx,y B Ø);
iii) if B A, then ind(B) = ind(A) iff for all x, y U, cx,y A Ø implies cx,y B Ø.
Proof (iii). Let BAE, for all (x, y)U, let aA, since cx,y= a(x) a(y) A Ø and BA then
Bcx,yA = Bcx,y Ø, if aoB, a ao, for all (x, y)ind(B) an for every aoB, (ao(x) = ao(y)).
Since BA, for every aoA, (ao(x) = ao(y)), that is ind(B) = ind(A).
The above lemma enables us to find the core of a set of parameter, namely we have the
following theorem.
Theorem 4.1. Let S = (U, E, F) be soft set information system and let A E. Then the
following assertion holds:
Proof: Let aA. Since (A - {a}) A, by lemma 13.3. (iii), ind(A - {a}) = ind(A) if for all x,
yU, cx,yA Ø implies cx,y (A-{a}) Ø. Hence ind(A-{a}) ind(A) iff there are xo, yoU
such that cxo,yoA Ø and cxo,yo (A-{a}) = Ø. So ind(A-{a}) ind(A) iff there are xo,yoU
such that cxo,yoA = {a}.
This theorem says that parameter a Cores(A) iff there are x,yU such that a is the only
parameter that allow us to make a distinction between x and y. In other words, the only
division between x and y is provided by their a-parameter.
We have already present several reasons for soft set that might be treated as an
information system. Other researchers might directly put a value on each parameter by
treating object parameter as an attribute information system and give a value (binary, fuzzy or
interval-valued numbers) on each attribute. Bring the soft set to the structure of information
system allows us to find a soft solution. In this work we use ranking value just look like
simple fuzzy numbers to valuation the parameter on soft set information system that may be
done by all people rather than directly set the membership value which may be done by an
expert to each object under consideration. Soft solution is expected come out using this way.
This set of soft solution will be used in the recommendation system. The following is the
recommendation system,
Definition 4.2. (regions of recommendation). Given a pair (F’, E’) over U’ which is a soft
solution of soft set (F, E) over U, a set X U’ of objects and a set A E’ of parameters, the
regions of recommendation is defined as a A-parameters which are sufficient to classify all
the elements of U’ either as members or non-members of X. Since A-parameter might not be
able to distinguish sufficiently between individual objects, it might be able to use cluster of
Recommendation Anlysis Based On Soft Set For Purchasing Products 839
Example 2. There are 6 houses under consideration (H1, H2, H3, H4, H5, H6) of Mr. X
which will be bought. It is clear that such of his evaluation to each house is not very accurate;
it is „soft‟ because he used any qualitative and quantitative information or knowledge he had,
to judge each house. Then he must give valuation for each house based on those parameters.
The way of dealing with the evaluation problem usually is using a ranking, rating or
membership degree to the objects which allow someone to express their preference,
knowledge, perception or a common thing in an easiest and fairly flexible way. This soft
expression will affect his evaluation to each parameter‟s house. The simplest way when
someone forced to the choice is to give ranking or rating for each house based on parameter.
Tabular representation of the houses and parameter will be useful to describe the response of
Mr. X evaluation. It will be checked using asterisks that describing the condition „more
asterisks more meet parameters‟.
From the table 1, we could look Mr. X‟s evaluation. H1 is an as expensive as H3 even though
the actual prices for both houses are not exactly same. H1 and H3 are the highest prices than
other houses and H4 and H6 are in the middle rate while H2 and H5 are the lowest from Mr.
X‟s evaluation. Some of parameters are vice versa, i.e., expensive and cheap, in good repair
840 R.B.F. HAKIM , SUBANAR , E. WINARKO
and in bad repair. It will be difficult to identify and directly obtain a solution from this table
due to qualitative evaluation. The main decision problem is the fact that the difference
between the houses was relative evaluation to each other that meets the parameters. Collect
the entries of each cell in the table based on the numbers of asterisks will give the numbers of
house‟s ranking that meet the parameter.
From table 2, H1 and H4, H2 and H3 have same highest attractiveness for several parameters
but we get difficulty to compare and knowing exactly what parameters they are. H6 has a
highest value in the middle preferences. Table 2 could be as a soft solution or
recommendation to Mr. X for choosing one of them. The soft set ranking solution or
recommendation is {H1 and H4, H2 and H3, H6, H5}. This solution is very general, supposed
Mr. X assigns several parameters as priority parameters that he though there parameters are
most important when buying a house. To buy a house, Mr. X gives priority to parameter
„beautiful‟, „cheap‟, „modern‟ and „in a good repair‟. Accumulate all entries of each cell on
table 1 based on his priority parameters and then placed in table 3.
From table 3, there are some recommendations for Mr. X, namely, H4 which dominate other
houses with two priority parameters that have three and two asterisks, then H1 and H3 or H2
and H5 which have same values in the priority parameters and the last is H6, H6 is interesting
one because even though it does not has a numbers in three asterisks but all priority
parameters are in the middle. The soft solution which is offered to Mr. X to choose one of
them could be clustered into four groups {(H4), (H1 and H3), (H2 and H5), (H6)}.
To better utilize information from the tables and provide added value to our
recommendation to Mr. X, we will use multidimensional scaling techniques. Non-metric
multidimensional scaling techniques are common techniques which based on ordinal or
qualitative rankings of similarities data (Kruskal [12]). Therefore, table 3 needs to be
transformed via the numbers into an ordinal table.
Recommendation Anlysis Based On Soft Set For Purchasing Products 841
Using the software R (R Development Core Team [13]) with vegan package and metaMDS
procedure (Dixon and Palmer, [14]), we get the mapping of Mr. X view of houses and
parameters ranking on Figure 1.
wooden
1.5
H1
1.0
H6
0.5
NMDS2
expensive
modern(P)
H5
0.0
H2
cheap(P) beautiful(P)
-0.5
H4 H3
-1.0
green surroundings
-3 -2 -1 0 1 2
NMDS1
From figure 1 we get an illustration of houses and its parameters where H3 and H4 are
„in the green surroundings‟, „beautiful‟ and „in a good repair‟ that two of its parameters are
priority, H2 tends to „cheap‟ parameter (priority parameter), H5 closed to parameter „bad
repair‟, while H1 far from priority parameter and H6 is in the middle evaluation. From this
illustration we could determine a set as a solution of soft set that we call as a soft solution. A
set (F’, E’) is called a soft solution for soft set (F, E) if and only if E’ is subset of E, F’ is a
domination mapping of E’ into the set of all subsets of the set U. The domination is
understood in the following way. Let say, V and W are sets of parameters, where V = {1, 2,
... } and W = {1, 2, ...} are subsets of E’. We say that a set of parameters Vdominates W on
a set of all subsets of U if and only if i i for every i and there exists an index j such that
j>j.
Houses H3 and H4 closed to parameters „green surroundings‟, „beautiful‟ and „good
repair‟, but H4 is dominated by „green surroundings‟ and „beautiful‟ while H3 is dominated
842 R.B.F. HAKIM , SUBANAR , E. WINARKO
by „good repair‟. It could be explained, let V = {green surroundings, beautiful, good repair}
and W = {good repair}, H4 is not dominated by V because there is v3 = „good repair‟ which is
equal to w1 = „good repair‟ where w1W that dominates H3. House H1 is inclined to
parameter „expensive‟ and „modern‟, H6 is in the middle preferences. H2 is dominated by
parameter „cheap‟, while H5 is dominated by parameter „bad repair‟. Then the solution of soft
set for Mr. X problem is
Soft solution (F’,E’) = {(green surroundings, beautiful) = H4, (good
repair) = H3, (expensive, modern) = H1,(cheap) = H2,(bad repair) =H5, (
) = H6}
This set of soft solution is used as a basis for recommendationsto Mr. X to choose the house.
So the recommendations for Mr. X is
{(H1, H2, H3, H4, H5, H6), (expensive, beautiful, wooden, cheap, in the
green surroundings, modern, in good repair, in bad repair), (green
surroundings, beautiful) = H4, (good repair) = H3, (expensive, modern) =
H1,(cheap) = H2,(bad repair) =H5, ( ) = H6}
Say, Mrs. X is agreed to her husband, Mr. X, about the parameters used and their evaluation
for each house, but she has different priority parameters. Her priority parameters are
(expensive, beautiful, in the green surroundings and in a good repair). By compromising to
both of their choices of priority parameters, the recommendations are: choose H4 that has
two dominant priority parameters i.e. „green surroundings‟ and „beautiful‟ or H3 that has one
dominant priority parameters i.e., „in a good repair‟.
To see the group of houses based on parameters of Mr. X view, we could get the soft
cluster of houses. Using three techniques of hierarchical clustering which are single, complete
and average linkages we could draw the dendrogram of houses (figure 2).
0.30
0.25
0.25
0.20
0.15
0.20
H6
Height
Height
Height
H6
H6
0.15
0.15
H4
0.10
0.10
H4
0.10
H4
H1
H3
0.05
H1
H3
0.05
H1
H3
0.05
H2
H5
H2
H5
H2
H5
For single and complete linkage techniques, the houses are separated in three groups (H2,
H5), (H1, H3, H4) and H6, while for average linkage technique (H1, H3, H4, H6) and (H2,
H5) fit in two groups. With the help of function rect.hclust and cutree from Base R (R
Development Core Team, 2006) for visualizing the group cutting and make classification
Recommendation Anlysis Based On Soft Set For Purchasing Products 843
vector with 3 classes of dendrogram in the complete linkage technique we could see the result
in figure 3 that shows three groups separation. We could also run those functions to average
or single linkage technique. Figure 3 shows three soft cluster separation of houses, (H1, H3,
H4), (H2, H5) and H6. First group is dominated by parameters: „green surroundings‟,
„beautiful‟, „good repair‟ and „expensive‟, second group is dominated by parameters: „cheap‟
and „bad repair‟ while the last group is in the middle preferences.
Cluster Dendrogram
0.35
0.25
Height
H6
0.15
H4
0.05
H1
H3
H2
H5
Figure 3. Three groups of houses using complete linkage technique
houses.dis
hclust (*, "complete")
In this section we are trying to illustrate user interface for soft set recommendation
systems for purchasing furniture products in some furniture store. Buyers are offered to get
assistant from furniture expert for choosing the products. System displays all collections of
furniture items as shown in figure 4. In this example, buyers need to see collections of dining
chairs. Selected chairs just click and the salesman will bring it or show the items (as shown in
figure 5, selected items column). Buyers can try and feel the comfort of each chair chosen.
Buyers could determine their own requirements for their dining chairs, in this example,
someone has a several thing which she thought as a dining chairs prerequisites, i.e. „match
with my dining room decoration‟, „fit the space of my dining room‟, „cheap‟, „comfort‟,
„classic‟ and „bright wood color‟. Those necessities could be regarded as parameter of each
chair. She thought that, those information/knowledge/parameters are necessary parameters for
her to choose a chair that she need for inviting a special guest for dinner. She does not need to
be an expert first when buying a dining chairs and does not need to estimate the interval
valuation for each chair criteria/requirements and also she does not need to try each chair for
a long time before decide which one should be bought. The simple act to evaluate the selected
items is to compare them in a fairly flexible way by giving a more mark to the chairs that
meet her requirements.
844 R.B.F. HAKIM , SUBANAR , E. WINARKO
Customer
List of Collections: Dining chairs collection:
Identification
Dining Chairs:
Name:F
ajriya
Hakim
Recommendatio
Selected Items: Customer Evaluation
n B
Fill in with asterisks, U
Note: more asterisk more The results is
Y
meet request. based on your
?
evaluation
Items 1 2 3 4 5 6
Match the dining room
* * Tend to
decoration (Match) * * *
* * * match
Custom Fit the space of dining * * *
* *
er room (Fit)
request Cheap Color
* * *
: Comfort Comfort
* * * * * *
Classic Classic
* * *
Wood colour
Fit
Cheap
* Tend to
Expert Comfort * * *
* * * cheap
Says : Classic * * *
* and
Wood Colour
match
* * Tend to
* * * *
* * cheap
Comfo Classic Cheap Cheap
rt Comfo
People who bought the items Classic
Wood rt
also bought: Color
colour Wood
colour
6. CONCLUSION
In this paper we have introduced a different way than other researchers that gave the
binary or fuzzy evaluation to parameters of soft sets. Binary or fuzzy evaluation for
parameters of soft sets will produce the exact solution for soft set. Molodtsov had emphasized
the approximation or approach solution for soft set rather than exact solution.
Recommendation or soft solution describes the dominant parameters for each choice. This
846 R.B.F. HAKIM , SUBANAR , E. WINARKO
example shows the strict dominant parameters, we call strict dominant parameter since a set
of all subsets of U has a set of parameter that has empty intersection with other set of
parameter. Another kind of dominant parameter is a weak dominant parameter that allows a
set of all subsets of U has a set of parameters that may have nonempty intersection with other
set of parameters. The main important part of soft set is an adequate parameters of objects
under consideration which determined by someone when deal with decision making problem.
The adequacy test is needed to test the adequacies of the parameters of soft set. Object‟s
parameters of soft set are called minimal if reducing one parameter will result the failure to
give soft solution. One parameter is called adequate if and only if this parameter could
describe the objects under consideration, while many parameters may be called inadequate if
and only if those parameters could not describe the objects under consideration. This paper
has shown a useful study using a simple approximation to soft set theory in a decision making
problem.
References
[1] MAJI, P.K., A.R. ROY AND R. BISWAS, (2002), An Application of Soft Sets in A Decision Making Problem,
Computers and Mathematics with Applications 44: 1077-1083.
[2] MOLODTSOV, D (1999), Soft Set Theory – First Results, Computers and Mathematics with Applications 37: 19
– 31.
[3] MAJI, P.K., A.R. ROY AND R. BISWAS, (2003), Soft Sets Theory, Computers and Mathematics with
Applications 45: 555-562.
[4] HERAWAN, T. AND DERIS, M.M., (2011), A soft set approach for association rules mining, Knowledge Based
Systems 24: 186 – 195
[5] ZOU, YAN AND XIAO, ZHI, (2008) Data Analysis Approaches of Soft Sets under Incomplete Information,
Knowledge Based Systems 21: 941 -945,
[6] ROY, A.R. AND P.K. MAJI(2007), A Fuzzy Soft Set Theoretic Approach to Decision Making Problems,
Computational and Applied Mathematics 203: 412-418
[7] YANG, X; D. YU, J. YANG, C. WU, (2007)Generalization of soft set theory: From crisp to fuzzy case, in Fuzzy
Information and Engineering (ICFIE), ASC 40: 345-354
[8] FENG, FENG, XIAOYAN LIU, VIOLETALEOREANU-FOTEA, YOUNG BAE JUN, (2011), Soft set and soft rough sets,
Information Sciences 181: 1125 – 1137
[9] JUN, YOUNG BAE; KYOUNGJA LEE AND CHUL HWAN PARK, (2010) Fuzzy soft sets theory applied to BCK/BCI-
algebras, Computers and Mathematics with Applications 59: 3180-3192
[10] JIANG, YUNCHENG; YONG TANG, QIMAI CHEN (2011) An adjustable approach to intuitionistic fuzzy soft sets
based decision making, Applied Mathematical Modelling 35: 824-836
[11] NIJKAMP, PETER AND ASTRID SOFFER (1979) Soft Multicriteria Decision Models for Urban Renewal Plans,
Researchmemorandum no. 1979-5, Paper SistemiUrbani, Torino.
[12] KRUSKAL, J. B. (1964) Nonmetric multidimensional scaling: A numerical method, Psychometrika 29, (1964),
pp. 115-29.
[13] R DEVELOPMENT CORE TEAM (2006), R: A language and environment for statistical computing. R Foundation
for Statistical Computing, Vienna, Austria. URL http://www.r-project.org/.
[14] DIXON, P., PALMER, M.W., (2003). Vegan, a package of R function for community ecology, Journal of
Vegetation Science 14, 927-930.
[15] FENG, F., JUN, Y.B., LIU, X., LI, L. (2010): An adjustable approach to fuzzy soft set based decisionmaking.
Journal of Computational and Applied Mathematics 234, 10–20.
[16] CHEN, D., TSANG, E.C.C., YEUNG, D.S., AND WANG, X. (2004). The Parameterization Reduction of Soft Sets
and its Applications, Computers and Mathematics with Applications 49 (2005) 757-763.
[17] Z. KONG, L. Z., GAO, L. WANG AND S. LI, (2008), The normal parameter reduction of softsets and its algorithm,
Comput. Math. Appl. 56 (12) 3029-3037.
Recommendation Anlysis Based On Soft Set For Purchasing Products 847
[18] DEMRI, S. P. AND ORLOWSKA, E. S., 2002, Incomplete Information: Structure, Inference, Complexity, Springer-
Verlag, Berlin Heidelberg, New York.
SUBANAR
Mathematics Department, Faculty of Mathematics and Natural Sciences,
UniversitasGadjahMada
Sekip Utara, Jogjakarta, Indonesia 55528
e-mail: [email protected]
EDI WINARKO
Mathematics Department, Faculty of Mathematics and Natural Sciences,
UniversitasGadjahMada
Sekip Utara, Jogjakarta, Indonesia 55528
e-mail: [email protected]
848 R.B.F. HAKIM , SUBANAR , E. WINARKO
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 849 - 858.
Abstract. Box-Jenkins is the best method for stationary time series modeling. When variance varies over
time, it is proposed to use ARCH model to capture time series structure (Engle, 1982). In 1986, Bollerslev
generalized it into GARCH model. The standard approach of models is introducin g an exogenous variable
along with some assumptions. This paper is proposed an alternative solution when exogenous variable
unfulfilled these assumptions. Discrete wavelet transform can be used to analyze time series structure
when the sample size is integer power of 2. When sample size is arbitrarily, it’s proposed to use
undecimated wavelet transform.
Keywords and Phrases :
1. INTRODUCTION
Volatility in time series data is indicated by changing of variance value over time. It
means there is a heteroscedasticity property in data. According to this condition, the data can
not be modeled by Box-Jenkins method directly. The early heteroscedastic model was
proposed by Engel [5] in 1982, which is called as Autoregressive Conditional
Heteroscedasticity (ARCH) model. Heteroscedastic properties in data are captured by AR(p)
model of error component.
t t v t ,
2t 0 1 2t 1 ... 1 2t p (1)
where {vt}is a sequence of iid with mean 0 and variance 1, α0>0, and αi≥0 for i>0. In
practice, vt is usually assumed to follow the standard normal or a standardized student-t. The
model (1) was developed by Bollerslev [2] in 1986 along with assumption that t follows an
2
ARMA(p,q) model. This paper does not cover complete solution of ARCH/GARCH model,
but will gives an alternative solution when there is a violation of v t assumption. However the
study of wavelet method is included in nonparametric modeling which free of distribution
assumption.
849
850 R. SANTOSO ET AL.
Wavelet is a small wave function that can build an orthonormal basis for L 2(R), so that
every function f∈L2(R) can be expressed as linear combination of wavelets [4]
where and ψ is a father and mother wavelet respectively with dilation and translation
indexes
j, k (t ) 2 j / 2 (2 j t k) (3a)
j, k (t ) 2 j / 2 (2 j t k) . (3b)
In discrete version wavelet can construct an orthonormal filter matrix so that every discrete
realization of f∈L2(R) can be decomposed into scaling component or smooth component (S)
and detail components (D) [8].
Let h=[h0, h1, …, hL-1] a wavelet filter then scaling filter g can be derived from h by
formulation (4)
gi (1)i 1 h L1i , i=0,1,…, L-1 (4)
For example the Haar filter h and its scaling filter will be formed as equation (5)
h= [ 1 , 1 ] and g = [ 1 , 1 ] (5)
2 2 2 2
Let {Zt} is time series from discrete time realization of fL2(R) with t=1,2,…, N,
N=2J, then the coefficient wj,k and cJ,k in equation (2) can be computed by discrete wavelet
transform (DWT) as shown in (6). Here, H is a filter matrix of NxN
h (1) h (01) 0 0 h (L1) 1 h (21) Z1
11
h3 h12 h11 h10 0 h15 h14 Z 2
W=H.Z= 0 0 h1L 1 h1L 2 h11 h10 (6)
h ( 2) h2( 2) ( 2)
h1 ( 2)
h0 h 3L 2 h 4
( 2) ( 2)
3
(J) (J) (J) (J)
h 2 J 1 h 2 J 2 h1 h0
g ( J ) (J)
g1( J ) g (0J ) Z N
2 J 1 g 2 J 1
Up-sampled version of h notion by hup is constructed by inserting zero between non-zero
value filter. The filter of high level (j=2,3,…,J) is gotten by convolution of hup and g as
shown in (7).
Heteroscedastic Time Series Model By Wavelet Transform 851
h(j)=(h(j-1))up*g (7)
For example in Haar case h = (2)
[ 1 ,0, 1 ]*[ 1 , 1 ] = [ 1 , 1 , 1 , 1 ]
2 2 2 2
2 2 2 2
When the sample size is not the form of 2J, JZ, the coefficient wj,k and cJ,k can be
computed by Undecimated Wavelet Transform (UDWT). The scenario of UDWT for j=1 can
be shown in Figure 1. The wavelet coefficient w1,k is resulted by convolution of time series Z
and h. The first detail component D1 is resulted by convolution of w1,k and h’ where h’ is
time reverse version of h. The scaling coefficient c1,k is resulted by convolution of Z and g.
The first scaling coefficient is resulted by convolution of c1,k and g’ where g’ is time reverse
version of g. Furthermore Ẑ =S1+D1 will equal to Z regard to wavelet filtering.
Higher level of UDWT can be constructed by split the scaling coefficient cj,k into cj+1,k and
wj+1,k. The UDWT for j=2 can be shown in Figure 2. Furthermore Ẑ =S2+D2 +D1 will equal
to Z regard to wavelet filtering.
The number of DWT coefficients at level j+1 is a half of level j. In other hand, the
number of UDWT coefficient always the same for all decomposition level. This property
makes UDWT more powerful to analyze the time series than DWT. Furthermore, this paper
will discuss wavelet base prediction of time series by UDWT only.
852 R. SANTOSO ET AL.
Prediction of Z at time t+1 will be done refer to realization of Z in the past and wavelet
coefficient which resulted from decomposition. Starck [9] propose the necessary wavelet
coefficients at each level j which will be used for forecasting at time t+1 have the
form w j, N 2 j ( k 1) and cJ, N 2 j ( k 1) . The forecasting formulation is expressed in equation
(8)
J Aj A J 1
Ẑ N 1 â j, k w j, N 2 j ( k 1) â J 1,kcJ, N 2 (k 1)
J (8)
j1 k 1 k 1
The highest level of decomposition is indicated by J, and Aj is indicate the number of
coefficients which chosen at level j. For example, if J=4 and Aj=2 for j=1,2,3,4 then (8) can
be expressed as (9)
Ẑ N 1 â1,1w1, N â1,2 w1, N 2 â 2,1w 2, N â 2,2 w 2, N 4
â 3,1w 3, N â 3,2 w 3, N 8 â 4,1w 4, N â 4,2 w 4, N 16 (9)
â 5,1c 4, N â 5,2c 4, N 16
Furthermore, least square method can be used for estimating coefficient a j,k in equation (8)
and (9).
The data of currency exchange from USD to IDR will be used for implementing the
proposed method. The daily equivalent value of $1 to IDR along of 2003 year will be
modeled according to equation (9). The statistic test will be appeared to check that the data is
reasonable for this aim.
The actual data which is appeared in Figure 4 shows that the data comes from a non-
stationary process. The Box-Jenkins standard method proposed to difference the data. The
result of one lag data differencing can be shown in Figure 3, which gives a sign of
heteroscedasticity feature. The ACF and PACF plot give a sign that there is neither AR nor
MA which is significant. It looks like that the data can be modeled as Zt=t, where t are not
normally distributed. The Ljung-Box test for {t} and {t2} indicate that {t} are independent,
but {t2} are dependent. So, it can be concluded that the heterogeneity of variances are
occurred. The GARCH(1,1) looks like as the nearest model for {Zt}, but the Jarque-Bera test
is not supporting the residual normality assumption. Finally, it is concluded that the standard
ARIMA and GARCH models have been failed to capture the data pattern.
Heteroscedastic Time Series Model By Wavelet Transform 853
As has been discussed above, there is a long way to reach final solution in parametric
modeling. Next, it will be appeared the simpler way to make a prediction model in non-
parametric sense, especially in wavelet based model. Although the wavelet computation
theory is a complicated problem, but there are some software which make it easier. The step
by step algorithm of modeling can be explained as follows
The summary of parameter estimation gives the form of prediction model as (10)
5. CONCLUDING REMARK
Wavelet transform, especially UDWT can be used for producing estimation model of
time series. This modeling is simpler and easier to be implemented. The graphical views
show that this method gives a good approximation. However a wide comparison to another
methods and further analytical study must be done to make a comprehensives conclusion.
References
[1] Aldrich, E., A package of Functions for Computing Wavelet Filters, Wavelet Transforms and Multiresolution
Analyses, http://www.ealdrich.com/wavelets/
[2] Bollerslev, T. “Generalized autoregressive conditional heteroskedasticity”, Journal of Econometrics, Vol.
31,1986, pp. 307–327.
[3] Ciancio, A., “Analysis of Time Series with Wavelets”, International Journal of Wavelets, Multiresolution and
Information Processing, Vol. 5(2007), No. 2, pp. 241-256.
[4] Daubhechies, I., Ten Lecture on Wavelets, SIAM, Philadelphia, 1992.
[5] Engel, R.F., “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom
Inflation”, Econometrica, Vol. 50 (1982), No. 4, pp. 987-1008
Heteroscedastic Time Series Model By Wavelet Transform 855
[6] Jawerth, B. and Sweldens, W., “An Overview of Wavelet Based Multiresolution Analyses”, SIAM Review, Vol.
36 (1994), pp. 377-412.
[7] Ogden, R.Todd, Essential Wavelets for Statistical Applications and Data Analysis, Birkhäuser: Berlin, 1997.
[8] Percival, D. B. and Walden, A. T., Wavelet Methods for Time Series Analysis Cambridge University, 2000.
[9] Starck, J.L., et. al., “The Undicimated Wavelet Decomposition and its Reconstruction”, IEEE on Image
Processing. Vol. 16 (2007) No. 2.
Rukun Santoso
Program Studi Statistika Universitas Diponegoro
e-mail: [email protected]
Subanar
Jurusan Matematika FMIPA UGM
Dedi Rosadi
Jurusan Matematika FMIPA UGM
Suhartono
Jurusan Statistik ITS Surabaya
856 R. SANTOSO ET AL.
R Code Listing
function (x,wv='haar',j=4)
{
n=length(x)
x.modwt=modwt(x,wv,j)
d1=x.modwt@W$W1
d2=x.modwt@W$W2
d3=x.modwt@W$W3
d4=x.modwt@W$W4
v4=x.modwt@V$V4
w1<-w2<-w3<-w4<-w5<-w6<-w7<-w8<-c9<-c10<-NULL
for (i in 1:(n-17)){
w1<-c(w1,d1[i+16])
w2<-c(w2,d1[i+14])
w3<-c(w3,d2[i+16])
w4<-c(w4,d2[i+12])
w5<-c(w5,d3[i+16])
w6<-c(w6,d3[i+8])
w7<-c(w7,d4[i+16])
w8<-c(w8,d4[i])
c9<-c(c9,v4[i+16])
c10<-c(c10,v4[i])
}
z=x[18:n]
lm.z=lm(z~-1+w1+w2+w3+w4+w5+w6+w7+w8+c9+c10)
koef<-lm.y$coeff
pred<-c(rep(0,17), lm.y$fitted)
ts.plot(z,xlim=c(0,250), ylim=c(8500,9800), xlab="", ylab="",
type= 'l')
par(new=T)
ts.plot(pred, xlim=c(0,250), xlab="Daily Time",ylab="$1
Equvalencies", ylim=c(8500,9800), col=2, lty=4)
return(lm.z)
}
DATA
> kurs2003
9468 9431 9435 9435 9424 9440 9433 9400 9360 9364 9364 9376 9387 9390 9388
9385 9390 9393 9392 9336 9364 9376 9380 9369 9363 9375 9367 9384 9405 9470
9435 9417 9389 9378 9384 9381 9418 9392 9392 9402 9405 9383 9363 9388 9375
9387 9383 9380 9400 9410 9419 9490 9525 9620 9525 9480 9440 9415 9415 9399
9408 9406 9397 9405 9396 9376 9377 9362 9370 9374 9342 9339 9295 9165 9230
9270 9218 9240 9275 9200 9175 9175 9161 9148 9058 9070 9000 9035 8975 8949
8951 8965 8890 8863 8830 8825 8665 8670 8730 8779 8837 8721 8730 8670 8675
8700 8690 8745 8760 8730 8695 8690 8698 8747 8730 8753 8725 8723 8726 8780
8785 8745 8735 8708 8703 8666 8695 8709 8718 8718 8725 8720 8740 8747 8770
8801 8895 9083 9165 9025 8995 9065 9090 9005 8980 9013 8993 9118 9076 9053
9033 9025 9034 9053 9047 8989 8944 8880 8906 8920 8988 8957 9018 9035 8983
Heteroscedastic Time Series Model By Wavelet Transform 857
8994 8988 8990 9003 9000 8970 8965 8941 8971 8955 8959 8960 8985 8991 8910
8930 8950 8950 8925 8889 8885 8870 8875 8890 8888 8871 8877 8886 8865 8893
8939 8945 8945 8937 8940 8958 8959 8998 9038 9083 9077 9020 8995 9015 9025
8988 8983 8990 8986 8981 8980 8980 9022 8997 8985 8974 8990 9037 9009 8996
8981 8988 9000 8991 8990 8988 8995 8990 8983 8990 8983 8988 8996 8994 8995
8986 8991 8947
858 R. SANTOSO ET AL.
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 859–864.
Keywords and Phrases: nonparametric regression, kernel, bandwidth, central limit the-
orem
1. INTRODUCTION
Consider N time series observation {Xit }Tt=1 , , i = 1, . . . , N , according to the
model
Xit = mi (t/T ) + it , t = 1, . . . , T (1)
where mi (u), u ∈ [0, 1], is the mean function of {Xit }Tt=1 , and {it }Tt=1 is the error of
mean zero and finite variance. It will be investigated whether the shape of the mean
function mi (.), i = 1, . . . , N are identical.
859
860 Sri Haryatmi Kartiko
In fact we want to test the null hypothesis that the mi (.) are parallel, that is there
is a function m(.) such that
H0 : mi (.) = ci + m(.), i = 1, . . . , N (2)
where ci are real constants representing the distance of the curve mi (.). In the testing
hypothesis, ci is viewed as nuisance parameters.
Comparison of regression curves is an important problem in regression analysis.
For instance, in the study of human growth, it is of interest to test whether the growth
curves have the same pattern. If a different growth pattern is observed, then a special
attention is needed and that the individual having it needs a closed monitoring. In
longitudinal clinical studies, evaluators are interested in comparing curves corresponding
to treatment and control groups.
Comparison problem for mean function is discussed in several literature, among
others are Hardle and Maron (1990)compare the shape of two regression curves by test-
ing whether one of them is a parametric transformation of the other. King, Hart and
Wehrly (1991) used kernel method to compare two regression curves, under indepen-
dent and identical distributed errors. This method is generalized by Munk and Dette
(1998)for several curves. Hall and Hart (1990) proposed a bootstrap test to compare
two mean functions with independent errors. In the time series set up, Park et al.
(2009) propose a graphical device to see the equality of two mean functions, while Guo
and Oyel (2009) apply a wavelet based method.
The former paper used assumptions such as the errors are independent in time
and the number of curves is fixed. Here in this paper, the assumptions are relaxed to
the presence of dependence structure and the number of curves can be fixed or tends to
infinity. We derive the asymptotic theory of the test statistics based on the L2 distances
between individual trend estimates and the global trend estimate. For implementation
of the test, we proposed a cross validation bandwidth selection procedure that accommo-
date the dependence in the data. Finally to approximate the finite sample distribution
of the test statistics, simulation based method that is more accurate than the normal
limiting distribution is presented. Overall the methodology is fully nonparametric and
data driven.
2. TEST STATISTICS
PN
To ensure model identifiability under the null hypothesis we assume that i=1 ci =
0.
X̄t = m(t/T ) + ¯t (3)
To test H0 is to compare curves m̂i estimated under the model (3) to the curves ĉi + m̂
estimated under H0 .
To estimate the common trend m under H0 we can use the averaged process
PN
i=1 it /N , for t = 1, . . . , T
X
T
X
X̄i = Xit /T (4)
t=1
Parallel Nonparametric Regression Curves 861
X̄, ¯t , ¯i are defined similarly. This paper will adopt the local linear smoothing pro-
cedure to estimate the trends. To estimate m we use nonparametric regression with
kernel function as the weight. Let K the kernel function then the estimate of m and
mi are respectively
XN
m̂(u) = Xit wh (t, u), 0 ≤ u ≤ 1 (5)
i=1
where
Sh,2 (u) − (u − t/T )Sh,1 (u)
wh (t, u) = K((u − t/T )/h) 2 (u)
Sh,2 (u)Sh,0 (u) − Sh,1
and
T
X
Sh,j (u) = (u − t/T )j K((u − t/T )/h), u ∈ [0, 1]
t=1
mi is estimated using the same bandwidth as m, so that the local linear estimate of mi
is
T
X
m̂i (u) = Xit wh (t, u), 0 ≤ u ≤ 1. (6)
t=1
The intercepts ci are estimated by
T
1X
ĉi = [m̂i (t/T ) − m̂(t/T )]. (7)
T t=1
There are many ways to measure the distance between the curves ĉi + m̂(.) and
m̂(u). This paper use L2 distance
XN Z N
∆2 = (m̂i (u) − ĉi + m̂(u))2 du (8)
i=1 0
as a test statistics. It is clear that ∆2 is a natural estimate for the parallelism index.
N Z
X 1
∆(m1 , m2 , . . . , mN ) = min
P (mi (u) − ci − m(u))2 du (9)
c1 ,...,cN : i ci =0 0
i=1
PN R1
where m(u) = i=1 mi (u)/N and ci = 0 (mi (u) − m(u))du. A central limit theorem
is then obtained for the parallelism index based on the distance between the estimates
of the regression curves and their average.
3. ASYMPTOTIC THEORY
To establish the asymptotic normality of ∆2 , we impose structural conditions
on the error processes (it )Tt=1 , i = 1, . . . , N are iid as the process (t )Tt=1 of the form
862 Sri Haryatmi Kartiko
for all u1 , u2 ∈ [0, 1], can be written as G ∈ SLC. Assuming that Ek = 0 for all k ∈ R,
let
γk (u) = E[G(u; Ft )G(u; Ft )], 0 ≤ u ≤ 1. (11)
Define long-run variance function
X
g(u) = γk (u) (12)
k∈Z
Recall that the kernel function K is Lipschitz continuous on it support [0, 1], let
Z 1−2|x| Z 1
K ∗ (x) = K(v)K(v + 2|x|)dv dan K2∗ = (K ∗ (v))2 dv
−1 0
Theorem 3.1. Let N = NT be such that either (i) N → ∞ or (ii) N is fixed. Let
h = hT be a bandwidth sequence such that T h3/2 → ∞ and h → 0. Further assume that
G ∈ SLN and that, for some p > 4 the following short-range condition hold;
∞
X
δp (t) < ∞ (14)
t=0
T = 100 T = 300
b\ N 50 100 150 50 100 150
1 .94 .93 .96 .91 .95 .96
2 .95 .93 .96 .92 .98 .93
3 .93 .94 .94 .93 .94 .95
Table 1. Acceptance probabilities at 95% level, for different T, N and h
4. SIMULATION STUDY
In this section we present a simulation study to assess the performance of the
testing procedure. Consider the model
Xit = ci + m(t/T ) + it (18)
2
where ci = 2(i/N ) and m(u) = 3 sin(3πu). The error process (it ) is generated by
it = ψi,t (t/T ) where for all i in Z and u ∈ [0, 1], the process ψi,t (u) follow the recursion
ψi,t (u) = ρ(u)ψi,t−1 (u) + σi,t (19)
with the i,t random variables satisfying P (i,t = −1) = P (i,t = 1) = 1/2. Thus,
(i,t )t∈Z for i = 1, . . . , N are iid sequences of AR(1) processes with time varying coef-
ficients. Let ρ(u) = 0.2 − 0.3u and σ = 1. It can be shown easily that E(ζi,t (u)) = 0,
V ar(ζi,t (u)) = σ 2 /(1 − ρ(u)2 ) and in the long run variance function g(u) = σ 2 /(1 −
ρ(u)2 ). Five estimated regression line of m(u) = ci + 3 sin(3πu), i = 1, . . . , 5, for 5
different ci by adding five different set of errors i are presented in Figure 1.
0
−1
−2
0 20 40 60 80 100
In the simulation study, the normal kernel is used. Simulation is done 100 times.
We are interested in the proportion of realization that the null hypothesis is correctly ac-
cepted. Acceptance probabilities are presented in Table 1 for different choice of T, N, h.
864 Sri Haryatmi Kartiko
This suggests that the acceptance probability are reasonably close to the 95% nominal
levels, and become more robust to the size of bandwidth as the sample size gets bigger.
References
[1] Guo, P. and Oyet, A.J. (2009). On wavelet methods for testing equality of mean response curves.
Int. J. Wavelets Multiresolut. Inf. Process. 7, 357-373.
[2] Hall, P. and Hart, J.D. (1990). Bootstrap test for difference between means in nonparametric
regression curves. J. Amer. Statist. Assoc. 85, 1039-1049
[3] Hardle, W. and Marron, J.S. (1990), Semiparametric comparison of regression curves. Ann.
Statist. 18, 63-89.
[4] King, E.C., Hart, J.D. and Wehrly, T.E. (1991), Testing the equality of regression curves using
linear smoothers. Statist. Probab. Lett. 12, 239-247.
[5] Munk, A. and Dette (1998), Nonparametric comparison of several regression functions : exact
and asymptotic theory. Ann. Statist. 6,2339-2368.
[6] Park, C., Vaughan, A., Hannig, J. and kang, K.H. (2009). Sizer analysis for the comparison
of time series,www.stat.uga.edu
Abstract. The objective of this research is to build a generalized linear model using ordinal
response variable with some covariates. One of the covariates in the model is an indicator
variable of hotspot as the results of the hotspot detection, while the response variable is ordinal
data as the result of ORDIT ranking method. The data in this research is about infant health and
poverty in some districts of East Java; i.e. Blitar, Kediri, and Jember taken from 3 different
infant health district levels. GLM is implemented for 200 villages from each district, so there are
600 villages (sub-districts) as unit observation for modeling. The modeling analysis gives results
that number of farmer families, hotspot, and district are statistically significant as the predictor
to the poverty level of villages.
Keywords and Phrases: generalized linear model, ordering dually in triangles, hotspot detection.
1. INTRODUCTION
2.1 Data. The data in this research is about poverty in East Java. Based on ORDIT ranking of
districts, Blitar, Kediri, and Jember are the districts chosen from each group of ranking.
Figure 1 shows the study areas in this research. The resource of sub-districts (villages) data is
PODES 2008 form BPS. Response variable is constructed from grouping the ranking result
based on two poverty indicators. This ordinal response variable has three levels (good,
moderate, and bad) as level of poverty. Modeling is built with this response variable and
some covariates. Hotspot is one of covariates in GLM.
-5.5
-6.0
-6.5
-7.0
Latitude
Tuban Sumen
BangkSampang
Pamek
Lamong
Gresik
Bojoneg Ksurab
-7.5
Ngawi Sidoarjo
Kmojok
Jombang Mojoke
Kmadiun Nganjuk
Madiun Kpasur
Magetan
Pasuru Kprobo Situb
Kkediri
Kediri Kbatu Proboli
-8.0
2.2.1 Rating Relations/Rules for Ascribing Advantage. This section describes the protocols
for comparing cases or collectives of cases via ratings, rules and relations that ascribe
advantage to some cases over some others or fail to do so for particular pairs [1] where Hasse
diagram should be prepare first before protocols are being used. There are three possibilities
in comparing a pair of cases, where one case is denoted by Э and the other by Є. ЭaaЄ
wherein Э is/has ascribed advantage over Є. Э has subordinate ЭssЄ status to Є which
implies ЄaaЭ. ЭiiЄ whereby these are indefinite instances without ascribed advantage and
without subordinate status, which implies ЄiiЭ. Thus the protocol either designates one
member of a pair as having ascribed advantage and the other subordinate status, or that
pairing as being indefinite. ff(aa) is the frequency of number of occurrences for the ascribed
advantage. ff(ss) is the frequency of number of occurrences for the subordinate status. ff(ii) is
the frequency of number of occurrences for being indefinite. Each of the N cases can be
compared on this basis to all others in the deleted domain DD=N–1 of competing cases with
the percent occurrence of these relations being tabulated as follows, AA = 100 × ff(aa)/DD,
SS = 100 × ff(ss)/DD, II = 100 × ff(ii)/DD. Clearly, AA + SS + II = 100%; and for later use
let us define CCC = 100 – AA as the complement of case condition relative to ascribed
advantage (AA) [1]. All of those computations should be visualized in a Hasse diagram as the
results of partially order set arrangement for a set of subjects with a number of quantity as
their attributes.
A simple X-shaped Hasse diagram in Figure 2 is used to illustrate wherein entity A has
ascribed advantage over B, D, and E while being indefinite with regard to C. Entity B has
subordinate status to A and C with ascribed advantage over D and E. The deleted domain
(DD) is four, since an entity is not paired with itself for present purposes.
2.2.2 Subordination Schematic and Ordering Dually in Triangles (ORDIT. Subordination can
be symbolized diagrammatically [1] in a triangle depicted in Figure 3, where the point
representing a district makes a triangle divided into two parts, ‘trapezoidal triplet’ (of AA, SS,
and II) and a topping triangle (of CCC and II). The combination of these two parts forms a
right triangle with the ‘tip’ at AA = 100% in the upper-left and the toe at SS = 100% in the
lower-right. The hypotenuse is a right-hand ‘limiting line’ of plotting positions because
AA+SS+II=100%. Topping triangle provides the basis for an ‘ORDIT ordering’ of the
districts or instances.
According to the Figure 3, an idealized district has AA = 100% of the deleted domain
(DD) of other districts, that is the frequency of ascribed advantage being equal to the number
868 YEKTI WIDYANINGSIH ET AL.
of competing districts, so if the ideal actually occurs, then the trapezoidal triplet becomes a
triangle.
Figure 3. Subordination schematic with plotted instance dividing a right triangle into two
parts, a ‘trapezoidal triplet’ (of AA, SS and II) below, and a ‘topping triangle’ (of CCC, SS
and II) above.
The numbers for ORDIT can be coupled as a decimal value ccc.bbb. The ccc
component is obtained by rounding CCC to two decimal places and then multiplying by 100.
The bbb component is obtained by dividing SS by CCC, and imposing 0.999 as an upper
limit. And then add these two values as ccc.bbb. This ordering is assigned the acronym
ORDIT for ORdering Dually In Triangles. It preserves all aspects of AA, SS and II except
for the actual number of districts. Simple rank ordering of ORDIT values becomes salient
scaling of the district [1].
2.2.3 Product-order Rating Regime. A general relational rule for ascribing advantage is
product-order whereby advantage is gained by having all criteria at least as good and at least
one better. Conversely, subordinate status lies with having all criteria at least as poor and at
least one poorer. This relational rule is applicable to all kinds of criteria as long as they have
the same polarity (sense of better and worse).
According to Figure 3 and its computation, ORDIT ordering is the ranking of the
instances based on their indicators. ORDITs and salient scaling according to product-order
are determined by Scheme 2 [1] in Function Facilities of R function. ORDIT (Ordering
Dually In Triangle) is topping triangle in Figure 3 and Salient is the ranking of ORDIT.
2.3 A Spatial Scan Statistic. In this paper a spatial scan statistic is proposed. The analysis is
always conditioned on the total number of observed points. The windows with a particular
size and shape are formed around the regions to capture the highest cases of study.
Cases are assumed to be Bernoulli model with constant risk over spaces under null
hypothesis, and with different risk inside and outside at least one of circular window under
the alternative hypothesis. For each circular window, the numbers of people in poverty inside
and outside it noted, together with the expected number of cases reflecting the population at
risk. On the basis of these numbers, the likelihood ratio is calculated for each circular
window. The circular window with the maximum likelihood, and with more than its
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 869
2.3.1 The Bernoulli Model in Hypothesis Statistic. With the Bernoulli model [2], there are
cases and non-cases represented by a 0/1 variable. These variables may represent people in
poverty or not in poverty. They may reflect cases and controls from a larger population, or
they may together constitute the population as a whole. Whatever the situation may be, these
variables will be denoted as cases and controls throughout the user guide, and their total will
be denoted as the population. The Bernoulli model requires information about the location of
a set of cases and controls, provided to SaTScan using the case, control and coordinates files.
Separate locations may be specified for each case and each control, or the data may be
aggregated for states, provinces, counties, parishes, census tracts, postal code areas, school
districts, households, etc, with multiple cases and controls at each data location
Let N denote a spatial point process where N(A) is the random number of points in the set A
G. As the window moves over the study area it defines a collection Z of zones Z G. Z
will be used to denote both a subset of G and a set of parameters defining the zone.
For the Bernoulli model, each unit of measure corresponds to an ‘entity’ or
‘individual’ who could be in either one of two states (yes or no). In the model, there is exactly
one zone Z G such that each individual within that zone has probability p of being a point,
while the probability for individual outside the zone is q. The probability for any individual is
independent of all the others. The null hypothesis is Ho : p = q. The alternative Hypothesis is
H1 : p > q, Z G . Under Ho, N(A) ~ Binomial ((A), p) for all sets A. Under H1, N(A) ~
Binomial ((A), q) for all sets A Z, and N(A) ~ Binomial ((A), q) for all sets A Z’ [2].
2.3.2 Likelihood Ratio Test. The likelihood function for the Bernoulli model is expressed as
L(Z , p, q) p nZ (1 p) ( Z ) nZ q nG nZ (1 q)( (G ) ( Z )) ( nG nZ )
To detect the zone that is most likely to be a cluster, we find the zone Ẑ that maximizes the
likelihood function. In other words, Ẑ is the maximum likelihood estimator of the parameter
Z. There are two steps to conclude the hotspot. First, maximize the likelihood function
condition on Z.
nZ ( Z ) nZ
def
n n
L( Z ) sup L( Z , p, q) Z 1 Z
pq ( Z ) (Z )
nG nZ ( ( G ) ( Z )) ( nG nZ )
nG nZ nG nZ
1 (1)
(G ) ( Z ) (G ) (Z )
nZ (nG nZ )
when , and otherwise
( Z ) ( (G) ( Z ))
nG ( G ) nG
n (G) nG
L( Z ) G (2)
(G) (G)
Next, we find the solution Zˆ {Z : L(Z ) L(Z ')Z ' } . The most likely cluster is of
interest, and should be continued for making statistical inference. Let
870 YEKTI WIDYANINGSIH ET AL.
nG ( G ) nG
def
n (G) nG
Lo sup L( Z , p, q) G (3)
pq (G) ( (G)
The likelihood ratio, , can be written as
sup Z , p q L( Z , p, q) L( Zˆ )
(4)
sup p q L( Z , p, q) Lo
The ratio is used as the test statistic, and its distribution is obtained through Monte Carlo
[2].
2.4 Generalized Linear Model. Generalized linear model (GLM) is a flexible generalization
of ordinary least squares regression. In the GLM, a link function is needed to link response
variable to covariates in generalize linear regression by allowing the linear model to be
related to the response variable [4].
Generalized linear models were formulated by John Nelder and Robert Wedderburn as
a way of unifying various other statistical models, including linear regression, logistic
regression and Poisson regression [5].
In a GLM, each outcome of the dependent variables, Y, is assumed to be generated
from a particular distribution in the exponential family, a large range of probability
distributions that includes the normal, binomial, poisson, and multinomial distributions,
among others. The mean, μ, of the distribution depends on the independent variables, X,
through:
(5)
where E(Y) is the expected value of Y; Xβ is the linear predictor, a linear combination of
unknown parameters, β; g is a link function [4].
Modeling in this paper uses ordinal scale with 3 levels as response variable and some
categorical variables as covariates. One of the covariates is the indicator values of hotspot,
whether a sub-district is hotspot or not hotspot. The data comprises of 3 districts, every
district has 200 sub-districts with a number of covariates. Assumed sub districts in a district
are more correlated than sub districts from different districts. Therefore, parameters
estimation method should able to tackle condition.
According to the data, model building in this study is based on spatial concept: the
closer the observation, the larger the correlation [10]. Based on this concept, the idea was
expanded to the correlated clustered data. The following section described Generalized
Estimating Equation (GEE) as a model parameters estimation method for correlated clustered
data.
2.5 Threshold Model. Threshold is a latent variable at the model that made the difference
between the linear models with ordinal response and the linear models with non-ordinal
responses. Threshold model is explained as follows. In logistic and probit regression models,
there are assumptions about an unobserved latent variable (y) associated with the actual
responses through the concept of threshold (Hedeker 1994). For dichotomy model, it is
assumed there is a threshold value, while for ordinal model with K categories (polytomy), it
is assumed there are K-1 threshold values, namely 𝛾1 , 𝛾2 , ⋯ , 𝛾𝐾−1 , with 𝛾0 = −∞ and
𝛾𝐾 = ∞. Response occurs in category k (Y = k), if the latent response y is greater than the
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 871
2.6 Generalized Estimating Equations (GEE) for Ordinal Response. GEE is a method of
parameter estimation for correlated or clustered data [6]. It is a common choice for marginal
modeling of ordinal response if one is interested in the regression parameters rather than
variance-covariance structure of the longitudinal data. The covariance structure of GEE is
regarded as nuisance. In this regard, the estimators of the regression coefficients and their
standard errors based on GEE are consistent even with miss-specification of the covariance
structure for the data [7].
Generalized linear models were first introduced by Nelder and Wedderburn (1972) and
later expanded by McCullagh and Nelder (1989). The following discussion is based on their
works and an extension of GEE from Liang & Zeger (1986) for ordinal categorical responses
data.
Suppose we have a multinomial response, say z. And for this response, there are K
ordered categories with corresponding probabilities π 1, π2, …, πK, that is Pr(z = k) = πk. The
proportional odds model is based on the cumulative probabilities, k = π1 + π2 + …+ πk, for k =
1 to K-1. Logit link function is used to relate k to a linear function of p covariates X. Now
let’s take a look at the repeated situation. Suppose we have a sample of I subjects. Let zij be
the ordinal response (with K levels) for the ith subject (i =1 to I) at point j (j = 1 to ni).
Assumed ni = J for all i for simplicity. Form of a (K-1)×1 vector
𝒚𝑖𝑗 = 𝑦𝑖𝑗 1 , 𝑦𝑖𝑗 2 , ⋯ , 𝑦𝑖𝑗 ,𝐾−1 ′ where yijk = 1 if zij = k, and 0 otherwise. Let’s denote the
expectation of 𝒚𝑖𝑗 as 𝝅𝑖𝑗 = E(𝒀𝑖𝑗 ) = 𝜋𝑖𝑗 1 , 𝜋𝑖𝑗 2 , ⋯ , 𝜋𝑖𝑗 ,𝐾−1 ′ with 𝜋𝑖𝑗𝑘 = Pr(yijk = 1). And
let xij denote a p×1 vector of covariates for subject i at sub subject j.
The objective of this part is to model the 𝜋𝑖𝑗𝑘 as a function of xij and the regression
parameters 𝜽 = 𝛾1 , 𝛾2 , ⋯ , 𝛾𝐾−1 , 𝜷 ′ where 𝛾𝑘 are intercept or cut-point parameters
and 𝜷 is a p×1 vector of regression parameters. Let 𝜑𝑖𝑗𝑘 = 𝜋𝑖𝑗 1 + 𝜋𝑖𝑗 2 + ⋯ + 𝜋𝑖𝑗𝑘
denote the cumulative probabilities. Then the proportional odds model at sub subject j is:
𝑙𝑜𝑔𝑖𝑡 𝜑𝑖𝑗𝑘 = 𝛾𝑘 + 𝒙𝑖𝑗 𝜷
𝜑𝑖𝑗𝑘 𝑃 𝑧𝑖𝑗 ≤ 𝑘
𝑙𝑜𝑔 = 𝑙𝑜𝑔 = 𝛾𝑘 + 𝒙𝑖𝑗 𝜷
1 − 𝜑𝑖𝑗𝑘 1 − 𝑃 𝑧𝑖𝑗 ≤ 𝑘
where 𝑧𝑖𝑗 ∈ 1,2, … 𝐾 is transformed to 𝑦𝑖𝑗𝑘 ∈ 1, 0 with 𝜋𝑖𝑗𝑘 = 𝑃𝑟 𝑦𝑖𝑗𝑘 = 1 . is a
vector of fixed effect at the transformed cumulative probabilities; 𝒙𝑖𝑗 is a vector of
covariates of district i, and sub-district j; 𝛾𝑘 is a threshold.
872 YEKTI WIDYANINGSIH ET AL.
To establish notation, let 𝒚𝑖 = 𝒚𝑖1 , ⋯ , 𝒚𝑖𝑛 𝑖 ′, and 𝝅𝑖 = 𝝅𝑖1 , ⋯ , 𝝅𝑖𝑛 𝑖 ′. Then θ can be
estimated through solving the estimating equation as follows
𝐼
𝜕𝝅𝑖 −1
ψ 𝜽 = 𝐕 𝐲i − 𝛑i = 𝟎 𝐾−1+𝑝 ×1
𝜕𝜽 i
𝑖=1
𝜕𝜋 𝑖
where
𝜕𝜽
=
𝜕𝜋𝑖1,1 𝜕𝜋𝑖1,1 𝜕𝜋𝑖1,1 𝜕𝜋𝑖1,1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝
⋱ ⋱
𝜕𝜋𝑖1,𝐾−1 𝜕𝜋𝑖1,𝐾−1 𝜕𝜋𝑖1,𝐾−1 𝜕𝜋𝑖1,𝐾−1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝
⋮ ⋮
𝜕𝜋𝑖𝑛 𝑖 ,1 𝜕𝜋𝑖𝑛 𝑖 ,1 𝜕𝜋𝑖𝑛 𝑖 ,1 𝜕𝜋𝑖𝑛 𝑖 ,1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝
⋱ ⋱
𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1 𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1 𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1 𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝 𝑛 𝑖 × 𝐾−1 × 𝐾−1+𝑝
and 𝑽−1
𝑖 is a generalized inverse of Vi . Here
1/2 1/2
𝐕i = ϕ𝐀i 𝑹𝒊 𝛼 𝐀i (6)
Moment methods to estimate correlation parameters α for specific correlation structures have
been proposed in [12].
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 873
𝑹𝑠(𝑖) 𝛼
−1/2 −1/2
𝐀𝑠(𝑖)1 𝐕𝑠(𝑖)1 𝐀𝑠(𝑖)1 𝜌𝑠(𝑖)12 ⋯ 𝜌𝑠(𝑖)1𝑛 𝑠𝑖
1 𝜌𝑠(𝑖)21
−1/2 −1/2
𝐀𝑠(𝑖)2 𝐕𝑠(𝑖)2 𝐀𝑠(𝑖)2 ⋯ 𝜌𝑠(𝑖)2𝑛 𝑠𝑖
=
𝜙 ⋮ ⋮ ⋱ ⋯
−1/2 −1/2
𝜌𝑠(𝑖)𝑛 𝑠𝑖 ,1 𝜌𝑠(𝑖)𝑛 𝑠𝑖 ,2 ⋯ 𝑨𝑠(𝑖)𝑛 𝑠𝑖 𝐕𝑠(𝑖)𝑛 𝑠𝑖 𝑨𝑠(𝑖)𝑛 𝑠𝑖
For simplifying, index j = 1, …, nsi is not inside parenthesis, but the same meaning with
written before is maintained [5][6][11].
The main purpose of this paper is to exploit the generalization of linear model to deal
with clustered ordinal measurement within the same district (kabupaten) as subject. GEE
approach is used to estimate the model parameters. Comparisons with the method of
maximum likelihood have been finished in the simpler case that serves to demonstrate the
efficiency of the method and some practical advantages [8]. The data is about poverty and
infant health in three districts of East Java. Thirty eight districts in East Java were ranked
based on 5 indicators of infant health; i.e. number of infant deaths, number of low births
weight, number of delivers in absence of health personnel, number of people in poverty, and
number of education shortfall. The ranks result of districts were divided into three groups as
not poor, moderate, and poor, which each group has 8, 7, and 23 districts, respectively. From
each group of infant health level, a district has taken for modeling analysis. Blitar, Kediri, and
Jember are chosen from those three different groups. Blitar has chosen from not poor group,
Kediri from moderate group, and Jember from poor group. Sub-districts (villages) from these
three district groups are ranked based on poverty indicators. Ordinal response was form from
grouping of these ranks into three groups. Two hundred sub-districts were taken from each
district to be analyzed in the GLM modeling. The covariates for modeling are scarcity,
percentage of farmer families, number of farmer family, number of Indonesia labor (TKI),
number of telephone cables, number of schools, number of health personnel, number of
families using electricity ( the values of all covariates are divided into three groups based on
the number of cases), and hotspot of poverty status. Hotspots of poverty in those three
districts were detected by Scan Statistic Method of Kulldorff, 1997. Hotspots status that used
as a covariate of villages is the significant hotspot: most likely and secondary hotspot. The
value is one if a village is hotspot and zero if it is not hotspot.
Table 1 at Appendix is the result of Generalized Linear Modeling with ordinal
response, some covariates and hotspot status. Ordinal response variable is level of sub-
districts in poverty, 1 = not poor, 2 = moderate, and 3 = poor, with each group has 115, 297,
and 188 sub-districts, respectively. As mentioned before, this ordinal response is a grouping
of rank result based on the number of people in poverty that represented by number of poor
statement letters and number of health insurance for people in poverty.
Based on the data in this research, table 1 shows some covariates are not significant,
while number of farmer families, number of Indonesia labor (TKI), number of telephone
cables, number of schools, and hotspot are related to the poverty.
Table 2 shows some covariates that significant to the level of poverty. District,
874 YEKTI WIDYANINGSIH ET AL.
number of family farmer and hotspot status are significant to the poverty level of a village.
This model has Log Likelihood = -456.0614. Interpretations of some model parameters at
Table 2:
1. Probability a village in Blitar has a better level, is exp(1.9344) = 6.92 times of a
village in Jember, where other covariates are remain the same.
2. Probability a village with jktan=1 (less farmer) is exp(2.1449) = 8.54 times of village
with jktan= 3 (more farmer), where other covariates are remain the same.
Cumulative predicted probabilities from the logistic model for each case:
1
𝑃𝑟 𝑧𝑖𝑗 ≤ 𝑘 =
1 + 𝑒𝑥𝑝 − 𝛾𝑘 + 𝑥𝑖𝑗 𝜷 + 𝛼𝑖𝑗 𝒅
The events in an ordinal logistic model are not individual scores but cumulative scores. First,
we calculate the predicted probability for a village as a hotspot (hotspot=1) in Jember
(kab=3), with number of farmer family at the highest level category (jktan=3), this means ˆ
is 0
P (score 1) = 1/(1 + e-(-5.2631+(-1.3014)*(1))) = 1/(1+709.457) = 0.00141
P (score 2) = 1/(1 + e-(-1.8998+(-1.3014)*(1))) = 1/(1+exp(3.2012)) = 0.039
P (score 3) = 1
P(score = 3) = 1 - P(score 2) = 1 – 0.039 = 0.961
P(score = 2) = P(score 2) - P(score 1) = 0.039 – 0.00141 = 0.03759
The village with that condition has probability 0.961 as the village from poor level (Yji = 3)
Table 3 shows some values for prediction of probabilities. Phat is the cumulative
probability of a village in the level or lower. As an example, a village in observation 2 has
level 2 in poverty or lower, it means moderate or not poor is 0.85537. Similar interpretation is
valid for other observations.
4. DISCUSSION
The three methods: ORDIT ranking, Hotspot detection, and Generalized Linear Model
give many advantages for analysis the data, especially when researcher needs to see the
relationship between ordering, hotspot, and some covariates. The results in this research,
where the topic is about infant health and poverty could be used by the government as an
input to make a decision or policy for district or sub-district improvement. As a conclusion,
between these three districts (kabupaten): Blitar, Kediri, and Jember, Jember is the worst in
infant health. Based on the GLM, there is a strong relationship between district, number of
family farmer, and hotspot to the poverty level of a village. All those covariates has p-values
< 0.0001. These results give advantages to the Health Department for decision making for
districts and sub-districts improvement in infant health with giving attention to the farmer
families. Hopefully, better condition in farmer families will improve women in education and
health to have healthier infant and children.
This method also can be used for other data such as biodiversity, medical, social,
economic, and many others.
References
[1] Myers, W.; Patil, G. P. Preliminary Prioritization Based on Partial Order Theory and R Software for
Compositional Complexes in Landscape Ecology, with Applications to Restoration, Remediation, and
Enhancement, Manuscript of Environmental Ecological Statistics. 2009.
[2] Kulldorff M. A Spatial Scan Statistic. Communication in Statistics: Theory and Methods, 26:1481—1496. 1997.
[3] Kulldorff, Martin, SaTScanTM User Guide for version 9.0. 2010.
[4] McCullagh P, Regression Models for Ordinal Data. Journal of the Royal Statistical Society. Series B
(Methodological, Vol. 42, No. 2, pp. 109-142. http://www.jstor.org/pss/2984952 [25 April 2011, 6:30]. 1980.
[5] McCullagh, Peter; Nelder, John. Generalized Linear Models, Second Edition. Boca Raton: Chapman and
Hall/CRC. ISBN 0-412-31760-5. 1989.
[6] Liang KY, Zeger SL, Longitudinal data analysis using generalized linear models, Biomttrika. 73. 1. pp. 13-22.
1986.
[7] Yang B, Lilly E. Analyzing Ordinal Repeated Measures Data Using SAS, Paper SP08, Indianapolis, Indiana.
2008.
[8] Clayton, David. Repeated Ordinal Measurement: a Generalized Estimating Equation Approach, MRC
Biostatistics Unit, Cambridge. 1992.
[9] Gortmaker, S.L., Poverty and Infant Mortality in the United States, American Sociological Review, Vol. 44, No. 2
(Apr., 1979), pp. 280-297
[10] Cressie, NAC. Statistics for Spatial Data, Revised Edition. New York: John Wiley & Sons. 1993.
[11] Nelder JA, Wedderburn RWM. Generalized Linear Models. Journal of the Royal Statistical Society Series A,
135, 370–384. 1972.
[12] Lipsitz SR, Kim K, Zhao L, 1994. Analysis of repeated categorical data using generalized estimating equations.
Statist. Med., 13: 1149–1163. doi: 10.1002/sim.4780131106
876 YEKTI WIDYANINGSIH ET AL.
YEKTI WIDYANINGSIH
Department of Mathematics
Faculty of Mathematics and Natural Sciences, University of Indonesia
e-mail: [email protected]
ASEP SAEFUDDIN
Department of Statistics
Faculty of Mathematics and Natural Sciences, Bogor Agricultural University
e-mail: [email protected]
Appendix
Abstract. This paper studies the behavior of financial time series in three indices of Bursa Malaysia
Index Series namely the FTSE Bursa Malaysia Composite Index (FBM KLCI), the Finance Index and the
Industrial Index from July 1990 until July 2010. We observe that these three i ndices are characterized by
the presence of the stylized facts such as lack of normality, exhibit skewness and excess kurtosis. This
paper provides discussion on how mixture distributions accommodate with non-normality and asymmetry
characteristics of financial time series data. We also present the most commonly used Maximum
Likelihood Estimation (MLE) via the EM algorithm to fit the mixture Normal distributi on and study the
two-component Normal mixtures using data sets on logarithmic stock returns of Bursa Malaysia indices.
Keywords and Phrases : Bursa Malaysia stock market indices; behavior of financial time
series; mixture distributions.
1. INTRODUCTION
It is a stylized fact that the marginal distributions of stock returns are poorly
described by the Normal distribution. It has been established that return distributions have
thick tails, are skewed and leptokurtic relative to the Normal distribution (having more
values near the mean and in the extreme tails; dramatic falls and spectacular jumps appear
with higher frequency than predicted) (Frances and van Dijk [1] and Cont [2]). Mixtures of
Normal distributions have been associated with empirical finance. There exists a long
history of modeling asset returns with a mixture of Normal (see Press [3], Praetz [4], Clark
[5], Blattberg and Gonedes [6] and Kon [7]). One attractive property of the mixtures of
Normal is that it is flexible to accommodate various shapes of continuous distributions, and
able to capture leptokurtic, skewed and multimodal characteristics of financial time series
data. The EM algorithm is a popular tool for simplifying maximum likelihood problems in
the context of a mixture model. The EM algorithm has become the method of choice for
_______________________________
2010 Mathematics Subject Classification: 62-07; 62P05; 65Y04
879
880 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL
estimating the parameters of a mixture model, since its formulation leads to straightforward
estimators (Picard [8]). The key property of the EM algorithm has been established by
Dempster et al. [9].
The paper is structured as follows. We start by presenting the case study and their
properties in Section 2. We describe the data and test the normality assumption for the
monthly stock returns. In Section 3, we introduce the statistical distribution to be fitted to the
data. Also we provide discussion on how mixture distributions accommodate the non-
normality and asymmetry characteristics of financial time series. In Section 4, we fit the
specification to the data and present the most commonly used Maximum Likelihood
Estimation (MLE) via the EM algorithm to fit the two-component mixture of Normal
distribution. Lastly, Section 5 concludes. We summarize the main findings of our study.
2. CASE STUDY
2.1 Data. The data sets used in this paper are monthly closing prices covers a twenty years
period from July 1990 to July 2010 for three Malaysia stock market indices namely the
Composite Index, Financial Index and Industrial Index as obtained from DataStream. All
three indices are denominated in local currency which is Malaysian Ringgit (MYR). In total,
we have 241 observations per index. The behavior of the three indices considered during the
sample period is shown below. Figure 1 depicts the time series of monthly stock market
indices of Bursa Malaysia. It can be seen that the price rise and fall over time.
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 881
Composite Index
1600
1400
1200
1000
800
600
400
200
92 94 96 98 00 02 04 06 08 10
Finance Index
14000
12000
10000
8000
6000
4000
2000
0
92 94 96 98 00 02 04 06 08 10
Industry Index
3500
3000
2500
2000
1500
1000
500
92 94 96 98 00 02 04 06 08 10
2.2 Return Series. Prior to analysis, all the series are analyzed in return, which is the first
difference of natural algorithms multiplied by 100. This is done to express things in
percentage terms. Let Pit be the observed monthly closing price of market indices i on day
t , i 1,..., n and t 1,..., T . The monthly rates of return are defined as the percentage rate of
return by
P
yit 100 log it (1)
P
i ,t 1
2.3 Empirical Properties of Return. Figure 2 depicts the monthly returns of Bursa Malaysia
stock market indices. There are periods of quiet and periods of wild variation in the monthly
returns. The period analyzed can be characterized as a period of market instability as it
reflects the upturn and downturn of Malaysia stock market.
882 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL
RETURN_COMPOSITE INDEX
30
20
10
-10
-20
-30
92 94 96 98 00 02 04 06 08 10
RETURN_FINANCIAL INDEX
50
40
30
20
10
0
-10
-20
-30
-40
92 94 96 98 00 02 04 06 08 10
RETURN_INDUSTRIAL INDEX
30
20
10
-10
-20
-30
92 94 96 98 00 02 04 06 08 10
Figure 2. Time series of index rate for three Malaysia stock markets
The three stock market indices are listed in Table 1. Table 1 below summarizes some
relevant information about the empirical distributions of stock market indices under
consideration. This table reports some descriptive statistics and tests for the monthly stock
prices. First, the means of the series are, in general, not significantly different from zero (H0: µ
= 0). Second, there is some evidence of negative skewness (β1, defined as the 3rd standardized
moment) in the monthly Industry Index. Meanwhile, the Composite Index and Financial Index
distribution of return rates are positive skewed. Third, it has been found that stock returns in
financial markets have excess kurtosis, i.e. kurtosis which is significantly greater than 3 (the
value for a normal distribution). Kurtosis β2, defined as the ratio of the 4th central moment to
the square of the variance, increases both with excessive mass in the tails or at the centre of the
distribution. Table 1 show that all three distributions are leptokurtic, thus exhibiting fat tails
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 883
and high peaks. The Jarque-Bera test rejects the null hypothesis of normality for each of the
three stock market indices.
Table 1. Summary Statistics
Bursa Malaysia stock market indices
Statistics Composite Financial Industrial
Index Index Index
Mean 0.3237 0.6021 0.3233
Median 0.6119 0.8037 0.6474
Maximum 28.2488 40.3887 22.5572
Minimum -24.8089 -32.3621 -23.1723
Std. Dev. 6.8727 9.3636 5.9313
Skewness 0.0749 0.3479 -0.3980
Kurtosis 5.2621 6.5048 5.4303
Jarque-Bera 51.3942 127.6802 65.4024
P-value 0.0000 0.0000 0.0000
Figure 3 depicts the histogram for the Malaysia’s stock return rates and the
corresponding normal curve with the same mean and standard deviation. The departures from
normality can be seen in the histograms displayed in Figure 3, where normal distributions
generated by the sample mean and standard deviation of each index is shown together with the
observed histograms. It can easily be seen that the empirical distribution is more peaked and
has heavier tails than the normal distribution. Note also that the return distributions with
thicker tails have a thinner and higher peak in the center compared to normal distribution.
884 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL
3. MIXTURE OF DISTRIBUTION
Most financial markets returns are both skewed and leptokurtic. Based on the above
analysis, the monthly log return is far from being normally distributed. Hence, a number of
alternatives skewed and leptokurtic distributions have been applied. The mixtures of normal
distributions is by far the most extensively applied and the simplest case is a mixture of two
univariate normal distributions may be considered as the most widely applied. A flexible and
tractable alternative of departures from normality is a mixture of two normal distributions. A
mixture of two log normal distributions fit financial data better than a single normal
distribution. Fama [10] claims that a mixture of several normal distributions with same mean
but different variances are the most popular approach to describe long-tailed distribution of
price changes.
Recent studies of stock returns tend to use mixture of Normal distributions. Under the
assumption of Normal distribution, the log return is normally distributed with mean and
variance 2 i.e. rt ~ N , 2 . Advantages of mixture of Normal include that they maintain
the tractability of normal, have finite higher order moments, and can capture the excess
kurtosis (Tsay [11]). Besides, the mixture of Normal has other advantages. One is that it can
capture the structural change not only in the variance but also in the mean. The other advantage
is that it can be asymmetric (Knight and Satchell [12]). Also it is believed that Normal
mixtures are appropriate in order to accommodate certain discontinuities in stock returns
such as the ‘weekend effect’, the ‘turn-of-the month effect’ and the ‘January effect’ (Klar
and Meintanis [13]). Besides, the mixtures of normal models are easy to interpret if the
asset returns are viewed as generated from different information distributions where the
mixture proportion can accommodate parameter cyclical shifts or switches among a finite
number of regimes (Xu and Wirjanto [14]). Other most appealing features of the mixtures
of normal models for modelling assets returns is that it has the flexibility to approximate
various shapes of continuous distributions by adjusting its component weights, means and
variances (Tan and Chu [15]).
The general form of the CDF of a Normal mixture can be represented as
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 885
K
x i
F x i (1)
i 1 i
where is the cumulative density function of N 0,1 . The probability density function of a
mixture of Normal is therefore given by
K
f ( x) i x; i , i (2)
i 1
where
x i
2
1
x; i , i 2 i 2
e
2 i
K
i 1
i 1 and 0 i 1
for i 1, 2,..., K .
Thus, in a Normal mixture, the return distribution is approximated by a mixture of
Normal each of which have unique mean i and standard deviation i and weight (or
probabilities or mixing parameter) i (Subramanian and Rao [16]).
A mixture of two Normal distributions is given by
f xt ; 1 , 2 ,12 , 22 N xt ; 1 ,12 1 N xt ; 2 , 22 (3)
where N x; i , i2 is the pdf of a Normal distribution with mean i and variance i2 . This
mixture implies that stock returns are drawn from a normal distribution with mean 1 and
standard deviation 1 with probability , and from a normal distribution with mean 2 and
standard deviation 2 with probability 1 .
If X is a mixture of K normal with pdf in (2), then its mean, variance, skewness and
kurtosis are
K
i i
i 1
2 i i2 i2 2
K
i 1
K
1
3 i3 i
2
3
i 1
i i
K
1
4
i 1
i
3 i4 6 i 2 i2 i 4
mixture distributions, we can obtain densities with larger peakness and with heavier tails than
Normal distribution.
In this section, we describe a simple mixture model for density estimation, and the
associated EM algorithm for carrying out maximum likelihood estimation.
4.1 The EM Algorithm for Two Component Mixture Model. Fitting mixture distributions
can be handled by a wide variety of techniques, such as graphical methods, the method of
moments, maximum likelihood and Bayesian approaches (see Titterington et al. [17] for an
exhaustive review of these methods). Now considerable advances have been made in the
fitting of mixture models, especially via the maximum likelihood method. The maximum
likelihood method has focused many attentions, mainly due to the existence of an
associated statistical theory.
Remark 1. (Estimation of the mixture Normal pdf). With two mixture components, the log
likelihood is
log L t 1 ln f xt ; 1 , 2 , 12 , 22 ,
T
where f is the pdf in (3). A numerical optimization method could be used to maximize
this likelihood function. However this is tricky so an alternative approach is often used
(Söderlind [18]).
We used a procedure called the EM algorithm, given in Algorithm 1 for the special case
of Normal mixtures (Hastie et al. [19]). In the Expectation (E) step, we do a soft assignment of
each observation to each model: the current estimates of the parameters are used to assign
responsibilities according to the relative density of the training points under each model. Next,
in the Maximization (M) step, these responsibilities are used in weighted maximum-likelihood
fits to update the estimates of the parameters (see Hastie et al. [19]).
Algorithm 1. EM algorithm for two-component Normal mixture (Hastie et al. [19] and
Söderlind [18])
Step 1: Take initial guesses for the parameters 1 , 2 ,12 , 22 and .
According to Hastie et al. [19], a good way to construct initial guesses for 1 and
proportion can be started at the value 0.5. Both 12 and 22 can be set equal to
the overall sample variance
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 887
x x
2
T t
t 1
T
1 t 1
t 1
t t 2 t t 1
,
1 1
T 1 T
t 1 t t 1 t
x xt 2
T T 2
2 t 1 t t
, 2
t 1 t
, and
2
T T
t 1 t t 1 t
t
t 1
T
T
Step 4: Iterate over steps 2 and 3 until the parameter values converge.
For the initial values, we decided to have the following values. Table 2 depicts the initial
values for the EM algorithm.
After the EM algorithm, our final maximum likelihood estimates for unknown
parameters are as described in Table 3. Table 3 depicts the summary of two components
Normal mixture using the EM algorithm. For each Index, there are two components, with two
weights, two means, two standard deviations and the overall log-likelihood. We report in Table
3 the maximum-likelihood estimations resulted from fitting the theoretical distributions as
888 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL
described previously to the series of monthly stock returns of the three indices under
consideration.
This table provides the estimates of the five parameters of a two component Normal
mixture as well as the Normal mixture model for the three stock market indices of Bursa
Malaysia. For Composite Index, 27.07% of the returns follow the first Normal distribution and
72.92% follow the second Normal distribution. 38.68% of the returns of Finance Index follow
the first Normal distribution and 61.32% follow the second Normal distribution. Meanwhile
for Industry Index, 19.64% of the returns follow the first Normal distribution and 80.36%
follow the second Normal distribution.
Several important observations may be drawn from Table 3. First, the components of
stock price series can clearly be distinguished with respect to their variance. The results for
medium-term stock prices, i.e. for monthly data, the first component is always associated with
the higher variance except in the monthly Finance Index. The high-variance component has the
smaller probability for these data except for Finance Index.
In order to judge whether the estimated models are compatible with the stylized facts of
the data, we compute the implied skewness 1 and the implied kurtosis 2 of the models
from
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 889
p 3 i3
I
2
i i i
1 i 1
3/ 2
I 2
pi i i
2
i 1
p 3 6 i2 i2 i4
I
4
i i
2 i 1
2
I 2
pi i i
2
i 1
with i i , where is the overall mean.
The results, reported in Table 4, show a rather close agreement between the pattern of
skewness and kurtosis in the data and implied skewness and kurtosis. There is a quite close
agreement between implied leptokurtosis and actual leptokurtosis for the monthly data.
Figure 4 depicts the mixture Normal distribution for the three indices. It shows that
mixture Normal distribution can accommodate leptokurtic as well as skewed in the data as
the distribution has thicker tails and higher peak. From Table 3 and Figure 4, the two indices
(Composite Index and Industry Index) indicate that the first Normal is a low mean high
variance regime and the second normal is a high mean low variance regime. However, the
Finance Index indicates that the first Normal is a low mean (-1.9839) low variance (79.9652)
regime and the second Normal is a high mean (2.2331) high variance (85.0682) regime.
Meanwhile, the weights indicate that the second regime is the more prevalent regime for the
three stock market indices of Bursa Malaysia.
890 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL
Also we plot the histogram of the data and the non-parametric density estimate (Figure
5a). In Figure 5b, we add the density of a given component to the current plot, but scaled by
the share it has in the mixture, so that it is visually comparable to the overall density.
0.10
0.08
0.08
0.06
0.06
Density
Density
0.04
0.04
0.02
0.02
0.00
0.00
-20 -10 0 10 20 30 -20 -10 0 10 20 30
0.08
0.06
0.06
Density
Density
0.04
0.04
0.02
0.02
0.00
0.00
-20 0 20 40 -20 0 20 40
0.12
0.12
0.10
0.10
0.08
0.08
Density
Density
0.06
0.06
0.04
0.04
0.02
0.02
0.00
0.00
-20 -10 0 10 20 -20 -10 0 10 20
Figure 5a (left). Histogram (grey) for monthly stock prices of Bursa Malaysia. The dashed
line is a kernel density estimate, which is not completely satisfactory. Figure 5b (right). As in
the previous figure, plus the components of a mixture of two Normal, fitted to the data by the
EM algorithm. These are scaled by the mixing weights of the components.
5. CONCLUSION
In this paper, we focus on study the behavior of financial time series in three stock
market indices of Bursa Malaysia (Composite Index, Finance Index and Industrial Index) for
20 years and characterize the presence of the stylized facts using the data sets on
logarithmic stock returns. We started by describing the data and test the hypothesis of
normality. We found that these three indices exhibit asymmetry and non-normality. Not
surprisingly, the distributions of monthly stock returns analyzed show fat tails and high peaks,
as well as skewness in different directions. The results are in fact fully consistent with those
found for many other markets and reported in many other studies. From previous studies we
found that financial data may be successfully modeled by mixture distributions. We also
found that mixture distribution can accommodate leptokurtic as well as skewed in the data.
Lastly, we fit the two component mixture Normal distribution to data sets using the EM
algorithm.
References
[1] FRANCES, H. P. AND VAN DIJK, H. K., Non Linear Time Series Models in Empirical Finance, Cambridge
University Press, Cambridge, 2000.
[2] CONT, R., Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1,
223-236, 2001.
[3] PRESS, S. J., A compound events model for security prices, J. Bus., 40(3), 317-335, 1967.
[4] PRAETZ, P. D., The distribution of share price changes, J. Bus., 45(1), 49-55, 1972.
[5] CLARK, P. K., A subordinated stochastic process model with finite variance for speculative prices,
Econometrica, 41(1), 135-155, 1973.
[6] BLATTBERG, R. C. AND GONEDES N. J., A comparison of the stable and student distributions as statistical
models for stock prices, J. Bus., 47(2), 244-280, 1974.
[7] KON, S. J., Models of stock returns – a comparison, J. Finance, 39(1), 147-165, 1984.
[8] PICARD, F., An introduction to mixture models, Statistics for Systems Biology Group, Research Report No 7,
2007.
[9] DEMPSTER, A. P., LAIRD, N. M. AND RUBIN D. B., Maximum likelihood from incomplete data via the EM
algorithm, J. Royal Statistical Society Series B, 39, 1-38, 1977.
[10] FAMA, E. F., The behavior of stock-market price, J. of Bussiness, 38(1), 34-105, 1965.
[11] TSAY, R. S., Analysis of Financial Time Series, Wiley Series in Probability and Statistics, 2005.
[12] KNIGHT, J., AND SATCHELL, S., Return Distributions in Finance, Quantitative Finance Series, 2001.
[13] XU, D., AND WIRJANTO, T., An empirical characteristic function approach to VaR under a mixture-of-
normal distribution with time-varying volatility. J. of Derivatives, 18(1), 39-58, 2010.
[14] TAN, K., AND CHU, M., Estimation of portfolio return and value at risk using a class of Gaussian mixture
distributions. The International Journal of Business and Finance Research, 6(1), 97-107, 2012.
[15] KLAR, B., AND MEINTANIS S. G., Test for normal mixtures based on the empirical characteristics function,
J. Computational Statistics and Data Analysis, 49, 227-242, 2005.
[16] SUBRAMANIAN, S., AND RAO U. S., Sensex and stylized facts an empirical investigation, Social Science
Research Network, id. 962828, 2007.
[17] TITTERINGTON, D. M., SMITH, A. F. M. AND MAKOV U. E., Statistical Analysis of Finite Mixture
Distribution, John Wiley & Sons, 2001.
[18] SöDERLIND, P., Lecture Notes in Empirical Finance (PhD): Return Distribution, University of St. Gallen,
2010.
[19] HASTIE, T., TIBSHIRANI, R. AND FRIEDMAN J., The Element of Statistical Learning: Data Mining,
Inference and Prediction, Springer Verlag, 2001.
ZAIDI ISA
Universiti Kebangsaan Malaysia.
e-mail: [email protected]
894 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL
Abstract. The immune system plays an important role in defending the body against
tumours and other threats. Currently, mechanisms involved in immune system interac-
tions with tumour cells are not fully understood. Here we develop a mathematical tool
that can be used in aiding to address this shortfall in understanding. This paper de-
scribes a hybrid cellular automata model of the interaction between a growing tumour
and cells of the innate and specific immune system including the effects of chemokines
that builds on previous models of tumour-immune system interactions. In particular, the
model is focused on the response of immune cells to tumour cells and how the dynamics
of the tumour cells change due to the immune system of the host. We present results and
predictions of in silico experiments including simulations of Kaplan-Meier survival-like
curves.
Keywords and Phrases: hybrid cellular automata, tumour, chemokine, immune, dendritic
cell, cytotoxic T lymphocyte.
1. INTRODUCTION
Cancer is one of the leading causes of death worldwide, with 7.9 million people
dying as a result of cancer in 2007 alone. This is projected to rise to 12 million by 2030
(see http://www.who.int/cancer/en, [13]). A similar report (see http://www.aihw.
gov.au/, [1]) states that in Australia in 2007, cancer was the second most common cause
of death and that 108, 368 new cases of cancer were diagnosed. For those diagnosed
with cancer between 1998 and 2004, the 5-year relative survival for all cancers combined
was 61%. Clearly, cancer is a major concern for public health officials around the world
and a greater understanding of cancer has potential to save many lives.
There is strong evidence in the literature for the hypothesis that tumour growth
is directly influenced by the cellular immune system of the human host. For example,
895
896 Trisilowati et al.
Sandel et al. [12] discuss the influence of dendritic cells in controlling prostate cancer.
Furthermore, tumour infiltrating dendritic cells (DCs) are a key factor at the interface
between the innate and adaptive immune responses in malignant diseases. While the
interactions of a tumour and the host immune system have been modelled previously
by, for example, Mallet and de Pillis [9] and de Pillis et al. [4], here we present the
first multidimensional, hybrid cellular automata model of the process that incorporates
important signalling molecules.
Hart [7] states that dendritic cells (DCs), found in many types of tumours, are
the dominant antigen presenting cells for initiating and maintaining the host immune
response. They are critical in activating, stimulating and recruiting T lymphocytes:
cells with the ability to lyse tumour cells. DCs have numerous states of activation,
maturation and differentiation. Natural killer (NK) cells and cytotoxic T lymphocyte
(CTL) cells also play important roles in the response of the immune system against the
tumour as described in Kindt et al. [8].
The dynamics of tumour growth and the interactions of growing tumours with
the host immune system have been studied using mathematical models over the past
four decades. Most of these models are presented using ordinary differential equations
(ODEs) or partial differential equations (PDEs) that impose restrictions on the modelled
system’s time-scales, as described in Ribba et al. [11]. However, a cellular automata
(CA) model can describe more complex mechanisms in the biological system without
such restrictions by detailing phenomena at the individual cell or particle level. The
classic definition of a CA model holds that they involve only local rules that depend
on the configuration of the spatial neighbourhood of each CA element. Hybrid cellular
automata (HCA), on the other hand, extend the CA to incorporate non-local effects,
often via coupling the CA with PDEs.
The purpose of the model developed in the present research is to investigate the
growth of a small solid tumour, when the growth is affected by the immune system. In
this preliminary study, we present a hybrid cellular automata model of the interaction
between a growing tumour and cells of the innate and specific immune system that also
includes generic signalling molecules known as chemokines. Chemokines are a family
of small cytokines, or proteins secreted by many different cell types, including tumour
cells. They can affect cell-cell interactions and play a fundamental role in the recruiting
or attracting cells of the immune system to sites of infection or tumour growth.
To include the effect of a chemokine in this model, we recognise the significantly
smaller size of such molecules compared with biological cells and introduce a partial
differential equation to describe the concentration of chemokine secreted by the tumour.
We combine the analytic solution of the partial differential equation model with a
number of biologically motivated automata rules to form the HCA model. We use
the hybrid cellular automata model to simulate the growth of a tumour in a number
of computational ‘cancer patients’. Each computational patient is distinguished from
others by altering model parameters. We define ‘death’ of a patient as the situation
where the cells of the tumour reach the boundary of our model domain; effectively this
represents tumour metastasis.
An Improved Model of Tumour-Immune System Interactions 897
In the sections to follow, we present the development of the HCA model before
analysing numerical simulations. We conclude with a discussion of the results and
conclusion.
2. MATHEMATICAL MODEL
We investigate the growth of a solid tumour and its interaction with the host
immune system and a tumour-secreted chemokine. The model is comprised of a partial
differential equation to describe the chemokine secreted by the tumour, coupled with a
discrete, stochastic cellular automata describing individual cells. We employ a square-
shaped computational domain of length L, which is partitioned into a regular square
grid. Each square element in the grid represents a location that may contain a healthy
cell, tumour cell or immune cell.
We consider a number of biological cell types including normal healthy cells,
tumour cells (necrotic, dividing and migrating), DCs (mature and immature), NK cells
and CTL cells. To build the CA model, we define ‘rules’ that draw upon the biological
literature to describe cell-cell interactions, cell effects on the environment, and effects
of the environment on cells.
Initially, non-cancerous healthy cells cover the whole of the model domain, then
the tumour mass is allowed to grow from one cancer cell placed at the centre cell of the
grid. Cells of the host immune system are spread randomly over the domain throughout
the other healthy cells. Three separate immune cell populations are considered here –
the NK cells of the innnate immune system and cells of the specific immune response,
represented by the CTL cells and DCs.
The model solutions are progressed via discrete time steps, at which each spatial
location is investigated to determine its contents and whether or not actions will occur.
This is summarised in Algorithm 1.
2.1. Cellular Automata Rules. Each particular cell-level action is associated with a
probability of success, Pevent , that is compared with a pseudo-random number, r, drawn
898 Trisilowati et al.
from the uniform distribution on the interval [0, 1] to determine whether or not it is
carried out. To describe the evolution of the cell population, we introduce the general
algorithm of cellular automata rules as presented below.
2.1.1. Host cells. As described in the work of Ferreira et al. [5] and of Mallet and de
Pillis [9], we assume that the healthy host cells are effectively passive bystanders in the
interaction. They do not hinder the growth of the tumour cells or the movement of any
cell type.
where θdiv controls the shape of the curve allowing it to capture qualitative understand-
ing of the biology and Tsum is the number of tumour cells in a one cell radius of the cell
of interest. From Figure 1(a), it can be seen that tumour cell division is more likely
when there is space in the neighbourhood for the resulting daughter cell.
The probability of tumour lysis depends on the strength of the immune system
in the neighbourhood of the tumour cell (see Figure 1(b)), and is given by
tmr 2
Plysis = 1 − exp − (θlysis Isum ) ,
where again θlysis controls the shape of the curve allowing it to capture qualitative
understanding of the biology and Isum is the number of immune cells in a one cell
radius of the cell of interest.
An Improved Model of Tumour-Immune System Interactions 899
1 1
Plysis
tmr
tmr
Pdiv
0.5 0.5
0 0
0 1 2 3 0 1 2 3
θdiv Tsum θlysis Isum
(a) (b)
2.1.3. Immune System. At each time step, the neighbourhood of each immune cell is
surveyed to determine whether the tumour cells are present. If tumour cells are present,
the immune system will kill the tumour cells in the manner described above. If there
are no tumour cells in the neighbourhood of the CTL cells, then the CTL cells move
towards areas of higher chemokine concentration.
To control the normal background level of CTL cells, at each time step there is a
chance that healthy cells are replaced (from outside the computational domain) by new
immune cells. This is carried out by imposing a probability of healthy cell replacement
with a CTL, given by
CTL 1 X
Prep = CTL0 − 2 CTLi,j , (1)
n
domain
where CTL0 is the ‘normal’ density of CTL cells and n2 is the total number of CA
elements.
NK cell and dendritic cell have similar rules to CTL cells, except that NK cells
and mature dendritic cells can lyse the tumour cell only once. When immature dendritic
cells come in contact with tumour cell it becomes a mature dendritic cell that has the
ability to kill the tumour cell.
100 100
80 80
60 60
40 40
20 20
20 40 60 80 100 20 40 60 80 100
(a) 50 cell cycles (b) 100 cell cycles
6,000 20
4,000
10
2,000
0 0
0 20 40 60 80 100 0 20 40 60 80 100
TC NEC CTL MDC NK IDC
(a) Tumour and necrotic cells (b) Immature and mature DCs, NKs and CTLs
3. RESULTS
We combine the solution of the PDE with the CA as described in Section 2 to
simulate the evolution of the growing tumour. A two-dimensional regular 100 × 100
square domain is used with 100 times steps and a Moore neighbourhood is considered
for the cellular automata rules. In this simulation, an estimated value of diffusion
coefficient for chemokine, D, is 10−4 µm2 s−1 . The distribution of the growing tumour
after 50 and 100 cell cyces is shown in Figure 2(a), with results qualitatively matching
those of Mallet and de Pillis [9].
Figure 3(a) shows the evolution of the tumour cell and necrotic cell densities over
100 cell cycles. This plot shows the characteristic exponential and linear growth phases
of solid, avascular tumours (see for example, Folkman and Hochberg [6]), as well as a
slower growing population of necrotic cells. In 3(b) we see that initially, the number
of mature dendritic cell is zero until immature dendritic cells come in contact with
tumour cells, at which point the matured dendritic cells commence killing the tumour
cells. After around 80 cycles, all immature dendritic cells have matured and the number
of mature dendritic cells stabilises. As expected, due to the nature of equation (1), the
populations of NK cells and CTL cells remain approximation steady over the extent of
the tumour growth.
We also use the hybrid cellular automata model to investigate the growth of a
tumour in a number of computational ‘cancer patients’. Each computational patient is
distinguished from others by altering model parameters. We define ‘death’ of a patient
as occurring when the tumour is able to metastasise. Effectively, this is when the cells
of the tumour reach the boundary of our model domain. We present the results of these
simulations using a simulated Kaplan-Meier survival curve, shown in Figure 4. The
figure shows that metastasis sets in for the first patients after 80 cycles. Metastasis
of the simulated tumours occurred in approximately 60% of simulated patients after
902 Trisilowati et al.
100
% of patients surviving
80
60
40
20
0
0 100 200 300 400
Cell cycles
250 cycles after which time most surviving patients exhibited dormant tumours being
controlled by the immune system.
4. CONCLUSION
Duchting and Vogelsaenger [3] pioneered the use of discrete cellular automata for
modelling cancer, investigating the effects of radio-therapy. Ferreira et al. [5] modelled
avascular cancer growth with a CA model based on the fundamental biological process
of proliferation, motility, and death, including competition for diffusing nutrients among
normal and cancer cells. Based on the Ferreira et al. model, Mallet and de Pillis [9]
constructed a hybrid cellular automata cancer model that built on the work of Ferreira
et al. to include NK cells as the innate immune system and CTL cells as the specific
immune system. The Mallet and de Pillis model lacked sufficient detail of the immune
system and in this present research, we attempt to improve on their work by explicitly
describing more of the host immune system. While direct comparison of the models
is difficult, the results as described in Figure 3(a) qualitatively reflect the findings of
Mallet and de Pillis and of Ferreira et al..
While models based on differential equations allow for analytical investigations
such as stability and parameter sensitivity analyses, and ease of fitting the model to
experimental data, these types of models cannot capture the detailed cellular and sub-
cellular level complexity of the biological system. On the other hand, HCA models can
describe greater complexity of the biological process such as the interaction between
every single cell. In current work complementary to the present research of this paper,
we have included greater realism in the modelling of tumour-secreted chemokines by
allowing secretion due to cell-cell interaction. Currently, chemokines and their receptors
in the tumour microenvironment are being extensively investigated to produce thera-
peutic interventions to combat cancer, (see for example, Allavena et al. [2] and Murooka
An Improved Model of Tumour-Immune System Interactions 903
et al. [10]). Our models currently under development will allow for simulation-based
and theoretical investigations of such interventions.
We have developed a useful model that can be employed as a preliminary inves-
tigative tool for experimentalists who conduct expensive in vitro and in vivo experiments
to test and refine hypotheses prior to entering the lab. With further cross disciplinary
collaboration, this type of model can be refined to provide a more accurate descrip-
tion of the underlying cancer biology and hence yield more relevant predictions and
tests of hypotheses. Future developments based upon this model will be related to the
specific context of colorectal cancer, and the effect of chemokines on the cell-cell inter-
actions will be deeply investigated. More complex partial differential equations related
to chemokines secretion resulting from cell-cell interactions will be introduced in future
work.
References
[1] Australian Institute of Health & Welfare, Cancer, available at http://www.aihw.gov.au/
cancer/, accessed April 16, 2011.
[2] Allavena, P., Marchesi, F. and Mantovani, A., The role of chemokines and their receptors
in tumor progression and invasion: Potential new target of biological therapy, Current Cancer
Therapy Reviews 1, 81-92, 2005.
[3] Duchting, W. and Vogelsaenger, T., Analysis, forecasting and control of three-dimensional
tumor growth and treatment. J. Med. Syst. 8, 461-475, 1984.
[4] de Pillis, L.G., Mallet, D.G. and Radunskaya, A.E., Spatial Tumor-Immune Modeling. Com-
putational and Mathematical Methods in Medicine 7:2-3, 159-176, 2006.
[5] Ferreira Jr. S.C., Martins, M.L., Vilela, M.J., Reaction diffusion model for the growth of
avascular tumor. Phys. Rev. E 65, 021907, 2002.
[6] Folkman, J. and Hochberg, M., Self regulation of growth in three dimensions. J. Exp. Med.
138, 745-753, 1973.
[7] Hart, D.N., Dendritic cells: Unique leukocyte populations which control the primary immune
response. Blood 90, 3245-3287, 1997.
[8] Kindt, T.J., Goldsby, R.A., and Osborne, B.A., Immunology, W.H. Freeman and Company,
New York, 2007.
[9] Mallet, D.G. and de Pillis, L.G., A cellular automata model of tumour-immune system inter-
actions, J. Theoretical Biology 239, 334-350, 2006.
[10] Murooka,T.T.,Ward, S.E., and Fish, E.N., Chemokines and Cancer, in: Cytokines and Cancer,
Springer, New York, 2005.
[11] Ribba, B., Alarkon, T., Marron, K., Maini, P.K. and Agur, Z., The use of hybrid cellular
automata models for improving cancer therapy. ACRI 2004, LNCS 3305, 444-453, 2004.
[12] Sandel, M.H. et al., Prognostic Value of Tumor-Infiltrating Dendritic Cells in Colorectal Cancer:
Role of Maturation Status and Intra-tumoral Localization. Clinical Cancer Research 11:7, 2576-
2582, 2005.
[13] World Health Organization, Programmes and Projects: Cancer, available at http://www.who.
int/cancer/en/, accessed April 16, 2011.
Trisilowati
Mathematical Sciences Discipline, Queensland University of Technology, Brisbane, Australia.
904 Trisilowati et al.
Scott W. Mccue
Mathematical Sciences Discipline, Queensland University of Technology, Brisbane, Australia.
e-mail: [email protected]
Dann Mallet
Mathematical Sciences Discipline and Institute of Health and Biomedical Innovation, Queens-
land University of Technology, Brisbane, Australia.
e-mail: [email protected]