0% found this document useful (0 votes)
161 views921 pages

Prosiding SEAMS 2011

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
161 views921 pages

Prosiding SEAMS 2011

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 921

th

th

Proceedings of the 6 SEAMS-GMU International


Conference on Mathematics and Its Applications
The
SEAMS - GMU 2011 ISBN 978-979-17979-3-1
International Conference on Mathematics
and Its Applications

th
th
Proceedings of the 6 SEAMS-GMU International
Conference on Mathematics and Its Applications
th th
Yogyakarta - Indonesia, 12 - 15 July 2011

MATHEMATICS AND ITS APPLICATIONS IN THE


DEVELOPMENT OF SCIENCES AND TECHNOLOGY

Development of Sciences and Technology.


Mathematics and Its Applications in the

Department of Mathematics
Faculty of Mathematics & Natural Sciences
Universitas Gadjah Mada
Sekip Utara Yogyakarta - INDONESIA 55281
Phone : +62 - 274 - 552243 ; 7104933
Fax. : +62 - 274 555131
PROCEEDINGS OF THE 6TH
SOUTHEAST ASIAN MATHEMATICAL SOCIETY
GADJAH MADA UNIVERSITY
INTERNATIONAL CONFERENCE ON MATHEMATICS
AND ITS APPLICATIONS 2011

Yogyakarta, Indonesia, 12th – 15th July 2011

DEPARTMENT OF MATHEMATICS
FACULTY OF MATHEMATICS AND NATURAL SCIENCES
UNIVERSITAS GADJAH MADA
YOGYAKARTA, INDONESIA
2012
Published by

Department of Mathematics
Faculty of Mathematics and Natural Sciences
Universitas Gadjah Mada
Sekip Utara, Yogyakarta, Indonesia
Telp. +62 (274) 7104933, 552243
Fax. +62 (274) 555131

PROCEEDINGS OF
THE 6TH SOUTHEAST ASIAN MATHEMATICAL SOCIETY-GADJAH MADA UNIVERSITY
INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS APPLICATIONS 2011
Copyright @ 2012 by Department of Mathematics, Faculty of Mathematics and
Natural Sciences, Universitas Gadjah Mada, Yogyakarta, Indonesia

ISBN 978-979-17979-3-1
PROCEEDINGS OF THE 6TH
SOUTHEAST ASIAN MATHEMATICAL SOCIETY-GADJAH MADA
UNIVERSITY
INTERNATIONAL CONFERENCE ON MATHEMATICS AND ITS
APPLICATIONS 2011

Chief Editor:
Sri Wahyuni

Managing Editor :

Indah Emilia Wijayanti Dedi Rosadi

Managing Team :

Ch. Rini Indrati Irwan Endrayanto A.


Herni Utami Dewi Kartika Sari
Nur Khusnussa’adah Indarsih
Noorma Yulia Megawati Rianti Siswi Utami
Hadrian Andradi

Supporting Team :

Parjilan Warjinah
Siti Aisyah Emiliana Sunaryani Yuniastuti
Susiana Karyati
Tutik Kristiastuti Sudarmanto
Tri Wiyanto Wira Kurniawan
Sukir Widodo Sumardi
EDITORIAL BOARDS

Algebra, Graph and Combinatorics


Budi Surodjo
Ari Suparwanto

Analysis
Supama
Atok Zulijanto

Applied Mathematics
Fajar Adi Kusumo
Salmah

Computer Science
Edi Winarko
MHD. Reza M.I. Pulungan

Statistics and Finance


Subanar
Abdurakhman
LIST OF REVIEWERS
Abdurakhman Insap Santosa
Universitas Gadjah Mada, Indonesia Universitas Gadjah Mada, Indonesia
Achmad Muchlis Intan Muchtadi-Alamsyah
Institut Teknologi Bandung, Indonesia Institut Teknologi Bandung, Indonesia
Adhitya Ronnie Effendi Irawati
Universitas Gadjah Mada, Indonesia Institut Teknologi Bandung, Indonesia
Agus Buwono Irwan Endrayanto A.
Institut Pertanian Bogor, Indonesia Universitas Gadjah Mada, Indonesia
Agus Maman Abadi Jailani
Universitas Negeri Yogyakarta, Indonesia Universitas Negeri Yogyakarta, Indonesia
Agus Yodi Gunawan Janson Naiborhu
Institut Teknologi Bandung, Indonesia Institut Teknologi Bandung, Indonesia
Joko Lianto Buliali
Ari Suparwanto
Institut Teknologi Sepuluh Nopember,
Universitas Gadjah Mada, Indonesia
Indonesia
Asep. K. Supriatna Khreshna Imaduddin Ahmad S.
Universitas Padjadjaran, Indonesia Institut Teknologi Bandung, Indonesia
Atok Zulijanto Kiki Ariyanti Sugeng
Universitas Gadjah Mada, Indonesia Universitas Indonesia
Azhari SN Lina Aryati
Universitas Gadjah Mada, Indonesia Universitas Gadjah Mada, Indonesia
Budi Nurani M. Farchani Rosyid
Universitas Padjadjaran, Indonesia Universitas Gadjah Mada, Indonesia
Budi Santosa Mardiyana
Institut Teknologi Sepuluh Nopember, Indonesia Universitas Negeri Surakarta, Indonesia
Budi Surodjo MHD. Reza M. I. Pulungan
Universitas Gadjah Mada, Indonesia Universitas Gadjah Mada, Indonesia
Cecilia Esti Nugraheni Miswanto
Universitas Katolik Parahyangan, Indonesia Universitas Airlangga, Indonesia
Ch. Rini Indrati Netty Hernawati
Universitas Gadjah Mada, Indonesia Universitas Lampung, Indonesia
Chan Basarrudin Noor Akhmad Setiawan
Universitas Indonesia Universitas Gadjah Mada, Indonesia
Danardono Nuning Nuraini
Universitas Gadjah Mada, Indonesia Institut Teknologi Bandung, Indonesia
Dedi Rosadi Rieske Hadianti
Universitas Gadjah Mada, Indonesia Institut Teknologi Bandung, Indonesia
Deni Saepudin Roberd Saragih
Institut Teknologi Telkom, Indonesia Institut Teknologi Bandung, Indonesia
Diah Chaerani Salmah
Universitas Padjadjaran, Indonesia Universitas Gadjah Mada, Indonesia
Edy Soewono Siti Fatimah
Institut Teknologi Bandung, Indonesia Universitas Pendidikan Indonesia

Edy Tri Baskoro Soeparna Darmawijaya


Institut Teknologi Bandung, Indonesia Universitas Gadjah Mada, Indonesia
Edi Winarko Sri Haryatmi
Universitas Gadjah Mada, Indonesia Universitas Gadjah Mada, Indonesia
Endar H Nugrahani Sri Wahyuni
Institut Pertanian Bogor, Indonesia Universitas Gadjah Mada, Indonesia
Endra Joelianto Subanar
Institut Teknologi Bandung, Indonesia Universitas Gadjah Mada, Indonesia
Eridani Supama
Universitas Airlangga, Indonesia Universitas Gadjah Mada, Indonesia
Fajar Adi Kusumo Suryanto
Universitas Gadjah Mada, Indonesia Universitas Negeri Yogyakarta, Indonesia
Frans Susilo Suyono
Universitas Sanata Dharma, Indonesia Universitas Negeri Jakarta, Indonesia
Gunardi Tony Bahtiar
Universitas Gadjah Mada, Indonesia Institut Pertanian Bogor, Indonesia
Hani Garminia Wayan Somayasa
Institut Teknologi Bandung, Indonesia Universitas Haluoleo, Indonesia
Hartono Widodo Priyodiprojo
Universitas Negeri Yogyakarta, Indonesia Universitas Gadjah Mada, Indonesia
Hengki Tasman Wono Setyo Budhi
Universitas Indonesia Institut Teknologi Bandung, Indonesia
I Wayan Mangku Yudi Soeharyadi
Institut Pertanian Bogor, Indonesia Institut Teknologi Bandung, Indonesia
Indah Emilia Wijayanti
Universitas Gadjah Mada, Indonesia
PREFACE

It is an honor and great pleasure for the Department of Mathematics –


Universitas Gadjah Mada, Yogyakarta – INDONESIA, to be entrusted by the
Southeast Asian Mathematical Society (SEAMS) to organize an international
conference every four years. Appreciation goes to those who have developed and
established this tradition of the successful series of conferences. The SEAMS -
Gadjah Mada University (SEAMS-GMU) 2011 International Conference on
Mathematics and Its Applications took place in the Faculty of Mathematics and
Natural Sciences of Universitas Gadjah Mada on July 12 th – 15th, 2011. The
conference was the follow up of the successful series of events which have been
held in 1989, 1995, 1999, 2003 and 2007.
The conference has achieved its main purposes of promoting the exchange
of ideas and presentation of recent development, particularly in the areas of pure,
applied, and computational mathematics which are represented in Southeast Asian
Countries. The conference has also provided a forum of researchers, developers,
and practitioners to exchange ideas and to discuss future direction of research.
Moreover, it has enhanced collaboration between researchers from countries in the
region and those from outside.
More than 250 participants from over the world attended the conference.
They come from USA, Austria, The Netherlands, Australia, Russia, South Africa,
Taiwan, Iran, Singapore, The Philippines, Thailand, Malaysia, India, Pakistan,
Mongolia, Saudi Arabia, Nigeria, Mexico and Indonesia. During the four days
conference, there were 16 plenary lectures and 217 contributed short
communication papers. The plenary lectures were delivered by Halina France-
Jackson (South Africa), Jawad Y. Abuihlail (Saudi Arabia), Andreas Rauber (Austria),
Svetlana Borovkova (The Netherlands), Murk J. Bottema (Australia), Ang Keng Cheng
(Singapore), Peter Filzmoser (Austria), Sergey Kryzhevich (Russia), Intan Muchtadi-
Alamsyah (Indonesia), Reza Pulungan (Indonesia), Salmah (Indonesia), Yudi
Soeharyadi (Indonesia), Subanar (Indonesia) Supama (Indonesia), Asep K. Supriatna
(Indonesia) and Indah Emilia Wijayanti (Indonesia). Most of the contributed papers
were delivered by mathematicians from Asia.
We would like to sincerely thank all plenary and invited speakers who
warmly accepted our invitation to come to the Conference and the paper
contributors for their overwhelming response to our call for short presentations.
Moreover, we are very grateful for the financial assistance and support that we
received from Universitas Gadjah Mada, the Faculty of Mathematics and Natural
Sciences, the Department of Mathematics, the Southeast Asian Mathematical
Society, and UNESCO.
We would like also to extend our appreciation and deepest gratitude to all
invited speakers, all participants, and referees for the wonderful cooperation, the
great coordination, and the fascinating efforts. Appreciation and special thanks are
addressed to our colleagues and staffs who help in editing process. Finally, we
acknowledge and express our thanks to all friends, colleagues, and staffs of the
Department of Mathematics UGM for their help and support in the preparation
during the conference.

The Editors
October, 2012
CONTENTS

Title i
Publisher and Copyright ii
Managerial Boards iii
Editorial Boards iv
List of Reviewers v
Preface vii
Paper of Invited Speakers

On Things You Can’t Find : Retrievability Measures and What to do with Them ……............... 1
Andreas Rauber and Shariq Bashir

A Quasi-Stochastic Diffusion-Reaction Dynamic Model for Tumour Growth ..……................... 9


Ang Keng Cheng

*-Rings in Radical Theory ...………………………………………………..................................................... 19


H. France-Jackson

Clean Rings and Clean Modules ...………………………………………................................................... 29


Indah Emilia Wijayanti

Research on Nakayama Algebras ……...…………………………………................................................. 41


Intan Muchtadi-Alamsyah

Mathematics in Medical Image Analysis: A Focus on Mammography ...…............................... 51


Murk J. Bottema, Mariusz Bajger, Kenny MA, Simon Williams

The Order of Phase-Type Distributions ..………………………………….............................................. 65


Reza Pulungan

The Linear Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System… 79
Salmah

Chaotic Dynamics and Bifurcations in Impact Systems ………………........................................... 89


Sergey Kryzhevich

Contribution of Fuzzy Systems for Time Series Analysis ……………….......................................... 121


Subanar and Agus Maman Abadi
Contributed Papers

Algebra

Degenerations for Finite Dimensional Representations of Quivers ……................................... 137


Darmajid and Intan Muchtadi-Alamsyah

On Sets Related to Clones of Quasilinear Operations …...……………………………………………………. 145


Denecke, K. and Susanti, Y.

Normalized H Coprime Factorization for Infinite-Dimensional Systems …………………………… 159


Fatmawati, Roberd Saragih, Yudi Soeharyadi

Construction of a Complete Heyting Algebra for Any Lattice ………………………………………………. 169


Harina O.L. Monim, Indah Emilia Wijayanti, Sri Wahyuni

The Fuzzy Regularity of Bilinear Form Semigroups …………………………………………………..…………. 175


Karyati, Sri Wahyuni, Budi Surodjo, Setiadji

The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras ………………………………………. 183


Khurul Wardati, Indah Emilia Wijayanti, Sri Wahyuni

Application of Fuzzy Number Max-Plus Algebra to Closed Serial Queuing Network with
Fuzzy Activitiy Time ………………………………………………………………………………………………..…………… 193
M. Andy Rudhito, Sri Wahyuni, Ari Suparwanto, F. Susilo

Enumerating of Star-Magic Coverings and Critical Sets on Complete Bipartite Graphs………… 205
M. Roswitha, E. T. Baskoro, H. Assiyatun, T. S. Martini, N. A. Sudibyo

Construction of Rate s/2s Convolutional Codes with Large Free Distance via Linear System
Approach ………………………………………………………………………………………………........…………………….. 213
Ricky Aditya and Ari Suparwanto

Characteristics of IBN, Rank Condition, and Stably Finite Rings ………....................................... 223
Samsul Arifin and Indah Emilia Wijayanti

The Eccentric Digraph of Pn Pm Graph ………………………………………………………………….….……… 233


Sri Kuntarti and Tri Atmojo Kusmayadi

On M -Linearly Independent Modules ……………………………………………………………………….......... 241


Suprapto, Sri Wahyuni, Indah Emilia Wijayanti, Irawati

The Existence of Moore Penrose Inverse in Rings with Involution …........................................ 249
Titi Udjiani SRRM, Sri Wahyuni, Budi Surodjo
Analysis

An Application of Zero Index to Sequences of Baire-1 Functions ………….................................. 259


Atok Zulijanto

Regulated Functions in the n-Dimensional Space ……………………....................….....................… 267


Ch. Rini Indrati

Compactness Space Which is Induced by Symmetric Gauge ……..........................………………... 275


Dewi Kartika Sari and Ch. Rini Indrati

A Continuous Linear Representation of a Topological Quotient Group …................................ 281


Diah Junia Eksi Palupi, Soeparna Darmawijaya, Setiadji, Ch. Rini Indrati

L
On Necessary and Sufficient Conditions for into 1 Superposition Operator ……...………... 289
Elvina Herawaty, Supama, Indah Emilia Wijayanti

A DRBEM for Steady Infiltration from Periodic Flat Channels with Root Water Uptake ……….. 297
Imam Solekhudin and Keng-Cheng Ang

Boundedness of the Bimaximal Operator and Bifractional Integral Operators in Generalized


Morrey Spaces …………………………………...................................................................................... 309
Wono Setya Budhi and Janny Lindiarni

Applied Mathematics

A Lepskij-Type Stopping-Rule for Simplified Iteratively Regularized Gauss-Newton Method.. 317


Agah D. Garnadi

Asymptotically Autonomous Subsystems Applied to the Analysis of a Two-Predator One-


Prey Population Model ............................................................................................................. 323
Alexis Erich S. Almocera, Lorna S. Almocera, Polly W.Sy

Sequence Analysis of DNA H1N1 Virus Using Super Pair Wise Alignment .............................. 331
Alfi Yusrotis Zakiyyah, M. Isa Irawan, Maya Shovitri

Optimization Problem in Inverted Pendulum System with Oblique Track ……………………………. 339
Bambang Edisusanto, Toni Bakhtiar, Ali Kusnanto

Existence of Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 347
Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang
Effect of Rainfall and Global Radiation on Oil Palm Yield in Two Contrasted Regions of
Sumatera, Riau and Lampung, Using Transfer Function .......................................................... 365
Divo D. Silalahi, J.P. Caliman, Yong Yit Yuan

Continuously Translated Framelet............................................................................................ 379


Dylmoon Hidayat

Multilane Kinetic Model of Vehicular Traffic System ............................................................. 386


Endar H. Nugrahani

Analysis of a Higher Dimensional Singularly Perturbed Conservative System: the Basic


Properties ….............................................................................................................................. 395
Fajar Adi Kusumo

A Mathematical Model of Periodic Maintence Policy based on the Number of Failures for
Two-Dimensional Warranted Product ..…................................................................................. 403
Hennie Husniah, Udjianna S. Pasaribu, A.H. Halim

The Existence of Periodic Solution on STN Neuron Model in Basal Ganglia …………………………. 413
I Made Eka Dwipayana

Optimum Locations of Multi-Providers Joint Base Station by Using Set-Covering Integer


Programming: Modeling & Simulation ….................................................................................. 419
I Wayan Suletra, Widodo, Subanar

Expected Value Approach for Solving Multi-Objective Linear Programming with Fuzzy
Random Parameters ….............................................................................................................. 427
Indarsih, Widodo, Ch. Rini Indrati

Chaotic S-Box with Piecewise Linear Chaotic Map (PLCM) ...................................................... 435
Jenny Irna Eva Sari and Bety Hayat Susanti

Model of Predator-Prey with Infected Prey in Toxic Environment .......................................... 449


Lina Aryati and Zenith Purisha

On the Mechanical Systems with Nonholonomic Constraints: The Motion of a Snakeboard


on a Spherical Arena ………………………………………..……………………................................................ 459
Muharani Asnal and Muhammad Farchani Rosyid

Safety Analysis of Timed Automata Hybrid Systems with SOS for Complex Eigenvalues …….. 471
Noorma Yulia Megawati, Salmah, Indah Emilia Wijayanti

Global Asymptotic Stability of Virus Dynamics Models and the Effects of CTL and Antibody
Responses ………………………………………………………………………………….……………………………………….. 481
Nughtoth Arfawi Kurdhi and Lina Aryati
A Simple Diffusion Model of Plasma Leakage in Dengue Infection …………………………..………… 499
Nuning Nuraini, Dinnar Rachmi Pasya, Edy Soewono

The Sequences Comparison of DNA H5N1 Virus on Human and Avian Host Using Tree
Diagram Method …………………………………..……………………………………………………………..…………….. 505
Siti Fauziyah, M. Isa Irawan, Maya Shovitri

Fuzzy Controller Design on Model of Motion System of the Satellite Based on Linear Matrix
Inequality …………………………………………………………………….…………….…………..…….…………………….. 515
Solikhatun and Salmah

Unsteady Heat and Mass Transfer from a Stretching Suface Embedded in a Porous Medium
with Suction/injection and Thermal Radiation Effects…………………………………………………………. 529
Stanford Shateyi and Sandile S Motsa

Level-Set-Like Method for Computing Multi-Valued Solutions to Nonlinear Two Channels


Dissipation Model ……………………………………………………………………………….....………………………….. 547
Sumardi, Soeparna Darmawijaya, Lina Aryati, F.P.H. Van Beckum

Nonhomogeneous Abstract Degenerate Cauchy Problem: The Bounded Operator on the


Nonhomogen Term ………………………………………………………………………..…………………..……………… 559
Susilo Hariyanto, Lina Aryati,Widodo

Stability Analysis and Optimal Harvesting of Predator-Prey Population Model with Time
Delay and Constant Effort of Harvesting ………………………………………………………..……………………. 567
Syamsuddin Toaha

Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation ………….. 579
Widowati, Nurhayati, Sutimin, Laylatusysyarifah

Computer Science, Graph and Combinatorics

Survey of Methods for M onitoring Association Rule Behav ior ............................. 589
Ani Dijah Rahajoe and Edi Winarko

A Comparison Framework for Finge rprint Recognition Met hods .................... 601
Ary Noviyanto and Reza Pulungan

The Global Behav ior of Ce rtain Turing System …………………. ..................................... 615
Janpou Nee

Logic Approach Towards Formal Ve rification of C rypt ographic P rotocol ......... 621
D.L. Crispina Pardede, Maukar, Sulistyo Puspitodjati

A Framew ork for an LTS Sem antics for P romela …........... ................. .............. 631
Suprapto and Reza Pulungan
Mathematics Education

Modelling On Lecturers’ Performance with Hotteling-Harmonic-Fuzzy …................................ 647


H. A. Parhusip and A. Setiawan

Differences in Creativity Qualities Between Reflective and Impulsive Students in Solving


Mathematics …………………......................................……...................…………………........................ 659
Warli

Statistics and Finance

Two-Dimensional Warranty Policies Using Copula ...…………...................………………………………. 671


Adhitya Ronnie Effendie

Consistency of the Bootstrap Estimator for Mean Under Kolmogorov Metric and Its
Implementation on Delta Method ……................................………………….................................. 679
Bambang Suprihatin, Suryo Guritno, Sri Haryatmi

Multivariate Time Series Analysis Using RcmdrPlugin.Econometrics and Its Application for
Finance ..................................................................................................................................... 689
Dedi Rosadi

Unified Structural Models and Reduced-Form Models in Credit Risk by the Yield Spreads …. 697
Di Asih I Maruddani, Dedi Rosadi, Gunardi, Abdurakhman

The Effect of Changing Measure in Interest Rate Models …….…....................…....................... 705


Dina Indarti, Bevina D. Handari, Ias Sri Wahyuni

New Weighted High Order Fuzzy Time Seriesfor Inflation Prediction ……................................ 715
Dwi Ayu Lusia and Suhartono

Detecting Outlier in Hyperspectral Imaging UsingMultivariate Statistical Modeling and


Numerical Optimization ………………………………………...........................................……………......... 729
Edisanter Lo

Prediction the Cause of Network Congestion Using Bayesian Probabilities ............................. 737
Erwin Harapap, M. Yusuf Fajar, Hiroaki Nishi

Solving Black-Scholes Equation by Using Interpolation Method with Estimated Volatility……… 751
F. Dastmalchisaei, M. Jahangir Hossein Pour, S. Yaghoubi

Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia ............... 763
Heri Kuswanto
Second Order Least Square for ARCH Model …………………………....................………………............. 773
Herni Utami, Subanar, Dedi Rosadi, Liqun Wang

Two Dimensional Weibull Failure Modeling ……………………..…..................……….......................... 781


Indira P. Kinasih and Udjianna S. Pasaribu

Simulation Study of MLE on Multivariate Probit Models …......................................................... 791


Jaka Nugraha

Clustering of Dichotomous Variables and Its Application for Simplifying Dimension of


Quality Variables of Building Reconstruction Process ............................................................. 801
Kariyam

Valuing Employee Stock Options Using Monte Carlo Method ……………................................. 813
Kuntjoro Adji Sidarto and Dila Puspita

Classification of Epileptic Data Using Fuzzy Clustering .......................................................... 821


Nazihah Ahmad, Sharmila Karim, Hawa Ibrahim, Azizan Saaban, Kamarun Hizam
Mansor

Recommendation Analysis Based on Soft Set for Purchasing Products ................................. 831
R.B. Fajriya Hakim, Subanar, Edi Winarko

Heteroscedastic Time Series Model by Wavelet Transform ................................................. 849


Rukun Santoso, Subanar, Dedi Rosadi, Suhartono

Parallel Nonparametric Regression Curves ............................................................................ 859


Sri Haryatmi Kartiko

Ordering Dually in Triangles (Ordit) and Hotspot Detection in Generalized Linear Model for
Poverty and Infant Health in East Java ……................................…………………………………………. 865
Yekti Widyaningsih, Asep Saefuddin, Khairil Anwar Notodiputro, Aji Hamim Wigena

Empirical Properties and Mixture of Distributions: Evidence from Bursa Malaysia Stock
Market Indices …………………....................................................................................................... 879
Zetty Ain Kamaruzzaman, Zaidi Isa, Mohd Tahir Ismail
An Improved Model of Tumour-Immune System Interactions …………………………………………….. 895
Trisilowati, Scott W. Mccue, Dann Mallet
Paper of Invited
Speakers
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 1–8.

ON THINGS YOU CAN’T FIND:


RETRIEVABILITY MEASURES AND
WHAT TO DO WITH THEM

Andreas Rauber and Shariq Bashir

Abstract. Information retrieval systems are commonly evaluated along two criteria,
namely effectiveness and efficiency, i.e. how well and how quickly they retrieve relevant
documents. More recently, another measure referred to as retrievability has been gaining
importance, namely how unbiased an IR system is against documents of certain charac-
teristics, i.e. how well all documents can theoretically be found. This paper provides a
short introduction to this concept of retrievability and show different ways of measuring
and estimating it. We will further discuss, how retrievability can be used to evaluate
and tune retrieval systems and discuss open challenges in terms of better understanding
the mathematical characteristics of similarity computation in high-dimensional feature
spaces.

1. INTRODUCTION
In the basic vector space model of information retrieval (IR), both documents as
well as queries are represented as vectors in a high-dimensional feature space. Distance
measures such as cosine distance, any L-norm, and others are used to identify the most
similar document vectors to any query vector, resulting in a ranked list of documents
to be returned to the user. This model poses several interesting challenges and oppor-
tunities for improving IR systems, be it in terms of identifying an optimized feature
space, selecting a suitable metric that can handle the frequently very high-dimensional
(several tens of thousands of dimensions), and very sparse feature spaces.
Evaluation of IR systems is usually performed along two dimensions, namely effec-
tiveness (how good is the system in returning relevant documents) and efficiency (how
time/memory efficient is the system). With the dominant effectiveness-based measures
being precision and recall, Retrievability has recently evolved as an important additional
dimension along which to evaluate information retrieval (IR) systems. It basically eval-
uates, in how far each document can be found by a given system. As each retrieval
1
2 Rauber Andreas and Shariq Bashir

model has certain characteristics that determine it’s behavior against documents of dif-
ferent length, vocabulary size, vocabulary distribution etc., retrievability can be used to
determine a potential bias introduced by a certain retrieval model against specific types
of documents. While retrievability has been primarily used in the context of recall-
oriented application domains (specifically patent retrieval), it may also be applied in a
range of more conventional application domains to understand which documents cannot
be found under certain assumptions, and to tune retrieval systems.
Still, the reasons as to why some documents have low or high retrievability, and
the characteristics of the bias introduced by a retrieval system are not fully understood.
Even more, estimating retrievability without processing prohibitively large numbers of
queries is a challenging task. We thus need to strive for a more systematic approach, re-
lying on the mathematical characteristics of the feature space as resulting from the data
as well as the query space, and their relationship. Similar to the insights gained from
understanding the behavior of distance metrics in very high-dimensional and specifically
very sparse feature spaces [1], we need to obtain a more solid model of the interplay
between data- and query space under a specific distance metric in order to understand
system bias and optimize retrieval. This extended abstract aims at raising a few ques-
tions in this context.
The remainder of this paper is organized as follows. A short introduction to re-
trievability measurement is provided in Section 2. Section 3 the proceeds with describing
some research challenges in the context of retrievability research, before providing some
conclusions in Section 4.

2. RETRIEVABILITY - HOW TO MEASURE?


Conventionally, retrieval systems are evaluated along two dimensions, namely
effectiveness and efficiency. Effectiveness measures capture, how good a system is in
finding relevant documents for a given set of topics/queries. Typical measures include
precision, i.e. the fraction of relevant documents in the set of documents retrieved by
the system, as well as recall, i.e. the fraction of relevant documents found. A range
of other measures capturing different trade-offs or handling ranked lists, such as Mean
Average Precision, Q-measure, Normalized Discounted, Cumulative Gain, Rank-Based
Precision, Binary Preference (bref), exist [5]. Efficiency measures the computational
performance in terms of speed, memory requirements etc. Analysis along these two
dimensions usually allows to determine the best system for a given setting.
Retrievability is a relatively novel dimension along which to evaluate information
retrieval systems. It basically measures, how well a system is able to potentially retrieve
(provide access to) all documents in a corpus, or - phrased the other way round - whether
it is possible for a user to at least theoretically find every document in a corpus with
some reasonable query (i.e. a query that is not a duplicate of the document itself).
Retrievability has originally been proposed by Azzopardi [4]. This section, adopted
from [3], provides a brief introduction into retrievability.
On Things You Can’t Find: Retrievability Measures and What to Do with Them 3

Given a retrieval system RS with a collection of documents D, the concept of


retrievability is to measure how well each document d ∈ D is retrievable within the
top-c rank results of all queries, if RS is presented with a large set of queries q ∈ Q.
Retrievability of a document is essentially a cumulative score that is proportional to
the number of times the document can be retrieved within that cut-off c over the set
Q [4]. A retrieval system is called best retrievable, if each document d has nearly the
same retrievability score, i.e. is equally likely to be found. More formally, retrievability
r(d) of d ∈ D can be defined as follows.
X
r(d) = f (kdq , c) (1)
q∈Q

f (kdq , c) is a generalized utility/cost function, where kdq is the rank of d in the


result set of query q, c denotes the maximum rank that a user is willing to proceed
down the ranked list. Commonly, the function f (kdq , c) returns a value of 1 if kdq ≤ c,
and 0 otherwise, although other implementations may be used when the impact of the
ranked position that a document is returned at can and should be assessed.
A variation of this, also proposed by Azzopardi in [4], takes into account the
probability with which a certain query will be issued. This allows for more realistic
estimates when information about the likeliness of certain queries is available, be it
based on query length, query term probability, etc.
Retrievability inequality can further be analyzed using the Lorenz Curve. Docu-
ments are sorted according to their retrievability score in ascending order, plotting a
cumulative score distribution. If the retrievability of documents is distributed equally,
then the Lorenz Curve will be linear. The more skewed the curve, the greater the
amount of inequality or bias within the retrieval system. The Gini coefficient G is used
to summarize the amount of bias in the Lorenz Curve, and is computed as follows.
Pn
· i − n − 1) · r(di )
i=1 (2
G= Pn (2)
(n − 1) j=1 r(dj )
where n = |D| is the number of documents in the collection sorted by r(d).
If G = 0, then no bias is present because all documents are equally retrievable. If
G = 1, then only one document is retrievable and all other documents have r(d) = 0.
By comparing the Gini coefficients of different retrieval methods, we can analyze the
retrievability bias imposed by the underlying retrieval system on the given document
collection.
However, the retrievability measure as defined above is a cumulative score over
all queries. Thus, longer documents that contain a larger vocabulary potentially have
a higher retrievability score than shorter documents as they can be found by a larger
number of queries. While this is desirable as a general measure of retrievability, in
settings where the actual set of queries is created directly from the documents to be
found, this may have a negative impact. This is because a larger number of queries are
generated for these longer documents. Thus, the cumulative retrievability score needs
to be normalized by the number of queries that were created from and thus potentially
can retrieve a particular document.
4 Rauber Andreas and Shariq Bashir

P
q∈Q f (kdq , c)
r̂(d) = (3)
|Q̂|
where Q̂ is the set of queries that can retrieve d when not considering any rank
cut-off factor.
Numerous experiments have meanwhile been performed on retrievability-related
analysis. one of the surprising factor is, that most retrieval system exhibit a rather
strong bias, either favoring or disfavoring strongly longer or shorter documents that are
rather vocabulary-rich or vocabulary poor. In most corpora there is a surprisingly high
number of documents that have a retrievability of 0, i.e. that cannot be found via any
query up to a reasonable length and excluding extremely specific queries consisting of
very rare terms. For examples of such analyses, see [2, 3, 4]

3. CHALLENGES IN RETRIEVABILITY RESEARCH


While numerous experiments analyzing retrievability for different document col-
lections and query systems have been performed, we still face numerous challenges that
need to be addressed on a fundamental basis. These include
• understanding retrievability: what are the factors that influence retrievability,
and how are they correlated?
• estimating retrievability: how can we estimate retrievability of a system/corpus
pairing without having to process prohibitively large stes of queries
• improving IR systems: can we improve the effectiveness of IR systems by im-
proving their retrievability?
The following subsections take a closer look at these research challenges.

3.1. Understanding retrievability. While we have observed the effect of largely vary-
ing retrievability values for different documents in various retrieval models, we still lack
a solid understanding of the relationship between the document space, the query space
and the distance measures used to retrieve documents that underly all effectiveness
based measures, but also retrievability. On the one hand, in situations where the char-
acteristics of the document space and the query space are highly similar, low-retrievable
documents tend to be outliers in the document space. This seems to be the case in some
multimedia retrieval scenarios such as music retrieval. However, in text retrieval the
situation is somewhat different: while the documents live in a rather continuous space
of weighted term frequency values, the query space is rather discrete, as conventional
queries consist of combinations of unique query terms. Thus, queries basically form
rather discrete sub-spaces of those terms present in the query, ignoring the importance
of each term as expressed in the weighted model used for representing the documents.
As a result of this, also documents that live in rather dense spaces in the document
space may end up having low retrievability as they are never close enough to the dis-
crete query points, leading to a different reason for having low retrievability based on
the document space.
On Things You Can’t Find: Retrievability Measures and What to Do with Them 5

We thus need to develop a better understanding, which characteristics of the


high-dimensional feature spaces (i.e. document corpus characteristics, query formula-
tion characteristics, and the indexing used to create the feature spaces) influence the
retrievability of which types of documents. We also need to analyze structurally, which
of these characteristics are corpus-based (densities), and which are document-based
(outliers, being too far away from query points even in dense areas). Last, but most
importantly, we need to obtain a more solid understanding of the mathematical princi-
ples and relationship between the two populations of the feature space, namely the data
points and the query points. These populations, depending on the type of corpus, fea-
ture space used for indexing, and the way queries are formulated and represented, will
differ significantly in terms of sparsity and discreteness. As these two populations of the
feature space are set in relationship to each other via some distance measure, we need
to obtain a solid understanding of the relationship between these three key elements
underlying any similarity computation. Specifically the latter element, i.e. the distance
measure, will further complicate a model-based analysis of the three interplaying con-
stituents, as most retrieval systems deployed now reply on rather complex interactions
between different types of feature spaces that are combined in a dynamic manner de-
pending on an analysis of the query type, with additional parameters modifying the way
similarity is being computed. Yet, solid models of these three elements and an analysis
of their interplay will provide us with a better understanding of how information can
be found, or why information cannot be found in a given representation.

3.2. Estimating retrievability. Retrievability calculation requires the analysis of an


almost prohibitively large sets of queries. in order to provide realistic estimates with
reasonable effort we need to devise ways to estimate retrievability based on a smaller
number of queries. How to create these queries, and to provide bounds on the error
made in such estimates would significantly speed up retrievability analysis, and thus
would allow to apply it more intensively in system fine-tuning and evaluation.
Yet, apart from simply reducing the number of queries, we also need to ensure
that the analyses result in realistic estimates with respect to realistic query setting for
a given combination of document corpus and retrieval system. This leads to questions
pertaining to the creation of realistic queries: how can we automatically create query
sets in large numbers that exhibit realistic distributions of query lengths, query term
specificity, and other. while this is a challenging problem considering individual query
terms, it becomes even harder when we also want to include realistic phrase-queries.
Furthermore, estimation methods as well as analytical approaches may help in providing
a better basis for determining retrievability - again with the constraint of having a
realistic model of the query space as mapped into the data space.

3.3. Improving IR systems. While retrievability analysis provides valuable informa-


tion on system bias in its own right, experiments seem to indicate that systems that
have a lower retrievability bias also provide better effectiveness as measured in preci-
sion and recall. If this indication should turn out to be true across a range of realistic
side-assumption (such as e.g not simply returning random documents, which would by
definition provide the lowest retrieval bias), the we could use retrievability as a base
6 Rauber Andreas and Shariq Bashir

measure to fine-tune retrieval systems, using it for optimizing parameters. Contrary


to other effectiveness-based measures, retrievability does not require annotated ground
truth such as which set of documents is relevant for each specific query. As such an-
notations are tremendously expensive up to virtually impossible to create for larger
document collections, having a performance measure based on unsupervised criteria
would offer tremendous potential for system fine-tuning on arbitrary document collec-
tions. Currently, systems can only be evaluated and tuned for document corpora where
such a ground truth is available - these setting are then ported to other corpora. With
retrievability measurement, one could fine-tune the systems specifically for each corpus,
if - as mentioned above - realistic estimates can be determined from realistic query
generation processes, and if the fine-tuning assumption should prove true.

4. CONCLUSIONS
Retrievability measurement has evolved as a promising evaluation for the perfor-
mance of an information retrieval system. It measures, in how far a systems provides
in principle equal access to all objects, i.e. in how far each object is equally likely to
be found using specific queries. Quite surprisingly, information retrieval systems differ
significantly in the bias they impose on a document corpus, returning some objects for
a large number of potential queries, while other objects are virtually never retrieved
within the top-n documents for all possible queries.
Obtaining a better understanding of this phenomenon will offer a basis for better
understanding and optimizing retrieval systems. It will also allow us to better un-
derstand the limitations of what can and what cannot be found. This is increasingly
essential as the corpora of information that we are searching in become increasingly
large, with the willingness of users to scan long result lists diminishing. thus, under-
standing which data items cannot be retrieved via any query may provide significant
insights on the features and limitations of a system. In order to achieve this, we need
to more thoroughly investigate the mathematical properties of these high-dimensional
and very sparse feature spaces, the characteristics of both the document as well as the
query feature space and the relationship between these two as influenced by the distance
measures used for retrieval.

References
[1] Aggarwal, C. C. and Hinneburg, A. and Keim, D. A., On the Surprising Behavior of Dis-
tance Metrics in High Dimensional Spaces. In Proceedings of the 8th International Conference
on Database Theory (ICDT ’01), 420–434, Springer, 2001.
[2] Bashir, S. and Rauber, A. Identification of low/high retrievable patents using content-based fea-
tures. In PaIR ’09: Proceeding of the 2nd international workshop on Patent information retrieval,
pages 9–16, 2009.
[3] Bashir, S. and Rauber, A. Improving Retrievability and Recall by Automatic Corpus Partition-
ing In: Transactions on Large-Scale Data- and Knowledge Centered Systems. 2:122-140.
2010.
On Things You Can’t Find: Retrievability Measures and What to Do with Them 7

[4] Azzopardi, L. and Vinay, V. Retrievability: an evaluation measure for higher order information
access tasks. In CIKM ’08: Proceeding of the 17th ACM conference on Information and knowledge
management, pages 561–570, New York, NY, USA, 2008. ACM.
[5] Sakai, T. Comparing metrics across trec and ntcir: the robustness to system bias. In CIKM
’08: Proceeding of the 17th ACM conference on Information and knowledge management, pages
581–590. ACM, 2008.

Andreas Rauber
Vienna University of Technology.
e-mail: [email protected]

Shariq Bashir
Vienna University of Technology.
e-mail: [email protected]
8 Rauber Andreas and Shariq Bashir
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 9–18.

A QUASI-STOCHASTIC DIFFUSION-REACTION
DYNAMIC MODEL FOR TUMOUR GROWTH

Ang Keng Cheng

Abstract. In this paper, a quasi-stochastic diffusion-reaction model for avascular tumour


growth is developed. The model is formulated as a set of partial differential equations
describing the dynamics of a three compartments of cell populations, the proliferating,
quiescent and necrotic cells. To introduce a more realistic effect of randomness in the
model, a stochastic term in the form of a standard Wiener process is added to one of the
equations. The model is solved numerically and the results obtained indicate that the
model is a reasonable representation of the dynamics of tumour growth. Convergence of
the numerical scheme is also analysed and discussed.

Keywords and Phrases: Tumour growth, quasi-stochastic, diffusion-reaction model

1. INTRODUCTION
Cancer is one of the main causes of death in the world. According to the World
Health Organization, in 2005, out of a total of 58 million deaths worldwide, cancer
accounts for 7.6 million, or around 13%, of all deaths (http://www.who.org/). In Sin-
gapore, cancer is the leading cause of death and is responsible for more than 25% of
all deaths in the past few years (http://www.moh.gov.sg/). It is therefore not sur-
prising that cancer research, both theoretical and experimental, has been gaining more
attention in recent years.
Cancer is a generic term for a group of diseases characterized by the abnormal
and uncontrolled growth of cells. Normal and healthy cells can grow, divide, die and
be replaced in a regulated fashion. However, if a cell becomes transformed as a result
of mutations in certain key genes, it can lose its ability to control its growth. This
in turn may lead to excessive proliferation, creating a cluster of cells, better known as
a primary tumour. If the tumour is malignant, it possesses the ability to spread and
invade neighbouring tissues. In fact, what makes cancer so lethal is the spread of cancer,
or metastasis, to other sites of the body because if the spread is not controlled, it can
result in death.
9
10 Ang Keng Cheng

The development of cancer can be divided into three distinct stages: avascular,
vascular and metastatic. When a primary tumour is first formed, it is simply a collection
of cells without any supply of blood vessels. This is the avascular stage and the avascular
tumour can sometimes remain dormant for a long period of time. If the tumour manages
to induce blood vessels to grow towards it and in turn develops its own network of vessels
(vasculature), it becomes a vascular tumour. The vascular tumour enters the metastatic
stage when its cancer cells are able to escape from the primary tumour to invade tissues
in a distant site and form a secondary tumour.
Avascular tumour growth usually begins with the presence of proliferating cells
dividing and reproducing at some rate. Some of these cells will then turn quiescent, that
is, alive but not dividing. However, if conditions are right, quiescent cells may begin
dividing again. Many cancer therapies specifically target dividing cells. Some quiescent
cells may eventually die, forming a central core of necrotic (dead) cells and the typical
three zone tumour spheroid is formed. This is shown schematically in Figure 1. Such
typical structure is evident in experimental culture studies performed by researchers
such as Folkman and Hochberg [6].

Proliferating cells

Quiescent cells

Necrotic cells

Figure 1. Typical structure of a multicellular tumour spheroid

Besides experimental studies, mathematical modelling can also provide useful


insights into the dynamics of tumour growth. Various approaches have been used by
researchers such as Burton [2], Greenspan [7], Adam [1], Ward and King [12], Sherratt
and Chaplain [10] and Tan and Ang [11].
In this paper, we focus on avascular tumour growth and develop a model based on
diffusion-reaction dynamics and the law of mass conservation. In addition, to include a
realistic element of randomness in the deterministic model, a stochastic term is added
to one of the equations. The resulting model is thus quasi-stochastic, and is solved
numerically. Results are discussed and compared with published works, and convergence
of the numerical scheme is also investigated.
N
Q
P
r
e
u
o
c
i
l
e
r
i
f
s
o
e
c
t
r
i
e
a
c
n
t
c
i
t
n
e
c
g
l
e
l
s
l
c
l
e
s
l
l
s
A Quasi-Stochastic Diffusion-Reaction Dynamic Model for Tumour Growth 11

2. THE MODEL
The tumour growth model considered here assumes that the cells are divided
into sub-populations of proliferating, quiescent and necrotic cells whose cell densities
are denoted by p(x, t), q(x, t) and n(x, t) respectively where t denotes time and x is
the one-dimensional space coordinate. Like the model first proposed by Sherratt and
Chaplain [10] and later modified by Tan and Ang [11], the current model also consists
of a set of partial differential equations based on the usual diffusion-reaction dynamics.
These equations describe the evolutions of p, q and n with time and along x.
However, unlike these previous models, we do not assume that movements of
proliferating and quiescent cells are inhibited by their proximity to each other. Instead,
the standard diffusion model is used for both cell sub-populations p and q. The resulting
model is a set of governing equations stated as follows.
∂p ∂2p
= + g(c).p.(1 − p − q − n) − f (c).p (1)
∂t ∂x2
∂q ∂2q
= + f (c).p − h(c).q (2)
∂t ∂x2
∂n
= h(c).q (3)
∂t
c0 γ
c = (1 − α(p + q + n)) (4)
γ+p
The functions g(c), f (c) and h(c) are variable rates of growth, and c(x, t) is a
function representing nutrient concentration. Following Tan and Ang, f (c) and h(c)
are assumed to be decreasing. Proliferating cells become quiescent at the rate of f (c)
and quiescent cells become necrotic at the rate of h(c). In addition, proliferating cells
divide at a rate of g(c) and a Gompertz growth rate is used to represent this rate in this
model. We also assume that as c tends to ∞, both f and h vanish. The growth process is
driven by nutrient supply, whose concentration in this case is governed by Equation (4),
as proposed by Sherratt and Chaplain for in vivo multicellular spheroids. Here, c0 and γ
are constant parameters, with c0 representing the nutrient concentration in the absence
of tumour cell population, and α ∈ (0, 1] represents a constant of proportionality.
Although several functional forms for f (c) and h(c) are possible, in the current
discussion, we take f (c) = 21 (1 − tanh(4c − 2)) and h(c) = 21 f (c). For a standard
Gompertz model, we assume g(c) = βeβc for some β between 0 and 1. For convenience,
we fix g(0) = 1 and set the total cell density p + q + n to 1 at t = 0 and x = 0 in all
computations.
Equations (1) to (4) and the functions f (c), g(c) and h(c) constitute a model for
growth of tumour cells based on diffusion-reaction dynamics.
12 Ang Keng Cheng

3. NUMERICAL SOLUTION
The governing equations (1) to (4) may be solved numerically using standard
finite difference methods. In our case, we employ the forward difference approximation
for time derivatives and central difference approximations for the space derivatives.
Using ∆t and ∆x as the time steps and space intervals respectively in a finite difference
scheme, the following set of discretized equations is obtained.
∆t  j   
pj+1
i = pji + pi+1 − 2pj
i + p j
i−1 + ∆t g(cj
i )(1 − ri
j
) − f (c j j
i )pi (5)
∆x2
∆t    
qij+1 = qij + 2
j
qi+1 − 2qij + qi−1
j
+ ∆t f (cji )pji − h(cji )qij (6)
∆x 
nj+1
i = nji + ∆t h(cji )qij (7)
c0 γ  
cj+1
i = 1 − αrij (8)
γ + pji
where rij = pji + qij + nji . In these discretized equations, the superscript and subscript
refer to the time level and space position respectively. As an example, pji = p(xi , tj ) =
p(i∆x, j∆t).
Given the randomness expected in cell mitosis and cell diffusion in proliferation, it
is both reasonable and realistic to include a stochastic term in the form of multiplicative
noise to Equation (1). Following Doering, Sargsyan and Smereka [4], Equation (5) may
be modified as
∆t  j   
pj+1
i = pji + pi+1 − 2p j
i + pj
i−1 + ∆t g(c j
i )(1 − ri
j
) − f (cj j
i )p i
∆x√2
+τ pji ∆t∆Wij (9)
where ∆Wij are independent Gaussians with mean zero and variance ∆t, and τ is
a suitably chosen scaling factor used to control the amplitude of the added noise. To
solve the stochastic differential equation numerically, we use the Euler-Maruyama (EM)
method as described by Higham [8]. A discretized Brownian path over a given duration,
say [0, T ] with a chosen incremental value of δt is first constructed. As part of the finite
difference computation, Equation (9) is used with a stepsize of ∆t = Rδt for some
positive integer R. This is to so that the stepsize for the finite difference scheme is
always an integer multiple of the increment for the discretized Brownian path. This, in
essence, is the EM method which ensures that the set of points used in the discretized
Brownian path contains the points of the EM computation.
The increment ∆Wij is then computed using
(j+1)R
X
∆Wij = Wij+1 − Wij = dWk
k=jR+1

and W0 = 0.
The boundary conditions and initial conditions used in the present discussion, as
well as some suitable parameter values are listed below.
A Quasi-Stochastic Diffusion-Reaction Dynamic Model for Tumour Growth 13

Boundary conditions:
∂p ∂q
=0 and =0 at x = 0 and as x → ∞.
∂x ∂x
Initial conditions:
q(x, 0) = n(x, 0) = 0 and p(x, 0) = 0.01 exp(−0.1x).
Suitable parameter values:
α ∈ (0.2, 0.9), β ∈ (0.1, 1.0), γ = 10, c0 = 1.
For the purpose of discussion, we have also fixed R = 2 and ∆t = 0.04, and
T = 16. Theoretically, x ∈ [0, ∞); however, in practice, we set a finite upper limit
for the space dimension. This upper limit has to be large enough to capture the main
features we wish to see in the results. After some computational experiments, it is found
that x = 210 is good enough, and that ∆x = 1 is an appropriate space interval.
The finite difference equations with the initial boundary conditions and suggested
parameter values, time steps and space intervals, are solved using matlab. Results are
presented and discussed in the next section.

4. RESULTS AND DISCUSSION


Using the functional forms, initial conditions and boundary conditions as men-
tioned in the preceding section, Equations (6) to (9) are solved for various parameter
values. A representative case with α = 0.8, β = 0.5 and γ = 10 produces results which
are shown in Figure 2. Solutions with other parameter values show similar results and
trends. In addition, one should also note that Figure 2(b) presents results obtained us-
ing one instance of a Brownian path. If one were to run another instance, the results are
likely to be different. However, the difference would quantitative and not qualitative.
For the purpose of the present discussion, it is sufficient to consider this representative
set of results.
Figure 2(a) and (b) show the changes in the distributions of proliferating, qui-
escent and necrotic cells over time for a purely deterministic model and for a quasi-
stochastic model respectively. It is clear that there are distinctive differences between
the two sets of graphs.
In Figure 2(a), the distributions are found to be very similar to the models origi-
nally proposed by Sherratt and Chaplain and later modified by Tan and Ang. Here, we
see that the distributions are unrealistically smooth and symmetrical. In contrast, the
new quasi-stochastic model produces distributions which are still smooth but no longer
symmetric, as shown in Figure 2(b).
Although “noise” was added only to the equation involving p(x, t), the stochastic
effect propagates and is carried over to the dependent variables. From Figure 2(b), we
observe that from t = 0 to t = 8, the proliferating cells appear to increase in density
and generally move away from the core. This results in a higher density of quiescent
14 Ang Keng Cheng

1 1

0.8 0.8
Proliferating Cells, p

Proliferating Cells, p
t=2 t=2
0.6 0.6

0.4 t = 16 0.4 t = 16

0.2 0.2

0 0
0 50 100 150 200 0 50 100 150 200
space, x space, x

0.5 0.5
t=2 t=2
Quiescent Cells, q

Quiescent Cells, q
0.4 0.4

0.3 t = 16 0.3 t = 16

0.2 0.2

0.1 0.1

0 0
0 50 100 150 200 0 50 100 150 200
space, x space, x

1 1

0.8 0.8
Necrotic Cells, n

Necrotic Cells, n

0.6 0.6

0.4 0.4
t = 16 t = 16
0.2 0.2

t=2 t=2
0 0
0 50 100 150 200 0 50 100 150 200
space, x space, x
(a) deterministic, without noise (b) quasi-stochastic, with noise

Figure 2. Graphs of p(x, t), q(x, t) and n(x, t) against t, at t = 0,


2, . . . 16, for α = 0.8, β = 0.5, and γ = 10 for deterministic (a) and
quasi-stochastic (b) models.
A Quasi-Stochastic Diffusion-Reaction Dynamic Model for Tumour Growth 15

cells. During this period, however, necrosis is limited to the central tumour core. This
is consistent with results obtained experimentally by Nirmala et al [9].
As time passes from t = 8 to t = 16, proliferating cells continue to increase in
numbers and move away from the centre, indicating absence of tumour regression. This
also compares well with results from Nirmala et al, who observed that there was no
limiting spheroid volume. From t = 8 onwards, necrotic cell density begins to build up
and a necrotic core starts to take shape.
Assuming that the model is radially symmetrical, images of tumour growth may
be constructed in two dimensions from the numerical results obtained by randomly dis-
tributing the cells along a circumference fixed by the corresponding value of x. The
result is a series of images simulating tumour growth progressing towards the distinc-
tive three-zone structure as shown in Figure 3. Comparing the last image in the figure
with the schematic diagram in Figure 1, it is evident that the model provides a rea-
sonable representation of tumour growth. Moreover, the patterns demonstrated by the
model are consistent with experimental observations of Dorie et al [5], and Folkman
and Hochberg [6].

Figure 3. Simulated tumour growth based on quasi-stochastic model


at t = 2, t = 8, t = 12 and t = 16 showing the formation of the
distinctive three-zone (proliferating, quiescent and necrotic) structure
of a tumour spheroid.

5. CONVERGENCE OF NUMERICAL SCHEME


Convergence of the finite difference scheme based on Equations (6) to (9) was
investigated numerically using the method proposed by Davie and Gaines [3]. We
generated 100 sets of results, each of which is a single representation of the noise in
the evolution of p(x, t). Approximations for various different numbers of space intervals
were calculated for values of p(x, T ), where T = 16 in all cases.
In each case, a weighting function defined by
 
1 1
φ(x) = √
210 2 + cos(πx/105)
was used to find a weighted average of the approximate values found across the space
dimension x.
16 Ang Keng Cheng

To investigate the convergence of the numerical solutions for p(x, T ), we let


100
X
Sk = (p̄kj − p̄k+1
j )2
j=1

for k = 1, 2, 3, where
Nk −1
1 X
p̄kj = φ(xki ) pkj,i ,
Nk i=0

210i
xki = and pkj,i is the approximation to p(xki , T ) with N = Nk and the jth inde-
Nk
pendent realisation of the Brownian path. In the present convergence study, we used
N1 = 210, N2 = 420, N3 = 840 and N4 = 1680, with ∆t = 0.04. Many runs were
carried out to obtain values of Sk and Table 1 records the results of five such runs.
Results shown in the table indicate that with multiplicative noise, the ratio Si /Si+1 is
about 4 in all cases, suggesting that the current numerical method has achieved weak
convergence. It seems reasonable to say that finite difference scheme used in the present
study converges reasonably well and is stable.

Table 1. Space averages for Equation (9)

Run S1 S2 S3 S1 /S2 S2 /S3


(×10−6 ) (×10−6 ) (×10−6 )
1 2.2016 0.5225 0.1386 4.2138 3.7703
2 2.0171 0.5819 0.1362 3.4666 4.2745
3 1.8706 0.6194 0.1405 3.0200 4.4099
4 1.8643 0.5423 0.1500 3.4378 3.6150
5 1.9587 0.6032 0.1375 3.2471 4.3881

6. CONCLUDING REMARKS
In this paper, we examine a quasi-stochastic dynamic model for tumour growth
by including multiplicative noise to one of the governing equations. The model is solved
using standard finite difference method, together with the Euler-Mayurama method to
handle the stochastic component. The result is a more realistic model, producing a
simulated tumour growth that agrees with published experimental observations.
A brief analysis of the numerical method using space averages indicates that
the method is both convergent and stable, although it is possible to achieve better
convergence if implicit or semi-implicit finite difference schemes are used. Nevertheless,
the focus of this paper is on a more realistic tumour growth model that can be solved
using a reasonable numerical scheme.
A Quasi-Stochastic Diffusion-Reaction Dynamic Model for Tumour Growth 17

Acknowledgement. This paper represents an extension of the work carried out by


Tan Liang Soon under the supervision of the author. The author acknowledges and
thanks Tan for his contribution to the initial part of the work.

References
[1] Adam, J.A., A simplified mathematical model of tumour growth, Mathematical Bioscience, 81,
224-229, 1986.
[2] Burton, A.C., Rate of grwoth of solid tumours as a problem of diffusion, Growth, 80, 157-176,
1966.
[3] Davie, A.M. and Gaines, J.G., Convergence of numerical schemes for the solution of parabolic
stochastic partial differential equations, Mathematics of Computation, 70, 121-134, 2000.
[4] Doering, C.R., Sargsyan, K.V. and Smereka, P., A numerical method for some stochastic
differential equations with multiplicative noise, Physics Letters A, 344, 149-155, 2005.
[5] Dorie, M., Kcallman, R. and Coyne, M., Effect of cytochalasin b, nocodazole and iradiation
on migration and internalization of cells and microspheres in tumour cell spheroids, Experimental
Cell Research, 166, 370-378, 1986.
[6] Folkman, J. and Hochberg, M., Self-regulation of growth in three dimensions, Journal of Ex-
perimental Medicine, 138, 745-753, 1973.
[7] Greenspan, H.P., Models for the growth of a solid tumour by diffusion, Studies in Applied Math-
ematics, 62, 317-340, 1972.
[8] Higham, D.J., An algorithmic introduction to numerical simulation of stochastid differential equa-
tions, SIAM Review, 43, 525-546, 2001.
[9] Nirmala, C., Rao, J.S., Ruifrok, A.C., Langford, L.A. and Obeyesekere, M., Growth char-
acteristics of glioblastoma spheroids, International Journal of Oncology, 19, 1109-1115, 2001.
[10] Sherratt, J.A. and Chaplain, M.A.J., A new mathematical model for avascular tumour growth,
Journal of Mathematical Biology, 43, 291-312, 2001.
[11] Tan, L.S. and Ang, K.C., A numerical simulation of avascular tumour growth, ANZIAM Journal,
46, C902-C917, 2005.
[12] Ward, J.P. and King, J.R., Mathematical modelling of avascular tumour growth, IMA Journal
of Mathematics Applied in Medicine and Biology, 14, 39-69, 1997.

Ang Keng Cheng


National Institute of Education, Nanyang Technological University.
e-mail: [email protected]
18 Ang Keng Cheng
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 19–28.

∗-RINGS IN RADICAL THEORY

Halina France-Jackson

Abstract. A semiprime ring A is called a *- ring if A/I is prime radical for every nonzero
ideal I of A. In this survey paper we will show the versatility of *- rings by discussing
some open problems of radical theory which were solved with the aid of *- rings.
Keywords and Phrases: radical; strongly-prime, prime-essential, fillial rings; atoms

1. INTRODUCTION
In this paper, all rings are associative and all classes of rings are closed under
isomorphisms and contain the one-element ring 0. The fundamental definitions and
properties of radicals can be found in Andrunakievich and Rjabukhin [2], Divinsky [4]
and Gardner and Wiegandt [14]. The notation I C A means that I is a two-sided ideal
of a ring A. For a class µ of rings, an ideal I of a ring A is called an µ ideal if the
factor ring R/I is in the class µ and U (µ) denotes the class of all rings that cannot be
homomorphically mapped onto a nonzero ring in µ. As usual, for a radical γ, the γ
radical of a ring A is denoted by γ (A) and S (γ) = {A : γ (A) = 0} is the semisimple
class of γ. π denotes the class of all prime rings and β = U (π) denotes the prime
radical. A ring A is simple if it has no proper ideals, that is, for every I C A, either
I = 0 or I = A.
A ring R is called a ∗-ring France-Jackson [5] if it satisfies one of the following
equivalent conditions:
(1) R ∈ π and R/I ∈ / π for every proper I C R
(2) R ∈ S (β) and R/I ∈ β for every 0 6= I C R.
Example 1.1. Every simple prime ring is a ∗-ring.
n o
2x
Example 1.2. France-Jackson [5] W = 2y+1 : x, y ∈ Z and gcd (2x, 2y + 1) = 1 is
a Jacobson radical ∗-ring without minimal ideals.

2010 Mathematics Subject Classification: 16N80

19
20 Halina France-Jackson

Example 1.3. France-Jackson [9] A nonzero prime heart P H (R) of a prime ring R
is a ∗-ring, where P H (R) = ∩ {I : 0 6= I C R and R/I ∈ π}.
Example 1.4. France-Jackson [5] Let x be a nonzero element of the centre of a semiprime
ring A, and let I be an ideal of Ax maximal with respect to having an empty intersection
with the set {xn : n > 1}, then Ax/I is a ∗-ring.
Example 1.5. Let M be the unique maximal ideal of a commutative local principal
ideal domain R with the identity element 1. If M 6= 0, then M is a ∗-ring which is not
a simple ring.

Proof. Since R is a local ring, the set of nonunits of R is the ideal M . Suppose I $ M
is a nonzero prime ideal of M . Then I C R and, since R is a commutative principal
ideal domain, it follows that I = iR and M = nR for some 0 6= i ∈ I and n ∈ M .
Then, since i = i1 ∈ I $ M , it follows that i = nr for some r ∈ R. If r ∈ / M , then r
is a unit of R and then n = ir−1 ∈ I since I C R. But this implies that M ⊆ I $ M ,
a contradiction. Thus r ∈ M . But then, since M/I is a prime commutative ring and
0 + I = i + I = (n + I) (r + I), it follows that n ∈ I or r ∈ I. But n ∈ I, implies
M ⊆ I $ M , a contradiction so we must have r ∈ I. Then r = is for some s ∈ R which
implies i1 = i = nr = n (is) = i (ns) since R is commutative. This implies 1 = ns
since R has no zero divisors. This means that 1 ∈ M which implies that M = R. This
contradicts the maximality of M . Thus M has no nontrivial prime ideals. Moreover,
since R is a prime ring, so is M which implies that M is a ∗-ring.
Suppose M is a simple ring and let 0 6= m ∈ M . Then 0 6= mM C M because R
is a commutative ring without zero divisors. But, since M is simple, this implies that
mM = M . Then mx = m = m1 for some x ∈ M and, since R has no zero divisors, it
followas that 1 = x ∈ M which implies that M = R, a contradiction. Thus M is not a
simple ring. 

In this survey paper we will show the versatility of ∗-rings by discussing some
open problems of radical theory which were solved with the aid of ∗-rings.

2. THE SECOND SECTION


2.1. Handelman’s and Lawrence’s question. A ring A is prime if for given r, t ∈
A\ {0} there exists s ∈ A such that rst 6= 0.
If for each nonzero t we restrict the choice of the s to a finite set F (independent
of r but dependening on t), then we have a ring stronger than prime. We therefore say
that a ring A is (left) strongly prime (or SP in short) Handlelman and Lawrence [15] if
it satisfies one of the equivalent conditions:
(1) For each nonzero a ∈ A there is a finite set F ⊆ A such that xF a = 0 implies
x = 0.
(2) Every nonzero ideal I of A contains a finite subset G such that
{x ∈ A : xG = 0} = {0}.
*-Rings in Radical Theory 21

Examples are domains, prime Goldie rings and simple rings with unity. Every
prime ring can be embedded in an SP ring. All SP rings are coefficient rings for some
primitive group rings and this was the initial motivation for their study.
Handelman and Lawrence [15] observed that if a prime ring A whose centre is a
field F contains a nonzero nil ideal that is locally nilpotent as an F -algebra, then A
is not SP. But, by Golod-Shafarevitch theorem Gardner Wiegandt [14], there exists a
finitely generated nil algebra A which is not nilpotent and so not locally nilpotent. So
they asked:
Can an SP ring with unity contain a nonzero nil ideal?
In Korolczuk [16] an essential extension of a ∗-ring which fits the prerequisites
was constructed as follows:
Let A be a finitely generated nil ring which is not nilpotent and let A1 be the
Dorroh extension of A which is again finitely generated. Since for any natural number
n, An is finitely generated, by Zorn’s Lemma we can find an ideal I of A1 which is
maximal with respect tonot containing any An . Then A1 /I is an SP with unity and,
as A ∈ N implies N A1 6= 0, we have N A1 /I 6= 0.
By the choice of I, for every nonzero
 prime ideal J/I of A1 /I , we have An ⊆ J
1 1
for some n. But, since A /J ' A /I /(J/I) ∈ π, then A ⊆ J. Thus 0 6= (A + I) /I is
contained in the prime heart P H A1 /I of A1 /I, that is the intersection of all nonzero
prime ideals of A1 /I. But by France-Jackson [9], a nonzero prime heart of a prime ring
is a ∗-ring, so A1 /I is an essential extension of its prime heart.

2.2. Ferrero’s questions. An ideal I of a ring A is essential (written I J A) if I ∩J 6= 0


for every 0 6= J C A. If I J A, we call A an essential extension of I.
A class µ ⊆ π is a special class if it is hereditary (I C A ∈ µ implies I ∈ µ) and
closed under essential extensions (I ∈ µ is an essential ideal of a ring A implies A ∈ µ).
The upper radical U (µ) of a special class µ is called a special radical.
For a homomorphically closed class µ of rings, L (µ) and Lsp (µ) denote the small-
est radical and the smallest special radical containing µ, respectively.
By Dyvinsky [4], for any partition (η, ς) of simple rings, L (η) U (ς). Let (η, ς)
be a partition of simple prime rings. Since L (η) ⊆ Lsp (η) ⊆ U (s (ς)) ⊆ U (ς), where
s (ς) is the class of all prime and subdirectly irreducible rings with heart in ς , Ferrero
asked:
Are Lsp (η) and U (s (ς)) distinct?
By Tumurbat and Wiegandt [26], Lsp (η) = U (ρ ∪ s (ς)), where ρ is the class of
all prime rings without minimal ideals. Since the ∗-ring W is without minimal ideals,
we have W ∈ ρ \ s (ς) and it follows that W ∈ U (s (ς)) and W ∈ / U (ρ ∪ s (ς)). So
Lsp (η) & U (s (ς)).
A radical γ is prime-like Tumurbat and France-Jackson [25] if for every prime ring
A, the polynomial ring A [x] is in S (γ). Examples include β and the smallest special
radical Lsp ({R}) generated by any commutative ∗-ring R which is Jacobson radical.
22 Halina France-Jackson

It is well known Tumurbat and Wiegandt [26] that if γ and δ are special radicals
that coincide on all simple prime rings and on all prime rings without minimal ideals,
then γ = δ. Since polynomial rings have no minimal ideals, Ferrero Tumurbat and
Wiegandt [26] asked:

Can two distinct special radicals coincide on all simple rings and on polynomial
rings A [x] for all rings A?

Since a radical γ k β is prime-like if and only if γ (A [x]) = β (A [x]) for every


ring A Tumurbat and France-Jackson [25], the prime radical β and the smallest special
radical Lsp ({W }) generated by the ∗-ring W satisfy Ferrero requirements Tumurbat
and France-Jackson [25].

2.3. Special and supernilpotent atoms. It is well known Andrunakievich and


Rjabukhin [2] that the family S of all special radicals and the family K of all supernilpo-
tent (that is, hereditary and containing β) radicals form complete lattices with respect
to inclusion. Minimal elements of S (respectively K) are called special (respectively
supernilpotent) atoms. The smallest special (respectively supernilpotent) radical con-
taining a ring A is denoted by blA (respectively l A ).
The problem of a description of special (respectively supernilpotent) atoms, was
raised in Andrunakievich and Rjabukhin [2]. Then it was studied in Booth and France-
Jackson [3], France-Jackson [5], France-Jackson [6], France-Jackson [9], Korolczuk [16],
Korolczuk [17], Snider [23] and Puczylowski and Roszkowska [21]. Rjabukhin [22]
showed that for every prime simple ring A, blA (respectively l A ) is an atom of S (re-
spectively K). He asked:

Is every atom of the lattice S (respectively K) generated by a single simple idem-


potent ring?

In Korolczuk [17] (respectively France-Jackson [6]) it was proved that blW (respec-
tively l W ) is an atom of S (respectively K) and is not generated by a single simple
idempotent ring.
Since every special (respectively supernilpotent) radical generated by a nonzero
∗-ring is an atom in S Korolczuk [16] (respectively K) France-Jackson [6], Rjabukhin’s
question now becomes:

Can every atom in S (respectively K) be generated by a nonzero ∗-ring?

This question is still open. The following is also an open problem (and a very
difficult one):

Does the upper radical U (∗k ) generated by ∗k = {A : I J A for some I ∈ ∗} co-


incide with the prime radical β?

It was shown in France-Jackson [5] that the equality of β and U (∗k ) is equivalent
to the lattice S (respectively K) being atomic, with every atom the smallest special
(respectively supernilpotent) radical containing some ∗-ring.
*-Rings in Radical Theory 23

2.4. Extraspecial radicals. Let ρ be a supernilpotent radical with a semisimple class


S. Gardner [13] called a nonzero ring A S-subdirectly irreducible if A is an essential
extension of a nonzero ring S ∈ S such that every proper homomorphic image of S is
in ρ. He called a supernilpotent radical ρ extraspecial if every ring A ∈ S is a subdirect
sum of S-subdirectly irreducible rings. Gardner asked:
Is the prime radical β extraspecial?
Since the class ∗k is precisely the class of all S-subdirectly irreducible rings with S
being the class of all semiprime rings, thus β = U (∗k ) is equivalent to the extraspeciality
of β.
2.5. Matric-extensible atoms. A radical ρ is matric-extensible if for every ring R,
R is in % if and only if the ring Rn of all n × n matrices with entries from R is in %. It
was shown in Booth and France-Jackson [3] that the class Lms of all matric-extensible
special radicals is a complete sublattice of the lattice S of all special radicals. Minimal
elements of Lms are called matric-extensible atoms. A natural question arrises:
Does Lms contain atoms?
Let V be a vector space of countably infinite dimension, and let {u1 , v1 , u2 , v2 , ...}
be a basis for V . Let f be the linear self-map of V defined by f (ui ) = vi and f (vi ) = 0
for all natural numbers i. Let S be the ring of all linear self-maps of V of finite rank
and let A be the ring generated by S ∪ {f }. Then A is a subdirectly irreducible ring
2
with the idempotent heart S and (A/S) = 0 Gardner [12]. Hence A is a nonsimple
∗-ring. It was shown in Booth and France-Jackson [3] that the smallest special radical
generated by A is an atom of Lms .
2.6. Supernilpotent left hereditary atoms and N-atoms. A radical ρ is left hered-
itary if every left ideal L of a ring A ∈ ρ is in ρ. A radical ρ is left strong if L ⊆ ρ (A)
for every left ideal L ∈ ρ of a ring A. A supernilpotent radical ρ which is both, left
hereditary and left strong is called an N-radical. The smallest N-radical containing a
ring A is denoted by l nA . Roszkowska [24] studied supernilpotent left hereditary atoms,
that is minimal elements of the lattice of all supernilpotent left hereditary radicals as
well as N-atoms, that is minimal elements of the lattice of all N-radicals. She proved
the following:
Theorem 2.1. [Roszkowska [24], Theorem 4.5.9] Let 0 6= A ∈ π be a commutative ring.
The following conditions are equivalent: (1) l A is a supernilpotent atom. (2) l A is a
supernilpotent left hereditary atom.(3) l nA is an N-atom.
Natural questions spring to mind:
Do supernilpotent left hereditary atoms exist?
Do N-atoms exist?
Since for every ∗-ring A, the radical l A is a supernilpotent atom, we have
Corollary 2.1. For every nonzero commutative ∗-ring A, l A is a supernilpotent left
hereditary atom and l nA is an N -atom.
24 Halina France-Jackson

2.7. Prime-like atoms. It was shown in Tumurbat and France-Jackson [25] that the
collection Lspl of all special and prime-like radicals is a complete sublattice of the
lattice S of all special radicals. Minimal elements of Lspl are called prime-like atoms.
It is natural to ask:
Do prime-like atoms exist?
Since blW is a special atom and it is a prime-like radical, it is clearly a prime-like
atom.
2.8. Supernilpotent nonspecial radicals. Almost nilpotent rings are rings whose
every proper homomorphic image is nilpotent. For example, the ∗-ring W is almost
nilpotent. Prime essential rings are semiprime rings whose every nonzero ideal is not
a prime ring.
Example 2.1. France-Jackson [7]
Let A be the ∗-ring W , let κ be an infinite cardinal number greater than the
cardinality of A and let W (κ) be the set of all finite words made from a well-ordered
alphabet of cardinality κ, lexicographically ordered. Then W (κ) is a semigroup with
multiplication defined by xy = max {x, y} and the semigroup ring A (W (κ)) is a nonzero
prime essential ring whose every prime homomorphic image is isomorphic to the ring
A.
Since special radicals are hereditary and they contain the prime radical β, every
special radical is supernilpotent. Therefore Andrunakievich [1] asked:
Is every supernilpotent radical special?
In France-Jackson [7] it was proved that any radical (other than β) whose semisim-
ple class contains all prime essential rings is nonspecial. This yields non-speciality of
certain known radicals such as, the lower radical L2 generated by the class of all almost
nilpotent rings and thus shows that L2 does not coincide with the antisimple radical
which answers a question of van Leeuwen and Heyman [20] in the negative. A natural
question arises:
Can a nonspecial radical contain a nonzero prime essential ring?
It was proved in France-Jackson [7] that the nonhereditary (and therefore non-
special) Jenkins radical U ({all prime simple rings}) Leavitt [18] contains the nonzero
nonsimple prime essential ring A (W (k)) constructed in Example 2.1.
Let ρ be a supernilpotent radical. Let ρ∗ be the class of all rings A such that
either A is simple in ρ or the factor ring A/I is in ρ for every nonzero ideal I of A and
every minimal ideal M of A is in ρ. Let L (ρ∗ ) be the lower radical determined by ρ∗
and let ρϕ denote the upper radical determined by the class of all subdirectly irreducible
rings with ρ-semisimple hearts. Le Roux and Heyman [19] proved that ρ ⊆ L (ρ∗ ) ⊆ ρϕ
and L (G ∗ ) = Gϕ , where G is the BrownMc-Coy radical. They asked:
Is it true that L (ρ∗ ) = ρϕ when ρ is replaced by the prime radical β, the locally
nilpotent radical L, the nil radical N or the Jacobson radical J , respectively?
*-Rings in Radical Theory 25

In France-Jackson [11] it was proved that if ρ is a supernilpotent radical whose


semisimple class contains a nonzero nonsimple ∗-ring without minimal ideals, then L (ρ∗ )
is nonspecial and consequently L (ρ∗ ) 6= ρϕ and the question was answered in the
negative by constructing a nonzero, nonsimple ∗-ring without minimal ideals which is
Jacobson semisimple as follows:
Let F be a field of characteristic 0 which has an authomorphism S such that no
integral power of S is the identity automorphism. For example, F might be a field
generated by the real numbers and an infinite number of independent variables labelled
... x−2 , x−1 , x0 , x1 , x2 , ... and S the automorphism which leaves the real numbers alone
and which sends xi into xi+1 for every i. Let R be the set of all polynomials in an
indeterminate z of the form a0 + za1 + z 2 a2 + ... + z n an , where ai ∈ F . Addition
and multiplication of such polynomials is defined in the usual way except that z does
not commute with the coefficients a. We define az = zS (a), where S (a) is the image
of a under the authomorphism S. Then az m = zS m (a) for any positive integer m.
Then this definition, together with the distributive law, makes R into a ring denoted by
F [z, S] and its ideal T = zR is a nonzero nonsimple primitive ∗-ring without minimal
ideals. Thus T ∈ S (J ) ⊆ S (N ) ⊆ S (L) ⊆ S (β).

2.9. Tzintzits’ questions. Tzintzis [27] introduced an interesting radical χ which is


the upper radical generated by the class of all fillial rings, that is, rings R such that
J C I C R implies J C R for all subrings J and I of R. He showed that χ contains the
class α of all idempotent Brown-McCoy radical rings R such that every nonzero prime
homomorphic image R0 of R has a nonzero centre. He asked:
Does χ coincide with α?
Does χ coincide with the class IL of all idempotent Levitzky radical rings?
Since the class of all filial rings is homomorphically closed and every idempotent
ring in β is in χ [27], nonsimple idempotent ∗-rings are contained in χ and therefore
they can be used to answer those questions.
The second question was answered in the negative in France-Jackson [8] by con-
structing a commutative nonsimple idempotent ∗-ring I which is χ-radical but Levitzki
semisimple as follows:
Let F be the field of all real numbers and G the multiplicative group of all
positive real numbers with the natural total ordering ≤. Consider the direct power
F G regarded as a vector space over F . With every element f ∈ F G we associate
its support, that is, the set D (f ) = {s ∈ G : f (s) 6= 0}. Let A be the subset of
F G consisting
Pof all elements with well-ordered support. Each member of A is a power
series f = f (s) s. The sum of two such power series again belongs to A as the
union ofP two well-ordered
P setsP is P
again well-ordered. We define products in A by
f g = ( f (s) s) ( g (t) t) = ( st=u f (s) g (t)) u. Then A is an algebra over F
and I = {f ∈ A : f (s) = 0 for s ≤ 1} is a commutative idempotent and non-simple
∗-ring. So I ∈ α ⊆ χ and L (I) = β (I) = 0.
A negative answer to the first question was given in France-Jackson [10] by show-
ing that if A is a nonsimple idempotent ∗-ring and R = M (A) is the ring of all infinite
26 Halina France-Jackson

matrices with entries from A having only finitely many nonzero entries, then R is a
nonsimple idempotent ∗-ring with zero centre and so R ∈ χ \ α.

3. CONCLUDING REMARKS
Although many problems were solved using ∗ -rings, very little is known about
their structure. Thus there is a strong motivation for studying ∗-rings to determine
their properties.

References
[1] andrunakievich V. A., Radicals of associative rings I, (in Russian), Mat. Sb. 44, 179-212, 1958.
[2] Andrunakievich V. A. and Rjabukhin Yu. M. , Radicals of algebras and structure theory, (in
Russian), Nauka, Moscow, 1979.
[3] Booth G. L. and France-Jackson H., On the lattice of matric-extensible radicals II, Acta Math.
Hungar. 112 (3), 187-199, 2006.
[4] Divinsky N, Rings and Radicals, Allen & Unwin: London, 1965.
[5] France-Jackson H., ∗-rings and their radicals, Quaestiones Math. 8 (3), 231-239, 1985.
[6] France-Jackson H., On atoms of the lattice of supernilpotent radicals, Quaestiones Math. 10
(3), 251-256, 1987.
[7] France-Jackson H., On prime essential rings, Bull. Austral. Math. Soc. Ser A, 47, 287-290, 1993.
[8] France-Jackson H., On the Tzintzis radical, Acta Math. Hungar. 67 (3), 261-263, 1995.
[9] France-Jackson H., Rings related to special atoms, Quaestiones Math. 24 (1), 105-109, 2001.
[10] France-Jackson H., On a nonsimple idempotent ∗-ring with zero centre, Acta Math. Hungar.
100 (4), 325-327, 2003.
[11] France-Jackson H., On supernilpotent nonspecial radicals, Bull. Austral. Math. Soc., 78, 107-
110, 2008.
[12] Gardner B. J., Small ideals in radical theory, Acta Math. Hungar., 43, 287-294, 1984.
[13] Gardner B. J., Some results and open problems concerning special radicals, in: Radical Theory
(Proceedings of the 1988 Sendai Conference, Sendai 24-30 July 1988), (ed. S. Kyuno) (Uchida
Rokakuho Pub. Co. Ltd, Tokyo, Japan, 1989) 25-26, 1989.
[14] Gardner B. J. and Wiegandt R. Radical Theory of Rings, Marcel Dekker Inc., New York, 2004.
[15] Handelman D. and Lawrence J., Strongly prime rings, Trans. Amer. Math. Soc. 211, 209-223,
1975.
[16] Korolczuk H., Lattices of Radicals of Rings, PhD thesis (in Polish), University of Warsaw, 1982.
[17] Korolczuk H., A note on the lattice of special radicals, Bull. Polish Acad. Sci. Math. 29, 103-104,
1981.
[18] Leavitt W. G. and Jenkins T. L., Non-hereditariness of the maximal ideal radical class, J.
Natur. Sci. Math. 7, 202-205, 1967.
[19] Le roux H. J. and Heyman G. A. P., A question on the characterization of certain upper radical
classes, Boll. Unione Mat. Ital. Sez. A 17(5), 67-72, 1980.
[20] van Leeuwen L. C. A. and Heyman G. A. P., A radical determined by a class of almost nilpotent
rings, Acta Math. Hungar. 26, 259-262, 1975.
[21] Puczylowski E. R. and Roszkowska E, Atoms of lattices of radicals of associative rings, in
Radical Theory (Proceedings of the 1988 Sendai Conference, Sendai 24-30 July 1988), (ed. S.
Kyuno) (Uchida Rokakuho Pub. Co. Ltd, Tokyo, Japan, 1989) 123-134, 1989.
[22] Rjabukhin J. M., Overnilpotent and special radicals (in Russian), Algebry i moduli Mat. Issled.
48, Kishinev, 80-93, 1978
[23] Snider R. L., Lattices of radicals, Pacific J. Math., 40, 207-220, 1972.
*-Rings in Radical Theory 27

[24] Roszkowska E., Lattices of Radicals of Associative Rings, PhD thesis (in Polish), University of
Warsaw, 1995.
[25] Tumurbat S. and France-Jackson H., On prime-like radicals, Bull. Austral. Math. Soc., 82,
113-119, 2010.
[26] Tumurbat S. and Wiegandt R., A note on special radicals and partitions of simple rings, Comm.
Algebra 30 (4), 1769-1777, 2002.
[27] Tzintzis G., An almost subidempotent radical property, Acta Math. Hungar. 49, 173-184, 1987.

Halina France-Jackson
Nelson Mandela Metropolitan University
e-mail: [email protected]
28 Halina France-Jackson
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 29–40.

CLEAN RINGS AND CLEAN MODULES

Indah Emilia Wijayanti

Abstract. The intensive investigation of (strongly) n-clean rings and right (left) clean
rings) have been done by many authors. Moreover, in every case they have some al-
most similar performances. The author has generalized those notions by combining the
definition of n-clean rings and right clean rings, i.e. right n-clean ring. In this paper
we give an overview the properties of (strongly) n-clean rings and right n-clean rings,
especially the product, the quotient ring, homomorphic image and matrices over over a
(strongly) n-clean rings or right n-clean ring. Furthermore, we also give an overview of
some properties of (strongly) clean n-modules and right clean modules.

Keywords and Phrases: (strongly) n-clean rings, right clean rings, right n-clean rings,
(strongly) n-clean modules, right clean modules.

1. INTRODUCTION
Throughout, for the ring R we mean an associative ring with identity 1R . An
element e in ring R is called (strongly) clean if e can be composed into a sum of a
(nonzero) idempotent element and a unit elements. An element r in ring R is called
(strongly) n-clean if r can be composed into a sum of a (nonzero) idempotent element
and n unit elements. A ring R is called strongly n-clean and n-clean if all its elements
are strongly n-clean and n-clean respectively.
Chen and Cui in [4] gave some characterizations of a (strongly) clean ring. Khak-
sari and Moghimi [6] introduced a slight generalization of clean rings to n-clean rings.
Moreover they also presented some properties of clean modules. A clean module is a
module which endomorphism ring is a clean ring. Wang and Chen in [7] observed 2-
clean rings and presented some properties which can be generalized into n-clean rings.
Călugăreanu [1] defined a generalization of clean to right clean by replacing the the
units by right units. Furthermore, Wijayanti [8] proposed a more general definition i.e.
right n-clean rings. In this paper we give an overview of the properties of strongly or

2010 Mathematics Subject Classification:

29
30 Indah Emilia Wijayanti

right n-clean rings, especially the product of strongly or right n-clean rings, the quo-
tient ring, homomorphic image of a strongly or right n-clean ring and matrices over a
strongly or right n-clean ring. Furthermore, we also give an overview of some properties
of (strongly) n-clean modules and right clean modules which are referred to Camillo
et.al [2] and Zhang [9].
A nonzero element s in the ring R is called right invertible if there exists t ∈ R
such that st = 1R . Sometimes we call right invertible elements as right units, i.e an
element which has a right inverse but not necessary left inverse. An element e in a ring
R is called idempotent if e2 = e. We denote by Ur (R) the right invertible elements of
R and Id(R) the set of idempotent elements in R.
We recall the Pierce Decomposition which plays an important rule in this work.
Let R be a ring with identity and {e1 , e2 , . . . , en } idempotent elements in R such that
e1 + e2 + . . . + en = 1R . Then R can be decomposed
Ln into a direct sums of ei Rej for
every i, j = 1, 2, . . . , n and denoted by R = i,j=1 ei Rej . We call this decomposition as
Pierce Decomposition of R. Sometimes we denote this decomposition as a generalized
matrix :  
e1 Re1 e1 Re2 . . . e1 Ren
 e2 Re1 e2 Re2 . . . e2 Ren 
R'
 
.. .. .. .. 
 . . . . 
en Re1 en Re2 ... en Ren

2. SOME PROPERTIES OF CLEAN RINGS


We recall some definitions which have an important role in our discussion.
Definition 2.1. (i) An element of a ring R is called n-clean if it is a sum of an
idempotent element and n invertible elements. A ring R is called n-clean if all its
elements are n-clean.
(ii) An element of a ring R is called strongly n-clean if it is a sum of an idempotent
element and n invertible elements. A ring R is called strongly n-clean if all its
elements are strongly n-clean.
(iii) An element of a ring R is called right n-clean if it is a sum of an idempotent
element and n right invertible elements. A ring R is called right n-clean if all its
elements are right n-clean.

We summarize some propositions in which many properties of clean ring are sim-
ilar with both of its generalizations and special cases, which are referred to [6] and
[8].
Proposition 2.1. Let f : R → S be a ring homomorphism. The following assertions
are satisfied.
(i) If R is a (right) n-clean ring, then Im(f ) is a (right) n-clean ring.
(ii) If R is a strongly n-clean ring and Id(R) ∩ Ker(f ) = 0, then Im(f ) is a strongly
n-clean ring.
Clean Rings and Clean Modules 31

Proof. We recall the proof of assertion (i) from Proposition 2 of [8] and the proof of (ii)
from Proposition 2.4 of [6].
(i) For any r ∈ R there is an idempotent element e ∈ Id(R) and right invertible
elements u1 , u2 , . . . , un ∈ Ur (R) such that
r = e + u1 + · · · + un .
It implies
f (r) = f (e) + f (u1 ) + · · · + f (un ).
Since f (e) ∈ Id(S) and f (ui ) ∈ Ur (S) for all i’s, it is clear f (r) is a right n-clean element
and S is right n-clean.
(ii) Let s ∈ Im (f ), so there r ∈ R such that f (r) = s. But as R a strongly
n-clean, we have r = e + u1 + · · · + un , where e a non-zero idempotent element, ui s are
unit. Therefore
s = f (r) = f (e + u1 + · · · + un ) = f (e) + f (u1 ) + · · · + f (un ).
Since f (e) 6= 0 and also an idempotent element in Im (f ) and f (ui )s are unit in Im (f ),
we conclude that Im (f ) is a strongly n-clean ring. 
Then we have immediately the following corollary.
Corollary 2.1. Let R, S and T be rings and consider the short exact sequence
0 /S /R /T / 0.
The following assertions are satisfied.
(i) If R is a (right) n-clean ring, then S and T are (right) n-clean rings.
(ii) If R is a strongly n-clean ring and Id(R) ∩ Ker(f ) = 0, then S and T are strongly
n-clean rings.
It means, every (right) n-clean ring implies its ideals and quotient rings are also
(right) n-clean ring. For strongly n-clean ring we need more condition, i.e. the idem-
potent elements should not included in kernels. Now we give the following proposition.
Proposition 2.2. Let {Rλ }Λ be a family of rings. The following assertions are satisfied.
Q
(i) The product of rings Λ Rλ is a right n-clean ring if and only if each Rλ is n-clean
for all λ ∈ Λ.
Q
(ii) If each Rλ is strongly n-clean for all λ ∈ Λ, then the product of rings Λ Rλ is a
strongly n-clean ring.
Proof. We prove assertion (i).
Q
(⇒) It is clear from Proposition 2.1, pµ : Λ Rλ → Rµ is an epimorphism and it
means Rµ is right n-clean for all µ ∈ Λ.
Q
(⇐) Assume now every Rλ is right n-clean. Take any (rλ )Λ in Λ Rλ . Since rλ is
a right n-clean element, rλ = eλ + u1λ + · · · + unλ , where eλ ∈ Id(Rλ ) and uiλ ∈ Ur (Rλ )
for all λ ∈ Λ. Then we obtain
(rλ )Λ = (eλ + u1λ + · · · + unλ )Λ = (eλ )Λ + (u1λ )Λ + · · · + (unλ )Λ ,
Q Q
where (eλ )Λ ∈ Id( Λ Rλ ) and each (uiλ )Λ ∈ Ur ( Λ Rλ ). 
32 Indah Emilia Wijayanti

Thus, if ring R is direct sums of (right) n-clean rings, then R is (right) n-clean
ring. Consider the following counter example to show that the converse of statement
(ii) of Proposition 2.2 is not true.
Example 2.1. In Z3 we have 0 = 1 + 1 + 1, 1 = 1 + 1 + 2 and 2 = 1 + 2 + 2. Also we
have that Z3 × Z2 is strongly 2-clean. But Z2 is not a strongly 2-clean. 

To obtain a sufficient and necessary condition for strongly n-clean ring, we have a
special condition as we refer from Proposition 2.2 of [6] and it is showed in the following
proposition.
Proposition 2.3. Let {Rλ } be a family of rings such Q that at least one of them is
strongly n-clean and the others are n-clean ring, then λ∈Λ Rλ is strongly n-clean.

In fact, if a ring is (right) n-clean, then it is also (right) m-clean for any m ≥ n
as we show in the next proposition. But this property is not hold for strongly n-clean
as given in the counter example. We have already known that Z3 is strongly 2-clean.
Since 2 ∈ Z3 is can not be expressed to sum of an idempotent element and 3 units, Z3
is not strongly 3-clean.
Proposition 2.4. Let R be a ring. If an element r ∈ R is (right) n-clean, then r is
also (right) m-clean for any non-zero integer m ≥ n.

Proof. It is sufficient to prove that for any right n-clean element r in R, it is right
n + 1-clean. Let r be a right n-clean element in R, then r = e + u1 + u2 + · · · + un ,
where e ∈ Id(R) and ui ∈ Ur (R) for all i’s. Consider that e = (1 − e) + (2e − 1). Thus
we obtain
r = (1 − e) + (2e − 1) + u1 + u2 + · · · + un
where 2e − 1 ∈ Ur (R), i.e. (2e − 1)(2e − 1) = 1. 

The next investigation is to look for the relationship between a strongly or right
n-clean ring with matrices over it.
Proposition 2.5. Let A, B be rings, A CB a bimodule and
 
A C
R= .
0 B
R is (right) n-clean if and only if A and B are (right) n-clean.

Proof. (⇒) Since we can construct epimorphisms f : R → A by


 
a c
f (r) = f = a;
0 b
and g : R → B by
 
a c
f (r) = f = b,
0 b
for every r ∈ R, according to Proposition (2.1) it is clear that A and B are right n-clean.
Clean Rings and Clean Modules 33

 
a c
(⇐) Now take any ∈ R, where a ∈ A and b ∈ B. Since A and B are
0 b
right n-clean, there exists an idempotent ea ∈ A and right units ua1 , . . . , uan ∈ A such
that a = ea + ua1 + . . . + uan . Also there exists an idempotent eb ∈ B and right units
ub1 , . . . , ubn ∈ B such that b = eb + ub1 + . . . + ubn . Moreover,
         
a c ea 0 ua1 c ua2 0 uan 0
= + + + ··· +
0 b 0 eb 0 ub1 0 ub2 0 ubn
   
ea 0 uai 0
where is idempotent and are right units for all i’s. Consider
0 eb 0 ubi
now   −1
ua1 −u−1 −1
   
ua1 c a1 cub1 1 0
=
0 ub1 0 u−1
b1 0 1
where u−1 −1
a1 and ub1 are right inverse of ua1 and ub1 respectively. We conclude that
 
ua1 c
0 ub1
is a right unit and R is right n-clean. 

We give now the sufficient condition for n-clean, strongly n-clean and right n-clean
ring respectively.
Proposition 2.6. Let e be an idempotent element in R. If eRe and (1 − e)R(1 − e)
are strongly or right n-clean, then R is a strongly or right n-clean ring.

Proof. By Pierce Decomposition as we mention before in previous section,


 
eRe eR(1 − e)
R' .
(1 − e)Re (1 − e)R(1 − e)
 
a x
Let A = in R, where
y b
a = f + u1 + · · · + un ,
f is an idempotent element in eRe, ui ’s are right units in eRe. Consider that b−yu−1
1 x∈
(1 − e)R(1 − e). Since (1 − e)R(1 − e) is right n-clean, there exists g an idempotent
element and vi ’s are right units in (1 − e)R(1 − e) such that
b − yu−1
1 x = g + v1 + · · · + vn
b = g + v1 + · · · + vn + yu−1
1 x.

Hence
 
a x
A =
y b
 
f + u1 + · · · + un x
=
y g + v1 + · · · + vn + yu−1
1 x
       
f 0 u1 x u2 0 un 0
= + + + ··· +
0 g y vn + yu−1
1 x 0 v 1 0 vn−1
34 Indah Emilia Wijayanti
 
u1 x
It is sufficient to show that is a right unit. But
y vn + yu−1
1 x
  −1
u1 + u−1 −1 −1
−u−1 −1
 
u1 x 1 xvn yu1 1 xvn =1
y vn + yu−1
1 x −vn−1 yu−1
1 vn−1
and
u−1 −1 −1 −1
−u−1 −1
  
1 + u1 xvn yu1 1 xvn u1 x
=1
−1
−vn yu1 −1 −1
vn y vn + yu−1
1 x
as needed. Then R is right n-clean. 

The situation in Proposition 2.6 can be generalized into following.


Corollary 2.2. If e1 and e2 are idempotent elements which are orthogonal, e1 + e2 = 1
and each ei Rei is strongly or right n-clean, then R is strongly or right n-clean.
Moreover we have a generalization of Proposition 2.6.
Proposition 2.7. Let ei ’s be idempotent elements which are pairwise orthogonal in R.
If each ei Rei is strongly or right n-clean, i = 1, 2, . . . , n, then generalized matrix T (R)
is strongly or right n-clean.

Proof. Let ei Rei is strongly or right n-clean, i = 1, 2, . . . , n. We prove it by induction


of the matrix order. It is clear for 1 × 1 matrices. Assume the assertion is hold for
matrices with order k, where k < n. Consider the generalized matrix
 
e1 Re1 e1 Re2 . . . e1 Ren
 e2 Re1 e2 Re2 . . . e2 Ren 
T (R) =  .
 
.. .. .. ..
 . . . . 
en Re1 en Re2 ... en Ren
We make a partition for T (R) as we can see below :
 
e1 Re1 | e1 Re2 ... e1 Ren
 − − −− −− − − −− − − −− − − −− 
 
T (R) =  e2 Re1
 | e2 Re2 ... e2 Ren 

 .. .. .. .. 
 . | . . . 
en Re1 | en Re2 ... en Ren
Denote

M = e1 Re2 ... e1 Ren
 
e2 Re1
N = 
 .. 
. 
en Re1
 
e2 Re2 ... e2 Ren
B =  .. .. ..
,
 
. . .
en Re2 ... en Ren
Clean Rings and Clean Modules 35

 
e1 Re1 M
such as we have T (R) = . Take any element in T (R), say r =
N B
 
a x
, where a ∈ e1 Re1 , x ∈ M , y ∈ N and b ∈ B. Since e1 Re1 and B are right n-
y b
clean, we can find f ∈ Id(e1 Re1 ) and ui ∈ Ur (e1 Re1 ) such that a = f +u1 +u2 +· · ·+un .
Now consider that yu−1 −1 −1
1 x ∈ B, for u1 ∈ e1 Re1 satisfies u1 u1 = 1e1 Re1 . Thus
b − yu−1
1 x ∈ B and we have
 
a x
r =
y b
 
f + u1 + u2 + · · · + un x
=
y g + v1 + v2 + · · · + vn + yu−1
1 x
       
f 0 u1 x u2 0 un 0
= + + + ··· + .
0 g y vn + yu−1 1 x 0 v1 0 vn−1
where g ∈ Id(B)
 and vi ’s in Ur (B).
 Similar with the argument in Proposition 2.6 we
u1 x
conclude that is right invertible in T (R). 
y vn + yu−1 1 x

The converse of Proposition 2.6 is not true since we can not always obtain a right
n-clean element ere ∈ eRe although r ∈ R is a right n-clean element. Moreover, the
observation of necessary and sufficient condition of clean ideals has been done by Chen
and Chen in [3].
Proposition 2.8. Let ei ’s be idempotent elements in a ring R, i = 1, 2, . . . , n, which
are pairwise orthogonal and e1 + e2 + · · · + en = 1R . If each ei Rei is strongly or right
n-clean, i = 1, 2, . . . , n, then R is strongly or right n-clean.

Next proposition gives a simpler condition, i.e. the idempotent elements are not
necessary pairwise orthogonal. The sketch of its proof is quiet similar with the proof of
Proposition 2.7, so we skip the prof in this note.
Proposition 2.9. Let ei ’s be idempotent elements, i = 1, 2, . . . , n. If each ei Rei is
strongly or right n-clean, then generalized matrix T (R) is strongly or right n-clean.

The important results of this work is the following proposition in which we prove
that the clean properties of a ring can be transferred to the matrices over this ring.
Proposition 2.10. If R is a strongly or right n-clean ring, ei ’s are idempotent elements
in R, i = 1, 2, . . . , n, then the n × n matrices over R is also strongly or right n-clean.

Proof. Let ei = 1, i = 1, 2, . . . , n. According Proposition 2.9 we have the following


matrix  
R R ... R
Mn  R R ... R 
T (R) ' R' . . .
 
 .. .. . . ... 

i,j=1
R R ... R
36 Indah Emilia Wijayanti

is strongly or right n-clean. Then for any n × n matrix A over R, A ∈ T (R). Since
T (R) is strongly or right n-clean, A is also strongly or right n-clean. 

3. CLEAN MODULES
We give some definitions of clean modules as follow.
Definition 3.1. An R-module M is called strongly or right n-clean module if EndR (M )
is a strongly or right n-clean ring.
If n = 1, then we obtain a strongly or right clean module. One example of clean
module is a continuous module (see Camillo et. al. [2] and Haily-Rahnaoui [5]).
Some previous authors have been investigated the properties of strongly clean
modules (Khaksari-Moghimi [6], Zang [9] and right clean modules (Călugăreanu [1]).
For further investigation of strongly or right clean modules, we refer to the prop-
erties of strongly or right clean rings. For example, motivated by Corollary 2.10 of
paper Khaksari-Moghimi [6] we have the following interpretation.
Proposition 3.1. If {Mi }, i = 1, . . . , n is a family of strongly or right n-clean modules,
M = M1 ⊕ M2 ⊕ · · · ⊕ Mn and for any f : Mi → Mj , f = 0 if i = j, then M is strongly
or right n-clean.
Proof. Let M = M1 ⊕ M2 ⊕ · · · ⊕ Mn and consider
EndR (M ) = EndR (M1 ⊕ M2 ⊕ · · · ⊕ Mn )
= EndR (M1 ) ⊕ EndR (M2 ) ⊕ · · · ⊕ EndR (Mn ).
Since every EndR (Mi ) is a strongly or right n-clean ring, according to Proposition 2.2
EndR (M ), as a product of EndR (Mi )s, is a strongly or right n-clean ring. Hence M is
a strongly or right n-clean modules. 
Proposition 3.2. If M is a strongly or right n-clean module, S = EndR (M ) and S
contains an idempotent element, then n × n matrices over S is also a strongly or right
n-clean ring.
Proof. Since M is a strongly or right n-clean module, S is a strongly or right n-clean
ring. Then we apply Proposition 2.6 to obtain the n×n matrices over S is also (strongly
or right) n-clean(strongly or right) n-clean. 
We recall the important result of Camillo et. al. [2] (Proposition 2.2 and 2.3),
which gave a necessary and sufficient condition for clean and strongly clean element.
Proposition 3.3. Let M be an R-module, S = EndR (M ) and e ∈ S an idempotent
element. Denote A = Ker(e) and B = Im(e).
(i) An element f ∈ S is clean if and only if there exists a decomposition M = C ⊕ D
such that f (A) ⊆ C, (1 − f )B ⊆ D and both f : A → C and (1 − f ) : B → D are
isomorphisms.
(ii) An element f ∈ S is strongly clean if and only if there exists a decomposition
M = A ⊕ B such that f (A) ⊆ A, (1 − f )B ⊆ B and both f : A → A and
(1 − f ) : B → B are isomorphisms.
Clean Rings and Clean Modules 37

Based on these result, in Theorem 6 of [9], Zhang characterized strongly clean


modules by direct sum decomposition.
Proposition 3.4. An R-module M is strongly clean if and only if M 0 ⊕ B = A1 ⊕ A2 ,
where M 0 ' M , there exist the following decompositions M 0 = M1 ⊕ M2 , B = B1 ⊕ B2
dan Ai = Ci ⊕ Di , i = 1, 2, such that M1 ⊕ B1 = C1 ⊕ D2 = M1 ⊕ C1 dan M2 ⊕ B2 =
D1 ⊕ C2 = M2 ⊕ C2 .

Moreover, Călugăreanu in Proposition 8 of [1] gave a result in right 1-clean (right


clean) version as follow.
Proposition 3.5. Let M be an R-module, S = EndR (M ) and e ∈ S an idempotent
element. Denote A = Ker(e) and B = Im(e). An element f ∈ S is right clean if and
only if there exists a decomposition M = C ⊕ D such that f (A) ⊆ C, (1 − f )B ⊆ D and
both f : A → C and (1 − f ) : B → D are monomorphisms, f (A) ∩ (1 − f )B = 0 and the
monomorphism fg |A ⊕ (1^− f )|B : A ⊕ B → f (A) ⊕ (1 − f )(B) has a left inverse in S.

With some condition, next we show that a (strongly or right) n-celan ring R is
possible to make all R-modules M also strongly or right n-celan.
Proposition 3.6. Let M be a finite rank free R-module. If R is (strongly or right)
n-celan, then so is the M .

Proof. Let S = EndR (M ). Since M is a finite rank free R-module, S = Rk , where k is


rank of M . Moreover since R is (strongly or right) n-celan, S = Rk is also (strongly or
right) n-celan. Hence M is a (strongly or right) n-celan module. 

Proposition 3.7. If M , N are strongly clean free R-module and R is a commutative


strongly n-clean, then HomR (M, N ) is a strongly n-clean module over EndR (M ).

Proof. It is clear that HomR (M, N ) is an EndR (M )-module by defining following scalar
multiplication :
EndR (M ) × HomR (M, N ) → HomR (M, N )
(f, g) 7→ g ◦ f.
Since M is a strongly n-clean module, EndR (M ) is a strongly n-clean. According to a
result in [6], HomR (M, N ) is also a strongly n-clean module. 

We recall some results in paper Khaksari-Moghimi which are related to strongly


n-clean modules.
Proposition 3.8. ([6] Lemma 3.4) Let m be a nonzero positive integer and M a free
strongly n-clean module with finite rank m. Then any free module with finite rank
divided by m is a strongly n-clean module.

Proposition 3.8 restricts for finite rank free modules. But this condition can be
generalized into infinite countable rank free modules.
38 Indah Emilia Wijayanti

Proposition 3.9. ([6] Theorem 3.4) Let m be a nonzero positive integer


L and any free
R-module with finite rank m is a strongly n-clean module. If M = i<ω Rei is a free
R-module with infinite countable rank, then every endomorphism of M is a strongly
n-clean ring.
According to the definition of strongly n-clean module we obtain the following
conclusion.
Corollary 3.1. Let m be a nonzero positive L integer and any free R-module with finite
rank m is a strongly n-clean module. If M = i<ω Rei is a free R-module with infinite
countable rank, then M is a strongly n-clean module.
Furthermore, properties in Corollary (3.1) can be generalized as follow.
Proposition 3.10. ([6] Theorem 3.5)If R is a strongly 2-clean ring (n = 2), then every
free R-module with infinite uncountable rank is strongly n-clean.
Q Q
It is well kown that EndR (M ) ' EndR ( M ), such that by Proposition 2.2 we
have :
Proposition
Q 3.11. If M is a strongly n-clean module, then finitely product of M , i.e.
I M where I is a finite index set, is also a strongly n-clean module
Proof. Since Q
M is a strongly n-clean
Q module, EndR (M ) is a strongly n-clean ring. By
the factQthat EndR (M ) ' EndR ( M ) and applying
Q Proposition 2.2 we conclude that
EndR ( M ) is a strongly n-clean ring. Hence I M is a strongly n-clean module. 
The next proposition gives a more general situation.
Proposition 3.12. If {Mλ } is a family ofQR-modules and there exists a strongly n-
clean module in this family, say Mλ0 , then Λ Mλ , where Λ is a finite index set, is a
strongly n-clean module.
Proof. Since there exists a strongly n-clean
Q module Mλ0 , EndQ R (M0 ) is a strongly n-
clean ring. Moreover, by the fact
Q that EndR (M λ ) ' EndR ( Mλ ) and Qby applying
Proposition 2.2 we have EndR ( Mλ ) is a strongly n-clean ring. Hence Λ Mλ is a
strongly n-clean module. 
If we make a relationship between cleanness of modules and their short exact
sequence, then we obtain the following result.
Proposition 3.13. Any projective module P which is strongly n-clean, is a direct sum-
mand of a strongly n-clean module.
Proof. Free presentation of a projective module P is

0 / Ker ψ /F ψ
/P / 0.

Since P is projective, this short exact sequence is split, so we have F ' Ker(ψ)⊕P . 
Clean Rings and Clean Modules 39

4. CONCLUDING REMARKS
There are some open problems related to the cleanness of modules. One of the
crucial investigations is an alternative definition of clean modules which is more natural
than by using the endomorphism ring. In case we still apply the recent definition,
we might observe the necessary and sufficient condition for an n-clean element in an
endomorphism ring. Also the investigation of the properties of free module and n-clean
module has not been done.

References
[1] Călugăreanu, G., One-sided Clean Rings, Studia Universitatis Babes-Bolyai, Vol. 55, No. 3,
2010.
[2] Camillo, V.P., Khurana, D., Lam, T.Y., Nicholson, W.K., Continuous Modules are Clean,
Journal of Algebra, 304 No.1, 94 - 111, 2006.
[3] Chen, H. and Chen, M., On Clean Ideals, International Journal of Mathemnatics and Mathe-
matical Sciences (IJMMS), 62, 3949 - 3956, 2003.
[4] Chen, W. and Cui, S., On Clean Rings and Clean Elements, Southeast Asian Bulletin of Math-
ematics, 32, 855-861, 2008.
[5] Haily, A. and Rahnaoui, H., Endomorphisms of Continuous Modules with Some Chain Condi-
tions, International Journal of ALgebra, 4 (8), 397 - 402, 2010.
[6] Khaksari, A. and Moghimi, G., Some Results on Clean Rings and Modules, World Applied
Sciences Journal, 6 (10), 1384 - 1387, 2009.
[7] Wang, Z. and Chen, J.L., 2-Clean Rings, arXiv:math/0610918v1 [math.RA] 30 Oct 2006.
[8] Wijayanti, I.E., On Right n-Clean Rings, submitted to Jurnal Matematika dan Sains (JMS) ITB,
2011.
[9] Zhang,H., On Strongly Clean Modules, Communications in Algebra, 37(4), 1420-1427, 2009.

Indah Emilia Wijayanti:


Department of Mathematics, Gadjah Mada University,
Sekip Utara, Yogyakarta, 55281, Indonesia.
e-mails: ind [email protected], ind [email protected]
40 Indah Emilia Wijayanti
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 41–50.

RESEARCH ON NAKAYAMA ALGEBRAS

Intan Muchtadi-Alamsyah

Abstract. An Nakayama algebra is an algebra that is both right and left serial. In this
paper we explain some research on Nakayama algebras that have been conducted: a con-
struction of an explicit tilting complex that gives derived equivalence between symmetric
Nakayama algebras and Brauer tree algebras. Then we explain our ongoing research
on Nakayama algebras in group representation theory based on the above result. For a
class of non-symmetric Nakayama algebras we explain our ongoing research on Nakayama
algebras with mutation.

Keywords and Phrases: Nakayama algebra, Brauer tree algebra, derived equivalence,
cluster tilted algebra, mutation quiver.

1. INTRODUCTION
An Nakayama algebra is an algebra that is both right and left serial. Nakayama
algebras are central in the representation theory of finite dimensional algebras. They
have a well understood module category, with particularly nice combinatorial properties.
For an algebraically closed field k, they are given as path algebras of finite quivers (the
Gabriel quivers) modulo ideals generated by linear combinations of paths. Moreover,
their Gabriel quivers are either of Dynkin type An or a finite oriented cycle.
An explicit tilting complex can be constructed that gives equivalence between the
symmetric Nakayama algebra and the Brauer tree algebra associated to a line without
exceptional vertex. They are derived equivalent based on the method in [19] and also
using the Green order correspond to these algebras [17, section 4.4]. As an application
is the result in [20] where this equivalence is used in representation theory of braid
groups. Moreover by Rickard [22, Theorem 4.2], up to derived equivalence, a Brauer
tree algebra is determined by the number of edges of the Brauer tree and the multiplicity
of the exceptional vertex. Hence, an arbitrary Brauer tree algebra is derived equivalent

2010 Mathematics Subject Classification: 13D09, 16G20, 20C05

41
42 Intan Muchtadi-Alamsyah

to a Brauer star algebra associated with a star having the same number of edges, i.e.
the Nakayama algebra with m simple modules and Loewy length n where m divides n.
In recent years, a major direction within representation theory has been the study
of cluster tilted algebras. These algebras occur as endomorphism algebras of certain
objects in triangulated Hom-finite categories related to derived categories. For these
algebras there is a concept of mutation. This notion is related to mutation of quivers
with (super-)potentials occurring in mathematical physics. For two algebras which
are related by a single mutation, their module categories have a similar structure, the
algebras are said to be ”nearly Morita equivalent”. It is an interesting general problem
to understand and explore algebras which are nearly Morita equivalent to algebras with
a well understood module category such as Nakayama algebras. In our research the
intersection between Nakayama algebras and cluster tilted algebras is explored.
In group representation theory, by result of Dade [11], blocks with cyclic de-
fect groups are Brauer tree algebras and the Brauer correspondent of these blocks are
Nakayama algebras. Hence the derived equivalence between Brauer tree algebras and
Nakayama algebras gives new alternative to Broué’s conjecture [6] for the case blocks
with cyclic defect groups. Research on invariance of derived equivalence have been
conducted, the most recent is by Zimmermann [25] and our research in this area is
the invariance of p-regular subspace of blocks with abelian defect groups. Fan and Kul-
shammer [12], by using perfect isometries, have shown the invariance of these subspaces.
By using derived equivalence between Brauer tree algebras and Nakayama algebras, one
may expect an alternative proof of Fan and Kulshammer’s method to get this invariance.

2. NAKAYAMA ALGEBRAS
A quiver Q is a quadruple (Q0 , Q1 , s, t) where Q0 is the set of vertices (points),
Q1 is the set of arrows and for each arrow α ∈ Q1 , the vertices s(α) and t(α) are the
source and the target of α, respectively (see [3]). If i and j are vertices, an (oriented)
path in Q of length m from i to j is a formal composition of arrows
p = α1 α2 · · · αm
where s(α1 ) = i, t(αm ) = j and t(αk−1 ) = s(αk ), for k = 2, · · · , m. To any vertex i ∈ Q0
we attach a trivial path of length 0, say ei , starting and ending at i such that for any
arrow α (resp. β) such that s(α) = i (resp. t(β) = i) then ei α = α (resp. βei = β). We
identify the set of vertices and the set of trivial paths.
Let KQ be the K-vector generated by the set of all paths in Q. Then KQ can be
endowed with a structure of K-algebra with multiplication induced by concatenation of
paths, that is,

β1 β2 · · · βn α1 α2 · · · αn , if t(βn ) = s(α1 )
(β1 β2 · · · βn )(α1 α2 · · · αn ) =
0, otherwise.
KQ is called the path algebra of the quiver Q. The algebra KQ can be graded by
KQ = KQ0 ⊕ KQ1 ⊕ · · · ⊕ KQm ⊕ · · · ,
Research on Nakayama Algebras 43

where Qm is the set of all paths of length m.

Definition 2.1. Let Q be a finite connected quiver. The ideal of path algebra KQ
generated by arrows of Q is called arrow ideal and denoted by RQ .

Definition 2.2. Let Q be a finite quiver and RQ be the arrow ideal in path algebra KQ.
An ideal I in KQ is admissible if there exists m ≥ 2 such that
m 2
RQ ⊆ I ⊆ RQ .

If I is an admissible ideal in KQ, (Q, I) is called bound quiver. The quotient algebra
KQ/I is called bound path algebra.

A finite dimensional algebra A over an algebraically closed field K is called basic


if the quotient algebra of A modulo the Jacobson radical is isomorphic to a product of
K as K-algebras. A theorem due to Gabriel says that a basic K-algebra is isomorphic
to the factor algebra of the path algebra KQA by an admissible ideal, where QA is the
quiver of A (see [3], [4]). Since any finite dimensional algebra is Morita equivalent to a
uniquely determined basic algebra, it follows that any finite dimensional algebra A over
an algebraically closed field is Morita equivalent to KQA modulo an admissible ideal.
An algebra A is called a Nakayama algebra if it is both right and left serial. That
is, A is a Nakayama algebra if and only if every indecomposable projective A-module
and every indecomposable injective A-module are uniserial.

Theorem 2.1. [3, Theorem 3.2] A basic and connected algebra A is a Nakayama algebra
if and only if its quiver QA is one of the following quivers:
(1) An quivers

(2) Cyclic quivers

Proposition 2.1. [3, Proposition 3.8] Let A be a basic and connected algebra, which
is not isomorphic to K. The A is a self-injective Nakayama algebra if and only if A ∼
=
KQ/I where Q is the quiver
44 Intan Muchtadi-Alamsyah

with m ≥ 1 and I = Rn for some n ≥ 2, where R denotes the arrow ideal of KQ.

We denote KQ/I in the previous proposition by Nnm . The algebra Nnm is sym-
metric if and only if m divides n. For the case Nakayama algebra Nnn the paths ei of
length 0 are mutually orthogonal idempotents and their sum is the unit element. The
Loewy series of the indecomposable projective modules Pi are as follows:
Si
Si+1
..
.
Sn
Pi =
S1
S2
..
.
Si

2.1. Brauer Tree Algebras. Let G be a finite connected tree with a cyclic ordering
of the edges adjacent to a given vertex and with a particular vertex v, the exceptional
vertex, and a positive integer m, the multiplicity of the exceptional vertex. To this data
(G, v, m) one associates a finite dimensional symmetric algebra, called a Brauer tree
algebra, characterized up to Morita equivalence by the following properties:
(1) The isomorphism classes of simple modules are parametrized by the edges of
G. Denote by Pj a projective cover of a simple module Sj corresponding to an
edge j. Then rad(Pj ) = soc(Pj ) is the direct sum of two uniserial modules Ua
and Ub where a and b are the vertices of j.
(2) For c in {a, b}, let j = j0 , j1 , , jr be the cyclic ordering of the r+1 edges around c.
Then the composition factors of Uc , starting from the top, are Sj1 , Sj2 , , Sjr , Sj0 , Sj1 , , Sjr ,
where the number of composition factors is m(r + 1) − 1 if c is the exceptional
vertex and r otherwise.
Associated to a Brauer tree algebra are two numerical invariants: the number of edges
of the tree and the multiplicity of the exceptional vertex.
Two examples are
Research on Nakayama Algebras 45

(1) A basic Brauer tree algebra associated to a line with n edges numbered 1, , n
such that i is adjacent to i + 1, and with no exceptional vertex. We assume
n > 1.

The Loewy series of the indecomposable projective modules are as follows:


S1 Sn Si
P1 = S2 , Pn = Sn−1 , and Pi = Si−1 Si+1 , for i 6= 1, n.
S1 Sn Si
(2) A basic Brauer tree algebra associated to a star with n edges numbered 1, , n,
and with no exceptional vertex. We assume n > 1. We also name it Brauer star
algebra with no exceptional vertex.

The Loewy series of the indecomposable projective modules are as follows:


Si
Si+1
..
.
Sn
Pi =
S1
S2
..
.
Si
The symmetric Nakayama algebra Nnn is the Brauer star algebra with n edges
and with no exceptional vertex.

3. DERIVED EQUIVALENCE
In 1989, Rickard [21] and Keller [15]have given a necessary and sufficient criterion
for the existence of derived equivalences between two rings as a generalization of Morita
equivalence. Rickard’s theorem says that for two rings A and B the derived categories
Db (A) and Db (B) of A and B are equivalent as triangulated categories if and only if
there exists an object T in Db (A), named tilting complex, satisfying similar proprieties
46 Intan Muchtadi-Alamsyah

as those of a progenerator and such that B is isomorphic to the endomorphism ring of


T in Db (A).
By Rickard [22, Theorem 4.2], up to derived equivalence, a Brauer tree algebra
is determined by the number of edges of the Brauer tree and the multiplicity of the
exceptional vertex. Hence, an arbitrary Brauer tree algebra is derived equivalent to a
Brauer star algebra associated with a star having the same number of edges, i.e. the
Nakayama algebra Nnm where m divides n. The following theorem gives an explicit
tilting complex that gives equivalence between the symmetric Nakayama algebra Nnn
and the Brauer tree algebra associated to a line without exceptional vertex.
Theorem 3.1. Let B be the Nakayama algebra Nnn and A is the Brauer tree algebra
associated to a line without exceptional vertex. Let T be the direct sum of the following
complexes of projective A-modules:
Ti : 0 → Pn → Pn−1 → · · · → Pi+1 → Pi → 0, i = 1, 2, . . . , n
Then EndDb (A) (T ) ∼
= B. Hence there is a derived equivalence between the Nakayama
algebra Nnn and the Brauer tree algebra associated to a line without exceptional vertex.

The proof of this theorem is based on the method in [19] and also using the Green
order correspond to these algebras [17, section 4.4] (See also [19, Example 5.2]). As an
application is the result in [20] where this equivalence is used in representation theory
of braid groups.

4. NAKAYAMA ALGEBRAS WITH MUTATIONS


This is a joint work with Aslak Bakke Buan, Irawati and Faisal.
Cluster categories were introduced in [8] as a framework for a categorification of
Fomin-Zelevinsky cluster algebras [13]. For any finite-dimensional hereditary algebra H
over a field K, the cluster category CH is the quotient of the bounded derived category
Db (H) by the functor τ −1 [1], where τ denotes the AR-translation. The category CH is
canonically triangulated [16], and it has AR-triangles induced by the AR-triangles in
Db (H).
In a cluster category CH, tilting objects are defined as objects which have no self-
extensions, and are maximal with respect to this property. The endomorphism rings of
such objects are called cluster-tilted algebras. [7].
Cluster-tilted algebras have several interesting properties. In particular, by [7]
their representation theory can be completely understood in terms of the representation
theory of the corresponding hereditary algebra H. Furthermore, their relationship to
tilted algebras is well understood by [1], [2], see also [24].
For cluster tilted algebras there is a concept of mutation. This notion is re-
lated to mutation of quivers with (super-)potentials occurring in mathematical physics.
Quiver mutation was introduced by Fomin and Zelevinsky [13] as a generalization of
the sink/source reflections used in connection with BGP functors [5].
Research on Nakayama Algebras 47

Any quiver Q with no loops and no cycles of length two, can be mutated at vertex
i to a new quiver Q∗ by the following rules:
(1) The vertex i is removed and replaced by a vertex i∗ , all other vertices are kept.
(2) For any arrow i → j in Q there is an arrow j → i∗ in Q∗ .
(3) For any arrow j → i in Q there is an arrow i∗ → j in Q∗ .
(4) If there are r > 0 arrows j1 → i, s > 0 arrows i → j2 and t arrows j2 → j1 in
Q, there are t − rs arrows j2 → j1 in Q∗ . (Here, a negative number of arrows
means arrows in the opposite direction.)
(5) all other arrows are kept.
In [9], Buan and Vatne provide an explicit description of the mutation class of
An -quivers, whereas a geometric interpretation of mutation of An -quivers is given by
Caldero, Chapoton and Sciffler [10]. For two algebras which are related by a single
mutation, their module categories have a similar structure, the algebras are said to be
”nearly Morita equivalent”.
It is an interesting general problem to understand and explore algebras which
are nearly Morita equivalent to algebras with a well understood module category such
as Nakayama algebras. Therefore, in our research the intersection between Nakayama
algebras and cluster tilted algebras is explored. Ringel in [23] has given the classification
of selfinjective cluster tilted algebras.
Theorem 4.1. [23] The selfinjective cluster tilted algebras are
(1) the Nakayama algebras Nn−2,n where n ≥ 3,
(2) algebras with an even number 2m of simples, m indecomposable projectives have
length 3 and the remaining m have length m + 1.
Based on results by Ringel [23] and Buan and Vatne [9] we get the following
theorem
Theorem 4.2. Nakayama algebras that admit some mutations are the selfinjective
Nakayama algebras Nn−2,n and the Nakayama algebras associated to An quivers.
The next step will be to classify the mutation class of these algebras, i.e. to
characterize the algebras which are nearly Morita equivalent to these algebras.

5. NAKAYAMA ALGEBRAS IN GROUP REPRESENTATION THEORY


This is a joint work with Alexander Zimmermann, Pudji Astuti and Aditya Purwa
Santika.
In group representation theory, by result of Dade [11], blocks with cyclic de-
fect groups are Brauer tree algebras and the Brauer correspondent of these blocks are
Nakayama algebras. Hence the derived equivalence between Brauer tree algebras and
Nakayama algebras gives new alternative to Broué’s conjecture [6] for the case blocks
with cyclic defect groups.
Research on invariance of derived equivalence have been conducted, the most
recent is by Zimmermann [25]. One of them is the invariance for the center of the rings:
48 Intan Muchtadi-Alamsyah

Theorem 5.1. [17, Proposition 6.3.2] Let R and S be two rings and assume Db (R) = ∼
Db (S) as triangulated category. Let T be the tilting complex over R with endomorphism
ring B. Then the centers of R and S are isomorphic.

Our research in this area is the invariance of p-regular subspace of blocks with
abelian defect groups. We start with a prime p, F a field of characteristic p > 0 and
G a finite group with an abelian Sylow p-subgroup. Element of G which order cannot
be divided by p is called p0 -element. The conjugation class formed by those elements is
called p-regular class.
The F -subspace of the group algebra F G spanned by all p-regular class sums in
G is denoted by Zp0 F G. Meyer showed in [18] that this subspace is a subalgebra of the
center ZF G of F G.
If C is a conjugacy class in G, then defect group of C is a Sylow p-subgroup of
CG (g), centralizer of g, with g belongs to C. A block B of F G is the smallest submodule
which contains indecomposable submodules of F G and simple submodules of F G. Define
Zp0 B = B ∩ Zp0 F G. Fan and Kulshammer proved in [12] the following result:
Theorem 5.2. [12] If B is a block with abelian defect group, then Zp0 B is a subalgebra
of ZB, center of B. Moreover, Zp0 B is invariant under perfect isometry, and hence,
under derived equivalence.

By using derived equivalence between Brauer tree algebras and Nakayama alge-
bras, one may expect an alternative proof of Fan and Kulshammer’s method to get this
invariance without going through the perfect isometry. Our preliminary result is the
invariance of the ranks of the p-regular subspaces as following:
We fix a prime number p and a p-modular system (k, O, F ), that is O a complete
discrete valuation ring with field of fractions k of characteristic 0 and residue field F
of characteristic p. The O-algebras we consider will always be free of finite rank as
O-modules.
Fix a finite group G and a block B1 of the group algebra OG, with defect group
D. Denote by Gp0 the set of p-regular elements in G. Denote by OGp0 the O-sublattice
of OG spanned by Gp0 and Zp0 OG = ZOG ∩ OGp0 . We set Zp0 B1 = B1 ∩ Zp0 OG.
Theorem 5.3. For blocks B1 and B2 with cyclic defect groups of OG and OH, respec-
tively, the derived equivalence between B1 and B2 implies rank Zp0 B1 = rank Zp0 B2 .

Proof The blocks B1 and B2 are Brauer tree algebras and by [22, Theorem 4.2], up
to derived equivalence, a Brauer tree algebra is determined by the number of edges of
the Brauer tree and the multiplicity of the exceptional vertex. Hence B1 and B2 has
the same number of edges, which is also the same number of simple modules.Since by
[14, Remarks 4.9] the rank of Zp0 B1 (resp. Zp0 B2 ) coincides with the number of simple
B1 -modules (resp. B2 -modules), consequently Zp0 B1 and Zp0 B2 has the same rank.
QED

For further research, using the derived equivalence between Nakayama algebras
and Brauer Tree Algebras, we will show that the isomorphism given in Lemma 5.1 maps
Research on Nakayama Algebras 49

Zp0 B1 to Zp0 B2 . This will provide a new invariant of derived equivalence and gives an
alternative proof of Fan and Kulshammer’s result.

Acknowledgement. The author is supported by Hibah Riset dan Inovasi KK 2011


based on Surat Perjanjian no 214/I.1.C01/PL/2011 and Hibah Bersaing DIKTI 2011
based on SK Dekan FMIPA ITB No. 001c/SK/K01.7/KP/2011.

References
[1] Assem I., Brustle T., Schiffler R. Cluster-tilted algebras as trivial extensions, Bull. London
Math. Soc. 40 (1), 151-162, 2008.
[2] Assem I., Brustle T., Schiffler R. Cluster-tilted algebras and slices, Journal of Algebra Vol-
ume 319, Issue 8, 3464-3479, 2008.
[3] Assem, I., Simson, D., Skowronski, A., Elements of the Representation Theory of Assosiative
Algebras, London Math Soc Student Text 65, Cambridge Univ Press, 2006.
[4] Auslander,M., Reiten,I., Smaloe,S.O. Representation Theory of Artin Algebras, Cambridge
Univ Press, 1995.
[5] Bernstein I. N., Gelfand I. M., Ponomarev V. A. Coxeter functors, and Gabriels theorem,
Uspehi Mat Nauk 28, no. 2, 19-33, 1973.
[6] Broué, M., Isométries parfaites, types de blocs, catégories dérivées, Ast/’erisque 181-182, 61-92,
1990.
[7] Buan A., Marsh R., Reiten I. Cluster-tilted algebras, Trans. Amer. Math. Soc. 359, no. 1,
323-332, 2007.
[8] Buan A., Marsh R., Reineke M., Reiten I., Todorov G. Tilting theory and cluster combina-
torics, Adv. Math. 204, 572-618, 2006.
[9] Buan A. and Vatne D., Derived Equivalence for Cluster-tilted Algebras of Type An . J. Algebra
319, no. 7, 2723-2738, 2008.
[10] Caldero P., Chapoton F., Schiffler R. Quivers with relations arising from clusters (An case),
Trans. Amer. Math. Soc. 358, no. 3, 1347-1364, 2006.
[11] Dade, E.C., Blocks with cyclic defect groups, Annals of Math. 84, 20-49, 1966.
[12] Fan, Y., and Kulshammer,B., A note on blocks with abelian defect groups, preprint.
[13] Fomin S., Zelevinsky A. Cluster Algebras I: Foundations, J. Amer. Math. Soc. 15, no. 2,
497-529, 2002.
[14] Huppert,B. Character theory of finite groups, Walter de Gruyter - Berlin - New York, 1998.
[15] Keller, B., A remark on tilting theory and DG-algebras, Manuscripta Mathematica 79, 247-253,
1993.
[16] Keller B., On triangulated orbit categories, Documenta Math. 10, 551-581, 2005.
[17] Koenig,S. and Zimmermann, A., Derived Equivalences for Group Rings, Lecture Notes in Math-
ematics 1685, Springer-Verlag, Berlin, 1998.
[18] Meyer,H. On a subalgebra of the centre of a group ring, Journal of Algebra, 295, 293-302, 2006.
[19] Muchtadi-Alamsyah, I., Homomorphisms of complexes via homologies, J.Algebra 294, 321-345,
2005.
[20] Muchtadi-Alamsyah, I., Braid action on derived category Nakayama algebras, Communications
in Algebra 36:7, 2544-2569, 2008.
[21] Rickard, J., Derived equivalences as derived functors, J.London Math.Soc 43, 37-48, 1991.
[22] Rickard, J., Derived categories and stable equivalence, J. Pure Appl. Algebra 61, 303317, 1989.
[23] Ringel, C.M., The self-injective cluster tilted algebras, to appear in Archiv der Mathematik.
[24] Ringel C.M. Some Remarks Concerning Tilting Modules and Tilted Algebras. Origin. Relevance.
Future, LMS Lecture Notes Series 332(An appendix to the Handbook of Tilting Theory), Cam-
bridge University Press, 2007.
[25] Zimmermann, A., Invariance of generalized Reynolds ideals under derived equivalences, Mathe-
matical Proceedings of the Royal Irish Academy 107A (1), 1-9, 2007.
50 Intan Muchtadi-Alamsyah

Intan Muchtadi-Alamsyah
Algebra Research Group
Faculty of Mathematics and Natural Sciences
Institut Teknologi Bandung.
e-mail : [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 51–64.

MATHEMATICS IN MEDICAL IMAGE ANALYSIS: A


FOCUS ON MAMMOGRAPHY

Murk J. Bottema, Mariusz Bajger, Kenny Ma, Simon Williams

Abstract. Automatic detection of breast cancer in screening mammograms is an ex-


ample of an image analysis task that draws on a wide range of mathematical notions to
understand the issues and address difficulties. The criteria for optimality of solutions are
generally too complex to allow analytic proofs of best practice but the vast repertoire of
mathematical structures and methodology provides the arsenal to move forward. This
paper presents a sample of image analysis methods that have been used in attempts to
improve early detection of breast cancer.
Keywords and Phrases: Image analysis, screening mammography, computer-aided detec-
tion, breast cancer

1. INTRODUCTION
Image analysis refers to the task of automatically extracting information from im-
ages. Examples include automatic tracking of vehicles in video sequences, handwriting
recognition, face recognition, counting cells on microscope slides, predicting crop yield
from aerial photographs, detecting cancer from x-ray images, etc. The applications to
defense, forensics, surveillance, agriculture, science, technology and medicine are vast
and growing. Mathematics and statistics underpins image analysis and much of this
mathematics is common across the various areas of application. At the same time, each
application and each class of images has its own peculiarities that impact some aspects
of how mathematics is used.
This paper examines the peculiarities of medical image analysis and more specif-
ically, mathematical issues arising in automatic detection of breast cancer in screening
mammograms. The examples and methods considered here reflect the current and re-
cent research interests of the authors and is not intended to represent to full range of
activity in this field. Readers interested in a comprehensive account of mammography

2010 Mathematics Subject Classification:

51
52 Bottema et al.

and computer-aided diagnosis of breast cancer are encouraged to consult the collection
of papers edited by Suri and Rangayyan [14].
The present paper is aimed at a mathematical audience and does not presume
any familiarity with either image analysis or the application field of computer-assisted
screening mammography. Accordingly, brief introductions to these topics are provided.

1.1. Screening Mammography. In western countries, breast cancer causes more


deaths among women than any other form of cancer. Early detection of breast cancer re-
sults in reduced morbidity and mortality. Accordingly, many countries have established
breast cancer screening programs. Protocol varies between countries, but typically,
women between the ages of about 50 to 70 are invited to participate in screening pro-
grams every one or two years. The purpose of screening is not to diagnose cancer but
to decide if there is sufficient evidence of cancer to warrant calling the woman back for
further tests such as ultrasound, fine needle aspiration, local x-ray, etc. Between one
and three radiologists or radiographers read the mammograms to decide if the woman
should be called back or not. The true detection rate cannot be known exactly - there
is no way to count the number of cancer that are missed - but evidence indicates that
between 20 and 30 percent of cancers present at screening are not found, usually be-
cause there is no clear visual sign of the cancer. In addition, between 4 and 7 percent
of women are called back while only one percent actually have cancer.
Since the mid 1980’s groups around the world have worked on computer algo-
rithms for automating at least part of the task of reading the screening mammograms.
Currently, detection rates for computer algorithms are quite high (high sensitivity) but
there are many false positive reports (low specificity). Hence current research focuses
on improving the specificity while maintaining high sensitivity. Most algorithms are
designed to work along side a radiologist and are not expected to replace radiologists.
Studies show that the performance of screening programs is higher if at least two radiolo-
gists read the mammograms instead of just one. Studies also show that the performance
of screening programs in terms of sensitivity, specificity and cost (total impact) of one
radiologist aided by a state-of-the-art computer algorithm for detection of breast cancer
is similar to that of two radiologists [15]. The objective is to improve computer algo-
rithms to the point where one radiologist working with a computer clearly outperforms
two radiologists.

1.2. Detection and Classification of Objects in Images. One fundamental image


analysis task is to decide if a particular object is present in an image or not. A system
on a ship may have to decide if radar images shows the presence of another ship, a
security system may have to decide if a video sequence shows an intruder, a system in
a pathology lab may have detect cells on a slide, a system in an observatory might have
to decide if a comet is present in an image of the night sky. A detection task may be
viewed as a two class classification task; classify images according to object present or
object absent.
At the simplest level, detection proceeds by constructing a statistical model for
images in which the object is present, constructing a statistical model for images in
Mathematics in Medical Image Analysis: A Focus on Mammography 53

which the object is not present and then designing a classification rule that best sepa-
rates the two groups. If the models are normal distributions, for example, an optimal
threshold for classification can be derived from elementary theory.
An image can be viewed as a point in an N dimensional feature space where N
is the number of pixels (picture elements). The number N ranges from a few tens of
thousands for very small images to several million. Without context information to
connect the information in separate pixels, classification in such a high dimensional
feature space is virtually impossible.
Instead, the usual practice is to extract features from the images that are thought
to represent the information of interest more efficiently. Features may include the
strength, orientation and juxtaposition of edges, shapes of regions of high contrast,
spatial distributions of lines, textures, or colors. In this way an image is reduced to
small number of features and classification is based on these features only. Thus object
detection requires a choice of features that allow good classification and a method for
extracting these features.
The choice of features is usually determined from the context of the application.
An astronomer may provide descriptions of comets that distinguish them from stars
and planets, for example, and these descriptions will form the basis of the choice of
features.
The extraction of these features is often the most crucial task in an image analysis
problem. The detection of the comet may require identifying the tail. Where does the
tail start and end? How can the tail be distinguished from background stars or noise
in the image? Even if edges and lines can be found, how do they connect to form the
objects of interest? These are the issues that are the most difficult to solve.
The process of identifying coherent regions in the image is called segmentation.
Once the image has been segmented, features such as shape, contrast, texture, orienta-
tion, can be measured for each segment separately in order to find which of the regions,
if any, match the expected features of the object of interest.
Segmentation, in turn, depends on the quality and complexity of the image. Many
images include noise or artifacts. One segmentation method may work perfectly well
if edges are sharp and noise is low, but may fail if edges are fuzzy and noise levels
are high. This motivates the use of preprocessing steps such as noise reduction, edge
enhancement, histogram equalization, etc., to allow segmentation methods best chance
of producing good results.
Altogether, a typical detection task involves the following steps:
preprocessing → segmentation → feature extraction → classification
In addition, many tasks require the additional step of image registration - align-
ing images to highlight difference or measure similarities. For example, in looking for
comets, comparing images taken at different times (say on consecutive nights) can reveal
the motion of a comet relative to apparently stationary stars far away.
One frustrating aspect of this process is that individual steps cannot be fully
evaluated in isolation. What is the best method for noise reduction for a particular
class of images? One can try several methods, but the difficulty is in judging the
54 Bottema et al.

results. Visual assessment is possible but subjective. In the long run, the quality of
noise reduction can only be measured according to the quality of the image segmentation
that follows. The quality of the image segmentation can only be measured according to
the quality of the features that can be extracted and these can be judged only according
to the final accuracy of the classification step.
Fortunately, classication can be measured using methods of machine learning if
there are sufficient examples of images where the true state (object present or object
absent) is known. Classification performance is often reported in terms of sensitivity
and specificity and these are commonly summarized using ROC analysis.
Classification of objects in images refers to the task of assigning a particular (or
previously detected) object in the image to one of two or more classes. For example, a
system might acquire images of vehicles on a road and the task is to classify these as cars,
trucks, motorcycles, or bicycles. The processing steps are very similar to those listed
for detection except that the classification is usually an assignment to many classes. In
some classification tasks, the segmentation is already known but many image analysis
tasks involve both a detection and a classification step.
Methods for feature extraction (including feature selection) and classification are
not very different for general image analysis tasks and medical image analysis. However,
several aspects of preprocessing, segmentation and image registration are very different
in the context of medical images, particularly x-ray images, than in the context of visual
images. These three steps will be discussed in subsequent sections.

1.3. What is Special About X-ray Images? The two main aspects of medical x-
ray images that impact choices of image analysis methods are that the images are
projection images and that the signal to noise ratio is low. The fact that the images are
projection images means that the intensity value of a single pixel reflects the aggregate
attenuation of an x-ray beam through several tissues (skin, heart, rib, lung, for example).
Hence a pixel does not ”belong to” any single object. By contrast, a pixel in a visual
images, represents either a part of a building or a tree or bird or someone’s face, but,
in any event, only one object. Remarkably, this important distinction between x-ray
images and visual images is often neglected even though everyone is entirely aware of
the distinction. A second consequence of being a projection image is that objects of
uniform x-ray attenuation, but of rounded shape, necessarily have poorly defined edges
in the image. This is because the path length of the x-ray beam through the object
becomes shorter closer to the edge of the object. Since organs in the human body
are nearly always rounded in shape, edges defining objects in medical x-ray images are
seldom sharp (Fig. 1).
Low signal to noise ratios occur in all areas of image analysis but the distinction
is that, once identified, developments in technology offer the possibility of improving
this ratio. In the case of x-ray images, improvements in technology are used to lower of
dosage as a priority over improving signal to noise ratios.
Other peculiarities of x-ray images include the phenomenon of beam hardening
(energy dependent attenuation) and scattering, but these issues will not be considered
in this paper.
Mathematics in Medical Image Analysis: A Focus on Mammography 55

500

400

Intensity
300

200

100

0
0 1000 2000 3000

300

Intensity
250

200

1000 1050 1100

Figure 1. A mammogram (left) and intensity plots. The image inten-


sities of pixels on the line across the mammgram are plotted in the top
right panel. The boundary of the breast is not a sharp, well defined
edge. A small section of this plot, from pixel 1000 to pixel 1100 from
the left edge of the image is shown in bottom right panel to indicate
the variation from one pixel to the next. One hundred pixels equates
to 5mm actual size.

2. PREPROCESSING
The key preprocessing step, and the one that is most sensitive to the special
circumstances of x-ray images, is that of noise reduction. Naive noise reduction can
provide visually satisfying results (Fig. 2), but this kind noise reduction is not well
suited to mammography. The trouble is that some outlying intensities or a small group
of intensities might be noise due to scattering or photon statistics, but might equally
well represent actual structure such as a narrow ridge of intensities running across the
line along which the intensities were sampled. In mammograms, such ridges may result
from fibers in normal tissue or spicules associated with malignant tumors (Fig. 3).
Noise reduction is needed that eliminates spike noise but retains narrow ridges.
Elegant methods for noise reduction with these properties exist based on ”anisotropic
smoothing” introduced by Perona and Malik [12]. The idea start with the noisy image
56 Bottema et al.

300

Intensity
250

200

1000 1050 1100

Figure 2. Naive noise reduction. · - the same intensities as in the lower


right panel of Fig. 1. The continuous line shows the intensities at the
same pixel after convolving the image with 3 × 3 averaging filter. The
intensities are displayed as a piecewise linear function to distinguish
the raw and smoothed values.

I and evolve the image according to the diffusion process


∂I
= div(c∇I). (1)
∂t
For c constant, (1) is the usual heat equation, I 0 = c∆I, and the diffusion is independent
of the local structure in the image. If
k∇Ik
c = c(x, y, t) = e− K

for example, then the diffusion is strong where the gradient is small and weak where
the gradient is large. Thus edges are preserved while the regions of similar intensity are
smoothed. This method is called anisotropic, even though the action is locally isotropic.
However, truly anisotropic methods based on these general principles do exist [18].
Even so, in our own work on mammograms [1], anisotropic smoothing did not
perform as well as a method called neutrosophic image denoising [6]. In this method,
each pixel in the image I is replaced by a vector P (i, j) = (T (i, j), U (i, j), F (i, j)) loosely
representing the membership states of the pixel at location (i, j) in a region. The three
states are True (T ), Undecided (U ), False (F ). The values are computed as
¯ j) − Imin (i, j)
I(i,
T (i, j) =
Imax (i, j) − Imin (i, j)
δ(i, j) − δmin (i, j)
U (i, j) =
δmax (i, j) − δmin (i, j)
F (i, j) = 1 − T (i, j)
¯ j)|,
δ(i, j) = |I(i, j) − I(i,
Mathematics in Medical Image Analysis: A Focus on Mammography 57

Figure 3. Ridges. Stellate patterns - linear structures emanating from


a central point are important indicators of breast cancer. The radiat-
ing pattern at the point in the mammogram indicated by the white
bars stems from the overlapping normal structure and is not related to
cancer. A close up view appears at the right. To distinguish normal
and tumor structures, linear features such as these must survive noise
reduction steps.

¯ j) and minima and maxima are computed over neighborhoods of


where the mean I(i,
size w × w centered at (i, j).
For a choice of threshold α, the vector P = P (i, j) is updated according to the
rule
P̂ = P (T̂ , Û , F̂ ),
where

T, I < α
T̂ =
T̄ I ≥ α
δT − δT min
Û =
δT max − δT min
¯
δT = |T̂ − T̂ |
This process is repeated until an entropy measure is less than a pre-set threshold.
On one hand, from a mathematical point of view, this method is much less elegant
than anisotropic smoothing - there is little understanding of why the method works.
On the other hand, this method illustrates a phenomenon that appears more frequently.
Judicious noise reduction based on local information in an iterative manner can out-
perform methods based on a global perspective of smoothing. Part of the difficulty in
58 Bottema et al.

studying this phenomenon is that there is no general criterion for good noise reduction.
In other words, stating the right theorem is not even possible.

3. IMAGE SEGMENTATION
Most segmentation methods used in mammography rely solely on pixel intensity
to distinguish masses or candidate masses from normal tissue. Early papers suggested
linear filters for detection. Significant post processing was required to realise reasonable
detection performance [13].

3.1. Snakes. An example of a mathematically appealing technique is based on the


calculus of variations [8]. The unknown boundary of an object of interest is viewed as a
curve in the plane given parametrically as v(s) = (x(s), y(s)), s ∈ [0, 1]. If the curve fits
along the edges of an object of high intensity contrast, then k∇I(v)k should be large. If
this is the only criterion for choosing v, then the curve wiggles around every local noise
spike instead of seeking out more substantial edges forming the tumor. To avoid this,
the curve is endowed with tensile strength modeled by kv 0 (s)k and rigidity modeled by
kv 00 (s)k. Accordingly, the curve should minimize the functional
Z
E(v) = akv 0 (s)k + bkv 00 (s)k − k∇I(v(s))k ds
[0,1]

where a and b are parameters (possibly dependent on s) to be determined empirically


for a particular segmentation task. To implement the method, a seed curve must be
supplied. An iterative numerical scheme for solving this variational problem results in
a sequence of curves that slither toward a (local) minimum. Consequently, the curves
are affectionately referred to as ”snakes”.
This very popular method has been applied with substantial success in many im-
age analysis tasks, although many implementations have required substantial adapta-
tion from the basic idea described above [4] [19]. Snakes, also known as active contours,
have not played a major role in detection of breast cancers. This is probably due to the
noise characteristics of mammograms discusses in the introduction. The snakes do not
converge if there is too much noise or competing edges from normal breast structure.
Another inconvenience is the requirement to supply initial curves.

3.2. Graph Based Methods. A raft segmentation methods exist which are either
directly or loosely based on graph theory. The advantage is that the emphasis is away
from detecting edges (as in the case of snakes) but on identifying regions of similar
intensity or regions that are similar according to some other attribute or set of attributes.
The basic setting is to form a graph G = (V, E) were V is a set of vertices and E is
a set of edges. Thus e ∈ E means e = (vi , vj ) for some vi , vj ∈ E. The vertices are the
pixels that comprise the image and the objective is to assign edges so that the connected
components of the graph correspond to regions of interest in the image. These methods
are used to segment the entire image, meaning that the image is decomposed into the
union of disjoint segments. In contrast, many segmentation schemes, including snakes,
are used to delineate one object of interest at a time.
Mathematics in Medical Image Analysis: A Focus on Mammography 59

3.2.1. Adaptive Pyramids. The adaptive pyramid [10] [7] starts with every vertex con-
nected to each of its eight neighbors. This neighborhood is called the support set of the
vertex. The method works by selecting a subset of these vertices to ”survive” to the next
level of the pyramid. A rule is used to insure that if two vertices are connected by an
edge, only one can survive and if a vertex does not survive, there is at least one vertex in
its support which does survive. The rule selects the surviving vertices according to how
well they represent their support region. Often the criterion used is similar intensity
but other attributes could be used instead. A surviving vertex inherits the supports of
vertices in the previous level that did not survive but are more similar to the surviving
vertex than other surviving vertices. If this process continues, eventually there is one
vertex at the top of the pyramid and so the entire image constitutes one segment. In
order to achieve useful segmentation, a rule is used to decide if a non-surviving vertex
at a certain level of the pyramid is not sufficiently similar to any surviving vertex to
be associated to any vertex at the next level. Such a vertex is called a root. All the
vertices in the base of the pyramid (the original image) associated to this vertex form
a separate segment (component) of the image.
3.2.2. Minimum Spanning Trees. This method, based on work by Felzenszwalb and
Huttenlocher [5], starts with a collection of eligible edges E comprising the edges be-
tween pixels and their four nearest neighbors. Each edge is assigned an edge weight
according to some rule, for example,

|I(vi ) − I(vj )| (vi , vj ) ∈ E
w(e) = w((vi , vj )) = .
∞ otherwise
The edges are sorted according to weight so that w(ei ) ≤ w(ej ) for i < j and the initial
graph is set as H 0 = (V, F 0 ) where F 0 is the empty set. The graph H q = (V, F q ) is
constructed from H q−1 = (V, F q−1 ) as follows. Let eq = (vi , vj ) denote the qth edge. If
vi and vj lie in different components of H q−1 and w(eq ) is small compared to the internal
variation of both components, set F 1 = F q−1 ∪ {eq }. Otherwise set F q = F q−1 . The
rule for merging or not merging components depends on a single parameter that directly
controls the granularity of the segmentation and provides a handle for automatically
tuning the segmentation to a particular class of images [1].
3.3. Statistical Region Merging. The image is viewed as the union of disjoint regions
of uniform intensity plus additive, normally distributed noise. Again the process starts
by viewing each pixel as an isolated region in the image. Regions are merged sequentially
according the likelihood that the regions have the same mean. The image I is realised
as Q independent random variables and the number of gray levels in I is g. According
to [11], for any fixed pair of regions R and R0 in I and for any 0 < δ ≤ 1
s   !
0 0 1 1 1 2
Prob |(R̄ − R̄ ) − E(R̄ − R̄ )| ≥ g + ln < δ.
2Q |R| |R0 | δ
Thus R and R0 should be merged if
s  
0 1 1 1 2
|(R̄ − R̄ )| ≤ g + ln .
2Q |R| |R0 | δ
60 Bottema et al.

In practice, Q is selected according to the saliency of the smallest objects the user hopes
to be able to segment [2] and δ is set to be a very small number, for example δ ≈ 1/|I|.
3.4. Mixture Models. So far, our group has found that statistical region merging
provides the best segmentation when judged according to the final performance of the
full breast cancer detection scheme. However, there are plenty of avenues left to explore.
To begin with, the methods presented above, are all based on the model that a region
of interests has uniform intensity and that each pixel is associated with exactly one
structure within the breast. Since mammograms are projection images and the structure
comprising the breast are generally rounded in shape, neither of these assumptions are
valid.
Current work is aimed at building a more realistic model of the mammogram by
viewing the image as the sum of bivariate Gaussian distributions. The objective is to
find the number of such distributions together with the set of means and variances that
best explain the content of the image.
3.5. Role of Texture. The discussion so far has focused on using image intensity alone
to segment the image. Two regions may have the same mean intensity but differ in the
variance of the intensity values. More generally, the distribution of edges, lines, bumps,
may vary between regions even if mean intensity does not. These notions have lead
to the use of texture in image analysis generally, especially in classification tasks. In
mammography, several studies have considered texture in classifying masses as benign or
malignant [3],[17],[20]. In this context, the region of interest has already been identified
and so texture descriptions can be computed for the region without difficulty.
In segmentation, the use of texture is more difficult because texture cannot be
measured one pixel at a time. Typically, each pixel is assigned a neighborhood and
textures measured over that neighborhood are assigned to the pixel. Many texture
measures may be used resulting in the representation of each pixel by a vector of texture
attributes. If the boundary assigned to the pixel spills across two regions of different
texture, then the texture attributes assigned to the pixel will reflect neither region
accurately. Thus regions of uniform texture will have blurred boundaries in this vector
representation.
Another problem is that there is usually no way to know ahead of time which
texture attributes characterize the region of interest. A popular approach is to measure
a large numbers of feature attributes, say by applying a filter bank, and searching
for clusters, called textons, in the resulting space of output vectors [16]. Each texton
represents a pattern that appears often in the image. Each pixel in the original image is
then assigned to the texton which lies closest to the vector of filter outputs associated
with the pixel.

4. IMAGE REGISTRATION
The literature on image registration is huge because of its fundamental role in
tracking objects in sequences of images such as video streams and in coordinating stereo
views of objects. Most of the work is in the context of visual images and radar images
(signals). In these settings, objects of interest are usually robust in the sense that shapes
Mathematics in Medical Image Analysis: A Focus on Mammography 61

T

Figure 4. An affine map suffices to register these two images

are generally consistent and the relative positions of many objects remain constant.
Even if a car moves with respect to buildings in the background, these object separately
retain their shapes well so that, locally, the map, T , that matches points in one image
to the next can be modelled as having nice mathematical properties (Fig. 4). A small
patch in the first image can be searched for a match in the second image, testing all
possible translations and rotations, for example. If reliable matches are found this way
for several patches, the map T can be inferred by restricting the class to, say, affine
maps.
In the case of mammograms, the situation is quite different for several reasons.
First, screening visits are typically one to three years apart. Breast tissue naturally
changes over time. Breast become more fatty with age, calcium deposits increase, etc.
Second, the positioning of the breast and exposure settings are generally not consistent.
Third, different types of film or detectors may be used as technology advances. Forth,
and most important, the breast is compressed between two plates at acquisition. The
soft tissue rolls inconsistently so that, in the x-ray image, being a projection image, the
relative positions of objects is not necessarily consistent. The map between x-rays of
the same breast from consecutive visits is not even a function since a single pixel in one
image may correspond to two locations in the second image (Fig. 5).
These considerations may discourage attempts to register mammograms, but there
is a mitigating factor. Aligning all the tissue is not necessary. Only candidate tumors
need to be associated in order to decide which anomalies are new, which have changed
and which have remained essentially unchanged. This view inspired a method to replace
true registration by matching only the information content relevant to cancer [9]. The
first step is to use detection methods to find all possible candidates for masses in both
images. Typically, 40-50 regions are included even though most images have no tumors
and very few have more than one. Next all the candidate masses in one image are
assigned a ”mass-like” score that indicates how much these objects resemble true masses.
This score is based on shape, contrast and texture features. The candidate masses are
then viewed as vertices of a graph with attributes assigned to indicate fuzzy descriptions
of relative location. In addition, the breast boundary is included in the list to provide a
fuzzy description of the location within the breast too. The graphs for the two images
are then aligned using a graph matching algorithm.
62 Bottema et al.

C1 C2

π π

Figure 5. The circle at the top represents a breast in the normal state.
The small circle and the cross represent anomalies in the breast lying
in the same vertical plane. Compression at the first visit (C1 ) results
in a distorted breast and the circle and cross happen to align. An x-
ray image taken top to bottom of the breast shows these two objects
as superimposed. Once acquisition is complete, the breast resumes its
normal shape so the map C1 can be viewed invertible. At the second
visit, the compression C2 results in a different relative position of the
two anomalies and the resulting x-ray shows the circle and cross as
separate objects. The induced map C can be modeled as invertible.
However, the map T between the two x-ray images is not even a func-
tion.

If an anomaly is found in the current image with a high mass-like score but is
matched to mass in the same location with a similar mass-like score, then the anomaly is
rejected as a candidate cancer. This step reduces the number of false positive detections
of cancer. On the other hand, if an anomaly is found in the current image with no match
in the previous image or is matched to much smaller anomaly in the previous image,
then this will be flagged as likely to be cancer. The key is that anomalies can be matched
even if their location relative to other anomalies are slightly different in the two images
or overlap in one image.
Mathematics in Medical Image Analysis: A Focus on Mammography 63

5. CONCLUDING REMARKS
This has been a rather haphazard gallop through mathematical ideas arising in
computer-aided screening mammography. The message is that the most mathematically
appealing solutions do not alway provide the best results, but, mathematical ideas still
provide the way forward in improving early detection of breast cancer.
Radiologists are very good at spotting breast cancer without computers and
computer-aided diagnosis systems in current use are quite successful. Hence, the best
one can usually hope for is a slight improvement over current performance. The in-
clusion of graph matching to circumvent image registration (Section 4), for example,
resulted in a reduction in the false positive rate from 1.04 false positive reports per
image to 1.00 at a true detection rate of 80 percent. On the other hand, due to the very
large number of women attending screening programs world wide and the high preva-
lence of breast cancer, a small improvement in screening performance has the potential
to save thousands of lives.

Acknowledgement. The authors thank the National Breast Cancer Foundation and
the Flinders Medical Centre Foundation for support and BreastScreen SA for access to
archives of screening mammograms.

References
[1] M. Bajger, F. Ma, and M. J. Bottema. Automatic tuning of mst segmentation of mammgorams
for registration and mass detection algorithms. In M. J. Bottema B. C. Lovel A. J. Maeder H. Shi,
Y. Zhang, editor, 2009 Digitial Image Computing Techniques and Applications, Melbourne, Aus-
tralia, Dec. 2009, IEEE Computer Society, pages 400–407, 2009.
[2] M. Bajger, S. Williams, F. Ma, and M. J. Bottema. Mammographic mass detection with statistical
region merging in digital mammography. In Digital Image Computing Techniques and Applica-
tions, Sydney, Australia, Dec. 2010, IEEE Computer Vision Society, pages 27–32, 2010.
[3] H-P. Chan, D. Wei, M. A. Halvie, B. Sahiner, D. D. Adler, M. M. Goodsitt, and N. Petrick.
Computer-aided classification of mammographic masses and normal tissue: Linear discriminant
analysis in texture feature space. Phys. Med. Biol., 40:857–876, 1995.
[4] L. D. Cohen. On active contour models and balloons. Comput. Vision, Graphics, and Image Proc.:
Image Understanding, 53:211–218, 1991.
[5] P. F. Felzenszwalb and D. P. Huttenlocher. Image segmentation using local variation. Proceedings
of IEEE Conference on Computer Vision and Pattern Recognition, pages 98–104, 1998.
[6] Y. Guo and H. D. Cheng. New neutrosophic approach to image segmentation. Int. Jour. Computer
Vision, 42:587–595.
[7] J. M. Jolion and A. Montanvert. The adaptive pyramid: A framework for 2d image analysis.
Computer Vision, Graphics, and Image Processing, 55(3):339–348, May 1992.
[8] M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. International Journal
of Computer Vision, 1(4):321–331, 1987.
[9] F. Ma, M. Bajger, , and M. J. Bottema. Temporal analysis of mammograms based on graph
matching. In E. A. Krupinski, editor, Digital Mammography 9th International Workshop, IWDM
2008, Tucson, AZ, USA, July 2008, Proceedings, number 5116 in Lecture notes in computer
science, pages 158–165. Springer, 2008.
[10] A. Montanvert, P. Meer, and A. Rosenfeld. Hierarchical image analysis using irregular tessellations.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(4):307–316, April 1991.
64 Bottema et al.

[11] R. Nock and F. Nielsen. Statistical region merging. Trans. Pattern Anal. Mach. Intell., 26:1452–
1458, 2007.
[12] P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence, 12:629–639, 1990.
[13] B. Sahiner, H-P. Chan, N. Petrick, M. A. Helvie, and L. M. Hadjiiski. Improvement of mam-
mographic mass characterization using spiculation measures and morphological features. Medical
Physics, 28(7):1455–1465, 2001.
[14] J. S. Suri and R. M. Rangayyan. Recent advances in breast imaging, mammography, and computer-
aided diagnosis of breast cancer. SPIE, 2006.
[15] P. Taylor, H. Potts, L. Wilkinson, and R. Givin-Wilson. Impact of CAD with full field digital
mammography on workload and cost. In J. Marti, A. Oliver, J. Freixenet, and R. Marti, editors,
Digital Mammography 10th International Workshop, IWDM 2010, Girona, Spain, June 2010,
Proceedings, number 6136 in Lecture notes in computer science, pages 1–8. Springer, 2010.
[16] M. Varma and A. Zisserman. A statistical approach to texure classification from single images.
Int. Jour. Computer Vision, 62:61–81, 2005.
[17] D. Wei, H-P. Chan, et al. False-positive reduction technique for detection of masses on digital
mammograms: Global and local multiresolution texture analysis. Medical Physics, 24(6):903–914,
1997.
[18] J. Weickert. Anisotropic Diffusion in Image Processing. B. G. Teubner, Stuttgart, 1998.
[19] C. Xu and J. L. Prince. Snakes, shapes and gradient vector flow. IEEE Trans. Image Proc.,
7:359–369, 1998.
[20] R. Zwiggelaar and E. R. E. Denton. Texture base segmentation. In S. M. Astley, M. Brady,
C. Rose, and R. Zwiggelaar, editors, Digital Mammography 8th International Workshop, IWDM
2006, Manchester, UK, June 2006, Proceedings, number 4046 in Lecture notes in computer science,
pages 433–440. Springer, 2006.

M. J. Bottema
Flinders University.
e-mail: [email protected]

M. Bajger
Flinders University.
e-mail: [email protected]

F. Ma
Flinders University.
e-mail: [email protected]

S. Williams
Flinders University.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 65–78.

THE ORDER OF PHASE-TYPE DISTRIBUTIONS

Reza Pulungan

Abstract. This paper lays out the past and the future of one of the most interesting
research problems in the area of phase-type distributions: the problem of their minimal
representations. We will chronologically present contemporary results, including our own
contributions to the problem, and provide several pointers and possible approaches in
attempting to solve the problem in future work.
Keywords and Phrases: Phase-type distributions, Markov chain, minimal representa-
tions, order.

1. INTRODUCTION
The problem of minimal representations remains one of the open problems in the
research area of phase-type (PH) distributions [16, 18]. Given a phase-type distribution,
a minimal representation is an absorbing Markov chain with the fewest number of
states, whose distribution of time to absorption is governed by the same phase-type
distribution. Obtaining minimal representations is important in various circumstances,
including, but not limited to, modeling formalisms that support compositionality [13,
2, 12]. In such circumstance, models are constructed by composing smaller components
via various operations that usually result in exponential blowups of the state space.
Ensuring that all components and all intermediate results of the composition come in
minimal representations will significantly reduce these blowups.
Previous researches [1, 19, 20, 3, 4, 15, 5, 6] have produced several techniques to
obtain these minimal representations. However, the frontier is still limited to acyclic
phase-type distributions [9, 10, 11, 22], namely those phase-type distributions having
at least one Markovian representation that contains no cycle. Even in this case, the
resulting algorithm is not yet satisfactory, for it contains non-linear programming, which
can be inefficient in many cases.

2010 Mathematics Subject Classification: 60J27, 60J28.

65
66 Reza Pulungan

This paper lays out the past and the future of one of the most interesting research
problems in the area of phase-type distributions: the problem of their minimal repre-
sentations. We will chronologically present contemporary results, including our own
contributions to the problem, and provide several pointers and possible approaches in
attempting to solve the problem in future work.
The paper is organized as follows: Section 2 introduces phase-type distributions
and other concepts required throughout the paper. In this section, we also formulate
the problem of the order of phase-type distributions. Section 3 lays out previous partial
solutions to the problem. In Section 4, we describe our contribution in solving the prob-
lem by proposing an algorithm to reduce the size of acyclic phase-type representations.
The paper is concluded in Section 5.

2. PRELIMINARIES
2.1. Phase-Type Distributions. Let the stochastic process {X(t) ∈ S | t ∈ R+ } be
a homogeneous Markov process defined on a discrete and finite state space
S = {s1 , s2 , · · · , sn , sn+1 }
and with time parameter t ∈ R+ := [0, ∞). The Markov process is a finite continuous-
time Markov chain (CTMC). We view the structure of such a CTMC as a tuple
M = (S, R) where R a rate matrix R : S × S → R+ . The rate matrix R is re-
lated to the corresponding
P infinitesimal generator matrix by: Q(s, s0 ) = R(s, s0 ) if
s 6= s else Q(s, s) = − s0 6=s R(s, s ) for all s, s0 ∈ S. If state sn+1 is absorbing (i.e.,
0 0

Q(sn+1 , sn+1 ) = 0) and all other states si are transient (i.e., there is a nonzero proba-
bility that the state will never be visited once it is left, or equivalently, there exists at
least one path from the state to the absorbing state), the infinitesimal generator matrix
of the Markov chain can be written as:
~
 
A A
Q= ~ .
0 0
Matrix A is called a PH-generator and it is non-singular because the first n states
in the Markov chain are transient. Vector A ~ is a column vector where its component A~i
for i = 1, · · · , n represents the transition rate from state si to the absorbing state. The
Markov chain is fully specified by the generator matrix Q and the initial probability
vector (~α, αn+1 ), where α ~ is an n-dimensional row vector corresponding to the initial
probabilities of the transient states and αn+1 is the initial probability to be immediately
in the absorbing state. Therefore α ~ ~1 + αn+1 = 1, where ~1 is an n-dimensional column
vector whose components are all equal to 1.
Definition 2.1 (Phase-Type Distribution [16]). A probability distribution on R+ is
a phase-type (PH) distribution if and only if it is the distribution of the time until
absorption in a Markov process of the type described above.
α, A) is called the representation of the PH distribution and P H(~
The pair (~ α, A)
is used to denote the PH distribution with representation (~α, A).
The Order of Phase-Type Distributions 67

The probability distribution of the time until absorption in the Markov chain
(hence of PH distribution) is given by:
~ exp(At)~1,
F (t) = 1 − α for t ≥ 0. (1)
The Laplace-Stieltjes transform (LST) of the PH distribution is given by:
Z ∞
f˜(s) = ~ + αn+1 ,
~ (sI − A)−1 A
exp(−st)dF (t) = α (2)
−∞

where s ∈ R+ and I is the n-dimensional identity matrix. Consider the LST of the PH
distribution in (2). This transform is a rational function, namely:

f˜(s) = α ~ + αn+1 = P (s) ,


~ (sI − A)−1 A
Q(s)
for some polynomials P (s) and Q(s) 6= 0.
2.2. Acyclic Phase-Type Distributions. An interesting subset of the family of PH
distributions is the family of acyclic PH distributions. The family can be identified
by the fact that they have triangular representations. A triangular representation is a
representation (~ α, A) where matrix A, under some permutation of its components, is
an upper triangular matrix.
In [19], O’Cinneide proved the following theorem, which characterizes acyclic PH
distributions in terms of the properties of their density functions and their LSTs.
Theorem 2.1 ([19]). A probability distribution defined on R+ , which is not the point
mass at zero, is an acyclic PH distribution if and only if (1) its density function is
strictly positive on (0, ∞), and (2) its LST is rational and has only real poles.
Thus, any general PH representation—possibly containing cycles—represents an
acyclic PH distribution (and hence has an acyclic representation) whenever the poles
of its LST are all real numbers.
2.3. Ordered Bidiagonal Representations. Let PH-generator:
 
−λ1 λ1 0 ··· 0

 0 −λ2 λ2 · · · 0 

Bi(λ1 , λ2 , · · · , λn ) = 
 0 0 −λ3 ··· 0 .

 .. .. .. . . .. 
 . . . . . 
0 0 0 · · · −λn
~ Bi(λ1 , λ2 , · · · , λn )) is
If λn ≥ λn−1 ≥ · · · ≥ λ1 > 0 and, then a PH representation (β,
called an ordered bidiagonal representation.
2.4. Size of Representation, Algebraic Degree, and Order. All PH representa-
tions we are dealing with in this paper are assumed to be irreducible. A representation
is irreducible if for the specified initial distribution any transient state is visited with
non-zero probability.
For a PH distribution with an irreducible representation (~ α, A), the size of the
representation is defined as the dimension of matrix A. The degree of the denominator
68 Reza Pulungan

polynomial of its LST expressed in irreducible ratio is called the algebraic degree—or
simply the degree—of the distribution.
It is known [16, 18] that a given PH distribution has more than one irreducible
representation. The size of a minimal irreducible representation, namely a representa-
tion with the fewest possible number of states, is referred to as the order of the PH
distribution. O’Cinneide in [18] showed that the order of a PH distribution may be
different from, but at least as great as, its algebraic degree. Therefore the following
lemma is straightforward.

Theorem 2.2. Let n be the size of a PH representation whose size is equal to the
algebraic degree of its PH distribution. The the order of the PH distribution is n.

2.5. The Problem of the Order of Phase-Type Distributions. The main problem
addressed in this paper is the problem of the order of PH distributions, namely: given a
PH distribution, what is its order? Stated differently, we would like to find the minimal
number of states required to represent a given PH distribution as an absorbing CTMC.
The PH distribution can be given in various ways: as a probability distribution in
mathematical formulas, as a Laplace-Stieltjes transform, or even as a PH representation
of a certain size.
As a byproduct, of course, it would be advantageous to also be able to devise
methods to compute a minimal representation—namely a PH representation whose size
is equal to the order—of the given PH distribution.

3. PREVIOUS RESULTS
In this section, previous results on the partial solutions to the problem of the order
of PH distributions are presented. The first early result is given in Theorem 2.2. This
theorem is a restatement of lemmas found in [16, 18]. The theorem basically establishes
that the lower bound of the order of PH distributions is their respective algebraic degree.
In the following subsections, we present further partial results, starting in acyclic
PH distributions, the general PH distributions, the relationship between simplicity and
order, and, in the end, an attempt to find non-minimal but nonetheless sparse repre-
sentations.

3.1. Acyclic Phase-Type Distributions. Cumani in [7] presented three canonical


forms of acyclic PH representations. Of particular interest to us, he proved Theorem 3.1.
Aside from the ordered bidiagonal representation, he also provided two other canonical
forms and straightforward procedures to transform one to others. A similar theorem
was proved by O’Cinneide in [19].

Theorem 3.1 ([7]). Any PH distribution with an acyclic PH representation of a certain


size has an ordered bidiagonal representation of equal to or less size.
The Order of Phase-Type Distributions 69

HE and Zhang in [9] provided an algorithm, called the spectral polynomial algo-
rithm, to obtain the ordered bidiagonal representation of any given acyclic PH repre-
sentation. The spectral polynomial algorithm is of complexity O(n3 ) where n is the size
of the given acyclic PH representation.
In [19], O’Cinneide formally characterized acyclic PH distributions by proving
Theorem 2.1. The characterization basically relates acyclic PH distributions to the
shape of their density functions and LSTs. The theorem maintains that the LST of any
acyclic PH distribution is a rational function and all of its poles are real. Hence, a PH
representation could be cyclic; but as long as its LST has only real poles, there must
exist an acyclic PH representation that has the same PH distribution.
The following three theorems by Commault and Chemla in [4] specify certain
conditions for acyclic PH representations to be minimal, namely to have their size be
equal to their respective order.
Theorem 3.2 ([4]). The order of a PH distribution with LST f˜(s) = P (s)/Q(s), where
P (S) and Q(s) are co-prime polynomials, such that Q(s) has degree n with n real roots
and P (s) has degree less than or equal to one, is n.
Theorem 3.2 establishes that the convolution of several exponential distributions
always produces minimal PH representations. This means that Erlang representations—
formed by a convolution of several exponential distributions of the same rate—and
hypoexponential representations—formed by a convolution of several exponential dis-
tributions of possibly different rates—are always minimal.
Theorem 3.3 ([4]). Consider a PH distribution with LST f˜(s) = P (s)/Q(s), where
P (S) and Q(s) are co-prime polynomials with real roots, such that:
  
s + µ1 s + µ2
P (s) = , µ1 ≥ µ2 > 0, and
µ1 µ2
n  
Y s + λi
Q(s) = , λ1 ≥ λ2 ≥ · · · ≥ λn > 0.
i=1
λi
If µ2 ≥ λn and (µ1 + µ2 ) ≥ (λn−1 + λn ), then the order of the distribution is n.
Theorem 3.4 ([4]). Consider a PH distribution with LST f˜(s) = P (s)/Q(s), where
P (S) and Q(s) are co-prime polynomials with real roots, such that:
m  
Y s + µi
P (s) = , µ1 ≥ µ2 ≥ · · · µm > 0, and
i=1
µi
n  
Y s + λi
Q(s) = , λ1 ≥ λ2 ≥ · · · ≥ λn > 0, n > m.
i=1
λi
If µm ≥ λn , µm−1 ≥ λn−1 , · · · , µ1 ≥ λn−m+1 , then the order of such PH distribution is
n.
Theorems 3.3 and 3.4 provides several conditions for the convolution of several
exponential distributions of possibly different rates, which starts not only from the first
state, to be minimal.
70 Reza Pulungan

So far, the partial results only provide conditions for the order of acyclic PH
distributions to be equal to the size of the representations. This, in itself, is important,
since it provides a means to determine whether an existing PH representation is already
minimal, or we should first try to find a smaller or even a minimal representation before
proceeding to use it. However, an algorithmic results would be useful. Such results
will allow us to obtain not only the order but also the minimal PH representations
themselves. We shall return to this issue in Section 4, where such algorithmic methods
are described.

3.2. General Phase-Type Distributions. For general PH distributions, namely cyclic


and acyclic, the following two theorems provide the lower bound of the number of the
state required to represent PH distributions.
Let m(µ) be the mean of distribution µ and σ(µ) be its standard deviation. Then
the coefficient of variation of the distribution is defined by:
σ(µ)
Cv(µ) = .
m(µ)
Theorem 3.5 ([1]). Consider a PH representation of size n and let µ be its PH distri-
bution, then:
1
Cv(µ) ≥ √ .
n
Moreover, the equality holds only in the case of n-state Erlang representations.
1
Theorem 3.5 establishes that it requires n—where n ≥ Cv(µ) 2 —states to represent

PH distributions with coefficient of variation Cv(µ). Hence, in order to obtain a low


coefficient of variation, bigger PH representations are needed.
Theorem 3.6 ([8] in [6]). Let A be a PH-generator of size n. Let −λ1 , λ > 0, be its
eigenvalue with maximal real part and −λ2 ± iθ, λ2 > 0 and θ > 0, be any pair of its
complex eigenvalues. The following relation is satisfied:
θ π
≤ cot .
λ2 − λ1 n

Theorem 3.6 establishes that it requires n—where:


π
n≥ ,
arctan λ2 −λ
θ
1

—states to represent such PH distributions. Since the poles of the LST of a PH dis-
tribution are eigenvalues of the PH-generator of any of its representation, the order of
the representation increases when the angle between the position of any complex poles
and the vertical line passing through the real dominating pole decreases [6]. This the-
orem assures us that finding a PH representation of a size that is exactly equal to the
algebraic degree of its PH distribution is not always possible.
The Order of Phase-Type Distributions 71

3.3. PH-Simplicity and Order. Let {Xt | t ∈ R≥0 } be an absorbing Markov process
representing a PH distribution and let τ be a random variable denoting its absorption
time.
Definition 3.1 ([14]). The dual or the time-reversal representation of the absorbing
Markov process {Xt | t ∈ R≥0 } is given by an absorbing Markov process {Xτ −t | t ∈
R≥0 }.
The relationship between the two processes can be described intuitively as follows:
the probability of being in state s at time t in one Markov process is equal to the
probability of being in state s at time τ − t in the time-reversal Markov process and
vice versa.
α, A), then its dual representation is
Lemma 3.1 ([3, 5]). Given a PH representation (~
~ B) such that:
(β,
β~ = A
~ T M and B = M−1 AT M,
where M = diag(m)
~ is a diagonal matrix whose diagonal components are formed by the
αA−1 .
~ = −~
components of vector m
Lemma 3.1 provides a recipe to obtain the dual representation of a given PH
representation. It is important to note that the size of both PH representation and its
dual are equal.
The notion of PH-simplicity, on the other hand, was first formalized in [17] and
it is closely related to the notion of simplicity in convex analysis.
Definition 3.2. A PH-generator A (of dimension n) is PH-simple if and only if for
any two n-dimensional substochastic vectors α
~ 1 and α ~ 1 6= α
~ 2 , where α α1 , A) 6=
~ 2 , P H(~
P H(~
α2 , A).
Theorem 3.7 ([3]). Given a PH representation of size n. If both PH-generators of the
representation and its dual representation are PH-simple, then the algebraic degree of
the associated PH distribution is n.
Theorem 3.7 establishes the relationship between PH-simplicity and the order of
PH distributions. In particular, the theorem maintains that if PH-generators of a PH
representation and of its dual representation are both PH-simple, than, no matter their
initial probability distributions, both representations are minimal and the order of the
associated PH distribution is equal to the size of the representations.
3.4. Mixture of Monocyclic Erlang. Figure 1 depicts an example of a monocyclic
Erlang representation in graph form.
The representation has n states and ends in an absorbing state, depicted by the
black circle. The representation is basically formed by a convolution of n exponential
distributions of the same rate λ—hence, Erlang—but with a single cycle from the last
to the first state—hence, monocyclic—with rate µ < λ.
Mocanu and Commault in [15] show that a conjugate pair of complex poles in the
LST of a PH distribution can be represented by a single monocyclic Erlang, and they
proceeded to prove Theorem 3.8.
72 Reza Pulungan

1
λ λ λ−µ
1 2 n

Figure 1. A Monocyclic Erlang Representation

Theorem 3.8 ([15]). Every PH distribution has a PH representation, which is a mix-


ture of monocyclic Erlangs (MME).
Theorem 3.8 establishes that any PH distribution can be represented by a mix-
ture of monocyclic Erlang representation. This is done by constructing an MME
PH-generator based on the poles of the LST of the given PH distribution. Then a
“representation”—that is not necessarily Markovian, namely whose initial probability
vector is not substochastic—is formed by using the obtained PH-generator. A proper
PH representation is then looked for by repeatedly constructing Euler approximants in
a suitable space of probability distributions until a representation with a substochastic
vector is obtained. Each approximation adds a new state to the existing, intermediate
“representation” [15].
The procedure to obtain the mixture of monocyclic Erlang representation of a
PH distribution is not guaranteed to end with a minimal PH representation. Hence,
the order of the PH distribution still cannot be determined. However, the resulting
representation is sparse, in the sense that, even though it contains more states, it
contains only a small number of transitions.
In this section, we have described several partial results on the solution to the
problem of the order of PH distributions. In the field of acyclic PH distributions, aside
from the conditions described in the previous section, a complete solution has been
found as will be explained further in the next section.
In the field of general (cyclic) PH distributions, on the other hand, the partial
results are rather limited. Several lower bounds on the order of PH distributions have
been discovered (cf. Theorems 3.5 and 3.6). A condition specifying when a (cyclic) PH
representation is minimal, because its size is equal to the degree of its PH distribution,
has been provided (cf. Theorem 3.7). An complete algorithmic solution to the problem
of the order in the field of general PH distributions, however, does not exist yet. The
frontier in the algorithmic solution is provided in [15] (cf. Theorem 3.8). The proposed
algorithm, nevertheless, only produces PH representations that are sparse but not nec-
essarily minimal. Hence, there is yet no way to determine the order of the associated
PH distributions.

4. OUR CONTRIBUTION
In this section, we will explore further on the algorithmic solution to the problem
in the field of acyclic PH distributions. In [10], HE and Zhang provided an algorithm
The Order of Phase-Type Distributions 73

for computing minimal ordered bidiagonal representations of acyclic PH distributions.


The algorithm of HE and Zhang starts by immediately transforming a given acyclic PH
distribution to a representation that only contains states that represent the poles of the
LST of the distribution. This representation is not necessarily a PH distribution, but
certainly a matrix-exponential distribution. If this is the case, another state and its total
outgoing rate are determined and appended to the representation. This is performed
one by one until a PH representation is obtained. The first PH representation found is a
minimal representation. The algorithm involves solving systems of non-linear equations
when additional states and their total outgoing rates are to be determined. Since non-
linear programming is difficult, the practicality of this algorithm for large models is not
obvious, and has not been investigated so far.
In the following, we will describe our contribution to the field, namely an algorithm
to reduce the size of acyclic PH representations. The algorithm is of cubic complexity
in the size of the state space, and only involves standard numerical computations. The
goal is to reduce the state space of the original representation one state by one state.
The algorithm returns a smaller or equal size representation than the original one.
However, unlike the algorithm of HE and Zhang, the result is not guaranteed to be
minimal. The algorithm starts by transforming a given acyclic PH representation to its
ordered bidiagonal representation. This transformation does not increase the number
of states. It then proceeds by removing “unnecessary” states while maintaining the
resulting representation to be phase-type. The removal of a state involves solving a
system of linear equations. This removal is repeated until no more removal is possible.
The algorithm is easy to implement and straightforward to parallelize. It only consists
of vector-matrix multiplications and the solutions of well-conditioned systems of linear
equations. Furthermore, because we are dealing with bidiagonal representations, these
operations can be carried out even more efficiently.
The exposition in the rest this section is based mainly on [22]. In the following,
we discuss a procedure to reduce the size of acyclic PH representations. The procedure
is roughly as follows: (1) Given an acyclic PH distribution with representation (~ α, A),
it is transformed into an ordered bidiagonal representation (β, ~ Bi(λ1 , λ2 , · · · , λn )) by
using the spectral polynomial algorithm[9], without increasing its size. (2) A smaller
representation is obtained by eliminating unnecessary states from the ordered bidiagonal
representation. If successful, the resulting representation is also an ordered bidiagonal
representation with fewer states.

L-terms. The LST of an exponential distribution with rate λ is given by f˜(s) = s+λ λ
.
s+λ
Let L(λ) = λ , i.e., the reciprocal of the LST. We call a single expression of L(·) an
L-term. The LST of an ordered bidiagonal representation (β, ~ Bi(λ1 , λ2 , · · · , λn )) can
be written as:
β1 β2 βn
f˜(s) = + + ··· + ,
L(λ1 ) · · · L(λn ) L(λ2 ) · · · L(λn ) L(λn )
β1 + β2 L(λ1 ) + · · · + βn L(λ1 ) · · · L(λn−1 )
= , (3)
L(λ1 )L(λ2 ) · · · L(λn )
74 Reza Pulungan

but this may not be in irreducible ratio form. Here the denominator polynomial corre-
sponds exactly to the sequence of the transition rates of the ordered bidiagonal represen-
tation, and thus its degree is equal to the size of the ordered bidiagonal representation.

Reduction. Observing (3), we see that in order to remove a state from the ordered
bidiagonal representation, we have to find a common L-term in both the numerator and
denominator polynomials. If we find that, we might be able to drop a state from the
representation. But removing a common L-term from the numerator and denominator
involves redistributing the initial probability distribution. This may not be possible,
because the resulting vector ~δ may not be substochastic (a vector ~δ is substochastic if
δi ≥ 0 and ~δ~1 ≤ 1). Otherwise, a state can be removed. The procedure of identifying
and properly removing a state from an ordered bidiagonal representation is based on
Lemma 4.1 (see [21] for proof).
Let M E(~ α, A, ω
~ ) denote the matrix-exponential (ME) distribution of representa-
α, A, ω
tion (~ ~ ). The set of PH distributions is a subset of ME distributions. In particular,
the initial distributionP vector α~ in an ME representation is allowed to be non-stochastic
vector, as long as 0 < i α ~ i ≤ 1. Let ~1|x be a vector of dimension x whose components
are all equal to 1.
Lemma 4.1. If for some 1 ≤ i ≤ n, β1 + β2 L(λ1 ) + · · · + βi L(λ1 ) · · · L(λi−1 ) is divisible
by L(λi ) then there exists a unique vector ~δ such that:
~ Bi(λ1 , · · · , λn )) = M E(~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn ), ~1|n−1 ).
P H(β,
If vector ~δ is substochastic, then
~ Bi(λ1 , · · · , λn )) = P H(~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )).
P H(β,

If both conditions are fulfilled, then switching from the given representation
~ Bi(λ1 , · · · , λn )) to the smaller representation (~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )) means
(β,
reducing the size from n to n − 1. Algorithmically, we investigate the two conditions
for a given λi . The divisibility of the numerator polynomial is obtained by checking
whether R(−λi ) = 0, where R(s) is the numerator polynomial in (3). The substochas-
ticity (i.e., the absence of nonnegative components) of ~δ is checked while computing it,
as explained below.
Let Bi1 := Bi(λ1 , · · · , λi ), Bi2 := Bi(λ1 , · · · , λi−1 ). Lemma 4.2 (see [21] for
proof) states that we can simply ignore the last n − i states in both bidiagonal chains.
Lemma 4.2. If δj = βj+1 , for i ≤ j ≤ n − 1, then:
~ Bi(λ1 , · · · , λn )) = M E(~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn ), ~1|n−1 )
P H(β,
implies:
P H([β1 , · · · , βi ], Bi1 ) = M E([δ1 , · · · , δi−1 ], Bi2 , ~1|i−1 ). (4)

From (4), we obtain:


[β1 , · · · , βi ] exp(Bi1 t)~1|i = [δ1 , · · · , δi−1 ] exp(Bi2 t)~1|i−1 . (5)
The Order of Phase-Type Distributions 75

Therefore, to compute [δ1 , · · · , δi−1 ] from [β1 , · · · , βi ], i − 1 equations relating their


components are needed. Equation (5) can be evaluated at i − 1 different t values
to obtain such a required system of equations. However, such function evaluations in
practice are costly, because they involve matrix exponentiations. We proceed differently.
For a PH representation (~ α, A), the j-th derivative of its distribution function,
for j ∈ N0 , is:
dj
F (t) = −~αAj exp(At)~1.
dtj
Evaluating these derivatives at t = 0 allows us to avoid computing the exponential of
matrices. Hence, the components of vector [δ1 , · · · , δi−1 ] can be computed by solving:
[δ1 , · · · , δi−1 ]Bi(λ1 , · · · , λi−1 )j~1|i−1
= [β1 , · · · , βi ]Bi(λ1 , · · · , λi )j~1|i , 0 ≤ j ≤ i − 2. (6)
Once the system of equations (6) is solved, the substochasticity of vector [δ1 , · · · , δi−1 ]
can be determined simply by verifying that all of its entries are nonnegative real num-
bers.
However, we observe that for any bidiagonal PH-generator of dimension d,
Bi[si , si ] = −Bi[si , si+1 ],
for 1 ≤ i < d. Since both PH-generators in the system of equations are bidiagonal, we
can prove the following lemma [21].
Lemma 4.3. Equations (6) can be transformed into:
A[δ1 , · · · , δi−1 ]> = ~b, (7)
where A is an upper triangular matrix of dimension i − 1.

This transformation requires O(i2 ) multiplications and O(i2 ) additions.

Algorithm. Lemma 4.1 can thus be turned into an algorithm that reduces the size of
a given APH representation (α, A), which we give here in an intuitive form.
(1) Use SPA to turn (~ α, A) into (β, ~ Bi(λ1 , · · · , λn )), which takes O(n3 ) time.
(2) Set i to 2.
(3) While i ≤ n:
(a) Check divisibility w.r.t. λi (i.e., R(λi ) = 0), which takes O(n) time.
(b) If not divisible (i.e., R(λi ) 6= 0), continue the while-loop with i is set to
i + 1.
Otherwise, construct (7), and then solve it by backward substitution. This
takes O(n2 ) time, and produces (~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )). If vec-
tor ~δ is substochastic (which takes O(n) time to check), continue with the
PH representation (~δ, Bi(λ1 , · · · , λi−1 , λi+1 , · · · , λn )) and then decrease n
to n − 1; otherwise continue the while-loop with (β, ~ Bi(λ1 , · · · , λn )) and i
is set to i + 1.
(4) Return (β,~ Bi(λ1 , · · · , λn )).
76 Reza Pulungan

In each iteration of the while-loop, either n is decreased, or i is increased. As a


consequence, the reduction algorithm terminates in O(n3 ) time, and produces a reduced
representation of the original input (α, A). We refer to [21] for a more exhaustive
discussion.

Properties of the Algorithm. In the following, we discuss several properties of the


proposed algorithm:
(1) Non-minimality for the general case. Even though the algorithm reduces the
size of given acyclic PH representations, it does not always produce minimal
representations. An example is provided in [22] to show why this is the case.
(2) Minimality for triangular ideal PH distributions. A PH distribution is called
triangular ideal if it has acyclic PH representations whose size is equal to the
degree of the PH distribution. Theorems 3.5, 3.6, and 3.7 establish conditions
under which an acyclic PH distribution is triangular ideal. We may encounter
such an acyclic PH distribution, however, in a representation having strictly
larger size than the order of its distribution. In this case, verifying the con-
ditions will be difficult. Even more so if we wish to build a representation of
the same size as the order of the distribution. Given an acyclic PH representa-
tion whose PH distribution is triangular ideal—no matter how large the size of
the representation is—the proposed algorithm is certain to produce a minimal
representation [22].
(3) A realistic case study has demonstrated the use of the proposed algorithm
in [22]. A further case study in [21] shows the feasibility of using the pro-
posed algorithm to reduce the size of acyclic PH representations from a trillion
of states to thousands of states.

5. CONCLUDING REMARKS
This paper has described one of the most interesting research problems in the
area of phase-type distributions: the problem of their order and hence their minimal
representations. Several partial solutions to the problem have been discussed. For
acyclic phase-type distributions, the problem has basically been solved. An algorithm
that is guaranteed to transform any given acyclic phase-type representation to its min-
imal representation has been proposed in [10]. Although the algorithm involves solving
non-linear programming, which can be difficult, highly unstable and prone to numerical
errors, this algorithm is an excellent basis for further developments and improvements.
Based on our own proposed algorithm to reduce the size of acyclic phase-type represen-
tations, we think that to achieve minimality, non-linearity seems to be unavoidable.
For the general phase-type distributions, the problem is still open. Currently
available partial results are restricted to conditions for minimality without algorithmic
possibilities. The only algorithmic method that we are aware of is the algorithm pro-
posed in [15]. However, this algorithm only strives for obtaining sparse representations,
The Order of Phase-Type Distributions 77

not minimal ones. Nevertheless, we think that this algorithm is also an excellent ba-
sis for further developments and improvements towards an algorithm that can produce
minimal representations, since the output of the algorithm is quite similar to ordered
bidiagonal representations.

References
[1] Aldous, D. and Shepp, L., The least variable phase-type distribution is erlang, Communications
in Statistics: Stochastic Models, 3, 467-473, 1987.
[2] Bernardo, M. and Gorrieri, R., Extended Markovian process algebra, In CONCUR 96, Con-
currency Theory, 7th International Conference, Pisa, Italy, August 26-29, 1996, Proceedings,
volume 1119 of Lecture Notes in Computer Science, 315-339, Springer, 1996.
[3] Commault, C. and Chemla, J.-P., On dual and minimal phase-type representations, Communi-
cations in Statistics: Stochastic Models, 9(3), 421-434, 1993.
[4] Commault, C. and Chemla, J.-P., An invariant of representations of phase-type distributions
and some applications, Journal of Applied Probability, 33, 368-381, 1996.
[5] Commault, C. and Mocanu, S., A generic property of phase-type representations, Journal of
Applied Probability, 39, 775-785, 2002.
[6] Commault, C. and Mocanu, S., Phase-type distributions and representations: Some results and
open problems for system theory, International Journal of Control, 76(6), 566-580, 2003.
[7] Cumani, A., On the canonical representation of homogeneous Markov processes modelling failure-
time distributions, Microelectronics and Reliability, 22, 583-602, 1982.
[8] Dmitriev, N. and Dynkin, E.B., On the characteristic numbers of a stochastic matrix. Comptes
Rendus (Doklady) de lAcadémie de Sciences de lURSS (Nouvelle Série), 49, 159162, 1945.
[9] He, Q.-M. and Zhang, H., Spectral polynomial algorithms for computing bi-diagonal representa-
tions for phase-type distributions and matrix-exponential distributions, Stochastic Models, 2(2),
289-317, 2006.
[10] He, Q.-M. and Zhang, H., An Algorithm for Computing Minimal Coxian Representations, IN-
FORMS Journal on Computing, ijoc.1070.0228, 2007.
[11] He, Q.-M. and Zhang, H., On matrix exponential distributions, Advances in Applied Probability,
39(1), 271-292, 2007.
[12] Hermanns, H., Interactive Markov Chains: The Quest for Quantified Quality, Lecture Notes in
Computer Science, 2428, Springer, 2002.
[13] Hillston, J., A compositional approach to performance modelling, Cambridge University Press,
1996.
[14] Kelly, F.P., Reversibility and Stochastic Networks, Wiley, 1979.
[15] Mocanu, S. and Commault, C., Sparse representation of phase-type distributions, Communica-
tions in Statistics: Stochastic Models, 15(4), 759-778, 1999.
[16] Neuts, M. F., Matrix-Geometric Solutions in Stochastic Models: An Algorithmic Approach,
Dover, 1981.
[17] O’Cinneide, C. A., On non-uniqueness of representations of phase-type distributions, Communi-
cations in Statistics: Stochastic Models, 5(2), 247-259, 1989.
[18] O’Cinneide, C. A., Characterization of phase-type distributions, Communications in Statistics:
Stochastic Models, 6(1), 1-57, 1990.
[19] O’Cinneide, C. A., Phase-type distributions and invariant polytopes, Advances in Applied Prob-
ability, 23(43), 515-535, 1991.
[20] O’Cinneide, C. A., Triangular order of triangular phase-type distributions, Communications in
Statistics: Stochastic Models, 9(4), 507-529, 1993.
[21] Pulungan, R., Reduction of acyclic phase-type representations, Ph.D. Dissertation, Saarland
University, Saarbrücken, Germany, 2009.
78 Reza Pulungan

[22] Pulungan, R. and Hermanns, H., Acyclic minimality by construction—almost, Sixth Interna-
tional Conference on Quantitative Evaluation of Systems, QEST 2009, 61-72, IEEE Computer
Society, 2009.

Reza Pulungan
Department of Computer Science and Electronics,
Faculty of Mathematics and Natural Sciences,
Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail : [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 79 - 88

QUADRATIC OPTIMAL REGULATOR PROBLEM OF


DYNAMIC GAME FOR DESCRIPTOR SYSTEM

SALMAH

Abstract. In this paper the noncooperative linear quadratic game problem will be
considered. We present necessary and sufficient conditions for existence of optimal
strategy for linear quadratic continuous non -zero-sum two player dynamic games for index
one descriptor system. The connection of the game solution with solution of couple Riccati
equation will be studied. In noncooperative game with open loop structure, we study Nash
solution of the game. If the second player is allowed to select his strategy first, he is called
the leader of the game and the first player who select his strategy at the second time is
called the follower.A stackelberg strategy is the optimal strategy for the leader under the
assumption that the follower reacts by playing optimally.
Keywords and Phrases : Dynamic, game, noncooperative, descriptor, system

1. INTRODUCTION

Dynamic game theory brings three keys to many situations in economy, ecology, and
elsewhere: optimizing behavior, multiple agents presence, and decisions consequences.
Therefore this theory has been used to study various policy problems especially in macro-
economic. In applications one often encounters systems described by differential equations
system subject to algebraic constraints. The descriptor systems, gives a realistic model for this
systems.
In policy coordination problems, questions arise, are policies coordinated and which
information do the parties have. One scenario is noncooperative open-loop game. According
this, the parties can not react to each other’s policies, and the only information that the
players know is the model structure and initial state.
In this paper we will consider a linear open-loop dynamic game in which the player
satisfy a linear descriptor system and minimize quadratic objective function. For finite
horizon problem, solution of generalized Riccati differential equation is studied. If the
planning horizon is extended to infinity the differential Riccati equation will become an
algebraic Riccati equation.

79
80 SALMAH

2. PRELIMINARIES

The players are assumed to minimize the performance criteria:


1  
T n
1
J i (u1 , u 2 ,..., u n )  x(T ) T E T K iT Ex (T )    x(t ) T Qi x(t )   u j (t ) T Rij u j (t ) dt ,
2 2 0 j 1 
(2.1)
with all matrices symmetric. Furthermore and semi positive definite and positive
definite, where the players give control vector to the system
̇ , (2.2)
( ) .
with , x(t) descriptor
vector n dimension. While ui(t), i=1,…,n are control vector dimension which is done by i-
th player, i=1,…,n. Matrix E generally singular with rank
Below is definition for Nash equilibrium strategy.

Definition2.1.The pair ( ) is called Nash equilibrium strategy if


( ) ( )
( ) ( )
for all admissible strategies , .

If the second player is allowed to select his strategy first, he is called the leader of the
game and the first player who select his strategy at the second time is called the follower. A
stackelberg strategy is the optimal strategy for the leader under the assumption that the
follower reacts by playing optimally.
Assumption which is needed will be given.

Assumption 2.1:Descriptor system (1) regular, impulse controllable and finite dynamic
stabilizable which satisfy
(i). | | , except for a finite number of ,
(ii). ( ) ( ) ,
(iii). ( | ) [ ]

3. NASH EQUILIBRIUM OF DESCRIPTOR GAME

The To derive necessary condition of opnimal Nash solution we need the Hamiltonian
functions as follow.
( ) ( ) ( ),
( ) ( ) ( ),
With Lagrange multiplier method as in [11] we get necessary conditions for objective
function to be optimal in the Nash sense are
, , ̇ i=1,2. (3.1)
Substitute these equations to (1) yields
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 81

̇( ) ( ) ( ) (3.2)
and
with i=1,2. (3.3)
The boundary conditions are
( ) and ( ) ( ). (3.4)
From necessary condition of optimal Nash solution we get optimal strategies for 2
players dynamic game are ( ) with satisfy (3.2) and boundary
conditions(3.4), i=1,2.
We can write in matrix form and get
̇( ) ( )
( ) ( ̇ ( )) ( ) ( ( )) , (3.5)
̇ ( ) ( )
with boundary conditions (3.4) . System (3.5) can be written in descriptor form as
̃ ̇ ̃ . If ( ̃ ̃) regular, system (3.5) will have solution (see [11]). We need the
following assumptions for equation (3.5).

Assumption 3.1: Descriptor system (3.5) is regular and impulse free i.e
̃ ̃( ̃) .

If Assumption 3.1 is satisfied then system (3.5) will be regular and impulse controllable.
For 2 players linear quadratic dynamic game define 2 generalized differential Riccati
equation as follow
̇
̇
with
with boundary condition
( ) (3.6)
The following theorem concern with relationship between the existence of dynamic
game solution and generalized Riccati differential equation (3.6).

Theorem 3.1: The two player linear quadratic discrete dynamic game (2.1), (2.2) has, for
every consistent initial state, a Nash equilibrium if the set of differenttial Riccati equation
(3.6) has a set of solutions on [0,T].
Moreover the optimal feedback Nash equilibrium is given by
( ) ( ) ( ), ( ) ( ) ( ),
where x(t) is a solution of the closed loop system
̇( ) ( ( ) ( )) ( ) ( )

PROOF: Let the player choose strategy


( ) ( ) ( ), ( ) ( ) ( ),
to control system (3.1), (3.2) with ( ) ( ) solusion of(3.6).
Define ( ) ( ) ( ), and ( ) ( ) ( ), we get
E T 1 (t )  E T K 1 (t ) x(t )  E T K1 (t ) x (t ) ,
E T  (t )  E T K (t ) x(t )  E T K (t ) x(t )
2 2 2 .
From (2.2) we get
82 SALMAH

Ex  Ax(t )  B1 R111 B1T K1 (t ) x(t )  B2 R22


1 T
B2 K 2 (t ) x(t ) ,
Therefore we get
E T 1 (t )   AT K1 x  L1 Ax  Q1 x  L1 B1 R111 B1T K1 x  L1 B2 R22
1 T
B2 K 2 x  E T K1 (t ) x
  AT K1 x  L1 Ax  Q1 x  L1 B1 R111 B1T K1 x  L1 B2 R22 1 T
B2 K 2 x  L1 Ex
  AT K1 x  L1 Ax  Q1 x  L1 B1 R111 B1T K1 x  L1 B2 R22
1 T
B2 K 2 x
 L1 Ax  L1B1R11 B1 K1 x  L1B2 R22 B2 K2 x
1 T 1 T

  AT K1 x(t )  Q1 x(t ) ,
and with same reason we get
ET 2 (t )   AT K2 x(t )  Q2 x(t ) .
This two equation has solution.

For 2 players infinite time linear quadratic dynamic game the players satisfy system (1).
Objective function to be minimized are in the form
1  T 
 2
J i (u1 , u 2 )   
2 0 
x (t )Qi x (t )   u T
j (t ) Rij u j (t ) dt ,

j 1
i=1,2 (3.7) 
with all matrices symmetric. Furthermore and semi positive definite and positive
definite.
Generalized algebraic Riccati equation for 2 players infinite time problem that related
with Nash equilibrium are

(3.8)
with .
It can be proved that Theorem 1 will also be satisfied for optimal control for infinite
time problem, therefore we get the optimal Nash have form ( ) ( ) ( ),
with are constant matrices, solution of (3.8).

4. STACKELBERG EQUILIBRIUM OF DESCRIPTOR GAME

To derive necessary condition of opnimal Nash solution we need the Hamiltonian functions
for the follower is.
( ) ( ) ( ),
With Lagrange multiplier method as in [11] we get necessary conditions for objective
function for the follower to be optimal is
, , ̇ i=1,2. (4.1)
From the first equation of (4.1) we get
̇ ( ) ( ) ( ). (4.2)
From the second equation of (4.2) we get the optimal control for the follower is
. (4.3)
The boundary conditions are
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 83

( ) and ( ) ( ). (4.4)
For the second player as the leader define Hamiltonian
( ) ( ) ( ). (4.5)
Let get deridative of (4.4) to ( ) we get
. (4.6)
Let get deridative of (4.4) to ( ) we get
̇ . (4.7)
With the second player as the leader let the Hamiltonian is
( ) ( ) ( )
( ),
or
( ) ( ) ( )
( ) ( ). (4.8)
Let get derivative of (4.8) to x we get
̇ ( ) ( ) ( ) ( ). (4.9)
Let get derivative of (4.8) to we get
( ). (4.10)
Let get derivative of (4.8) to we get
. (4.11)
Let get derivative of (4.8) to we get
̇ . (4.12)
Substitute (4.9) to (4.12) we get
̇ . (4.13)
Substitute (4.3) to (4.13) we get
̇ . (4.14)

From necessary condition of optimal Stackelberg solution we get optimal strategies for 2
players dynamic game are ( ) with satisfy (4.2), (4.8) and (4.13) and
boundary conditions(4.4), i=1,2.
We can write in matrix form and get
E 0 0 0  x   A  S1  S2 0  x 
     
0 ET 0 0     0 A  S21 S1   

0 0 E T
0  1    Q1 0  AT 0   1 
     
0 E T  2    Q2  AT   2 
 0 0 Q1 0
(4.15)
with boundary conditions (4.4) . System (4.15) can be written in descriptor form as ̃ ̇ ̃ .
If ( ̃ ̃) regular, system (4.15) will have solution. We need the following assumptions for
equation (4.15).

Assumption 4.1: Descriptor system (4.15) is regular and impulse free i.e
̃ ̃( ̃) .

If Assumption 4.1 is satisfied then system (4.15) will be regular and impulse
84 SALMAH

controllable.
For 2 players Stackelberg linear quadratic dynamic game define generalized differential
Riccati equation as follow
̇
̇
̇
with .
with boundary condition
( ) , ( ) . (4.16)
The following theorem concern with relationship between the existence of dynamic
game solution and generalized Riccati differential equation (4.16).

Theorem 4.1: The two player linear quadratic discrete dynamic game (2.1), (2.2) has, for
every consistent initial state, a Stackelberg equilibrium if the set of differenttial Riccati
equation (4.16) has a set of solutions on [0,T].
Moreover the optimal feedback Nash equilibrium is given by
( ) ( ) ( ), ( ) ( ) ( )
Where x(t) is a solution of the closed loop system
̇( ) ( ( ) ( )) ( ) ( ) .

PROOF: Let , and . We get ̇ ̇ ̇ . Substitute this to


(4.2) and because we get the first equation of (4.16). Because ̇ ̇ ̇
substitute to (4.9) and because we get the second equation of (4.16). Because
̇ ̇ ̇ , substitute to (4.14) we get the third equation of (4.16).

Generalized algebraic Riccati equation for 2 players infinite time problem that related
with Nash equilibrium are

With . (4.17)

5. NUMERICAL EXAMPLE

We will give numerical example to find optimal Nash solution of game by try to find ARE
solution. Consider system
 1 0  x1   0 1  x1   0  1
         u1   u 2 ,
 0 0  x 2   1 0  x2   1   0 (4.18)
x1 (0)  x10 , x2 (0)  x20 .
For the cost function, given
 1 0  0 1
Q1   , Q2   , R1  1, R2  2, R21  1.
0 1  1 1
We will find solution of the generalized algebraic Riccati equation (4.18) to get optimal
Nash of the game.
Because , we have
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 85

K1 (1,1)  L1 (1,1) , L1 (2,1)  0 , K1 (1,2)  0 .


Because we have
K 2 (1,1)  L2 (1,1) , L2 (2,1)  0 , K 2 (1,2)  0 .
Substitute the result to first algebraic Riccati equation (3.8) will give the following equations.
1
K1 (2,1)  L1 (1,2)  1  L1 (1,2) K1 (2,1)  L1 (1,1) K 2 (1,1)  0,
2 (4.19)
K1 (2,2)  L1 (1,1)  L1 (1,2) K1 (2,2)  0, (4.20)
K1 (1,1)  L1 (2,2)  L1 (2,2) K1 (2,1)  0, (4.21)
1  L1 (2,2) K1 (2,2)  0 . (4.22)
Take ( ) , we get ( ) and based on (4.21), we get

K1 (2,1)  aK1 (1,1)  1 . (4.23)


Based on (4.22) we get
1
L1 (1,2)  1  K1 (1,1)
a . (4.24)
Substitute (4.23), (4.24) to (4.19) give
1 3
K1 (1,1)  K12 (1,1)  2  0
2 2 . (4.25)
Because of the second algebraic Riccati equation we get
1
K 2 (2,1)  L2 (1,2)  L2 (1,2) K1 (2,1)  L2 (1,1) K 2 (1,1)  0,
2 (4.26)
K 2 (2,2)  L2 (1,1)  L2 (1,2) K1 (2,2)  1  0, (4.27)
K 2 (1,1)  L2 (2,2)  L2 (2,2) K1 (2,1)  1  0, (4.28)
1  L2 (2,2) K1 (2,2)  0. (4.29)
Based on (4.29) and because of ( ) , give ( ) . Based on (4.27) and
(4.28) we get ( ) ( ) ( ) and ( ) ( ) . Let L2(1,2)=b we
get ( ) ( )
Because ( ) ( ) ( ) , ( ) ( ) and based on (4.27)
we get
1 2
K 2 (2,1)  abK1 (1,1)  K 2 (1,1)
2 . (4.30)
Based on (4.29), (4.26) and (4.27) we get

K 2 (2,1)  abK1 (1,1) 


1
 1  K1 (1,1)2
2 . (4.31)
Therefore we have
86 SALMAH

 K1 (1,1) 0
K1N   ,
 aK1 (1,1)  1 a  (4.32)
Let K2(2,1)=c we can write
  1  K1 (1,1) 0 
K 2 N   ,
 c ab  K1 (1,1) 
(4.33)
where ( ) and ( ) can be found from (4.24) and (4.30). Solution for ( ),
( )are ( ) . Take ( ) we get ( ) .
Optimal Nash control gain for the players is given by
 4   4 
 0  1  0 
K1N  3  K2 N   3 
 a 1 a
4  ab  1 ab  4 
4
   
3  3 18 3  .(4.34)
Now we will find the Stackelberg equilibrium of the game. Because we get
. From the first Riccati equation (4.17) we get (4.19)-(4.22). From the second
Riccati (4.17) we get
1
K2 (2,1)  L2 (1,2)  P(1,1)  L2 (1,2) K1 (2,1)  L2 (1,1) K2 (1,1)  0,
2 (4.35)
K2 (2,2)  L2 (1,1)  L2 (1,2) K1 (2,2)  1  0, (4.36)
K 2 (1,1)  L2 (2,2)  L2 (2,2) K1 (2,1)  1  0, (4.37)
P(2,2)  1  L2 (2,2) K1 (2,2)  0. (4.38)
From the third Riccati equation (4.17) we get
1
 P(1,1) K1 (1,1)  K1 (1,1)  K1 (2,1)  K 2 (1,1) K2 (2,1)  0,
2 (4.39)
P(1,1)  P(2,2)  K1 (2,2)  K2 (2,2)  0, (4.40)
P(2,2)  P(1,1)  P(2,2) K1 (2,1)  K1 (1,1)  K1 (2,1)  K2 (1,1) K2 (2,1)  0, (4.41)
K1 (2,2)( P(2,2)  2)  K2 (2,2)  0. (4.42)
Take ( ) , we get from (4.22) ( ) . From (4.21) we get
( ) ( ) .
From (4.20) we get
( ) ( ).
Take ( ) From (4.38) and because ( ) we get
( ) .
From (4.42) we get
( ) ( ).
Take ( ) , from (4.36) we get
( ) ( ) .
Quadratic Optimal Regulator Problem of Dynamic Game for Descriptor System 87

From (4.42) we get


( )( ) ( ) .
From (4.35) we get
( ) ( ) ( ( )) ( ). (4.43)
From (4.40) we get
( ) ( ) .
From (4.39) and (4.41) we get
( ) .
From (4.39) and (4.43) we get
( )( ( ) ( ( )) ( ))

( ) ( ) ( ) ( ). (4.44)
Therefore we can find ( ) from (4.44). Let ( ) . From (4.39) we can get
( ). Let ( ) .
Then we get optimal Stackelberg for the game is given by

( ), ( ).
( ) ( )

6. CONCLUDING REMARK

This paper consider 2 player non-zero-sum linear quadratic dynamic game with descriptor
systems for finite horizon and infinite horizon case. Necessary condition for the existence of a
Nash equilibrium and Stackelberg equilibrium have been derived with Hamiltonian method.
The paper also consider 2 couple Riccati-type differential equation for finite horizon case and
algebraic Riccati equation for infinite horizon case related with Nash equilibrium and
Stackelberg equilibrium.

References

[1] BASAR, T., AND OLSDER, G.J., Dynamic Noncooperative Game Theory, second Edition,
Academic Press, London, San Diego, 1995.
[2] DAI, L., Singular Control Systems, Springer Verlag, Berlin, 1989.
[3] ENGWERDA, J., On the Open-loop Nash Equilibrium in LQ-games, Journal of Economic
Dynamics and Control, Vol. 22, 729-762, 1998.
[4] ENGWERDA, J.C., LQ Dynamic Optimization and Differential Games, Chichester: John
Wiley & Sons, 229-260, 2005.
[5] KATAYAMA, T., AND MINAMINO, K., Linear Quadratic Regular and Spectral
Factorization for Continuous Time Descriptor Systems, Proceedings of the 31st
Conference on decision and Control, Tucson, Arizona, 967-972, 1992.
88 SALMAH

[6] LEWIS, F.L., A survey of Linear Singular Systems, Circuits System Signal Process,
vol.5, no.1, 3-36, 1986.
[7] MINAMINO, K., The Linear Quadratic Optimal Regulator and Spectral Factorization for
Descriptor Systems, Master Thesis, Department of Applied Mathematics and Physics,
Faculty of Engineering, Kyoto University, 1992.
[8] MUKAIDANI, H. AND XU, H., Nash Strategies for Large Scale Interconnected Systems,
43rd IEEE Conference on Decision and Control Bahamas, 2004, pp 4862-4867.
[9] SALMAH, BAMBANG, S., NABABAN, S.M., AND WAHYUNI, S., Non-Zero-Sum Linear
Quadratic Dynamic Game with Descriptor Systems, Proceeding Asian Control
Conference, Singapore, pp 1602-1607, 2002.
[10] SALMAH, N-Player Linear Quadratic Dynamic Game for Descriptor System,
Proceeding of International Conference Mathematics and its Applications SEAMS-
GMU, Gadjah Mada University, Yogyakarta, Indonesia, 2007.
[11] XU, H., AND MIZUKAMI, K., Linear Quadratic Zero-sum Differential Games for
Generalized State Space Systems, IEEE Transactions on Automatic Control, Vol. 39
No.1, January, 1994, 143-147,1994.

SALMAH
Department of Mathematics Gadjah Mada University
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 89–120

CHAOTIC DYNAMICS AND BIFURCATIONS IN


IMPACT SYSTEMS

SERGEY KRYZHEVICH

Abstract. Bifurcations of dynamical systems, described by several second order differential


equations and by an impact condition are studied. It is shown that the variation of parameters when
the number of impacts of a periodic solution increases, leads to the occurrence of a hyperbolic
chaotic invariant set.

Keywords and Phrases : impacts, grazing, chaos, hyperbolicity, homoclinic points

INTRODUCTION

The vibro-impact systems appear in different mechanical problems (modeling of impact


dampers, clock mechanisms, immersion of constructions, etc.). All the impact systems are
strongly nonlinear. Their properties resemble ones of classical nonlinear systems.
Particularly, the chaotic dynamics is possible Akhmet, 2009 − Chin, Ott, Nusse & Grebogi,
1995; Fredriksson & Nordmark, 1997 – Holmes, 1982; Kryzhevich & Pliss, 2005 –
Nordmark, 1991; Thomson & Ghaffari, 1983; Whiston, 1987.
There is a big number of publications, devoted to bifurcations, proper to vibro-
impact systems. One of them, the so-called grazing bifurcation, first described by Nordmark
(1991), corresponds to the case, when there is a family of periodic solution, which has a finite
number of impacts over the period and this number increases or decreases as the parameter
changes. For the bifurcation value of the parameter, the periodic solution has an impact with a
zero normal velocity. It was shown that this bifurcation implies a non-smooth behavior of
solutions, instability of the periodic solution in the parametric neighborhood of grazing and,
in additional assumptions, the chaotic dynamics (see Budd, 1995; Chin, Ott, Nusse &
Grebogi, 1995; Fredriksson & Nordmark, 1997 – Ivanov, 1996; Nordmark, 1991; Whiston,
1987 and the references therein).
The approaches to find a chaos in impact systems are very different. For example,
topological Li-Yorke chaos was studied by Akhmet, 2009 and di Bernardo, Budd,

89
90 SERGEY KRYZHEVICH

Champneys & Kowalczyk, 2007). Stochastic chaos (existence of SRB-measures) was


considered by Bunimovich, Pesin, Sinai & Jacobson (1985), see also Chernov, Markarian,
2001.
Devaney’s chaos was studied for single degree of freedom sytems in the author's
work Kryzhevich, 2008. In this paper generalize this result for systems with several degrees
of freedom. The main result of this paper is the method, which allows to find homoclinic
points, corresponding to grazing. The main idea of the proof is the nonsmoothness of Perron
surfaces in the neighborhood of periodic solution. If these manifolds bend in a "good" way
(the corresponding sufficient conditions can be written down explicitly) they can intersect.
This implies chaos. We study a motion of a point mass, described by system of second order
differential equations of the general form and impact conditions of Newtonian type.
Unlike the similar author’s paper Kryzhevich, 2008 here we study the systems with
several degrees of freedom. For these systems a near-grazing periodic point is not
automatically hyperbolic, so we need to provide additional conditions to have a classical
Smale horseshoe.
In order to avoid technical troubles we assume that the delimiter is plain, immobile
and slippery. However, there is no obstacles to apply the offered method to the systems with a
mobile delimiter (see, for example Holmes, 1982), ones with non-Newtonian model of
impacts Babitsky, 1998; Fredriksson & Nordmark, 1997; Ivanov, 1996; Kozlov & Treshev,
1991 and even to some special cases of strongly nonlinear dynamical systems without any
impact conditions.
The paper is organized as follows. In Section 1 we consider the nmathematical
model of vibro-impact systems. In Section 2 we define the grazing family of periodic
solution. In Section 3 the near-grazing behavior of solutions is described. In Section 4 the
main result of the paper is presented. In Sections 5 and 6 the near-impact behavior of
solutions is studied and estimates of Lyapunov exponents are given. In Sections 7 and 8 an
analogue of the Smale horseshoe was constructed and an analogue of the Smale-Birkhof
theorem is proved. An example, illustrating the main result, is considered at Section 9. In
Section 10 some practical applications and experiments and simulations, related to results of
the paper are discussed. The results, mentioned at Section 10 are not original, we need them
to provide an experimental justification of Theorem 1 that is the main result of the current
paper.

1. MATHEMATICAL MODEL

Consider a segment and a smooth function .

Suppose that . Here the period is a smooth

function of the parameter , and . We may suppose without loss of generality that

does not depend on , making, if necessary, the transformation .


Denote
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
91

Consider the system of first order ordinary differential equations


(1.1)

Let be a − smooth function. Let us note by the

column vector, consisting of elements . Suppose that the system (1.1) is defined
for and the following Newtonian impact condition takes place as soon as the first

component of the solution vanishes.

Condition 1. If then ,

.
Consider a vibro-impact system
(1.2)

We say that a function is a

solution of the vibro-impact system (1.2) on an interval , if this interval can be

represented as a disjoint union with following properties.


1) The set , corresponding to free flight motions, is

open.
2) The set , corresponding to

sliding moitions, is closed.


3) All the entries of the vector , except, maybe , are continuous over the

interval , the discontinuity set of the function is .

4) The function vanishes at the points of .


92 SERGEY KRYZHEVICH

5) The function is a solution of System (1.1) on every subinterval of the set .

6) The set is at most countable. All the limit points of this set belong to .

7) For every Condition 1 is satisfied.

8) The function is a solution of the system

on every subsegment of the set .

2. GRAZING FAMILY OF PERIODIC SOLUTIONS

Since the solutions of the vibro-impact systems are discontinuous at impact instants, the
classical results on integral continuity are not applicable. Nevertheless, the following two
statements hold true.

Lemma 2.1. Let be the solution of (1.2)

corresponding to and to the initial data

and . Suppose that this solution is defined


on the segment . Assume that there are exactly zeros

of the function over the segment and

. Then for any there exists a neighborhood of the point

such that for any fixed the

mapping is − smooth with respect to the variables .

These solutions have exactly impacts , over the segment

. These moments and corresponding velocities

smoothly depend on .

PROOF: Let the number be such that (assume, if necessary ,

). The solution of the vibroimpact system (1.2) is also one of (1.1) over the

segment . The impact moments and as well as the impact points


C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
93

and smoothly depend on their

parameters. Similarly, the values and smoothly depend on

and , as well as and are −


smooth functions of and and so on. □

Condition 2. (Figure 1.). There exists a continuous family of − periodic solutions

of the system (1.2), satisfying the following properties.


1) For all the component has exactly zeros over

the period .

2) The velocities are such that

for all , ,

, (2.1)

, , .

Here

Figure 1. A grazing family of periodic solutions.


94 SERGEY KRYZHEVICH

3) The instants and the velocities continuously depend on .


Denote

We may suppose without loss of generality that . This

may be obtained by the transformation . Define

Fix a small positive and consider the shift mapping for the system (1.2), given by

the formula . For small positive and θ the mapping

is - smooth in a neighborhood of the point . Without loss of

generality, we may assume that .

3. THE SEPARATRIX

Denote .

Lemma 3.1. There exists a neighborhood of zero such that if the parameters and θ are

small enough, the set is a surface of the dimension , which is the graph

of the smooth function (Figure 2). Moreover,

(3.1)

where is a smooth function such that .


C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
95

Figure 2. The curve of initial data, corresponding to grazing and a near-grazing stretching of
a small square R in the phase space.

PROOF: Take a point . Let the moment be such that ,

. Let us show that if is close

enough to , we may take so that the function does not have

zeros on , except . Otherwise, there exists a sequence (suppose

without loss of generality, that and the sequence decreases), a sequence

and one, consisting of solutions, uniformly bounded on the segment

of the system (1.2), such that (Figure 3).

Figure 3. An illustration to the proof of Lemma 3.1.


96 SERGEY KRYZHEVICH

Also, there exist time instants such that and instants

such that . Moreover, , ,

. Without loss of generality, we assume that . Then

. This contradicts to (2.1).

Then for all the function can be presented as series


(3.2)

Differentiating (3.2), we obtain that . On the other


hand,

as . Then

Since , (3.1) is true □.

Take a small parameter such that the sets

are correctly defined and nonempty.

4. THE MAIN RESULT

Consider the matrix

Let . Denote the elements of the matrix by and ones of the matrix by

. Denote the columns of matrices and by and respectively, the strings of the

matrix by . If and consider the ( matrix


C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
97

, defined by formulae

, .

Similarly, if and and , we define the matrix

Assume that the at least one of the following statements is true.

Condition 3.
1) Either or the matrix does not have eigenvalues on the unit circle in ℂ.
2)

(4.1)

Condition 4.
1) Either or the matrix does not have eigenvalues on the unit circle in ℂ.
2)

(4.2)

From the geometrical point of view, the first items the first items of conditions 3 and
4 mean that the equilibriums are saddle hyperbolic and (4.1), (4.2) provide that the
corresponding stable and unstable manifolds intersect. This will be shown below.

Later on we shall suppose that Condition 3 is satisfied. Otherwise, we consider the mapping
instead of . Then the matrix is replaced with , and the condition (4.1) is

replaced with (4.2). The similar reasoning shall prove the statement of Theorem 1 in the
considered case (see the right part of Figure 4). All the proofs given below, may be repeated
for this case.
98 SERGEY KRYZHEVICH

Figure 4. Homoclinic points.

Theorem 1. If Condition 2 and either Condition 3 or Condition 4 are satisfied, there exist
values and such that for all the mapping is chaotic in the

sense of Devaney, 1987. More precisely, there exist an integer m and a compact set

invariant with respect to and such that the following conditions are satisfied.

1) There exists a neighborhood of the set such that the mapping is a

diffeomorphism. The invariant set is hyperbolic.

2) The mapping has infinitely many periodic points.

3) The periodic points of are dense in .

4) The set is transitive i.e. there exists a point such that the orbit

is dense in .

Remark. The similar results may be obtained for the systems with any finite number of
grazings over the period.

5. GRAZING

Now we start to prove Theorem 1. Note that all the mappings , corresponding to the same
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
99

value of , are conjugated. Fix a small value and a solution

of the corresponding system, having an impact at the moment . Suppose that the

corresponding normal velocity is nonzero. We consider as a

small parameter. Denote . Fix a positive value and consider the


mapping
,

defined in a neighborhood of the point . Here we assume that the point

and the parameter are chosen so that there exists a neighborhood such that

any solution has exactly one impact over the

segment . Denote the corresponding moment by and the

normal velocity, defined similarly to , by . Let be the last


components of the solution at the impact instant. Take the values

so that for all . The mapping is

smooth in the neighborhood of the point , let us estimate the Jacobi matrix .
Denote

.
Similarly, we define the values . Consider the Taylor formula for values

as functions of :

(5.1)
100 SERGEY KRYZHEVICH

Here all functions, denoted by the letter with different indices, are smooth with respect
to all arguments except Denote

, ,

It follows from (5.1) that

Here is the unit matrix of the corresponding size,

.
Denote

Clearly, . Then, similarly to the results of the paper Ivanov (1996), we obtain
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
101

(5.2)

(Figure 2). Here ,

Note that as .

It is shown on Figure 2 how the small neighborhood of the form

is stretching under the action of the mapping , defined by the formula

. Here .

6. LYAPUNOV EXPONENTS

We check that the equilibrium point of the mapping is hyperbolic and estimate the

bigger and the smaller absolute value of eigenvalues of the matrix and

ones of small perturbations of this matrix. The mapping can be represented as the

composition where . The


Jacobi matrix

tends to as .

The matrix

is of the form (5.2), where . Then


102 SERGEY KRYZHEVICH

Since , if the inequalities (4.1) take

place and if is small enough, one of the eigenvalues of the matrix is

. The corresponding eigenvector equals to .

The eigenvalue is of multiplicity 1, the linear space, corresponding to other eigenvalues,

tends as to the hyperplane , given by the condition . Since , the

vector is out of . The matrix satisfies the following asymptotic estimate

It follows from the form of this matrix, that the matrix has the eigenvalue

. The corresponding eigenvector satisfies

the asymptotical estimate .

If n=1, it is clear that the matrix as well as matrices corresponding to

points of a small neighborhood of are hyperbolic. Otherwise we need the following


lemma.

Lemma 6.1. If the condition 3 is satisfied there exist positive constants and such that

for any matrix such that and any the matrix does
not have eigenvalues on the unit circle in ℂ.

PROOF: Suppose the statement of Lemma 6.1 is not true. Then there exist a sequence of
matrices and sequences such that all matrices have

eigenvalues of the form . Denote the corresponding eigenvectors by

. Without loss of generality we may assume that for all

and that
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
103

as .

Denote the elements of the matrices and by and respectively ( ,

). Let , ,
.

Note that since , and other values are uniformly bounded,

the sequence tends to as and

.
From the definition of eigenvalues and eigenvectors we obtain

(6.1)

It follows from the first equation of (6.1) that

as . Consequently, . Substituting this expression to equations (6.1),

corresponding to , we have

Proceeding to the limit as we obtain

Clearly, at least one of values ( ) is nonzero. Then is an eigenvalue of

the matrix . This contradicts to our assumptions. □


104 SERGEY KRYZHEVICH

7. HOMOCLINIC POINT

The mapping is differentiable at the points of the set . Due to Lemma 6.1 the

eigenvalues of Jacobi matrices are out of the unit circle, provided is small. Then, due to
the Perron theorem, in a small neighborhood of the point there exist the local stable

manifold and the unstable one of the mapping . Both of them are smooth

surfaces. Let , , . Select

the orthonormal basises and in the spaces and respectively

so that and . Extend the stable and unstable manifolds up to the

invariant sets and of the mapping . The obtained sets consist, generally
speaking, of a countable number of the connected components. Every of these components is
a partially smooth manifold.

Lemma 7.1. The manifolds and intersect transversally at a point (Figure


4).

PROOF: If Conditions (4.1) be satisfied then, for small values of the parameter the

manifold intersects transversally the surface . Denote the manifold,

obtained in the intersection, by . The dimension of equals to . Denote

. The neighborhood of the point may be chosen so that both the

manifolds and intersect with and the intersection consists of

two connected components. Denote one, which does not contain the point , by . The

neighborhood may be chosen so that as . Let

. For any the tangent space is the linear hull of

unit orthogonal vectors , which can be chosen so that

for any .

The surface is not smooth in the neighborhood of the manifold . For the points

the tangent space is the linear hull of unit vectors


C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
105

, where

for all and

It follows from (4.1), that for any the set is linearly

independent. The vectors and lie at different half-spaces, separated by the hyperplane

. Consequently, for all

, and the vectors and lie in


different half-spaces, separated by the linear hull of vectors
. This means that for the small values of the

parameter and the neighborhood may be chosen so that the surfaces and
intersect transversally. This proves the lemma for the considered case (see the left part of
Figure 4). □

The Smale-Birkhof theorem Smale, 1965 on the existence of a chaotic invariant set
in a neighborhood of a homoclinic point is not applicable in the considered case since the
mapping is discontinuous. However, the similar techniques will help us to find a chaotic

set of the mapping .

8. SYMBOLICAL DYNAMICS

Consider the new smooth coordinates in the neighborhood of the point , such that the
following conditions are satisfied.
1) The point corresponds to .

2) The Euclidean norm of any column of the matrix equals to 1.

3) The vector can be represented in the form

so that in a small neighborhood of the


106 SERGEY KRYZHEVICH

stable and the unstable manifolds are given by the conditions and
respectively.
4) The direction of the tangent line to the axis, corresponding to the coordinate , taken at

the point coincide with one of the vector , and one, corresponding to the coordinate

coincides with the vector .

Suppose . Consider the neighborhood of the point ,

defined by the conditions , (Figure 5). For any define

. Denote the parts of the boundary of the set , corresponding to

and by and respectively. Here the positive values and

are chosen so that and there exist natural numbers and such that

(Figure 6). We may take and and, respectively, eigenvalues

so big in the neighborhood that for any .

Figure 5. Domain .

Due to Lemma 7.1 the set contains at least two connected

components. One of them (let us denote it by ) contains the point . Another one,

denoted by , contains the point . Let ,


(Figure 5). Note that m is arbitrarily

big provided μ, and are small.


C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
107

Figure 6. Smale horsheshoe for System (1.2).

Let us show that the set is chaotic. Evidently, the set

is invariant with respect to the mapping , compact and nonempty, since it contains

the point . Moreover, the set does intersect neither with the inverse images

( ) of the hyperplane nor with ones of the surface . Consequently,

for any integer there is a neighborhood of the set such that the mapping

is smooth.

Lemma 8.1. For any k∈ ℕ, any set such that for any j=0,...,

k, the set is not empty.

PROOF: Consider an arbitrary arc , joining the parts and of the boundary of the set

, defined as the graph of a smooth function , such that


.

Let us call such arcs admissible. Similarly to the well-known Palis lemma ( -lemma), Palis &

di Melo, 1982, chapter 2, Lemma 7.1, one may show that for small values of , and
108 SERGEY KRYZHEVICH

there exists an embedding of the curve to the manifold , arbitrarily -close to

the unit one. Particularly, this means that the set contains two admissible arcs

and . Fix the index . It follows from what is proved, that the inverse image of

any admissible arc , contains an admissible arc . Applying the same procedure

to the curve , we obtain an arc . Finally, we shall get the

curve
.

Then the statement of the lemma follows from the inclusion □

For any point there is a unique sequence ,

such that for any . Due to Lemma 8.1 for any sequence one may

find the corresponding point . Since the diffeomorphism is hyperbolic in a

neighborhood of the set the point is uniquely defined by the sequence .

Therefore the set is of the power continuum. It is the unit shift of the index to the left,

which corresponds to the mapping . The presence of this symbolical dynamics proves

the statement of Theorem 1. □

9. EXAMPLE
Consider the following 2 degree of freedom system
; . (9.1)

Assume that
(9.2)

and there exists such that


(9.3)

Define . Suppose that System (9.1) is defined for . The impacts

correspond to the zeros of the first component of a solution . Consider the system
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
109

(9.4)

Note that the second component of a solution of System (9.1) (or one

of System (9.4)) is of the form , where and are

arbitrary constants. Then System (9.1) can be reduced to the equation

. (9.5)

If Conditions (9.2) are satisfied, this equation has the only periodic solution

Let . If the function is nonnegative and

has the only zero over the period . The general solution of Equation (9.5) is of the
form

Here and are arbitrary constants. All periodic solution of System (9.4) with one impact

over the period correspond to values and , that there exists (the impact
moment) such that

(9.6)
.

Fixing values and consider (9.6) as a system of the variables and . Let

. Suppose
110 SERGEY KRYZHEVICH

The general solution takes the form

.
The first two string of (9.6) may be rewritten as follows:

(9.7)

The second equation of (9.7) gives us two possible cases: either or


(9.8)

If , then and the component of the periodic solution has only one
zero (of the multiplicity 2) over the period. Otherwise the condition (9.8) uniquely defines the
value ϑ∈ . It follows from the third equation of (9.5) that

where

.
Substituting the obtained expression for to the first equation of (9.7), we obtain
, (9.9)

where

The value can be found from Equation (9.9) if and only if


(9.10)
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
111

This inequality is equivalent to . If (9.10) is false, the system (9.6) is

unsolvable. For the graph of the function is tangent to the axis at

the only point of the period. If then for every fixed value of the couple
the equation (9.9) has two zeros. Both of them correspond to the branches of

periodical solutions, which continuously depend on and . These solutions have the only

impact over the period. The corresponding velocity tends to zero as .


Consider the system
, . (9.11)

The Cauchy matrix of (9.11) corresponding to is of the form

The trace of this matrix is and the determinant equals to

This means that the condition (9.3) coincides in the considered case with
(4.1). Then the conditions of Theorem 1 are satisfied. Therefore, vibro-impact system (9.4)
has an non-hyperbolic invariant set, which contains invariant subsets of the shift mapping,
described by the symbolical dynamics.

10. GRAZING IN EXPERIMENTS AND APPLICATIONS

In this section we compare main results of the current paper with experiments and simulations
made by other authors. As it was noticed in the introduction, there are hundreds of papers
where a numerical and experimental approach was applied to study vibro-impact systems.
Here we do not try to give a review of all these results, we quote ones of papers Molenaar,
van de Water & de Wegerand, 2000 and Ing, Pavlovskaia, Wiercigroch, Banerjee, 2008 in
order to give a confirmation of results of the current paper. Though both discussed models
describe single degree-of-freedom oscillations, it is impossible to avoid all oscillations in
other dimensions, so one still needs to consider a higher dimension model (and, consequently,
Theorem 1) to have a theoretical justification of mentioned experimental results.
Let us start our mini-review with the paper Molenaar, van de Water & de Wegerand,
112 SERGEY KRYZHEVICH

2000. There, an experiment with the mechanical system, depicted at Figure 6 and modeling
atomic force microscopy, is described.

Figure 6. A mechanical system with grazing.

The aim of the experiment is to construct bifurcation diagrams near grazing impact,
and to explore the geometric convergence of series of period-adding bifurcations. To reach
this goal, a precise control of the frequency and amplitude of the excitation is needed.
The experiment consists of a U-shaped, brass leaf spring that is excited horizontally
by means of a large electromagnetic exciter on which it is mounted. The beam has length 13
cm, width 2 cm, and is made of 0.2 mm thick material; its clamped ends are 2 cm apart. The
U shape suppresses undesired torsional motion of the beam. When the deflection of the beam
is large enough, a ceramic ball that is attached to the beam collides with a hardened steel plate
on the exciter. These materials are chosen such that the wear due to frequent impacts is
negligible, so that the distance between the stop and the equilibrium position of the beam is
constant. A problem in this experiment is the excitation of many higher harmonic modes upon
impact. To increase the damping of these, adhesive tape is glued on the inner side of the
spring and on the side that faces the exciter. The upper side of the spring is kept shiny for the
measurement of the deflection of the spring using a laser beam.
The period of non-excited oscillations of the spring is 41.41 ms. The near-impact
positions of the ceramic ball were fixed by the laser; the error does not exceed 0.3 mm. The
frequency of the exciting force (oscillations of the pendant) is assumed to be constant (around
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
113

21.35Hz), the amplitude of steady non-impacting oscillations is considered as a parameter of


the system. Let T be the period, corresponding to this fixed frequency. The parameter at the
diagrams below is the difference between this amplitude and the clearance between the
delimiter and the initial position of the ball.

Figure 7. Bifurcation diagram of the considered mechanical system.

It was shown at Figure 7 how the grazing bifurcation, which takes place for
and transferring the periodic motion with one impact over the period to one
with two impacts, changes the phase portrait of the system. We see, instead of a stable
periodic solution a strange attractor which persists up to at least.
Another case, corresponding to the excitation frequency equal to 21.97 Hz (Figure
8), is even more interesting. In this case there are at least two grazing bifurcations,
responsible for chaotic behavior (for and for ).
114 SERGEY KRYZHEVICH

Figure 8. Transition to chaos via grazing.

It was one of conclusions of the quoted paper that robust chaotic oscillations may
appear in a neighborhood of grazing. This conclusion, justified by numerical simulations and
bifurcation analysis, shows that the result of Theorem 1 is applicable to real life systems.
The similar problem has been analyzed by the research group in non-smooth
dynamics at Aberdeen University (see Ing, Pavlovskaya, Wiercigroch, Banerjee, 2008 and
Wiercigroch, Sin,1998). The main aim of the paper was modeling piecewise smooth
dynamical systems, particularly ones that appear in percussion drilling problems.
The experimental rig consists of a block of mild steel supported by parallel leaf
springs. These provide the primary stiffness while preventing the mass from rotation. The
secondary stiffness consists of a beam, mounted on a separate column, which prevents large
displacement in the positive vertical direction. Contact between the two is controlled via an
adjustable bolt mounted on the beam. Harmonic excitation of the system via the base is then
generated by an electromagnetic shaker. It is assumed that there is no coupling between the
oscillator and the shaker, due to the large mass ratio in favour of the shaker.
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
115

Measurements are recorded using an eddy current displacement probe to monitor the
displacement of the mass, and accelerometers to measure the base and mass acceleration. 100
Hz low-pass filtering is performed on the pre-amplified accelerometer signals. The time
history is then plotted, and the Savitsky-Golay algorithm used for polynomial smoothing of
the data. As a y-product of this operation the first derivative is available, which enables direct
plotting of the phase portrait.
Bifurcations are monitored as a function of frequency by incrementing the frequency
a small amount, allowing transients to decay, and then using the base acceleration signal to
construct an appropriate Poincaré stroboscopic map. The system was driven from a
nonimpacting to an impacting response, and care was taken to follow each attractor for as
long as it remained stable, in order to capture bifurcation phenomena in the experimental
system which could be compared to the model.
The experimental system was designed so that it could be described by a simple
mathematical model, shown in Figure 9. The primary system is described very well by a
linear oscillator. The effects of inertia of the secondary beam, and additional damping during
the contact phase, are expected to be small in comparison to that of the main oscillator, and
are neglected. The beam is considered to provide stiffness support only.

.
Figure 9. Physical model of the oscillator.

The experimental results are shown at Figure 10. Bifurcation diagrams (a) recorded
experimentally for the mass displacement under varying frequency at Hz;
116 SERGEY KRYZHEVICH

damping ; ratio of stiffness ; clearance g = 1.26 mm and

excitation amplitude equal to 0.25 mm and (b) the corresponding simulation at .


Additional windows demonstrate the trajectories on the phase plane and obtained for
excitation frequencies (a) 0.847, 0.906 and 0.996 and (b) 0.85 and 0.95, respectively. In
simulation the period-1 response remains stable through the grazing bifurcation. In the
experimental system there appears to be a window of chaotic response.

Figure 10. Experimental results.

These results also confirm the main idea of Theorem 1 that the chaotic oscillations
may appear in a neighborhood of grazing.

CONCLUSION
A structure of invariant manifolds is studied for near-grazing periodic solutions of vibro-
impact systems. A presence of homoclinic points has been established for vibro-impact
systems satisfying some general type conditions. This provides existence of a Devaney
chaotic invariant set.
Comparing the results of the current paper with ones on existence of stochastic chaos
or Li-Yorke chaos, one may say that the chaotic invariant set, obtained in this paper, is always
structurally stable, i.e. it persists if the parameters of the system are slightly perturbed.
Comparing the results of the paper with a single degree of freedom case, studied in
Kryzhevich, 2008, we needed new techniques, like Lemma 6.1 to estimate Lyapunov
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
117

exponents (in the s.d.f. case they are automatically nonzero). The techniques to find a
homoclinic point (Lemma 7.1) is also quite different in the case of many degrees of freedom.
However, a result on single degree of freedom systems can still be generalized.

ACKNOWLEDGEMENTS

This work was supported by the UK Royal Society (joint research project with Aberdeen
University), by the Russian Federal Program "Scientific and pedagogical cadres", grant no.
2010-1.1-111-128-033 and by the Chebyshev Laboratory (Department of Mathematics and
Mechanics, Saint-Petersburg State University) under the grant 11.G34.31.0026 of the
Government of the Russian Federation.

References

[1] AKHMET, M. U. (2009). Li-Yorke chaos in systems with impacts. Journal of Mathematical
Analysis & Applications, 351(2), 804-810, 2009.
[2] BERNARDO, M., BUDD C. J., CHAMPNEYS A. R. AND KOWALCZYK P., Bifurcations and
Chaos in Piecewise-Smooth Dynamical Systems: Theory and Applications. New York,
Springer, 2007.
[3] BABITSKY, V. I. (1998) Theory of Vibro-Impact Systems and Applications, Berlin,
Germany, Springer.

[4] BANERJEE, S., YORKE, J. A. AND GREBOGI, C. (1998) Robust chaos. Physical Review
Letters, 80(14), 3049-3052.

[5] BUDD, C. , GRAZING IN IMPACT OSCILLATORS. IN: BRANNER B. AND HJORTH P. (Ed.) Real
and Complex Dynamical Systems (pp. 47-64). Kluwer Academic Publishers, 1995.

[6] BUNIMOVICH L.A., PESIN YA. G., SINAI YA. G. AND JACOBSON M. V., Ergodic theory of
smooth dynamical systems. Modern problems of mathematics. Fundamental trends, 2, 113-
231, 1985.

[7] CHERNOV N., MARKARIAN R. , Introduction to the ergodic theory of chaotic billiards.
IMCA, Lima, 2001.

[8] CHILLINGWORTH, D.R.J., Dynamics of an impact oscillator near a degenerate graze.


Nonlinearity, 23, 2723-2748, 2010.
118 SERGEY KRYZHEVICH

[9] CHIN W., OTT E., NUSSE, H. E. AND GREBOGI, C., Universal behavior of impact oscillators
near grazing incidence. Physics Letters A 201(2), 197-204, 1995.

[10] DEVANEY, R. L., An Introduction to Chaotic Dynamical Systems. Redwood City, CA:
Addison-Wesley, 1987.

[11] FREDRIKSSON, M. H. & NORDMARK, A. B., Bifurcations caused by grazing incidence in


many degrees of freedom impact oscillators. Proceedings of the Royal Society, London. Ser.
A. 453, 1261-1276, 1997.

[12] GORBIKOV, S. P. AND MEN'SHENINA, A. V., Statistical description of the limiting set for
chaotic motion of the vibro-impact system. Automation & remote control, 68(10), 1794-1800,
2007.

[13] HOLMES, P. J., The dynamics of repeated impacts with a sinusoidally vibrating table.
Journal of Sound and Vibration 84(2), 173-189, 1982.

[14] ING, J., PAVLOVSKAIA E., WIERCIGROCH M, BANERJEE S., Experimental study of an
impact oscillator with a one-side elastic constraint near grazing. Physica D 239, 312-321,
2008.

[15] Ivanov, A. P., Bifurcations in impact systems. Chaos, Solitons & Fractals 7(10), 1615-
1634, 1996.

[16] KOZLOV, V. V. AND TRESCHEV, D. V., Billiards. A genetic introductionto the Dynamics
of Systems with Impacts. Translations of mathematical Monographs, 89. Providence, RI:
American Mathematical Society, 1991.

[17] KRYZHEVICH, S. G. AND PLISS, V. A., Chaotic modes of oscillations of a vibro-impact


system. Journal of Applied Mathematics & Mechanics, 69(1), 15-29, 2005.

[18] KRYZHEVICH S. G., Grazing bifurcation and chaotic oscillations of single-degree-of-


freedom dynamical systems. Journal of Applied Mathematics & Mechanics, 69(4), 539-556,
2008.

[19] MOLENAAR J, VAN DE WATER W. AND DE WEGERAND J., Grazing impact oscillations.
Physical Review E, 62(2), 2030-2041, 2000.

[20] NORDMARK, A. B., Non-periodic motion caused by grazing incidence in an impact


oscillator. Journal of Sound & Vibration, 145(2), 279-297, 1991.
C h a oti c Dyn a m i c s a nd B i fu rcat i on s i n Im p a c t Sys t em s
119

[21] PALIS, J., DI MELO, W., Geometric Theory of Dynamical Systems. Springer-Verlag, 1982.

[22] PAVLOVSKAIA, E. ,WIERCIGROCH, M., Analytical drift reconstruction for visco-elastic


impact oscillators operating in periodic and chaotic regimes. Chaos, Solitons & Fractals
19(1), 151-161, 2004.

[23] SMALE, S., Diffeomorfisms with many periodic points. Differential & Combinatorial.
Topology. Princeton: University Press, 63-81, 1965.

[24] THOMSON, J. M. T., GHAFFARI, R., Chaotic dynamics of an impact oscillator. Physical
Review A, 27(3), 1741-1743, 1983.

[25] WIERCIGROCH, M., SIN, V.W.T., Experimental study of a symmetrical piecewise base-
excited oscillator, J. Appl. Mech.65, 657-663, 1998.

[26] WHISTON, G. S., Global dynamics of a vibro-impacting linear oscillator. Journal of


Sound & Vibration, 118(3), 395-429, 1987.
120 SERGEY KRYZHEVICH
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
pp. 121 - 136

CONTRIBUTION OF FUZZY SYSTEMS FOR TIME SERIES


ANALYSIS

SUBANAR AND AGUS MAMAN ABADI

Abstract. A time series is a realization or sample function from a certain stochastic process. The
main goals of the analysis of time series are forecasting, modeling and characterizing. Conventional
time series models i.e. autoregressive (AR), moving average (MA), hybrid AR and MA (ARMA)
models, assume that the time series is stationary. The other methods to model time series are soft
computing techniques that include fuzzy systems, neural networks, genetic algorithms and hybrids.
That techniques have been used to model the complexity of relationships in nonlinear time series
because those techniques is as universal approximators that capable to approximate any real
continuous function on a compact set to any degree of accuracy. As a universal approximator, fuzzy
systems have capability to model nonstationary time series. Not all kinds of series data can be
analyzed by conventional time series methods. Song & Chissom [19] introduced fuzzy time series as
a dynamic process with linguistic values as its observations. Techniques to model fuzzy time series
data are based on fuzzy systems. In this paper, we apply fuzzy model to forecast interest rate of Bank
Indonesia certificate that gives better prediction accuracy than using other fuzzy time series methods
and conventional statistical methods (AR and ECM).
Keywords and Phrases : soft computing, fuzzy systems, time series, fuzzy time series, fuzzy relation

1 INTRODUCTION

A time series is a realization or sample function from a certain stochastic process. To


understanding of systems based on time series, some researchers adopt time series analysis
methods. Those methods are based on many assumptions. Conventional statistical methods
have been used to analysis time series data such in modeling for economic problems using
parametric methods. The main goals of the analysis of time series are forecasting, modeling
and characterizing. Conventional statistical models for time series analysis can be classified
into linear models and non-linear models. Linear models are autoregressive (AR), moving
average (MA), hybrid AR and MA (ARMA) models. This model assumes that the time series
is stationary.
Soft computing, defined in Wikipedia, is a term applied to a field within computer
science which is characterized by the use of inexact solutions to computationally-hard tasks.

121
122 SUBANAR AND AGUS M AMAN ABADI

Soft computing deals with imprecision, uncertainty, partial truth, and approximation to
achieve tractability, robustness and low solution cost. Soft computing techniques include
fuzzy systems, neural networks, genetic algorithms and hybrids (Zadeh, [25]). That
techniques have been used to model the complexity of relationships in nonlinear time series
because those techniques is as universal approximators that capable to approximate any real
continuous function on a compact set to any degree of accuracy (Wang, [22]). Soft computing
techniques, as universal approximators, make no assumptions about the structure of the data.
Fuzzy systems are systems combining fuzzifier, fuzzy rule bases, fuzzy inference
engine and defuzzifier (Wang, [22]). The systems have advantages that the developed models
are characterized by linguistic interpretability and the generated rules can be understood,
verified and extended. As a universal approximator, fuzzy systems have capability to model
nonstationary time series and give effect of data pre-processing on the forecast performance
(Zhang, et.al, [26]; Zhang & Qi, [27]). Studying on data pre-processing using soft computing
method has been done. Popoola [16] has analyzed effect of data pre-processing on the
forecast performance of subtractive clustering fuzzy systems. Then, Popoola [16] has
developed fuzzy model for time series using wavelet-based pre-processing method. Wang
[23] and Tseng, et al [20] applied fuzzy model to analyze financial time series data.
Not all kinds of series data can be analyzed by conventional time series methods. Song
& Chissom [19] introduced fuzzy time series as a dynamic process with linguistic values as
its observations. Techniques to model fuzzy time series data are based on fuzzy systems.
Some researchers have developed fuzzy time series model. Hwang et al. [12] used data
variants to modeling, Huarng [11] constructed fuzzy time series model by determining
effective intervals length. Then, Sah and Degtiarev [18] and Chen and Hsu [8] established
fuzzy time series 1-order. Lee, et al [14] and Jilani et al [13] developed fuzzy time series high
order. Abadi, et al ([1], [2], [3], [4]) developed fuzzy model for fuzzy time series data that
optimize the fuzzy relations. This method was applied to forecast interest rate of Bank
Indonesia certificate and gave better prediction accuracy than using other fuzzy time series
methods and conventional statistical method (AR and ECM).
The rest of this paper is organized as follows. In section 2, we briefly review the
conventional time series model. In section 3, we introduce fuzzy systems and its properties. In
section 4, construction of fuzzy model for time series data using table lookup scheme
(Wang’s method) is introduced. Optimization of fuzzy model for time series data is discussed
in section 5. We also give example of application of fuzzy systems for forecasting interest
rate of Bank Indonesia Certificate based on time series data in section 6. Finally, some
conclusions are discussed in section 7.

2. MATHEMATICAL MODEL

A time series can be expressed as { X t : t  1, 2,..., N} where t is time index and N is the total

number of observations and X t is the function of components with


X t  f (Tt , St , Ct , It )
where Tt , St , Ct , I t represent the trend, seasonal, cyclical and irregular components
Contribution of Fuzzy Systems for Time Series Analysis 123

respectively.
The main goals of the analysis of time series are forecasting, modeling and
characterizing. Conventional statistical models for time series analysis can be classified into
linear models and non-linear models. Linear models are autoregressive (AR), moving average
(MA), hybrid AR and MA (ARMA) models. The linear models assume that the underlying
data generation process is time invariant. The orders of simple autoregressive and moving
average models can be determined by the autocorrelation function (ACF) and partial
autocorrelation function (PACF) plots of the time series data and the models can be identified
from those functions (Makridakis, et.al, [15]). Box and Jenkins [6] introduced a model
combining both AR and MA models called ARMA model. An ARMA model with order
(p,q) is expressed as ARMA(p, q) where p is order of moving average (MA) and q is order of
autoregressive (AR). The models assume that the time series data is stationary. If the time
series data is nonstationary, then the modified model, integrated ARMA or ARIMA model, is
used to generate a model (Chatfield, [7]). If the dependence is nonlinear where variance of a
time series increases with time i.e. the time series is heteroskedastic, then the series is
modeled by autoregressive conditional heteroskedastic (ARCH) model (Engle, [9]).
Bollerslev [5] introduced the generalization of ARCH model called generalized ARCH
(GARCH) model.

3. FUZZY SYSTEMS

In this section, we introduce some basic definitions and properties of fuzzy systems.

Definition 3.1 (Zimmermann, [28]) Let U be universal set. Fuzzy set A in universal set U is a

set A  {( x,  A ( x)) x U } where  A is function from U to [0, 1].

Let U1 ,U 2 ,...,U n be subsets of . A classical relation among U1 ,U 2 ,...,U n is

subset of U1 U 2  ... U n . The definition of classical relation can be generalized to fuzzy

relation in U1 U 2  ... U n as follow.


124 SUBANAR AND AGUS M AMAN ABADI

Definition 3.2 (Wang, [22]) A fuzzy relation Q in U1 U 2  ... U n is defined as the fuzzy set

Q = ((u1 , u2 ,..., un ), Q (u1 , u2 ,..., un )) (u1 , u2 ,..., un ) U1 U 2  ... U n 

where Q : U1 U 2  ... U n  [0,1] .

Based on the definition of fuzzy relation, the concept of compositions of fuzzy relation can be
generated.

Definition 3.3 (Wang, [21]) Let A and B be fuzzy relations in U V and V W ,


respectively. Composition of fuzzy relations A and B, denoted by A B , is defined as a fuzzy
relation in U W whose membership function is defined by

 A B ( x, z )  sup t[ A ( x, y), B ( y, z )] for every ( x, z) U W .


yV

A l-th fuzzy rule of fuzzy rule bases can be represented by:

Ru (l ) : If x1 is A1l and … xn is Anl , then y is B l (3.1)

and V 
l l
where Ai and B are fuzzy sets in U i  , respectively and x = (x 1, x2,

…, xn)T  U and y  V are linguistic variables.

The fuzzy rule (3.1) can be represented by fuzzy relation in UxV where the membership
function is defined by  ( x1 , x2 ,..., xn , y)   ( x1 )  ...   ( xn )   ( y) .
(l ) l l l
Ru A 1 A n B

Fuzzy system is a system combining fuzzifier, fuzzy rule bases, fuzzy inference
engine and defuzzifier. In fuzzy inference engine, fuzzy logic principles are used to combine
fuzzy rule in fuzzy rule bases into a mapping from a fuzzy set A in U to a fuzzy set B in V. In
applications, if the input and output of fuzzy system are real numbers, then a fuzzifier and

defuzzifier can be used. Supposed U  n


, A is fuzzy set in U and real input x*  U. A

fuzzifier is defined by a mapping from U to fuzzy set A that map x*  U to fuzzy set A in U.
There are three kinds of fuzzifier i.e. singleton fuzzifier, Gaussian fuzzifier and triangular
fuzzifier. A defuzzifier is defined as a mapping from fuzzy set B in V  , the output of
fuzzy inference engine, to real number y*  V. There are three kinds of defuzzifier i.e. center
of gravity, center average and maximum.
Contribution of Fuzzy Systems for Time Series Analysis 125

Definition 3.4 (Wang, [22]) Let A be fuzzy set in U. A fuzzy inference engine based on
individual rule inference with union combination, Mamdani’s product implication, algebraic
product for all t-norm operators, maximum for all s-norm operators, gives output of fuzzy set
B in V whose membership function as
K   n

B ( y)  mak sup  (  A ( x)  A ( x)  B ( y))  . l l
(3.2)
l 1
 xU  i 1
i


If fuzzy set Bl is normal with center y l , then fuzzy system using Mamdani
implication, fuzzy inference engine, singleton fuzzifier and center average defuzzifier is

l  
K n

 y    Ail ( xi ) 
f ( x)  l 1K  ni 1  (3.3)
 
    Ail ( xi ) 
l 1  i 1 
with input x U  and f(x)  V 
n
.
The advantage of fuzzy system (3.3) is that the computation of the system is simple.

The fuzzy system (3.3) is non linear mapping that maps x U  to f(x) V  .
n

Different membership functions of A l and B l give the different fuzzy system. If the
i

membership functions of Al and B l is Gaussian, then the fuzzy system (3.3) becomes
i

n   x  x l 2  
 
 y   ai exp    i l i   
M
l l
 i  
 
l 1 i 1
 
f ( x)  (3.4)
M  n  
2
 a l exp    xi  xi   
l
  i
l 1 i 1    il   
  
where ail  (0, 1],  il  (o, ), xil , y l  R .
126 SUBANAR AND AGUS M AMAN ABADI

n
Theorem 3.5 (Wang, [22]) Let U be compact set in . For any real continuous function
g(x) on U and for every   0, there exists a fuzzy system f(x) in the form of (3.4) such that
sup f ( x)  g ( x)   .
xU

Based on the Theorem 3.5, fuzzy system can be used to approximate any real
continuous function on compact set with any degree of accuracy. In applications, not all of
values of function are known so it is necessary to construct fuzzy system based on sample

data. Supposed there are N input-output pairs ( x0 , y 0 ), x0  , y0 


l l l s l
, l = 1, 2,3,…, N . If
2

 
s
chosen ai  1 ,  il     xi  x
l 2
and x  x0l l
0i
, then fuzzy system (3.4) has the
i 1

following property.

Theorem 3.6 (Wang, [22]) For arbitrary   0, there exists  *  0 such that fuzzy system
(3.4) with  =  * has the property that f ( x0l )  y0l   , for all l = 1, 2, …,N.

4. CONSTRUCTION OF FUZZY MODEL FOR TIME SERIES DATA USING


TABLE LOOKUP SCHEME (WANG’S METHOD)

In this section, construction of fuzzy model for time series data using table lookup scheme
will be introduced. Suppose given the following N training
data: ( x1 p (t  1), x2 p (t  1),..., xm p (t  1); x1 p (t )) , p  1, 2,3,..., N . Construction of fuzzy relations to

modeling time series data from training data based on the table lookup scheme is presented as
follows:

Step 1. Define the universal set for main factor and secondary factors. Let U  [1 ,  1 ] 

be universal set for main factor, x1 p (t  1), x1 p (t ) [1 , 1 ] and V = [i , i ]  , i  2,3,..., m ,

be universal set for secondary factors, xip (t  1)  [i , i ] .


Contribution of Fuzzy Systems for Time Series Analysis 127

Step 2. Define fuzzy sets on the universal sets. Let A1,k (t  i),..., AN ,k (t  i) be Ni fuzzy sets in i

time series Fk (t  i) . The fuzzy sets are continuous, normal and complete in [ k , k ]  , i
=0,1, k  1, 2,3,..., m .
Step 3. Set up fuzzy relations using training data. From this step we have the following M
collections of fuzzy relations designed from training data:
( Al (t  1), Al (t  1),..., Al (t  1))  Al (t ) , l = 1, 2, 3, …, M. (4.1)
j1* ,1 * ,2
j2 * ,m
jm i1* ,1

Step 4. Determine the membership function for each fuzzy relation resulted in the Step 3. The
fuzzy relation (4.1) can be viewed as a fuzzy relation on U V with

U  U1  ... U m  m
, V and the membership function for the fuzzy relation is

defined by Rl ( x1 p (t  1), x2 p (t  1),..., xmp (t  1); x1 p (t ))

= A * ( t 1)
( x1 p (t  1)) A * ,2 ( t 1)
( x2 p (t  1))... A * ,m ( t 1)
( xmp (t  1))  A l
(t )
( x1 p (t ))
j1 ,1 j2 jm i1* ,1

Step 5. For given fuzzy set input A(t  1) in input space U, establish the fuzzy set output

Al(t ) in output space V for each fuzzy relation (4.1) defined as

 A ( x1 (t ))  sup( A ( x(t  1))R ( x(t  1); x1 (t )))) , where x(t  1)  ( x1 (t  1),..., xm (t  1)) .
l
l
xU

Step 6. Find out fuzzy set A(t ) as the combination of M fuzzy sets
M
A1(t ), A2 (t ), A3(t ),..., AM (t ) defined as  A(t ) ( x1 (t ))  max( A1 (t ) ( x1 (t ),...,  AM (t ) ( x1 (t ))) =
l 1

M m
max (sup(  A ( x(t  1)) Rl ( x(t  1); x1 (t ))) = max (sup( A ( x(t  1))  A
M

l 1 ( t 1)
( x f (t  1))  A ( x1 (t )))) .
l
xU l 1 xU
if , f i1 ,1
f 1

Step 7. Calculate the forecasting outputs. Based on the Step 6, if fuzzy set input A(t  1) is

given, then the membership function of the forecasting output A(t ) is


m

(sup(  A ( x(t  1))  A


M
 A(t ) ( x1 (t ))  max ( t 1)
( x f (t  1))  A ( x1 (t )))) . l (4.2)
l 1 xU
if , f i1 ,1
f 1

Step 8. Defuzzify the output of the model. If the aim of output of the model is fuzzy set, then
we stop at the Step 7. We use this step if we want the real output. If fuzzy set input A(t  1) is
given with Gaussian membership function
m
( xi (t  1)  xi* (t  1))2
 A(t 1) ( x(t  1))  exp( ),
i 1 ai2
128 SUBANAR AND AGUS M AMAN ABADI

then membership function of forecasting output A(t ) in (4.2) is

 m   
2
K xi* (t  1)  xil (t  1)
 Bl ( y)  max  exp(  )  ( y )  (4.3)
l 1  ai2  ( il )2 Bl

i 1
 
With y [1 , 1 ] . If given real input ( x1 (t  1),..., xm (t  1)) , then the forecasting real output

using the Step 7 and center average defuzzifier is


M m
( xi (t  1)  xi* j (t  1)) 2
y
j 1
j exp(
i 1 ai2   i2, j
)
x1 (t )  f ( x1 (t  1),..., xm (t  1))  (4.4)
M
( x (t  1)  xi* j (t  1)) 2
m


j 1
exp( i
i 1 ai2   i2, j
)

where y j is center of the fuzzy set Ai1j,1 (t ) .

5. OPTIMIZATION OF FUZZY MODELING


FOR TIME SERIES DATA
In this section, a procedure to get optimal time series model is developed. The procedure uses
the following steps: (1) Determine significant input variables, (2) Construct complete fuzzy
relations, (3) Reduce the complete fuzzy relations to get the number of optimal fuzzy
relations. In this paper, optimization of fuzzy model is measured by values of Mean Squared
Error (MSE) and Mean Absolute Percent Error (MAPE) from testing data.

5.1 Selection of Input Variables. Given M fuzzy relations where the lth fuzzy relation is
expressed by:

“If x1 is A1 1 and x2 is A2 2 and ... and xn is An n , then


j j j
y is Bi ”
M

yw r r

and the output of fuzzy model is defined by f  r 1


M
, where yr is center of fuzzy set
w r
r 1

( xi  xir )2
Br , wr  A1r  A2r  ...  Anr , and Air ( xi )  exp( ) . Saez and Cipriano [17] defined the
 ir2

sensitivity of input variable xi by i ( x)  f ( x) with x  ( x1 , x2 ,..., xn ) . Sensitivity


xi
Contribution of Fuzzy Systems for Time Series Analysis 129

i ( x) depends on input variable x and computation of the sensitivity is based on training data.
Thus, we compute I i  i2   i2 for each variable where  and  are mean and standard

deviation of sensitivity of variable xi respectively. Then, input variable with the smallest

value Ii is discarded. Based on this procedure, to choose the important input variables, we
must take some variables having the biggest values Ii.

5.2 Construction of Complete Fuzzy Relations Using Method of Degree of Fuzzy Relation. In
modeling time series, if there are less number of training data, then fuzzy relations resulted
may not cover all values in input domain. So in this paper, a procedure to construct complete
fuzzy relations will be introduced. Given the following N input-output

data ( x1 p , x2 p ,..., xn p ; y p ) , p  1, 2,3,..., N where xi p  [i ,  i ]  and y p  [ y ,  y ]  ,

i = 1, 2, …, n. The method to design complete fuzzy relations is given by the following steps:
Step 1. Define fuzzy sets to cover the input and output spaces.
For each space [i ,  i ] , i = 1, 2, …, n, define N i fuzzy sets Ai j , j = 1, 2, …, Ni

which are complete and normal in [i ,  i ] . Similarly, define N y fuzzy sets B j , j = 1, 2, …,

Ny which are complete and normal in [ y ,  y ] .

Step 2. Determine all possible antecedents of fuzzy relation candidates.


n
Based on the Step 1, there are  N i antecedents of fuzzy relation candidates. The
i 1

antecedent has form: x1 is A1j and x2 is A2j and ... and xn is Anj
1 2 n
simplified by

A1j and A2j and ... and Anj .


1 2 n

Step 3. Determine consequence of each fuzzy relation candidate.


For each antecedent A1j and A2j and ... and Anj , the consequence of fuzzy relation is
1 2 n

determined by degree of the rule as


 A ( x1 p ) A ( x2 p ) ...A ( xn p )B ( y p )
j1 j2 jn j
1 2 n

based on the training data. Choosing the consequence is done as follows: For any training
data ( x1 p , x2 p ,..., xn p ; y p ) and for any fuzzy set Bj , choose B j* such that

 A ( x1 p* ) A ( x2 p* ) ...A ( xn p* )B ( y p* )   A ( x1 p ) A ( x2 p ) ...A ( xn p )B ( y p ) ,


j1 j2 jn j* j1 j2 jn j for some
1 2 n 1 2 n
130 SUBANAR AND AGUS M AMAN ABADI

( x1 p* , x2 p* ,..., xn p* ; y p* ) . If there are at least two B j* such that  A j1 ( x1 p* ) ...A jn ( xn p* )B j* ( y p* ) 


1 n

 A ( x1 p ) ...A ( xn p )B ( y p ) , then choose one of some B . From this step, we have the fuzzy
j1 jn j
j*

1 n

relations in form:
j j
IF x1 is A1 1 and x2 is A2 2 and ... and xn is An n , THEN
j
y is B j*
n
So if this process is continued for every antecedent, there are  N i complete fuzzy relations.
i 1

Theorem 5.1 If A is a set of fuzzy relations constructed by Wang’s method and B is a set of
fuzzy relations generated by method of degree of fuzzy relation, then A  B .

Based on the Theorem 5.1, the method of degree of fuzzy relation is generalization
of the Wang’s method.

5.3 Reduction of Fuzzy Relations Using Singular Value Decomposition Method. f the number
of training data is large, then the number of fuzzy relations may be large too. So increasing
the number of fuzzy relations will add the complexity of computations. To overcome that, we
will apply singular value decomposition method (Yen, at.al [24]). Reduction of fuzzy
relations is done by the following steps referring to Abadi, et.al [4]:
Step 1. Set up the firing strength of the fuzzy relation for each training datum
(x;y)= ( x1 (t  1), x2 (t  1),..., xm (t  1); x1 (t )) as follows
m
  Ai f , f ( t 1) ( x f (t  1))  Ail ,1 ( x1 (t ))
f 1 1
Ll (x;y) = M m
   Ai f , f ( t 1) ( x f (t  1))  Aik ,1 ( x1 (t ))
k 1 f 1 1

Step 2. Construct N x M firing strength matrix L  (Lij ) where Lij is firing strength of j-th
fuzzy relation for i-th datum, i = 1, 2, …, N, j = 1, 2, …, M.
Step 3. Compute singular value decomposition of L as L  USV T .
Step 4. Determine the biggest r singular values with r  rank( L) .

 V 11 V 12 
Step 5. Partition V as V    , where V 11 is r x r matrix, V 21 is (M-r)x r matrix,
V 21 V 22 
and construct V1T  (VT11 ,VT21 ) .
Contribution of Fuzzy Systems for Time Series Analysis 131

Step 6. Apply QR-factorization to V1T and find M x M permutation matrix P such that
V1T P  QR where Q is r x r orthogonal matrix, R = [R11, R12], and R11 is r x r upper
triangular matrix.
Step 7. Assign the position of entries one’s in the first r columns of matrix P that indicate the
position of the r most important fuzzy relations.
Step 8. Construct time series forecasting model (4.3) or (4.4) using the r most important
fuzzy relations.
Step 9. If the model is optimal, then stop. If it not yet optimal, then go to Step 4.

6. FORECASTING INTEREST RATE OF BANK INDONESIA


CERTIFICATE USING FUZZY MODEL

In this section, singular value decomposition method is applied to forecast interest rate of
Bank Indonesia Certificate (BIC) based on time series data. First, the method of sensitivity
input is applied to select input variables. Second, singular value decomposition method is
applied to select the optimal fuzzy relations. The initial fuzzy model with 8 input variables
(x(k-8), x(k-7), …, x(k-1)) from data of interest rate of BIC will be considered. The universal
set of 8 inputs and 1 output is [10, 40] and 7 fuzzy sets A1 , A2 ,..., A7 are defined on each
universal set of input and output with Gaussian membership function. Then the procedure in
Section 5.1 is applied to find significant inputs. The distribution of sensitivity of input
variables Ii is shown in Figure 1(a). We choose the biggest two sensitivity of input variables I i
and three sensitivity of input variables Ii. Based on selecting the biggest two sensitivity of
input variables and three sensitivity of input variables, the selected input variables are x(k-8),
x(k-1) and x(k-8), x(k-3), x(k-1), respectively.
Then time series model constructed by two input variables x(k-8) and x(k-1) has
better prediction accuracy than time series model constructed by three input variables x(k-8),
x(k-3), x(k-1). So we choose x(k-8) and x(k-1) as input variables to predict value x(k). Then
the method of degree of fuzzy relation is applied to yield 49 fuzzy relations showed in Table
1.
132 SUBANAR AND AGUS M AMAN ABADI

The singular value decomposition method in Section 5.3 is applied to get optimal
fuzzy relations. The singular values of firing strength matrix are shown in Figure 1(b). There
are 10 optimal fuzzy relations. The positions of the 10 most important fuzzy relations are
known as 1, 2, 8, 9, 10, 15, 17, 29, 37, 44 printed bold in Table 1. The resulted fuzzy relations
are used to design time series forecasting model (4.3) and (4.4).

Table 1. Fuzzy relations for interest rate of BIC using method of


degree of fuzzy relation based on time series data

Rule ( x (t  8) , x (t  1) )  x(t ) Rule ( x (t  8) , x (t  1) )  x(t ) Rule ( x (t  8) , x (t  1) )  x(t )


1 (A1, A1)  A1 17 (A3, A3)  A2 33 (A5, A5)  A1
2 (A1, A2)  A2 18 (A3, A4)  A2 34 (A5, A6)  A2
3 (A1, A3)  A2 19 (A3, A5)  A2 35 (A5, A7)  A2
4 (A1, A4)  A3 20 (A3, A6)  A2 36 (A6, A1)  A2
5 (A1, A5)  A3 21 (A3, A7)  A2 37 (A6, A2)  A2
6 (A1, A6)  A3 22 (A4, A1)  A1 38 (A6, A3)  A2
7 (A1, A7)  A3 23 (A4, A2)  A1 39 (A6, A4)  A2
8 (A2, A1)  A1 24 (A4, A3)  A2 40 (A6, A5)  A2
9 (A2, A2)  A2 25 (A4, A4)  A2 41 (A6, A6)  A2
10 (A2, A3)  A3 26 (A4, A5)  A2 42 (A6, A7)  A2
11 (A2, A4)  A3 27 (A4, A6)  A2 43 (A7, A1)  A2
12 (A2, A5)  A3 28 (A4, A7)  A2 44 (A7, A2)  A2
13 (A2, A6)  A3 29 (A5, A1)  A1 45 (A7, A3)  A2
14 (A2, A7)  A3 30 (A5, A2)  A1 46 (A7, A4)  A2
15 (A3, A1)  A1 31 (A5, A3)  A1 47 (A7, A5)  A2
16 (A3, A2)  A2 32 (A5, A4)  A1 48 (A7, A6)  A2
49 (A7, A7)  A2

(a) (b)

Figure 1. (a) Distribution of sensitivity of input variables I i; (b) Distribution of singular


values of firing strength matrix based on time series data of
interest rate of BIC
Contribution of Fuzzy Systems for Time Series Analysis 133

(a) (b)

Figure 2. Prediction and true values of interest rate of BIC using: (a) singular
value decomposition method (b) degree of fuzzy relation method

Table 2. Comparison of MSE and MAPE for forecasting interest rate of


BIC using different methods

Method Number Number MSE of MAPE of


of fuzzy of fuzzy testing data testing data
set relations (%)
Singular value decomposition (selected 2- 7 10 0.14180 1.8787
input variables), Abadi, et. al., [4]
Degree of fuzzy relation (2-order model), 7 49 0.38679 3.7750
Abadi, et. al., [3]
Wang (2-order model), [22] 7 12 0.55075 4.4393
Wang (1-order, 6-factor model), [22] 16 36 0.26990 3.1256
Song & Chissom (1-order model), [19] 6 8 2.62025 9.8409
Chen, S.M & Hsu, C.C (1-order model), [8] 6 8 2.62025 9.8409
Chen, S.M & Hsu, C.C (2-order model), [8] 6 10 2.99333 11.029
Chen, S.M & Hsu, C.C (1-order model), [8] 10 12 1.50179 7.3517
Chen, S.M & Hsu, C.C (2-order model), [8] 10 14 1.58718 7.7343
Lee et.al. (-1order, 6-factor model), [14] 16 36 1.07718 6.1832
Jilani et.al. (1-order, 6-factor model), [13] 16 36 0.62164 5.0642
AR(7) 31.64529 35.0970
ECM 1.10010 6.3881

7. CONCLUSION

In this paper, we have presented capability of fuzzy systems to model time series data.
As a universal approximator, fuzzy systems have capability to model non stationary time
series. The uniqueness of fuzzy system is that the system can formulate problems based on
expert knowledge or empirical data. We also presented a method to select input variables and
134 SUBANAR AND AGUS M AMAN ABADI

reduce fuzzy relations of time series model based on training data. The method was used to
get significant input variables and optimal number of fuzzy relations. We applied the
proposed method to forecast the interest rate of BIC. The result was that forecasting interest
rate of BIC using the proposed method has a higher accuracy than that using conventional
time series methods.

REFERENCES

[1] ABADI, A.M., SUBANAR, WIDODO AND SALEH, S., Designing fuzzy time series model and
its application to forecasting inflation rate. 7Th World Congress in Probability and
Statistics. Singapore: National University of Singapore, 2008.
[2] ABADI, A.M., SUBANAR, WIDODO, SALEH, S., Constructing Fuzzy Time Series Model
Using Combination of Table lookup and Singular Value Decomposition Methods
and Its Applications to Forecasting Inflation Rate, Jurnal ILMU DASAR, 10(2), 190-
198, 2009.
[3] ABADI, A.M., SUBANAR, WIDODO, SALEH, S., A New Method for Generating Fuzzy
Rules from Training Data and Its Applications to Forecasting Inflation Rate and
Interest Rate of Bank Indonesia Certificate, Journal of Quantitative Methods, 5(2),
78-83, 2009.
[4] ABADI, A.M., SUBANAR, WIDODO, SALEH, S., Fuzzy Model for Forecasting Interest Rate
of Bank Indonesia Certificate, Proceedings of the 3rd International Conference on
Quantitative Methods Used in Economics and Business, Faculty of Economics,
Universitas Malahayati, Bandar Lampung, June 16-18, 2010.
[5] BOLLERSLEV, T., Generalized Autoregressive Conditional Heteroscedasticity, Journal of
Econometrics, 31, 307-327, 1986.
[6] BOX, G.E.P. AND JENKINS, G.M., Time Series Analysis: forecasting and Control, Holden-
Day, San Francisco, 1970.
[7] CHATFIELD, C., The Analysis of Time Series: An Introduction, Sixth Edition, Chapman &
Hall/CRC Press, Boca Raton, 2004.
[8] CHEN, S.M. AND HSU, C.C., A New Method to Forecasting Enrollments Using Fuzzy
Time Series, International Journal of Applied Sciences and Engineering, 3(2), 234-
244, 2004.
[9] ENGLE, R.F., Autoregressive Conditional Heteroscedasticity with Estimate of Variance of
United Kingdom Inflation, Econometrica, 50, 987-1008, 1982.
[10] GOLUB, G.H., KLEMA, V., STEWART, G.W., Rank Degeneracy and Least Squares
Problems, Technical Report TR-456, Dept. of Computer Science, University of
Maryland, College Park, 1976.
[11] HUARNG, K., Effective Lengths of Intervals to Improve Forecasting in Fuzzy Time
Series, Fuzzy Sets and Systems 123, 387-394, 2001.
[12] HWANG, J.R., CHEN, S.M., LEE, C.H., Handling Forecasting Problems Using Fuzzy
Time Series, Fuzzy Sets and Systems 100, 217-228, 1998.
[13] JILANI, T.A, BURNEY, S.M.A., ARDIL, C., Multivariate High Order Fuzzy Time Series
Forecasting for Car Road Accidents. International Journal of Computational
Intelligence, 4(1), 15-20, 2007.
[14] LEE, L.W., WANG, L.H., CHEN, S.M., LEU, Y.H., Handling Forecasting Problems Based
Contribution of Fuzzy Systems for Time Series Analysis 135

on Two-factors High Order Fuzzy Time Series, IEEE Transactions on Fuzzy


Systems, 14(3), 468 – 477, 2006.
[15] MAKRIDAKIS, S., WHEELWRIGHT, S.C., Hyndman, R.J., Forecasting: Methods and
Applications, Chichester: Wesley, New York, 1998.
[16] POPOOLA, A.O., Fuzzy-wavelet Method for Time Series Analysis, Dissertation,
Department of Computing, School of Electronics and Physical Sciences, University
of Surrey, Guildford, UK, 2007.
[17] SAEZ, D. AND CIPRIANO, A., A New Method For Structure Identification Of Fuzzy
Models And Its Application To A Combined Cycle Power Plant, Engineering
Intelligent Systems, 2, 101-107, 2001.
[18] SAH, M. AND DEGTIAREV, K.Y., Forecasting Enrollments Model Based on First-Order
Fuzzy Time Series, Transaction on Engineering Computing and Technology IV,
2004.
[19] SONG, Q. AND CHISSOM, B.S., Forecasting Enrollments with Fuzzy Time series Part I,
Fuzzy Sets and Systems 54, 1-9, 1993.
[20] TSENG, F-M, TSENG, G-H, YU, H-C, YUAN, B.J.C., Fuzzy ARIMA Model for Forecasting
The Foreign Exchange Market, Fuzzy Sets and Systems, 118, 9-19, 2001.
[21] WANG L.X., Adaptive Fuzzy Systems and Control: Design and Stability Analysis,
Prentice-Hall, Inc., New Jersey, 1994.
[22] WANG L.X., A Course in Fuzzy Systems and Control, Prentice-Hall, Inc., New Jersey,
1997.
[23] WANG, L.X., The WM Method Completed: A Flexible System Approach to Data
Mining, IEEE Transactions on Fuzzy Systems, 11(6), 768-782, 2003.
[24] YEN, J., Wang, L., Gillespie, C.W., Improving the Interpretability of TSK Fuzzy Models
by Combining Global Learning and Local Learning, IEEE Transactions on Fuzzy
Systems, 6(4), 530-537, 1998.
[25] ZADEH, L.A., Soft Computing and Fuzzy Logic, IEEE Software, 11(6), 48-56, 1994.
[26] ZHANG, B-L, Coggins, R., Jabri, M.A., Dersch, D., Flower, B., Multiresolution
Forecasting for Future Trading Using Wavelet Decomposition, IEEE Transactions
on Neural Networks, 12(4), 765-775, 2001.
[27] ZHANG, G.P., AND QI, M., Neural Network Forecasting for Seasonal and Trend Time
Series, European Journal of Operation Research, 160(2), 501-514, 2005.
[28] ZIMMERMANN, H.J., Fuzzy Sets Theory and Its Applications, Kluwer Academic
Publisher, London, 1991.

SUBANAR
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah
MadaUniversity, Indonesia
e-mail: [email protected]

AGUS MAMAN ABADI


Department of Mathematics Education, Faculty of Mathematics and Natural Sciences,
Yogyakarta State University, Indonesia
e-mail: [email protected]
136 SUBANAR AND AGUS M AMAN ABADI
Paper of Participants
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp.137–144.

DEGENERATIONS FOR FINITE DIMENSIONAL


REPRESENTATIONS OF QUIVERS

Darmajid and Intan Muchtadi-Alamsyah

Abstract. This is a survey about quivers and representations of quivers in the geomet-
ric point of view. Specifically, we will study about degeneration for finite dimensional
representation of quivers. In this paper we will prove the following result by Brion. If
0 → U → M → N → 0 is an exact sequence of finite dimensional representations of
bound quiver (Q, I), then M degenerates to U ⊕ N.

Keywords and Phrases: quivers, representations of quivers, exact sequence, degeneration.

1. INTRODUCTION
A quiver is a finite directed (oriented) graph, possibly with multiple arrows and
loops. Using a quiver Q one can define an algebra over an algebraically closed field
k called path algebra kQ. Conversely, if an algebra A is basic, connected, and finite
dimensional we may obtain associate quiver QA [1]. By using quivers, studying algebra
becomes more interesting because we could work in a graphical structure. Similarly,
using a bound quiver (Q, I) associate to an algebra A, we can visualize any (finite
dimensional) A-module as a k-linear representation of (Q, I) [1]. Following [3], we
can show that the representation M of the bound quiver (Q, I) is naturally an affine
variety and the product Gd of general linear groups acts on the category repk (Q, I) of
all representations of bound quiver (Q, I) so that the Gd -orbits are the isomorphism
classes of representations in repk (Q, I). The point in the closure of the Gd -orbit of
representation M of bound quiver (Q, I) may be viewed as geometric degenerations
of the representation M. In this paper we will prove the following result in [2]. Let
0 → U → M → N → 0 be an exact sequence of finite dimensional representations of
bound quiver (Q, I). Then M degenerates to U ⊕ N.

2010 Mathematics Subject Classification: 16G20, 14L30, 16G60.

137
138 Darmajid and Intan Muchtadi-Alamsyah

2. PRELIMINARIES
Throughout the paper, k denotes an algebraically closed field. A quiver is a
oriented graph. More specifically, a quiver Q = (Q0 , Q1 , s, t) is a quadruple consisting
of two sets: Q0 whose elements are called vertices and Q1 whose elements are called
arrows, and two maps s, t : Q1 → Q0 which associate to each arrow α ∈ Q1 its source
s(α) ∈ Q0 and its target t(α) ∈ Q0 , respectively. A quiver Q is said to be finite if Q0
and Q1 are finite sets [1]. We will only consider finite quiver Q.
Example 2.1. Let Q be quiver whose Q0 := {1, 2, 3}, Q1 := {α, β}, s(α) := 2, t(α) :=
1, s(γ) := 2, t(γ) := 1, s(β) := 3, t(β) := 2, and s(λ) = t(λ) := 3. Then

α
β
Q := 1 2 3 λ
γ

By a path of length l ≥ 1 in the quiver from a to b is a concatenation of arrows


αl αl−1 · · · α1 such that a, b ∈ Q0 , αl , αl−1 , . . . , α1 ∈ Q1 , s(α1 ) = a, t(αl ) = b, and
t(αj ) = s(αj+1 ) for each j ∈ {1, . . . , l − 1} that obeys the orientation. We also agree
to associate each point i ∈ Q0 a path of length l = 0, called stationary path at a and
denoted by a . The quiver algebra kQ of Q over k is a k-vector space with k-basis all
paths in Q, and two product of two paths β : a0 → b0 and α : a → b is the composed
path βα : a → b0 if a0 = b and zero otherwise [5].
l
Let RQ be the ideal of kQ generated by the set of all paths of length l ≥ 1. A
two sided ideal I of kQ is said to be admissible if there exists integer m ≥ 2 such that
m 2
RQ ⊆ I ⊆ RQ . If I is an admissible ideal of kQ, the pair (Q, I) is said to be bound
quiver and the quotient algebra kQ/I is said to be a bound quiver algebra[1].
m
Example 2.2. (a) For any finite quiver Q and any m ≥ 2, the ideal RQ is admis-
sible.
(b) Let Q be the quiver
β 2 α
λ
Q := 1 4

δ 3 γ

The ideal I1 = hβα − δγi of the k-algebra kQ is admissible, but I2 = hβα − λi


2
is not admissible since βα − λ 6∈ RQ .
A representation V of a quiver Q consists of a family of vector spaces Va indexes
by the vertices a ∈ Q0 , together with a family of linear maps fα : Vs(α) → Vt(α)
indexed by the arrows α ∈ Q1 . A representation V = (Va , fα )a∈Q0 ,α∈Q1 of quiver Q
is finite dimensional if so are all the vector spaces Va . Under that assumption, the
Degenerations for Finite Dimensional Representations of Quivers 139

family dimV := (dim Va )a∈Q0 is the dimension vector of V. It lies in the additive
group NQ0 consisting of allP
tuples of positive integers d = (da )a∈Q0 . The dimension of
V is the natural number dim Va . For a path τ = αl αl−1 · · · α1 : a → b in quiver
a∈Q0
Q, the k-linear map fτ : Va → Vb is defined the identity map of Va if l = 0 and the
composition fαl ◦ fαl−1 ◦ · · · ◦ fα1 if l > 0. A representation of bound quiver (Q, I)
is a representation M = (Ma , fα )a∈Q0 ,α∈Q1 with the additional property that for each
P r
k-linear combination ρ = ki τi ∈ I where ∀i ∈ {1, . . . , r}, τi ∈ I are path from a
i=1
r
P
to b, the k-linear map fρ = ki fτι is zero. In this paper, we will only consider the
i=1
finite-dimensional representations of bound quiver (Q, I) [5].
Let V = (Va , fα )a∈Q0 ,α∈Q1 and W = (Wa , gα )a∈Q0 ,α∈Q1 be two representations
of bound quiver (Q, I). A morphism ϕ : V → W is family ϕ = (ϕa )a∈Q0 of linear maps
(ϕa : Va → Wa )a∈Q0 such that for any arrow α : a → b the equality gα ◦ ϕa = ϕb ◦ fα
holds. In fact, the category of (finite-dimensional) k-representations of bound quiver
(Q, I), denoted by repk (Q, I), is equivalent to the category of finite dimensional kQ/I-
modules [1].

Example 2.3. Let Q be the quiver bound by commutativity relation I = hβα − δγi

3
β α
λ
Q := 1 2 5
δ γ
4

It is clear that the representation M of quiver Q is not bound by βα − δγ = 0.

0
(0)
(1) (0)
M := k k k
(1)
(1)
k
140 Darmajid and Intan Muchtadi-Alamsyah

The following representations V and W of quiver Q are bound by βα − δγ = 0 and the


morphism from V to W is given by dashed line

k
 
1
0 (0)

1 1
V := k k2 0
[1]
 
0 (0)
1
k [0]

 
[1] 1 1 k

(1) (1)
[1]
(1)
W := k k k

(1) (1)

Indeed, it is readily verified that


 
     1
[1] 1 1 = (1) 1 1 1 , 1 = (1) [1]
0
 
  0
1 1 = (1) [1] , [1] (0) = (1) [0] , and [1] (0) = (1) [0] .
1

3. GEOMETRIC STRUCTURE OF REPRESENTATIONS


Following [3], we will give geometrical structure such as affine variety for category
repk (Q, I) and the product Gd of general linear groups acts on repk (Q, I) so that the
Gd -orbits are the isomorphism classes of representations in repk (Q, I). Recall that
representations V = (Va , fα ) of bound quiver (Q, I) with dimension vector d = (da )a∈Q0
assign each vector spaces Va with dimension da to a point a ∈ Q0 and assign each linear
maps fα : Va → Vb to an arrow α : a → b. By choosing basis, we can identify every
vector spaces Va by kda so that every fα can identified by db × da -matrix Vα . This fact
maybe use to give geometrical structure of affine variety for repk (Q, I) by parameterizing
Degenerations for Finite Dimensional Representations of Quivers 141

L
repk (Q, I) into direct sum of Mdb ×da (k) where Mdb ×da (k) denote the set
α:a→b;α∈Q1
of all db × da -matrices. It is clear that
M M
Homk kdb , kda =

repk (Q, I) = Mdb ×da (k) = Md
α:a→b;α∈Q1 α:a→b;α∈Q1
 ! 
si
r k
 P Q 


 i Vαj = Vρ = O for 


 i=1 i 

M M j=1 

!
= Vα ∈ Mdb ×da (k) r
P Qsi
j


 α:a→b α:a→b some ki αi = ρ = 0; 



 α∈Q1 α∈Q1 i=1 j=1 

ki ∈ k; αj are paths from a to b

 

i

define simultaneous zeroes of collections of polynomial equations on entriesY


in matrices
Vα . Hence, repk (Q, I) = Md is an affine variety. The product Gd = GLda of
a∈Q0
general linear groups GLda = {G ∈ Mda ×da (k) |det G 6= 0} acts on Md via

(Ga )a∈Q0 ∗ (Vα )α∈Q1 = (Gb · Vα · G−1


a )α∈Q1 ,

for all (Gc )c∈Q0 ∈ Gd , for all V ∈repk (Q, I) which is parameterized by (Vβ )β∈Q1 ∈ Md
and for all arrow α : a → b. The Gd -orbits are the isomorphism classes of representations
in Md .
Denoted by OM the Gd -orbit of representation M in repk (Q, I). The main object
of our interest is the closure OM (in the Zariski topology) of the orbit OM . There
are two main reasons for studying such orbit closure. By inspecting them as affine
varieties by methods of algebraic geometry we can achieve deeper understanding of
the category of representations. On the other hand, our orbits closures provide many
interesting example of affine varieties, whose geometric properties are derived from
known properties of category of representations.
α
Example 3.1. Let Q := 1 ← − 2; I = h0i; and d = (d1 , d2 ) ∈ N2 . The group Gd =
GLd1 × GLd2 acts on Md via

(G1 G2 ) ∗ (Vα ) = G1 · Vα · G−1



2 .

The vector space repk (Q, I) = Md decomposes into a union of (m + 1) orbits Or ,


0 ≤ r ≤ m = min{d1 , d2 }, consisting of matrices [of rank r. The orbits closure Or
consists of the matrices of rank at most r : Or = Oj .
j≤r

Definition 3.1. Let V, W ∈repk (Q, I) = Md . We say that V degenerates to W if


OW ⊆ OV .
142 Darmajid and Intan Muchtadi-Alamsyah

Example 3.2. Let Q be quiver

Q := 1 λ

d = d ∈ N1 and I = hαr i, r ≥ 2. Any representation V ∈repk (Q, I) which is parame-


terized by V ∈ Md is given by a nilpotent endomorphism V of kd , that is square matrix
in Md . The group Gd = GLd acts by conjugation G ∗ V = GVG−1 , and the orbits are
just the conjugacy classes of matrices in Md . Any orbit in Md contains a matrix in
the canonical Jordan form
M X
J(λi , ri ), λi ∈ k, ri ≥ 1, ri = d,
which is unique, up to an order of the block J(λi , ri ). In particular, there are infinitely
many orbits. However, the characteristic polynomial det(tId − V) of a matrix V ∈ Md
leads to a Gd -invariant regular morphism, and therefore any orbit closure contains only
finite many orbits. It is known, that V in repk (Q, I) (which is parameterized by V ∈ Md
) degenerate to W in repk (Q, I) (which is parameterized by W ∈ Md )if and only if
rank(λId − Vj ) ≥ rank(λId − Wj ), for all λ ∈ k and j ≥ 1.
Before we constructs degenerations in algebraic terms, we adopt the definition
about algebraic family in the sense of representations. and its relationship with degen-
eration. Let Z be an affine variety. An indexed set of representation Mz ∈repk (Q, I) =
Md is called algebraic family of representations if the map Z → Md , z 7→ Mz is a
morphism of variety [4]. We use the following lemma about the relationship between
algebraic family and degeneration which proof can be found in [4].
Lemma 3.1. Given two representations M and N in repk (Q, I). The representation M
degenerates to the representation N if and only if there is an algebraic family (Mz )z∈Z
of representations in repk (Q, I) such that Mz ∼
= M on an open set of Z and Mx ∼ =N
for some x ∈ Z.
Note that general linear group GL1 = {k ∈ k|k 6= 0} is an open dense set in affine
variety k. The following fundamental result is about the construction of degenerations
in algebraic terms.
Theorem 3.1. Let
ϕ ψ
0→U− →M− →N→0
be an exact sequence of finite dimensional representations of bound quiver (Q, I). Then
M degenerates to U ⊕ N.
Proof. Let the representation M = (Ma , fα )a∈Q0 ,α∈Q1 and dimM := d = (da )a∈Q0 . By
exactness, ϕ = (ϕa )a∈Q0 is monomorphism and ψ = (ψa )a∈Q0 is epimorphism. Hence,
without loss of generality, we may assume that U is sub representation of M. This yields
subspace Ua ⊂ Ma , ∀a ∈ Q0 such that fα (Ua ) ⊂ Ub for all α : a → b. Choosing bases
Degenerations for Finite Dimensional Representations of Quivers 143

for the Ua and completing them to the bases of Ma , we obtain a point (Mα )α∈Q1 ∈ Md
that parameterize M ((Mα )α∈Q1 ∼= M) such that Mα (kra ) ⊂ krb for all α : a → b. Here,
r = (ra )a∈Q0 denotes the dimension vector of U. The family of restrictions (induced
by monomorphism ϕ) (Uα : kra → krb )α:a→b,α∈Q1 parameterize U, that is (Uα )α∈Q1 ∼ =
 
U α Xα
U where Uα ∈ Mrb ×ra (k). Therefore,∀α ∈ Q1 , Mα = with Xα ∈
O Yα
Mrb ×sa (k), Yα ∈ Msb ×sa (k), and sa = da − ra for all a ∈ Q0 . Moreover, the vector
space kda can be decomposed into
kda = kra ⊕ ksa
for all a ∈ Q0 . Using the surjectivity of ψa , we obtain that s = (sa )a∈Q0 is the
dimension vector of N and the family of quotient maps (induced by epimorphism
ψ) (Nα : kra → krb )α:a→b,α∈Q1 parameterize N, that is (Nα )α∈Q1 ∼
= N where Nα ∈
 
Uα Xα
Msb ×sa (k). Hence, ∀α ∈ Q1 , we obtain Mα = .
O Nα
Define a homomorphism of algebraic groups
λ : GL1 → GLd
t 7→ (λa (t))a∈Q0 ,
where  
tIra 0
λa (t) =
0 Isa
in the decomposition kda = kra ⊕ ksa for all a ∈ Q0 . Therefore, we have
λ(t) · M = λb (t)Mα λa (t)−1 α:a→b,α∈Q1


   −1 !
tIrb 0 Uα Xα tIra 0
=
0 Isb O Nα 0 Isa
α:a→b,α∈Q1
 
Uα tXα
= .
O Nα α∈Q
1

Hence, we get a morphism of variety


λM : GL1 → OM
t 7→ λ(t) · M
As a consequence, the morphism of variety λM can be extended a morphism λM : k →
OM by defining

  λM (t) if t 6= 0
λM (t) := Uα O
 if t = 0
O Nα α∈Q
1

For this condition, λM (t) also continuous at t = 0 ∈ k. Hence, λM is morphism


of variety. It follows that λM (0) ∈ OM parameterize U ⊕ N. Therefore, we obtain
an algebraic family λM (t) t∈k of representations of bound quiver (Q, I) such that
144 Darmajid and Intan Muchtadi-Alamsyah

λM (t) ∼
= M on an open dense set GL1 of k and λM (0) ∼
= U ⊕ N. By Lemma 3.1, M
degenerates to U ⊕ N. 

4. CONCLUDING REMARKS
We have shown that if 0 → U → M → N → 0 is an exact sequence of finite-
dimensional representations of bound quiver (Q, I) then M degenerates to U ⊕ N. This
mean that we construct degenerations in algebraic terms from geometric degeneration
of the representations of bound quiver.

Acknowledgement. The authors would like thank to I-MHERE FMIPA ITB for
financial support based on Surat Perjanjian No.113/I1.B01/I-MHERE ITB/SPK/2011.

References
[1] I. Assem, D. Simson, and A. Skowronski, Elements of the Representation Theory of Associative
Algebras, in : Techniques of Representation Theory, vol 1, Cambridge University Press, New York,
2006.
[2] M. Brion, Representations of Quivers, Lecture notes available on http://www-fourier.ujf-
grenoble.fr/˜mbrion/notes quivers rev.pdf, 2000.
[3] Darmajid, Variety Representasi Aljabar dan Variety Modul, Prosiding Seminar Nasional Aljabar
2011, 18-27, 2011.
[4] H. Kraft, Geometric Methods in Representation Theory, in: Representations of Algebras, Lecture
Notes in Math., 944, 180–258, 1982.
[5] C. Riedtmann, Degenerations for Representations of Quivers with Relations, Ann. Sci. École
Norm. Sup. 4, 275-301, 1986.

Darmajid
Algebra Research Division, Institut Teknologi Bandung.
e-mail: [email protected]

Intan Muchtadi-Alamsyah
Algebra Research Division, Institut Teknologi Bandung.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 145–158.

ON SETS RELATED TO CLONES OF QUASILINEAR


OPERATIONS

Denecke, K. and Susanti, Y.

Abstract. For an arbitrary finite commutative group (A; ?), we are interested to define
a particular subset Ln n
a,b of the set O (A) of all n-ary operations on A. We study its
properties and the connection between this set and the n-clone P oln αA of all n-ary
quasilinear operations in O n (A). We are also interested in the properties of Ln
a,b for

some particular groups.


Keywords and Phrases: n-ary operations, semigroup, finite commutative group, clone of
quasilinear operations.

1. INTRODUCTION
Let A be an arbitrary set and let O(A) be the set of all operations on the set
A. A clone on the set A, i.e. a subset of O(A) that is closed under superposition
and contains all projections has been widely studied by many authors see e.g., ([2], [3],
[4], [5], [6], [9], [10], [11], [12], [13]). An important clone is P olαA consisting of all
operations which preserve αA = {(u, v, w, z)|u, v, w, z ∈ A, u ? v = w ? z} for a group
(A; ?). Thus, if f is an n-ary operation for n ≥ 1 in P olαA and (ui , vi , wi , zi ) ∈ αA ,
i = 1, . . . , n, then (f (u1 , . . . , un ), f (v1 , . . . , vn ), f (w1 , . . . , wn ), f (z1 , . . . , zn )) ∈ αA , i.e.
f (u1 , . . . , un ) ? f (v1 , . . . , vn ) = f (w1 , . . . , wn ) ? f (z1 , . . . , zn ). It is easy to understand

{f ∈ On (A)|f (u1 ? v1 , . . . , un ? vn ) ? f (e, . . . , e) = f (u1 , . . . , un ) ?
S
that P olαA =
n=1
f (v1 , . . . , vn )} for the identity element e of the group. It is well-known that if (A; ?)
is an elementary abelian p-group for a prime number p and |A| = pm , then P olαA
is a maximal clone on A ([12]). Now, consider P oln αA = P olαA ∩ On (A), which we
call an n-clone on A. If (ui , vi , wi , zi ) ∈ α, i = 1, . . . , n then we have u ?n v = w ?n z
for u = (u1 , . . . , un ), v = (v1 , . . . , vn ), w = (w1 , . . . , wn ) and z = (z1 , . . . , zn ) and ?n is
defined by (x1 , . . . , xn ) ?n (y1 , . . . , yn ) = (x1 ? y1 , . . . , xn ? yn ). Therefore for arbitrary

2010 Mathematics Subject Classification: 08A40, 08A62, 08A99

145
146 Denecke, K. and Susanti, Y.

(ui , vi , wi , zi ) ∈ α, i = 1, . . . , n there exists a ∈ An such that u ?n v = a = w ?n z.


Conversely, for arbitrary u, w, a ∈ An there exist v, z ∈ An such that u?n v = w ?n z = a.
Furthermore, if f ∈ P oln αA , then f (u) ? f (v) = f (w) ? f (z). Thus there exists b ∈ A
such that f (u) ? f (v) = b = f (w) ? f (z). With this background, we are then interested
in the set of all f ∈ On (A) such that f (u) ? f (v) = b when u ?n v = a for arbitrary
a ∈ An and b ∈ A. For further investigation of the properties of this set, we recall the
following concepts.
Let (A; ?,−1 , e) be an arbitrary finite commutative group with −1 as the inverse
operation and e as the identity element and let |A| ≥ 2. For a natural number n ≥ 1
consider An , the binary fundamental operation ?n , the unary operation corresponding to
−1
and the nullary operation corresponding to e of the n-th direct power of (A; ?,−1 , e).
Generally, in this paper we use x and x̂ for (x1 , . . . , xn ) and (x, . . . , x), respectively.
By this notation, we have x̂ ?n ŷ = x[ ? y and xd−1 = (x−1 , . . . , x−1 ). Moreover, for a

unary operation π ∈ O (A) we define Cπ := {f ∈ On (A)|f (x̂) = π(x), x ∈ A}. By this


1

definition, we have Cπ1 ∩ Cπ2 = ∅ for π1 6= π2 . Thus, for every f ∈ On (A) there is a
unique πf ∈ O1 (A) such that f ∈ Cπf . We use id for the identity operation on A and
cny for constant n-ary operation, i.e. cny (x) = y for all x ∈ An and y ∈ A. It is clear that
cny ∈ Cc1y . Furthermore, let ∆An = {x̂|x ∈ A}. If f ∈ Cid , then f (x̂) = x and if f ∈ Cc1y ,
then f (x̂) = y for all x̂ ∈ ∆An (see [8]). Now, for the main aim of this paper, we define
Lna,b := {f ∈ On (A)|f (x) ? f (a ?n x−1 ) = b for all x ∈ An } for fixed a ∈ An and b ∈ A.
Lna,b could be the empty set but also it could be equal to On (A). Moreover, in general
P oln αA and Lna,b are not comparable with respect to inclusion. Before we come to the
properties of this set, recall that we obtain in the following way a semigroup of n-ary
operations on A.
Let On (A) be the set of all n-ary operations on A. On On (A) we define an op-
eration + by f + g := f (g, . . . , g) for arbitrary f, g ∈ On (A), i.e. (f + g)(x) = f (g(x)) d
n n
for every x ∈ A . The operation + is associative, giving a semigroup (O (A); +)
(see [1], [2], [6], [7] and [8]). By this definition, generally, if f ∈ Cπ1 and g ∈ Cπ2 ,
then f + g ∈ Cπ1 ◦π2 for the composition operation ◦ in O1 (A). Therefore we have
f + g = cny for every f ∈ Cc1y (see [8]). Moreover, it is clear that if C is an n-clone
on A, then (C; +) is a subsemigroup of (On (A); +). Particularly for A = {0, 1}, we
have C4n = {f ∈ On (A)|f (0̂) = 0, f (1̂) = 1}, ¬C4n = {f ∈ On (A)|f (0̂) = 1, f (1̂) = 0},
K0n = {f ∈ On (A)|f (0̂) = f (1̂) = 0} and K1n = {f ∈ On (A)|f (0̂) = f (1̂) = 1}. Clearly,
C4n , ¬C4n , K0n and K1n are all disjoint and On (A) = C4n ∪ ¬C4n ∪ K0n ∪ K1n . By cn0 and
cn1 we mean the constant operations with value 0 and 1, respectively. Moreover, the
operation + has the following properties:



 g if f ∈ C4n
¬g if f ∈ ¬C4n

f +g =
cn if f ∈ K0n
 n0


c1 if f ∈ ¬K0n

(for more details see [1], [2] and [6]).


On Sets Related to Clones of Quasilinear Operations 147

We recall the definition of a four-part semigroup (see [2]). We use four non-empty,
finite and pairwise disjoint sets S1 = {a11 , a12 , . . . , a1nr }, S2 = {a21 , a22 , . . . , a2nr },
S3 = {a31 , a32 , . . . , a3ns }, S4 = {a41 , a42 , . . . , a4ns } and define a binary operation ∗ on
S = S1 ∪ S2 ∪ S3 ∪ S4 by


 alk if aij ∈ S1 
1 if l = 2



 

2 if l = 1

 
atk if aij ∈ S2 where t =

aij ∗ alk = 
 3 if l = 4
4 if l = 3

 



a ∈ S if a ∈ S

3 ij 3


 ∗∗

a ∈ S4 if aij ∈ S4 .
The binary operation ∗ is well-defined, and it can be checked that it is associative,
giving us a semigroup (S; ∗) called a four-part semigroup (see [2]). The following theo-
rem gives a characterization of all four-part subsemigroups of the four-part semigroup
(On ({0, 1}); +).
Theorem 1.1. ([7]) A set S ⊆ On ({0, 1}) is the universe of a four-part subsemigroup
of (On ({0, 1}); +) if and only if
(i) S ∩ ¬C4n 6= ∅
(ii) S ∩ ¬C4n = ¬(S ∩ C4n ), S ∩ K1n = ¬(S ∩ K0n ) and
(iii) {cn0 } ⊆ S.
Remark 1.1. From the above characterization we know that if S1 , S2 ⊆ On ({0, 1}) both
form four-part semigroups, then S1 ∩ S2 also forms a four-part semigroup. Moreover, it
can be shown that S ⊆ On ({0, 1}) forms a four-part semigroup if and only if S ∩C4n 6= ∅,
cn0 ∈ S and ¬f ∈ S for all f ∈ S.
Recall also that a semigroup S = (S; ∗) is called a two-constant semigroup if there
are subsets S1 , S2 of S such that S = S1 ∪ S2 , S1 ∩ S2 = ∅, Si 6= ∅, i = 1, 2 and if there
are two fixed elements b∗ ∈ S1 , b∗∗ ∈ S2 such that
 ∗
b ∈ S1 if a ∈ S1
a∗b=
b∗∗ ∈ S2 if a ∈ S2
(see [2]). The following theorem characterizes two-constant subsemigroup of the four-
partsemigroup (On ({0, 1}); +).
Theorem 1.2. ([7]) A subset S ⊆ On ({0, 1}) is the universe of a two-constant sub-
semigroup of (On ({0, 1}); +) if and only if S ⊆ K0n ∪ K1n and {cn0 , cn1 } ⊆ S.

2. PROPERTIES OF Ln n
a,b AND P ol αA FOR AN ARBITRARY FINITE
COMMUTATIVE GROUP (A; ?)
Lemma 2.1. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a natural
number. For every a, a0 , â ∈ An and b, b0 , y ∈ A, the following propositions are true.
(i) Lna,b 6= ∅ if and only if x ?n x 6= a for all x ∈ An or there exists y ∈ A such that
b = y ? y.
148 Denecke, K. and Susanti, Y.

Lna,b ∩ Lna,b0 = ∅ if b 6= b0 .
(ii)
If f ∈ Lnâ,b and g ∈ Lna0 ,a then f + g ∈ Lna0 ,b .
(iii)
Lna,b contains the projection eni if and only if ai = b.
(iv)
Lna,b contains the constant element cny if and only if b = y ? y.
(v)
If {x ? x|x ∈ A} = A, then there exists a unique constant element cny ∈ On (A)
(vi)
such that cny ∈ Lna,b .
(vii) Lnâ,b ∩ Cc1y 6= ∅ if and only if b = y ? y.
T n
(viii) ( Lâ,a ) ∩ Cc1y = ∅ for all x ∈ A.
a∈A

Proof: (i) Assume that there exists x ∈ An such that x ?n x = a, i.e. a ?n x−1 = x
and b 6= y ? y for all y ∈ A. Then for every f ∈ On (A) we have f (x) ? f (a ?n x−1 ) =
f (x) ? f (x) 6= b, i.e f 6∈ Lna,b . Thus Lna,b = ∅, a contradiction. Conversely, let x ?n x 6= a
for all x ∈ An . Then for every x ∈ An we have x 6= a ?n x−1 . Thus, we can choose f
such that f (x) = e and f (a ?n x−1 ) = b for every x ∈ An and have f ∈ Lna,b . If there
exist y ∈ A such that b = y ? y then cny (x) ? cny (a ?n x−1 ) = y ? y = b, i.e. cny ∈ Lna,b .
Thus Lna,b 6= ∅.
(ii) Let b, b0 ∈ A such that b 6= b0 . Assume that there exists f ∈ Lna,b ∩ Lna,b0 . Then for
all x ∈ An we obtain f (x) ? f (a ?n x−1 ) = b and f (x) ? f (a ?n x−1 ) = b0 and hence b = b0 ,
a contradiction.
(iii) Let f ∈ Lnâ,b and g ∈ Lna0 ,a . Then for every x ∈ An we obtain f (x) ? f (â ?n x−1 ) = b
and g(x) ? g(a0 ?n x−1 ) = a. Therefore

(f + g)(x) ? (f + g)(a0 ?n x−1 ) = d ? f (g(a0\


f (g(x)) ?n x−1 ))
= f (g(x)) ? f (a ? g(x)−1 )
d \
= \−1 )
d ? f (â ?n (g(x))
f (g(x))
= b
and hence f + g ∈ Lna0 ,b .
(iv) Let Lna,b contain the projection eni . Then for arbitrary x ∈ An we have eni (x) ?
eni (a ?n x−1 ) = b. On the other hand side we have eni (x) ? eni (a ?n x−1 ) = xi ? ai ? x−1
i =
ai and hence ai = b. Conversely, let ai = b. Then for arbitrary x ∈ An we get
eni (x) ? eni (a ?n x−1 ) = xi ? ai ? x−1
i = ai = b and thus eni ∈ Lna,b .
(v) Let cny ∈ Lna,b . Thus for every x ∈ An we have cny (x) ? cny (a ?n x−1 ) = b and
cny (x) ? cny (a ?n x−1 ) = y ? y and hence y ? y = b. Conversely, if y ? y = b, then
cny (x) ? cny (a ?n x−1 ) = y ? y = b, i.e. cny ∈ Lna,b .
(vi) Let a ∈ An and b ∈ A. By assumption, there exists a unique y ∈ A such that
y ? y = b and therefore for every x ∈ An we have cny (x) ? cny (a ?n x−1 ) = y ? y = b, i.e.
cny ∈ Lna,b .
(vii) Let Lnâ,b ∩ Cc1y 6= ∅ and let f ∈ Lnâ,b ∩ Cc1y . Then for all x̂ ∈ ∆An we obtain
−1 ) = y ? y and f (x̂) ? f (â ? x
f (x̂) ? f (â ?n xd d −1 ) = b. Thus b = y ? y. Conversely, let
n
b = y ? y. By (v) we have cny ∈ Lnâ,b and therefore Lnâ,b ∩ Cc1y 6= ∅.
(viii) is obvious by (vii).
On Sets Related to Clones of Quasilinear Operations 149

Now, let c ∈ A be arbitrary and ωc be a mapping from A to A defined by


ωc (x) = x−1 ? c for every x ∈ A. It is easy to see that ωc ωc (x) = x. Moreover, by c f
we mean n-ary operations on A mapping x ∈ An to ωc (f (x)), i.e. c f (x) = ωc (f (x)).
Then we have the following properties.
Proposition 2.1. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a
natural number. For arbitrary c ∈ A and f ∈ On (A) it follows c c f = f .

The proof is straight forward from definition.


Lemma 2.2. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a natural
number. Then for arbitrary a ∈ An and b, c ∈ A, the following propositions hold.
(i) f ∈ Lna,b if and only if c f ∈ Lna,ωc (b)?c .
(ii) Lna,b contains c f for all f ∈ Lna,b if and only if b ? b = c ? c.
(iii) If b ? b = c ? c, then b f, c f ∈ Lnb̂,b ∩ Lnĉ,c if and only if f ∈ Lnb̂,b ∩ Lnĉ,c .

Proof: (i) Let f ∈ Lna,b and x ∈ An . Then we have f (x) ? f (a ?n x−1 ) = b and
hence
c f (x) ? c f (a ?n x−1 ) = ωc (f (x)) ? ωc (f (a ?n x−1 ))
= (f (x))−1 ? c ? (f (a ?n x−1 ))−1 ? c
= (f (x) ? f (a ?n x−1 ))−1 ? c ? c
= b−1 ? c ? c
= ωc (b) ? c.
Thus c f ∈ Lna,ωc (b)?c . Conversely, let c f ∈ Lna,ωc (b)?c . Then c f (x)?c f (a?n x−1 ) =
ωc (b) ? c, i.e. ωc (f (x)) ? ωc (f (a ?n x−1 )) = b−1 ? c ? c. Since ωc ωc (x) = x for every x ∈ An
we obtain
f (x) ? f (a ?n x−1 ) = ωc (ωc (f (x))) ? ωc (ωc (f (a ?n x−1 )))
= (ωc (f (x)))−1 ? c ? (ωc (f (a ?n x−1 )))−1 ? c
= (ωc (f (x)) ? ωc (f (a ?n x−1 )))−1 ? c ? c
= (b−1 ? c ? c)−1 ? c ? c
= b
and hence f ∈ Lna,b .
(ii) Let Lna,b contain c f for all f ∈ Lna,b . Then for all f ∈ Lna,b and x ∈ An we have
f (x) ? f (a ?n x−1 ) = c f (x) ? c f (a ?n x−1 ) = b and therefore we have
b = c f (x) ? c f (a ?n x−1 )
= ωc (f (x)) ? ωc (f (a ?n x−1 ))
= (f (x))−1 ? c ? (f (a ?n x−1 ))−1 ? c
= (f (x) ? f (a ?n x−1 ))−1 ? c ? c
= b−1 ? c ? c,
i.e. b ? b = c ? c. Conversely, let b ∈ A. If b ? b = c ? c, then b = b−1 ? c ? c = ωc (b) ? c
and thus by (i), Lna,b contains c f for all f ∈ Lna,b .
(iii) Let b , c f ∈ Lnb̂,b ∩Lnĉ,c . Then by (ii) and Proposition 2.1, we get f = b b f ∈ Lnb̂,b
and f = c c f ∈ Lnĉ,c , i.e f ∈ Lnb̂,b ∩ Lnĉ,c . Conversely, let f ∈ Lnb̂,b ∩ Lnĉ,c . Then by (ii),
150 Denecke, K. and Susanti, Y.

we get b f ∈ Lnb̂,b , b f ∈ Lnĉ,c , c f ∈ Lnb̂,b and c f ∈ Lnĉ,c . Thus b f, c f ∈ Lnb̂,b ∩ Lnĉ,c .

The following result gives a necessary and sufficient condition for Lna,b to be an
n-clone on A.
Theorem 2.1. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a natural
number. For every a ∈ An and b ∈ A the following propositions are equivalent.
(i) Lna,b contains all projections.
(ii) Lna,b is an n-clone on A.
(iii) a = b̂.
Proof: (i) ⇔ (iii) is clear by Lemma 2.1 (iv).
(ii)⇔ (iii) Let Lna,b be an n-clone on A. Then Lna,b contains all projections. Therefore
a = b̂. Conversely, let a = b̂, i.e. Lna,b = Lnb̂,b . Thus by Lemma 2.1 (iv), Lna,b contains
all projections eni , i ∈ {1, . . . , n}. Moreover, let f, g1 , . . . , gn be in Lna,b = Lnb̂,b . Then
gi (x) ? gi (b̂ ?n x−1 ) = b, i.e. gi (b̂ ?n x−1 ) = b ? gi (x)−1 for all i = 1, . . . , n. Therefore for
arbitrary x ∈ An and g(x) = (g1 (x), . . . , gn (x)) we have
f (g1 , . . . , gn )(x) ? f (g1 , . . . , gn )(b̂ ?n x−1 )
= f (g1 (x), . . . , gn (x)) ? f (g1 (b̂ ?n x−1 ), . . . , gn (b̂ ?n x−1 ))
= f (g1 (x), . . . , gn (x)) ? f (b ? g1 (x)−1 , . . . , b ? gn (x)−1 )
= f (g1 (x), . . . , gn (x)) ? f (b̂ ?n (g1 (x)−1 , . . . , gn (x)−1 ))
= f (g(x)) ? f (b̂ ?n (g(x))−1 )
= b.
Therefore Lna,b is an n-clone on A.
By the definition of the operation + on On (A), it is clear that if C is an n-clone
on A, then (C; +) is a subsemigroup of (On (A); +). In the following part we will prove
some results on semigroups in On (A) related to Lna,b . A direct consequence of Theorem
2.1 is the following proposition.
Proposition 2.2. Let (A; ?,−1 , e) be a finite commutative group. For every a, b ∈ A
the following four propositions are equivalent.
(i) L1a,b forms a subsemigroup in (O1 (A); ◦).
(ii) L1a,b contains the identity operation.
(iii) L1a,b is a 1-clone.
(iv) a=b.
Proposition 2.3. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a
natural number. For every a ∈ An and a, b, c, y ∈ A the following four propositions are
true.
(i) (Lnâ,a ; +) is a subsemigroup of (On (A); +).
(ii) (Lna,y?y ∩ Cc1y ; +) is a subsemigroup of (On (A); +) which is not an n-clone.
(iii) If b ? b = c ? c, then (Lnb̂,b ; +) is a subsemigroup of (On (A); +) containing c f
for all f ∈ Lnb̂,b .
On Sets Related to Clones of Quasilinear Operations 151

Proof: (i) is clear by Theorem 2.1.


(ii) By Lemma 2.1 (iv), cny ∈ Lna,y?y and since cny ∈ Cc1y then Lna,y?y ∩ Cc1y 6= ∅. Let f
and g be in Lna,y?y ∩ Cc1y . Then for all x ∈ An we have (f + g)(x) = f (g(x))d = y, i.e.
n n
f + g = cy ∈ La,y?y ∩ Cc1y . Moreover, since Cid ∩ Cc1y = ∅ and ei ∈ Cid , then Lna,y?y ∩ Cc1y
n

does not contain the projections and thus it is not an n-clone.


(iii) is clear by (i) and Lemma 2.2 (ii).
A necessary and sufficient condition for Lna,b to form a subsemigroup of (On (A); +)
will be given in Theorem 2.2. For proving this theorem we need the following lemma.
Lemma 2.3. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 2 be a natural
number. For every a ∈ An and b ∈ A, if Lna,b 6= ∅, then there exists g ∈ Lna,b such that
Img = A.
Proof: We count the number p of unordered pairs (y, y 0 ) such that y ? y 0 = b and
the number q of unordered pairs of (x, x0 ) such that x ?n x0 = a. For y, y 0 ∈ A we have
y = y 0 if and only if b ? y −1 = b ? y 0−1 . Hence for x, x0 ∈ An we have x = x0 ∈ An if
and only if a ?n x−1 = a ?n x0−1 . This means that each y ∈ A and each x ∈ An occur
mn
in a unique pair. Therefore we have d m n
2 e ≤ p ≤ m and d 2 e ≤ q ≤ m . Moreover, for
n
m n
all m ≥ 2 and n ≥ 2 we have d 2 e ≥ m. Furthermore, if La,b 6= ∅, then by Lemma
2.1 (i), x ?n x 6= a for every x ∈ An or there exists y ∈ A such that b = y ? y. Now,
we define g : An → A as follows: take the first two unordered pairs (x1 , x01 ) and (y1 , y10 )
and then map x1 to y1 and x01 to y10 . Now, take the second two unordered pairs (x2 , x02 )
and (y2 , y20 ) which are different from the first ones and then map x2 to y2 and x02 to y20 .
Continue until all unordered pairs y and y 0 satisfying y ∗ y 0 = b were considered. For
each x and x0 of the remaining unordered pairs, map x to b and x0 to e if a 6= x ?n x
for every x ∈ An and map both x and x0 to y if there exists y ∈ A such that y ? y = b.
Then it is easy to see that g ∈ Lna,b and Img = A.
Theorem 2.2. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 2 be a natural
number. For every a ∈ An and b ∈ A, if Lna,b 6= ∅, then (Lna,b ; +) is a subsemigroup of
(On (A); +) if and only if πf ∈ L1b,b for all f ∈ Lna,b .

Proof: ⇒ Let Lna,b form a subsemigroup of (On (A); +). Let f ∈ Lna,b and y ∈ A.
Then we can find πf ∈ O1 (A) such that πf (x) = f (x̂) for every x ∈ A. Moreover, by
Lemma 2.3 we can find g ∈ Lna,b and x ∈ An such that y = g(x). Therefore we have
g(x) ? g(a ?n x−1 ) = b, i.e. g(a ?n x−1 ) = b ? (g(x))−1 = b ? y −1 and f + g ∈ Lna,b , i.e.
(f + g)(x) ? (f + g)(a ?n x−1 ) = b. Hence
πf (y) ? πf (b ? y −1 ) = πf (y) ? πf (g(a ?n x−1 ))
= ?n x−1 ))
f (ŷ) ? f (g(a\
= d ? f (g(a\
f (g(x)) ?n x−1 ))
= (f + g)(x) ? (f + g)(a ?n x−1 )
= b,
i.e. πf ∈ L1b,b .
⇐ Let f, g ∈ Lna,b and x ∈ An . By assumption we have πf ∈ L1b,b , i.e. πf (y) ? πf (b ?
152 Denecke, K. and Susanti, Y.

y −1 ) = b for all y ∈ A and g(x) ? g(a ?n x−1 ) = b, i.e. g(a ?n x−1 ) = b ? (g(x))−1 = b ? y −1
for y = g(x). Therefore
(f + g)(x) ? (f + g)(a ?n x−1 ) = f (g(x)) d ? f (g(a\ ?n x−1 ))
= f (b ? y −1 )
y ) ? f (b\
= πf (y) ? πf (b ? y −1 )
= b,
i.e. f + g ∈ Lna,b and hence (Lna,b ; +) is a subsemigroup of (On (A); +). This completes
the proof.
Now, we come to some properties of P oln αA and the connection between Lna,b
and P oln αA .
Lemma 2.4. Let (A; ?,−1 , e) be a finite commutative group and let n ≥ 1 be a natural
number. Let c ∈ A and f ∈ On (A). Then f ∈ P oln αA if and only if c f ∈ P oln αA .
Proof: Let f ∈ P oln αA . For arbitrary (ui , vi , wi , zi ) ∈ αA , i = 1, 2, . . . , n, put
u = (u1 , . . . , un ), v = (v1 , . . . , vn ), w = (w1 , . . . , wn ) and z = (z1 , . . . , zn ). Then we
have f (u) ? f (v) = f (w) ? f (z). By c f (x) = ωc (f (x)) = (f (x))−1 ? c we get
c f (u) ? c f (v) = ωc (f (u)) ? ωc (f (v))
= (f (u))−1 ? c ? (f (v))−1 ? c
= (f (u) ? f (v))−1 ? c ? c
= (f (w) ? f (z))−1 ? c ? c
= (f (w))−1 ? c ? (f (z))−1 ? c
= ωc (f (w)) ? ωc (f (z))
= c f (w) ? c f (z).
Hence c f ∈ P oln αA . Conversely, let c f ∈ P oln αA and (ui , vi , wi , zi ) ∈ αA , i =
1, 2, . . . , n. Then c f (u) ? c f (v) = c f (w) ? c f (z), i.e. ωc (f (u)) ? ωc (f (v)) =
ωc (f (w)) ? ωc (f (z)). Using the properties ωc ωc (x) = x for all x ∈ A and ωc (f (x)) =
(f (x))−1 ? c we obtain
f (u) ? f (v) = ωc ωc (f (u)) ? ωc ωc (f (v))
= (ωc (f (u)))−1 ? c ? (ωc (f (v)))−1 ? c
= (ωc (f (u)) ? ωc (f (v)))−1 ? c ? c
= (ωc (f (w)) ? ωc (f (z)))−1 ? c ? c
= (ωc (f (w)))−1 ? c ? (ωc (f (z)))−1 ? c
= ωc ωc (f (w)) ? ωc ωc (f (z))
= f (w) ? f (z),
i.e. f ∈ P oln αA .
The following results show the connection between the sets of Lna,b and P oln αA .
Lemma 2.5. Let (A; ?,−1 , e) be a finite commutativeSgroup, let n ≥ 1 be a natural
number and let a ∈ An be arbitrary. Then P oln αA ⊆ Lna,b .
b∈A

Proof: Let f ∈ P ol αA and let a ∈ A . Then for all u, w ∈ An we can find


n n

(ui , vi , wi , zi ) ∈ αA , i = 1, 2, . . . , n such that u ?n v = w ?n z = a, i.e. v = a ?n


u−1 and z = a ?n w−1 for u = (u1 , . . . , un ), v = (v1 , . . . , vn ), w = (w1 , . . . , wn ) and
On Sets Related to Clones of Quasilinear Operations 153

z = (z1 , . . . , zn ). Since f ∈ P oln αA we have f (u) ? f (v) = f (w) ? f (z). Hence,


f (u) ? f (a ?n u−1 ) = f (u) ? f (v) = f (w) ? f (z) = f (w) ? f (a ?n w−1 ). Therefore there
exists b ∈ A such that f (u) ? f (a ?n u−1 ) = b for all u ∈ An , i.e. f ∈ Lna,b and thus
S n
f∈ La,b .
b∈A

Theorem 2.3. Let (A; ?,−1T, e) S


be a finite commutative group and let n ≥ 1 be a natural
number. Then P oln αA = Lna,b .
a∈An b∈A

Proof: (⊆) By Lemma 2.5, P oln αA ⊆


S n
La,b for all a ∈ An . Therefore P oln α ⊆
T S n b∈A
La,b .
a∈An b∈A
T S n
(⊇) Let f ∈ La,b . Now, let (ui , vi , wi , zi ) ∈ αA , i = 1, 2, . . . , n. By putting
a∈An b∈A
a = (u1 ? v1 , . . . , un ? vn ) ∈ An , we have v = a ?n u−1 and z = a ?n w−1 for u =
(u1 , .S
. . , un ), v = (v1 , . . . , vn ), w = (w1 , . . . , wn ) and z = (z1 , . . . , zn ). It is clear that
f∈ Lna,b . Therefore there exists b ∈ A such that f ∈ Lna,b , i.e. f (u) ? f (a ?n u−1 ) = b
b∈A
and f (w) ? f (a ?n w−1 ) = b. Hence, f (u) ? f (v) = f (u) ? f (a ?n u−1 ) = b = f (w) ? f (a ?n
w−1 ) = f (w) ? f (z) and thus f ∈ P oln αA .

3. PROPERTIES OF Ln
a,b FOR PARTICULAR FINITE COMMUTATIVE
GROUPS
In this section we will study some properties of the set Lna,b for elementary abelian
p-groups, i.e. abelian groups in which all non-identity elements have order p for prime
numbers p and then for the group (Zm = Z/mZ; +) for arbitrary natural numbers
m ≥ 2.
Lemma 3.1. Let (A; ?,−1 , e) be an elementary abelian 2-group (|A| = 2m ) and let n ≥ 1
be a natural number. For every a, â ∈ An and a, b, y ∈ A the following propositions are
satisfied.
(i) Lnê,e = On (A) and Lnê,b = ∅ for all b 6= e.
(ii) Lna,b contains c f for all f ∈ Lna,b and for all c ∈ A.
(iii) Lna,b contains some constant elements if and only if b = e. Moreover, Lna,e
contains all constant elements of On (A).
(iv) Lnâ,b ∩ Cc1y 6= ∅ if and only if b = e.
(v) If Lna,b ∩ Cc1y 6= ∅, then (Lna,b ∩ Cc1y ; +) is a subsemigroup of (On (A); +) if and
only if b = e. T
Lnâ,a if and only if f ∈
T n
(vi) {c f |c ∈ A} ⊆ Lâ,a .
a∈A a∈A
(vii) {eni |i = 1, 2, . . . , n} ∪ ( {c eni |i = 1, 2, . . . , n}) ⊆
S T n
Lâ,a
c∈A a∈A

Proof: (i) Let (A; ?,−1 , e) be a 2-group, i.e. x−1 = x for all x ∈ A. Then for
every f ∈ On (A) we have f (x) ? f (ê ? x−1 ) = f (x) ? f (ê ? x) = f (x) ? f (x) = e. Therefore
f ∈ Lnê,e , i.e. On (A) ⊆ Lnê,e and thus Lnê,e = On (A). Moreover, by Lemma 2.1 (ii), we
154 Denecke, K. and Susanti, Y.

get Lnê,b = ∅ for all b 6= e.


(ii) Let b, c ∈ A. Then b ? b = c ? c and hence by Lemma 2.2 (ii), Lna,b contains c f for
all f ∈ Lna,b .
(iii) By the fact that b ? b = e for all b ∈ A and Lemma 2.1 (v).
(iv) By Lemma 2.1 (vii), Lnâ,b ∩ Cc1y 6= ∅ if and only if b = y ? y. Since y ? y = e for all
y ∈ A, we have Lnâ,b ∩ Cc1y 6= ∅ if and only if b = e.
(v) Let Lna,b ∩ Cc1y 6= ∅ be a semigroup. Then cny ∈ Cc1y must be in Lna,b and thus by (iii),
b = e. Conversely, let b = e. Since x ? x = e for all x ∈ A then by Proposition 2.3 (ii),
Lna,b ∩ Cc1y is a subsemigroup of (On (A); +).
(vi) By Lemma 2.2 (iii) and the assumption.
(vii) By Lemma 2.1 (iv), it is clear that {eni |i = 1, 2, . . . , n} ⊆
T n
Lâ,a . Moreover,
a∈A
n
{c eni |i {c eni |c
S S
by the fact that = 1, 2, . . . , n} = ∈ A} and by (vi), we have
c∈A i=1
{eni |i = 1, 2, . . . , n} ∪ ( {c eni |i = 1, 2, . . . , n}) ⊆ Lnâ,a .
S T
c∈A a∈A
−1
Let p > 2 be a prime number and let (A; ?, , e) be a commutative p-group. It
is clear that p + 1 is an even natural number and thus if x ? x = y ? y, then x = e ? x =
xp ? x = xp+1 = y p+1 = y p ? y = e ? y = y. Therefore x ? x = y ? y if and only if x = y,
i.e {x ? x|x ∈ A} = A and hence we have the following properties.

Proposition 3.1. Let n ≥ 1 be a natural number, p > 2 be a prime number and


let (A; ?,−1 , e) be a commutative p-group. For a ∈ An and b, c, y ∈ A the following
propositions are true.

(i) Lna,b 6= ∅.
(ii) Lna,b contains a constant element cny for a unique y ∈ A.
(iii) Lna,b ∩ Cc1y forms a semigroup for a unique y ∈ A.
(iv) If f ∈ Lna,b , then c f ∈ Lna,b if and only if b = c.
T n
(v) If f ∈ Lâ,a , then f ∈ Cid .
a∈A

Proof: (i) By the fact that {x ? x|x ∈ A} = A for a p-group A and Lemma 2.1
(i).
(ii) By assumption and by Lemma 2.1 (vi).
(iii) By (ii), there is a unique cny in Lna,b and hence Lna,b ∩ Cc1y 6= ∅. Moreover, for all
f, g ∈ Lna,b ∩ Cc1y we have f + g = cny ∈ Lna,b ∩ Cc1y , i.e. Lna,b ∩ Cc1y forms a subsemigroup
of (On (A); +).
(iv) Let f ∈ Lna,b . By assumption, b ? b = c ? c if and only if b = c. Therefore by Lemma
2.2 (ii), c ∈ Lna,b if and only if b ? b = c ? c if and only if b = c.
T n
(v) Let f ∈ Lâ,a . Assume that f 6∈ Cid , i.e. there exists x̂ ∈ ∆An such that
a∈A
f (x̂) 6= x. Then for â ∈ ∆An such that â = x̂ ?n x̂ if and only if a = x ? x we get
−1 ) = f (x̂) ? f (x̂) 6= x ? x = a. Therefore f 6∈ Ln , a contradiction.
f (x̂) ? f (â ?n xd â,a
On Sets Related to Clones of Quasilinear Operations 155

Recall that for the group (Zm ; +) where + is the usual addition modulo m and
A = Zm , by f for f ∈ On (A) we mean n-ary function on A mapping x ∈ An to
ω(f (x)) with ω(x) = m − 1 − x ([8]).
The following propositions are true for (Z/mZ; +).
Proposition 3.2. Let n ≥ 1, m ≥ 2 be two natural numbers and let A = Zm . For every
a ∈ An and b ∈ A the following propositions hold.
(i)If m is odd, then Lna,b 6= ∅.
(ii)f ∈ Lna,b if and only if f ∈ Lna,−b−2 .
(iii)Lna,b contains f for all f ∈ Lna,b if and only if 2b = −2.
(iv) If 2b = −2, then (Lnâ,a ; +) is a semigroup containing f for all f ∈ Lnâ,a .
(v) If m is even, then Lna,b contains some constant elements cni if and only if b is
even. Moreover, Lna,b contains exactly two constants.
(vi) If m is odd, then Lna,b contains a constant element cni , i ∈ A.

Proof: (i) and (vi) Let m be odd. It is easy to check that {a + a|a ∈ Zm } = Zm .
Applying Lemma 2.1 (i) we have Lna,b 6= ∅ for all a ∈ An = Zm n
and b ∈ A = Zm . Thus
we have (i). Moreover, by Lemma 2.1 (vi), we obtain (vi).
(ii), (iii) and (iv) Putting c = m − 1 we get  is equal to c . Then applying Lemma
2.2 (i), Lemma 2.2 (ii) and Proposition 2.3 (iii), we have (ii), (iii) and (iv), respectively.
(v) Let m be even. Then by Lemma 2.1 (v), it is clear that Lna,b contains some constant
elements cni if and only if b is even. Moreover, for every even number b, 0 ≤ b ≤ m − 1,
we have that xb = 2b and yb = m+b 2 are the only numbers in A = Zm satisfying
xb + xb = yb + yb = b. Therefore, cnxb (x) + cnxb (a −n x) = xb + xb = b and cnyb (x) +
cnyb (a −n x) = yb + yb = b for every x ∈ An = Zm n
, i.e. cnxb , cnyb ∈ Lna,b .

4. PROPERTIES OF Ln n
a,b AND P ol αA FOR A = {0, 1}

In this section we consider A = {0, 1} and the 2-group ({0, 1}; +), where + is
addition modulo 2, since in this case our sets Lna,b are sets of Boolean operations and
Boolean operations play an important role in many applications.
Proposition 4.1. For A = {0, 1} the following propositions are true.
(i) L2a,0 forms a subsemigroup of (O2 ({0, 1}); +) for all a ∈ A2 .
(ii) L2a,1 does not form a subsemigroup of (O2 ({0, 1}); +) for all (1, 1) 6= a ∈ A2 .

Proof: Let c20 , c21 , e21 , e22 , ¬e2i , ¬e22 , f+ , ¬f+ be the following binary operations on
{0, 1}:
c20 c21 e21 e22 ¬e21 ¬e22 f+ ¬f+
(0, 0) 0 1 0 0 1 1 0 1
(0, 1) 0 1 0 1 1 0 1 0
(1, 0) 0 1 1 0 0 1 1 0
(1, 1) 0 1 1 1 0 0 0 1.
By simple counting we get precisely the following sets L2a,b .
156 Denecke, K. and Susanti, Y.

L2(0,0),0 = O2 ({0, 1}) L2(0,0),1 =∅


L2(0,1),0 = {c20 , c21 , e21 , ¬e21 } L2(0,1),1 = {e22 , ¬e22 , f+ , ¬f+ }
L2(1,0),0 = {c20 , c21 , e22 , ¬e22 } L2(1,0),1 = {e21 , ¬e21 , f+ , ¬f+ }
L2(1,1),0 = {c20 , c21 , f+ , ¬f+ } L2(1,1),1 = {e21 , e22 , ¬e21 , ¬e22 }.
Then (i) and (ii) are clear.
For A = {0, 1}, P ol2 αA = {c20 , c21 , e21 , e22 , ¬e2i , ¬e22 , f+ , ¬f+ }. Considering the list
of all sets L2a,b we see that these sets are subset of P oln αA for all a 6= (0, 0). But in
general, this is not true as we can see in the following example.
Example 4.1. Let A = {0, 1} and n = 3. We define a ternary operation on A as
follows:
f: (0, 0, 0) → 0
(0, 0, 1) → 1
(0, 1, 0) → 0
(0, 1, 1) → 1
(1, 0, 0) → 1
(1, 0, 1) → 1
(1, 1, 0) → 0
(1, 1, 1) → 0
f belongs to L3(0,1,1),1 but is not a quasilinear operation, since f ((0, 0, 1) + (1, 0, 0)) +
f (0, 0, 0) = f (1, 0, 1) + f (0, 0, 0) = 1 + 0 = 1 but f (0, 0, 1) + f (1, 0, 0) = 1 + 1 = 0.
Therefore L3(0,1,1),1 6⊆ P ol3 α.

Lemma 4.1. Let A = {0, 1} and let n ≥ 1 be a natural number. Then the following
propositions hold:
(i) Ln0̂,0 = On (A) and Ln0̂,1 = ∅,
(ii) For all a ∈ An and b ∈ A the set Lna,b contains ¬f whenever f ∈ Lna,b ,
(iii) If a ∈ An and 0̂ 6= a 6= 1̂, then Lna,1 ∩(K0n \{cn0 }) 6= ∅ and Lna,1 ∩(K1n \{cn1 }) 6= ∅.

Proof: Let A = {0, 1}.


(i) is clear by Lemma 3.1 (i).
(ii) The unary operation ¬ on A = {0, 1} can be considered as c with c = 1. Then by
Lemma 3.1 (ii), Lna,b contains ¬f whenever f ∈ Lna,b .
(iii) Let 0̂ 6= a 6= 1̂. Then 0̂ 6= a − 1̂ and a − 0̂ 6= 1̂. Thus we can find two elements
f, g ∈ Lna,1 such that f (0̂) = 0, f (a − 0̂) = 1, f (1̂) = 0 and f (a − 1̂) = 1 and g(0̂) = 1,
g(a − 0̂) = 0, g(1̂) = 1 and g(a − 1̂) = 0. It is clear that cn0 6= f ∈ K0n and cn1 6= g ∈ K1n .
Therefore Lna,1 ∩ (K0n \ {cn0 }) 6= ∅ and Lna,1 ∩ (K1n \ {cn1 }) 6= ∅.
Proposition 4.2. Let A = {0, 1} and let n ≥ 1 be a natural number. For every a ∈ An
the following propositions are satisfied.
(i) If a 6= 1̂, then Lna,0 forms a four-part semigroup in (On ({0, 1}); +).
(ii) If a = 1̂, then Lna,0 = C ∪ ¬C for some cn0 ∈ C ⊆ K0n , i.e Lna,0 forms a two-
constant semigroup in (On ({0, 1}); +) containing all negation of its elements.
On Sets Related to Clones of Quasilinear Operations 157

(iii) Lna,1 forms a subsemigroup of (On ({0, 1}); +) if and only if a = 1̂. Moreover,
Ln1̂,1 = C ∪ ¬C for some C ⊆ C4n .

Proof: Let A = {0, 1} and a ∈ An . By Lemma 4.1 (ii), Lna,b contains ¬f for all
f∈ Lna,b . Moreover, by Lemma 3.1 (iii), Lna,0 contains the two constants cn0 and cn1 .
(i) If a 6= 1̂ then by Lemma 2.1 (iv), Lna,0 contains the projection eni whenever ai = 0.
Since eni (0̂) = 0 and eni (1̂) = 1 we have eni ∈ C4n and thus Lna,0 ∩ C4n 6= ∅. Therefore we
have cn0 , cn1 ∈ Lna,0 , ¬(Lna,0 ∩ C4n ) = Lna,0 ∩ ¬C4n 6= ∅ and ¬(Lna,0 ∩ K0n ) = Lna,0 ∩ K1n and
hence by Theorem 1.1, (Lna,0 ; +) is a four-part semigroup.
(ii) Now, let a = 1̂, i.e. Lna,o = Ln1̂,0 . We show that L1̂,0 ∩ C4n = ∅ and L1̂,0 ∩ ¬C4n = ∅.
Assume that there is f ∈ L1̂,0 ∩ C4n or f ∈ L1̂,0 ∩ ¬C4n . Then we have f (0̂) = 0 and
f (1̂) = 1 or f (0̂) = 1 and f (1̂) = 0. Therefore f (0̂) + f (1̂ −n 0̂) = f (0̂) + f (1̂) = 1
and hence f 6∈ Ln1̂,0 , a contradiction. Thus Ln1̂,0 ⊆ K0n ∪ K1n such that cn0 , cn1 ∈ Ln1̂,0
and Ln1̂,0 contains the negations of its elements, i.e. Ln1̂,0 = C ∪ ¬C for some cn0 ∈ C ⊆
K0n . Therefore by Theorem 1.2, Ln1̂,0 forms a two-constant semigroup containing the
negations of its elements.
(iii) Let Lna,1 form a semigroup. Assume that a 6= 1̂. By Lemma 4.1 (i), â 6= 0̂ and
by Lemma 3.1 (iii), Lna,1 does not contain any constant element. By Lemma 4.1 (iii),
Lna,1 ∩ (K0n \ {cn0 }) 6= ∅ and hence f + g = cn0 6∈ Lna,1 , for every cn0 6= f ∈ Lna,1 ∩ K0n , g ∈
Lna,1 , a contradiction. Thus a = 1̂. Conversely, if a = 1̂, then by Proposition 2.3 (i),
Lna,1 = Ln1̂,1 forms a subsemigroup of (On ({0, 1}); +). Moreover, by Lemma 3.1 (iii),
Ln1̂,1 does not contain neither cn0 nor cn1 . Hence Ln1̂,1 ∩ K0n and Ln1̂,1 ∩ K1n must be empty
sets. Furthermore, by Lemma 4.1 (ii), Ln1̂,1 contains all the negations of its elements.
Therefore Ln1̂,1 = C ∪ ¬C for some C ⊆ C4n .

Theorem 4.1. Let A = {0, 1} and let n ≥ 1. Then (P oln αA ; +) is a four-part sub-
semigroup of (On (A); +).
Proof: Each n-ary projection eni belongs to P oln αA . Thus P oln αA ∩ C4n 6= ∅.
Moreover, the constant cn0 belongs to P oln αA . The operation ¬ corresponds to c for
c = 1 and then by Lemma 2.4 we have ¬f ∈ P oln αA whenever f ∈ P oln αA . Therefore
by Theorem 1.1, (P oln αA ; +) is a four-part semigroup.

References
[1] Butkote, R., Denecke, K., Semigroup Properties of Boolean Operations, Asian-Eur. J. Math,
Vol. 1, No. 2, 157–176, 2008.
[2] Butkote, R., Universal-algebraic and Semigroup-theoretical Properties of Boolean Operations,
Dissertation, Universität Potsdam, 2009.
[3] Denecke, K., Lau, D., Pöschel, R. and Schweigert, D., Hyperidentities, Hyperequational
Classes and Clone Congruences, Contributions to General Algebra 7, 97-118, Verlag Hölder-
Pichler-Tempsky, Wien, 1991.
[4] Denecke, K., Wismath, S. L., Hyperidentities and Clones, Gordon and Breach Science Publisher,
2000.
[5] Denecke, K., Wismath, S. L., Universal Algebra and Applications in Theoretical Computer
Science, Chapman and Hall, 2002.
158 Denecke, K. and Susanti, Y.

[6] Denecke, K., Wismath, S. L., Universal Algebra and Coalgebra, World Scientific, 2009.
[7] Denecke, K., Susanti, Y., Semigroup-theoretical Properties of Boolean Operations, —–, 2009
(submitted).
[8] Denecke, K., Susanti, Y., Semigroups of n-ary Operations on Finite Sets,—–, 2010 (submitted).
[9] Fearnley, A., Clones on Three Elements Preserving a Binary Relation, Algebra Universalis 56,
165-177, 2007.
[10] Lau, D., F unction Algebras on Finite Sets, Springer 2006.
[11] Pöschel, R., Kalužnin, L. A., F unktionen- und Relationenalgebren, VEB Deutscher Verlag der
Wissenschaften, Berlin 1979.
[12] Rosenberg, I. G., Über die Funktionale Vollständigkeit in den Mehrwertigen Logiken, Rozpravy
Ćeskoslovenské Akad. véd, Ser. Math. nat. Sci. 80, 3-93, 1970.
[13] Szendrei, Á., Clones in Universal Algebra, Les Presses de L’ Université de Montréal, 1986.

Denecke, K.
Universität Potsdam, Am Neuen Palais 10, 14469 Potsdam Deutschland.
e-mail: [email protected]

Susanti, Y.
Universitas Gadjah Mada, FMIPA UGM Sekip Utara Bulaksumur 55281 Yogyakarta Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 159–168.

NORMALIZED H∞ COPRIME FACTORIZATION FOR


INFINITE-DIMENSIONAL SYSTEMS

Fatmawati, Roberd Saragih, Yudi Soeharyadi

Abstract. In this paper, we investigate the existence of normalized H∞ coprime factori-


zation for infinite-dimensional systems. The systems considered are assumed to be expo-
nentially stabilizable and detectable linear systems with bounded and finite-rank input
and output operators. We construct the normalized left-coprime factorization (NLCF)
systems based on the solutions of Riccati equations of H∞ -type controllers. Furthermore,
we derive the connection between the controllability and observability gramians of NLCF
systems and the solutions of H∞ -Riccati equations.
Keywords and Phrases: Infinite-dimensional systems, coprime factorization, Riccati
equations.

1. INTRODUCTION
Two aspects stand out in system theory, those are robust stabilization and model
reduction. Robust stabilization problem is the problem of finding the controller that
not only stabilizes nominal plant, but also a similar to the nominal ones. Moreover,
the controller of a low order is often desirable in practice. One way approach to obtain
a low order controller is a model reduction. For that purposes, coprime factorization
of the representation of a transfer function has become a powerful tool. The coprime
factorization theory was then generalized to infinite dimensional system in [1, 3].
Curtain and Opmeer [2, 9] have developed model reduction method based on
LQG-balanced realization for the infinite dimensional system. The LQG-balanced is a
realization of a transformed system, such that the solutions of the corresponding control
and filter Riccati operator equations of Linear Quadratic Gaussian (LQG) controllers
are equal and diagonal. The key step in the analysis of LQG-balanced realization is to
construct the normalized left-coprime factorization (NLCF) systems using the solutions
of LQG-Riccati equations of infinite dimensional systems [2].

2010 Mathematics Subject Classification: 93A15, 93B05, 93B07, 93C05, 93C20.

159
160 Fatmawati, R. Saragih, Y. Soeharyadi

During the last decades, there has been much research in the design of H∞ con-
trollers, which are robust to system uncertainty and disturbance. In [8], it is shown
that the LQG-balanced realization of FDLTI systems can be carried out to the H∞ -
balanced based on H∞ -type controllers. Hence, it is very interesting to generalize the
H∞ -balanced to infinite dimensional systems. The H∞ -balanced realization is con-
structed via normalized left-coprime factorization (NLCF) systems using the solutions
of H∞ -Riccati equations of infinite dimensional systems [4]. The systems considered are
assumed to be exponentially stabilizable and in a detectable linear state, with bounded
and finite-rank input and output operators. Furthermore, we derive the connection be-
tween the controllability and observability gramians of NLCF systems and the solutions
of H∞ -Riccati equations.

2. H∞ -CONTROL FOR INFINITE-DIMENSIONAL SYSTEMS


In this section, we summarize without proof the existence of H∞ -control for
infinite-dimensional systems. Readers are referred to reference [3] for further details.
The infinite dimensional systems can be described in abstract form as

ẋ(t) = Ax(t) + Bu(t)


y(t) = Cx(t) + Du(t), (1)

where A is an infinitesimal generator of C0 -semigroup S(t) on Hilbert space X , B is a


bounded linear operator from a Hilbert space U to X , C is a bounded linear operator
from X to a Hilbert space Y, and D is a bounded operator from U to Y. We assume that
the input operator B and the output operator C are of finite rank. Choose U = Cm and
Y = Ck . The signal u(t) ∈ L2 ([0, ∞); Cm ) is control input and y(t) ∈ L2 ([0, ∞); Ck ) is
the output.
We shall denote the state linear system given by (1) as (A, B, C, D) and transfer
function given by G, with realization G(s) = C(sI − A)−1 B + D. We will omit the
operator ”D” if it is not relevant. The adjoint of operator A is written as A∗ and domain
of A is denoted by D(A). A symmetric operator A is self-adjoint if D(A∗ ) = D(A). The
self-adjoint operator A on the Hilbert space X with its inner product h·, ·i is nonnegative
if hAz, zi ≥ 0 for all z ∈ D(A) and positive if hAz, zi > 0 for all nonzero z ∈ D(A). We
shall use the notation A ≥ 0 for nonnegativity of the self-adjoint operator A, and A > 0
for positivity. Let C be the set of complex numbers. For λ ∈ C, we say that λ is in the
resolvent set ρ(A) of A, if (λI − A)−1 is exists and bounded linear operator on X . The
spectrum of A is defined to be σ(A) = C\ρ(A).
In this study, we will assume the system (A, B, C) to be exponentially stabilizable
and detectable. The system (A, B, C) is exponentially stabilizable if there exists an
operator F such that A + BF is exponentially stable and it is exponentially detectable
if there exists an operator L such that A + LC is exponentially stable. The operators
A + BF and A + LC are exponentially stable if they generate the exponentially stable
C0 -semigroup SF (t) and SL (t), respectively, on X . Recall that the C0 -semigroup SF (t)
Normalized H∞ Coprime Factorization for Infinite-Dimensional Systems 161

on X is exponentially stable [3] if there exist positive constants M and α such that
kSF (t)k ≤ M e−αt , for all t ≥ 0.
Next, we review an H∞ -control problem for infinite-dimensional systems. This
problem is concerned with a generalized plant
      
 ẋ(t)   A  B 0 B   x(t) 
 z1 (t)  
= C 0 0 0  w1 (t) 
.
  (2)
 z2 (t)   0  0 0  I   w 2 (t) 
y(t) C 0 I 0 u(t)
   
z1 w1
Put z := , and w := , where z is the error signals and w is the disturbance
z2 w2
signals. The state space description of the controller with transfer function K is given
by

ẋK (t) = AK xK (t) + BK y(t)


u(t) = CK xK (t). (3)

The closed-loop transfer function from w to z will be denoted by Tzw . The existence
of suboptimal H∞ -control for infinite-dimensional systems is given by the following
theorem.

Theorem 2.1. [10, 7] There exists an admissible controller K for the system (2) such
that kTzw k∞ < γ, γ > 0 if and only if there exists operator X, Y ∈ L(X ) with X =
X ∗ ≥ 0, Y = Y ∗ ≥ 0 that satisfy
1. Operator X satisfying control Riccati equation

A∗ Xx + XAx − (1 − γ −2 )XBB ∗ Xx + C ∗ Cx = 0, x ∈ D(A) (4)

such that operator AX = A − (1 − γ −2 )BB ∗ X exponentially stable.


2. Operator Y satisfying filter Riccati equation

AY x + Y A∗ x − (1 − γ −2 )Y C ∗ CY x + BB ∗ x = 0, x ∈ D(A∗ ) (5)

such that operator AY = A − (1 − γ −2 )Y C ∗ C exponentially stable.


3. rσ (Y X) < γ 2 .

Moreover, when these conditions hold, one such the controller K (3) can be con-
struted with

AK = A − (1 − γ −2 )BB ∗ X − ZC ∗ C,
BK = ZC ∗ ,
CK = −B ∗ X
Z = (I − γ −2 Y X)−1 Y = Y (I − γ −2 XY )−1 .
162 Fatmawati, R. Saragih, Y. Soeharyadi

3. NORMALIZED H∞ COPRIME FACTORIZATION


This section gives a detailed analysis of the normalized H∞ coprime factorization
for infinite-dimensional systems. It will be seen that the normalized H∞ coprime facto-
rization is an extension of the normalized LQG coprime factorization. We will restrict
the discussion to the normalized left-coprime factorization (NLCF) systems. The def-
inition of NLCF of infinite-dimensional systems refers to Curtain and Zwart [3] that
given in the following.
Definition 3.1. [3] Let Ñ ∈ H∞ (L(Cm , Ck )) and M̃ ∈ H∞ (L(Ck )) have the same
number of rows such that M̃(s) has an inverse. The ordered pair [Ñ(s) M̃(s)] repre-
sents a NLCF of the transfer function G if
1. G(s) = M̃(s)−1 Ñ(s),
2. the coprime property is
M̃(s)X(s) − Ñ(s)Y(s) = I, s ∈ C+
0
with X ∈ H∞ (L(Ck )) and Y ∈ H∞ (L(Ck , Cm )).
3. the normalization property is
M̃(iω)M̃(iω)∗ + Ñ(iω)Ñ(iω)∗ = I, ω ∈ R.
Next, we will construct the NLCF system based on the filter H∞ -Riccati equation
(5). Let γ0 denote the smallest γ for which admissible controller K exists. Define
p
β := 1 − γ −2 ,
where γ > max{1, γ0 }, such that 0 < β ≤ 1. Multiply (5) with β 2 such that we have
A(β 2 Y )x + (β 2 Y )A∗ x − (β 2 Y )C ∗ C(β 2 Y )x + (βB)(βB)∗ x = 0, x ∈ D(A∗ ). (6)
It is shown that the filter Riccati equation (6) is the filter LQG-Riccati equation for the
scaled system (A, βB, C). Meanwhile, the control LQG-Riccati equation for (A, βB, C)
is equal to the control H∞ -Riccati equation (4), which can be rewritten as follows
A∗ Xx + XAx − X(βB)(βB)∗ Xx + C ∗ Cx = 0, x ∈ D(A). (7)
2
Hence, β Y is a solution of the fiter LQG-Riccati equation that stabilize the scaled
system (A, βB, C) due to AY = A − β 2 Y C ∗ C is exponentially stable.
Let βG(s) = C(sI − A)−1 βB denote the transfer function of (A, βB, C). Using
the equation (6), we can construct NLCF from the scaled system (A, βB, C). The
following lemma shows that (AY , BY , CY , DY ) is a state-space realization of NLCF of
βG, with
AY = A − β 2 Y C ∗ C, BY = [βB − β 2 Y C ∗ ], CY = C, DY = [0 I]. (8)
The idea of proof adopted the method as is done in Curtain and Zwart [3].
Lemma 3.1. (AY , BY , CY , DY ) given by (8) is a state-space realization of NLCF for
βG with the transfer function is [Ñ(s) M̃(s)], where Ñ(s) and M̃(s) are given by
Ñ(s) = C(sI − AY )−1 βB, M̃(s) = I − C(sI − AY )−1 β 2 Y C ∗ . (9)
such that
Normalized H∞ Coprime Factorization for Infinite-Dimensional Systems 163

1. βG(s) = M̃(s)−1 Ñ(s).


2. There exist
X(s) = I + C(sI − AX )−1 β 2 Y C ∗ and Y(s) = −βB ∗ X(sI − AX )−1 β 2 Y C ∗ ,
with AX = A − β 2 BB ∗ X such that satisfy
M̃(s)X(s) − Ñ(s)Y(s) = I, s ∈ C+
0.

3. The normalization property is


M̃(iω)M̃(iω)∗ + Ñ(iω)Ñ(iω)∗ = I, ω ∈ R.

Proof. The first step, we shall use the following identities:


(sI − AY )−1 = (sI − A)−1 − (sI − A)−1 β 2 Y C ∗ C(sI − AY )−1 (10)
−1 −1 −1 2 ∗ −1
(sI − AY ) = (sI − A) − (sI − AY ) β Y C C(sI − A) , (11)
for all s ∈ ρ(A) ∩ ρ(AY ).
1. From (9) and (10), we obtain
I − M̃(s) = C(sI − AY )−1 β 2 Y C ∗
= C (sI − A)−1 − (sI − A)−1 β 2 Y C ∗ C(sI − AY )−1 β 2 Y C ∗
 

= C(sI − A)−1 β 2 Y C ∗ M̃(s).


and so
I = I + C(sI − A)−1 β 2 Y C ∗ M̃(s).
 

Similarly, using (11) we can show that


I = M̃(s) I + C(sI − A)−1 β 2 Y C ∗ .
 

This means that M̃(s) has the inverse


M̃(s)−1 = I + C(sI − A)−1 β 2 Y C ∗ .
 

Then multiply M̃(s)−1 with Ñ(s) and apply (10) so that


M̃(s)−1 Ñ(s) = I + C(sI − A)−1 β 2 Y C ∗ C(sI − AY )−1 βB
  

= C(sI − AY )−1 βB + C(sI − A)−1 β 2 Y C ∗ C(sI − AY )−1 βB


= C (sI − A)−1 − (sI − A)−1 β 2 Y C ∗ C(sI − AY )−1 βB+
 

C(sI − A)−1 β 2 Y C ∗ C(sI − AY )−1 βB


= C(sI − A)−1 βB
= βG(s).
2. The second step, we will show that the factorization of βG(s) is left-coprime.
Note that, (sI − AX )−1 can be represented by the identity
(sI − AX )−1 = (sI − A)−1 − (sI − A)−1 β 2 BB ∗ X(sI − AX )−1 . (12)
164 Fatmawati, R. Saragih, Y. Soeharyadi

Substitute X(s), Y(s) and (12) to the following equality


X(s) − βG(s)Y(s)
= I + C(sI − AX )−1 β 2 Y C ∗ − C(sI − A)−1 βB −βB ∗ X(sI − AX )−1 β 2 Y C ∗
 

= I + C(sI − A)−1 β 2 Y C ∗ − C(sI − A)−1 β 2 BB ∗ X(sI − AX )−1 β 2 Y C ∗ +


C(sI − A)−1 β 2 BB ∗ X(sI − AX )−1 β 2 Y C ∗
= I + C(sI − A)−1 β 2 Y C ∗
= M̃(s)−1 .
So, we have
X(s) − M̃(s)−1 Ñ(s)Y(s) = M̃(s)−1 . (13)
We see that (13) equivalent to
M̃(s)X(s) − Ñ(s)Y(s) = I. (14)

Since M̃, Ñ, X and Y analytic and bounded in C+ 0 , the identity (14) must hold
on all of s ∈ C+
0 , where C +
0 is the set of complex number with real part larger
than zero.
3. The last step, we will show that
M̃(iω)M̃(iω)∗ + Ñ(iω)Ñ(iω)∗ = I, ω ∈ R. (15)
Note that for ω ∈ R will be obtained
M̃(iω)M̃(iω)∗ + Ñ(iω)Ñ(iω)∗
= [I − C(iωI − AY )−1 β 2 Y C ∗ ][I − Cβ 2 Y (−iωI − A∗Y )−1 C ∗ ]+
C(iωI − AY )−1 βBβB ∗ (−iωI − A∗Y )−1 C ∗
= I − C(iωI − AY )−1 β 2 Y C ∗ − Cβ 2 Y (−iωI − A∗Y )−1 C ∗ +
C(iωI − AY )−1 β 4 Y C ∗ CY + β 2 BB ∗ (−iωI − A∗Y )−1 C ∗
 

= I+
C(iωI − AY )−1 AY β 2 Y + β 2 Y A∗Y + β 4 Y C ∗ CY + β 2 BB ∗ (−iωI − A∗Y )−1 C ∗
 

= I.
To establish (15), we substitute AY = A − β 2 Y C ∗ C to the following form
AY β 2 Y + β 2 Y A∗Y + β 4 Y C ∗ CY + β 2 BB ∗
= [A − β 2 Y C ∗ C]β 2 Y + β 2 Y [A − β 2 Y C ∗ C]∗ + β 4 Y C ∗ CY + β 2 BB ∗
= Aβ 2 Y + β 2 Y A∗ − β 2 Y C ∗ Cβ 2 Y + β 2 BB ∗
=0 by (6).
This verifies (15).

Normalized H∞ Coprime Factorization for Infinite-Dimensional Systems 165

Note that the NLCF system (AY , BY , CY , DY ) is exponentially stable. The con-
nection between the controllability and observability gramians of NLCF system with
the solutions of the Riccati equations is given by the following lemma.

Lemma 3.2. The controllability and observability LB and LC , respectively, of the


NLCF system (AY , BY , CY , DY ) satisfy

LB = β 2 Y, LC = X(I + β 2 Y X)−1 , X = (I − LC LB )−1 LC , (16)

where β 2 Y = β 2 Y ∗ ≥ 0 and X = X ∗ ≥ 0 are the stabilizing solutions of the LQG-


Riccati equations (6) and (7), respectively.

Proof. Since the NLCF system (AY , BY , CY , DY ) is exponentially stable, then the con-
trollability gramian LB = L∗B ≥ 0 is the unique solution of the Lyapunov equation [3,
Lemma 4.1.24]

AY LB x + LB A∗Y x = −BY BY∗ x, x ∈ D(A∗ ). (17)

Substitute BY = [βB − β 2 Y C ∗ ], so that (17) can be expressed in the form

AY LB x + LB A∗Y x = −β 2 BB ∗ x − β 4 Y C ∗ CY x, x ∈ D(A∗ ). (18)

Furthermore, we reformulate the filter LQG-Riccati equation (6) for β 2 Y as

AY β 2 Y + β 2 Y A∗Y = (A − β 2 Y C ∗ C)β 2 Y + β 2 Y (A − β 2 Y C ∗ C)∗


= [Aβ 2 Y + β 2 Y A∗ ] − 2β 4 Y C ∗ CY
= [β 4 Y C ∗ CY − β 2 BB ∗ ] − 2β 4 Y C ∗ CY
= −β 2 BB ∗ − β 4 Y C ∗ CY.

For x ∈ D(A∗ ), we have

AY β 2 Y x + β 2 Y A∗Y x = −β 2 BB ∗ x − β 4 Y C ∗ CY x. (19)

We see that LB dan β 2 Y are both the solutions to the Lyapunov equation (18). By the
uniqueness of the solutions, we conclude that LB = β 2 Y .
Similarly, the observability gramian LC = L∗C ≥ 0 is the unique solution of the
Lyapunov equation

A∗Y LC x + LC AY x = −CY∗ CY x, x ∈ D(A). (20)

By (8), CY = C so that (20) can be rewritten as follows

A∗Y LC x + LC AY x = −C ∗ Cx, x ∈ D(A). (21)

Since the maximum Hankel singular value of NLCF sistem (AY , BY , CY , DY ) is less
than one, we have that (I − LC LB ) is invertible and (I − LC LB )−1 : D(A∗ ) → D(A∗ )
[3, Lemma 9.4.7, Lemma 8.3.2].
We verify that Q := (I − LC LB )−1 LC is a solution of the control LQG-Riccati (7).
166 Fatmawati, R. Saragih, Y. Soeharyadi

Define N1 := (I − LB LC )−1 . For x ∈ D(A), we have


QAY x + A∗Y Qx
= N1∗ LC AY N1−1 + N1−∗ A∗Y N1∗ LC N1−1 N1 x
 

= N1∗ [LC AY (I − LB LC ) + (I − LC LB )A∗Y LC ] N1 x


= N1∗ [(LC AY + A∗Y LC ) + LC (−AY LB − LB A∗Y )LC ] N1 x
= N1∗ [−C ∗ C + LC BY BY∗ LC ] N1 x,

where we have used (21) and (17). Hence, Q = (I − LC LB )−1 LC satisfies the Riccati
equation
QAY x + A∗Y Qx = −(I − LC LB )−1 C ∗ C(I − LB LC )−1 x + QBY BY∗ Qx
QAY x + A∗Y Qx − QBY BY∗ Qx = −(I − LC LB )−1 C ∗ C(I − LB LC )−1 x. (22)
Then substitute BY = [βB − β 2 Y C ∗ ] to (22) such that we obtain
QAY x + A∗Y Qx − Qβ 2 BB ∗ Qx − Qβ 2 Y C ∗ Cβ 2 Y Qx
= −(I − LC LB )−1 C ∗ C(I − LB LC )−1 x. (23)
Adding C ∗ C to the both sides of (23) and using β 2 Y = LB , we obtain
QAY x + A∗Y Qx − Qβ 2 BB ∗ Q + C ∗ Cx
= C ∗ Cx + QLB C ∗ CLB Qx − (I − LC LB )−1 C ∗ C(I − LB LC )−1 x
= (I − LC LB )−1 [(I − LC LB )C ∗ C(I − LB LC ) + LC LB C ∗ CLB LC − C ∗ C] ·
(I − LB LC )−1 x
= −(I − LC LB )−1 [LC LB C ∗ C(I − LB LC ) + (I − LC LB )C ∗ CLB LC ] (I − LB LC )−1 x
= −QLB C ∗ C − C ∗ CLB Qx.
Hence, for x ∈ D(A), we have
QAY x + A∗Y Qx − Qβ 2 BB ∗ Qx + C ∗ Cx = −QLB C ∗ Cx − C ∗ CLB Qx. (24)
Moreover, using AY = A − β 2 Y C ∗ C, we obtain
XAY x + A∗Y Xx − Xβ 2 BB ∗ Xx + C ∗ Cx (25)
∗ 2 ∗ ∗ 2 2 ∗ ∗
= XAx + A Xx − Xβ Y C Cx − C Cβ Y Xx − Xβ BB Xx + C Cx.
Note that X is the unique solution of the control LQG-Riccati equation (7). Using
β 2 Y = LB , we can rewrite (25) as
XAY x + A∗Y Xx − Xβ 2 BB ∗ Xx + C ∗ Cx = −XLB C ∗ Cx − C ∗ CLB Xx. (26)
According to (24) and (26), we have Q and X are both solutions to the control LQG-
Riccati equation. So by the uniqueness property, we have
X = Q = (I − LC LB )−1 LC = LC (I − LB LC )−1 . (27)
Normalized H∞ Coprime Factorization for Infinite-Dimensional Systems 167

Now (β 2 Y )X = LB LC (I − LB LC )−1 shows that


(I + β 2 Y X) = I + LB LC (I − LB LC )−1
= [(I − LB LC ) + LB LC ] (I − LB LC )−1
= (I − LB LC )−1 .

Similarly, X(β 2 Y ) = (I − LC LB )−1 LC LB and so (I + Xβ 2 Y ) = (I − LC LB )−1 . Thus


(I + β 2 Y X) and (I + Xβ 2 Y ) are invertible. Applying (27) and LB = β 2 Y we conclude
that
LC = X(I + β 2 Y X)−1 = (I + Xβ 2 Y )−1 X.


4. CONCLUDING REMARKS
We have constructed the normalized left-coprime factorization (NLCF) systems
system based on H∞ -Riccati equation. There is the relationship between the control-
lability and observability gramians of NLCF system with the solutions of the Riccati
equations. The next work, we will extend a model reduction techniques based on H∞ -
balancing via NLCF system.

References
[1] Curtain, R. F., Robust stabilizability of normalized coprime factors: the infinite-dimensional
case, Int. J. Control 51, 1173-1190, 1990.
[2] Curtain, R. F., Model reduction for control design for distributed parameter systems, in Research
Directions in Distributed Parameter systems, SIAM, Philadelphia, PA, 95-121, 2003.
[3] Curtain, R. F. and Zwart, H. J., An Introduction to Infinite-Dimensional Systems, Springer-
Verlag, New York, 1995.
[4] Fatmawati, Model Reduction and Low Order Controller Design Strategy for Infinite Dimensional
Systems, PhD Dissertation, Institut Teknologi Bandung, Indonesia, 2010.
[5] Glover, K. and McFarlane, D., Robust stabilization of normalized coprime factor plant de-
scriptions with H∞ -bounded uncertainty, IEEE Trans. on Automatic Control AC-34, 821-830,
1989.
[6] Meyer, D. E., Fractional balanced reduction: model reduction via fractional representation, IEEE
Trans. on Automatic Control, 35, 1341-1345, 1990.
[7] Morris, K. A, H∞ -output feedback of infinite-dimensional systems via approximation, Systems
Control Lett. , 44, 211-217, 2001.
[8] Mustafa, D. and Glover, K., Controller reduction by H∞ -balanced truncation, IEEE Trans.
on Automatic Control, 36, 668-682, 1991.
[9] Opmeer, M. R., LQG balancing for continuous-time infinite-dimensional systems, SIAM J. Con-
trol and Optim., 46, 1831-1848, 2007.
[10] van Keulen, B., H∞ -Control for Distributed Parameter Systems: A State-Space Approach, Sys-
tems & Control: Foundation & Applications, Birkhäuser, Boston, 1993.
168 Fatmawati, R. Saragih, Y. Soeharyadi

FATMAWATI
Department of Mathematics, Faculty of Science and Technology, Universitas Airlangga,
Kampus C Jl. Mulyorejo Surabaya, Indonesia.
e-mail: [email protected]

Roberd Saragih
Industrial and Financial Mathematics Group, Faculty of Mathematics and Natural Science,
Institut Teknologi Bandung,
Jl. Ganesa 10 Bandung, Indonesia.
e-mail: [email protected]
Yudi Soeharyadi
Analysis and Geometry Group, Faculty of Mathematics and Natural Science, Institut Teknologi
Bandung,
Jl. Ganesa 10 Bandung, Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 169 - 174.

CONSTRUCTION OF A COMPLETE HEYTING


ALGEBRA FOR ANY LATTICE

HARINA O.L. MONIM, INDAH EMILIA WIJAYANTI, SRI WAHYUNI

Abstract. Let  be a function from any set X into any lattice L with the smallest elemet. The set
of all such functions  is called the L-power set of X and is denoted by LX. Considering that LX is a
complete Heyting algebra containing 0 X and 1 X . The lattice L can be isomorphically embedded
into the lattice LX, so L can be viewed as a subset of LX. The purpose of this note is to construct L as a
complete Heyting algebra containing both the smallest and the largest elements. We first show that
the smallest element of LX contained in nonempty subset of sublattice of LX . Since every nonempty
subset of L also contain the smallest element, it is implies that the subset always has a supremum and
an infimum in L which is a complete lattice. Moreover, we know that LX be relatively pseudo-
complemented lattice, so is L. Finally, we can prove L as a complete Heyting algebra.
Keywords and Phrases: complete lattice, relatively pseudo-complemented , Heyting algebra,
complete Heyting algebra.

1. INTRODUCTION

The purpose of this paper is to construct a lattice L as a complete Heyting algebra


containing both the smallest and the largest elements. This work starts from our work in [6]
which leads us to see the convers of what we have done before.
The result in [6], if L is a complete Heyting algebra then LX is a complete Heyting
algebra by defining relatively pseudo-complemented lattice on LX, lead us to the following
convers. We assume that L is a lattice with the smallest element and LX is a complete
Heyting algebra. How do we construct of a complete Heyting algebra of a lattice L?
We will show that if we take L is a lattice with the smallest element, then L is clearly a
subset of LX by an embedding technique. Moreover, nonempty subsets of sublattice of LX
that contain the smallest element in LX will imply those of lattice L has both a supremum
and an infimum in L. It makes clear that L is a complete lattice. Since L satisfies infinitely
distributive law, according to Birkhoff in [1], L is a relatively pseudo-complemented lattice.
By using the result in [6], we prove that L satisfies a complete Heyting algebra.

169
170 H.O.L. M ONIM , I.E. WIJAYANTI, S. WAHYUNI

1.1 Fundamental Definition. The concept about complete Heyting algebra connects with
particular cases of lattice such as a relatively pseudo-complemented lattice and a lattice
which holds infinitely distributive law. On the other hand, we know that studying lattice can
be done by an algebraic approach and an order sets approach. Furthermore, we can use a
notation V ; ,   as an algebra with two binary operation  and  . In this section, we
shall give some basic definitions and its examples which related to lattice and Heyting
algebra. We also can start by recalling some well known results. For basic notations and
results, we refer the reader to the references [6].

Definition 1.1. [1]


i. A lattice is a partially ordered set P such that any two elements of it possess both a g.l.b.
(greatest lower bound) or “meet” and a l.u.b. (least upper bound) or “join”.
ii. Partially ordered set in which every subset has a g.l.b. and a l.u.b. are called complete
lattice.

Let see some examples below.


Example 1.2

a) Let n be a natural number. Denoted by Dn  a a | n , n  N  which is the set of all
divisors of n in N. Then (D(n);l.c.m., g.c.d.) is a lattice. For n = 24, All elements of D24 is
1, 2, 3, 4, 6, 8, 12 and 24. If we have 8  23 , 12  2 2  31 , and 6  2  3 , then l.c.m.
and g.c.d. for any pairs of elements in D24 are
l.c.m.(8,12)  2 3  3  24, g.c.d .(8,12)  2 2  4
l.c.m.(6,8)  2 3  3  24, g.c.d .(6,8)  2 3  3  24
l.c.m.(4,6)  2 3  3  24, g.c.d .(4,6)  21  2
 
For n = 30 is analog. The diagram of those lattices is visualized in Figure 1 an Figure 2,

.
respectively.

....
24
12
30

6 6  15
.. 4

2
3 2
10

3

31
5
1 3
Figure 1 Figure 2
3
3
3
b) The closed interval [0,1] with the usual order is a complete lattice.
c) 3
The set of all subsets of some set, ordered by inclusion, is a complete lattice.
3
3
3
3
3
3
C on st ru ct i on of a C om p let e He yt i n g Alg eb ra f or a n y La t t i c e 171

d) The set of all integers with the usual ordering is not a complete lattice.

The following definition explains some notation relating with the mapping between two
lattices.

Definition 1.3. [1] Let  : L  L * be any mapping where L and L* are lattice such that:
(i) x  L y   x   L*   y ,
(ii)  x  L y    x   L*   y ,
(iii)  x  L y    x   L*   y ,
for any x, y  L . Mapping  is called:
a) isoton if it holds (i),
b) meet homomorphism if it holds (i) and (ii),
c) join homomorphism if it holds (i) and (ii),
d) lattice homomorphism if it holds (i), (ii) and (iii).

We shall use abbreviations a RPC for a relatively pseudo-complemented and a ID for an


infinitely distributive.

Definition 1.4. [1]. By the pseudo-complement a  b of an element a relative to an element


b in a lattice L is meant an element c such that a  c  b if and only if x  c . A lattice in
which a  b exist, for all a, b  L , is called an RPC.
We now present some useful properties of a relatively pseudo-complemented lattice L.

Theorem 1.5. [3] If a, b, c  L , then


1. a  b  1  a  b 7. b  a b
2. a  b  a  b  1  b  a 8. a  b  a  c  a  b  c
3. a  a  1 9. a  b  c   a  b  c  b  a  c 
4. a  1  1
10. a  a  b  a  b
5. a  a  b  b
11. a  b  c   a  a  b  a  c 
6. a  b  b  c  a  c 
12. a  b  a  1 .
 
Definition 1.6. [1] A lattice L hold an ID law if  a  b    a   b
aA aA
and

 
 a  b    a   b , for all A  L and for all b L .
aA aA
The following definition of a Heyting algebra as a lattice is taken from Katsuya (1981).

Definition 1.7. [4]. A lattice L is a Heyting algebra if it is a bounded distributive lattice and
an RPC lattice.
The connecting property between a lattice (as a poset) and Heyting algebra is a core of this
paper.
172 H.O.L. M ONIM , I.E. WIJAYANTI, S. WAHYUNI

Proposisi 1.8. [3]. Let L,   be a bounded lattice and a relatively pseudo-complemented
lattice. Then the three binary operations, infimum, supremum and relatively pseudo-
complemented, make L,,, ,0,1 as a Heyting algebra. Conversely, the Heyting algebra
is defined a bounded lattice such that for all a and b in L there is a greatest element x of L in
which a  x  b holds.

A complete Heyting algebra is a Heyting algebra which lattice reduct that is a


complete lattice.

Definition 1.9. [5] By an L-subset of X, we mean a function from any set X into L where L is a
complete Heyting algebra. The set of all L-subsets of X is called the L-power set of X and is
denoted by LX.

Here are the known results that are taken from [6].

Theorem 1.10. [6] Let L be a complete Heyting algebra. LX is a complete Heyting algebra if
LX is a relatively pseudo-complemented lattice.

Theorema 1.11. [6] The L-power set of X, together with the operations union and
intersection, and relatively pseudo-complemented constitutes a complete Heyting algebra
whose partially ordering is . Its maximal and minimal elements are 1X and 0X, respectively.
Moreover, the lattice L can be isomorphically embedded into the lattice LX.

2. CONSTRUCTION OF A COMPLETE HEYTING ALGEBRA FOR ANY


LATTICE

From now on LX is a complete Heyting algebra along with the operations union,
intersection, and relatively pseudo-complemeneted consisting 1X and 0X, unless otherwise is
stated. On the other hand, we assume L as lattice with the smallest element 0.
Relation between the lattice L and the lattice LX showed in [6] as follows. We start by
recalling the embed mapping  : L  LX that is defined by
a , x Y
 a x   aY x  : 
0 , x X \Y

where Y  X and a  L .

Case 1. For any lattice L, unfortunately the definition of the function is fail.

Case 2. For lattice L with the bottom, the definition of the function is well defined.
C on st ru ct i on of a C om p let e He yt i n g Alg eb ra f or a n y La t t i c e 173

We have already done L   L  . Of course, every subset A of L is that of LX. We


obtain sup A and inf A exist for any subset A, but there is no guarantee that both of those exist
in L. Now, we are going to show that L is a complete lattice by considering the image  L 
of L.

The following lemma which is taken from Davey and Priestly [2] will be used to prove
the main result.

Lemma 2.1 [2] Let L be a lattice.


1. If L satisfies descending chain condition, then for every nonempty subset A of L there
exists a finite subset B of A such that sup A = sup B (which exists in L)
2. If L has the smallest element and satisfies descending chain condition, then L is
complete lattice.

The main result of this paper is given bellow.

Proposisi 2.2. Let X be any set and L is any lattice with the smallest element. Let LX be a
complete Heyting algebra. There exist isomorphically embedding  : L  LX with the
properties that for sublattice  L  contain every function aY in LX such that a  aY x 
for any x  X , Y  X and a  L . Then L is a complete Heyting algebra.
Proof: By the definition of  a  aY x  for any x  X since
above, we have
a  a and a  0 implies a  0 . Thus, there exist at least 0  L such that  L  contain
aY in LX hold the property. Let the set of all functions satisfying this property, denoted by
Sa : aY x  X  a  aY x   contained in  L  because 0  L implies 0Y  S a
which is the smallest element of LX . Thus, every nonempty subset A of L contain the
smallest element 0  L and L is said to satisfiy the minimum condition. It means that L
fulfills descending chain condition. Furthermore, for every non-empty subset A of L there
exist a finite subset B of A such that sup A = sup B exists in L. From this we conclude that L is
complete lattice. Moreover, LX holds Infinitely distributives law, so is L. According to
Birkhoff, L is relative pseudo-complemented lattice. Finally, referent to the properties of
relative pseudo-complemente lattice as the result in [6] show that L is a complete Heyting
algebra. 

3. CONCLUDING REMARK

If L be any lattice and LX be a complete Heyting algebra then we cannot construct a complete
Heyting algebra L because of defining of the embed mapping  from L into LX failed. On
the other hand, lattice L with the smallest element can be constructed as complete Heyting
algebra if there is an isomorphically embedding  with the properties that for sublattice
 L  contain all functions such that a  aY x  for any x  X , Y  X and a  L .
174 H.O.L. M ONIM , I.E. WIJAYANTI, S. WAHYUNI

References

[1] BIRKHOFF G., Lattice Theory, vol. 25, The American Mathematical Society Qollouquim Publishing, New
York, 1984.
[2] DAVEY B. A. AND PRIESTLEY H. A., Introduction to Lattices and Order, Second Edition, Cambridge
University Press, 2002.
[3] RASIOWA H. AND SIKORSKI R., The Mathematics of Metamathematics, Warszawa, Poland, 1963.
[4] KATSUYA E., Completion and coproduct of Heyting Algebras.Tsukuba Journal of Mathemathics, vol. 5, No.2,
1981
[5] MODERSON, J. AND MANIK S., Fuzzy Commutative Algebra, World Scientific Publishing Co. Pte. Ltd,
Singapore, 1998.
[6] MONIM, H.O.L., WIJAYANTI I.E. AND WAHYUNI S., Construction a Complete Heyting Algebra-LX, Presented
in National Conference on Algebras, Padjajaran University, Bandung, 201

HARINA O.L. MONIM


Ph.D. Student in Mathematics
Department of Mathematics, Faculty of Mathematics and natural Science,
Gadjah Mada University, Yogyakarta-Indonesia.
Department of Mathematics, Faculty of Mathematics and Natural Science,
Papua State University, Manokwari-Indonesia.
e-mail: [email protected]

INDAH EMILIA WIJAYANTI


Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta-Indonesia.
e-mail: [email protected]

SRI WAHYUNI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta-Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 175 - 182.

THE FUZZY REGULARITY OF BILINEAR FORM


SEMIGROUPS

KARYATI, SRI WAHYUNI, BUDI SURODJO, SETIADJI

Abstract. A bilinear form semigroup is a semigroup constructed based on a bilinear form. In this
paper, it is established the properties of a bilinear form semigroup as a regular semigroup. The
bilinear form semigroup will be denoted by . We get some properties of the fuzzy regular
bilinear form subsemigroup of , i.e: is a fuzzy regular bilinear form subsemigroup of
if and only if , is a regular subsemigroup of , provided . If is a
nonempty subset of , then is a regular subsemigroup of if and only if , the
characteristics function of , is a fuzzy regular bilinear form subsemigroup of . If is a
fuzzy regular bilinear form subsemigroup of , then .
Keywords and Phrases : fuzzy subsemigroup, fuzzy regular bilinear form subsemigroup

1. INTRODUCTION

The basic concept of fuzzy subset is established by Zadeh since 1965. The concept has
been developed in many areas of algebraic structures, such as fuzzy subgroup , fuzzy subring,
fuzzy subsemigroup, etc. Refers to the articles of Asaad (1991), Kandasamy (2003),
Mordeson & Malik (1998), Ajmal (1994), we define the fuzzy subset of a set that is a
mapping from into the interval , i.e. . Let be a fuzzy subset of and
, then the level subset is defined as set of all elements in such that their
images of the fuzzy subset are more than or equal to , i.e. .
Moreover, let be a semigroup, then a mapping is called a fuzzy
subsemigroup of a semigroup if

(1)

The fuzzy subset is a fuzzy subsemigroup of a semigroup if and only if every nonempty
level subset is a subsemigroup of .

175
176 KARYATI ET AL

Let and be vector spaces over field , with the characteristics of is zero. A
function is called a bilinear form if is linear with respect to each variable.
Every bilinear form determines two linear transformation, , which is defined as
and , which is defined as . In this case,
and are the dual spaces of and , respectively. In this paper, we denote and
as the set of all linear operators of and , respectively. For , we can
construct the following sets:

The element is called an adjoin pair with respect to the bilinear


form if for all and . Further, we denote the
following sets:

Based on these sets, we denote a set as follow:

The structure of the set is a semigroup with respect to the binary operation define as
. The semigroup is called a bilinear form semigroup
related to the bilinear form .
Let be a semigroup and , an element is called a regular element if
there exists such that . A semigroup is called regular if every element
is regular. The element is called a completely regular if there exists such that
and . A semigroup is called completely regular if every element is a
completely regular element.
The purpose of this paper is to define the fuzzy regular bilinear form subsemigroup
and investigates the characteristics of it.

2. RESULTS

Let be bilinear form semigroups, with respect to the bilinear form


and , respectively. For any , we define two sets as follow:

and

We define a fuzzy regular bilinear form subsemigroup of , as follow:


Th e Fu z z y R e gu la ri t y of B i li n ea r form Sem i gr ou p s 177

Definition 2.1. If is a fuzzy subsemigroup of and for every there


exists such that:

(2)
then is called a fuzzy regular bilinear form subsemigroup.

Based on the property of a fuzzy subsemigroup, we can investigate the following


proposition:
Proposition 2.1. Fuzzy subsemigroup is a fuzzy regular bilinear form subsemigroup of
if and only if , is a regular subsemigroup of bilinear form semigroup
.

Proof: is a fuzzy subsemigroup if and only if is a subsemigroup of In fact, if is a


fuzzy regular bilinear form subsemigroup of for every , then there
exists such that (2) is fulfilled, i.e. :

Furthermore,

, .

Hence, we obtain i.e. is a regular subsemigroup of bilinear form semigroup


.
Conversely, suppose that ,the level subset is a regular subsemigroup of
bilinear form semigroup , provided . On the other hand there is
such that and , . Set .
Clearly that and for every , . This contradicts the
fact that is a regular subsemigroup of bilinear form semigroup . So, the equation (2)
is satisfied.
The following proposition gives the relation between the nonempty subset of bilinear
form semigroup and the characteristic function of this subset.

Proposition 2.2. If is a nonempty subset of , then is a regular subsemigroup of


bilinear form semigroup if and only if the characteristic function of , is a fuzzy
regular bilinear form subsemigoup of .

Proof: If is a regular subsemigroup of bilinear form semigroup , then the following


equation is fulfilled

and if then . For every , if


i.e. , then . Furthermore, since is a regular
178 KARYATI ET AL

subsemigroup of bilinear form semigroup , then for every there exists


such that or . Therefore,

From this point we have is a fuzzy regular bilinear form subsemigroup of semigroup
.

Conversely, if is a fuzzy regular bilinear form subsemigroup of , then for every


, we get

Hence,

Thus,

Therefore it implies .

In addition, if then we obtain . Finally we


conclude that . Thus is a regular subsemigrup bilinear form .

The following proposition give the condition of a semigroup homomorphism, such that
the image of a fuzzy regular bilinear form of is a fuzzy regular bilinear form
subsemigroup of too, and visa versa.

Proposition 2.3. Let be a semigroup surjective homomorphism from onto .


1. If is fuzzy regular bilinear form subsemigroup of , then is fuzzy regular
bilinear form subsemigroup of
2. If is a fuzzy regular bilinear form subsemigroup of , then is a fuzzy
regular bilinear form subsemigroup of

Proof:
1. For every is a regular subsemigroup bilinear form , provided
. In fact, if there is such that is not a nonempty set
and is not regular semigroup, then there is such that:
, ,i.e. and
, ,
Th e Fu z z y R e gu la ri t y of B i li n ea r form Sem i gr ou p s 179

and

Let :

and

Now there is with and


i.e. , .

For any with , we have


or

Clearly, , and so ,
i.e. is not regular. It contradicts that is a fuzzy regular bilinear form
subsemigroup of semigroup
2. If is a fuzzy regular bilinear form subsemigroup of semigroup , then for every
, , there exists such that
. Since is surjective, for every there exists
such that .
For every , , there exists

with

Based on the previous property, we can conclude that for every ,


there exists such that . Furthermore, we
have:
180 KARYATI ET AL

Finally, based on (2), is a fuzzy regular bilinear form subsemigroup of


semigroup .

Proposition 2.4. If is a fuzzy regular bilinear form subsemigroup of semigroup ,


then

Proof: It is always fulfilled . For every , if then

that implies

If , then there exists . Since is a fuzzy regular bilinear


form subsemigrup, so we have . Hence

It is proved that , so .

3. CONCLUDING REMARKS

Based on the discussion, by defining a fuzzy regular bilinear form subsemigroup, we


have investigated and provede the following results.
1. The fuzzy semigroup is a fuzzy regular bilinear form subsemigroup of if and
only if , is a regular subsemigroup of bilinear form semigroup
2. If is a nonempty subset of , then is a regular subsemigroup of bilinear form
semigroup if and only if , the characteristic function of , is a fuzzy regular
bilinear form subsemigroup of

3. Let be a semigroup surjective homomorphism from onto


a. If is fuzzy regular bilinear form subsemigroup of , then is fuzzy
regular bilinear form subsemigroup of
b. If is a fuzzy regular bilinear form subsemigroup of , then is a fuzzy
regular bilinear form subsemigroup of
4. If is a fuzzy regular bilinear form subsemigroup of , then
Th e Fu z z y R e gu la ri t y of B i li n ea r form Sem i gr ou p s 181

References

[1] AJMAL, NASEEM., Homomorphism of Fuzzy groups, Correspondence Theorem and Fuzzy Quotient Groups.
Fuzzy Sets and Systems 61, p:329-339. North-Holland. 1994
[2] AKTAŞ, HACI, On Fuzzy Relation and Fuzzy Quotient Groups. International Journal of Computational
Cognition Vol 2 ,No 2, p: 71-79. 2004
[3] ASAAD, MOHAMED, Group and Fuzzy Subgroup. Fuzzy Sets and Systems 39, p:323-328. North-Holland. 1991
[4] HOWIE, J.M., An Introduction to Semigroup Theory. Academic Press, Ltd, London, 1976
[5] KANDASAMY, W.B.V., Smarandache Fuzzy Algebra. American Research Press and W.B. Vasantha Kandasamy
Rehoboth. USA. 2003
[6] KARYATI, S.WAHYUNI, B. SURODJO, SETIADJI , Ideal Fuzzy Semigrup. Seminar Nasional MIPA dan Pendidikan
MIPA di FMIPA, Universitas Negeri Yogyakarta, tanggal 30 Mei 2008. Yogyakarta, 2008.
[7] KARYATI, S. WAHYUNI , B. SURODJO, SETIADJI, The Fuzzy Version Of The Fundamental Theorem Of
Semigroup Homomorphism. The 3rd International Conference on Mathematics and Statistics (ICoMS-3)Institut
Pertanian Bogor, Indonesia, 5-6 August 2008. Bogor, 2008.
[8] KARYATI, S. WAHYUNI, B. SURODJO, SETIADJI, Beberapa Sifat Ideal Fuzzy Semigrup yang Dibangun oleh
Subhimpunan Fuzzy. Seminar Nasional Matematika, FMIPA, UNEJ. Jember, 2009.
[9] MORDESON, J.N AND MALIK, D.S. Fuzzy Commutative Algebra. World Scientific Publishing Co. Pte. Ltd.
Singapore, 1998.

KARYATI
Ph. D student, Mathematics Department, Gadjah Mada University
e-mail: [email protected]

SRI WAHYUNI
Mathematics Department, Gadjah Mada University
e-mail: [email protected]

BUDI SURODJO
Mathematics Department, Gadjah Mada University
e-mail: [email protected]

SETIADJI
Mathematics Department, Gadjah Mada University
182 KARYATI ET AL
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 183 – 192.

THE CUNTZ-KRIEGER UNIQUENESS THEOREM


OF LEAVITT PATH ALGEBRAS

KHURUL WARDATI, INDAH EMILIA WIJAYANTI, SRI WAHYUNI

Abstract. Given a directed graph E. We can denote a Leavitt path algebra with whose
coefficients are in a commutative unital ring R. Leavitt path algebras generalize a Leavitt path
algebras over field K. Generally, both and have the same basic properties.
However, some differences do exist. In addition to differ in sufficient condition of graded uniqueness
theorem, they also differ in sufficient condition of the Cuntz-Krieger uniqueness theorem.
Keywords and Phrases: Path algebra, Leavitt path algebra, Graded uniqueness theorem, the Cuntz
Krieger uniqueness theorem.

1. INTRODUCTION

Any mathematical object involving points and connections between them be called
graph. Graphs are not only viewed as combinatorial objects that sit at the core of
mathematical intuition, but also can be stated algebraicly. An amalgamation between graph
theory and algebras can create the new algebras such as path algebras over field.
In [1], [2], and [3] the authors explained in details to construct Leavitt path algebras
over graph E with coefficients in field K that be denoted . Leavitt path algebras
are constructed from path algebras over field on Leavitt extended graph with Cuntz-Kreager
conditions.
Leavitt path algebras over commutative unital ring R are a generalization of
Leavitt path algebras over field K. Tomforde in [4] constructs by using the
definition of Leavitt E-family. It is interesting to examine the similarities and differences
between the two constructions as well as their properties.
In [5] two constructions of and are compared, and we obtain that the
definition of LK(E) can be expressed as K-algebra constructed by Leavitt E-family. On the
contrary can be stated as path R-algebra on Leavitt extended graph which require
certain Cuntz-Krieger conditions. Several properties of LK(E) are still hold on LR(E) but many

183
184 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI

others are different from the LR(E)’s properties. One of the differences is the graded
uniqueness theorem of Leavitt path algebras.
This paper is a continuation of the one presented before in Bandung. The focus of the
previous article was on the graded uniqueness theorem of Leavitt path algebras [5], while the
present is on the Cuntz-Krieger Uniqueness Theorem on LR(E) and LK(E). Both this paper and
the previous are not only the review but also the proofs in detail. In addition, Tomforde [4]
proved that be -graded rings, while in [5] be described in detail that be -
graded algebras with different proof.
The paper is organized to five sections. After we have expressed introduction in
section 1, we will describe constructions of Leavitt path algebras in section 2. In section 3 we
will clarify the properties of Leavitt path algebras. The Cuntz Krieger Uniqueness Theorem
of Leavitt path algebras will be explained in section 4, and finally we conclude this paper in
section 5.

2. CONSTRUCTIONS OF LEAVITT PATH ALGEBRAS

When we mention a graph in this paper, it means that a graph concists


of two countable sets and two functions . The elements of are
called vertices and edges respectively. For each edge e, s(e) is the source of e and r(e) is the
range of e. If s(e) = v and r(e) = w, then we also say that v emits e and that w receives e.
Graph E is called row-finite if only if s-1(v) is finite for every .
Let be a graph. There are many understandings about vertices, i.e. : A
vertex is sink if or v  s(e) for all e , is infinite emitter if
, and is a singular vertex if or . The set of
singuler vertices is denoted . The vertex
that is not a singuler vertex is named reguler vertex such that
.
A path in a graph E is a sequence of edges , which has length
such that for , and . We consider
the vertices in E0 to be paths with zero in length and be a set of the path
with n in length. We let denote the set of paths of finite length. We say that a
graph E satisfies Condition (L) if every cycle in E contains an exit, in which a cycle in E is a
path  ∈ E∗\E0 with s() = r() and an edge f ∈ E1 is called an exit for path  := e1e2 . . . en iff
s(f) = s(ei) but f  ei for some 1 ≤ i ≤ n.
In this section we will costruct Leavitt path algebras not only over field LK(E) but also
over commutative unital ring. These are the following :

2.1.Leavitt Path Algebra over Field. In [1], [2] and [3] Aranda Pino et. al. Explained in
detail how to construct Leavitt path algebras with coefficients in field. They begin by
reminding the reader of the construction of the standard path algebra of a graph.

Definition 2.1.1. Let K be a field and E (E0,E1,r,s) be a graph. The path K-algebra over E is
defined as the free K-algebra KE with the relations:
The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras 185

a. vi v j   ij vi for every vi , v j  E 0
b. ei  ei r (ei )  s(ei )ei for every ei  E1
In addition to definition of path algebras, the definition of extended graph be needed to
construct Leavitt path algebras. The followings are definitions related to extended graph adn
Leavitt path algebras.

Definition 2.1.2. Given a graph we define new graph called extended graph,
that is where ( E
1 *
 
)  ei* : ei  E1 and the functions r’, s’:
. An edge is named a real edge
and is called a ghost edge. If the path be real path then
be ghost path and , for every .

Definition 2.1.3. Let be a graph. The Leavitt path algebra of E over field K denoted LK(E)
is defined as the path algebra of the extended graph , which satisfy
Cuntz-Krieger relations :
(CK1) e f   e, f r (e) for every e, f  E
* 1

(CK2) for every that is a reguler vertex.

2.2.Leavitt Path Algebra over Commutative Unital Ring.Tomforde [4] defined Leavitt
path algebras over commutative ring with unit differently. We will introduce Leavitt E-family
of graph E over ring R to construct definition of Leavitt path algebras.

Definition 2.2.1. Given a graph and ring R. A set is a


Leavitt E-family over ring R, if :
 consists of pairwise orthogonal idempotents
 the following conditions are satisfied :
1.
2.
3.
4.

Definition 2.2.2. Given a graph E and commutative unital ring R. The Leavitt path algebra
with coefficient in R, denoted LR(E), is the universal R-algebra generated by a Leavitt E-
family.
If we examine definition 2.1.1, 2.1.2, 2.1.3, 2.2.1, and 2.2.2, it means that the two
constructions are compared, then the definition 2.1.3 of LK(E) can be expressed as the
universal K-algebra generated by a Leavitt E-family. On the other hand, can be stated
as path R-algebra on Leavitt extended graph which meets certain Cuntz-Krieger conditions.
Therefore, here is a brief review on Leavitt path algebras that is a generalization of
.
186 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI

3. THE PROPERTIES OF LEAVITT PATH ALGEBRAS

In [5] the authors reviewed not only how to construct Leavitt path algebras but also
about their properties. In addition they reviewed some properties of without proof and
explained in detail several properties of which still exist in , these are as follow:
1. All elements in are non-zero and
for all .
2. Both Leavitt path algebras are unital if is finite. They both contain local unit if
is infinite.
3. Both are -graded algebras, whose elements are linear combinations of monomials.
Associated with the third property above, there is little difference between [4] and [5]
in stating one of the properties of of -graded and they prove differently. Tomforde
proved that be -graded rings while the authors in [5] showed that be -graded
algebras with a complete proof.
As previously discussed is a generalization of . They have some similar
properties, but there are also several different properties. One of them is “the graded
uniqueness theorem of Leavitt path algebras” that was completely reviewed in [5]. The
following will be resubmitted without proof of the graded uniqueness theorem of .

Theorem 3.1. Let E be a graph, and let R be a commutative unital ring. If S is a graded ring
and : → S is a graded ring homomorphism with the property that for all
v ∈ E0and for all r ∈ R \ {0}, then is injective.
In theorem 3.1. above, if a commutative unital ring R is replaced by field K then here
is always true that for all r ∈ K\{0} there exists r-1∈ K\{0}such that .
Therefore a sufficient condition for all v ∈ E0and for all r ∈ R\{0} can be replaced
with for all v ∈ E0. Thus we obtain the following corollary.

Corollary 3.2. Let E be a graph, and K be a field. If S is a graded ring and : → S is


a graded ring homomorphism with the property that for all v ∈ E0, then is
injective.
If we compare the graded uniqueness theorem of and then they differ in
a sufficient condition in order to graded ring homomorphism is injective. In , the
sufficient condition is for all v ∈ E0and for all r ∈ R \ {0} and the other is
for all v ∈ E0.
There is still other property in Leavitt path algebras, which is also different between
and , that is the Cuntz-Krieger Uniqueness Theorem. The following chapter we
will discuss this theorem, which is equipped with the supporting lemmas and their proofs.
The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras 187

4. THE CUNTZ-KRIEGER UNIQUENESS THEOREM OF


LEAVITT PATH ALGEGRAS

In this section, we will take more of a graphs that satisfy the Condition (L). As defined
earlier, that a graph E is said to satisfy Condition (L) if every cycle in E has an exit, in which
is called an exit of the path α if there exists such that
and . Next we will discuss in detail the lemmas that support the proofs of Cuntz-
Krieger Uniqueness theorem.

Lemma 4.1. Let E is a graph satisfying Condition (L). If F is a finite subset of and
, then there exists a path such that and for every .
Proof : Given and a finite subset , then there are three cases.
CASE 1. : There is a path from v to a sink , i.e. .
Let  be a path with a sink. For any path , we can see that
. It is not possible because is a , and
then and .
CASE 2. : There is no path from v to a sink .
Given . If with , is a path that no
repeated vertices, then for any . But that is
impossible, because this condition will imply that for some . It means
that there is a contradiction with no repetition of vertices. Thus, it is proved that .
CASE 3. : the path whose vertex v is a source has repetition of vertices and there is a path
from v to base point of a cycle in E.
Given any path , which has a repetition of vertices and there exists a
path of the shortest length such that and is the base point of a cycle in E.
Choose a cycle  that has the shortest length with . Because any cycle has an
exit, we can suppose that be an exit for . Let be the segment of from
to .
Consider the following illustration :

 v3
e3 e4 ,
v  e1  e2  v4  f v6
, for every edge in path 
v1 v2 e6 e5 , for every edge at cycle 
 v5

Because of the minimality of length of both  and , the path has the
property that . If we select sufficiently many repetitions of the cycle
, then it can be guaranteed that the length of path is greater than or equal to M, in other
word, . Therefore, we have that that
is impossible since this will result in for some . It is a contradiction
with ). Thus, .
188 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI

Lemma 4.2. Let E be a graph satisfying Condition L. Every polynomial in only real edges
and , there exists real paths ,  such that for some
and some
Proof : Given a graph E satisfying Condition (L), commutative ring R with unit. Suppose
is a polynomial in only real edges and . It will be proved that there are path
such that for some and , by the mathematical
induction on degree x (deg.x) :
 The first step will show that it is correct on deg.x = 0 :
If is a polynomial in only real edges with deg.x = 0 and then
for and , with . Chosen
then .
 The second step, suppose that it is true on deg.x  N – 1, i.e. there exists path
such that for some and , then we will prove that it is true
on deg.x = N : If is a polynomial in only real edges with deg.x = N and
.
 If x does not have a terms of degree 0, then such that each xi is a non
zero polynomial in real edges of degree N–1 or less, and with for
any . By induction hypothesis, is a polynomial non zero of deg. 
N–1, there are such that , for some and .
If we take then .
 If x has a term of degree 0, then we denote with
deg. , and for every . Chosen
, by lemma 4.1. there exists such that and we
have for every . If we take then

Thus, it is proved that that is true for deg.x = N.

Lemma 4.3.Let E be a graph and R be a commutative unital ring. Every polynomial in only
real edges with , if there exists with , then for any edge with
it is the case that .
Proof : let is a non zero polynomial in only real edges with deg.x = k for some
. Since be graded -algebra, i.e. , then ,
where and each , for any . Since , it can be
assummed that . For every with we obtain .
Suppose that then . It contradicts the previous
proposition, that for and . Hence it must be .

Lemma 4.4. Let E be a graph and R be a commutative unital ring. If any non zero polynomial
in only real edges , then there exists E* such that x  0 and x is a polinomial
in only real edges.
The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras 189

Proof : let with , then we can write with ,


for any . We will use mathematical induction to prove that there exists a path
such that , on :
 First step : N = 0. Then is a polynomial in real edges. If we choose
such that then is also a polynomial in real edges. Hence the claim
holds.
 Step of induction : it is assummed that claim holds for case N – 1, i.e., if and
with then there
exists such that and is a polynomial in real edges. The following we
will prove that claim also holds for : Given with
, we may choose then by grouping
of its terms, we can state :
where each is a polynomial in which has N – 1 ghost edges or
less, each with and , and y is a polynomial in real
edges.
 If then , and there exists such that . In
addition, is a polynomial in real edges (inductive hypotesis). If we choose
then and is a polynomial in real edges.
 If then there are three probabilities for :
1) If is a reguler vertex, it means or then
and .
Because of then we obtain
. By inductive hypotesis,
there exists such that is a polynomial in real edges and
. Next, we choose the we find
.
Thus, it is proved that there exists such that and is a polynomial in
real edges.
2) If is a sink, it means then , therefore we can choose
. It seems obviously that and is a polynomial in real
edges.
3) If is an infinite emitter, it means then we may choose such
that If we let then we obtain :
. Since y is a non
zero polynomial in real edges and y = yv  0 then it is clear that and
is also a polynomial in real edges.
190 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI

Hence the lemma holds for all .


Theorem 4.5. Let E be a graph satisfying Condition (L), and let R be a commutative ring
with unit. If S is a ring and : → S is a ring homomorphism with the property that
for all and for all then is injective.
Proof : Let E be a graph satisfying Condition (L) and R be a commutative ring with unit. Let
: → S is a ring homomorphism with the property that for all and
for all . We will prove that injektif, it means that for any ,
then .
Suppose and . By the lemma 4.4., there exists a path such that
and is a polynomial in real edges. Consequently, based on the lemma 4.2., we obtain that
there are such that for some . Since  is a
homomorphism then . It contradicts with
. Hence we obtain , in other word is injective.
The theorem 4.5. is called by The Cuntz-Krieger Uniqueness Theorem on LR(E). If K
be a field then K be exactly a commutative ring with unit and every non zero element in K has
in invers ( , ). Based on the theorem 4.5. by changing R with
K, then the sufficient condition of ring homomorphism of , in order that is
injective, is for It can be caused by , ,
. By lemma 4.2. and proof of the theorem 4.5., that *x = rv or *x = v.
Suppose then , since is a ring
homomorphism and or . It contradicts with that is it’s
sufficient condition. This Description explained that theorem 4.5., Result in the following
corollary which is named The Cuntz-Krieger Uniqueness Theorem on LK(E).

Corollary 4.6. Let E be a graph satisfying Condition (L) and K be a field. If is a ring
homomorphism from to S with the property that for all then is
injective.

5. CONCLUSION

Leavitt path algebras LR(E) which is a generalization of Leavitt path algebras LK(E)
has some differences, in addition to some similarities. One of the differences is The Cuntz-
Krieger Uniqueness Theorem on LR(E) and LK(E), in addition to the graded uniqueness
theorem. The Cuntz-Krieger Uniqueness Theorem on LR(E) and LK(E) requires graph E which
satisfy the Condition L, i.e. if every cycle in E has an exit.
The differences between the two constructions are in the sufficient conditions of the 
ring homomorphism such that the  is injective. The Cuntzt-Krieger Uniqueness Theorem for
LR(E), it is for . This sufficient condition can still be applied
if the commutative unital ring R is replaced by field K. Since every element in field K has an
invers, the sufficient conditions of the  ring homomorphism such that the  is injective from
LK(E) to S is for .
Both the graded uniqueness theorem and the Cuntzt-Krieger uniqueness theorem have
The Cuntz-Krieger Uniqueness Theorem of Leavitt Path Algebras 191

a similar statement of the theorem, because both have the same sufficient conditions as well.
Ring homomorphism in the Cuntzt-Krieger uniqueness theorem does not require graded
homomorphism but the graph E must satisfy the Condition L. Otherwise, the graded
uniqueness theorem requires the graded homomorphism , but it does not require the graph
satisfying Conditin (L).
As a follow-up, many things that can be studied from Leavitt path algebras of both
LR(E) and LK(E) by developing what is already studied the authors before. For example, what
and how the necessary and sufficient conditions in order that LR(E) be simple, semi simple,
prime, or semi prime.

References

[1] ABRAMS, G., ARANDA PINO, G., Leavitt Path Algebra of a Graph, J. Algebra 293 (2), 319 – 334, 2005.
[2] ARANDA PINO, G.,, A Course On Leavitt Path Algebra, ITB, 2010
[3] ARANDA PINO, G., PERERA, F., MOLINA, M. S., Graph algebras: bridging the gap between analysis and algebra,
University of Malaga Press, Spain, 2007.
[4] TOMFORDE, M, Leavitt Path Algebras With Coefficient In A Commutative Ring, J. Algebra., 2009.
[5] WARDATI, K., ET AL, Teorema Keunikan Graded Aljabar Lintasan Leavitt (The Graded Uniqueness Theorem on
Leavitt Path Algebras), presented at the “National Conference on Algebra 2011” hosted by the Padjadjaran
University, Bandung, on April 30, 2011.

KHURUL WARDATI
Departement of Mathematics Faculty of Science and Technology State Islamic University
Sunan Kalijaga.
e-mail: [email protected]

INDAH EMILIA WIJAYANTI


Departement of Mathematics Faculty of Mathematics and Natural Sciences Gadjah Mada
University.
e-mail: [email protected]

SRI WAHYUNI
Departement of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah Mada
University.
e-mail: [email protected]
192 K. WARDATI, I.E. WIJAYANTI, S. WAHYUNI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 193 – 204.

APPLICATION OF FUZZY NUMBER MAX -PLUS ALGEBRA


TO CLOSED SERIAL QUEUING NETWORK WITH FUZZY
ACTIVITY TIME

M. ANDY RUDHITO, SRI WAHYUNI, ARI SUPARWANTO,


F. SUSILO

Abstract. The activity times in a queuing network are seldom precisely known, and then
could be represented into the fuzzy number, that is called fuzzy activity times. This paper
aims to determine the dynamical model of a closed serial que uing network with fuzzy
activity time and its periodic properties using max-plus algebra approach. The finding
shows that the dynamics of the network can be modeled as a recursive system of fuzzy
number max-plus linear equations. The periodic properties of the network can be obtained
from the fuzzy number max-plus eigenvalue and eigenvector of matrix in the system. In the
network, for a given level of risk, it can be determined the earliest of early departure time
of a customer, so that the customer's departure interval time will be in the sma llest interval
where the lower bound and upper bound are periodic.
Keywords and Phrases: max-plus algebra, queuing network, fuzzy activity times, periodic.

1. INTRODUCTION

We will discuss the closed serial queuing network of n single-server, with a infinite
buffer capacity and n customers (Krivulin [4]). The network works with the principle of First-
In First-Out (FIFO). In the system, the customers have to pass through the queues
consecutively so as to receive service at each server. One cycle of network services is the
process of entry of customers into the buffer of 1st server to leave the nth server. After
completion of service to the nth server, customers return to the first queue for a new cycle of
network services. Suppose at the initial time of observation, all the servers do not give
service, in which the buffer of ith server contains one customer, for each i = 1, 2, ..., n. It is
assumed that the transition of customers from a queue to the next one requires no time.

193
194 M. ANDY R UDHITO E T AL

Figure 1 (Krivulin [5]) gives the initial state of the closed serial queuing network,
where customers are expressed by "•".

   
1 2 n
Figure 1 Closed Serial Queuing Network

The closed serial queuing network can be found in the assembly plant systems, such as
assembling cars and electronic goods. Customers in this system are palettes while the server
is a machine assembler. Palette is a kind of desk or place where the components or semi-
finished goods are placed and moved to visit machines assemblers. At first, 1st pallete enters
to the buffer of 1st engine and then enters to the 1st machine and the 2nd pallete enters to the
buffer of 1st engine. In the 1st engine, components are placed and prepared for assembly in
the next machine. Next, 1st palette enters buffer of 2nd machine and 2nd pallette enter 1st
machine. And so forth for n palettes are available, so that it reaches the state as in Figure 1
above, where the initial state observation is reached. After assembly is completed in the nth
machine, the assembly of goods will leave the network, while the palette will go back to the
buffer of 1st engine, to begin a new cycle of network services, and so on.
Max-plus algebra (Baccelli, et al. [1]; Heidergott, B. B, et. al. [3]), namely the set of
all real numbers R with the operations max and plus, has been used to model a closed serial
queuing network algebraically, with a deterministic time activity (Krivulin [4]; Krivulin
[5]). In the problem of modeling and analysis of a network sometimes its activity times is
not known, for instance due to its design phase, data on time activity or distribution are not
fixed. This activity can be estimated based on the experience and opinions from experts and
network operators. This network activity times are modeled using fuzzy number, that is
called fuzzy activity times. Scheduling problems involving fuzzy number can be seen in
Chanas and Zielinski [2], and Soltoni and Haji [9]. As for the issue network model
involving fuzzy number can be seen in Lüthi and Haring [6].
In this paper we determine the dynamical model of a closed serial queuing network
with fuzzy activity time and its periodic properties using max-plus algebra approach. This
approach will use some concepts such as: fuzzy number max-plus algebra, fuzzy number
max-plus eigenvalue and eigenvector (Rudhito [8]). We will discuss a closed serial queuing
network as discussed in Krivulin [4] and Krivulin [5], where crisp activity time will be
replaced with fuzzy activity time, where can be modeled by fuzzy number. The dynamical
model of the network can be obtained analogous with crisp activity time case. The periodic
properties of the network can be obtained from the fuzzy number max-plus eigenvalue and
eigenvector of matrix in the system. We will use some concepts and result on max-plus
algebra, interval max-plus algebra and fuzzy number max-plus algebra.
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 195

2. MAX-PLUS ALGEBRA

In this section we will review some concepts and results of max-plus algebra, matrix
over max-plus algebra and max-plus eigenvalue. Further details can be found in Baccelli, et
al. [1].
Let R : = R { } with R is the set of all real numbers and  : = . Define two
operations  and  such that
a  b := max(a, b) and a  b : = a  b
for every a, b  R  .
We can show that (R , , ) is a commutative idempotent semiring with neutral element  =
 and unity element e = 0. Moreover, (R, , ) is a semifield, that is (R, , ) is a
commutative semiring, where for every a  R there exists a such that a  (a) = 0. Then,
(R, , ) is called max-plus algebra, and is written as Rmax. The relation “  m ” defined on
Rmax as x  m y iff x  y = y. In Rmax, operations  and  are consistent with respect to the
order  m , that is for every a, b, c  Rmax, if a  m b, then a  c  m b  c, and a  c  m b
0 k k 1
x  := 0 , x  := x  x 
k
 c. Define and   : = , for k = 1, 2, ... .

Define R mmax : = {A = (Aij)Aij  Rmax, i = 1, 2, ..., m and j = 1, 2, ..., n}, that is set of
n

all matrices over max-plus algebra. Specifically, for A, B  R n max and   Rmax we define
n

n
(  A)ij =   Aij , (A  B)ij = Aij  Bij and (A  B)ij = A k 1
ik  Bkj .

 0 if i  j
We define matrix E  R n
max , (E )ij : = 
n
and matrix   R n
max , ( )ij :=  for every i
n

 if i  j
0 k k 1
R n 
= En and A = A  A
n
and j . For any matrix A max , one can define A for k = 1,

2, ... . The relation “  m ”defined on R m  m B iff A  B = B. In ( R n n


max , , ),
n
max as A

operations  and  are consistent with respect to the order  m , that is for every A, B, C 

R n n
max , if A
 m B , then A  C  m B  C, and A  C  m B  C .
Define R nmax := { x = [ x1, x2, ... , xn]T | xi  Rmax, i = 1, 2, ... , n}. Note that R nmax can be
1
viewed as R nmax . The elements of R nmax are called vectors over Rmax or shortly vectors. A
vector x  R max is said to be not equal to vector , and is written as x  , if there exists i 
n

{1, 2, ..., n} such that xi  .


Let G = (V, A) with V = {1, 2, ... , p} is non empty finite set which is its elements is
called node and A is a set of ordered pairs of nodes. A directed graph G is said to be weighted
if every arch (j, i)  A corresponds to a real number Aij. The real number Aij is called the
weight of arch (j, i), and is written as w(j, i). In pictorial representation of weighted directed
n
graph, archs are labelled by its weight. Define a precedence graph of a matrix A  R nmax as
weighted directed graph G(A) = (V, A) with V = {1, 2, ... , n}, A = {(j, i) | w(i, j) = Aij   }.
196 M. ANDY R UDHITO E T AL

Conversely, for every weighted directed graph G = (V, A), can be defined a matrix A 
n
R nmax , which is called the weighting matrix of graph G, where
 w( j , i) if ( j , i)  A
Aij =  . The mean weight of a path is defined as the sum of the
 if ( j , i)  A .
weights of the individual arcs of this path, divided by the length of this path. If such a path is
a circuit one talks about the mean weight of the circuit, or simply the cycle mean. It follow
that a formula for maximum mean cycle mean max(A) in G(A) is max(A) =
n

k 1
(
1 n

k i1
( Ak
)ii )..

The matrix A  R n n
max is said to be irreducible if its precedence graph G = (V, A) is
strongly connected, that is for every i, j  V, i  j, there is a path from i to j. We can show
2  n1
that matrix A  R n n
max is irreducible if and only if (A  A  ...  A )ij   for
every i, j where i  j (Schutter, 1996).
Given A  R n max . Scalar   Rmax is called the max-plus eigenvalue of matrix A if there
n

exists a vector v  R nmax with v  n1 such that A  v =   v. Vector v is called max-plus
eigenvector of matrix A associated with . We can show that max(A) is a max-plus eigenvalue
of matrix A. For matrix B = max(A)  A, if Bii = 0, then i-th column of matrix B is an
*

eigenvector corresponding with eigenvalue max(A). The eigenvector is called fundamental


max-plus eigenvector associated with eigenvalues max(A) (Bacelli, et al., 2001). A linear
combination of fundamental max-plus eigenvector of matrix A is also an eigenvector
assosiated with max(A). We can show that if matrix A  R max is irreducible, then max(A) is
n n

the unique max-plus egenvalue of A and the max-plus eigenvector associated with max(A) is
v, where vi   for every i  {1, 2, ..., n} (Bacelli, et al., 2001).

3. INTERVAL MAX-PLUS ALGEBRA

In this section we will review some concepts and results of interval max-plus algebra,
matrix over interval max-plus algebra and interval max-plus eigenvalue. Further details can
be found in Rudhito, et al. [7] and Rudhito [8].
The (closed) max-plus interval x in Rmax is a subset of Rmax of the form
x = [ x , x ] = {x  Rmax  x  m x  m x },
which is shortly called interval. The interval x  y if and only if y m x m x m y .
Especially x = y if and only if x = y and x = y . The number x  Rmax can be represented
as interval [x, x]. Define I(R) := {x = [ x , x ]  x , x  R,  m x  m x }  {  }, where
:= [,  ]. Define x  y = [ x  y , x  y ] and x  y = [ x  y , x  y ] for every x, y
 I(R). We can show that (I(R),  ,  ) is a commutative idempotent semiring with neutral
element  = [, ] and unity element 0 = [0, 0]. This commutative idempotent semiring
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 197

(I(R),  ,  ) is called interval max-plus algebra, and is written as I(R)max. Relation


 Im y  x  y = y is a partial order on I(R)max. Notice that x
“  Im ”defined on I(R)max as x
 y = y  x m y and x m y .
m n
Define I(R) max := {A = (Aij)Aij  I(R)max, i = 1, 2, ..., m, j = 1, 2, ..., n}. The
m n
elements of I(R) max are called matrices over interval max-plus algebra or shortly interval
matrices. The operations on interval matrices can be defined in the same way with the
m n
operations on matrices over max-plus algebra. For any matrix A  I(R) max , Define the

matrix A = ( A ij )  R m n m n
max and A = ( A ij )  R max , which are called lower bound matrix

and upper bound matrix of A, respectively. Define a matrix interval of A, that is [ A , A ] =


m n
{A  R max  A m A m A m n n n
} and I( R max )b = { [ A , A ]  A (R) max }. The matrix
m n
interval [ A , A ] and [ B , B ]  I(R max ) b are equal if A = B and A = B . We can show
m n
that for every matrix interval A  I(R max ) we can determine matrix interval
m n
[ A , A ]I( R max )b and conversely. The matrix interval [ A , A ] is called matrix interval
associated with the interval matrix A, and is written as A  [ A , A ]. Moreover, we have
  A  [   A ,   A ], A  B  [ A  B , A  B ] and A  B  [ A  B , A  B ].
n
Define I(R) max := { x = [x1, x2, ... , xn ]T | xi  I(R)max, i = 1, 2, ... , n }. Note that
n n 1
I(R) max can be viewed as I(R) max . The elements of I(R) nmax are called interval vectors over
n
I(R)max or shortly interval vectors. An interval vector x  I(R) max is said to be not equal to
interval vector , and is written as x   , if there exists i  {1, 2, ..., n} such that xi  .
n
Interval matrix A  I(R) nmax , where A  [ A , A ], is said to be irreducible if every
n
matrix A  [ A , A ] is irreducible. We can show that interval matrix A  I(R) nmax , where
A  [ A , A ] is irreducible if and only if A  R n n
max is irreducible (Rudhito, et al. [7]).

4. FUZZY NUMBER MAX-PLUS ALGEBRA

In this section we will review some concepts and results of fuzzy number max-plus
algebra, matrix over fuzzy number max-plus algebra and fuzzy number max-plus eigenvalue.
Further details can be found in Rudhito [8].
~ ~
Fuzzy set K in universal set X is represented as the set of ordered pairs K = {(x,
~
 K~ (x)) x  X } where  K~ is a membership function of fuzzy set K , which is a mapping
198 M. ANDY R UDHITO E T AL

~ ~
from universal set X to closed interval [0, 1]. Support of a fuzzy set K is supp( K ) = {x  X
~ ~ ~
  K~ (x)  0}. Height of a fuzzy set K is height( K ) = sup {  K~ (x)}. A fuzzy set K is
x X
~ ~
said to be normal if height( K ) = 1. For a number   [0, 1], -cut of a fuzzy set K is
~ ~
cut( K ) = K  ={x  X   K~ (x)  }. A fuzzy sets K is said to be convex if K  is convex,
that is contains line segment between any two points in the K  , for every   [0, 1],
Fuzzy number a ~ is defined as a fuzzy set in universal set R which satisfies the
following properties: i ) normal, that is a1  , ii ) for every   (0, 1] a is closed in R, that
is there exists a ,

a  R with a  a such that a = [ a , a ] = {x  R  a  x 
a }, iii) supp( a~ ) is bounded. For  = 0, define a0 = [inf(supp( a~ )), sup(supp( a~ ))]. Since
every closed interval in R is convex, a is convex for every   [0, 1], hence a ~ is convex.
Let F(R) ~ := F(R)  { ~ }, where F(R) is set of all fuzzy numbers and ~ : = {}
~ ~
with the -cut of ~ is   = [,]. Define two operations  and  such that for every
~  
a~ , b  F(R) ~ , with a = [ a , a  ]  I(R)max and b = [ b , b  ] I(R)max,
~ and b~ , written a~ ~ ~
i) Maximum of a b , is a fuzzy number whose -cut is interval
 
b , a   b  ] for every   [0, 1]
[a 
~ and b~ , written a~ 
ii) Addition of a
~ ~
b , is a fuzzy number whose -cut is interval
   
[ a  b , a  b ] for every   [0, 1].
~ ~
We can show that (F(R) ~ ,  ,  ) is a commutative idempotent semiring. The
~ ~
commutative idempotent semiring F(R)max := (F(R) ~ ,  ,  ) is called fuzzy number max-
plus algebra, and is written as F(R)max (Rudhito, et al. [8]).
m n ~ ~ ~
Define F(R) max := { A = ( A ij) A ij  F(R)max, i = 1, 2, ..., m and j = 1, 2, ..., n }. The
n
elements of F(R) mmax are called matrices over fuzzy number max-plus algebra or shortly
fuzzy number matrices. The operations on fuzzy number matrices can be defined in the same
~ n
way with the operations on matrices over max-plus algebra. Define matrix E  F(R) nmax ,
~
 0 if i  j ~ ~
, and matrix   F(R) max , with (  )ij := ~ for every i and j.
~ n n
with ( E )ij : = 
~
 if i  j
~ ~
For every A  F(R) mmax n and   [0, 1], define -cut matrix of A as the interval
  ~  
matrix A = ( Aij ), with Aij is the -cut of A ij for every i and j. Define matrix A = ( Aij )
mn
 R max and A = ( Aij )  R mn
max which are called lower bound and upper bound of matrix
~ ~ n
A, respectively. We can conclude that the matrices A , B  F(R) m  
max are equal iff A = B ,
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 199

that isAij = Bij for every   [0, 1] and for every i and j. For every fuzzy number matrix
~  ~ ~ ~
A , A  [ A , A  ]. Let   F(R)max, A , B  F(R) mmax n . We can show that   A) 
  

[  A ,  
 A ] and (A  B)  [ A  B , A  B ] for every   [0, 1]. Let
~ ~    
A  F(R) mmax p , A  F(R) max
p n
. We can show that (A  B)  [ A  B , A  B ] for
every   [0, 1].
Define F(R) max := { ~ ~x , ~ ~ T ~
n
x =[ 1 x 2 , ... , xn ] | xi  F(R)max , i = 1, ... , n }. The
n
elements in F(R) max are called fuzzy number vectors over F(R)max or shortly fuzzy number
vectors. A fuzzy number vector ~
x  F(R) nmax is said to be not equal to fuzzy number vector
ε , written ~
~ x ~ xi  ~ .
ε , if there exists i  {1, 2, ..., n} such that ~
~ n n n
Fuzzy number matrix A  F(R) nmax is said to be irreducible if A  I(R) max is
~
irreducible for every   [0, 1]. We can show that A is irreducible if and only if A 
0

n
R nmax is irreducible (Rudhito, et al. [7]).
~ ~
n
A  F(R) nmax
Let . The fuzzy number scalar   F(R)max is called fuzzy number max-
~ ~  F(R) n with v~ 
plus eigenvalue of matrix A if there exists a fuzzy number vector v max
~ ~ ~ ~ ~ ~ ~ ~
ε n1 such that A  v =   v . The vector v is called fuzzy number max-plus
~ ~ ~
eigenvectors of matrix A associaed with  . Given A  F(R) nmax n
. We can show that the
~ ~ ~ ~
fuzzy number scalar max ( A ) =  max

, where max is a fuzzy set in R with membership
[0,1]

function ~ (x) =   ~ (x), and  ~ is the characteristic function of the set [max( A ),
max max max

 ~
max( A )], is a fuzzy number max-plus eigenvalues of matrix A . Based on fundamental
 
max-plus eigenvector associated with eigenvalues max( A ) and max( A ), we can find
~
fundamental fuzzy number max-plus eigenvector associated with max (Rudhito [8]).
~ ~ ~
Moreover, if matrix A is irreducible, then max ( A ) is the unique fuzzy number max-plus
~
eigenvalue of matrix A and the fuzzy number max-plus eigenvector associated with
~
max( A ) is v~ , where v~i  ~ for every i  {1, 2, ..., n}.

5. DYNAMICAL MODEL OF A CLOSED SERIAL QUEUING NETWORK


WITH FUZZY ACTIVITIY TIME
200 M. ANDY R UDHITO E T AL

We discuss the closed serial queuing network of n single-server, with a infinite buffer
capacity and n customers, as in Figure 1.
Let a~i (k ) = fuzzy arrival time of kth customer at server i,
~
di (k ) = fuzzy departure time of kth customer at server i,
~t = fuzzy service time of kth customer at server i.
i
for k = 1, 2, ... and i = 1, 2, ..., n.
The dynamical of queuing at server i can be written as
~ ~ ~ ~
di (k ) = max( ti + a~i (k ) , ti + di (k  1) ) (1)
~
~ 
d n (k  1) if i  1
ai (k ) =  ~ . (2)

 d i 1 (k  1) if i  2, ...,n
Using fuzzy number max-plus algebra notation, equation (1) can be written as
~ ~ ~ ~
di (k ) = ( ~ti  a~i (k ) )  ( ~ti  di (k  1) ). (3)
~ ~ ~ ~ ~
Let d (k ) = [ d1 (k ) , d 2 (k ) , ... , d n (k ) ]T, a~ (k ) = [ a~1 (k ) , a~2 (k ) , ... , a~n (k ) ]T and T =
~ ~
 t1 ε
 
   , then equations (3) and (2) can be written as
~
ε ~
tn 
~ ~ ~ ~ ~ ~ ~
d (k ) = ( T  a~ (k ) )  ( T  d (k  1) . (4)
~ ~ ~
a~ (k ) = G  d (k  1) , (5)

~
~ ε  ~
ε 0
~ ~ ~ 
0  ε ε
with matrix G = 
~
.
   
~ ~ ~ 
 ε 0 ε 

Substituting equation (5) to the equation (4), can be obtained the equation
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
d (k ) = T  G  d (k  1)  T  d (k  1)
~ ~ ~ ~ ~ ~ ~
= T  ( G  E )  d (k  1)
~ ~ ~
= A  d (k  1) (6)
~ ~ ~ ~
 t1 ε  ε t1 
~ ~ ~ ~ 
 t2 t2 ε  ε
~ ~ ~ ~ ~ ~
with fuzzy number matrix A = T  ( G  E ) =  ~ε   .
 ~ ~ ~ 
 t n 1 tn1 ε
~  ~ ~ tn 
~
ε ε t
n

The equation (6) is dynamical model of the closed serial queuing network.
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 201

6. PERIODIC PROPERTIES OF A CLOSED SERIAL QUEUING


NETWORK WITH FUZZY ACTIVITIY TIME

Dynamical model recursive equation of the closed serial queuing network (6) can be
~
represented through the early departure time of customer d (0) , with its -cut d  (0) 
[ d  (0) , d  (0) ] for every   [0, 1]. For every   [0, 1] hold d  (k ) = A  d  (k  1) 
  k
[ A  d (k  1) , A  d  (k  1) ] = [ ( A )   d (0) , ( A )   d  (0) ]  (A ) 
k k

k ~
d(0). Thus, for every   [0, 1] hold d  (k ) = (A )  d(0). Hence we have d (k )
~~k ~ ~
= A   d (0) . Since the early departure time of customer can be determined exactly, it is a
~
crisp time, that is a point fuzzy number d (0) , with d  (0)  [ d  (0) , d  (0) ] where d  (0)
= d  (0) for every   [0, 1].
Since precedence of matrix A0 in the model of the closed serial queuing network
~
(Figure 1) is strongly connected, the matrix A0 is irredusible. Hence, matrix A in the
~
equation (6) is irredusibel. Thus, matrix A has unique fuzzy number max-plus eigenvalue,
~ ~ ~ is the fundamental fuzzy number max-plus eigenvector associated
that is max ( A ) where v
~ ~
with max ( A ), where v~i  ~ for every i  {1, 2, ..., n}.
~ where its -cut vector is
*   
We construct fuzzy number vector v v*  [ v* , v* ],
using the following steps. For every   [0, 1] dan i = 1, 2, ..., n, form
1. v  = 1  v  , v  = 1  v  , with 1 =  min(v 0 i ) .
i

2. v = 2()  v  , v = 2()  v  , with 2() =  min(v i  v0 i ) .


   
i

3. v = 3  v , with 3 =  min(v i  v i ) .


  0 0
i

* *
4. v = v  , v

= 4()  v , with 4() =  min(v0 i  v i ) .
i

The fuzzy number vector v~* is also a fuzzy number max-plus eigenvalue associated with
~ ~
max ( A ). From construction above, the components of v* , that is v* i are all non-negative
0 0


and there exist i  {1, 2, ..., n } such that v* i
= 0 for every   [0, 1]. Meanwhile, its -cut

vector is the smalest interval, in the sense that min(v* i  v* i ) = 0 for i = 1, 2, ..., n, among
0 0

all possible fuzzy number max-plus eigenvector, the modification of the fundamental fuzzy
202 M. ANDY R UDHITO E T AL

number max-plus eigenvector v~ , where all the lower bounds of its components are non-
negative and at least one zero.

Since the fuzzy number vector v~* is a fuzzy number max-plus eigenvector associated
~ ~
with max ( A )
~ ~ ~ ~ ~  
A  v~* = max ( A )  v~* or A  v* = max(A)  v* or
   
[ A  v* , A  v* ] = [max( A )  v* , max( A )  v* ].
   
Hence A  v* = max( A )  v* and A  v* = max( A )  v*
for every   [0, 1].
~ 
For some   [0, 1], we can take the early departure time of customer d (0) = v* ,
that is the earliest of early departure time of a customer, such that the lower bound of
customer departure time intervals are periodic. This is because there exist i  {1, 2, ..., n }

such that v* i = 0 for every   [0, 1]. Since the operation  and  on matrix are consistent
with respect to the order “  m ”, then
  
( A )  v*
k
 m ( A ) k  v*  m ( A ) k  v* .
This resulted
   
d  (k )  [ ( A )  v* , ( A )  v* ]  [ ( A )  v* , ( A )  v* ]
k k k k

 
= [ (max ( A ))  v* , (max ( A ))  v* ]
k k

 
= [ (max ( A )) , (max ( A )) ]  [ v* , v* ]
k k

 
= [ max ( A ), max ( A )]  [ v* , v* ] for every k = 1, 2, 3 , ... .
k


Thus for some   [0, 1], vector v* is the earliest of early departure time of a customer, so
that the customer's departure interval time will be in the smallest interval where the lower
bound and upper bound are periodic with the period max( A ) and max( A ), respectively.
In the same way as above, we can show that for some   [0, 1], if we take the early
~ 
departure time d (0) = v , where v* m v m v  , then we have
 
d  (k )  [ ( A )  v, ( A )  v]  [ ( A )  v* , ( A )  v* ]
k k k k

  
= [ max ( A ), max ( A )]  [ v* , v* ].
k

References

[1] BACCELLI, F., COHEN, G., OLSDER, G.J. AND QUADRAT, J.P., Synchronization and Linearity, John Wiley &
Sons, New York, 2001.
[2] CHANAS, S. AND ZIELINSKI, P., Critical path analysis in the network with fuzzy activity times, Fuzzy Sets and
Systems, 122, 195–204, 2001.
Ap p li c a t i on of Fu z z y Nu m b er M a x - Plu s Alg eb ra t o C los ed Seri a l Qu eu i n g Net work … 203

[3] HEIDERGOTT, B., OLSDER, J. G AND WOUDE, J., Max Plus at Work, Princeton, Princeton University Press,
2005.
[4] KRIVULIN, N.K., A Max-Algebra Approach to Modeling and Simulation of Tandem Queuing Systems.
Mathematical and Computer Modeling, 22 , N.3, 25-31, 1995.
[5] KRIVULIN, N.K., The Max-Plus Algebra Approach in Modelling of Queueing Networks, Proc. 1996 Summer
Computer Simulation Conf., Portland, OR, July 21-25, 485-490, 1996.
[6] LÜTHI, J. AND HARING, G., Fuzzy Queueing Network Models of Computing Systems, Proceedings of the 13th
UK Performance Engineering Workshop, Ilkley, UK, Edinburgh University Press, July 1997.
[7] RUDHITO, A., WAHYUNI, S., SUPARWANTO, A. AND SUSILO, F., Matriks atas Aljabar Max-Plus Interval. Jurnal
Natur Indonesia 13, No. 2., 94-99, 2011.
[8] RUDHITO, A., Aljabar Max-Plus Bilangan Kabur dan Penerapannya pada Masalah Penjadwalan dan Jaringan
Antrian Kabur, Disertasi: Program S3 Matematika FMIPA Universitas Gadjah Mada, Yogyakarta, 2011.
[9] SOLTONI, A. AND HAJI, R., A Project Scheduling Method Based on Fuzzy Theory, Journal of Industrial and
Systems Engineering, 1, No.1, 70 – 80, 2007.

M. ANDY RUDHITO
Department of Mathematics and Natural Science Education, Sanata Dharma University,
Yogyakarta, Indonesia.
e-mail: [email protected]

SRI WAHYUNI
Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia.
e-mail: [email protected]

ARI SUPARWANTO
Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia.
e-mail: [email protected]

F. SUSILO
Department of Mathematics, Sanata Dharma University, Yogyakarta, Indonesia.
e-mail: [email protected]
204 M. ANDY R UDHITO E T AL
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 205–212.

ENUMERATING OF STAR-MAGIC COVERINGS AND


CRITICAL SETS ON COMPLETE BIPARTITE GRAPHS

M. Roswitha, E. T. Baskoro, H. Assiyatun, T. S. Martini, N. A. Sudibyo

Abstract. A simple graph G = (V, E) admits a H-magic covering if every edge in G


belongs to a subgraph of G that isomorphic to H. Let Km,n be a complete bipartite
graph on m and n vertices. Then graph G is a K1,n -magic if there is a total labeling
f : V ∪E → {1, 2, ..., |V |+|E|} such that for each subgraph G0 = (V 0 , E 0 ) of G isomorphic
P P
to K1,n , there exists v∈V 0 f (v) + e∈E 0 f (e) = m(f ), where m(f ) is a positif integer
called magic sum. When f (V ) = {1, 2, ..., |V |} we say that G is K1,n -super magic and
we denote its supermagic-sum by s(f ).
In this paper we enumerate some of the possible K1,n -magic coverings for Km,n for
m = n, and we also provide all critical sets for certain K1,n -magic coverings, n = 2, 3, of
Km,n .

Keywords and Phrases: K1,n -magic covering, critical sets, complete bipartite graph.

1. INTRODUCTION
We consider finite and simple graphs. The vertex set of a graph G is denoted
by V (G) while the edge set by E(G). An edge-covering of G is a family of different
subgraphs H1 , . . . Hk such that any edge of E belongs to at least one of the subgraphs
Hi , 1 ≤ i ≤ k. If every Hi is isomorphic to a given graph H, then G admits an H-
covering. A total labeling of G is an injection function f : V ∪ E → {1, 2, . . . |V | + |E|}
such that for each subgraph H 0 = (V 0 , E 0 ) of G isomorphic to H, Σv∈V 0 f (v)+Σe∈E 0 f (e)
is constant. Further, if f (V ) = {1, 2, . . . |V |} then G is called H-supermagic.
The H-supermagic covering was first introduced by Gutiérrez and Lladó [3] in
2005, on which H-supermagic labeling of stars, complete bipartite graphs, paths and
cycles were found. In [4], Lladó and Morragas studied Cn -supermagic labeling of some
graphs, i.e. wheel, windmill, prism and books. They proved that those graphs men-
tioned are Ch -supermagic for some h. In [5], Maryati et al. proved that some classes

2010 Mathematics Subject Classification: 05C78


205
206 M. Roswitha et al.

of trees such as subdivision of stars, shrubs and banana tree graphs are Ph -supermagic.
Furthermore, cycles-supermagic labeling of chain graphs kCn -snake, triangle ladders
T Ln , grids Pm xPn for n = 2,3,4,5, also fans Fn and books Bn are found in Ngurah et
al.. [7]. For certain schakles and amalgamations of a connected graph, Maryati et al.
[6] had shown the result on the same topics, and a path-amalgamation of isomorphic
graphs had also been proved by Salman and Maryati (see [8]). For some of the labelings,
we number all vertices and all edges. and we called these number positions. A graph
labeling can be represented as a set of ordered pairs of position and its label.
Baskoro et al. in [1] defined a critical set of a graph G with labeling f as a set
Qf (G) = {(a, b)|a, b ∈ {1, 2, . . . , |V (G) ∪ E(G)}} with the ordered pair (a, b) represents
label b in position a, which satisfy :
(1) f is the only labeling of G which has label b in position a
(2) No proper subset of G satisfies 1.
If a critical set has c members, thus it has size c.
A critical set can be applied on secret sharing schemes. Secret sharing schemes
were first introduced by Blakley [2] and Shamir [9] in 1979. A secret sharing scheme is
a method of sharing a secret S among a finite set of participants P = {P1 , P2 , . . . Pn } in
such a way that if the participants in A ⊆ P are qualified to know the secret, then by
pooling together their partial information, they can reconstruct the secret S; but any
B ⊂ P which is not qualified to know S, can not reconstruct the secret. The key S is
chosen by special participant D, called dealer and the dealer gives partial information,
called the share to each participant to share the secret S.
In this talk we enumerate all possible of K1,n -magic covering to Km,n for m =
n, n = 2, 3 and for some n = 4. We also provide some of its critical sets for certain
K1,n - magic coverings.

2. K1,n -STAR MAGIC COVERINGS ON COMPLETE BIPARTITE


GRAPH Kn,n
The complete bipartite graph on V1 and V2 has two disjoint sets of vertices, V1
and V2 ; two vertices are adjacent if and only if they lie in different sets. We write Km,n
to mean a complete bipartite graph with m vertices in one set and n in the other. K1,n
in particular is called an n-star; the vertex of degree n is called center (Wallis [10]).
Theorem 2.1. (Gutiérrez and Lladó [3]) The complete bipartite graph Kn,n is K1,n -
magic for n ≥ 1.
Based on Theorem 2.1, K1,n -star magic coverings for complete bipartite graph
Kn,n can be obtained from a magic square of order n + 1 and is still possible to find
labeling other than the one was proved in [3], and in this paper we list all that possible
covering on Kn,n for n = 2, 3 and for some n = 4.

3. ENUMERATION OF POSSIBLE MAGIC COVERINGS


Let K2,2 be a complete bipartite graph, T be the sum of labels of all vertices, s
be the sum of labels of all its vertices and edges, and m(f ) be the magic sum of the
Enumerating of Star-Magic Coverings and Critical Sets 207

star magic covering. In labeling K2,2 , the vertices are labeled by {a, b, c, d} and the
edges are labeled by {e, f, g, h}, ordered from left to right (see Figure 1 ). The magic
constants m(f ) are found by summing all four possible K1,2 -star magic coverings (see
Figure 2).

a b

f g
e h

c d

Figure 1. Labeling on complete bipartite graph K2,2

a b
f g
e h

d c
c d
a b
b a
e f h
g

c d

Figure 2. Four K1,2 -magic covering on complete bipartite graph K2,2

From Figure 2, we get the magic constant m(f ) as follows.


4 m(f ) = 3(a + b + c + d) + 2(e + f + g + h)
= (a + b + c + d) + 2(a + b + . . . + g + h)
(1)
= T + 2s
m(f ) = T /4 + s/2.
By choosing different T as a multiple of 4 (or mod 4), e.g. 12, 16, 20 and 24 (see Table
1), we have different m(f ) from 21 to 24.

3.1. Possible K1,2 -magic Coverings on K2,2 . The labels of vertices and edges are
positive integers between 1 to |V (G) + E(G)|. From Equation (1) T should be vertices
with the sum of labels mod 4, and there are 6 possible K1,2 -magic coverings on K2,2
(See Figure 3).
Tabel 1 shows the only six possible K1,2 -magic coverings on K2,2 with the last
three are the dual of the first three. For the notion of duality, see [10]. Given a star
208 M. Roswitha et al.

1 5 1 7 6 2

6 7 8 3 7 8
8 3 5 4 1 4

4 2 6 2 5 3
(a) (b) (c)

8 2 3 7 8 4

4 5 2 1 3 6
1 6 8 5 1 2

7 3 4 6 5 7
(d) (e) (f)

Figure 3. K1,2 -magic covering on complete bipartite graph K1,2

magic covering λ, its dual labeling λ0 is defined by λ0 = v + e + 1 − λ(x) for every vertex
x and λ0 (x, y) = v + e + 1 − λ0 (x, y) for every edge {x, y}.

Table 1. All possible T and m(f ) of K1,2 -magic coverings on K2,2

T m(f) a b c d e f g h
12 21 1 5 4 2 8 6 7 3
16 22 7 1 2 6 4 3 8 6
6 2 5 3 1 7 8 4
20 23 2 8 7 3 6 6 1 4
3 7 4 6 8 2 1 5
24 24 8 4 5 7 1 3 6 2

3.2. Possible K1,3 -magic Coverings on K3,3 . We label K3,3 in the same order as
in Figure 4, the vertices are labeled by {a, b, c, d, e, f } and the edges are labeled by
{g, . . . , o}, again, ordered from left to right. There are 6 K1,3 -magic coverings on a
bipartite graph K3,3 . To find the magic constant m(f ), see Equation (2), and all T ’s
found are vertices with the sum of labels mod 3. To get K1,3 -star magic coverings on a
bipartite graph K3,3 , first we search all possible combinations of labels of vertices in T .
As an example, for T = 21 we need only one combination of vertices labels {1,2,3,4,5,6}.
For T = 24, we have 3 combinations on labels of vertices: {1,2,3,4,5,9}, {1,2,3,4,6,8},
{1,2,3,5,6,7}. But there is no guarantee that every combination of T has a star magic
covering.
To find all possible coverings, we also use a duality, e.g. T = 75 is a dual of
T = 21, and T = 72 is a dual of T = 24. From Table 2, it can be seen that we have
Enumerating of Star-Magic Coverings and Critical Sets 209

a b c

I j k l m

g h n o

d e f

Figure 4. Complete bipartite graph K3,3 with label a, b, . . . , o

20 labelings for T = 24 and T = 72. The results of the enumeration of the possible
labeling with T = 21 to T = 75 are presented in Table 2. For each T odd we could not
find a K1,3 -magic coverings. But there is no guarantee that all even T have star magic
coverings on it.

Table 2. All possible K1,3 -magic coverings on complete bipartite


graph K3,3

T 21 24 27 30 33 36 39 42 45 48
m(f ) 47 48 49 50 51 52 53 54 55 56
Labelings 0 20 0 52 0 109 0 166 0 224
T 51 54 57 60 63 66 69 72 75
m(f ) 57 58 59 60 61 62 63 64 65
Labelings 0 166 0 109 0 52 0 20 0

Proposition 3.1. If the sum of labels of vertices T is odd, then there is no star magic
covering on complete bipartite graphs K3,3 .

Proof. Suppose there is a star magic covering on complete bipartite graphs K3,3 with
the odd sum of vertices T . If T odd then m(f ) must be odd, too, and it follows that
s is also odd according to Equation (2). Suppose T has a label set {a,b,c,d,e,f} as in
Figure 4, then we have s as follows.
6 m(f ) = 2 s + 2 T
(2)
s = 3 m(f ) − T

It is clear that if T odd then m(f ) is also odd, then we have s is even, a contradiction. 

3.3. K1,4 -magic Covering on K4,4 . There are hundreds of combinations of label
sets on K4,4 . For example, for T = 56 by using a computer program, we have 402
combinations. that yields more than 922 labeling (we still work on it). Some of the
labeling on certain T have been found but still more works to do. A list of the possible
labeling can be seen on Table 3.
210 M. Roswitha et al.

Table 3. Some possible T and m(f ) K1,4 -magic coverings on K4,4

T 40 48 56
m(f ) 90 93 96
Labelings 0 135 922

4. CRITICAL SETS ON K1,2 -MAGIC COVERING


In six K1,2 -star magic coverings on bipartite graph K2,2 (see Figure 3) we have
no critical sets size 1, 86 critical sets size 2 (see Table 4), and 56 critical sets of size 3
(see Table 5). For each graph, we number all vertices and all edges, then we call these
numbers positions. Thus, a graph labeling can be represented as a set of ordered pairs
of position and its label. See Figure 3(a) as an example. If the first four labels are
the labels of the vertices and the last four are edge labels, then we should write the
labeling on Figure 3(a) as {(1, 1), (2, 5), (3, 4), (4, 2), (5, 8), (6, 6), (7, 7), (8, 3)}, meaning
that label 1 is on position 1, label 5 on position 2, etc.
Table 4. Critical sets size 2 of K1,2 -star magic coverings

{(1,1), (3,4)} {(1,8), (4,3)} {(3,4), (4,2)} {(4,2), (5,8)} {(5,8), (5,6)}
{(1,1), (2,5)} {(1,8), (6,3)} {(3,4), (4,6)} {(4,2), (7,3)} {(5,8), (6,2)}
{(1,1), (2,7)} {(1,8), (6,4)} {(3,4), (6,2)} {(4,3), (6,7)} {(5,8), (6,6)}
{(1,1), (3,6)} {(1,8), (7,5)} {(3,4), (6,6)} {(4,3), (7,8)} {(5,8), (7,7)}
{(1,1), (5,5)} {(1,8), (8,2)} {(3,4), (7,1)} {(4,2), (8,3)} {(6,2), (8,5)}
{(1,1), (6,6)} {(2,2), (6,4)} {(3,4), (7,7)} {(4,3), (8,6)} {(6,3), (7,6)}
{(1,1), (7,7)} {(2,2), (7,5)} {(3,4), (8,5)} {(4,6), (6,2)} {(6,3), (8,2)}
{(1,1), (8,4)} {(2,4), (5,1)} {(3,5), (4,3)} {(4,6), (7,1)} {(6,4), (8,6)}
{(1,3), (5,8)} {(2,4), (7,6)} {(3,5), (4,7)} {(4,7), (5,1)} {(6,6), (7,7)}
{(1,3), (6,2)} {(2,4), (8,2)} {(3,5), (6,3)} {(4,7),(8,2)} {(6,6), (8,3)}
{(1,3), (3,4)} {(2,5), (5,8)} {(3,5), (6,7)} {(5,1), (6,4)} {(6,7), (7,8)}
{(1,6), (3,5)} {(2,5), (6,6)} {(3,5), (7,2)} {(5,1), (6,7)} {(6,7), (8,4)}
{(1,6), (5,1)} {(2,5), (7,7)} {(3,5), (7,6)} {(5,1), (6,3)} {(7,3), (8,4)}
{(1,6), (6,7)} {(2,5), (8,3)} {(3,5), (8,4)} {(5,1), (7,6)} {(7,5), (8,6)}
{(1,8), (2,2)} {(2,7), (5,5)} {(3,6), (7,3)} {(5,1), (8,4)} {(7,6), (8,2)}
{(1,8), (2,4)} {(2,7), (6,2)} {(3,7), (6,4)} {(5,1), (8,6)} {(7,7), (8,3)}
{(1,8), (3,5)} {(2,7), (8,4)} {(4,2), (5,5)} {(5,5), (7,3)}

Open Problem. Enumeration of possible labelings on K4,4 for T ∈ [64, 160] with
99 ≤ k ≤ 135 has not been counted yet. There is a possible sum of hundreds or even
thousands of them since we have been found 922 from 280 out of a list of combinations
of 402 counted by a computer program.

Acknowledgement. The authors would like to thank the Directorate Generale of


Higher Education (DIKTI) Indonesia for the support of funding this research under
Multi-years Research Collaboration Grant (Hibah Pekerti Multi Tahun) 2010-2011.
Enumerating of Star-Magic Coverings and Critical Sets 211

Table 5. Critical sets size 3 of K1,2 -star magic coverings

{(1,1),(4,2),(7,3)} {(1,6),(4,3),(8,4)} {(2,2),(4,3),(8,4)} {(2,7),(6,8),(7,3)}


{(1,1),(4,2),(5,8)} {(1,8),(3,5),(5,1)} {(2,2),(5,1),(7,8)} {(3,5),(4,3),(5,1)}
{(1,1),(4,2),(8,3)} {(1,8),(5,1).(8,2)} {(2,2),(7,8),(8,4)} {(3,5),(5,1),(7,8)}
{(1,1),(6,8),(7,3)} {(1,8),(3,7),(8,6)} {(2,4),(3,5),(4,7)} {(3,5),(5,1),(8,2)}
{(1,3),(2,1).(6,2)} {(1,8),(3,7).(5,1)} {(2,4),(3,5).(6,3)} {(3,6),(5,5).(8,4)}
{(1,3),(2,7),(4,6)} {(1,8),(4,7),(5,1)} {(2,4),(4,7),(6,3)} {(3,7),(4,3),(7,5)}
{(1,3),(2,7),(3,4)} {(1,8),(4,7),(7,6)} {(2,5),(3,4),(5,8)} {(4,2),(6,8).(8,4)}
{(1,3),(2,7),(7,1)} {(1,8),(5,1),(7,5)} {(2,5),(3,4),(8,3)} {(4,3),(6,4),(7,5)}
{(1,3),(2,7),(8,5)} {(2,2),(3,5),(5,1)} {(2,7),(3,6).(6,8)} {(4,5),(6,2),(7,1)}
{(1,3),(4,6).(8,5)} {(2,2),(3,5),(7,8)} {(2,7),(4,6),(5,8)} {(4,7),(5,1),(6,3)}
{(1,3),(5,8).(7,1)} {(2,2),(3,5),(4,3)} {(2,7),(4,6),(8,5)} {(5,5),(6,8),(8,4)}
{(1,6),(2,2).(4,3)} {(2,2),(3,7),(5,1)} {(2,7),(5,8),(7,1)} {(5,8),(6,2),(7,1)}
{(1,6),(2,2),(8,4)} {(2,2),(3,7),(4,3)} {(2,7),(6,2).(7,1)} {(5,8),(7,1),(8,5)}

References
[1] Baskoro, E. T., Simanjuntak, R. and Adithia, M. T., Secret Sharing Scheme based on magic-
labeling, Proceeding of 12th National Mathematics Conference, 23-27, 2004.
[2] Blackley, G. R., Safeguarding Criptographic Keys, Proc. AFIPS 12th, New York, (48) 313-317,
1979.
[3] Gutiérrez, A. and Lladó, A., Magic Coverings, JCMCC 55, 43-56, 2005.
[4] Lladó, A. and Moragas, J., Cycle-magic Graphs, Discrete Mathematics 307(23), 2925-2933,
2007.
[5] Maryati, T. K., Baskoro, E. T. and Salman, A. N., M., Ph -supermagic labelings of some trees,
JCMCC 65, 197-204, 2008.
[6] Maryati, T. K., Salman, A. N. M., Baskoro, E. T., Ryan, J.and Miller, M., On H-supermagic
labelings for certain schakles and amalgamations of a connected graph, Utilitas Mathematica 83,
333-342, 2010.
[7] Ngurah, A. A. G., Salman, A. N. M. and Susilowati, L., H-supermagic labeling of graphs,
Discrete Mathematics 310(8), 1293-1300, 2010.
[8] Salman, A. N., M. and Maryati, T. K., On graph-(super)magic labelings of a path-amalgamation
of isomorphic graphs, Proc. of the 6thIMT-GT Conference on Mathematics, Statistics and its
Applications ICMSA2010, Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia, 2010.
[9] Shamir, A., How to share a secret, Comm. ACM 22 No.11, 612-613, 1979.
[10] Wallis, W.D., Magic Graph, Birkhäuser, Boston, Basel, Berlin, 2001.

M. Roswitha, T. S. Martini, N. A. Sudibyo


Faculty of Mathematics and Natural Sciences,
Sebelas Maret University, Surakarta, Indonesia.
e-mails: mania [email protected]; [email protected]; nugroho [email protected]

E. T. Baskoro, H. Assiyatun
Combinatorial Mathematics Research Division,
Faculty of Mathematics and Natural Sciences
Institut Teknologi Bandung, Indonesia.
e-mails: [email protected]; [email protected]
212 M. Roswitha et al.
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 213–222.

CONSTRUCTION OF RATE s/2s CONVOLUTIONAL


CODES WITH LARGE FREE DISTANCE VIA LINEAR
SYSTEM APPROACH

Ricky Aditya and Ari Suparwanto

Abstract. Convolutional codes are a class of error-correcting codes. Convolutional codes


can be represented as discrete time-invariant linear systems. By defining the system with
some special properties, we can construct convolutional codes with good free distance
properties. The maximum possible free distance of convolutional codes is determined by
Generalized Singleton Bound. Convolutional codes which have maximum free distance
among all codes of same rate are called MDS Convolutional Codes. In this paper, for any
natural numbers s, we construct a class of rate s/2s convolutional codes by generalizing
the construction of rate 1/2 MDS convolutional codes. The free distance of this class
doesn’t reach the maximum possible unless for s = 1. However, for largely enough
complexity, the free distance is near the maximum possible.

Keywords and Phrases: Error-Correcting Codes, Convolutional Codes, Free Distance,


Generalized Singleton Bound, MDS Codes.

1. BACKGROUNDS AND MOTIVATIONS


Coding theory is a branch of mathematics which is concerned with successfully
transmitting data through a noisy channel and correcting error in corrupted messages.
Coding theory contains two main parts: encoding and decoding. Encoding is the process
of representing message as codeword, and decoding is the process of detecting and
correcting some errors which might occur in the transmission.
In coding theory, some concepts of abstract algebra and linear algebra are applied.
One class of codes which is often used is linear block codes. The linear block codes are
finite dimensional sub-vector spaces of Fqn over finite field Fq . A subspace C of Fqn is said
rate k/n linear block code if it has dimension k (over Fq ). A rate k/n linear block code

2010 Mathematics Subject Classification: 11T71.

213
214 Ricky Aditya and Ari Suparwanto

encodes a k-tuple message to an n-tuple codeword by multiplying the k-tuple message


with generator matrix. Generator matrix is a k × n matrix whose rows form a basis
of C. The decoding process is also based on some concepts of linear algebra, including
orthogonal complement and coset of vector spaces.
Another property of codes is about its (Hamming) distance. Distance between
two codewords is defined as the number of their different digits. Moreover, (minimum)
distance of a linear block code is defined as the minimum distance between two different
codewords in the code. The distance property has important role to determine the
number of errors which can be detected and corrected by the code. If a linear block
code has distance d, then it can detect up to d − 1 errors and correct up to b d−12 c errors.
If a code has larger distance, then its error-correcting ability is larger too. Therefore, to
transmit a codeword successfully through a noisy channel it is required a linear block
code which has large distance.
In other hand, to represent many different messages, we also need a linear block
code which contains many codewords. The number of codewords in a linear block
codes is determined by its dimension. So, we need a ”good” linear block code, which
can represent many different messages and also can correct many errors, i.e. large
dimension and large distance. But, from Singleton Bound we have the upper bound
(maximum value) of distance of a rate k/n linear code is n − k + 1. It means that for
larger dimension, the maximum value of the distance is smaller, and vice versa. So, for
linear block codes, it is not easy to construct a ”good” code. This problem motivates
us to build a new concept of linear codes.

2. CONVOLUTIONAL CODES
In this section, we introduce the basic concept of convolutional codes. Most
definitions and theorems in this section can be found in Johanesson and Zigangirov [1].
In convolutional codes, the length of each codeword is not constant. The input message
of convolutional codes later will be generalized as sequence of messages with finitely
many nonzeros. The sequence is represented as polynomial vector. At first, we discuss
about definition of convolutional codes from abstract algebra point of view. We recall
that for any finite field with order q, denoted by Fq , the polynomial ring Fq [z] is a
principal ideal domain (PID). Moreover, Fqn [z] is a finitely generated free module over
Fq [z]. Bases of free modules over PID have same properties as bases of vector spaces
over field.
Definition 2.1. Given a finite field Fq . A convolutional codes C over Fq is defined as
a Fq [z]-submodule of Fqn [z].

A convolutional code C ⊆ Fqn [z] has free-rank k, where k ∈ 0, 1, · · · , n. Therefore


there is a set of k vectors which form a basis for C. If G(z) is a k × n polynomial matrix
which its rows form a basis for C, then the mapping ϕ : Fqk [z] → Fqn [z] which is defined
as:
ϕ (u(z)) = u(z) · G(z), for all u(z) ∈ Fqk [z], (1)
Construction of Rate s/2s Convolutional Codes with Large Free Distance... 215

is an injective module homomorphism. Given a sequence of message u = u0 , u1 , u2 , · · ·


where ut ∈ Fqk , for all t ∈ Z, t ≥ 0. The sequence u can be represented as polynomial
vector u(z) = u0 + u1 · z + u2 · z 2 + · · · ∈ Fqk [z]. Then the mapping (1) will map u(z)
to ϕ (u(z)) ∈ C. If ϕ (u(z)) = v0 + v1 · z + v2 · z 2 + · · · , where vt ∈ Fqn , for all t ∈ Z,
t ≥ 0, then we will get a sequence of codewords v = v0 , v1 , v2 , · · · . So, analogous with
linear block codes, the mapping (1) is the encoding process of convolutional code C.
Moreover, we can define the encoder matrix as below:

Definition 2.2. A polynomial matrix G(z) is called encoder matrix of convolutional


code C if the rows of G(z) form a basis for C. If C ⊆ Fqn [z] and rank(C) = k (size of
G(z) is k × n), then C is called has rate k/n.

In principal, encoding of convolutional codes is analogous with encoding of linear


block codes. In rate k/n linear block codes, every k-tuple message vector is encoded
to n-tuple codeword. In rate k/n convolutional codes, every sequence of message is
represented as k-tuple polynomial vector, then it is encoded to n-tuple polynomial
vector, and represent it as sequence of codewords. There are some additional parameters
in convolutional codes, such as memory and complexity, which are determined by the
entries of encoder matrix G(z).

Definition 2.3. Given a rate k/n convolutional code C with encoder matrix G(z).
(1) Memory of C is defined as the highest degree of entries of G(z).
(2) Complexity of C is defined as the highest degree of full size minors of G(z).

We also have another parameters such as row degree and total degree. Row degree
is the highest degree of entries in a row of the encoder matrix, and total degree is sum
of all row degrees. Encoder matrix of a convolutional codes is unique up to matrix
equivalence, as given in following lemma:

Lemma 2.1. Two polynomial matrices G(z) and G0 (z) are encoder matrix of a con-
volutional code C if and only if exist a unimodular matrix U (z) ∈ GLk (Fq [z]) such that
G(z) = U (z) · G0 (z). Then we say that G(z) and G0 (z) are equivalent.

Encoder matrix with high memory might make the encoding process more com-
plicated. So, we are concerned about choosing encoder matrix such that its memory
is minimal. We say that an encoder matrix is minimal if its total degree is minimum
among all equivalent encoders. For later discussion, we limit our discussion for minimal
encoder. By choosing minimal encoder matrix, the total degree and so the memory of
the encoder matrix are minimal. So, the encoding map will be easier to do.
As in linear block codes, we can also define a concept of distance in convolutional
codes. Usually the distance of convolutional codes is called free distance.

Definition 2.4. Given a convolutional code C and v(z), w(z) ∈ C. If v(z) and w(z)
represent sequences of codewords v = v1 , v2 , v3 , · · · and v = v1 , v2 , v3 , · · · respectively,
then:
216 Ricky Aditya and Ari Suparwanto

(1) free distance of v(z) and w(z), denoted by df (v(z), w(z)), is defined as:

X
df (v(z), w(z)) = d(vi , wi ),
i=1

1, if vi 6= wi
where d(vi , wi ) = , i ∈ N.
0, if vi = wi
(2) free weight of v(z), denoted by wtf (v(z)), is defined as:

wtf (v(z)) = df v(z), 0 ;
(3) free distance of C, denoted by df (C), is defined as:
df (C) = min {df (v(z), w(z)) : v(z), w(z) ∈ C, v(z) 6= w(z)}

= min wtf (v(z)) : v(z) ∈ C, v(z) 6= 0 .

Free distance of a convolutional code are determined from the number of different
digits as sequences of codewords, not as polynomial vectors. As in linear block codes, if
a convolutional code C has free distance df (C) = d, then C can detect up to d − 1 errors
and correct up to b d−1
2 c errors. We also can generalize Singleton Bound in linear block
codes to convolutional codes, which is given and proved in Rosenthal and Smarandache
[2].
Theorem 2.1. Every rate k/n convolutional code C with complexity δ and encoder
matrix G(z) satisfies:
  
δ
df (C) ≤ (n − k) · + 1 + δ + 1. (2)
k
Theorem 2.1 is called Generalized Singleton Bound. Convolutional codes which
reach the equality in (2) is called Maximum Distance Separable (MDS) Convolutional
Codes. The upper bound of free distance of convolutional codes might be very large,
depends on its complexity. However, it is not easy to determine the free distance of
a convolutional code. Consequently, to construct a convolutional code with designed
free distance is not easy too. For efficient error-correcting process, we need a ”good”
convolutional codes, which has large free distance. Therefore, we should consider some
construction methods of convolutional codes such that the free distance are large enough.
One construction method is from linear system approach, which will be discussed in next
sections.

3. REPRESENTATION OF CODES
In this section, we will discuss about connection between convolutional codes and
linear systems. First, we discuss about representation of convolutional codes as discrete
time-invariant linear systems. Later, we will discuss about how to construct codes from
given systems. All proofs of theorems and lemmas in this section can be found on York
[4].
Construction of Rate s/2s Convolutional Codes with Large Free Distance... 217

At first, we discuss about representation of convolutional codes as linear systems.


The following theorem shows the existence of the representation:
Theorem 3.1. If C is a rate k/n convolutional code over Fq with complexity δ, then
exist matrices K, L (of size δ × (δ + n − k)) and M (of size n × (δ + n − k)) over Fq
such that:
C = v(z) ∈ Fqn : ∃x(z) ∈ Fqδ , 3 x(z) · (z · K + L) + v(z) · M = 0 ,

(3)
and the following properties hold:
(1) K
 isfull (row) rank.
K
(2) is full (column) rank.
M  
z0 · K + L
(3) For every z0 ∈ Fq , rank = δ + n − k.
M
Theorem 3.1 shows existence of triple (K, L, M ) which represent the convolutional
code as in (3). Conversely, if we have triple (K, L, M ) which satisfies (3) and three
properties in Theorem 3.1, then this triple will defines a convolutional code with rate
k/n and complexity δ. Representation (K, L, M ) in Theorem 3.1 is unique up to matrix
equivalence, as given in the following theorem:
Theorem 3.2. Given a convolutional code C. If (K, L, M ) and (K, L, M ) are two
triple matrices which satisfies (3) and properties in Theorem 3.1, then exist two unique
invertible matrices S and T of size δ × δ and (δ + n − k) × (δ + n − k) respectively such
that:
(K, L, M ) = (S −1 KT, S −1 LT, M T ) (4)
Then we want to transform the (K, L, M ) representation to a linear system form.
From properties (1) and (3) with z0 = 0 in Theorem 3.1, we will have after some row
operation and column permutation, if necessary, the (K, L, M ) representation forms:
 
  O −I
K = I O , L = −A C , M = . (5)
−B D
In (5), the sizes of A, B, C dan D are δ × δ, k × δ, δ × (n − k) and k × (n − k),
respectively. Then we make a partition on v(z) ∈ Fqn [z] as v(z) = (y(z) u(z)), where
y(z) ∈ Fqn−k [z] and u(z) ∈ Fqk [z]. Let x(z) = x0 · z p + x1 · z p−1 + · · · + xp−1 · z + xp and
v(z) = v0 · z p + v1 · z p−1 + · · · + vp−1 · z + vp . After equating the polynomial coefficients,
we will have from (3) a linear system form in A, B, C and D as following:
xt+1 = xt · A + ut · B
yt = xt · C + ut · D (6)
vt = (yt ut ), x0 = xp+1 = 0
System (6) is a discrete time-invariant linear system. Dimension of system (6) is δ,
the complexity of the code. From this representation we have seen a relation between
convolutional codes and linear system. We have seen that any convolutional codes can
be represented as linear system. How about the converse? From a given system, can we
define a convolutional code? The following lemma gives a ”converse” of Theorem 3.1:
218 Ricky Aditya and Ari Suparwanto

Lemma 3.1. If given matrices A, B, C and D in system (6), then exist polynomial
matrices X(z), Y (z) dan U (z) of size k × δ, k × (n − k) and k × k respectively such that:
 
z·I −A C 
Ker  O −I  = Im X(z) Y (z) U (z) . (7)
−B D

Moreover, G(z) = Y (z) U (z) is the encoder matrix of a convolutional code C.
We have seen that from a linear system we can construct an encoder matrix. Rate
and complexity of constructed code depend on dimension and size of matrices in the
system. So, from Theorem 3.1 and Lemma 3.1 we have found a close relation between
convolutional codes and linear system. In next section, we construct a convolutional
code by defining the system (choosing quadruple (A, B, C, D)) before. However, if sys-
tem (A, B, C, D) is transformed to (K, L, M ) form by (5), three properties in Theorem
3.1 do not always hold. So, we are interested to find some properties of the system
such that all properties in Theorem 3.1 hold. One can show that every representation
system of convolutional codes is controllable system. But, not all representation system
are observable. We say a convolutional code to be an observable code if its representa-
tion system is observable too. We can justify the controllability and observability of a
system in (A, B, C, D) form by these matrices:
 
B
 BA 
2 
 
Φδ (A, B) =  BA  and Ωδ (A, C) = C AC A2 C · · · Aδ−1 C .
 
 .. 
 . 
BAδ−1
From system theory we have a system is controllable (observable) if Φδ (A, B) (Ωδ (A, C))
is full rank. For a controllable and observable system, we have following lemma:
Lemma 3.2. If (A, B, C, D) is a quadruple matrices in system (6) such that
rank (Ωδ (A, C)) = δ and rank (Φδ (A, B)) = δ, then the triple matrices (K, L, M ) which
is defined as (5) satisfies the properties in Theorem 3.1.
We can say that controllability and observability of a system are necessary con-
ditions to make a ”converse” of Theorem 3.1. In the later discussion, we construct
convolutional codes by defining a controllable and observable system. In addition, we
choose (A, B, C, D) such that our constructed codes have large free distance.

4. CONSTRUCTION OF RATE s/2s CONVOLUTIONAL CODES


In York [4], it is discussed about construction of convolutional codes for any rate
k/n and complexity δ. By choosing some special matrices to define the system, it can
be shown that free distance of the constructed code is at least δ + 1. In other words,
the free distance can be designed larger by choosing larger complexity. In Smarandache
and Rosenthal [3], it is given a construction method for any rate 1/n with the free
Construction of Rate s/2s Convolutional Codes with Large Free Distance... 219

distance is the maximum possible. The case for rate 1/2 is also proved completely. In
this section, we will generalize the proof of rate 1/2 convolutional codes to rate s/2s,
for any natural numbers s.
At first, we remember that for any rate s/2s, the size of matrix D in system (6)
is s × s. Therefore, we can choose D = Is , the identity matrix of size s × s, and the
input and output space have same dimension. Moreover, we can modify system (6) to:
xt+1 = xt · (A − CB) + yt · B
ut = −xt · C + yt · D (8)
vt = (yt ut ), x0 = xp+1 = 0
In other words, we exchange the role of input ut and output yt as in (8) form. The
problem is to guarantee that both (6) and (8) are controllable and observable. First we
consider matrices A and B below:
 
 s
α 0 ··· 0
 1 1 ··· 1
..   α α2 ··· αδ 
 0 α2s . . .

.
 2 4
α2δ 

A=

, B =  α α ··· , (9)
. .. ..

 .. .. .. .. 
 ..

. .0   . . . . 
0 ··· 0 αδs αs−1 α2(s−1) · · · αδ(s−1)
where α is primitive element of field Fq and q ≥ δ 2 s. One can show that A and B form
a controllable pair. To make system (8) also controllable, we must make A − CB and
B form a controllable pair too. We will consider the following lemma:

Lemma 4.1. If α is primitive element of field Fq and assume that q ≥ max δ 2 s, 3δs + 1 ,
s | q − 1. Let A, B are matrices as defined in (9) and:
 (δ+1)s 
α 0 ··· 0
 .. .. 
0
 0 α(δ+2)s . . 
A = .
 ,
. . . . .

 . . . 0 
0 ··· 0 α2δs
then exist δ × s matrix C such that:
A0 = S −1 (A − CB)S,
for some invertible matrix S of size δ × δ.

Proof. We want to show that A0 similar with A − CB, which is equivalent with showing
0 (δ+1)s
det (x · I − (A
 −CB)) = det(x · I − A ) = (x − α ) · (x − α(δ+2)s ) · · · · · (x − α2δs ).
c1
c2 
Write C =  . , then substitute x with α(δ+1)s , α(δ+2)s , α(δ+3)s , · · · , α2δs , we will
 
 .. 

have det (x · I − (A − CB)) = 0 become a system of equations of c1 ·B, c2 ·B, · · · , cδ ·B.
Because B is full row rank, we can uniquely determined c1 , c2 , · · · , cδ . By determining
the solutions of this equation system, we will get the result. 
220 Ricky Aditya and Ari Suparwanto

From Lemma 4.1, we have an invertible matrix S of size δ × δ such that system
(8) is equivalent with:

xt+1 = xt · A0 + yt · BS
ut = −xt · S −1 C + yt (10)
vt = (yt ut ), x0 = xp+1 = 0.

To simplify, we write BS = B 0 dan −S −1 C = C 0 . One can show that system (10) is


controllable and observable. Therefore, from system (10) we can construct convolutional
codes with rate s/2s and complexity δ. The remaining problem is about the free distance
of our constructed codes.

Theorem 4.1. Convolutional code C which is constructed from system (10) have free
distance at least 2(δ + 1).

Proof. For any v(z) ∈ C, where v(z) 6= 0 and deg (v(z)) = γ, let v(z) = v0 ·z γ +v1 ·z γ−1 +
· · ·+vγ−1 ·z +vγ and v(z) = (y(z) u(z)), y(z) = y0 ·z γ +y1 ·z γ−1 +· · ·+yγ−1 ·z +yγ and
u(z) = u0 ·z γ +u1 ·z γ−1 +· · ·+uγ−1 ·z+uγ . Because deg (v(z)) = γ, we must have v0 6= 0,
so that from the equation of system (6) u0 and y0 cannot both are zero vectors. We will
consider two cases, (γ+1)s < q−1 and (γ+1)s ≥ q−1. For (γ+1)s < q−1, from Lemma
6.1.2 and Proposition 6.1.3 in York [4], we have (u0 u1 · · · uγ ) · Φγ+1 (A, B) =
0. Because (γ + 1)s < q − 1, matrix Φγ+1 (A, B) is equivalent with Vandermonde
matrix of size (γ + 1)s × δ and so any δ rows  of Φγ+1 (A, B) are linearly independent.
Therefore, we get wt (u0 u1 · · · uγ ) ≥ δ + 1. Similarly, if we consider system
(10), we have (y0 y1 · · · yγ ) · Φγ+1 (A0 , B 0 ) = 0 and any 0
 δ rows of Φγ+1 (A , B )
0

are linearly independent. Therefore, wt (y0 y1 ·· · yγ ) ≥ δ + 1. Moreover,  we


get wt (v0 v1 · · · vγ ) = wt (y0 y1 · · · yγ ) + wt (u0 u1 · · · uγ ) ≥ (δ +
1) + (δ + 1) = 2(δ + 1).
For (γ + 1)s ≥ q − 1, from the definition of A in system (8) and assumption s | q − 1,
q−1
we have A s = I. If (γ + 1)s ≥ q − 1, then (u0 u1 · · · uγ ) · Φγ+1 (A, B) = 0 will
implies u0 · Φ q−1 (A, B) = 0, where:
s

 
u0 = u0 + u q−1
s
+ ··· u1 + u q−1 +1 + · · ·
s
··· u (q−1) −1 + u 2(q−1) −1 + · · · .
s s

Similarly we define y 0 in system (10). We now consider four cases. If u0 6= 0 dan


y 0 6= 0, as in previous case, Φ q−1 (A, B) is equivalent with Vandermonde matrix of
s
size (q − 1) × s, so that any δ rows of Φ q−1 (A, B) are linearly independent. There-
s
fore, we have wt(u0 ) ≥ δ + 1, which implies wt (u0 · · · uγ ) ≥ wt(u0 ) ≥ δ +

 1.
Similarly, wt (y0  · · · yγ ) ≥ wt(y 0 ) ≥ δ + 1. So we will get wt (v 0 ··· v γ) =
wt (y0 · · · yγ ) + wt (u0 · · · uγ ) ≥ (δ + 1) + (δ + 1) = 2(δ + 1). If u0 = 0 and

Construction of Rate s/2s Convolutional Codes with Large Free Distance... 221

y 0 6= 0, from the first equation of system (8) with u0 = 0, we have:


 
x1 + x q−1 +1 + · · · = x0 + x q−1 + · · · · A,
s s
 
x2 + x q−1 +2 + · · · = x0 + x q−1 + · · · · A2 ,
s s

..
.
  q−1
x q−1 −1 + x 2(q−1) −1 + · · · = x0 + x q−1 + · · · · A s −1 .
s s s

Substituting to the second equation, we have:


 
y0 + y q−1 + · · · = x0 + x q−1 + · · · · C,
s s
   
y1 + y q−1 +1 + · · · = x1 + x q−1 +1 + · · · · C = x0 + x q−1 + · · · · AC,
s s s

..
.
 
y q−1 −1 + y 2(q−1) −1 + · · · = x q−1 −1 + x 2(q−1) −1 + · · · · C
s s s s
  q−1
= x0 + x q−1 + · · · · A s −1 C,
s
   
which mean y 0 = y0 + y q−1 + · · · + y q−1 −1 + y 2(q−1) −1 + · · · = x0 + x q−1 + · · · ·
s s s s
Ωδ (A, C). Because Ωδ (A, C) is a Vandermonde matrix which is multiplied by non-
singular diagonal matrix, any δ columns of Ωδ (A, C) are linearly independent. Be-
cause y 0 6= 0, wecan show that y 0 contains at most δ zero digits. In other words,
wt (y0 · · · yγ ) ≥ wt(y 0 ) ≥ q − 1 − δ ≥ 3δs − δ ≥ 3δ − δ = 2δ. Because u0 =  0
and u0 6= 0, we must have wt (u 0 · · · u ) ≥ 2. Consequently wt (v · · · v ) =
  γ 0
0
0
γ
wt (y0 · · · y γ ) + wt (u0 · · · u γ ) ≥ 2δ + 2 = 2(δ + 1). For u 6= 0 and y = 0,
the proof is similar with previous case. For u0 = 0 dan y 0 = 0, with similar way as in
two previous cases, we can see that the only codeword which satisfies these conditions
is the zero codeword. Therefore, to determine the free distance of the code we just need
to consider our previous cases and we have already proven that df (C) ≥ 2(δ + 1). 
By choosing some special matrices, we can construct convolutional codes with
large free distance. For larger complexity, free distance of our constructed code is also
larger. For very large complexity we have:
df (C) 2(δ + 1)
1 ≥ lim ≥ lim δ 
δ→∞ df max δ→∞ s ·
s +1 +δ+1
(11)
2δ + 2 2δ + 2 2
≥ lim = lim = = 1.
δ→∞ s · δ + 1 + δ + 1

δ→∞ 2δ + s + 1 2
s
in other words, for very large complexity, free distance of the constructed code is near
the maximum possible value of free distance. For s = 1, the free distance of our
construction is the maximum possible, i.e. forms a MDS Convolutional Codes. For
s > 1, the constructed free distance is not maximum, but near the maximum for very
large complexity. These conditions hold for any rate s/2s.
222 Ricky Aditya and Ari Suparwanto

5. CONCLUDING REMARKS
From our discussion, we can conclude some important points. First, we have seen
convolutional codes as an extension of linear block codes for various length of messages.
Any sequence of messages can be encoded in convolutional codes using polynomial rep-
resentation. From Generalized Singleton Bound, we also see that free distance of a
convolutional code might be very large, depends on its complexity. But, it is difficult
to construct a convolutional code with designed free distance. Therefore, we use repre-
sentation of convolutional codes as linear systems to simplify the construction. In the
construction, we define the system before to determine the encoder matrix. Then, for
special case in rate s/2s, where s is a natural number, we can choose special matrices
(A, B, C, D) such that the constructed codes have large free distance and near the max-
imum possible. However, for large complexity, we need field with large order too, which
might cause some difficulty in the construction. Therefore, some other construction
methods for small order field should be considered.

References
[1] Johanesson, R. and Zigangirov, K. Sh., Fundamentals of Convolutional Coding, IEEE Press,
New York, 1999.
[2] Rosenthal, J. and Smarandache, R., Maximum Distance Separable Convolutional Codes, Ap-
plicable Algebra in Engineering, Communication and Computing, Vol. 10, No. 1, 15-32, 1999.
[3] Smarandache, R. and Rosenthal, J., A State Space Approach for Constructing MDS Rate 1/n
Convolutional Codes, IEEE Information Theory Workshop, 1998.
[4] York, E.V., Ph.D thesis: Algebraic Description and Construction of Error-Correcting Codes, a
Systems Theory Point of View, University of Notre Dame, Indiana, 1997.

Ricky Aditya
Mathematics Graduate Student, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail: downing in [email protected]

Ari Suparwanto
Department of Mathematics, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail: ari [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 223 - 232.

CHARACTERISTICS OF IBN, RANK CONDITION, AND STABLY


FINITE RINGS

SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI

Abstract. We will discuss the characteristayion of the IBN on a ring and their properties. We also
will be discussed about (strong) rank condition and stably finite, with their properties, which will
help in understanding IBN. We will see their relationship then.

Keywords and Phrases: IBN, (strong) rank condition, and stably finite.

1. INTRODUCTION

Assumed here that every ring R is associative and have an identity, and each module
M is unital. In the discussion of vector space V over a field F (or more generally over division
ring D) with dimension n, we can be obtained three following properties.
(i) Any basis in V has cardinality  n .
(ii) Any genarating set for V has cardinality  n .
(iii) Any linearly independent set in V has cardinality  n .
But in the discussion of the R-module M, M does not necessarily have a basis. In other words,
these three conditions are not necessarily satisfied. If these properties remain satisfies in any
free module M over ring R with basis dimension n, then condition (i), (ii) and (iii) above will
motivate the condition that called IBN, rank condition, and strong rank condition. [3] is
excellent reference for these three facts and this paper will give some details of their
properties. To further understanding IBN, (strong) rank condition, and also to recognize the
classes of those three things, we will discuss the condition that are more stronger than all
those conditions, which is called stably finite. The definition of each of these conditions is as
follows:

2. IBN, (STRONG) RANK CONDITION


AND STABLY FINITENESS

Definition 2.1: R is said to have IBN (invariant basis number) if for any natural number n,

223
224 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI

m, Rm  Rn (as right modules) implies n = m.

These are things for which any free right module has a unique rank. On the other hand,
denoting the set of m x n matrices over R by M mn  R  , it can be shown that two things are
equivalent in proposition below.

Proposition 2.2: For any ring R, the following conditions are equivalent:
(i) R has IBN.
(ii) X  M mn  R  , Y  M nm  R  , XY  I m , YX  I n  m  n .

Proof:
i    ii  Let X  M mxn  R  , Y  M nxm  R  with XY  I m and YX  I n . If
X  M mn  R  and Y  M nm  R  then there exist a module homomorphism
f : R n  R m such that X   f  and Y   g  . From XY  I m we get f g  iRm
which means g is a right invers, and from YX  I n we get g f  iRn which also means g
Rm  Rn , and m  n by (i).
is a left invers, so f must be an isomorphism. Then implies
 ii    i  If R  R , then there is an isomorphism k : R  R such that
m n m n

 k   M nxm  R  , so exist l : Rn  Rm such that l   M mxn  R  with k l  1R and n

l k  1R . Then we get  k l   I n and l  k   I m , so m  n by (ii). We have shown


m

that R has IBN.


[7, Ch. 1] is excellent reference for some example of ring with and without IBN. The
following definition and proposition is about the rank condition.

Definition 2.3: R satisfies the rank condition if for any n   , any set of R-module
generators for R has cardinality  n .
n

The rank condition is can be shown in three sentences.

Proposition 2.4: The following conditions are equivalent:


(i) R satisfies the rank condition.
(ii) As right free modules, ifRm  Rn  0 is exact then m  n .
(iii) For any X  M mxn  R  , Y  M nxm  R  , if YX  I n then m  n .

Proof:
 i    ii  Note that each list v   v1 ,..., vm   R n determines a linear transformation
Lv : Rm  Rn by  Lv    v11  ...  vm m for all   1 ,...,  m   Rm . Since
Rm  Rn  0 is exact, Lv : Rm  R n is an epimorphism, so we have v spans R n , and
Characteristics of IBN, Rank Condition, and Stably Finite Rings 225

by (i) we get m  n , since v  m.


 ii    iii  Let X  M mxn  R  , Y  M nxm  R  with YX  I n . From X  M mxn  R 
and Y  M nxm  R  then there is a homomophfism f : R  R such that X   f  and
n m

Y   g  . If YX  I n , then we get g f  1R or g : R m  R n is an epimorphism,


n

which means R  R  0 is exact, so m  n by (ii).


m n

 iii    i  Let n   and any generator G for free right module R n with G  m . Then
exist an epimorphism LG : Rm  R n such that LG    r11  ...  rmm for all
  1 ,..., m   Rm . Recall that every exact sequence of free modules are split, so there is
a monomorphism LG ' : Rn  Rm such that LG o LG '  1Rn . Then we get
 LG   LG '  I n for some matrices  LG   M nxm  R  ,  LG '  M mxn  R  , so mn
by (iii). We have shown that R satisfies the rank condition.

Now we will see a condition that more stronger than the rank condition, that called strong
rank condition.

Definition 2.5: R satisfies the strong rank condition if for any n   , any set of linearly
independent element in R has cardinality  n .
n

Analog to the rank condition, then the strong rank condition is also can be shown in three
sentences.

Proposition 2.6: The following conditions are equivalent:


(i) R satisfies the strong rank condition.
(ii) As right free modules, if 0  R  R is exact then m  n .
m n

(iii) For any right R-module M generated by n elements, any n  1 elements in M are
linearly dependent.
Proof:
 i    ii  Note that each list v   v1 ,..., vm   R n determines a linear transformation
Lv : Rm  R n by  Lv    v11  ...  vm m for all   1 ,...,  m   Rm . Since
0  Rm  Rn is exact, Lv : Rm  R n is a monomorphism, so we have v is linearly
independent in R n , and by (i) we get m  n , since v  m .
 ii    i  Let n   and H is a set of linearly independent in R n with H  m . Then
there exist a monomorphism LH : Rm  Rn such that LH    r11  ...  rmm for all
  1 ,..., m   Rm . This implies that 0  Rm  Rn is exact, and m  n by (ii). We
have shown that R satisfies the strong rank condition.
226 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI

 i    iii  First suppose that in any right R-module M generated by n elements, any n 1
elements in M are linearly dependent. Then, for any n, R cannot contain a copy of RRn 1 .
n
R
This implies that R satisfies strong rank condition. Conversely, assume that R satisfies the
strong rank condition. Let M be any right R-module generated by x1 ,..., xn and let
y1 ,..., yn1  M . Let  : Rn  M by the R-epimorphism defined by ei   ei   xi
(where ei  is the standard basis in R ), and let fi  R 1  i  n  1 such that
n n

fi   fi   yi . By hipotesis, f1 ,..., f n1 must be linearly dependent. Applying  , we


see that y1 ,..., yn1 are likewise linearly dependent.

Proposition before will implies the following proposition.

Corollary 2.7: A ring R satisfies the strong rank condition iff any homogeanus system of n
m 
linear equation over R, i.e  aij x j  0 |1  i  n  with m  n unknowns has a
 j 1 
nontrivial solution over R.

The following observation says that the strong rank condition is stronger than the rank
condition.

Proposition 2.8: If R satisfies the strong rank condition, then it satisfies the rank condition.

Proof: Assume R satisfies the strong rank condition, and consider an epimorphism
 : Rk  Rn . Then  must split (recall that every exact sequences of free modules are
split), and we get a monomorphism  : Rn  Rk with    1R . n By the strong rank
condition, we have n  k . We have shown that R satisfies the rank condition.
The following definition and proposition is about stably finiteness, that more stronger than
IBN and rank condition.

Definition 2.9: A ring R is Dedekind finite if for any x, y  R , xy  1 implies yx  1 . A


ring R is stably finite if the matrix rings M n  R  are Dedekind finite for all natural numbers
n.

[5, Ch. 1 and 2] is right reference for some properties of stably finite. But then, stably
finiteness is also can be describe in three sentences.

Proposition 2.10: The following conditions are equivalent:


(i) R is stably finite.
(ii) n  
, as free right modules, Rn   Rn  K   K  0 .
Characteristics of IBN, Rank Condition, and Stably Finite Rings 227

(iii)n   , as free right modules, any epimorphism Rn  Rn is an isomorphism.


(iv) End  RR  is Dedekind finite.

Proof:
 i    ii  Let n 
and Rn  Rn  K (as free right R-module). Then there is an
isomorphism f : Rn  Rn  K such that  f   M mn  R  , m  n  x for some
dimension x of K, and there also exist g : R  K  R with  g   M nm  R  such that
n n

f g  1R and g f  1R . Now we have  f  g   I m and  g  f   I n  M n  R  ,


m n

so  f  g   I n  M n  R  by (i). On the other hand, we have


 f  g   I m  M m  R  , m  n  x for some x, so must be K  0 .
 ii    iii  Let n   and an epimorphism f : Rn  Rn . Then we get
i f
0  Ker  f   R n  R n  0 is split such that Rn  R n  Ker  f  , so

Ker  f   0 by (ii), which means that f is a monomorphism. We have shown that f is an


isomorphism.
 iii    i  Let n 
and X , Y  M n  R  with XY  I n . Then we get
g f
R n  R n  R n from some  f   X ,  g   Y  M n  R  . Because f is an epimorphism,
epi

then f is an isomorphism by (iii), so there is exist Z  M n  R  with ZX  I n . But then


XY  I n  Z  XY   Z  I n    ZX  Y  ZI n  I nY  Z  Y  Z ,
so we have YX  I n . We have shown that M n  R  is Dedekind finite, or R is stably finite.
 iii    iv  a, b  E with ab  1 . Then we get that a is surjective, so by (iii) a is
Let
isomorphism, so there exist c  E such that ca  1 . But then
c  ab   c1   ca  b  c  1b  c  b  c ,
n   and
so we can get ba  1. We have shown that E is Dedekind finite. Conversely, let
any epimorphism f : R  R . Since f is split, so there exist g : R  R with
n n n n

f g  1Rn . We can say now that f , g  E and by (iv), we get gf  1 so we get that f is
an isomorphism.

Since R  End  RR  , so we will get surprise from the following conclusion that every
stably finite rings are Dedekind finite.

Corollary 2.11: R is stably finite if only if it is Dedekind finite.


228 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI

3. THE RELATIONSHIP BETWEEN IBN, (STRONG) RANK


CONDITION AND STABLY FINITENESS

We will begin first with the relationship between a ring homomorphism and IBN,
rank condition, strong rank condition and stably finite.

Proposition 3.1: The following conditions are satisfies:


1) If f : R  S is a ring homomorphism and S satisfies IBN/rank condition, then so does
R.
2) If f : R  S is a ring homomorphism such that S becomes a flat left R-module under
f and S satisfies strong rank condition, then so does R.
3) If f : R  S is an embedding of the ring R into the ring S and S is stably finite, then
so is R.

Proof:
1) Let  : Rk  Rn be an isomorphism as right free modules. Tensoring this with R S we
get an isomorphism  R S : Rk R S k  Rn R S n . But then Rk R S k  S k
and R R S  S , so we get an isomorphism  R S : S  S . By IBN on S,
n n n k n

we get m  n . For the rank condition we have the same way, let  : R  R be an
k n

epimorphism as right free modules. Tensoring this with R S we get an epimorphism


 R S : S  S , so k  n by the rank condition on S.
k n

Let  : R  R be an monomorphism as right free modules. Tensoring this with the


k n
2)
flat module R S , we get an epimorphism  R S : S  S , so k  n by the strong
k n

rank condition on S. We have shown that R satisfies the strong rank condition.
3) Upon identifying R with f  R  , the identity e of R is an idempotent in S , with the
complementary idempotent f  1  e satisfying Rf  fR  0 . Let A, B  M n  R 
such that AB  eI n . Then for A  fI n , B  fI n  M n  S  ,
 A  fI n  B  fI n   AB  f 2 I n  eI n  fI n   e  f  I n .
If S is stably finite, this implies that I n   B  fI n  A  fI n   BA  fI n , so we get

BA  I n  fI n  1  f  I n  eI n . This shows that R is stably finite.

The following result is very interesting.

Theorem 3.2: For any nonzero ring R,


stably finiteness  rank condition  IBN.

Proof: First assume R satisfies the rank condition. If Rn  Rm , then the rank condition gives
Characteristics of IBN, Rank Condition, and Stably Finite Rings 229

n  m and m  n , so we have m  n . Therefore, R has IBN. Now assume R does not


satisfies the rank condition. Then there exist an epimorphism  : R  R with
k n

k  n   . But then
Rk  Rn  ker    Rk  Rnk   ker   Rk   Rnk  ker   ,
where ker   0 , so that R nk
 ker    0 . Therefore, R is not stably finite. We have
shown that stably finiteness implies the rank condition.

The sufficient condition that R is simple will be needed when we will show that rank
condition implies stably finiteness.

Proposition 3.3: A simple ring R satisfies the rank condition if only if it is stably finite.

Proof: The “if” part follows from (3.2). Conversely, if R satisfies the rank condition, then
R I  0 that is stably finite for some ideals I. But then the projection map R  R I must
be an isomorphism, so R  R I is stably finite.

The following result is about the relationship between ring R and


M n  R  , R x , R  x  for IBN, the rank condition and stably finite.

Proposition 3.4: For any nonzero ring, the following properties is satisfied:
(i) R satisfies IBN/rank condition /stably finite iff n   , M  R 

n IBN/rank
condition/ /stably finite.
(ii) R satisfies IBN/rank condition/strong rank condition/stably finite iff R x satisfies
IBN/rank condition /stably finite.
(iii) R satisfies IBN/rank condition/strong rank condition/stably finite iff R  x  IBN/rank
condition /stably finite.
Schematically:
M n  R  is "IBN / RC / SF "  R is "IBN / RC / SF "  R  x  is "IBN / RC / SF "

R x is "IBN / RC / SF"
Proof:
 i  The relationship between ring R and M n  R  for IBN, the rank condition, and stably finite.
a) Suppose M n  R  has IBN. Since we have a ring homomorphism f : R  M n  R 
defined by r f  r   diag  r ,..., r  , so we get R is also has IBN. Conversely,
suppose M n  R  does not have IBN. Then  p, q   , p  q

and matrices

A  M pq  M n  R   , B  M q p  M n  R   with AB  I p and BA  I q . On the


230 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI

other hand, we also can see that A  M npnq  R  , B  M nqnp  R  where np  nq , so


we see that R does not have IBN. We have shown that if R has IBN then M n  R  has
IBN.
b) Suppose M n  R  satisfies the rank condition. Since we have a ring homomorphism
f : R  M n  R  defined by r f  r   diag  r ,..., r  , so we get R is also
satisfies the rank condition. Conversely, suppose M n  R  does not satisfy tha rank
condition. Then  p, q   , p  q  1

and matrices

A  M pq  M n  R   , B  M q p  M n  R   with AB  I p . On the other hand, we


also can see that A  M npnq  R  , B  M nqnp  R  where np  nq  1 , so we see that
R does not satisfy the rank condition. We have shown that if R satisfies the rank condition
then M n  R  satisfies the rank condition.
c) Suppose M n  R  is stably finite. Since we have an embedding  : R  M n  R 
defined by r f  r   diag  r ,..., r  , so we get R is also satisfies the rank condition.
Conversely, suppose Mn  R is not stably finite. Then

 p   ,  A, B  M  M  R 

p n with AB  I p  BA . Since

M p  M n  R    M pn  R  , we see that M pn  R  is not Dedekind finite. We have


shown that R is not stably finite.
 ii  The relationship between ring R and R x for IBN, the rank condition, and stably finite.
a) Suppose R x has IBN/the rank condition. Since we have a ring homomorphism

f : R  R x defined by r f  r    ri xi , so we get R is also has IBN/ the
i 1

rank condition. Conversely, suppose R x has IBN/ the rank condition. Since we also

  
have a ring homomorphism g : R x  R defined by  ri xi
i 0
f   ri xi   r0 ,
 i 0 
so we get R x is also has IBN/ the rank condition.
b) Suppose R x is stably finite. Since R is the subring of R x , then we get R is also
stably finite. Conversely, suppose R is stably finite, and consider the ideal x R x
generated by x. Since 1 x consists of units of R x , then we get
x  rad  R x  . But then R x x  R and R x x is stably finite iff
Characteristics of IBN, Rank Condition, and Stably Finite Rings 231

R x is stably finite, so tha fact R is stably finite implies that R x is also stably
finite.
 iii  The relationship between ring R and R  x for IBN, the rank condition, and stably finite.
a) For IBN/the rank condition, the same proof given in a)  ii  above.

b) Suppose R  x  is stably finite. R is the subring of R  x  , then we get R is also stably


finite. Conversely, suppose R is stably finite. From b)  ii  above we have R x is

stably finite, so the subring R  x  is also stably finite.

The following result is about the relationship between ring R and


M n  R  , R x , R  x  for the strong rank condition.

Proposition 3.5: For any nonzero ring, the following properties is satisfied:
i) R satisfies the strong rank condition finite iff n   , M  R  satisfies the strong

n
rank condition.
ii) If R x satisfies the strong rank condition then R satisfies the strong rank condition.
iii) If R  x  satisfies the strong rank condition then R satisfies the strong rank condition.
Schematically:
M n  R  satisfies "SRC "  R satisfies "SRC "  R  x  satisfies "SRC "

R x satisfies "SRC"

Proof:
(i) Suppose M n  R  satisfies the strong rank condition, and consider an embedding
 : R  Mn  R defined by r   r   diag  r ,..., r  . Viewing M n  R  as a
 , we have M n  R   R n M n  R  is a flat R -
2
left R -module via . In particular,
module, so we get R satisfies the strong rank condition. Conversely, suppose R does not
satisfies the strong rank condition. Then  m   
such that exists an embedding

 M  R    M n  R   . Since we also have M n  R   R n , this leads to en


m 1 m 2

R   
m 1 m
n2
 Rn
2
embedding . Hence R also does not satisfy the strong rank
condition.
232 SAMSUL ARIFIN AND INDAH EMILIA WIJAYANTI

(ii) Suppose R x satisfies the strong rank condition. Viewing R x as a left R -module
via the embedding R  R x , we have R x  R  R  ... , which is a flat module.
Therefore, R is also satisfies the strong rank condition.
(iii) Suppose R  x  is stably finite. Then so does R since R  x  R  R  ... , which is
free and flat.

References

[1] BERRICK, A. J., KEATING, M. E., An Introduction to Rings and Modules, 2000, Cambridge University Press,
United Kingdom.
[2] BREAZ, S., CALUGAREANU, G., AND SCHULTZ, P., 1991, Modules With Dedekind Finite Endomorphism Rings,
Babes Bolyai University, 1-13.
[3] HAGHANY, A., VARADARAJAN, K., 2002, IBN And Related Properties For Rings, Acta Mathematica
Hungarica, 94, 251-261.
[4] HUNGERFORD, T. W. , Algebra, 2000, Springer Verlag, New York.
[5] LAM, T. Y., A First Course in Noncommutative Rings, 1991, Springer Verlag, New York.
[6] LAM, T. Y., Exercises in Modules and Rings, 2007, Springer Verlag, New York.
[7] LAM, T. Y., Lectures On Modules And Rings, 1999, Springer Verlag, New York.

SAMSUL ARIFIN
Department of Mathematics
Surya College of Teaching and Education
e-mail: [email protected]

INDAH EMILIA WIJAYANTI


Department of Mathematics
Gadjah Mada University
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 233 - 240.

THE ECCENTRIC DIGRAPH OF P n P m GRAPH

SRI KUNTARI AND TRI ATMOJO KUSMAYADI

Abstract. Let G be a graph with a set of vertices V(G) and a set of edges E(G). The distance from vertex u
to vertex v in G, denoted by d(u, v), is the length of the shortest path from vertex u to v. The eccentricity of
vertex u in graph G is the maximum distance from vertex u to any other vertices in G, denoted by e(u).
Vertex v is an eccentric vertex from u if d(u, v) = e(u). The eccentric digraph ED(G) of a graph G is a graph
that has the same set of vertices as G, and there is an arc (directed edge) joining vertex u to v if v is an
eccentric vertex from u. In this paper, we answer the open problem proposed by Boland and Miller [1] to find
the eccentric digraph of various classes of graphs. In particular, we determine the eccentric digraph of the
cartesian product PnPm graph, Pn and Pm are path with n and m respectively.

Keywords and Phrases : Eccentricity, eccentric digraph, PnPm graph.

1. INTRODUCTION

Most of the notations and terminologies follow that of Chartrand and Oellermann [2]
and Gallian [4]. Let G be a graph with a set of vertices V(G) and a set of edges E(G). The
distance from vertex u to vertex v in G, denoted by d(u,v), is the length of the shortest path
from vertex u to v. If there is no a path joining vertex u and vertex v, then d(u, v) = . The
eccentricity of vertex u in graph G is the maximum distance from vertex u to any other
vertices in G, denoted by e(u), and so e(u) = max{d(u, v)v V(G)}. Radius of a graph G,
denoted by rad(G), is the minimum eccentricity of every vertex in G. The diameter of a graph
G, denoted by diam(G), is the maximum eccentricity of every vertex in G. If e(u) = rad(G),
then vertex u is called central vertex. Center of a graph G, denoted by cen(G), is an induced
subgraph formed from central vertices of G. Vertex v is an eccentric vertex from u if
d(u,v)  e(u). The eccentric digraph ED(G) of a graph G is a graph that has the same set of
vertices as G, V(ED(G)) = V(G), and there is an arc (directed edge) joining vertex u to v if v is
an eccentric vertex from u. An arc of a digraph D joining vertex u to v and vertex v to u is
called a symmetric arc. Further, Fred Buckley concluded that almost in every graph G, its

_____________

2010 Mathematics Subject Classification : 05C99

233
234 SRI KUNTARI AND T.A.KUSMAYADI

* *
eccentric digraph is ED(G) = G , where G is a complement of G which is every edge
replaced by a symmetric arc.
One of the topics in graph theory is to determine the eccentric digraph of a given
graph. The eccentric digraph of a graph was initially introduced by Fred Buckley (Boland and
Miller [1]). Some authors have investigated the problem of finding the eccentric digraph.
For example, Boland and Miller [1] determined the eccentric digraph of a digraph, while
Gimbert, et.al [5] found the characterisation of the eccentric digraphs. Boland and Miller [1]
also proposed an open problem to find the eccentric digraph of various classes of graphs.
Some results related to this open problem can be found in [6, 7].
In this paper, we also answer the open problem proposed by Boland and Miller [1]. In
particular, we determine the eccentric digraph of the cartesian product PnPm graph.

2. MATERIALS AND METHODS

The materials of this research are mostly from the papers related to the eccentric
digraph.
There are three steps to determine the eccentric digraph from the given graph. The first
step, we determined the distance from vertex u to any vertex v in the graph, denoted by d(u,
v), using Breadth First Search (BFS) Moore Algorithm taken from Chartrand and Oellermann
[2] as follows.
(1) Take any vertex, say u, and labeled 0 stating the distance from u to itself, and other
vertices are labeled .
(2) All vertices having label  adjacent to u are labeled by 1.
(3) All vertices having label  adjacent to 1 are labeled by 2 and so on until the required
vertex, say v, has already labeled.
The second step, we determined the vertex eccentricity u by choosing the maximum
distance from the vertex u, and so we obtained the eccentric vertex v from u if d(u, v) = e(u).
The final step, by joining an arc from vertex u to its eccentric vertex, so we obtained
the eccentric digraph from the given graph.

3. RESULTS AND DISCUSSIONS

Let Pn and Pm be two paths with vertex set V ( Pn )  {u1 , u2 , , un } ,


V ( Pm )  {v1 , v2 ,, vm } and edge set E( Pn )  {ui ui1 ; i  1, 2, , n  1} ,
E ( Pm )  {v j v j 1; j  1, 2, , m  1} , respectively. According to Chartrand and Lesniak [3], we
can construct the cartesian product PnPm with vertex set
V ( Pn  Pm )  { ij ; i  1, 2, , n, j  1, 2, ,m} where  ij  (ui , v j ) and edge set
E( Pn  Pm )  { ij i ( j 1) ,  ij (i1) j ; i  1, 2,, n, j  1, 2, , m} .
The following result is the eccentric digraphs of the cartesian product PnPm graph. There are
three cases the eccentric digraphs of the cartesian product PnPm graph to consider based on
the values of n and m.
The Eccentric Digraph of Pn×Pm Graph
235

Theorem 3.1. Let PnPm be the cartesian product graph where n and m even, then the
eccentric digraph ED( Pn  Pm ) is a digraph 2Sp,p where Sp,p is a double star with
nm  4 .
p
4

Proof. Using BFS Moore Algorithm, we obtain the eccentricity of PnPm graph as on the
Table 1.

Table 1. Eccentricity of PnPm graph, where n and m even


vertex of PnPm graph Eccentricity
11, 1m ,  n1 ,  nm nm2
12 ,  21, 1(m1) ,  2m ,  (n1)1 , n2 ,  (n1)m , n(m1) n m3
... ...
 n m ,  n m 
 n  m 
,  n m
nm
 1  1 1  1 2
22 2 2   2  2  2 2

The eccentricity of all vertices from Table 1 are used to determine the eccentric vertex of all
vertices of PnPm graph. The arcs can be obtained by joining every vertex to its eccentric
vertex of PnPm graph. Table 2 shows the eccentric vertices and arc of PnPm graph.

Table 2. The eccentric vertices and arc of PnPm graph vertex


vertex of PnPm graph eccentric vertices Arc
n m  nm  ij nm
 ij , i  1, 2, , ; j  1, 2, ,
2 2
n
 ij , i   1, , n; j  1, 2, ,
m 1m  ij1m
2 2
n m  n1  ij n1
 ij , i  1, 2, , ; j   1, , m
2 2
 ij , i 
n m
 1, , n; j   1, , m
11  ij11
2 2
From Table 2, the arc of PnPm graph vertex is adjacent to its eccentric vertices. Their arcs
are not symmetric except 11 nm and  n11m . Based on these arcs, the vertex set
nm  4
V ( ED( Pn  Pm )) can be partitioned into two disjoint subsets with each subset have
4
vertices. Therefore the eccentric digraph of PnPm graph where n and m even can be formed
nm  4
into the union of two double stars 2Sp,p with p .
4

The following figure is an example of the cartesian product P4P6 graph and its eccentric
digraph.
236 SRI KUNTARI AND T.A.KUSMAYADI

11 12 13 14 15 16  35  34 12


13
11  46
 21  22  24  25  26  36  21
 23
 44  45  23  22
 31  32  33  34  35  36 15 14  31  32

 24  41 16
 33
 41  42  43  44  45  46
 25  42
 26  43
Figure 1. The cartesian product P4P6 graph and its eccentric digraph

Theorem 3.2. Let PnPm be the cartesian product graph, for n even and m odd, then the
eccentric digraph ED( Pn  Pm ) is a digraph S 1p, p  S p2, p  2 K n where S 1p , p and S p2, p are
2,
2
nm  n  4
double stars with p  , K n  K 2  K n where K 2 is a set of two cut vertices of
4 2,
2 2

S 1p , p and S p2, p .

Proof. Using BFS Moore Algorithm, we obtain the eccentricity of PnPm graph as on the
Table 3. The eccentricity of all vertices from Table 3 are used to determine the eccentric
vertex of all vertices of PnPm graph. The arcs can be obtained by joining every vertex to its
eccentric vertex of PnPm graph. Table 4 shows the eccentric vertices and arc of PnPm graph
vertex.

Table 3. Eccentricity of PnPm graph vertex , where n even and m odd


vertex of PnPm graph Eccentricity
11, 1m ,  n1 ,  nm nm2

12 ,  21, 1(m1) ,  2m ,  (n1)1 , n2 ,  (n1)m , n(m1) n m3

... ...
 n m1 ,  n  m1 
 n  m1 
,   n1 m1
n  m 1
 1  1 1  1
2 2 2 2   2  2   2  2 2

 m1 ,  m1 n
m 1
1
1 n
2 2 2

... …
 n m1 ,   n  m1
n m 1

 1
2 2 2  2 2 2
The Eccentric Digraph of Pn×Pm Graph
237

Table 4. The eccentric vertices and arc of PnPm graph vertex,


where n even and m odd
vertex of PnPm graph eccentric vertices Arc
n m 1  nm  ij nm
 ij , i  1, 2, , ; j  1, 2, ,
2 2
n m 1 1m  ij1m
 ij , i   1, , n; j  1, 2, ,
2 2
n m 1  n1  ij n1
 ij , i  1, 2, , ; j   1, , m
2 2
n m 1 11  ij11
 ij , i   1, , n; j   1, , m
2 2
n m 1 1m ,  nm  ij1m ,  ij nm
 ij , i  1, 2, , ; j 
2 2
n m 1 11,  n1  ij11,  ij n1
 ij , i   1, , n; j 
2 2

From Table 4, the arc of PnPm graph vertex is adjacent to its eccentric vertices. Their arcs
are not symmetric except 11 nm and  n11m . Therefore the eccentric digraph of PnPm graph
where n even and m odd can be formed into S 1p , p  S p2, p  2 K n
where S 1p , p and S p2, p are
2,
2
nm  n  4
double stars with p  , K n  K 2  K n where K 2 is a set of two cut vertices of
4 2,
2 2

S 1p , p and S p2, p . □
The following figure is the cartesian product P4P5 graph and its eccentric digraph.
11 12 13 14 15  34  35 12
 21

 44 11  45
 21  22  23  24  25  22

 43  33  23 13
 31  32  33  34  35
 42 15  41
14

 41  42  43  44  45  32
  25  24
Figure 2. The cartesian product P4P5 graph and
31
its eccentric digraph

Theorem 3.3. Let PnPm be the cartesian product graph, where n, m odd, then the eccentric
digraph ED( Pn  Pm ) is a digraph S 1p, p  S p2, p  K n1  K m1  K1,4 where S 1p , p and
2, 2,
2 2
(n  1)( m  1)  4
are double stars with p 
S p2, p , K1, 4  K1  K 4 where K 4 is a set of all cut
4
vertices of S 1p , p and S p2, p
238 SRI KUNTARI AND T.A.KUSMAYADI

Proof. Using BFS Moore Algorithm, we obtain the eccentricity of PnPm graph as on the
Table 5. Table 6 shows the eccentric vertices and arc of PnPm graph vertex.
Table 5. Eccentricity of PnPm graph vertex , where n and m odd
vertex of PnPm graph Eccentricity
11, 1m ,  n1,  nm nm2
12 ,  21, 1(m1) ,  2m ,  (n1)1, n2 ,  (n1)m , n(m1) n m3
... ...
 n1 m1 ,  n1 m1 
  n1  m1 
,   n1 m1
nm
1
 1  1 1  1
2 2 2  2   2  2   2  2 2
 m1 ,  m1 n
m 1
1
1 n
2 2 2
... …
 n1 m1 ,   n1  m1
n 1 m 1

 1
2 2  2  2 2 2
 n1 ,  n1 n 1
1 m  m 1
2 2 2
... …
 n1 m1 ,  n1 m1 
nm
 1
2 2 2  2  2
 n1 m1 nm
1
2 2 2

Table 6. The eccentric vertices and arc of PnPm graph vertex,


where n and m odd
vertex of PnPm graph eccentric vertices Arc
n 1 m 1  nm  ij nm
 ij , i  1, 2, , ; j  1, 2, ,
2 2
n 1 m 1 1m  ij1m
 ij , i   1, , n; j  1, 2, ,
2 2
n 1 m 1  n1  ij n1
 ij , i  1, 2, , ; j  1, , m
2 2
n 1 m 1 11  ij11
ij , i   1, , n; j   1, , m
2 2
n 1 m 1  n1 ,  nm  ij n1,  ij nm
 ij , i  1, 2, , ; j
2 2
n 1 m 1 11, 1m  ij11,  ij1m
 ij , i   1, , n; j 
2 2
n 1 m 1 1m , nm  ij1m ,  ij nm
 ij , i  ; j  1, 2,,
2 2
n 1 m 1 11, n1  ij11,  ij n1
 ij , i  ; j  1, , m
2 2
The Eccentric Digraph of Pn×Pm Graph
239

n 1 m 1 11, n1 , 1m , nm  ij11,  ij n1 ,


 ij , i  ; j
2 2  ij1m ,  ij nm

From Table 6, the arc of PnPm graph vertex is adjacent to its eccentric vertices. Their arcs
are not symmetric except 11 nm and  n11m . The eccentric digraph of PnPm graph where n
and m odd can be formed into S 1p, p  S p2, p  K n1 K m1  K1, 4 where S 1p , p and S p2, p are
2, 2,
2 2
(n  1)( m  1)  4
double stars with p  , K1, 4  K1  K 4 where K 4 is a set of all cut vertices of
4
S 1p , p and S p2, p . □

The following figure is an example of the cartesian product P3P5 graph and its eccentric
digraph.
 23
11 12 13 14 15
 34  33
 32
 21  22  23  24  25 11 15
 25  24  21  22
 31  32  33  34  35
 31  35

13 12
14
Figure 2.The cartesian product P3P5 graph and its eccentric digraph

4. CONCLUDING REMARK

The results show that the eccentric digraphs of PnPm graph is a digraph
nm  4
1. 2Sp,p where Sp,p is a double star with p  for n and m even,
4
2. S 1p , p  S p2, p  2 K n
where S 1p , p and S p2, p are double stars with
2,
2
nm  n  4
p , K n  K 2  K n where K 2 is a set of two cut vertices of S 1p , p and
4 2,
2 2
S p2, p for n even and m odd,
240 SRI KUNTARI AND T.A.KUSMAYADI

3. S 1p, p  S p2, p  K n1 K m1  K1,4 where S 1p , p and S p2, p are double stars with
2, 2,
2 2
(n  1)( m  1)  4
p , K1, 4  K1  K 4 where K 4 is a set of all cut vertices of S 1p , p and
4
S p2, p for n and m odd.
As mentioned in previous sections the main goal of this paper is to find the eccentric
digraph of a given class of graph. Some authors have conducted research on this such
problem. Most of them have left some open problems on their paper for the future research.
We suggest readers to investigate the problem proposed by Boland and Miller [1] by
considering other classes of graphs.

References

[1] BOLAND, J. AND M. MILLER, The Eccentric Digraph of a Digraph, Proceeding of AWOCA’01, Lembang-
Bandung, Indonesia, 2001.
[2] CHARTRAND, G. AND O. R. OELLERMANN, Applied and AlGorithmic Graph Theory, International Series in
Pure and Applied Mathematics, McGraw-Hill Inc, California, 1993.
[3] CHARTRAND, G. AND L. LESNIAK, Graphs and Digraphs, Chapman and Hall/CRC, New York, 1996.
[4] GALLIAN, J. A. Dynamic Survey of Graph Labeling, The Electronic Journal of Combinatorics, 2009, #16, 1-
219.
[5] GIMBERT, J., N. LOPEZ, M. MILLER, AND J. RYAN, Characterization of eccentric digraphs, Discrete
Mathematics, 2006, Vol. 306, Issue 2, 210 - 219.
[6] KUSMAYADI, T.A. AND M. RIVAI. The Eccentric Digraph of an Umbrella Graph, Proceeding of INDOMS
International Conference on Mathematics and Its Applications (IICMA), Gadjah Mada University
Yogyakarta, Indonesia 2009, ISBN 978-602-96426-0-5, pp 0627-0638
[7] KUSMAYADI, T.A. AND M. RIVAI. The Eccentric Digraph of a Double Cones Graph, Proceeding of INDOMS
International Conference on Mathematics and Its Applications (IICMA), Gadjah Mada University
Yogyakarta, Indonesia 2009, ISBN 978-602-96426-0-5, pp 0639-0646

SRI KUNTARI
Department of Mathematics Faculty of Mathematics and Natural Sciences Sebelas Maret
University Surakarta.
e-mail: [email protected]

TRI ATMOJO KUSMAYADI


Department of Mathematics Faculty of Mathematics and Natural Sciences Sebelas Maret
University Surakarta.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 241 - 248.

ON -LINEARLY INDEPENDENT MODULES

SUPRAPTO, SRI WAHYUNI, INDAH EMILIA WIJAYANTI ,IRAWATI

Abstract. In this paper, we generalize R-linearly independent to be M-linearly independent, where


M is an arbitrary left R-module. Let { } be a class of R-modules and N be a left
R-module and R is a ring with unity. Then is linearly independent to N (or N is an -linearly
independent) if there exist a monomorphism ∐ . If { } is a singleton, it
means that there is a monomorphism , for index set . The set of a factor modules
of left R-module which is -linearly independent is denoted by .We observe some properties
of -linearly independent modules we obtain that N is -linearly independent if and only if there
exista finite indexed set with ∐ is a monomorphism. Furthermore, an R-
module N is finitely -linearly independent if and only if every , N is finitely -
linearly independent.We observe also some properties of . We obtain that is a
subcategory of category R-Mod, for any submodule and factor module of module in belong to
.

Keywords and Phrases :(finitely) -linearly independent, subcategory .

1. INTRODUCTION

We denote R as an associative ring with unit element , and by module we mean a


left R-module. A subset of left R-module N is called R-linearly independent if there exist a
monomorphism which defined by where with
and . In other hand, we also can call it, R is linearly independent to N (or N is R-linearly
independent).
By notion R-linearly independent, we can generalize the notion of linearly independent
of modules. Let be a class of R-modules and N be a left R-module. Then is linearly
independent to N (or N is an -linearly independent) if there is a monomorphism
∐ , which is defined by ∐ , where for
, with ∐ { }. If { } is a singleton, it means
there is a monomorphism , for any index set . The set of a factor modules of
left R-module which -linearly independent is denoted by .
In section 2, we study definition and observe some properties of -linearly independent
241
242 SUPRAPTO et al.

modules, (i.e.; finite indexed set, finitely -linearly independent, and ).We obtain that
N is -linearly independent if and only if there exist finite indexed set with
∐ is a monomorphism. Furthermore, N is finitely -linearly independent if and
only if every , N is a finitely -linearly independent.
In section 3, we study definition of notion subcategory for { } and shown
that is subcategory of category R-Mod. We obtain properties subcategory that are
for any submodule and factor module of modules in belong to .

2. -LINEARLY INDEPENDENT MODULES

The notion of -linearly independent is categorical. So it depends only on the objects


and the morphisms of the category R-Mod, and not on the elements of any module.
Here are shown some examples of module which is linearly independent to other
module.

Example 2.1Let be aring ofreal numbers and be a ring of polynomial over . Then
there exist a monomorphism , which is defined by where
with , and for .

Proof: Let be elements in . Then ∑ . So we


have for all . In other word is a monomorphism, or
is linearly independent to .□
Furthermore, for any ring R certain linearly independent to its formal power series [ ]
where for [ ], ∑ .

Example 2.2 Let { } be a class -modules and be a -module. Formed


a relation where satisfies and
. Then
and { } { } { }. In other word
is a monomorphism, or { }is linearly independent to .

Futhermore, ⁄ ⁄ . Since { } not


generates , then cannot use to form .

Example 2.3 Let { } be a class of -modules and be a -module.


Because , then { } both generates and linearly
independent to . So we have { }.

In studying notion of direct sum and -copies of modules, we need the properties of
morphisms direct sum such as on the following proposition.

Proposition 2.1 [Anderson, 1992, (6.8)] If is a morphism for index , then


∐ is direct sum of morphism with ∑ and
⋂ .
On -Linearly Independent Modules 243

Study the characterization is an important way for observe further some properties. The
following proposition is the characterization of -linearly independent.

Proposition 2.2Let be a class of R-modules and let N be an R-module. Then is


linearly independent to N (or N is -linearly independent) if and only if there exist a finite
indexed set with ∐ is a monomorphism.

Proof: Let ∐ be a monomorphism with and any index set .


Then, by ∐ { } there exist a finite index set
where for and we have ⁄
∐ ∐ ⁄ is a
monomorphism. So ∐ is a monomorphism.
Let ∐ be a monomorphism with any finite index set and
. Suppose ⁄
∐ ∐ ⁄ is a morphism with

{ .

Thus ⁄
∐ ∐ ⁄
∐ is a mono-morphism, with
∐ ⁄ . This means N is -linearly independent. □

The following proposition is property of morphisms,which will be used to prove the


next property.

Proposition 2.3Let be a class of R-modules. For an R-modul N, the following assertions


are equivalent:
a. A module N is an -linearly independent.
b. For every nonzero monomorphism , there exist with
and is a monomorphism.
c. ⋂{ } .

Proof: Let ∐ be a morphism for indexed module in and with


∏ . Since is a monomorphism, by proposition 2.2 there exist a
singleton index with is a monomorphism and . This means for
every we have . So ( ) , this implies
is a monomorphism.
Let ⋂{ } be an intersection of and let
for , and we have is a monomorphism. Then
for -copies of , we have ∏ is a
monomorphism and . By Proposition 2.2, ⋂{ }
.
Let ∏ be a morphism for index set and let
⋂{ } be an intersection of . Then there exist a finite
index set with ⋂ { } . So ∏
is a monomorphism and by proposition 2.2, N is an -linearly independent. □
244 SUPRAPTO et al.

Moreover, we observe properties of finitely -linearly independent. The following


definition is a notion of finitely -linearly independent.

Definition 2.4 Let be a class of R-modules and let N be an R-module. is finite linearly
independent to N (or N is an finite -linearly independent), there is a monomorphism
∏ , for finite indexed module in .

Corollary 2.5Let be a class of R-modules. For an R-module N, this the following


assertions are equivalent:
a. is finite linearly independent to N.
b. For every { } with ⋂ , there exist a finite index set
with ⋂ .
c. Every { } is finite linearly independent to N.

Proof: This is clear by Proposition 2.3 .


This is clear by Proposition 2.3 .□

Proposition 2.6Let be a class of R-modules and let N be an R-module. An R-module N is


isomorphic with direct sum of modules in if and only if is both linearly independent
to N and generates N.

Proof: Suppose N is isomorphic with direct sum of modules in . Then there exist
isomorphism for . So f is both monomorphism and epimorphism.
This implies is linearly independent to N and is generates N.
Suppose is both linearly independent to N and generates N. Then there exist an
isomorphism for . So N is isomorphic with direct sum of modules in
.□

3. SUBCATEGORY

In this section, it is shown some examples of , with will be use to show that is
a subcategory of a category R-Mod.

Example 3.1 In Example 2.2 ⁄ ⁄ . Since


{ } not generates , then can not used to form .

Example 3.2 In Example 2.3 { }.

Proposition 3.3 Let be a set of factor module of modules which is M-linearly


independent. Then is a subcategory of a category R-Mod.

Proof: Here are shown that is a category, as follow:


i. .
ii. .
On -Linearly Independent Modules 245

iii. Composition morphisms in is the restrictions in R-Mod.


Since set of factor modules contained in R-Mod and set of monomorphisms contained in set
of morphisms, then and .
Also, since composition of morphisms in is defined by composition morphisms in R-
Mod, then composition morphisms in is the restriction in R-Mod. □

The following proposition isinteresting properties of subcategory , which is similar


to properties of subcategory σ[M].

Proposition 3.4Let be a category of factor modules of module which is M-independent.


For any Q in then we have the following assertion;
1. For any submodule of Q in .
2. For any factor module of Q in .
3. in .

Proof:
1. Let N, K and L be R-modules. If , then ⁄ ⁄ . For N is M-linearly
independent, we have ⁄ . So we obtain K is M-linearly independent, this
implies ⁄ .
2. Let N, K and L be R-modules. If , then ⁄ ⁄ . By the modules
⁄ ⁄
isomorphism theorem, (i.e. ⁄ ⁄ ⁄ ), this implies ⁄ ⁄ .
3. Let ⁄ be a factor module in . Then we have ⁄ in . Now we
showed that ( ⁄ ) ⁄ , as follow:
We define ⁄ ( ⁄ ) or ̅̅̅̅̅̅̅ ̅̅̅ , so we have shown:
i. is a mapping. Take any ̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅ ⁄ with ̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅ , so
we have;

̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅

̅̅̅̅̅̅ ̅̅̅̅̅̅
(̅̅̅̅̅̅̅) (̅̅̅̅̅̅̅̅) ,

ii. is a homomorphism. Take any ̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅ ⁄ , so we have;


(̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅) ( )
( )
( )
( )
(̅̅̅̅̅̅̅̅̅̅̅̅̅)
246 SUPRAPTO et al.

̅̅̅ ̅̅̅̅
̅̅̅ ̅̅̅̅
̅̅̅ ̅̅̅̅ ,
iii. is injective. Take any ̅̅̅ ̅̅̅̅ ( ⁄ ) with ̅̅̅ ̅̅̅̅ , so
we have;
̅̅̅ ̅̅̅̅

̅̅̅̅̅̅̅ ̅̅̅̅̅̅̅̅ ,

iv. is a surjective. Take any ̅̅̅ ( ⁄ ) , so we have some ̅̅̅̅̅̅̅


⁄ such that (̅̅̅̅̅̅̅) ̅̅̅ .

By (i), (ii), (iii), and (iv), is isomorphism or


( ⁄ ) ⁄
This imply ( ⁄ ) in . □

4. CONCLUDING REMARK

Based on previous sections, by defining of generalize of R-linearly independent to be


(finite) M-linearly independent, and is a set of factor modules of module which is M-
linearly independent, we have investigate some properties as follow;
1. is linearly independent to N (or N is -linearly independent) if and only if there
exist finite indexed set with ∐ is a monomorphism.
2. is linearly independent to N if and only if every nonzero monomorphism
, there exist with and is monomorphism.
3. is linearly independent to N if and only if ⋂{ } .
4. is finite linearly independent to N if an d only if every { } with
⋂ , there exist a finite index set with ⋂ .
5. is finite linearly independent to N if an d only if every { } is a finite
linearly independent to N.
6. is subcategory of category R-Mod.
7. For any submodule and factor module of modules in belong to .
On -Linearly Independent Modules 247

References

[1] ADKINS, W., WEINTRAUB, S.H., Algebra, Springer-Verlag New York Berlin Heidelberg, 1992.
[2] ANDERSON, F.W., FULLER, K.R., Rings and Category of Modules, 2nd edition, Springer-Verlag, 1992.
[3] BEACHY, J. A., M-injective Modules and Prime M-ideals, Article,----
[4] GARMINIA, H., ASTUTI, P., Journal of The Indonesian Mathematical Society, Karakterisasi Modul -
koherediter, Indonesian Mathematical Society, 2006.
[5] MACLANE, S. , BIRKHOFF, G., Algebra, MacMillan Publishing CO., INC., New York, 1979.
[6] PASSMAN, S. DONALD, A Course in Ring Theory, Wadsworth & Brooks, Pacific Grove, California, 1991.
[7] WISBAUER, R., Foundations of Module and Ring Theory, University of Dűsseldorf, Germany, Gordon and
Breach Science Publishers, 1991.

SUPRAPTO
Junior High School of 1 Banguntapan, Bantul, Yogyakarta, Indonesia.
Ph.D. Student of Mathematics, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail : [email protected] HP 0817 273 274

SRI WAHYUNI
Department of Mathematics, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail : [email protected]

INDAH EMILIA WIJAYANTI


Department of Mathematics, Universitas Gadjah Mada, Yogyakarta, Indonesia.
e-mail : [email protected]

IRAWATI
Department of Mathematics, Institute Technology Bandung, Indonesia.
e-mail : [email protected]
248 SUPRAPTO et al.
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Algebra, pp. 249 - 258.

THE EXISTENCE OF MOORE PENROSE INVERSE IN


RINGS WITH INVOLUTION

TITI UDJIANI SRRM, SRI WAHYUNI, BUDI SURODJO

Abstract. In this paper we present about the Moore–Penrose inverse in rings with involution . J.J
Koliha and Pedro Patricio, formulated the definition ofthe Moore–Penrose inverse in rings with
involution and described that two one-sided invertibility conditions imply the Moore Penrose
invertibility. In this paper a ring means an assosiative ring with unit 1 ≠ 0.
Many concepts ina ring which given by many authors were motivated this paper. There are
regularityand group invertibility of elements were given by Pedro Patricio and Roland Puystjens,
drazin invertibility was given by J.J. Koliha. We use those concepts to investigate the existence of
Moore–Penrose inverse in rings with involution.

Keywords and Phrases :Moore Penrose Inverse, involution .

1. INTRODUCTION

During the decade 1910–1920 ,E H. Moore introduced and studied the general
inverse for any complex matrices. The general inverse matrixwas rediscovered by R. Penrose
in 1955. In [5] R.Penrosedescribed a generalization of the inverse of a non-singular matrix,
as the unique solution of a certain set of equations. We can see that from Theorem 1.1 below
.This generalized inverse exists for any (possibly rectangular) matrix whatsoever with
complex elements and is nowadays called the Moore–Penrose inverse.

Theorem 1.1 :[5]If A and X are matrices (not necessarily square) with complex elements ,
then the four equations :
AXA =A (1.1)
XAX =X (1.2)
(AX)H = AX (1.3)
(XA)H = XA (1.4)
have a unique solution for any A.
The conjugate transpose of the matrix A is written withAH. The unique solution of
(1.1), (1.2), (1.3) and (1.4) will be called the Moore Penrose inverse of A and writtenX =
249
250 T.UDJIANI, S.WAHYUNI , B, SURODJO

A+.(Note that A need not be a square matrix and may even be zero.)
In this paper we present the Moore–Penrose inverse in rings with involution . Koliha
and Pedro Patricio [3] , formulated the definition ofthe Moore–Penrose inverse in rings with
involution and described that two one-sided invertibility conditions imply the Moore Penrose
invertibility.
*
Definition 1.2: [3]An involution in a ring R is operation a→ a*in a ring R such that,
i). (a + b )* = a* + b*
ii). (ab)* = b* a*
iii).(a*)* = a
for each a,b∈R.

Definition 1.3 : [3]Element a ∈R is said to be Moore–Penrose invertible if the equations


axa = a (1.5)
xax = x (1.6)
(ax)* = ax (1.7)
(xa)* = xa (1.8)
have a common solution for any a . Such solution is unique if it exists, and is usually denoted
by a+. The set of all Moore–Penrose invertible elements of R will be denoted by R+.

From equations (1.6) and (1.7) we have


x = xax = x (ax)* = x x*a* (1.9)
and from equations (1.5) and (1.8) we have
a = axa = a(xa)* = a a* x*
a* = (a a* x*)* = (x*)* (a*)*a* = xaa* (1.10)
Hence , to get x as a solution of equations (1.5) – (1.8), it is enough to get x as a solution of
equations (1.9 ) dan (1.10). By equations (1.9) and (1.10) we obtain that
a* = x x*a* aa*. (1.11)
We conclude that such x is exists if there exists d where
d a*a a* = a*. (1.12)
Further by using (1.10) we get x = d a* such that x satisfies (1.10).Similarly x satisfies (1.9)
because of x x*a* = d a* x*a* = d( axa)* = d a* = x.
Next we will to show that such solution is unique if it exists.Suppose x and y are Moore
Penrose inverse of a.
axa = a (1.13)
xax = x (1.14)
(ax)* = ax (1.15)
(xa)* = xa. (1.16)
Similarly because y is Moore Penrose inverse of a , then
aya = a (1.17)
yay = y (1.18)
(ay)* = ay (1.19)
(ya)* = ya. (1.20)
From (1.14) and (1.15) we obtain
x x*a* = x. (1.21)
From (1.13) and (1.16) we obtain
xaa* = a*. (1.22)
From (1.18) and (1.20) we obtain
a*y*y = y. (1.23)
The Existence of Moore Penrose Inverse in Rings with Involution 251

From (1.17) and (1.19) we obtain


y* a*a = a (1.24)
a*ay = a*. (1.25)
Further by (1.21), (1.22), (1.23) dan (1.25) we get
x = x x*a* , by (1.21)
= x x* a*ay , by (1.25)
= x (ax)*ay , by involution
=xaxay , by Moore Penrose invertibility
=xay , by Moore Penrose invertibility
= x a a*y*y , by (1.23 )
= a*y*y , by ( 1.22 )
= y , by ( 1.23 ).

The next well known lemma asserts that two one-sided invertibility conditions imply the
Moore–Penrose invertibility.
Lemma 1.4 :[3]Let a ∈R. Then a ∈R+ if and only if there exist x, y ∈ R such that
axa = a = aya , (ay)* = ay , (xa)* = xa
+
In this case a = xay.

Proof :By (1.20) we obtain thata = aa*x* = a(xa)* = axa⇔(xa)* = xaand by (1.25) we havea =
y*a*a = (ay)*a = aya⇔ (ay)* = ay. Similarly we have axa = a = aya. Further a = aya =
axaya = aa+awitha+ = xay.

2.THE EXISTENCE OF MOORE PENROSE INVERSE

There are some concepts of elements in rings with involution. In this section we
discuss the concepts of regular element and group inverse which given by Pedro Patricio and
Roland Puystjens [4]. We also study the Drazin inverse element which given by Koliha [2].

Definition2.1 : [4]The group inverse of a ∈Risa#∈Rsuch that


a a#a = a (2.1)
a#a a#= a# (2.2)
a a#= a# a. (2.3)
The set of all group invertible elements of R will be denoted by R#.

Further if the group inverse exists then it is unique and a is called group
invertible.Suppose x and y are group inverse of a. Then
axa = a (2.4)
xax = x (2.5)
ax = xa (2.6)
and
aya = a (2.7)
yay = y (2.8)
ay = ya . (2.9)
By (2.4) − (2.9) we get
x = xax = axx = ayaxx = ayxax = ayxayax = yaxayax = yayax = yayxa = yyaxa = yya =
252 T.UDJIANI, S.WAHYUNI , B, SURODJO

yyaya = yyaay = yayay = yay = y.

Definition 2.2 : [4]An element a ∈R is said to have a Drazin inverse if there exists b ∈R such
that
ab = ba (2.10)
b = ab2 (2.11)
ak= ak+1 b (2.12)
for some nonnegative integer k. The least nonnegative integer k for which these equations
hold is the Drazin index i(a) of a. The set of all Drazin invertible elements of R will be
denoted by RD.

Then given a ∈R# sothere exists b∈Rsuch that


(1) a b = b a.
(2) b = bab = bba = b2a
(3) a = aba
ak = ak+1 b for some nonnegative integer k.
This is clear that a∈RD. We conclude that R#⊂RD.

Definition 2.3:[4]An element a ∈ R is regular (in the sense of von Neumann) if it has an inner
inverse x, that is, if there exists x ∈R such that axa = a. Any inner inverse of a will be denoted
by . The set of all regular elements of R will be denoted by .

Definition 2.4 : [4]Given a ∈ R we define the sets


aR = { ax : x ∈ R } (2.13)
Ra = { xa : x ∈ R }. (2.14)

Definition 2.5 :[3]An element a ∈R is *-cancellable if a*ax = 0 ⇒ ax = 0 and xaa* = 0 ⇒xa


= 0, for x ∈R.

Applying the involution to Definition 2.5, we observe that ais*-cancellable if and only if a*is
*
-cancellable.
Proof :
⇒ Suppose that a is *-cancellable , that is a*ax = 0⇒ax = 0 and xaa* = 0⇒xa = 0. Then
(a*)*a* x* = 0 ⇒x a a* = 0
⇒x a = 0
⇒( x a ) * = 0
⇒a * x*= 0
* * * *
x a (a ) = 0 ⇒a*ax = 0
⇒ax = 0
⇒(ax)* = 0
⇒x*a * = 0
⟸ *
a ax = 0 ⇒(a*ax )* = 0
⇒x*a *(a*)*= 0
⇒x*a * = 0
⇒(ax) * = 0
⇒ax = 0
x a a* = 0 ⇒( x a a*) * = 0
⇒(a*)*a* x* = 0
The Existence of Moore Penrose Inverse in Rings with Involution 253

⇒a * x* = 0
⇒( x a ) * = 0
⇒x a = 0
Hence a is*-cancellable if and only if a*is *-cancellable.

It is often useful to observe that if ais *-cancellable thena*a and aa* are *-cancellable.
Proof :
Suppose ais *-cancellable and let ( a*a)* ( a*a) x = 0. Then
( a*a)* ( a*a) x = 0 ⇒a*aa*a x = 0
⇒aa*a x = 0
⇒x*a *(a*)*a*= 0
⇒x*a*(a*)*=0
⇒(a*a x ) * = 0.
Hence a a is -cancellable .With similar way we can prove that aa* is *-cancellable.
* *

By studying some concepts elements in rings with involution that have been
described previously, motivates us to know the relation between those concepts and the
Moore Penrose inverse. From the next theorem, we will see those concepts construct the
existence of Moore Penrose inverse. If J.J. Koliha [3] have been proved the next theorem also
uses the spectral idempotent element, this paper use another way, the theorem proved only by
using group invertible , drazin invertible and regular element. We also proved many
characteristic that has not beenproven by Koliha.

Theorem 2.6 : [3]For a ∈ R the following conditions are equivalent :


(i) a ∈ R+
(ii) a*∈ R+
(iii) a *-cancellable and a*a∈ R+
(iv) a *-cancellable and aa*∈ R+
(v) a *-cancellable and a*a∈RD
(vi) a *-cancellable and aa*∈RD
(vii) a *-cancellable and a*a∈R#
(viii) a *-cancellable and aa*∈ R#
(ix) a *-cancellable and a*a and aa* are regular
(x) a ∈aa* R R a*a
(xi) a*-cancellable and a*a a* regular.

Proof :First we prove the implications (i) ⇒(iii) ⇒(v) ⇒(vii) ⇒ (i). (*)
( i ) ⇒ ( iii ) We need to proof that a is *-cancellable .
Given a∈R+ and a*a x = 0 ⇒a x = a a+a x
= ( a a+)* a x
= (a+)*a* a x
= 0.
Given a∈R+ andx a a* = 0 ⇒x a = x a a+a
= x a(a+a ) *
= x a a*(a+) *
= 0.
Next we will to proof that a*a∈R+ .The Moore–Penrose invertibility of a*ais obtained by
verifying that ( a*a)+= a+(a+)*
254 T.UDJIANI, S.WAHYUNI , B, SURODJO

1) a*a (a*a)+a*a = a*a a+(a+)*a*a


= a*a a+(a a+)* a
= a*a a+(a a+) a
= a *a a + a
= a *a
2) (a a) a a (a a) = a (a ) a a a+(a+)*
* + * * + + + * *

= a+( a a+)* a a+(a+)*


= a+(a a+) a a+(a+)*
= a+ a a+(a+)*
= a+(a+)*
= ( a*a)+
* + * *
3) ((a a) a a ) = (a +(a +)*a*a ) *
= a*a a +(a +)*
= a*(a a +)*(a +)*
= a*(a +)* a*(a +)*
= a*(a +)*(a +a)*
= a*(a +)*a +a
= (a +a)* a +a
= a + a a +a
= a + (a a +)* a
= a+(a +)* a*a
= (a*a) +a*a
* * + *
4) (a a (a a) ) = (a*a a +(a +)*)*
= a+(a +)* a*a
= a + ( a a +)*a
= a + a a +a
= (a +a)*(a +a)*
= a*(a+)* a*(a +)*
= a*(aa +)*(a +)*
= a*a a +(a +)*
= a*a (a*a)+
(iii)⇒(v)
Given a*a∈R+ with ( a*a)+ = a + (a+)*. We will prove that ( a*a)∈RD
1) ( a*a)+ a*a = a+ (a+)*a*a
= a+ (a a+)*a
= a+ (a a+) a
= a+ a
* * +
a a ( a a) = a*a a+ (a+)*
= a*(a a+)* (a+)*
= a* (a+)*a* (a+)*
= (a+ a a+ a)*
= (a+a)*
= a+ a
Hence ( a*a)+a*a = a*a ( a*a)+
2) Because of 1) and a*a∈R+ then a*a ( a*a)+ ( a*a)+ = ( a*a)+a*a ( a*a)+ = ( a*a)+
3) Sincea*a∈R+then we havea*a ( a*a)+a*a= a*a. By using 1) then ( a*a )k = (a*a)k+1(
a*a)+
Hence ( a*a) ∈RD is proved and ( a*a)D= ( a*a)+
(v )⇒(vii)
The Existence of Moore Penrose Inverse in Rings with Involution 255

From ( a*a) ∈RD then we have ( a*a)Da*a = a*a ( a*a)D.


Because of ( a*a)Da*a= a*a ( a*a)D thena*a ( a*a)Da*a = (a*a)2 ( a*a)D = a*a { by applying
that ( a*a )k = (a*a)k+1( a*a)D }.
Similarly way , because of ( a*a)Da*a = a*a ( a*a)D and a*a (( a*a)D)2 = ( a*a)D then
( a*a)D a*a ( a*a)D= ( a*a)D
Hence ( a*a) ∈R# is proved and ( a*a)# = ( a*a)D
(vii )⇒(i)
Given group invertibility of a*a with ( a*a)# = ( a*a)+ = a+ (a+)* . We need to proof that
a∈R+ which a+ = ( a*a)#a* = a* ( aa*)#
1) a+ a a+ = ( a*a)#a*a ( a*a)#a*
= ( a*a)+a*a ( a*a)+a*
= ( a*a)+a* , since a*a is group invertible
* # *
= ( a a) a
= a+
+ *
2) ( a a ) = (a( a*a) #a*) *
= (a(a*a) +a*) *
= (a a+( a+ ) *a*) *
= a a+ (a+)*a*
= a(a*a) +a*
= a(a*a) #a*
= a a+
+ *
3) ( a a) = (a* (a a*) #a) *
= (a* (a a*) +a) *
= (a*( a+ ) *a+a) *
= a* ( a+ ) *a+ a
= a* ( a a*)+ a
= a*( aa*) #a
= a+ a.
We observed that a∈R if and only if a is *-cancellable and a*a is group invertible .
+

Note : That a∈R+if and only of a*∈R+.


We observed that a∈R+ if and only if a is *-cancellable and a*a is group invertible. Further
a*a anda a* symmetric , and a*-cancellable if and only if a**-cancellable then
a ∈R+ ⟺a* is *-cancellable also (a*a) * and (aa*)* are grup invertible.
⟺a*∈R+
Since a∈R+if and only of a*∈R+ , (*) gives immediately (ii) ⇒(iv) ⇒(vi) ⇒ (viii) ⇒(ii),and the
equivalence of (i)–(viii) is established
(viii ) ⇒(ix)
By using aa*∈R#and a*a∈R# we get a*a and aa* are regular.
(ix)⇒(x)
If aa* is regular ⇒ exists x∈Rsuch thataa* x aa* = aa*
⇒exists x∈Rsuch thataa* x aa* - aa* = 0
⇒ exists x∈Rsuch that (aa* x – 1 ) aa* = 0
⇒ exists x∈Rsuch that (aa* x – 1 ) a = 0
⇒ aa* x a = a
⇒a∈aa* R , by applying Definition 2.4.
If a a is regular ⇒exists x∈Rsuch thata*a y a*a = a*a
*

⇒ exists x∈Rsuch thata*a x a*a - a*a = 0


⇒ exists x∈Rsuch thata*a (x a*a – 1 ) = 0
256 T.UDJIANI, S.WAHYUNI , B, SURODJO

⇒exists x∈Rsuch thata (x a*a – 1 ) = 0


⇒ ax a*a = a
⇒a ∈R a*a , by applying Definition 2.4.
Hence a∈aa* R R a*a.
(x)⇒(i)
We have a∈a a* R R a*a , then a = aa*u = v a*a with u,v∈R
(u*a) * = a*u = ( aa*u)* u = u*a a*u = u*a
(a v*) * = v a* = v (v a*a)* = v a*a v* = av*
Next a u*a = aa*u = a and av*a = v a*a = a . By Lemma 1.4 its proves that a∈R+ with x =
v* and y = u*.
(i ) ⇒(xi)
Given a∈R+ .We observe that there exists x∈R such that a*aa*xa*aa*= a*a a*.
Let x = (a*a a*)+ = (a+)*a+(a+)*. Then
a*a a*x a*a a* = a*a a*(a+)*a+(a+)*a*a a*
= a*a(a+a)*a+(aa+)*a a*
= a*aa+a a+ a a+a a*
= a*a a*.
(xi ) ⇒(x)
Because a*a a* is regular so there exists x ∈ Rsuch that a*a a*x a*a a* = a*a a*. Then by using
the the *_cancellability of a twice we get
⇒a*a a*x a*a a* - a*a a* = 0
⇒a*a( a*x a*a a* - a* ) = 0
⇒a( a*x a*a a* - a* ) = 0
⇒a a*x a*a a* - a a* = 0
⇒ ( a a*x a* - 1 )a a* = 0
⇒ ( a a*x a* - 1 )a = 0
⇒a a*x a* a = a
⇒a ∈aa* Rand a∈R a*a
⇒ a∈aa* R R a*a.

References

[1] BEN ISRAEL, The Moore of The Moore Penrose Inverse , The Electronic Journal of Linear Algebra 2002.
[2] J.J.KOLIHA ANDA ADI , Generalized Drazin Inverse, Glasglow Math, 1996
[3] J.J.KOLIHA AND PEDRO PATRICIO, Elements of Rings with Equal Spectral Idempotents , Australian
Mathematical Society M, 2001.
[4] PEDRO PATRICIO AND ROLAND PUYSTJENS, Drazin Moore Penrose Invertibility in Rings, AMS, 2004.
[5] R.PENROSE, ( Communicated by J.A.Todd) , A Generalized Inverse for Matrices , St. John’s College
Cambridge,1954.
The Existence of Moore Penrose Inverse in Rings with Involution 257

TITI UDJIANI SRRM


Undip
e-mail: [email protected]

BUDI SURODJO
UGM
e-mail: [email protected]

SRI WAHYUNI
UGM
e-mail: [email protected]
258 T.UDJIANI, S.WAHYUNI , B, SURODJO
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 259–266.

AN APPLICATION OF ZERO INDEX TO SEQUENCES OF


BAIRE-1 FUNCTIONS

Atok Zulijanto

Abstract. In this paper, we prove a result that yields an upper bound of the oscillation
index of a function that is a limit of a sequence of Baire-1 functions using the zero index
of related gauges.

Keywords and Phrases: Oscillation index, convergence index, Baire-1 functions, gauges
.

1. PRELIMINARIES
Let X be a metrizable space. A real-valued function f is said to be of Baire class
one or simply Baire-1, if there exists a sequence (fn ) of real-valued continuous functions
that converges pointwise to f . The Baire Characterization Theorem [1] states that if
X is a Polish (separable completely metrizable), then f : X → R is a Baire-1 function if
and only if for all nonempty closed subset F of X, f |F has a point of continuity. This
leads naturally to an ordinal index for Baire-1 functions, called the oscillation index
(see, [2] and [3]). In [3], Kechris and Louveau also introduced another ordinal index for
Baire-1 functions, called the convergence index. The study of Baire-1 function in terms
of ordinal indices was continued by several authors (see, e.g.,[4], [6] and [7]).
Let X be a metrizable space and C denote the collection of all closed subsets of X.
Now, let ε > 0 and a function f : X → R be given. For any H ∈ C, let D0 (f, ε, H) = H
and D1 (f, ε, H) be the set of all x ∈ H such that for every open set U containing x,
there are two points x1 and x2 in U ∩ H with |f (x1 ) − f (x2 )| ≥ ε. For all α < ω1 (the
first uncountable ordinal number), set

Dα+1 (f, ε, H) = D1 (f, ε, Dα (f, ε, H)).

2010 Mathematics Subject Classification: 26A21, 03E15, 04A15, 54C30

259
260 Atok Zulijanto

If α is a countable limit ordinal, let


\ 0
Dα (f, ε, H) = Dα (f, ε, H).
α0 <α

The ε-oscillation index of f on H is defined by




 the smallest ordinal α < ω1 such that Dα (f, ε, H) = ∅
if such an α exists,

βH (f, ε) =


ω1 , otherwise.

The oscillation index of f on the set H is defined by


βH (f ) = sup{βH (f, ε) : ε > 0}.
We shall write β(f, ε) and β(f ) for βX (f, ε) and βX (f ) respectively.
It follows from the Baire Characterization Theorem that a real-valued function
on a Polish space is Baire-1 if and only if its oscillation index is countable (see, e.g.,
[3]). Lee, Tang and Zhao [5] gave an equivalent definition of Baire-1 functions in terms
of gauges analogous to the definition of continuity of a function.
Theorem 1.1. ([5]) Suppose that f is a real-valued function on a complete separable
metric space (X, d). Then the following statements are equivalent.
(1) For any ε > 0, there exists a positive function δ on X such that
|f (x) − f (y)| < ε
whenever d(x, y) < min{δ(x), δ(y)}.
(2) The function f is of Baire class one.

We call the positive function δ in Theorem 1.1 an ε-gauge of f .


Recently, Leung, Tang, and Zulijanto [8] developed a method to compute the
oscillation index of a Baire-1 function using gauge approach. The tool used in [8] for
the gauge approach is called zero index. The zero index measures the accumulation of
the values of a gauge towards zero. The advantage of the method is that it provides an
easy-to-use calculus for the oscillation index. The gauge approach, using the zero index,
allows us to obtain the estimation of the oscillation index of fairly general combinations
of two Baire-1 functions [8]. It was proved in [6] that for any two real functions f and g
on a compact metric space K, if β(f ) ≤ ω ξ1 and β(g) ≤ ω ξ2 on K, then β(f +g) ≤ ω ξ1 ∨ξ2
and β(f g) ≤ ω ξ , where ξ = max{ξ1 + ξ2 , ξ2 + ξ1 }. In [8], using gauge approach, Leung,
Tang and Zulijanto obtain the generalized versions of the results on a Polish space and
improve the result on the product of two Baire-1 functions.
Let X be a Polish space and Bξ1 (X) denote the set of all Baire-1 functions f on X
with β(f ) ≤ ω ξ . In this paper, we give another application of zero index of gauges. We
prove a result yielding an upper bound of the oscillation index of a function that is a
limit of a sequence of Baire-1 functions. More precisely, if (fn ) is a sequence in Bξ1 (X)
that converges to f , then β(f ) is bounded above by the product of convergence index
of (fn ) and ω ξ . We prove the result by computing the zero index of an appropriate
An Application of Zero Index to Sequences of Baire-1 Functions 261

gauge of f . We will recall the definition of convergence index of a sequence in the next
section.
Before we present our results, we recall the definition of zero index of gauges (see,
[8]). Let X be a Polish space and π be a positive function on X. For any closed subset
H of X, let Z 0 (π, H) = H and Z 1 (π, H) be the set of all x ∈ H such that for any
neighborhood U of x it holds that inf{π(y) : y ∈ U ∩ H} = 0. For all α < ω1 , define
Z α+1 (π, H) = Z 1 (π, Z α (π, H)). If α is a limit ordinal, let
\ 0
Z α (π, H) = Z α (π, H).
α0 <α

The zero index of π is




 the smallest ordinal α < ω1 such that Z α (π, H) = ∅
if such an α exists,

oH (π) =


ω1 , otherwise.

We shall write o(π) for oX (π). It is easy to prove that for any closed subset H of X,
Z 1 (π, H) is closed.
The following theorems can be found in [8].
Theorem 1.2. (see [8], Proposition 3) Let ε > 0 and a Baire-1 function f : X → R be
given. If δ : X → R+ is an ε-gauge of f , then β(f, ε) ≤ o(δ).
Theorem 1.3. (see [8], Theorem 4) Let f : X → R be a Baire-1 function. Then for
any ε > 0 there exists an ε-gauge δ of f such that o(δ) = β(f, ε).

In order to compute the oscillation index of Baire-1 functions using zero index of
the appropriate gauges, we need the following computational tool that can be found in
[8].
Theorem 1.4. (see [8], Theorem 6) If π1 , π2 : X → R+ are positive functions with
o(π1 ) ≤ ω ξ and o(π2 ) ≤ ω ξ for some ξ < ω1 , then o(π1 ∧ π2 ) ≤ ω ξ .

2. RESULTS
Throughout this section, let X be a Polish space with a compatible metric d.
Before we proceed to our main result, we recall the definition of convergence index of
a sequence. Let (fn ) be a sequence of real-valued functions on X and H be a closed
subset of X. For any ε > 0, let D0 ((fn ), ε, H) = H and D1 ((fn ), ε, H) be the set of
those x ∈ H such that for every neighborhood U of x and any N ∈ N, there are n and
m in N with n > m > N and x0 ∈ U ∩ H such that |fn (x0 ) − fm (x0 )| ≥ ε. For all
countable ordinals α, let
Dα+1 ((fn ), ε, H) = D1 ((fn ), ε, Dα ((fn ), ε, H)).
262 Atok Zulijanto

If α is a countable limit ordinal, set


\ 0
Dα ((fn ), ε, H) = Dα ((fn ), ε, H).
α0 <α

Define the ε-convergence index γH ((fn ), ε) of (fn ) on H by


 α
 the smallest ordinal α < ω1 such that D ((fn ), ε, H) = ∅

if such an α exists,

γH ((fn ), ε) =


ω1 , otherwise.

Finally, the convergence index of (fn ) on H is


γH ((fn )) = sup{γH ((fn ), ε) : ε > 0}.
We shall write γ((fn ), ε) and γ((fn )) for γX ((fn ), ε) and γX ((fn )) respectively.
The following lemma will be useful to prove the result on an upper bound of the
oscillation index of a function that is a limit of a sequence of Baire-1 functions.
Lemma 2.1. Let φ, ψ : X → R+ , H ⊆ X be a closed set and U be an open set in X.
If φ ≥ ψ on U ∩ H, then
U ∩ Z α (φ, H) ⊆ Z α (ψ, H)
for all α < ω1 .

Proof. The proof is by induction on α. The statement is clear for α = 0 or a limit


ordinal. Suppose that the statement is true for some α < ω1 . Let x ∈ U ∩ Z α+1 (φ, H)
and V be any open neighborhood of x. By the inductive hypothesis, V ∩U ∩Z α (φ, H) ⊆
V ∩ Z α (ψ, H). Therefore
inf{ψ(y) : y ∈ V ∩ Z α (ψ, H)} ≤ inf{ψ(y) : y ∈ V ∩ U ∩ Z α (φ, H)}
≤ inf{φ(y) : y ∈ V ∩ U ∩ Z α (φ, H)}
= 0.
Hence x ∈ Z α+1 (ψ, H). 

Now, we are ready to prove our main result.


Theorem 2.1. Let (fn ) be a sequence in Bξ1 (X) converging pointwise to f with γ((fn )) ≤
γ0 . Then f is a Baire-1 function with β(f ) ≤ ω ξ · γ0 .

Proof. Let ε > 0 be given. By Theorem 1.3, there exists a sequence (δn )n≥1 of positive
functions on X such that for each n ∈ N, δn is an 3ε -gauge of fn and o(δn ) = β(fn , 3ε ) ≤
0
ω ξ . By replacing δn+1 with δn+1 = δn+1 ∧ δn for each n ∈ N if necessary, we can assume
that δn+1 ≤ δn for all n ∈ N.
For all x ∈ Dα ((fn ), 3ε , X)\Dα+1 ((fn ), 3ε , X), there exist rα (x) > 0 and N (α, x) ∈
N such that whenever n > m ≥ N (α, x) we have
ε
|fn (x0 ) − fm (x0 )| <
3
An Application of Zero Index to Sequences of Baire-1 Functions 263

for all x0 ∈ B(x, rα (x)) ∩ Dα ((fn ), 3ε , X). Taking the limit as n → ∞, we have
ε
|f (x0 ) − fm (x0 )| ≤ (1)
3
for all m ≥ N (α, x) and all x0 ∈ B(x, rα (x)) ∩ Dα ((fn ), 3ε , X).
α
Since {B(x, r 2(x) ) : x ∈ Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X)} is an open cover of
the separable (thus Lindelöf ) space Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X), there exists
(xi )∞ α ε
i=1 ⊆ D ((fn ), 3 , X) \ D
α+1
((fn ), 3ε , X) such that

ε ε [ rα (xi )
Dα ((fn ), , X) \ Dα+1 ((fn ), , X) ⊆ B(xi , ).
3 3 i=1
2
r α (xj )
For all x ∈ Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X), let j(x) = min{j : x ∈ B(xj , 2 )}.
For all m, j ∈ N, denote
rα (xi )
 
α 1
rm,j = ∧ min .
m 1≤i≤j 2
α α
Then (rm,j )m and (rm,j )j are non-increasing sequences. For all α < γ0 and m ∈ N,
let Um be m -neighborhood of Dα ((fn ), 3ε , X), set mx = min{m : x ∈ Dα ((fn ), 3ε , X) \
α 1
α+1
Um } and Nα,j = max1≤i≤j N (α, xi ).
Define δ : X → R+ by
α
δ(x) = rm x ,j(x)
∧ δNα,j(x) (x)

whenever x ∈ Dα ((fn ), 3ε , X)\Dα+1 ((fn ), 3ε , X). Let x, y ∈ X with d(x, y) < min{δ(x), δ(y)}.
Suppose that x ∈ Dα ((fn ), 3ε , X) \ Dα+1 ((fn ), 3ε , X). Since y ∈ B(x, δ(x)) and δ(x) ≤
α
rm x ,j(x)
≤ m1x , we have y 6∈ Dα+1 ((fn ), 3ε , X). Therefore y ∈ Dβ ((fn ), 3ε , X)\Dβ+1 ((fn ), 3ε , X)
for some β ≤ α. By symmetry we have β = α.
If j(x) ≤ j(y), then Nα,j(x) ≤ Nα,j(y) . It follows that δNα,j(x) ≥ δNα,j(y) . Therefore
d(x, y) < min{δNα,j(x) (x), δNα,j(y) (y)} ≤ min{δNα,j(x) (x), δNα,j(x) (y)}
r α (xj(x) )
which implies that |fNα,j(x) (x) − fNα,j(x) (y)| < 3ε . Also, since x ∈ B(xj(x) , 2 ) and
y ∈ B(x, δ(x)), we see that
rα (xj(x) )
d(y, xj(x) ) ≤ d(y, x) + d(x, xj(x) ) < δ(x) +
2
rα (xj(x) ) rα (xj(x) )
≤ + = rα (xj(x) ).
2 2
Thus, both x and y belong to B(xj(x) , rα (xj(x) ))∩Dα ((fn ), 3ε , X) and Nα,j(x) ≥ N (α, xj(x) ).
It follows from (1) that |fNα,j(x) (z) − f (z)| ≤ 3ε for z = x or z = y. Therefore we have
|f (x) − f (y)| ≤ |f (x) − fNα,j(x) (x)| + |fNα,j(x) (x) − fNα,j(x) (y)| + |fNα,j(x) (y) − f (y)|
ε ε ε
< + + = ε.
3 3 3
Similarly, if we assume that j(x) ≥ j(y), we will also have |f (x) − f (y)| < ε.
264 Atok Zulijanto

We have proved that δ is an ε-gauge of f . It remains to show that o(δ) ≤ ω ξ · γ0 .


To this end, we claim that for all α < ω1 ,
ξ ε
Z ω ·α (δ, X) ⊆ Dα ((fn ), , X).
3
We prove the claim by induction on α. The claim is trivial if α = 0 or a limit or-
ξ
dinal. Suppose that the assertion is true for some α < ω1 . Let x ∈ Z ω ·(α+1) (δ, X).
Suppose that x 6∈ Dα+1 ((fn ), 3ε , X). By the inductive hypothesis, x ∈ Dα ((fn ), 3ε , X) \
Dα+1 ((fn ), 3ε , X). Choose m ∈ N such that m 1
< 12 d(x, Dα+1 ((fn ), 3ε , X)). Then
1 ξ ε ε
B(x, ) ∩ Z ω ·α (δ, X) ⊆ Dα ((fn ), , X) \ Dα+1 ((fn ), , X).
m 3 3
r α (xj(x) ) ξ
1
For all y ∈ B(x, m ) ∩ B(xj(x) , 2 ) ∩ Z ω ·α (δ, X), we have
α
δ(y) = rm y ,j(y)
∧ δNα,j(y) (y).
r α (x
j(x) )
Since y ∈ B(xj(x) , 2 ), j(y) ≤ j(x). It follows that Nα,j(y) ≤ Nα,j(x) , which im-
plies that δNα,j(y) (y) ≥ δNα,j(x) (y) by the monotonicity of (δn ). Also, y ∈ Dα ((fn ), 3ε , X)\
α+1
Um implies that my ≤ m. Therefore,
α α α
rm y ,j(y)
≥ rm,j(y) ≥ rm,j(x) .
α 1 r α (xj(x) ) ξ
·α
Thus, δ(·) ≥ rm,j(x) ∧ δNα,j(x) (·) on B(x, m ) ∩ B(xj(x) , 2 ) ∩ Zω (δ, X). Let
α
1 r (xj(x) )
U = B(x, m) ∩ B(xj(x) , 2 ), then by Lemma 2.1, for all λ < ω1 we have
ξ ξ
·α ·α
U ∩ Z λ (δ, Z ω (δ, X)) ⊆ Z λ (δNα,j(x) , Z ω (δ, X)) ⊆ Z λ (δNα,j(x) , X).
ξ
Since Z ω (δNα,j(x) , X) = ∅, applying λ = ω ξ to the above, we have
ξ ξ
·(α+1)
x ∈ U ∩ Zω (δ, X) ⊆ Z ω (δNα,j(x) , X) = ∅.
There is a contradiction. Thus x ∈ Dα+1 ((fn ), 3ε , X).
It follows from the Claim that o(δ) ≤ ω ξ · γ0 . Since δ is an ε-gauge of f , by
Theorem 1.2, β(f, ε) ≤ o(δ) ≤ ω ξ · γ0 . Since this is true for any ε > 0, we conclude that
β(f ) ≤ ω ξ · γ0 . 

Acknowledgement. I would like to thank Dr Tang Wee Kee for suggesting this
problem to me.

References
[1] Baire, R., Sur les Fonctions des Variables Relles, Ann. Mat. Pura ed Appl. 3, 1-122, 1899.
[2] Haydon, R., Odell, E. and Rosenthal, H. P., On Certain Classes of Baire-1 Functions with
Applications to Banach Space Theory, in : Functional Analysis, Lecture Notes in Math., 1470,
1-35, Springer, New York, 1991.
[3] Kechris, A. S. and Louveau, A., A Classification of Baire Class 1 Functions, Trans. Amer.
Math. Soc. 318, 209-236, 1990.
[4] Kiriakouli, P., A Classification of Baire-1 Functions, Trans. Amer. Math. Soc. 351, 4599-4609,
1999.
An Application of Zero Index to Sequences of Baire-1 Functions 265

[5] Lee, P.-Y., Tang, W.-K, and Zhao, D., An Equivalent Definition of Functions of The First Baire
Class, Proc. Amer. Math. Soc. 129, 2273-2275, 2000.
[6] Leung, D. H. and Tang, W.-K., Functions of Baire Class One, Fund. Math. 179, 225-247, 2003.
[7] Leung, D. H. and Tang, W.-K., Extension of Functions with Small Oscillation, Fund. Math.
192, 183-193, 2006.
[8] Leung, D. H., Tang, W.-K. and Zulijanto, A., A gauge Approach to An Ordinal Index of Baire
One Functions, Fund. Math. 210, 99-109, 2010.

Atok Zulijanto
Department of Mathematics Faculty of Mathematics and Natural Sciences
Gadjah Mada University, Indonesia
e-mail: [email protected]
266 Atok Zulijanto
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 267–274.

REGULATED FUNCTIONS IN THE n-DIMENSIONAL


SPACE

Ch. Rini Indrati

Abstract. In this paper we discuss some important characterizations of the regulated


function in the n-dimensional space. The characterization will be used to prove that a
regulated function is Hencstock-Stieltjes on a cell E and to have a convergence theorem
for Henstock-Stieltjes integral in the n-dimensional space.

Keywords and Phrases: n-dimensional, regulated function, Henstock-Kurzweil integral,


convergence theorem.

1. INTRODUCTION

Based on the definition of regulated function that is introduced by Tvrdŷ [5], the
set of all regulated functions has been characterized as the closure of the union of C[a, b]
and BV [a, b], where C[a, b] and BV [a, b] stand for the collection of all continuous func-
tions on [a, b] ⊆ R and the collection of all bounded variation functions on [a, b] ⊆ R,
respectively [2]. Based on the result of the characterization of the regulated function,
some theorems in the Henstock-Stieltjes integral on [a, b] ⊆ R could be improved by
using regulated function. In [3], Indrati used the generalized of the concept of regu-
lated function in the n-dimensional space and used it to have the multiplication of two
Henstock integrable functions still Henstock integrable on a cell E ⊆ Rn . In this paper
we characterize the set of regulated function on a cell E ⊆ Rn . We characterize the
regulated function in the Henstock-Stieltjes integral to have a generalized result in [2].
Furthermore, we build a convergence theorem for the Henstock-Stieltjes integral on a
cell E ⊆ Rn by using regulated function.
Some concepts that will be used in the generalization are stated below [3].
In this discussion, a cell E stands for a non degenerate closed and bounded interval
in the Euclidean space Rn . Its volume will be represented by |E|.

2010 Mathematics Subject Classification: 26 A 39

267
268 Ch. Rini Indrati

For x = (x1 , x2 , . . . , xn ) ∈ E, kxk = max |xk | : 1 ≤ k ≤ n. If E is a cell and δ is a


positive function on E, an open ball centered at x ∈ E with radius δ(x), in short will
be written B(x, δ(x), is defined as follows:
B(x, δ(x)) = {y : kx − yk < δ(x).}
A collection of cells {Ii : i = 1, 2, . . . , p} is called non-overlapping if Iio ∩ Ijo = ∅ for
i 6= j. A collection of finite non-overlapping cells, D = {I} with ∪I∈D I = E, is
called a partition of E. A collection D = {(I, x)} = {(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} is
called a δ-fine partition of E if E = ∪I∈D I, xi ∈ Ii ⊆ B(xi , δ(xi )), and Iio ∩ Ijo = ∅,
i 6= j, i = 1, 2, . . . , p. Furthermore, (I, x) ∈ D is called a δ-fine interval with associated
point x. A collection D = {(I, x)} = {(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} is called a δ-fine
partial partition in E if ∪I∈D I ⊆ E, xi ∈ Ii ⊆ B(xi , δ(xi )), and Iio ∩ Ijo = ∅, i 6= j,
i = 1, 2, . . . , p.

2. REGULATED FUNCTION IN THE n-DIMENSIONAL SPACE


Definition 2.1. [1] A bounded real valued function ϕ is called a step function on a cell
E ⊆ Rn , if it assumes only a finite number of distinct values in E.
Lemma 2.1. [1] A bounded real valued function ϕ is a step function on a cell E if and
only if there exist a partition D = {I} = {I1 , I2 , . . . , Ip } and ci ∈ R, i = 1, 2, 3, . . . , p,
such that
f (x) = ci , x ∈ Iio .
Definition 2.2. [3] Let E ⊆ Rn be a cell. A function f : E → R is called a regulated
function on E, if for every  > 0, there exists a step function ϕ : E → R such that for
every x ∈ E we have
|f (x) − ϕ(x)| < .

It is clear that every step function on a cell E is a regulated function on E. Since


every continuous function on a cell E can be uniformly approximated by a step function
on E, then we have fact in Theorem 2.1.
Theorem 2.1. If f is a continuous function on a cell E ⊆ Rn , then f is a regulated
function on E.

Proof. See [3]. 

We start the result of the research in this paper by giving some characterization
below. The characterization of the regulated function has been done based on the
Definition 2.2.
From the definition of regulated function, we have a characteristic of regulated
function in a sequence of step functions in Theorem 2.2.
Theorem 2.2. A function f is regulated on a cell E ⊆ Rn if and only if there exists a
sequence of step functions {ϕk } that converges uniformly to f on E.
Regulated Functions in the n-dimensional Space 269

Lemma 2.2. If ϕ and φ are step functions on a cell E, then ϕ + φ and αϕ are step
functions.

As corollary we have Theorem 2.3.


Theorem 2.3. If f1 and f2 are regulated functions on a cell E ⊆ Rn , then f1 + f2 and
αf1 are regulated on E for every α ∈ R.

Let RF (E) stand for a collection of all regulated functions on a cell E ⊆ Rn .


From the definition of a regulated function on E, we have RF (E) is the closure of the
set of all step functions on E. Furthermore, by Lemma 2.2, RF (E) is a closed convex
hull of a set of all linear combination of step functions on E. Therefore, RF (E) is
closed.
Theorem 2.3 implies that RF (E) is a linear space over R. From Theorem 2.1, it
is clear that C(E) is a subspace of RF (E), where C(E) is a collection of all continuous
functions on a cell E.
The next characteristic of a regulated function will be used to prove the integra-
bility of a regulated function in the sense of Henstock-Stieltjes.
Theorem 2.4. A function f is regulated on a cell E ⊆ Rn if and only if for every  > 0,
there exists a partition D = {I} = {I1 , I2 , . . . , Ip } of E such that for every x, y ∈ Iio ,
1 ≤ i ≤ p, we have
|f (x) − f (y)| < .

Proof. (⇒) Let  > 0 be given. There exist a step function ϕ on E, such that for every
x ∈ E, we have

|ϕ(x) − f (x)| < .
2
Let D = {I} = {I1 , I2 , . . . , Ip } be the partition of E due to the step function ϕ on E.
Therefore, for any x, yıIio , we have
|f (x) − f (y)| ≤ |f (x) − ϕ(x)| + |ϕ(y) − f (y)| < .
(⇐) Let  > 0 be given. From the hypothesis, there is a partition D = {I} = {Ii =
[ai , bi ], i = 1, 2, 3, . . . , p} of E such that for every x, y ∈ Iio , 1 ≤ i ≤ p, we have
|f (x) − f (y)| < .
We define a function ϕ on E, with
for x ∈ Iio

f (ai ),
ϕ(x) =
min{f (x) : x ∈ ∂(Ii )}, otherwise.
We have ϕ is a step function on E and for every x ∈ E,
|ϕ(x) − f (x)| < .

270 Ch. Rini Indrati

3. REGULATED FUNCTION IN THE HENSTOCK-STIELTJES


INTEGRAL
In this section we characterize the regulated function as a function that is Henstock-
Stieltjes integrable function on a cell E. Moreover, a convergence theorem will be stated
in regulated function.
Definition 3.1. [4] Let E ⊆ Rn be a cell and I(E) be a collection of all intervals
subset of E. A function g : I(E) → R is called additive on E, if for every I, J ∈ I(E),
I o ∩ J o = ∅, and I ∪ J ∈ I(E), we have
g(I ∪ J) = g(I) + g(J).
Definition 3.2. Let E ⊆ Rn be a cell and I(E) be a collection of all intervals subset
of E and g : I(E) → R be an additive function on E. The variation of g on E, written
Vg (E) or V (g, E), is defined as the following
X
Vg (E) = sup |g(D)|,
D∈D

where the supremum is taken over all partitions D = {D} of E. We have g has bounded
variation on E, if Vg (E) < ∞, i.e., there is a constant M ≥ 0 such that for every
partition D = {I} of E, we have
X
(D) |g(I)| ≤ M.
In this section, we will consider that the function g is additive on a cell E.
3.1. The Henstock-Stieltjes Integrability of the Regulated Functions. In the
Riemann-Stieltjes in the real line, there is no guarantee of the integrability of a function
with respect to a function when they have a point of discontinuity. The Henstock-
Stieltjes in the real line gives a guarantee of that case [2]. In this paper we give a
generalized result in Theorem 3.4.
Definition 3.3. A function f is said to be Henstock-Stieltjes integrable with respect to
a function g on a cell E ⊆ Rn , if there exists a real number A, such that for every  > 0,
there is a positive function δ on E, such that for every δ-fine partition D = {(I, x)} =
{(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} of E, we have
X
|(D) f (x)g(I) − A| < ,
P Pp
where (D) f (x)g(I) = i=1 f (xi )g(Ii ).
The real number A in Definition 3.3 is unique and will be called the Henstock-
Stieltjes integral value of f with respect to g on E, written
Z
A = (HS) f dg.
E

Here, HS(E; g) represents a collection of all Henstock-Stieltjes integrable func-


tions with respect to a function g on a cell E ⊆ Rn . Some basic properties of the
Henstock-Stieltjes integral in one-dimensional space could be generalized in the n-
dimensional space.
Regulated Functions in the n-dimensional Space 271

(i) HS(E; g) is a linear space over R.


(ii) If f ∈ HS(E; g) and f ∈ HS(E; h), then f ∈ HS(E; g + h). Moreover, we have
Z Z Z
HS) f d(g + h) = (HS) f dg + (HS) f dh.
E E E
n
(iii) Let E1 and E2 be non-overlapping cells in R with E1 ∪ E2 is a cell E. If
f ∈ HS(E1 ; g) and f ∈ HS(E2 ; g), then f ∈ HS(E; g). Moreover, we have
Z Z Z
HS) f dg = (HS) f dg + (HS) f dg.
E E1 E2

Sometimes, it is easier to use the Cauchy’s Criterion to check the integrability of


a function rather than to find the integral value of the function. The Cauchy’s Criterion
still hold in the n-dimensional space as stated in Theorem 3.1.
Theorem 3.1. Cauchy’s Criterion
A function f is Henstock-Stieltjes integrable with respect to g on a cell E ⊆ Rn if and
only if for every  > 0 there exists a positive function δ on E such that for every two
δ-fine partitions D1 = {(I, x)} and D2 = {(I, x)} of E, we have
X X
|(D1 ) f (x)g(I) − (D2 ) f (x)g(I)| < .

Proof. (⇒) Use definition 3.3.


(⇐) For every n ∈ N, there is a positive function δk∗ on E such that for any two δk∗ -fine
partition D1 = {(I, x)} and D2 = {(I, x)} of E, we have
X X 1
|(D1 ) f (x)G(I) − (D2 ) f (x)G(I)| < .
n
For every k ∈ N, we define a positive function δk , where
(
δk (x) = min{δ1∗ (x), δ2 x), . . . , δk∗ (x)},
for every x ∈ E. Therefore, for every n, we have a fact that every δk -fine partition
of E Pis a δk∗ -fine partition of E. For every n, put one δk partition D of E and Ak =
(Dk ) f (x)g(I). By a fact that for m, k ∈ N, m ≥ n, we have every δm -fine partition
of E is a δk -fine partition of E, we can prove that {Ak } is a Cauchy sequence in R.
There exists a real number A as the limit of {Ak }. Finally, it can be proved that A is
the Henstock-Stieltjes integral value of f with rspect to g on E. 

By Cauchy’s Criterion, we have Theorem 3.2 and 3.3.


Theorem 3.2. If f ∈ HS(E; g), then f ∈ HS(I; g) for every cell I ⊆ E.
Theorem 3.3. If f is a continuous function on a cell E ⊆ Rn and g is a bounded
variation function on E, then f is Henstock-Stieltjes integrable with respect to g on E.

Proof. Since g has bounded variation, there exists a constant M > 0, such that for
every partition D = {I} of E, we have
X
(D) |g(I)| ≤ M.
272 Ch. Rini Indrati

Let  > 0 be given. Since f is continuous on a cell E, then f is uniformly continuous


on E. There exists a positive constant δ, such that for every x, y ∈ E, kx − yk < δ, we
have

|f x) − f (y)| < .
M

Put δ(x) = 2−(n+1) δ, for every x ∈ E. As corollary, for every two δ-fine partitions
D1 = {(I, x)} = {(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} and
D2 = {(J, y)} = {(J1 , y1 ), (J2 , y2 ), . . . , (Jm , ym )} of E, we have

X X p X
X m
|(D1 ) f (x)|g(I)| − (D2 ) f (y)|g(J)|| = | (f (xi ) − f (yj )) g(Ii ∩ Jj )|
i=1 j=1
p X
X m
≤ |f (xi ) − f (yj )| |g(Ii ∩ Jj )|
i=1 j=1
p Xm
 X
< |g(Ii ∩ Jj )|
M i=1 j=1
p+m
 X
≤ |g(Di )|
M i=1

≤ M = .
M

By Cauchy’s Criterion, f is Henstock-Stieltjes integrable with respect to g on E. 

Now we prove that a regulated function is Henstock-Stieltjes with respect to a


bounded variation on E.

Theorem 3.4. If f is a regulated function on a cell E ⊆ Rn and g is a bounded


variation function on E, then f is Henstock-Stieltjes integrable with respect to g on E.

Proof. Let  > 0 be given. Since f is regulated on E, by Theorem 2.4, there exists a
partition D = {I} = {I1 , I2 , . . . , Ip } of E such that for every x, y ∈ Iio , 1 ≤ i ≤ p, we
have

|f (x) − f (y)| < ,
M

where M = Vg (E). We define a positive function δ on E such that for every x ∈ Iio ,
B(x, δ(x)) ⊆ Iio , for i = 1, 2, 3, . . . , p. Therefore, for any two δ-fine partition D1 =
{(I, x)} = {(I1 , x1 ), (I2 , x2 ), . . . , (Ip , xp )} and
Regulated Functions in the n-dimensional Space 273

D2 = {(J, y)} = {(J1 , y1 ), (J2 , y2 ), . . . , (Jm , ym )} of E, we have


X X p X
X m
|(D1 ) f (x)|g(I)| − (D2 ) f (y)|g(J)|| = | (f (xi ) − f (yj )) g(Ii ∩ Jj )|
i=1 j=1
p X
X m
≤ |f (xi ) − f (yj )| |g(Ii ∩ Jj )|
i=1 j=1
p Xm
 X
< |g(Ii ∩ Jj )|
M i=1 j=1
p+m
 X
≤ |g(Di )|
M i=1

≤ M = .
M

By Cauchy’s Criterion, f is Henstock-Stieltjes integrable with respect to g on E. 

3.2. Regulated Functions in the Convergence Theorem of the Henstock-


Stieltjes Integral.
Theorem 3.5. Let {fk } be a sequence of regulated functions on a cell E ⊆ Rn . If {fk }
uniformly converges to a function f on E, then f is regulated on E.

Proof. Let  > 0 be given.


For every k, there is a step function ϕk : E → R such that for every x ∈ E we have
|fk (x) − ϕk (x)| < .
Since {fk } uniformly converges to f on E, there exists a positive integer ko such that
for any k ∈ N, k ≥ ko , we have
|fk (x) − f (x)| < ,
for every x ∈ E. Put ϕ = ϕko , then for every x ∈ E, we have
|f (x) − ϕ(x)| ≤ |f (x) − fko (x)| + |fko (x) − ϕ(x)| < 2.
That means f is regulated on E. 
Theorem 3.6. Let {fk } be a sequence of regulated functions on a cell E ⊆ Rn and
g ∈ BV (E). If {fk } uniformly converges to a function f on E, then f is Henstock-
Stieltjes integrable with respect to g on E. Furthermore,
Z Z
lim (HS) fk dg = (HS) lim fk dg.
k→∞ E E k→∞

Proof. By Theorem 3.5, f is regulated on E, then by Theorem 3.4, f and fk are


Henstock-Stieltjes
R integrable with respect to g on E for every k. For every k ∈ N, put
Ak = (HS) E fk dg. We can prove that {Ak } is a Cauchy sequence in R. Therefore,
274 Ch. Rini Indrati

there exists A ∈ R, such that {Ak } converges to A. The number A is the Henstock-
Stieltjes integral value of f with respect to g on E. Moreover, we have
Z Z
lim (HS) fk dg = (HS) lim fk dg.
k→∞ E E k→∞


4. CONCLUDING REMARKS
We have had a characterization of regulated function in section 2, especially The-
orem 2.4. The space of all regulated functions includes the space of all continuous
functions on a cell E ⊆ Rn . From Theorem 2.4 we prove that every regulated function
on a cell E ⊆ Rn is Henstock-Stieltjes integrability with respect to an additive bounded
variation function in Theorem 3.4. The convergence theorem involving regulated func-
tion is stated in Theorem 3.6. These results open opportunity in solving differential
equation problems with discontinuity.

References
[1] Bartle, R.G. and Sherbert, D.R., Introduction to Real Analysis, Third Edition, J. Wiley &
Sons, USA, 2000.
[2] Indrati, Ch. R., On the Regulated Function, Proceeding of National Seminar on Mathematics, ,
Surabaya, Indonesia, June 20, 2009, pp. 8 - 24, 2009.
[3] Indrati, Ch. R., The Application of Regulated Function on the Multiplication of Two Henstock
Integrable Functions, Proceeding of ICMSA IMT-GT, Padang, Indonesia, 2009.
[4] Pfeffer, W.F., The Riemann Approach to Integration, Cambridge University Press, New-York,
USA, 1993.
[5] Tvurdŷ, M., Linear Bounded Functionals on the Space of regular Regulated Functions, Tatra
Mountains Mathematical Publications 8, 203 - 210, 1996.

Ch. Rini Indrati


Department of Mathematics, Gadjah Mada University,
Sekip Utara, Yogyakarta, Indonesia.
e-mails: [email protected] or ch.rini [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 275 - 280.

COMPACTNESS SPACE WHICH IS INDUCED BY


SYMMETRIC GAUGE

DEWI KARTIKA SARI AND CH. RINI INDRATI

Abstract. In this paper we define a uniformly continuous mapping which is induced by a symmetric
gauge, called M s -uniformly continuous mapping. By this M s -uniformly continuous mapping, we
characterize new compactness which is induced by gauge symmetric, called symmetric gauge
compact.
Keywords and Phrases: Symmetric gauge, M s -uniformly continuous mapping, Symmetric
gauge compact.

1. INTRODUCTION

A gauge on a metric space  X , d  is understood to be a positive real valued

function  : X 

. A gauge is used to define Henstock-Kuzweil [1] and to provide a
Cauchy type characterization for Baire class one function [4]. For any positive valued
function  on the metric space  X , d  , we can define an open neighbourhood, i.e. for
each

x  X ,  x   B  x,   x     y  X : d  x, y     x  .
Since for an arbitrary topological space we do not use the metric, consequently we can not
define an open neighbourhood as in a metric space, we define a gauge on a topological
space  X ,  as a function from topology space X to the topology of space itself. In 2005,
Zhao introduced a binary relation RM which is induced by gauge δ on the topological
space X [5], i.e. for every x, y  X , x   y  or y   x  . by this relation he created M-
uniformly continuous mapping and gauge compact space [5].

275
276 D. K. SARI AND CH. R. INDRATI

In this paper we create M s -uniformly continuous mapping and generalize the


concept of gauge compact becomes symmetric gauge compact by a gauge which has the
property of symmetric. By M s -uniformly continuous mapping, we study characterization of
symmetric gauge compact space. We prove that gauge compactness implies symmetric gauge
compactness and every M-uniformly continuous image of a symmetric gauge compact is
symmetric gauge compact if the codomain function is R0 space.

2. SYMMETRIC GAUGE AND


M s -UNIFORMLY CONTINUOUS MAPPINGS

2.1. Symmetric Gauge. Let  X ,d  be a metric space and a be any positive number. We
define a gauge  a on X, where 
a
 x   z  X : d  z, x   a , for every x  X . The

Gauge a has following property


x   a  y  if only if y   a  x  .
In [6] Zhao defined a gauge on a topological space which has a such property is called
symmetric gauge, as follows

Definition 1 Let X be a topological space. A Gauge  on X is said to be symmetric if for


any x, y  X , x    y  iff y    x  .

Example 2 Let X   5,2  7,10 be subset of equipped with usual topology .


Define a gauge on X by
 x  2, x  2    5, 2 , x   5, 2

  x  
  x  3, x  3  7,10 , x  7,10.

A gauge  is symmetric on X.

Next we give the properties of gauge symmetry. According to [2], if f : X  Y is a


continuous function between topological spaces and 𝛿 is a gauge on a topological space X,
then f
1
  is  
a gauge on Y, where f 1   x   f 1  f  x   for each x  X .
Consequently, we have

Lemma 3 Let X and Y be topological spaces and f : X  Y be a continuous mapping. If 𝛿


is a symmetric gauge on Y, then f
1
  is a symmetric gauge on X.
Proof: Since δ is a symmetric gauge on Y, then for each y1 , y2  Y
y1   y2   y2   y1  .
Furthermore, since f is a mapping from X to Y , then for any x1 , x2  X there exist
277 C om p a ct n es s Spa c e wh i c h i s In du c ed b y S ym m et ri c G a u ge

y1 , y2  Y such that f  x1   y1 and f  x2   y2 . As consequence, we have


x1  f 1   x2 
 x2  f 1   x1 
Therefore, f
1
  is a symmetric gauge on X.∎

Based on [2], intersection of two gauges is a gauge. So, we have

Lemma 4 Let X be a topological space. If 𝛿 and 𝜆 are symmetric gauges, then   is


also symmetric, where     x     x     x  , x  X .
Proof: For any x, y  X , we will prove x      y   y      x  .

Since  and  are symmetric, we have


x   y   y   x  and x    y   y    x  .
Consequently,
y   x     x       x 
 x      y     y     y  .
We can conclude that   is symmetric on X.∎

2.2. M s - Uniformly Continuous Mappings. A function f : X  Y is an M-uniformly


continuous mapping between topological spaces if every gauge 𝜆 on Y there exists a
gauge 𝛿 on X such that for any x1 , x2  X , x1 RM x2 implies x1 , x2  X , f  x1  RM f  x2  ,
see [5]. In this subsection we will generalize the concept of M-uniformly continuous
mapping becomes M s -uniformly continuous mapping by replacing the gauge δ with a
symmetric gauge δ.

Definition 5 Let X and Y be topological spaces. A mapping f : X  Y is said to be Ms -


uniformly continuous if for any symmetric gauge 𝜆 on Y there exists a symmetric gauge 𝛿 on
X such that for any x1 , x2  X , x1   x2   x2   x1  implies
f  x1     f  x2    f  x2     f  x1   .

M s - uniformly continuous mappings is


It is clearly that the composition of two
M s - uniformly continuous. Next we will prove that every continuous mapping is an M s -
uniformly continuous.

Theorem 6 Let X and Y be topological spaces. Every continuous mapping is Ms- uniformly
continuous.
278 D. K. SARI AND CH. R. INDRATI

Proof: Let f : X  Y be a mapping and λ be a symmetric gauge on Y. By Lemma 3,


  f 1    is symmetric on X. For each x, y  X , x   y   y   x  implies
f  x     f  y   f  y     f  x .
We can conclude that f is M s - uniformly continuous.∎

Converse of Theorem 6 is not always true, see the following example.

Example 7 Let X  Y  1,2,3,4,... be topological spaces equipped with topology


 
respectively   , X  and     A : Ac countable . We define an inclusion mapping

from X to Y. It is clear that f is M s - uniformly continuous, but f is not continuous.

Theorem 8 Let X and Y be topological spaces. Every M-uniformly continuous mapping is


M s - uniformly continuous.

Proof: Let f : X  Y be a mapping and let  be a symmetric gauge on Y. Assume that f is


an M-uniformly continuous, then there exists a gauge  on X such that for each x, y  X ,
xRM y implies f  x  RM f  y  . Since  is symmetric, we obtained
f  x    f  y   f  y    f  x  .
Hence, f is M s - uniformly continuous.∎

Based on Theorem 2 in [5] and Theorem 8, we obtained a mapping M s –uniformly


continuous f : X  Y is continuous if only if Y is R0 -space.
Recall from [3] that a space X is called an R0 -space if for any two points x, y  X ,
x  cl  y if only if y  cl  x , where cl  x denotes the closure of x .

Corollary 9 A space Y is an R0 -space if and only if for any space X, every M s - uniformly
continuous mapping is continuous.

3. SYMMETRIC GAUGE COMPACT SPACES

In this section we will use a symmetric gauge to generate a gauge compact, i.e. a
topological space X is called gauge compact if for every gauge 𝛿 on X, there exist finitely
many points x1 , x2 ,..., xn on X, such that for every x  X there is xi x1 , x2 ,..., xn  so
that xRM xi , see [5].
279 C om p a ct n es s Spa c e wh i c h i s In du c ed b y S ym m et ri c G a u ge

Definition 10 A topological space X is said to be symmetric gauge compact if for every


symmetric gauge 𝛿 on X, there exist finitely many points x1 , x2 ,..., xn on X, such that for
every x  X there is xi x1 , x2 ,..., xn  so that
x   y   y   x  .

From Definition 5 in [5] and Definition 10, we obtained

Theorem 11 If X is a gauge compact space, then X is a symmetric gauge compact.


Proof: Let λ be a symmetric gauge on X. Since X is gauge compact, there exist finitely many
points x1 , x2 ,..., xn on X, such that for each x  X there is xi  x1 , x2 ,..., xn  so that
xRM xi . Furthermore, since δ is a symmetric gauge, then x    xi   xi    x  . So for
every symmetric gauge 𝛿 on X, there exist finitely many points x1 , x2 ,..., xn on X, such that
for each x  X there is xi x1 , x2 ,..., xn  so that x    xi   xi    x  . We can
conclude that X is symmetric gauge compact.∎

Theorem 12 If f : X  Y is an M s -uniformly continuous mapping onto a


space Y and X is symmetric gauge compact, then Y is symmetric gauge compact.
Proof: Let λ be a symmetric gauge on Y. Since f is M s - uniformly continuous, there
exists a symmetric gauge δ on X such that for each u, v  X , u    v   v    u 
implies
f u     f  v   f  v     f u  .
Furthermore X is symmetric gauge compact, then there exist finitely many points
x1 , x2 ,..., xn  X such that for every x  X , there is xi  x1 , x2 ,..., xn  so that
x    xi   xi    x  for some i. Now for every y  Y , since f is surjective, there
exists a x  X such that f  x   y . Suppose x    xi   xi    x  , then
y    f  xi    f  xi     y  .
Hence Y is symmetric gauge compact.

4. CONCLUDING REMARK

An M s - Uniformly continuous mapping is an extension of M-uniformly continuous


mapping. This mapping is obtained by replacing gauge which induced M-uniformly
continuous mapping with gauge symmetry. Gauge symmetry is imposed on a gauge compact
space produces a symmetry gauge compact.
280 D. K. SARI AND CH. R. INDRATI

Since every continuous mapping is M s - uniformly continuous, then every


continuous image of symmetry gauge compact is symmetry gauge compact. Also M-
Uniformly continuous mapping is maintain symmetry gauge compactness.

References

[1] BARTLE, R. G. AND SHERBERT, D. R., Introduction to Real Analysis, 3th edition, John wiley and Son, Inc.,
New York, 2000.
[2] ENGELKING, R.,General Topology, , revised and completed edition, Heldermann Verlag, Berlin., United Stade
of America, 1989.
[3] PREUSS, G., Theory of Topological Structure- An Approachto Categorical Topology, D. Reidel Publishing
Company, City ?, 1988.
[4] LEE, P.Y., TANG, W. K., AND ZHAO, D., An equivalent definition of functions of the first Baire class, Proc.
Amer. Math. Soc., 129, 2273-2275, 2001.
[5] ZHAO, D., A New Compactness Type Topological Property, Quaestiones Mathematicae, 28, 1-11, 2005.

DEWI K ARTIKA S ARI


DEPARTMENT O F M ATHEMATICS G ADJAH M ADA U NIVERSITY.
e-mail: [email protected]

CH. R INI INDRATI


DEPARTMENT O F M ATHEMATICS G ADJAH M ADA U NIVERSITY.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 281 - 288.

A CONTINUOUS LINEAR REPRESENTATION OF A


TOPOLOGICAL QUOTIENT GROUP

DIAH JUNIA EKSI PALUPI, SOEPARNA DARMAWIJAYA, SETIADJI, CH RINI INDRATI

Abstract. Let (G,  ) and (Lc(V),  ) be a topological group and a topological vector space,
respectively. A continuous linear representation  c of (G,  ) into (Lc(V),  ) is a
homomorphism of (G,  ) into (GLc(V),  GL). Therefore the Ker(  c) is a normal subgroup of G.
In this paper, we prove that the (Ker(  c),  Kr) is a topological normal subgroup of the
topological group (G,  ), where  Kr is a topology induced by  . Futhermore, we also prove
that the set G = G/Ker(  c) = {g + Ker(  c)│g  G}is a topological quotient group and there is

a continuous linear representation of G into (Lc(V),  ).


Keywords and Phrases: Topological group, topological quotient group,homomorphism and
continuous linear representatition.

1. INTRODUCTION

Let G be a finite group, V be a vector space, and GL(V) be a group of all


isomorphisms of V onto itself . A linear representation of a group G in a vector space V is a
homomorphism from G into GL(V) ( [3], [5] and [10]). Based on the concept of the
topological group and the topological vector space ([2] and [9]), Palupi [6] have constructed a
topology  on Lc(V), that is a collection of all linear continuous transformations from V
onto itself. Palupi et.al.[7] have defined also a continuous linear representation of the
topological group (G,  ) into the topological vector space (V,  ). For this situation, (V,  )
is called a Representation Space. From here, Palupi [7] have continued to observe the
281
282 DIAH JUNIA EKSI PALUPI ET AL

generalization of some properties of the linear representation of finite groups to be the


continuous linear representation of topological groups, such as an invariant subspace, an
irreducible and a complete reducible representation, Decomposition of Representation Space
into direct sum of minimal topological invariant subspace ([8]). In this paper, we will search
the topological quotient group and continuous linear representation of its, and its properties.
The theorem and the definition of continuous linear representation, which we have
owned as below.
Theorem.[6] Let (G,  ) be a topological group and let (GLc(V),  GL) be a topological

subspace of (Lc(V),  ). If  c : (G,  ) → (GLc(V),  GL) is a map defined by  c(x) = Tx for


 x  G, Tx is a bijective linear continuous transformations from V onto itself in GL c(V), then
c is a continuous homomorphism .

Definition.[7]Let (G,  ) be a topological group and let (Lc(V),  ) be a topological vector

space. A continuous homomorphism  c : (G,  ) → (GLc(V),  GL), defined by  c(x) = Tx,


for  x  G, is called a continuous linear representation from topological group
(G,  )into topological vector space (V,  ), if for every x,y  G, we have
(i) Tx+y = Tx ○ Ty
(ii) T-x = (Tx)-1

Let G and G* be groups and  be a homomorphism from G into G*. Ker(  ) =


{g  G│  (g) = eG* , the identity element of G*} is a normal subgroup of G. Hence Ker(  c)
is a normal subgroup of G, we have a quotient group G/Ker(  c), written G = G/Ker(  c),
and there is an epimorphism  G . We will show G is a topological group.
from G into
Futhermore, we construct a topology on G , which is a quotient topology. Let X be
a topological space, A be a set and p be a surjection mapping from X into A. A topology T
on A relative to p is called the quotient topology induced by p [1,4]. Here, we define the
topology T as a collection of subsets U of A such that p -1(U) is open in X.
A CONTINUOUS LINEAR REPRESENTATION OF A TOPOLOGICAL QUOTIENT GROUP 283

2. MAIN RESULTS

Let (G,  ) be a topological group, (V,  ) be a topological vector space and  c be


a continuous linear representation of (G,  ) into (V,  ). So,  c is a homomorphism from
(G,  ) into (GLc(V),  GL), that is a topological vector subspace of (Lc(V),  ), that is
topological vector spaces by constructed from (V,  ) and Ker(  c) is a normal subgroup of
group G. Let  Kr be a topology induced by  on Ker(  c), (Ker(  c),  Kr) is a
topological space. If g1 and g2 are continuous functions on G such that (G,  ) is a topological
group then by the restrictions g1 and g2 on Ker(  c), we have (Ker(  c),  Kr) is a
topological subgroup of topological group (G,  ). By a normal subgroup Ker(  c) of G, we
have a quotient group G = G/Ker(  c) = {g + Ker(  c) │g  G}. We will show G is a
topological group as stated in the proposition below.

PROPOSITION 1. If  c is a continuous linear representation from a topological group


(G,  ) into a topological vector space (V,  ) then G = G/Ker(  c) is a topological
quotient group.
Proof. Since  c is a continuous linear representation then  c is a homomorphism of a group
G into GLc(V). The Ker(  c) is a normal subgroup of G and
G = G/ Ker(  c) = {g + Ker(  c) │g  G}
is a quotient group. Therefore there is an epimorphism  from G into G .
Futhermore, by the assumption (G,  ) as a topological group, we have Ker(  c)  G, which
as a topological space with topology  Kr induced by  . Let g1 and g2 be continuous
mappings such that (G,  ) is a topological group. The restriction of g 1 and g2 on Ker(  c)
are continuous. Therefore (Ker(  c),  Kr) is a topological normal subgroup of G. On the
otherhand G = G/Ker(  c) is quotient group and  from a group G into G is a
epimorphism group,  is a surjection mapping . Because we can construct a quotient
topology TG induced by  , ( G , TG) is a topological space. That means, G is a quotient
group which is a topological space.
We prove that ( G , TG) is a topological quotient group, where the topology TG is the
set{ U  G │  -1( U ) open in G}. If (G,  ) is a topological group then g1 : G  G  G
is defined by g1( x, y )= x  y  (G,  ), for any ( x, y )  (G,  )  (G,  ), and
g2 : G  G is defined by g2( x ) = - x  (G,  ), for any x  (G,  ), are continuous
mapping.
Suppose that f1 : G  G → G is defined by f1 ( x , y ) = x + y  G , for any
( x , y ) G  G , (1)
where x + y = ( x + y ) + Ker(  c) and x + y = g1( x , y )  (G,  )
Suppose that f2 : G → G is defined by f2( x ) = - x  G , for any x  G (2)
where - x = - x + Ker(  c) and - x = g2 ( x )  (G,  ).
284 DIAH JUNIA EKSI PALUPI ET AL

We show f1 and f2 are continuous as follows.


We show f1 : G  G → G is continuous.
Let consider the diagram GG  G
  
G  G  G
We have g1: G  G  G and  : G  G are continuous, then there is a continuous
mapping  =   g1 from G  G into G . Let f1: G  G  G be a mapping defined as
(1). For arbitrary ( x, y )  G  G we have (f1   )( x, y ) = f1(  ( x, y ) = f1( x , y ) =
x y = xy . On the otherhand  ( x, y ) = (   g1)( x, y ) =  (g1( x, y )) =  ( xy ) = xy .
So, there is a mapping  : G  G  G  G such that f1   =  .
Let U be an open set in G that is U  TG which  -1( U ) be an open set in G. Because
 is a continuous mapping,  -1( U ) is an open set in G  G, where
 -1( U ) = {( x, y )  G  G│  ( x, y )  U }
= {( x, y )  G  G│(   g1)( x, y )  U }
= {( x, y )  G  G│  (g1)( x, y )  U }
= {( x, y )  G  G│  ( xy )  U }
= {( x, y )  G  G│ xy  U }.
For any ( x, y )  G  G we have  ( x, y ) = ( x , y )  G  G So{( x , y )  G  G │
xy  U }= f1-1( U ) is open set in G  G , in other word f1 is continuous.

We show f2 : G → G is a continuous mapping.


Let consider the diagram G  G
  
G  G
We have g2: G  G and  : G  G are continuous. Then there is a continuous mapping
 =   g2 from G into G . By f2: G  G as in (2), there is a mapping  * : G  G
such that f2   * =  . Since, for every x  G, we have (f2   *)( x ) = f2(  * ( x )) =
f2( x ) = - x . On the other hand  ( x ) = (   g2)( x ) =  (g2( x )) =  (- x ) = - x .
Let V be an open set in G , that is V  TG which  -1( V ) be an open in G. Since  is a
continuous on G ,  -1( V ) is an open set in G, where
 -1( V ) = { x  G │  ( x )  V }
= { x  G │(   g2)( x )  V }
= { x  G │  ((g2)( x ))  V }
= { x  G │  (- x )  V }
= { x  G │- x  V }.
For every x  G we have  *( x ) = x  G .
A CONTINUOUS LINEAR REPRESENTATION OF A TOPOLOGICAL QUOTIENT GROUP 285

So, we have an open set{ x  G │- x  V }= f2-1( V ) in G , in other word f2 is continuous


We have a topological quotient group ( G , TG) as desired ■

Futhermore we will find the connection between G and a continuous linear representation
 c.
Let G be a topological group, G be a topological quotient group and GLc(V) be a topological
subspace of Lc(V). If we have the diagram
G  GLc(V)

G
where  c : G  GLc(V) and  :G  G are continuous homomorphism. Then there
exists a map  : G  GLc(V) such that  ◦  =  c. It state on the proposition followed

PROPOSITION 2. If  c is a continuous linear representation from a topological group


(G,  ) into a topological vector space (V,  ) then there is a continuous homomorphism
 : G  GLc(V) such that  ◦  =  c.
Proof. Let consider the diagram
G  GLc(V)

G
where  c : G  GLc(V) and  : G  G are continuous homomorphism. There is a map
 : G  GLc(V) such that  ◦  =  c.
We will show  is a homomorphism, that is for any x , y  G ,
 ( x + y ) =  ( x ) ◦  ( y ).
Let x , y  G ,  ( x + y ) =  ( x  y ) because x + y = ( x  y ) in G . Since
 ◦ =  c and  is surjective, for every ( x  y )  G there is ( x  y )  G such that
 ( x  y )=  (  ( x  y )) =  c( x  y ) =  c( x ) ◦  c( y ) =  (  (x)) ◦  (  (y)) =
 ( x ) ◦  ( y ). So  ( x + y ) =  ( x ) ◦  ( y ), in other word  is a homomorphism.
We show  is continuous. Let W be an open set in GLc(V). That means W is an element of
 GL. Since  c is a continuous mapping,  c-1(W) is open in G, where
 c -1(W) = { x  G │  c ( x )  W}
= { x  G │(    )( x )  W}
= { x  G │(  (  ))( x ))  W}
= { x  G │  (  ( x ))  W}
= { x  G │  ( x )  W}.
For every x  G we have  ( x ) = x  G .
286 DIAH JUNIA EKSI PALUPI ET AL

Hence, we have an open set { x  G │  ( x )  V }=  -1(W) in G . So  is continuous


on G .■
By the two above propositions, we have a continuous homorphism  from G into
GLc(V). Futhermore we will show  is a continuous linear representation.

THEOREM. A continuous homomorphism  from G into GLc(V) as in Proposition 2 is a


G into (Lc(V),  ).
continuous linear representation from
Proof. Because  ◦  =  c , for any x , y  G we can define  ( x ) = Tx , where  : G
 GLc(V). Then  ( x y ) =  ( x )  ( y ) = Tx ◦ Ty and  (- x ) = T-x ■

Corollary. For any x,y  G, if  c(x) =  c(y) then there exists an isomorphism from G or
G into Im(  c) such that the diagram
G  Im(  c)
 
G
is commutative.
Proof. If  c(x) =  c(y) then(  ◦  )(x) = (  ◦  )(y) that is  (  (x)) =  (  (y)). So
 ( x ) =  ( y ) that is Tx = Ty for every x,y  G or x , y  G . ■

3. CONCLUSION
Let  c be a continuous linear representation from a topological group (G,  ) into a
topological vector space (V,  ) then
1. the set G = G/Ker(  c) is a topological quotient group
2. there is a continuous linear representation  from G into (V,  ), that is a
homomorphism  : G  GLc(V) such that  ◦ = c
3. for any x,y  G, if  c(x) =  c(y) then G isomorphic to Im(  c) and Im(  c)
isomorphic to G

REFERENCES

[1] Bourbaki N. General Topology Part 1, Addison-Wesley Publishing Company, 1966


[2] Husain T. Introduction to Topological Groups, W. B Saunder Company, 1966.
[3] Ladermann, W. Group Representations and Characters, Interscience Publisher, Inc, 1965.
A CONTINUOUS LINEAR REPRESENTATION OF A TOPOLOGICAL QUOTIENT GROUP 287

[4] Munkres, J.R. Topology, Pearson Prentice Hall, Ameica, 2000.


[5] Pierre. Linear Representation of Finite Groups, Springer-Verlag New York, 1977
[6] Palupi, Indrati, Darmawijaya. “Konstruksi Topologi Pada Lc(V,W)”, Pros. Seminar Nasional
“Penelitian, Pendidikan dan Penerapan MIPA” Univ Neg Yogyakarta, 2009
[7] Palupi, Indrati, Darmawijaya. “Representasi Linear Kontinu Dari Grup Topologis Ke Dalam Ruang
Vektor Topologis”. Seminar Nasional “Matematika dan Pendidikan Matematika” Universitas
Negeri Malang, 2009.
[8] Palupi, Indrati, Darmawijaya. “Dekomposisi Ruang Representasi linear Kontinu” Konferensi Nasional
Matematika XV Universitas Negeri Manado, 2010.
[9] Schaefer H.H. Topological Vector Spaces, The Macmillan Company, New York, 1966.
10] Vinberg B. Ernest. Linear Representation of Groups, Birkhauser VerlagBasel- Boston-Ber
lin, 1989.
288 DIAH JUNIA EKSI PALUPI ET AL
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 289 - 296.

ON NECESSARY AND SUFFICIENT CONDITIONS FOR INTO

SUPERPOSITION OPERATOR

ELVINA HERAWATY, SUPAMA , INDAH EMILIA WIJAYANTI

Abstract. Let L be a Banach lattice and 𝜙 be a weight function that satisfies the −condition, we
define the L-value sequences where
. In this paper we discuss about necessary and sufficient conditions for
superposition operator maps the space to the space .
Keywords and Phrases : Banach lattice, weight function, −condition, superposition operator.

1. INTRODUCTION

Let ℝ be the set of all real nunbers, L be a Banach lattice and be the space of all L-
valued sequences. We denote the k-th term of sequence by . Any vektor
subspace of is called L-valued sequence space.
A L-valued sequence space X is called BK-Space if it is a Banach space and the
canonical function defined by continuous for all k ∈ℕ. For
sequence and N ∈ ℕ, finite sequences defined by and
zero otherwise. A Banach space X is said to have the AK if X contains all finite sequences
and every sequence , the limit as N →∞, hold.
We write for the real sequence space of all sequences assosiated with absolutely
convergent series. It is known that is a BK space with norm defined by

Let X and Y be two sequence spaces and let be a function with


g(k, 0) = 0 for every k ∈ ℕ. A superposition operator is defined by
for every

289
290 E. HERAWATY, SUPAMA, I.E. WIJAYANTI

Characterization of on Orlicz space was given by Robert and Šragin. The complete
investigation of superposition operators on sequence space and (0 < p < ∞) for the
sequence spaces of all bounded and null sequences and p-absolutely covergent series,
respectively was given by Dedagich and Zabreĭko. The acting condition for pg : L  1
was proved by Chew Tuan Seng by asumption that the function g(k, ⦁) are continuous. The
results of Šragin contain characterizations of superposition operators on and
where is a sequence of 𝜙-function. The main of the present paper is introduced the
L-valued sequence space , where 𝜙 is a weight function and L is a Banach lattice. For a
function such that
(1) for every k ∈ ℕ and
(2) continuous on L for every k ∈ ℕ
the necessary and sufficient conditions are given for the superposition operator maps
to .

2. SOME LEMMAS AND DEFINITIONS

Lemma 2.1. Let L-value sequence space X is a BK-space with AK-properties. If and
N∈ℕ, then for any number ε > 0, there exist a number δ > 0 so that for every with
, we have
k ∈ℕ.
The sufficient condition for functional F : X → ℝ continuous on norm space X we
shall present in the following theorem

Theorem 2.2. Let L-value sequence space X be an BK-space with AK-properties. If the
function satisfy the condition g(k, 0) = 0 and is continuous on L for
each k ≥ 1, then for N ∈ ℕ the functional defined by

is continuous on X.
Proof. Suppose not continuous on X, then there are and real numbers such
that for every δ > 0 there exist so that if then valid
for every N ∈ ℕ
If there exist such that if then
.
The result can be choosed with such that

Since is continuous on L for each k ≥ 1, then there is a real number δ > 0 so that for
every with is applicable:
On Necessary and Sufficient Conditions for into Superposition Operator 291

(1)

By lemma 1, there exist such that for every k ≤ , whenever


, so
(2)
On the other hand there exist , so if then
.
By taking , then from (2) is obtained:

So

Furthermore it can be choosed agian with such that

and so on.
So forth so that the obtained sequence and sequence with and
increasing sequence natural number such that for and there is so
that for every with is obtained:

(3)
Defined for each k ∈ ℕ. So there is a sequence . Let
then for natural numbers n ≥ m applies:

So .
292 E. HERAWATY, SUPAMA, I.E. WIJAYANTI

This means that Cauchy sequence in X. Because X complete, then there is so that
. Or . Since x ,z ∈ X then

Consequently exist in X , resulting


is exist

Therefore is exist. So there are p ∈ ℕ so


,

that contradictis with (3). Thus F is continuous on X. ∎

For Banach lattice L and the set of real number ℝ, a function 𝜙 : L ⟶ ℝ is a weight
function if it is non-decreasing on L+, continuous, even, and it satisfies
doubling condition ( condition), that is there exists a real number M > 0 such that
for every u ∈ L+.
For a weight function 𝜙 that satisfies condition, the function
defined by . So it is obtained that the sequence is
monoton non-decreasing on ℝ. The space defined by

For L-valued sequence space X, the function is called modular if


(i)
(ii)
(iii)
(iv) If
Because of is non-decreasing and on ℝ then if we define
then ρ is a modular on . A sequence space is BK-space with
respect to norm defined by

3. MAIN RESULT
The main result of the research is the theorem 3.3. The following lemmas are
required for proving those theorems.

Lemma 3.1. If given any ∈ then for every real number β > 0 there is a real number
α > 0 so that if 𝜙 for every k, then
On Necessary and Sufficient Conditions for into Superposition Operator 293

Proof :Take any real number β > 0, then there is so that . Since 𝜙 satisfy the

, then for each k valid for areal number M > 0.

Furthermore, selected α ≤ , hence if then , so that

Because of then whenever

for every k. ∎

Lemma 3.2. Given any function that satisfy the condition g(k, 0) = 0 and
is continuous on L for each k ≥ 1. If there is real number α, β > 0 so that
for each with result
then for every real number ɛ > 0 there is with for each with
and for
Proof :Since is continous on L, hence if then for a δ > 0.
Also since 𝜙 is continous on L, then for δ > 0 is true . So for each k ∈ℕ,
Therefore, for each k, the set
is upper bound.. Furthermore, define
.
can be seen that for each . Because the function 𝜙 is continous on L then the set
is closed and finite. It means .
Next take any real number ε > 0 with 0 < ε ≤ 1, then there is so that

For any m made decomposition

So that
for i = 1, 2,... , l–1 with and

As a result
for i = 1, 2,... , l.
and obtained
294 E. HERAWATY, SUPAMA, I.E. WIJAYANTI

≤ lβ – (l – 1) + ε = β + ε, for each m,
As a result β + ε.
So for for each k, with . ∎

This following theorem shows a necessary and sufficient condition of superposition


operators maps to

Theorem 3.3. Let satisfy (1) and 2). Superposition operator acts from
to if and only if the following condition satisfied
there is exist a real number 𝛼, 𝛽 > 0 and with for each so that
whenever
Proof :Sufficient condition, take any then . So there is
exist N ∈ ℕ such that .
As a result
for each k
Using the hypothesis, for each k valid

So for every k

Because

and

then

It means
The necessary condition. Let is a superposition operator, functional
given by
Then by lemma 2.1 , the functional continous on that also it is continous at 0.
This means for each number ε > there is a number η > 0 so that
If then
From lemma 3.1, there is exist a number 𝛼 > 0 so that if
then .
On Necessary and Sufficient Conditions for into Superposition Operator 295

So there is exist a real number α, β > 0, if then . So by


lemma 3.2. there is exist with for each and
for . ∎

References

[1] APPEL,JURGEN AND PETER P. ZABREJKO, Nonlinear Superposition Operators,Cambridge University Press, 1990.
[2] CHEW TUAN SENG, Superposition Operator On ω0 and W0, comment. Math (2)29,149-153, 1990.
[3] DEDAGICH, F AND ZABREἱKO, P.P., On Superposition Operators in 𝓵p spaces , Sibirsk.Math. Zh. 28, no 1, 86-98
(Russian), English tranlation, Serbian Math. J.28, no.1,63-73,1987.
[4] KAMTHAN, P.K. AND GUPTA, M., Sequences Space and Series, Marcel Dekker, Inc , 1981.
[5] MEYER, P. AND NIERBERG, Banach Lattice, Springer-Verlag, 1991.
[6] PAREDES, L.I., Boundedness of Superposition Operators on w0, SEA Bull. Math.,Vol.15, number 2, 145-151,
1991.
[7] PAREDES, L.I., Orthogonally Additive Functional and Superposition Operators on w0(ф) Ph.D dissertation,
University of The Philippines, 1993.
[8] PETRANUARAT, S AND YUPAPRON KEMPRASIT, Superposition Operators on ℓp and c0 into ℓq (1  p, q < ), SEA
Bull. Math. Vol 21, 139-147, 1997.
[9] RAO, M.M. AND REN, Z.D., 2002 : Applications of Orlicz Spaces, Marcell Dekker, Inc, N. Y
[10] ROBERT,I.J, Continuité d’un opérateur nonlinéar sur certains espacesde suites, C.R. Acad. Sci. Paris, Ser. A 259
(1964), 1287-1290.
[11]ŠHRAGIN, I.V., 1976, Condition for imbedding of classes and their consequences (Russian), Math. Zametki (5)
20, 681-692 ; English translation : Math.Note. 20, 942-948.
[12] SRI DARU, U., 1998 : Operator superposisi Terbatas Pada Beberapa Ruang Barisan, Disertasi Doktor,
Universitas Gadjah Mada

ELVINA HERAWATY
Department of Mathematic, FMIPA USU, Medan
e-mail : [email protected]

SUPAMA
Department of Mathematic, FMIPA UGM, Yogyakarta, 55281
e-mail : [email protected]

INDAH EMILIA WIJAYANTI


Department of Mathematic, FMIPA UGM, Yogyakarta, 55281
e-mail : [email protected]
296 E. HERAWATY, SUPAMA, I.E. WIJAYANTI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 297–308.

A DRBEM FOR STEADY INFILTRATION FROM


PERIODIC FLAT CHANNELS WITH ROOT WATER
UPTAKE

Imam Solekhudin and Keng-Cheng Ang

Abstract. Water flow in unsaturated soils that is induced by infiltration and plant
root water uptake processes is governed by Richard’s equation. To study the governing
equation more conveniently, it is transformed into a form of Helmholtz equation using
Kirchhoff transformation with dimensionless variables. It may be difficult or even impos-
sible to obtain the analytical solutions of boundary value problems involving this type
of Helmholtz equation. In this study, we employ the dual reciprocity boundary element
method (DRBEM) to solve these problems numerically. The proposed method is tested
through an example involving infiltration from periodic flat channels with root water
uptake.

Keywords and Phrases: Richard’s equation, Helmholtz equation, DRBEM, infiltration,


root water uptake.

1. INTRODUCTION
The study of infiltration through soils has been considered by numerous re-
searchers. Philips [10], Batu [3], Azis et al [2], Lobo et al [9], and Clements et al
[4] are some of such researchers.
In this paper, we investigate solutions to time independent water flow problems
in unsaturated soils involving processes of infiltration and absorption by plant roots.
The process of water absorption by the plant roots is also known as root water uptake
process. A set of transformations is employed to obtain the governing equation in
the form of a Helmholtz equation. An integral formulation is used to construct a
numerical scheme based on the dual reciprocity boundary element method or DRBEM,
for obtaining numerical solutions of the Helmholtz equation. An example of infiltration
from periodic flat channels with root water uptake is considered to test the scheme.
The solutions obtained are compared with solutions of problems involving infiltration
from the same flat channels without root water uptake process.

297
298 I. Solekhudin And K. C. Ang

2. ROOT WATER UPTAKE


Root water uptake is one of the most important processes in subsurface, as it can
largely control water fluxes to the atmosphere and the groundwater [13]. To improve
our understanding of the magnitude of these fluxes, an estimation of the spatial root
water uptake pattern is needed.
Many models of root water uptake have been proposed. The basis of some of
these proposed models is one developed by Raats [12], and written as
1 −Z
β(Z) = e λ, Z ≥ 0, (1)
λ
where β(Z) is the spatial root water uptake distribution with depth (L−1 ), λ (L) is a
positive number such that at Z = λ the cumulative root water uptake is 63% of the
total uptake over the whole root zone, and Z is the depth in the soil (L).
Vrugt et al [13] and Vrugt et al [14] developed a model based on the Raats model.
The model is given by
 
Z pZ ∗
β(Z) = 1 − e Zm |Z −Z| , Z ≥ 0, (2)
Zm
where β(Z) is the dimensionless spatial root distribution with depth, Zm is the maxi-
mum rooting depth (L), and pZ (-) and Z ∗ (L) are empirical parameters. When Zm = 1
m, Z ∗ = 0.20 and pZ = 1.00, the model is similar to that the one proposed by Hoffman
and van Genuchten [7]. If Zm = 1 m, Z ∗ = 1.00 and pZ = 0.01, then the model is
similar to the one proposed by Prasad [11].
We assume that crops are planted on the soil surface between and equidistance
from two channels. The root zone width is equal to the distance between two channels,
2D, and its depth is Zm . This is illustrated in Figure 1
The two-dimensional model proposed in this study is based on the model proposed
by Vrugt et al, and is given as
  
X Z pZ ∗ pX ∗
β(X, Z) = −1 1− e−( Zm |Z −Z|+ Xm |X −X|) ,
D Zm
L ≤ X ≤ L + D, Z ≥ 0, (3)
where β(X, Z) is the dimensionless two-dimensional spatial root distribution, X is the
distance from the middle of a channel in the X direction (L), L is the half width of the
channels (L), D is the half width of the soil surface outside the channels (L), Zm is the
depth of the root zone, and pZ , pX , X ∗ , and Z ∗ are empirical parameters.
Since the potential cumulative root water uptake equals to the potential tran-
spiration rate (Tpot ) [13], the normalized root water uptake distribution, Sm (T−1 ), is
computed from
Lt β(X, Z)Tpot
Sm (X, Z) = R L+D R Zm , (4)
L 0
β(X, Z)
where Lt is the width of the soil surface associated with the transpiration process.
A DRBEM for a Steady Infiltration from Periodic Flat Channels... 299

xxxxxxxx xxxxxxx X
xxxxxxxx
xxxxxxxx xxxxxxx
xxxxxxx Zm
xxxxxxxx xxxxxxx
xxxxxxxx xxxxxxx
2L Xm
Root
2D zone
Flat
Channel

Figure 1. Geometry of periodic flat-channels with root zones

The actual root uptake, S, depends on the potential cumulative root water uptake
and soil water pressure head [5], and is modelled as
S(X, Z) = γ(ψ)Sm (X, Z), (5)
where γ is the dimensionless water stress response function, which takes values from
zero to one, and ψ is the suction potential (L). In this study we assume that the root
water uptake is under the condition of no stress, γ = 1.

3. BASIC EQUATION AND PROBLEM FORMULATION


Steady infiltration with root water uptake is governed by the equation
   
∂ ∂ψ ∂ ∂ψ ∂K
K + K − = S(X, Z), (6)
∂X ∂X ∂Z ∂Z ∂Z
where K is the hydraulic conductivity, ψ is the suction potential, and S is the root
water uptake function.
The Matric Flux Potential (MFP) introduced by Gardner [6]
Z ψ
Θ= Kds, (7)
−∞

and an exponential relation


K = Ks eαψ , (8)
300 I. Solekhudin And K. C. Ang

where α is an empirical parameter (L−1 ), transform equation (6) to the equation


∂2Θ ∂2Θ ∂Θ
+ −α = S(X, Z). (9)
∂X 2 ∂Z 2 ∂Z
The horizontal and vertical components of the flux in terms of the MFP are
∂Θ
U =− , (10)
∂X
and
∂Θ
V = αΘ − , (11)
∂Z
respectively. The flux normal to the surface with outward pointing normal n = (nX , nZ )
is given by
 
∂Θ ∂Θ
F =− nX + αΘ − nZ . (12)
∂X ∂Z
We consider an array of equally spaced irrigation flat channels, each of width 2L
and the spacing is 2D as can be seen in Figure 1. The fluxes on the channels are v0 , and
the fluxes on the soil surface outside the channels are 0. Since the problem is symmetric
about lines X = 0 and X = L+D, there is no flux across these two lines. The boundary
conditions for this problem are [3]
V = v0 , 0 ≤ X ≤ L and Z = 0, (13)
V = 0, L < X ≤ L + D and Z = 0, (14)
∂Θ
= 0, X = 0 and Z ≥ 0, (15)
∂X
∂Θ
= 0, X = L + D and Z ≥ 0, (16)
∂X
and
∂Θ ∂Θ
= 0, = 0, 0 ≤ X ≤ L + D and Z = ∞. (17)
∂X ∂Z
Using dimensionless variables
α α πΘ
x = X, z = Z, Φ =
2 2 v0 L
2π 2π 2π
u = U, v = V, and f = F, (18)
v0 αL v0 αL v0 αL
and the transformation
Φ = ez φ, (19)
equation (9) may be written as
∂2φ ∂2φ
+ 2 = φ + s(x, z)e−z , (20)
∂x2 ∂z
A DRBEM for a Steady Infiltration from Periodic Flat Channels... 301

where
2π lt β ∗ (x, z) Tpot
s(x, z) = R b R zm , (21)
αL β ∗ (x, z)dzdx v0
a 0
  
x z pZ ∗ pX ∗
β ∗ (x, z) = −1 1− e−( Zm |Z −2z/α|+ Xm |X −2x/α|) , 0 ≤ x ≤ b,
xm zm
z ≥ 0, (22)
and
α α α α α
lt = Lt , xm = Xm , zm = Zm , a = L, b= (L + D). (23)
2 2 2 2 2
Boundary conditions (13) to (17) can be written as
∂φ 2π α
= − φ, 0 ≤ x ≤ L and z = 0, (24)
∂n αL 2
∂φ α α
= −φ, L < x ≤ (L + D) and z = 0, (25)
∂n 2 2
∂φ
= 0, x = 0 and z ≥ 0, (26)
∂n
∂φ α
= 0, x= (L + D) and z ≥ 0, (27)
∂n 2
(28)
and
∂φ α
= −φ, 0 ≤ x ≤ (L + D) and z = ∞. (29)
∂n 2
Here, ∂φ/∂n = (∂φ/∂x)nx + (∂φ/∂z)nz is the normal derivative of φ.

4. METHOD OF SOLUTION
According to Ang [1], an integral equation to solve equation (20) is
Z Z
λ(ξ, η)φ(ξ, η) = ϕ(x, z; ξ, η)[φ(x, z) + s(x, z)e−z ]dxdz
Z R

+ φ(x, z) (ϕ(x, z; ξ, η))
C ∂n


− ϕ(x, z; ξη) (φ(x, z)) ds(x, z), (30)
∂n
where
1

λ(ξ, η) = 2 , (ξ, η) lies on a smooth part of C , (31)
1, (ξ, η) ∈ R
and
1
ϕ(x, z; ξ, η) = ln[(x − ξ)2 + (z − η)2 ] (32)

302 I. Solekhudin And K. C. Ang

is the fundamental solution of the Laplace’s equation.


Equation (30) may not be solved analytically, and hence a numerical method is
needed to solve the integral equation approximately. In this paper, we employ the
DRBEM. To apply the DRBEM, the boundary for the problem must be a simple closed
curve. Therefore, in this case, we set the domain to lie between z = 0 and z = c, for a
positive number c such that it is assumed that ∂φ/∂n = −φ at z = c. The boundary is
discretized by a number of line segments joined end to end, and the mid point of every
segment is taken as a collocation point. A number of interior points is chosen, such
that the points are well spaced in the domain. An illustration for these can be seen in
Figure 2.

(a (1),b(1) )

(a(N ),b(N ))
Line segment

( a(N +1 ) ,b( N+1 ) )

( ) ( )
(a N + M ,b N+ M )

ρφ p (i) φ ≅ φ(i)

ρn

Figure 2. Plot of line segments and collocation points

Let C (1) , C (2) , C (3) ,..., C (N ) be the line segments on the boundary, points (a(1) , b(1) ),
(a , b(1) ), (a(2) , b(2) ),..., (a(N ) , b(N ) ) be the midpoints of the segments, and points
(1)

(a(N +1) , b(N +1) ), (a(N +2) , b(N +2) ), (a(N +3) , b(N +3) ),..., (a(N +M ) , b(N +M ) ) are the cho-
sen interior points.
The value of φ(x, z) + s(x, z)e−z in (30) may be approximated by
NX
+M
φ(x, z) + s(x, z)e−z ' δ (i) ρ(x, z; a(i) , b(i) ), (33)
i=1

where δ (i) are the coefficients to be determined and


ρ(x, z; a(i) , b(i) ) = 1 + ((x − a(i) )2 + (z − b(i) )2 )
+((x − a(i) )2 + (z − b(i) )2 )3/2 (34)
is a radial function on R2 .
A DRBEM for a Steady Infiltration from Periodic Flat Channels... 303

From (33), we obtain

Z Z NX
+M
ϕ(x, z; ξ, η)[φ(x, z) + s(x, z)e−z ]dxdz ' δ (i) Υ(ξ, η; a(i) , b(i) ), (35)
R i=1

where
Z 

Υ(ξ, η; a(i) , b(i) ) = λ(ξ, η)χ(ξ, η; a(i) , b(i) ) + ϕ(x, z; ξ, η) (χ(x, z; a(i) , b(i) ))
C ∂n


− χ(x, z; a(i) , b(i) ) (ϕ(x, z; ξ, η)) ds(x, z), (36)
∂n

and

1 1
χ(x, z; a(i) , b(i) ) = [(x − a(i) )2 + (z − b(i) )2 ] + [(x − a(i) )2 + (z − b(i) )2 ]2
4 16
1
+ [(x − a(i) )2 + (z − b(i) )2 ]5/2 . (37)
25

The line integral in (36) can be approximated by

N Z
X ∂ (i) (i)

(χ(x, z; a , b )) (x,z)=(a(j) ,b(j) ) ϕ(x, z; ξ, η)ds(x, z)
j=1
∂n C (j)

N Z
X ∂
+ χ(a(j) , b(j) ; a(i) , b(i) )) ϕ(x, y; ξ, η)ds(x, z). (38)
j=1 C (j) ∂n

To compute δ (i) in (35), point (x, z) in (33) is substituted by the collocation


points. From this, we obtain a system of linear equations

NX
+M
(k)
φ(a(k) , b(k) ) + s(a(k) , b(k) )e−b = δ (i) ρ(a(k) , b(k) ; a(i) , b(i) ),
i=1
k = 1, 2, .., N + M. (39)

System of linear equations (39) can be inverted, and we obtain

NX
+M
(k)
δ (i)
= ω (ik) [φ(a(k) , b(k) ) + s(a(k) , b(k) )e−b ], i = 1, 2, ..., N + M, (40)
k=1

where

[ω (ik) ] = [ρ(a(k) , b(k) ; a(i) , b(i) )]−1 . (41)


304 I. Solekhudin And K. C. Ang

Now, (ξ, η) in equation (30) is taken to be (a(n) , b(n) ). The value of φ can be
approximated by
NX
+M
(k)
λ(a (n)
,b(n)
)φ (n)
= ν (nk) [φ(k) + s(a(k) , b(k) )e−b ]
k=1
N
(m) (m)
X
+ [φ(k) F1 (a(n) , b(n) ) − p(k) F2 (a(n) , b(n) )],
m=1
n = 1, 2, ..., N + M. (42)
where
NX
+M
ν (nk) = Υ(a(n) , b(n) ; a(i) , b(i) )ω (ik) , (43)
i=1
Z
(m)
F1 (a(n) , b(n) ) = ϕ(x, z; a(n) , b(n) )ds(x, z), (44)
C (m)
and
Z
(m) ∂
F2 (a(n) , b(n) ) = (ϕ(x, z; a(n) , b(n) ))ds(x, z). (45)
C (m) ∂n

5. RESULTS AND DISCUSSION


The DRBEM described above is tested on a problem illustrated in Figure 3. The
geometry of the problem after transformation using the non dimensionless variables (18)
is shown in Figure 4.

15 cm/d Tpot = 0.4 cm/d

xxxxxxxx
xxxxxxxx xxxxxxx
xxxxxxx X 0 0.25 0.5
xxxxxxxx
xxxxxxxx xxxxxxx
xxxxxxx
100 cm xxxx x
xxxxxxxx xxxxxxx xxxx
xxxx
100 cm xxxx
xxxx0.5
100 cm
α = 0.01cm -1

Z
z

Figure 3. Geometry of
the problem Figure 4. Transformed
geometry of the problem
A DRBEM for a Steady Infiltration from Periodic Flat Channels... 305

In the present discussion, the value of potential transpiration rate, Tpot , is set to
0.4 cm/day, which is the same value assumed by Li et al in their experiments [8]. The
value of α is taken to be 0.01 cm−1 , a typical value for homogeneous soil. The domain
in Figure 4 lies between z = 0 and z = 4.
Numerical values of Φ on lines x = l for various values of l are presented in Figures
5, 6, 7, 8, and 9. In order to make comparisons, graphs of Φ for the condition of no
root water uptake process are also given in the same figures accordingly.

4
without root water uptake
with root water uptake
3.8

3.6
Φ

3.4

3.2

3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z

Figure 5. Values of Φ along z at x = 0.05

4
without root water uptake
with root water uptake
3.8

3.6
Φ

3.4

3.2

3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z

Figure 6. Values of Φ along z at x = 0.15

Values of Φ for problems involving infiltration with and without root water uptake
process along x = 0.05 and x = 0.15 are respectively shown in Figures 5 and 6. These
two lines are under a channel. Since the lines are under the channel, the maximum
values of Φ for both problems are achieved at z = 0. As z increases the values of Φ
decrease and eventually the values of Φ converge to some constants between 3 and 3.2.
It can be seen that there are decreases in Φ when there is root water extraction.
The drop in Φ increases as z increases from 0 to 0.5, and seems to remain the same for
306 I. Solekhudin And K. C. Ang

3.2

3 without root water uptake


with root water uptake
2.8

Φ
2.6

2.4

2.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
z

Figure 7. Values of Φ along x = 0.35

3.2

3
without root water uptake
with root water uptake
2.8
Φ

2.6

2.4

2.2

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z

Figure 8. Values of Φ along x = 0.45

z ≥ 0.5. In other words, the drop seems to be increase except for region deeper than
the depth of the root zone.
Figures 7 and 8 show the values of Φ for problems involving infiltration with and
without root water extraction along x = 0.35 and x = 0.25 respectively. These two lines
are not under the channels. Since there is no flux at the surface of the soil, the minimum
values of Φ for both problems are at z = 0. The values of Φ increase as z increases and
converge to some constants. There are drops in the values of the dimensionless MFP of
infiltration with root water uptake process compared with those of infiltration without
root water uptake extraction. As before, the drop increases in the region shallower than
the depth of the root zone, and seems to be constant in the region deeper than the root
zone.
A DRBEM for a Steady Infiltration from Periodic Flat Channels... 307

4
without root water uptake
with root water uptake
3.8

3.6

Φ
3.4

3.2

3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2
z

Figure 9. Values of Φ along x = 0.25

From Figure 9, it can be seen that the values of Φ for the two problems, with
and without root water uptake process, are constants except at z = 0. This is probably
because of the singularity of point (0.25,0).

6. CONCLUDING REMARKS
A study of a model with new factor incorporated, root water uptake process,
has been made. A numerical method, the DRBEM, is employed to obtain numerical
solutions of the dimensionless MFP. Numerical solutions of the dimensionless MFP
obtained from this problem are compared with those obtained from a problem involving
infiltration from the same periodic flat channels without root water uptake process.
The results are presented in graphs for some lines along the z axis. They illustrate
the effect of root water extraction on the dimensionless MFP. The root water extraction
process reduces the dimensionless MFP. This implies declines in water contents in the
soil. The results shown are reasonable as roots absorb water from soil, and therefore the
model is accepted. However, it is not a trivial task if we want to verify quantitatively.

Acknowledgement. Imam Solekhudin wishes to thank the Directorate General of the


Higher Education of the Republic of Indonesia (DIKTI) for providing financial support
for this research.

References
[1] Ang, W. T., A Beginer’s Course in Boundary Element Method, Universal Publishers Boca Raton,
Florida, 2007.
[2] Azis, M. I., Clements, D. L. and Lobo, M., A Boundary Element Method for Steady Infiltration
from Periodic Channels, ANZIAM J., 44(E), C61 - C78, 2003.
308 I. Solekhudin And K. C. Ang

[3] Batu, V., Steady Infiltration from Single and Periodic Strip Sources, Soil Sci. Soc. Am. J. 42,
544 - 549, 1978.
[4] Clements, D. L., Lobo, M. and Widana, N., A Hypersingular Boundary Integral Equation for
a Class of Problems Concerning Infiltration from Periodic Channels, El. J. of Bound. Elem., 5, 1
- 16, 2007.
[5] Feddes, R. A., Kowalik, P. J., Zaradny, H., Simulation of Field Water Use and Crop Yield,
John Wiley & Sons, New York, 1978.
[6] Gardner, W. R., Some Steady State Solutions of the Unsaturated Moisture Flow Equation with
Application to Evaporation from a Water Table, Soil Sci., 85, 228 - 232, 1958.
[7] Hoffman, G. J. and van Genuchten, M. T., Soil Properties and Efficient Water Use, Water
Management for Salinity Control, 73 - 85. In H. M. Taylor et al. (eds) Limitation to Efficient
Water Use in Crop Production. Am. Soc. Agron., Madison, WI, 1983.
[8] Li, K. Y., De Jong, R. and Boisvert J. B., An Exponential Root-Water-Uptake Model with
Water Stress Compensation, J. Hydlol., 252, 189 - 204, 2001.
[9] Lobo, M., Clements, D. L. and Widana, N., Infiltration from Irrigation Channels in a Soil with
Impermeable Inclussion, ANZIAM J., 46(E), C1055 - C1068, 2005.
[10] Philips, J. R., Flow in Porous Media, Annu. Rev. Fluid Mechanics 2, 177 - 204, 1970.
[11] Prasad, R., A Linear Root Water Uptake Model, J. Hydrol. 99, 297 - 306, 1988.
[12] Raats, P. A. C, Steady Flows of Water and Salt in Uniform Soil Profiles with Plant Roots, Soil
Sci. Soc. Am. Proc. 38, 717 - 722, 1974.
[13] Vrugt, J. A., Hopmans, J. W. and Šimunek, J., Calibration of a Two-Dimensional Root Water
Uptake Model, Soil Sci. Soc. Am. J. 65, 1027 - 1037, 2001.
[14] Vrugt, J. A., van Wijk, M. T., Hopmans, J. W. and Šimunek, J., One-, Two-, and Three-
Dimensional Root Water Uptake Functions for Transient Modeling, Water Resources Res. 37,
2457 - 2470, 2001.

Imam Solekhudin
Mathematics and Mathematics Education, National Institute of Education,
Nanyang Technological University, Singapore.
Permanently at Department of Mathematics, Faculty of Mathematics and Natural Sciences,
Gadjah Mada University, Yogyakarta-Indonesia.
e-mail: [email protected]

Keng-Cheng Ang
Mathematics and Mathematics Education, National Institute of Education,
Nanyang Technological University, Singapore.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Analysis, pp. 309–316.

BOUNDEDNESS OF THE BIMAXIMAL OPERATOR AND


BIFRACTIONAL INTEGRAL OPERATORS IN
GENERALIZED MORREY SPACES

Wono Setya Budhi and Janny Lindiarni

Abstract. In this paper we study about the bimaximal operator as an extension of


the Hardy-Littlewood maximal function for two independent functions. In the first part
of the paper we prove the boundedness of the operator from the cross product of two
generalized Morrey spaces Mp,φ × Mq,φ into another Morrey space Ms,φ2 where p, q be
two real numbers greater than 1, and s be the harmonic means of p and q.
In the second part of the paper we prove the bilinear fractional integral operators from
the cross product of two generalized Morrey spaces Mp,φ × Mq,φ into another Morrey
space Mr,φs/r for a particular real number r.

Keywords and Phrases: bifractional integral operators, generalized Morrey spaces

1. INTRODUCTION
The determinant gives a mechanism in measuring the subset in multi dimensional.
For example second order of determinant gives the way of how we can compute the
linear shape in a plane. Also an operator acting on function spaces may not only
depend on a main variable but also on several other function-variables that are often
treated as parameters. Examples of such operators are ubiquitous in harmonic analysis:
for example multiplier operators, the Calderon commutators, and the Cauchy integral
along Lipschitz curves. These kind of operators can be treated as multilinear operators.
In this article we study about the boundedness of bilinear generalized fractional
integral operators in the form
f (x + y) g (x − y)
Z
Iα (f, g) (x) = n−α dy
R n |y|

2010 Mathematics Subject Classification: primary 46E30; secondary 43A15

309
310 W.S. Budhi and J. Lindiarni

where 0 < α < n [2]. There are many similar discussion in bilinear operators. In 1997,
M.T. Lacey solved the long standing Calderon’s conjecture of bilinear Hilbert transform
[3]. Calderon studied about the bilinear Hilbert transform which is intimately related to
Carleson’s theorem asserting the pointwise convergence of Fourier series [1]. L. Grafakos
and N Kalton studied about multilinear fractional integral operator in classical Lebesgue
spaces [2].
Recently the study of one variable version of the above operator has been extended
from the classical Lebesgue spaces to Morrey spaces. Let 1 ≤ p < ∞ and 0 ≤ λ ≤ n,
the classical Morrey space Lp,λ = Lp,λ (Rn ) is defined to be the space of all functions
f ∈ Lploc (Rn ) for which
Z !1/p
1 p
kf kp,λ = sup |f (y)| dy <∞
x∈Rn rλ B(x,r)
r>0

where B (x, r) is the open ball centered at x ∈ Rn with radius r > 0 [5]. Many result
about the Morrey space and also the result about the boundedness of the classical
fractional operator for one function. For example Adam; Chiarenza-Frasca proved that
kIα f kq,λ ≤ Cp,λ kf kp,λ
n
for 1 < p < α , 0 ≤ λ < n − αp and 1q = p1 − n−λ
α
.
In [5] E. Nakai defined the generalized Morrey spaces. For 1 ≤ p < ∞ and
a suitable function φ : (0, ∞) → (0, ∞), he defined the (generalized) Morrey space
Mp,φ = Mp,φ (Rn ) to be the space of all functions f ∈ Lploc (Rn ) for which
Z !1/p
1 1 p
kf kp,φ = sup |f (y)| dy <∞
x∈Rn φ (r) |B (x, r) | B(x,r)
r>0

Notice that for φ (t) = t(λ−n)/p , 0 ≤ λ ≤ n we have Mp,φ = Lp,λ , the classical Morrey
space. The function φ satisfies the two conditions:
(1) There exist C1 such that for all r, s with 12 ≤ rs ≤ 2 then C11 ≤ φ(r)
φ(s) ≤ C1
R ∞ φp (t) p
(2) There exist C2 such that r t dt ≤ C2 φ (r) for all 1 < p < ∞
The condition (1) is known as the doubling condition. For more detail about this
information, see the works of H. Gunawan and Eridani [4], and the references therein.
In order to prove boundedness of the fractional integral operators, it usually proves
boundedness of the Hardy-Littlewood maximal operators for one function, defined by
the formula Z
1
M f (x) = sup |f (y)| dy
r>0 |B (x, r)| B(x,r)

The boundedness of this operator in the classical Morrey space was proved by Chiarenza-
Frasca, that is for p > 1 and 0 ≤ λ < n the following
kM f kp,λ ≤ Cp,λ kf kp,λ
Boundedness of the Bimaximal Operator and Bifractional Integral Operators ... 311

holds. Then in [5], Nakai proved the extension of this for generalized Morrey space,
that is the inequality

kM f kp,φ ≤ Cp,φ kf kp,φ (1)

holds for 1 < p < ∞.


In this paper, we will prove the similar inequality for the bilinear maximal function
[2]
Z
1
M (f, g) (x) = sup f (x − y) g (x + y) dy (2)
r>0 |B (0, r)| B(0,r)

and then we will use it to prove the boundedness of Iα (f, g). In the proof we will use
the results of one variable maximal Hardy-Littlewood operators.

2. BOUNDEDNESS OF THE BILINEAR MAXIMAL FUNCTION


Following the maximal function of one variable function, the bilinear maximal
function is defined by (2). In the first part of the paper, we like to show the boundedness
of M2 (f, g) (x) in Morrey spaces

Theorem 2.1. Let p, q be the real numbers with p, q > 1 and s be the harmonic mean
of p, q then for f ∈ Mp,φ and g ∈ Mq,φ

kM2 (f, g)ks,φ2 ≤ C kf kp,φ kgkq,φ (3)

Proof. In order to prove the inequality, we can assume that f, g > 0, then we use Holder
inequality to have
Z
1
f (x − y) g (x + y) dy
|B (0, r)| B(0,r)
Z !s/p Z !s/q
1 p/s q/s
≤ f (y) dy g (y) dy
|B (0, r)| B(x,r) B(x,r)

Z !s/p Z !s/q
1 p/s 1 q/s
= f (y) dy g (y) dy
|B (0, r)| B(x,r) |B (0, r)| B(x,r)

Using the definition of the maximal function for one component, then we have the
relation between bilinear maximal function with the classical Hardy-Littlewood maximal
function. In this case we get
   s/p    s/q
M2 (f, g) (x) ≤ M1 f p/s (x) M1 g q/s (x)
312 W.S. Budhi and J. Lindiarni

The next step is computing the norm

Z !
1 1 s
2 M2 (f, g) (x) dx
φ (r) |B (x, r)| B(x,r)
!
1 1
Z    s2 /p    s2 /q
≤ 2 M1 f p/s (x) M1 g q/s (x) dx
φ (r) |B (x, r)| B(x,r)
!s/p
1 1
Z    s2 /p p/s
p/s
≤ M1 f (x) dx
φ (r) |B (x, r)| B(x,r)
!s/q
1 1
Z    s2 /q q/s
q/s
× M1 g (x) dx
φ (r) |B (x, r)| B(x,r)

Z !s/p
1 1    s
p/s
= M1 f (x) dx
φ (r) |B (x, r)| B(x,r)

Z !s/q
1 1    s
× M1 g q/s (x) dx
φ (r) |B (x, r)| B(x,r)

With inequality (1) in hand, we can conclude that the inequality (3) holds. 

3. BOUNDEDNESS OF BILINEAR FRACTIONAL INTEGRAL


OPERATOR
In this part, we will show the main result of the paper, that is boundedness of
bifractional integral operators

Theorem 3.1. Let p, q be the real numbers greater than 1 and s be their harmonic
mean, and let φ be a positive function satisfies the doubling condition on the it’s domain
2βs
(0, ∞) and also φ (t) ≤ Ctβ for − ns ≤ β < −α, 1 < s < α n
. Then, for r = α+2β , the
bifractional integral operators satisfy

kIα (f, g)kr,φ2s/r ≤ Cs,β kf kp,φ kgkq,φ (4)

Proof. Let x ∈ Rn and R > 0 be any real numbers. In the proof, we will write C for
any constant bound. Then for any f ∈ Lp and g ∈ Lq , we can write the operator as

f (x − y) g (x + y) f (x − y) g (x + y)
Z Z
Iα (f, g) (x) = n−α dy + n−α dy
|y|<R |y| |y|>R |y|
Boundedness of the Bimaximal Operator and Bifractional Integral Operators ... 313

(1)
For the first integral Iα (f, g) (x), we can write it as


(1)
Z f (x − y) g (x + y)
Iα (f, g) (x) = dy

n−α
|y|<R |y|
−1 Z
X |f (x − y) g (x + y)|
= dy

n−α
k
k=−∞ 2 R≤|y|<2
k+1 R |y|
−1 Z
X
k
α−n
≤ 2 R |f (x − y)| |g (x + y)| dy
k=−∞ 2k R≤|y|<2k+1 R

−1 Z
n
X
k
α 1
≤2 2 R n |f (x − y)| |g (x + y)| dy
k=−∞
(2k+1 R) |y|<2k+1 R

The last integral can be written as a maximal function, to get


−1
(1) X α
Iα (f, g) (x) ≤ C 2k R M2 (f, g) (x) ≤ CRα M2 (f, g) (x)

k=−∞

(2)
For second integral Iα (f, g) (x), we write it as


(2)
Z f (x − y) g (x + y)
Iα (f, g) (x) = dy

n−α
|y|>R |y|

|f (x − y)| |g (x + y)|
X Z
≤ n−α dy
k
k=0 2 R≤|y|<2
k+1 R |y|
∞ Z
X
k
α−n
≤ 2 R |f (x − y)| |g (x + y)| dy
k=0 |y|<2k+1 R

As before, we use Holder’s inequality for three functions to have




(2)
X α−n k+1 n(1− p1 − q1 )
Iα (f, g) (x) ≤ 2k R 2 R ×

k=0
Z !1/p Z !1/q
p q
|f (y)| dy |g (y)| dy
B(x,2k+1 R) B(x,2k+1 R)

X α 2
≤C 2k R φ 2k+1 R kf kp,φ kgkq,φ
k=0

X α 2
=C 2k R φ 2k R kf kp,φ kgkq,φ
k=0
314 W.S. Budhi and J. Lindiarni

In the last line, we use the property of that φ is doubling function. Finally, because the
order of φ, then we have

(2) X α+2β
Iα (f, g) (x) ≤ C kf kp,φ kgkq,φ 2k R ≤ CRα+2β kf kp,φ kgkq,φ

k=0
if α + 2β < 0. Now if kf kp,φ = 0 or kgkq,φ = 0, then M (f, g) (x) = 0, therefore we can
 1/2β
set R = kfMk (f,g)(x)
kgk . Combine the both inequalities, we have
p,φ q,φ

|Iα (f, g) (x)|


!α/2β !(α+2β)/2β
M (f, g) (x) M (f, g) (x)
≤C M (f, g) (x) + C kf kp,φ kgkp,φ
kf kp,φ kgkq,φ kf kp,φ kgkq,φ
(α+2β)/2β −α/2β −α/2β
= C (M (f, g) (x)) kf kp,φ kgkq,φ
s/r 1−s/r 1−s/r
≤ C (M (f, g) (x)) kf kp,φ kgkq,φ
Now for the norm, we have
Z Z
r r−s r−s s
|Iα (f, g) (x)| dx ≤ C kf kp,φ kgkq,φ (M (f, g) (x)) dx
B(x,r) B(x,r)
r−s r−s 2s s
≤ C kf kp,φ kgkq,φ φ (r) |B (x, r)| kM (f, g)ks,φ2
r r 2s
≤ C kf kp,φ kgkq,φ φ (r) |B (x, r)|
In the last equation, we use (3). Finally we have
Z !1/r
1 1 r
2s/r
|Iα (f, g) (x)| dx ≤ C kf kp,φ kgkq,φ
φ (r) |B (x, r)| B(x,r)
and the inequality (4) follows. 

Acknowledgement. Thanks to I. Sihwaningrum and H. Gunawan for introducing the


topic to the first author. Also to ITB for supporting the first author with Research
Grant No. 229/I.1.C01/PL/2011. Finally my thank to the anonymous reviewer for the
comments.

References
[1] A. Calderon, Commutators of singular integral operators, Proc. Natl. Acad. Sci. USA 53 (1965),
1092-1099.
[2] L. Grafakos and N Kalton, Some Remarks on Multilinear Maps and Interpolation, Mathema-
tische Annalen 319 (2001), no. 1, 151–180.
[3] M. Lacey and C Thiele, Lp estimates for the bilinear Hilbert transform, Proc. Natl. Acad. Sci.
USA 94 (1997), 33-35
[4] H. Gunawan and Eridani, Fractional Integrals and Generalized Olsen Inequalities, Kyungpook
Mathematical Journal 49, (2009) 31-39
[5] E. Nakai, Hardy-Littlewood maximal operator, singular integral operators, and the Riesz potentials
on generalized Morrey spaces”, Math. Nachr. 166 (1994), 95-103.
Boundedness of the Bimaximal Operator and Bifractional Integral Operators ... 315

[6] I. Sihwaningrum, Operator Integral Fraksional dan Ruang Morrey Tak Homogen yang Diperumum,
Disertasi, Institut Teknologi Bandung, 2010 (in Indonesian).

Wono Setya Budhi


FMIPA Institut Teknologi Bandung.
e-mail: [email protected]

Janny Lindiarni
FMIPA Institut Teknologi Bandung.
e-mail: [email protected]
316 W.S. Budhi and J. Lindiarni
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 317–322.

A LEPSKIJ-TYPE STOPPING-RULE FOR SIMPLIFIED


ITERATIVELY REGULARIZED GAUSS-NEWTON
METHOD

Agah D. Garnadi

Abstract. Iterative regularization methods for nonlinear ill-posed equations of the form
F (a) = y, where F : D(F ) ⊂ X → Y is an operator between Hilbert spaces X and
Y , usually involve calculation of the Fréchet derivatives of F at each iterate and at the
unknown solution a] . A modified form of the generalized Gauss-Newton method which
requires the Fréchet derivative of F only at an initial approximation a0 of the solution
a] as studied by Mahale and Nair [11]. This work studied an a posteriori stopping rule
of Lepskij-type of the method. A numerical experiment from inverse source potential
problem is demonstrated.

Keywords and Phrases: Nonlinear ill-posed problem, a posteriori stopping rule, regular-
ized Gauss-Newton.

1. INTRODUCTION
Nonlinear ill-posed problem usually posed as a non-linear operator equation
F : D(F ) ⊂ X → Y
between Hilbert spaces X and Y. We assume that F is one-to-one and Fréchet differ-
entiable on its domain D(F ) and denote the derivative at a point a ∈ D(F ) by F 0 [a].
Since it is ill-posed, F does not have a bounded inverse.
In this work we will focus on the simplified iteratively regularized Gauss-Newton
method (sIRGNM) which is a variant of iteratively regularized Gauss-Newton method
(IRGNM), one of the most attractive iterative regularization methods. For an overview
on iterative regularization methods for non-linear ill-posed problem, we refer to the
monograph by Kaltenbacher, Neubauer, and Scherzer [10] or Bakushinsky, Kokurin,
Kokurin, and Smirnova [1]. At (n+1)−st iteration of the IRGNM, the iterate aδ(n+1) ∈ X

2010 Mathematics Subject Classification: 47A52, 65J22

317
318 Agah D. Garnadi

is defined as the unique global minimizer of the quadratic functional a 7→ kF 0 [aδn ](a −
aδn ) + F (aδn ) − y δ k2Y + αn k(a − a0 )k2X , n ∈ IN0 . Where a0 ∈ D(F ) is some initial guess,
and αn is a regularization parameter, here we chose αn = α0 q n , for some 0 < q < 1.
The (n + 1)−st iterate aδ(n+1) can be expressed in a closed form
aδ(n+1) := an + (F 0 [aδn ]∗ F 0 [aδn ] + αn I)−1 F 0 [aδn ]∗ (y δ − F (aδn ) + F 0 [aδn ](aδn − a0 )). (1.1)
A variant of IRGNM, where we approximate F 0 [aδn ] by an equivalent linear operator A,
typically by F 0 [a0 ]. Hence the previous formula at the (n + 1)−st iteration, we use
aδ(n+1) := aδn + (F 0 [a0 ]∗ F 0 [a0 ] + αn I)−1 F 0 [a0 ]∗ (y δ − F (aδn ) + F 0 [a0 ](aδn − a0 )). (1.2)
This variant called the simplified IRGNM, which is widely used in practice, but lacking
in theoretical grounds. Kaltenbacher [9] initiated studying the methods, and closely
studied in details recently by Mahale & Nair [11], Jin [8]and George [5].
One of important thing during iteration is when to terminate the steps as the
error kaδn − a] k experiencing deterioration as n → ∞ in the presence of noise. One of
the rule that widely use is the discrepancy principle, which is the iteration terminated at
the index N (δ, y δ ) for the first time the criteria kF (aN ) − y δ k ≤ τ δ satisfied with some
parameter τ > 1. In [2, 3] the authors studied a Lepskij-type stopping rule for IRGNM
with deterministic and random noise, their studies showed that both theoretically and
numerically, compared to the discrepancy principle, the proposed stopping rule yields
at least as good, and at some point even better results.
In this work, we examine the stopping rule to the sIRGNM, fills a gap left behind
by [2] works on IRGNM and completing the works of Bauer & Lukas [4] on extensive
survey on stopping criteria in linear inverse problem.

2. LEPSKIJ STOPPING RULE FOR DETERMINISTIC NOISE.


To analyze the convergence of the sIRGNM, we follows analysis for IRGNM
closely. After the (n + 1) iterations, the total error en+1 = aδn+1 − a] is decomposed into
three components which are estimated one-by-one: the (modified) approximation error
eapp 0 ∗ 0
n+1 := αn (F [a0 ] F [a0 ] + αn I)
−1
e0 , the (modified) propagated data noise error enoi
n+1
The non-linearity component error always bounded by the two other error compo-
nents up to index Kmax , which is known a-priori in principle. A ’non-linearity dominance
(blow-up)’ may happened after that index. The optimal stopping index N is roughly
situated at the step when eappn and enoi
n are of the same order.
The (modified) propagated data noise error in sIRGNM can be bounded a-priori
by
δ
kenoi
n k≤ √ , (2.1)
2 αn
which is obtained by studying carefully the works of Mahale & Nair [11], Jin [8] and
[5]. This bound is also the same expression of the propagated noise error in IRGNM,
where the rate of decay of the approximation error, eapp n , depends on the smoothness
of the unknown solution a] , to be precise on the smoothness of (a] − a0 ). With a slight
A LEPSKIJ-Type Stopping-Rule for SIRGNM 319

modification, this is also true for sIRGNM, by utilizing the property of F 0 [a0 ], which is
likely known apriori. The essence of the Lepskij stopping rule is to extract information
from the a-priori bound (2.1) to detect the point after which the propagated data error is
become dominant.In the following theorem as given in [2], this situation stated precisely,
as the proof is quite illustrative and short, we reproduce it here.
Theorem 2.1. [2] Let aδn be the sequence of iterates produced by an iterative regulariza-
tion method for an initial guess a0 from some admissible set and data (δ, y δ ) satisfying
uobs := y δ = F (a] ) + δξ. (2.2)
We assume that
• There exists an a-priori known index Kmax = Kmax (δ) such that aδn is well
defined for 0 ≤ n ≤ Kmax .
• There exists an ’optimal’ stopping index N = N (δ, y δ , a] ) ∈ {0, 1, · · · , Kmax },
and a known increasing function Φ : IN0 → [0, ∞) such that
kaδn − a] k ≤ Φ(n)δ, n = N, · · · , Kmax . (2.3)
Then ther error at the Lepskij stopping index n∗ = n∗ (δ, y δ ) defined by
n∗ := min{n ∈ {0, · · · , Kmax (δ)} : kaδn − a] k ≤ 2Φ(m)δ, ∀m = n + 1, · · · , Kmax },
is bounded by
kaδn∗ − a] k ≤ 3Φ(N )δ.

Proof. Since Φ is increasing, we have


kaδm − aδN k ≤ kaδm − a] k + kaδN − a] k ≤ Φ(m)δ + Φ(N )δ ≤ 2Φ(m)δ
for m = N + 1, · · · , Kmax (δ). Consequently imply n∗ ≤ N. therefore,
kaδn∗ − a] k ≤ kaδN − a] k + kaδN − aδn∗ k ≤ Φ(N )δ + 2Φ(N )δ ≤ 3Φ(N )δ,
hence the assertion follows. 

Some observations should be made in order to highlights the above theorem.


• The bound (2.1) and the remark following on the bound, lead us to choose
−1/2
Φ := καn , for some constant κ > 1.
• Within the algorithm, the optimal stopping index N never appear explicitely.
In the case of IRGNM, the explicit appearance of the optimal stopping index
for prescribed data is not necessary, since it is determined by the some rule that
a-priorily depends on the smoothness of a] . The smoothness of the solution with
respect to the smoothing properties of non-linear operator F usually expressed
in terms of source conditions
a0 − a] = Λ(F 0 [a] ]∗ F 0 [a] ])w, kwk ≤ ρ.
Analogous to this result in IRGNM, in the case of sIRGNM, we replace the
above source conditions, using modified source conditions
a0 − a] = Λ(F 0 [a0 ]∗ F 0 [a0 ])w, kwk ≤ ρ.
320 Agah D. Garnadi

If Λ(t) = tµ , we refer to a Hoelder-type source condition, if Λ(t) = (log(t))−p ,


we mean a logarithmic source condition.
• The maximum iteration is chosen Kmax = CF + s logq αδ0 , for some constant
CF ∈ IR, and s = 2 if F satisfies a stronger non-linearity condition.

3. Inverse Source Problem.


We consider the identification of the support of an external force acting over
Ω ⊂ G from measurement of potential flux ∂u/∂n and the potential u on the boundary
Γ = ∂G. The situation can be rearranged such that u|Γ = 0, by subtracting to a solution
of the Laplace equation. Then the forward problem is described by the boundary value
problem
∆u = χΩ , u=0 on Γ
where the domain Ω is the support of H, and χΩ = supp(H) denoted the characteristic
function of Ω, which we assume to be star-shaped with respect to the origin. Then
∂Ω := {q(t)(cos t, sin t) : t ∈ [0, 2π]} for q which is a positive function and 2π−periodic.
∂u
The inverse problem consists in identifying the shape of Ω given the Neumann data ∂n
∂u
of the solution on Γ. Therefore, we define F as the operator mapping q to ∂n .
In this case we consider the identification of the shape of a unit constant external
force on Ω ⊂ G from measurement of potential flux ∂u/∂n and the potential u on the
boundary Γ = ∂G.
It has been shown in [7] that logarithmic source conditions are equivalent to
smoothness conditions in terms of Sobolev spaces if ∂Ω and Γ are concentric circles.

Numerical Result. We assume that the data are n noisy measurements of g ] =


(n)
F (q ] )(t) at equidistant points tj := nj ,
(n)
Yj = g ] (tj ) + δξj , j = 1, · · · , n, (3.1)

where δ is error level, and ξj is randomly generated with kξk ≤ 1.


For numerical test, we have chosen Γ := {x : |x| = 1}, and we used an exact
solutions a] := q ] (t) := 0.5 ∗ (1 + 0.9 cos(t) + 0.1 sin(t))/(1 + 0.75 cos(t)), a bean shaped
inclusion.
The initial guess we choose a0 := q (0) (t) = 1, a unit circular inclusion,and fixed
F 0 [a0 ].
We tested the rates of convergence with the balancing principle for the exact
solution a] , by fixing n = 64.
The results in figure (3) show the Lepskij rule based on the worst case bound
(2.1) and predicted optimal index (2.3). The performance of balancing principle tested
first for κ = 1.1 and κ = 0.3 .
A LEPSKIJ-Type Stopping-Rule for SIRGNM 321

κ = 1.1 κ = 1.1
0
10 1.5
Lepskij Optimal
Optimal Lepskij
Exact
1

0.5
L2−error

−1
10 0

−0.5

−1

−2
10 −1.5
−3 −4 −5 −6 −1.5 −1 −0.5 0 0.5 1 1.5
10 10 10 10
Noise level

κ = 0.3 κ = 0.3
0
10 1.5
Lepskij Optimal
Optimal Lepskij
Exact
1

0.5
L2−error

−1
10 0

−0.5

−1

−2
10 −1.5
−3 −4 −5 −6 −1.5 −1 −0.5 0 0.5 1 1.5
10 10 10 10
Noise level

Figure 1. Lepskij rule based on worst case bound. (Leftcolumn) Lep-


skij & Optimal, L2-error versus Noise Level, and (Right column) re-
construction results with noise 5%

Acknowledgement. This work funded by The Directorate General of Higher Ed-


ucation, Ministry of Education, The Government of Indonesia, under staff develop-
ment programme in higher education establishment, years 2008-2010. The support
by Graduirtenkolleg 1023 ’Identification of Mathematical Models’ at the Georg-August
University of Goettingen, is here gratefully acknowledged.

References
[1] Bakushinsky,A.B. and Kokurin,M.Y. and Kokurin,M.I.U. and Smirnova,A., Iterative Meth-
ods for Ill-Posed Problems: An Introduction,(De Gruyter, Berlin, 2010)
[2] Bauer,F. and Hohage,T., A Lepskij-type stopping-Rule for regularized Newton-type methods,
Inv. Problems, 21,1975, 2005.
[3] Bauer,F. and Hohage,T., A Lepskij-type Stopping-Rule for Newton-type Methods with Random
Noise, PAMM, 5,15, 2005.
[4] Bauer,F. and Lukas,M.A., Comparing parameter choice methods for regularization of ill-posed
problems, Mathematics and Computers in Simulation,81(9),1795, 2011.
[5] George,S., On convergence of regularized modified Newton’s method for nonlinear ill-posed prob-
lems, J. Inv. and Ill-posed Prob.,18(2), 133, 2010.
[6] Bauer,F. and Hohage,T. and Munk,A. Iteratively Regularized Gauss-Newton Methods for Non-
linear Inverse Problem with Random Noise, SIAM J. Num.An., 47,1827,2009.
[7] Hohage,T., On the numerical solution of a three dimensional inverse medium scattering prob-
lem,Inv.Problem, 17, 1743, 2001.
322 Agah D. Garnadi

[8] Jin, Q.N. On a class of frozen regularized Gauss-Newton methods for nonlinear inverse prob-
lems,Math. Comp., 79, 2191, 2010.
[9] Kaltenbacher,B., A posteriori parameter choice strategies for some Newton type methods for
the regularization of nonlinear ill-posed problems, Num. Math., v 79, 501, 1998.
[10] Kaltenbacher,B. and Neubauer,A. and Scherzer,O., Iterative Regularization Methods for
Nonlinear Ill-Posed Problems, (de Gruyter, Berlin, 2008).
[11] Mahale, P. and Nair,M.T., A simplified generalized Gauss-Newton method for nonlinear ill-
posed problems, Math. Comp., 78, 171-184, 2009.

Agah D. Garnadi
Dept. of Mathematics
Institut Pertanian Bogor.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 323–330.

ASYMPTOTICALLY AUTONOMOUS SUBSYSTEMS


APPLIED TO THE ANALYSIS OF A TWO-PREDATOR
ONE-PREY POPULATION MODEL

Alexis Erich S. Almocera, Lorna S. Almocera, Polly W. Sy

Abstract. We investigate the limiting behavior of solutions to a population model,


where two competitive predators exploit their consumption of a single renewable prey.
The predator-prey interaction is modeled by a Beddington-DeAngelis functional response.
By analyzing one of the subsystems of that model, we look at how the prey becomes the
sole survivor, given the extinction of both competitors.

Keywords and Phrases: Two-Predator, One-Prey; Beddington-DeAngelis; Limiting Be-


havior; Subsystem; Global Stability.

1. INTRODUCTION
Let i = 1, 2, and s and xi be variables dependent on time t, that denote the
densities of a single renewable prey and the ith exploitative competitor respectively.
These competitors contend by exploiting their consumption of the prey. We study the
population model

ds 1 1
= s (1 − s) − · F1 x 1 − · F2 x2 , s(0) = s0 ,


dt b1 b2

(E)3
dx
 i = Fi xi − di xi ,

 xi (0) = xi0 , i = 1, 2,
dt
where s0 and xi0 are initial densities, ai , bi , di and mi are parameters, and
mi s
Fi = , i = 1, 2,
a i + s + xi

2010 Mathematics Subject Classification: 92D25, 34D05, 34D23

323
324 A. E. S. Almocera, L.S. Almocera, P.W. Sy

represents the functional response of the predator-prey interaction, which is of the


Beddington-DeAngelis type [1]. For simplicity, we assume that the parameters and the
initial densities are fixed and positive.
By considering at least one of the species to be absent, the model (E)3 reduces
to a subsystem with a known dynamical behavior. For instance, assuming the absence
of both competitors (x1 = x2 = 0), the model (E)3 becomes the s-subsystem, a logistic
growth model for the prey
ds
= s (1 − s) , s(0) = s0 , (1)
dt
where the carrying capacity K = 1 is locally asymptotically stable. If y(t) is the unique
solution of (1), with y(0) = s0 , then limt→∞ y(t) = K = 1. By analyzing (E)3 with
the subsystem (1), our aim is to investigate on the limiting behavior of the solution
(s(t) , x1 (t) , x2 (t)), given
lim xi (t) = 0, i = 1, 2, (2)
t→∞

that is, both competitors become extinct.


This paper is presented as follows: in Section 2, we present the background ma-
terial, which will be used in our analysis at Section 3. In the analysis, we first look
at points in the vector field of (E)3 before we look at the solution of (E)3 itself. We
summarize our discussion on Section 4.

2. MARKUS’ THEORY OF ASYMPTOTICALLY AUTONOMOUS


DIFFERENTIAL SYSTEMS
Our analysis assumes that some, but not all, of the component functions in the
solution (s(t) , x1 (t) , x2 (t)) have limiting values, thus representing a given biological
behavior. Hsu, Hubbell, and Waltman [2] did the same kind of analysis by making use
of analytical tools developed by Markus [3].
Consider two functions f (x, t) and g(x) that are vector fields of a non-autonomous
and an autonomous system of ordinary differential equations respectively. Letting D
be an open set in Rn , assume that f (x, t) and g(x) are:
• Continuous in (x, t) for all x ∈ D and for sufficiently large t; and,
• Continuously differentiable for all x ∈ D.
Definition 2.1. [3, 4] Given that the convergence f (x, t) → g(x), as t → ∞, is local
uniform at each x ∈ D, we say that the non-autonomous system
dx
= f (x, t) , x ∈ D, (3a)
dt
is asymptotically autonomous in D, with limit system
dx
= g(x) , x ∈ D. (3b)
dt
This is denoted by “ (3a) → (3b) in D”.
Asymptotically Autonomous Subsystems 325

The analysis of asymptotically autonomous systems for limiting behavior involves


the omega limit set of a given solution.

Definition 2.2. [5] Let x(t) be a given solution of (3a) defined for all t > 0. We say
that p is an omega limit point of x(t) if there exists a sequence htn i ↑ ∞ such that
hx(tn )i → p. The set of all omega limit points of x(t) is called the omega limit set of
x(t).

If x(t) is forward bounded, that is, x(t) stays inside a fixed compact subset of D
for sufficiently large t, then as one of the main results in [3], the omega limit set of x(t)
is nonempty. Moreover, the following result has been used in many population models
to show that the solution tends to an equilibrium point [4].

Theorem 2.1. (Markus [3, 4]) Let (3a) → (3b) in D, and x(t) be a forward bounded
solution of (3a), with a nonempty omega limit set Ω. Suppose that P is a locally
asymptotically stable fixed point of (3b). If there is an omega limit point y0 ∈ Ω such
that the solution y(t) of (3b), with y(0) = y0 , has limt→∞ y(t) = P , then limt→∞ x(t) =
P.

3. ANALYSIS
Now, we are given equation (2) where limt→∞ xi (t) = 0 for i = 1, 2. Following
Hsu, Hubbell, and Waltman [2], we investigate on the limiting behavior of s(t), one of
the coordinates of the solution (s(t) , x1 (t) , x2 (t)). To facilitate our analysis, we let
ε > 0 be arbitrarily small. Then (2) implies that for i = 1, 2, there is a τi > 0 large
enough such that
 
1 bi
xi (t) = |xi (t) − 0| < min , a1 , a2 · ε, for all t > τi , (4a)
2 mi
which yields
bi ai bi
xi (t) < ε, xi (t) < ε, for all t > τ = max(τ1 , τ2 ) . (4b)
2mi mi

3.1. Asymptotically Autonomous Subsystem. Let us consider first the points


(s, x1 , x2 ) of the vector field of (E)3 . To apply Markus’ theory, we treat each xi (t)
as a function that is independent of any value of s and only dependent on its argu-
ment t. This allows us to construct the following functions:
2
X 1 mi s xi (t)
G(s, t) = s (1 − s) − · ,
b ai + s + xi (t)
i=1 i
G∞ (s) = s (1 − s) .
326 A. E. S. Almocera, L.S. Almocera, P.W. Sy


Note that our treatment of xi (t) yields the partial derivative ∂s xi (t) = 0, since
holding t fixed makes xi (t) constant. Moreover, we have the following partial derivatives:
2
∂G X mi (ai + xi (t)) xi (t)
= 1 − 2s − · 2,
∂s b
i=1 i (ai + s + xi (t))
∂G∞
= 1 − 2s.
∂s
If we let D = (0, ∞), so that ai + s + xi (t) > 0, for i = 1, 2, for each s ∈ D, and for all
t > 0, then it follows that G(s, t) and G∞ (s) are:
• Continuous in (s, t) for all s ∈ D and for all t > 0; and,
• Continuously differentiable for all s ∈ D.
Furthermore, G(s, t) and G∞ (s) are respectively the vector fields of the following ordi-
nary differential equations:
2
ds X 1 mi s xi (t)
= s (1 − s) − · , (5a)
dt b ai + s + xi (t)
i=1 i
ds
= s (1 − s) , (5b)
dt
where (5b) corresponds to the s-subsystem (1). The following result enables us to
compare (5a) with (5b).
Lemma 3.1. If equation (2) holds, then (5a) → (5b) in D = (0, ∞).

Proof. Let E be any compact subset of D. Then for i = 1, 2, for all s ∈ E, and
for all t > τ ,

s s
xi (t) > 0, ai + s + xi (t) > 0, ai + s + xi (t) = ai + s + xi (t) < 1.
(6)

Thus, applying (6) for t > τ ,



2
X mi s xi (t)
|G(s, t) − G∞ (s)| = ·

b a + s + x (t)


i=1 i i i
2
X mi s
≤ · · |xi (t)| ,
i=1
b i ai + s + x i (t)
2
X mi
< · |xi (t)| ,
i=1
bi

bi
and by (4b), where |xi (t)| < 2mi ε,

2
X mi bi ε
|G(s, t) − G∞ (s)| < · = ε.
i=1
bi 2mi
Asymptotically Autonomous Subsystems 327

Therefore, |G(s, t) − G∞ (s)| < ε for all time t > τ , and for all s ∈ E. That is,
G(s, t) → G∞ (s) locally uniformly in s ∈ D, as t → ∞. Consequently, we have
(5a) → (5b) in D. 

3.2. Forward Bounded Solution. Now we return to the solution of (E)3 , which
is (s(t) , x1 (t) , x2 (t)). Our next result relates one of the component functions s(t)
with (5a).
Lemma 3.2. If (s(t) , x1 (t) , x2 (t)) is a solution of (E)3 , then s(t) is a forward bounded
solution of (5a) for all time t ≥ 0.

Proof. From the fundamental properties of an initial value problem, the solution
(s(t) , x1 (t) , x2 (t))
exists for all t ≥ 0, where the component function s(t) is:
(1) Continuously differentiable, that is, s(t) and its derivative s0 (t) are both con-
tinuous;
(2) Positive, so that s(t) ∈ (0, ∞) = D; and,
(3) Defined by the equation
2
X 1 mi s(t) xi (t)
s0 (t) = s(t) [1 − s(t)] − · .
b ai + s(t) + xi (t)
i=1 i

With each other coordinate xi (t) assumed to be an explicit function, it follows that
s0 (t) = G(s(t) , t). Therefore, s(t) is a solution of (5a) for all t ≥ 0.
It can be shown that there is a sufficiently large T > τ such that (4b) holds and
s(t) ≤ 1 + ε for all t ≥ T . Assuming further that 4ε  1, this yields,
2
0
X mi s(t) xi (t)
s (t) = s(t) [1 − s(t)] − ·
i=1
bi ai + s(t) + xi (t)
2
X mi
> s(t) [1 − s(t)] − · s(t) · xi (t) , since ai + s(t) + xi (t) > ai ,
a b
i=1 i i
2
X 1 ai bi ai bi
> s(t) [1 − s(t)] − · s(t) , since xi (t) < ε< ,
i=1
4 mi 4mi
 
1 s(t)
= s(t) 1 − 1 .
2 2
Letting sT = s(T ), we have the following
 
0 1 s(t)
s (t) ≥ s(t) 1 − 1 , t ≥ T,
2 2
from which, by the theory of differential inequalities,
  1
1 sT
min sT , ≤ 2  ≤ s(t) ≤ 1 + ε, (7)
sT + 2 − sT exp − 21 (t − T )
1 
2
328 A. E. S. Almocera, L.S. Almocera, P.W. Sy

or s(t) ∈ min sT , 12 , 1 + ε ⊂ (0, ∞) = D, for all t ≥ T . This shows that s(t) is a


  

forward bounded solution of (5a). 

3.3. Limiting Behavior. We are now ready to prove our main result.
Theorem 3.1. Let (s(t) , x1 (t) , x2 (t)) be the solution of (E)3 . If limt→∞ x1 (t) = 0
and limt→∞ x2 (t) = 0, then limt→∞ s(t) = 1.

Proof. Suppose that limt→∞ x1 (t) = 0 and limt→∞ x2 (t) = 0, that is, equation (2)
holds. Then, (5a) → (5b) in (0, ∞) by Lemma 3.1. Since Lemma 3.2 states that s(t)
is a forward bounded solution of (5a), it follows that s(t) has a nonempty omega limit
set Ω. With s(t) differentiable for all t ≥ 0, one can find a sequence htm i ↑ ∞ and
some y0 such that hs(tm )i → y0 and y0 ∈ Ω. Furthermore, with inequality (7) true for
sufficiently large t, we must have y0 > 0.
Since the limiting system (5b) corresponds to the s-subsystem (1), it has a locally
asymptotically stable equilibrium point K = 1, from which the unique solution y(t)
of (5b), with y(0) = y0 , satisfies limt→∞ y(t) = 1. Therefore, limt→∞ s(t) = 1 by
Theorem 2.1. 

4. CONCLUSION
To summarize, we analyzed the limiting behavior of (s(t) , x1 (t) , x2 (t)) and in
particular one of its components s(t), by considering the s-subsystem (1), which is
a logistic growth model. To apply Markus’ theory to (E)3 under the assumption of
equation (2), we constructed a non-autonomous system (5a) that can be compared to
the s-subsystem, such that s(t) is a forward bounded solution of (5a).
Our main result Theorem 3.1 implies that when both competitors do not survive,
the prey saturates. That is, not only the prey becomes the sole survivor, but also, its
density tends to the carrying capacity. The key to this result is the global attractiveness
of the carrying capacity of a logistic growth model, aside from being a locally asymptot-
ically stable equilibrium point. Theorem 3.1 also demonstrates a relationship between
a given system and one of its subsystems, where under (2) the model (E)3 eventually
behaves more like that of (1).

Acknowledgement. The authors would like to thank the Department of Science and
Technology for their financial support of this research, through the Accelerated Science
and Technology Human Resource Development Program (ASTHRDP). The authors are
grateful for the helpful comments of Prof. Jean-Stephane Dhersin, from Université Paris
13, as well as the insights of S. B. Hsu, S. P. Hubbell and P. Waltman on their paper
[2].
Asymptotically Autonomous Subsystems 329

References
[1] DeAngelis, D. L., Goldstein, R. A., and ONeill, R. V., A Model for Tropic Interaction,
Ecology 56, 881-892, 1975.
[2] Hsu, S. B., Hubbell, S. P., and Waltman, P., Competing Predators, SIAM Journal on Applied
Mathematics 35, 617-625, 1978.
[3] Markus, L., Asymptotically autonomous differential systems, in: Contributions to the Theory of
Nonlinear Oscillations, Vol. 3, Princeton University Press, 1956.
[4] Thieme, H. R., Asymptotically Autonomous Differential Equations in the Plane, Rocky Mountain
Journal of Mathematics 24, 351-380, 1993.
[5] Wiggins, S., Introduction to Applied Nonlinear Dynamical Systems and Chaos, Texts in Applied
Mathematics, No. 2, Springer-Verlag, New York, 2003.

Alexis Erich S. Almocera


University of the Philippines, Diliman
e-mail: [email protected]

Lorna S. Almocera
University of the Philippines, Cebu
e-mail: [email protected]

Polly W. Sy
University of the Philippines, Diliman
e-mail: [email protected]
330 A. E. S. Almocera, L.S. Almocera, P.W. Sy
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 331 - 338.

SEQUENCE ANALYSIS OF DNA H1N1 VIRUS


USING SUPER PAIRWISE ALIGNMENT

ALFI YUSROTIS ZAKIYYAH, M.ISA IRAWAN, MAYA SHOVITRI

Abstract. Sequence analysis is one of methods to assign function, structure evolution and
features from sequence and one of kind this methods is a Super Pairwise Alignment. This
methods was assigned homologous sequences DNA H1N1 Virus.
Keywords and Phrases : DNA, Homologous, Super Pairwise Aligment

1. INTRODUCTION

DNA sequencing methodology was developed in the late 1970 and has become one of
the most widely uses technique in moleculer biology. The importance of this technique is
underlined by the volume research funds now being invested in development of outomated
sequencers and sequence analysis system. Sequence analysis in moleculer biology includes a
very wide range of relevan topics like construction of map, translation, protein analysis,
similarty search, alignment with a similar sequence and submission and retrievel. The most
task sequence analysis is alignment with similar sequence. An optimal alignment is achieved
between two similar sequences (DNA or amino acid) and the percent or similarity calculated
[1].
One of methods in sequence alignment is dynamic programming. The widely used
alignment, dynamic programming though generating optimal alignment, takes too much due
to its high computation complexity O(N2). A majority of sequence alignment software utilizes
dynamic programming, such as global Needleman Wunsch and local alignment use Smith
Waterman [3]. Both methods were a classical algorithm in sequence alignment. Based on the
result of the research of Shen et all, both methods have disadvantage of which one is the
speed of computation. To solve this problem, Shen et all found a new methods that is a Super
Pairwise alignment. This methods combines the analysis of the methods combinatorics and
probability[2,3].
________________________________
2011 Mathematics Subject Classification : Computer Science For Biology and other naturall science

331
332 A.Y. ZAKIYYAH, M . I. IRAWAN, M . SHOVITRI

The management of the new virus can be analyzed by homologous. On the problem of
the identification of disease, DNA virus mutates so that it may give rise to new viruses. The
case H1N1 is one example of mutation. The H1N1 virus mutates quickly enough.
Hemaglutinin virus transmission from aspartic acid (D) to Glynine (G) on the line 222 [4].
This becomes the reference to the writer for further consideration applied method Super
pairwise Alignment to analyze sequence DNA H1N1 virus.

1.1 Super Pairwise Alignment. The mathematical methods used to study mutation and
alignment fall mainly into three groups there are stochastics analysis, modulus structure and
combinatorial graph theory [2]. At first glance, a DNA sequence structure may seem
disorderly and unsystematics and nucleotides at each positions (or a group of positions) are
not fixed. That is to say that biological sequence analysis is stochastics sequence. statistically,
we may find that frequency of observing all molecules or segment changes based o different
dataset of biological sequences. Therefore, we may use stochastics model to describes
biological sequences

1.1.1 Parameter Estimation.The key to solving the uniform alignment of the pairwise
sequences is knowing how to estimate the parameters in the mutation mode T based on
sequence (A,B). then T is a group of statistical parameter and

T iˆk , ˆ k  , k  1, 2,...kˆa 
is a set of statistics determined by (A,B), and estimate of the parameter set T. the vital
problem of uniform alignment of pairwise sequences is the estimate of the parameter in T.
The approach to solving this problem is briefly described below :

a. To estimate the parameters in T alternately, we estimate ( , ), = 1, 2, … , one


after the other, that is, we estimate ( , ) based on ( , ), = 1, 2, … , ′ − 1.
b. To estimate each ( , ), we need not have the entire data of sequence (A,B), but
depends on only part of the data. Therefore, choosing the data to use becomes one of
the most important aspect of the statistical decision algorithm
c. The estimate of the parameter set T includes an estimate of the parameter ka

1.1.2 Algorithm. Let (A,B) be two fixed sequences. This algorithm based on Shen et all
research [2]. Specifically, we first select the importance parameters n, h ,  ,  ',  . Here, n
is selected according to the convergence of the law of large numbers or the central limit
'
theorem. Typically, we choose n = 20,50, 100, 150 etc.  ,  are selected based on the error
rate of mutation and error rate of the indepently random variables. Thus, we choose
0     '  0.75 . For the parameters h, as two local modifications, we choose them to
be proportional to n; typically,    n , h   n , 0   ,   0.5 etc. The SPA
algorithm is described below:

a. Estimate the first mutation position i1 in T. i is position of mutation and  is


length of mutation.
(1). Initialize i = j = 0 and calculate w( A, B; i, j , n ) . If
S e q u e n c e An a l y si s o f D N A H 1 N1 Vi ru s u si n g Su p e r P a i r wi se A l i g n me n t 333

w( A, B; i, j, n)  w   '
then let iˆ1  0 . This means the shifting mutations occurs at the beginning of [1, n].
Otherwise, go to step (2)

(2). In step a, procedure (1), if w   , meaning no shifting mutations occurs in


[1,n], we put the starting point forward and consider i  j  n   . Next we
calculate the corresponding w( A, B; i, j , n ) . If

w( A, B; i, j, n)  w  

Then let w( A, B; i, j, n)  w   and repeat step a procedure (2) until


w( A, B; i, j , n)   . Let k1 be the integer satisfying the following requirement:

w( A, B; i, j, n)  w  

If, i  j  k1 ( n   ) and w( A, B; i, j, n)  w   if i  j  (k1  1)(n   ) .


Proceed to step a procedure (3) or procedure (4)

(3). For i  j  ( k1  1)( n   ) , if w( A, B; i, j , n)  w   ' then set


iˆ1  (k1  1)(n   ) . Otherwise, go to step a procedure (4).

(4). Following step a, procedure (1) - (3), we have   w   ' if


i  j  (k1  1)(n   ) . Therefore, for the same n, compute
h 3 
w '  w( A, B; i  h, j  h, n) . If w '  w , calculate iˆ1  L    w .
w ' w  4 
Otherwise, repeat step a procedures (1) – (4) for the larger h and n, until w '  w

More, through the use of step a, we may estimate iˆ1 of i1

b. Estimate 1 based on the estimation iˆ1 of the first mutation position in T. typically,
w( A, B; iˆ1  , iˆ1 , n) , w( A, B; iˆ1 , iˆ1  , n) ,   1, 2,3,...
If pair (iˆ1  , iˆ1 ) or pair (iˆ1 , iˆ1  ) satisfies w    0.3 or 0.4, where w is it
corresponding sliding window function, then this  is the length of the shifting
mutation, specifically:
- If w( A, B; iˆ1  , iˆ1 , n)   , we note that ˆ1   and we insert  virtual
symbols into sequence B following the positions iˆ1 , while keeping sequence A
invariant
334 A.Y. ZAKIYYAH, M . I. IRAWAN, M . SHOVITRI

- If w( A, B; iˆ1 , iˆ1  , n)   , we note that ˆ1   and we insert  virtual

symbols into sequence A B following the positions iˆ1 , while keeping sequence
B invariant.
Through the use of these two steps we may estimate the local mutations mode
T1   i1 , 1  and its corresponding locally uniform alignment  C1 , D1  . It is
decomposed as follows
C1  C1,1 , A2,1 , D1  D1,1 , B2,1
Denote the length of vector C1,1 and D1,1 by iˆ1  ˆ 1 . Since there is no shifting

mutation occurring in the first n positions of A2,1 , B2,1 , we let


L1  iˆ1  ˆ 1  n be starting point for next alignment.
c. After obtaining the estimation iˆ , ˆ  , we continue to estimate
1 1 i2 based on
 C1 , D1  . We initialize i  j  L1 and we calculate w( A, B; i, j , n) by repeating
step a procedures (1) - (4) to obtain the estimation iˆ2 for i2

d. Estimate ̂ 2 based on the estimation iˆ1 , ˆ 1 , iˆ2 . Here we calculate


w(C1 , D1 ; iˆ2  , iˆ2 , n) , w( A, B; iˆ2 , iˆ2  , n) ,   1, 2,3...
We repeat step b to get ̂ 2 and the local alignment  C2 , D2 

e. 
Continuing the above process, we find the sequence iˆk , ˆ k  and the corresponding
sequence  Ck , Dk  for all k  1, 2,3... . The process will terminate at some k0
.
The process will be terminate at some Ck0  C1,k0 , A2, k0 and Dk0  D1, k0 , B2,k0
have shifting mutation occurring in ( A2,k0 , B2,k0 ) .

1.2. DNA Virus H1N1. In this research, we get sequences of H1N1 from database The
national Center for Biotechnology Information [5]. For preliminary research, we use
ClustalW software to alignment several strain DNA H1N1 virus. The output from this
software is phylogenetics trees which showing the inferred evolutionary relationships among
various biological spesies or other entities based upon similarities and differences in their
physical and genetic characteristics. The result of running the program Clustal W to
determine the phylogenetics tree as follows :
S e q u e n c e An a l y si s o f D N A H 1 N1 Vi ru s u si n g Su p e r P a i r wi se A l i g n me n t 335

gi|284999378|gb|GU576514.1|

gi|284999370|gb|GU576506.1|

gi|283831864|gb|GU451262.1|

gi|283831900|gb|GU451280.1|

gi|284999362|gb|GU576500.1|

gi|284999366|gb|GU576502.1|

gi|284999368|gb|GU576504.1|

gi|284999372|gb|GU576508.1|

gi|284999374|gb|GU576510.1|

gi|284999376|gb|GU576512.1|

Figure 1. Phylogenetic Tree


336 A.Y. ZAKIYYAH, M . I. IRAWAN, M . SHOVITRI

2.SEQUENCE ALIGNMENT DNA H1N1 VIRUS USING SPA

In this section discussion about alignment of several DNA H1N1 virus using Super
Pairwise Aligment (SPA). One of pairwise alignment are DNA GU451262 and GU451280
sequences which each length of character 923 bp and 909 respectively. These DNA taken
from NCBI [5]. For the first step, take local similarity (n) and for this this problem this study

GU4512 G T A G A C A C A G T A C T A G A A A A
62
GU4512 T A G A C A C A G T A C T A G A A A A G
80
'
use value of n0  20 and   0.6 and this study obtain the result of local similarity from
20 character of sequences are :

Figure 2.1 The result of local similarity

From the list 2.1, There are three pair sequences in the same character from 17nd up to 19nd.
 '  0.6 . If the value
Further identify is determined sliding window ( w ) and compared by
'
of w   and its can be assumed the shifting Mutation in i  j  0 Otherwise, meaning
.
no shifting in [1,n] and it can be continued to estimate iˆ . in this case, alignment GU451262
'
dan GU451280, similarity = 3 and w  0.85    0.6 . Second step, after it determine
local mutation, next step is estimate (  ) based on iˆ and criterion of    . The result from
running program is gived in the figure below :

 -5 -4 -3 -2 -1 0 1 2 3 4 5
w 0.7 0.6 0.7 0.6 0.6 0.8 0.0 0.8 0.6 0.6 0.7
5 5 5 5 5 5

Figure 2.2 Determine local mutation

From the list above can be determined the suitable  value. The suitable  =1 and it is
related by w  0    0.4 . This study assign shifting distance   1 and insert one ‘-’
into the first point of second sequence . so the sequence GU451262 and Sequence
GU451280 change into

Figure 2.3 The alignment result from running program


S e q u e n c e An a l y si s o f D N A H 1 N1 Vi ru s u si n g Su p e r P a i r wi se A l i g n me n t 337

In this case, determination one gap in the first position, it doesn’t alignment again
because fom this step it get optimal alignment. The output result are 14 gap and homolog
99,56%. In the other result alignment using BLAST software, it is obtain 98% homolog with
same value of gap. The alignment both sequence, it is related by Clustal W, GU451262 and
GU451280 sequence is located in one subdivision.

List 2.3 Comparing result of SPA with BLAST software

Sequence Alignment BLAST SPA


GU451262.1 =923 bp Similarity : 906 Similarity : 906
GU451280.= 909 bp Homolog : 98% Homolog : 99.56%
GU576500 = 1701 bp
Similarity : 1699 Similarity : 1699
GU576502.= 1701bp
Homolog :99% Homolog : 99,88%
GU576502= 1701 bp
Similarity : 1698 Similarity : 1698
GU576504.= 1701bp
Homolog : 98% Homolog : 85%
GU576510 = 1701 bp
Similarity : 1701 Similarity : 1701
GU576512.= 1701 bp
Homolog :100% Homolog :100%

In the apply algoritm of Super Pairwise Alignment, some parameter need adjustment
as the value of the decision local similarity (n). Inaccuracy when taking value of local
similarity (n) influence in to optimization of the sequence alignment. Taking same value of
the parameter alignment sequences before, it is used to align GU576500 and GU451280 and
it doesn’t obtain optimal alignment. These DNA have each character 1701 bp and 909 bp
respectively. The alignment both sequences can be obtained the result of local similarity
from 20 character of sequences are :

GU57650 A T G A A G G C A A T A C T A G T A G T
0
GU45128 T A G A C A C A G T A C T A G A A A A G
0
Figure 2.4 The result of local similarity

From the Figure 2.4, there are three pair sequences in the same character at 3nd, 4nd and 18nd
'
position. The result of sliding windows w  0.85    0.6 so it can be conclude the
shifting mutation in i = j = 0. Next, this study determinate the value of  . The result of  , it
same with the value of aligment betwen GU451262 and GU251280 that value of
w  0    0.4 . This study know shifting distance   1 and insert one ‘-’ into the first
point of second sequence . By addition with one gap at the first point, aligment both
sequences have 231 same characters. Comparing with the alignment by BLAST software
there is difference. Using BLAST software, there are 97 gap at the first point and the final
338 A.Y. ZAKIYYAH, M . I. IRAWAN, M . SHOVITRI

result of similarity is 907.

3. CONCLUDING REMARK

Super Pairwise Alignment get optimal alignment. This study found that sequences
GU451262 and GU451280 have 99,56% homologous. When both sequences alignt with
BLAST software have result about 85%. This study still faced several problem for example
how to determinate parameter of local similarity. Determination of local similarity influence
the value of optimal alignment.
Acknowledgement. I would like to express my gratitude to Department for
Higher Education with scholarship. It is ver y useful to me to continue my study
and research in Institute Teknologi Sepuluh Nopember (ITS) Surabaya.

References

[1] G IFFIN, H UGH G AND ANNETTE M. GRIFFIN., Computer Analysis of Sequence Data.Humana Press, Totowa,
1994.
[2] PUZELLI, SIMONA. MARCIA FACCHININ, D OMENICO SPAGNOLO, MARIA A.DE MARCO, LAURA C ALZONETTI,
ALESSANDRO ZANETTI, R OBERTO FUMAGGALLI, MARIA L.TANZI, ANTONIO C ASSONE,G IOVANNIE REZZA,
ISABELLA DONATELLI, AND THE SURVEILLANCE GROUP FOR PANDEMIC A (H1N1) 2009., Transmission of
Hemaglutinin D222G Mutant Strain of Pandemic (H1N1) 2009 Virus. Vol t6. No,5 May 2010.
[3] SHEN, SHI Y I N ANKAI AND TUSZYNKI., Theory and Mathematical methods for Bioinformatics. Springer,New
York,2008
[4] SHEN, SHI Y I., J UN Y ANG, ADAM Y AO, PEI I NG H WANG. Super Pairwise Alignment (SPA) :An Efficient
Approach to Global Alignment For Homologous Sequences. Journal of Computational Biology Volume 9,
Number 3,2002@Mary Ann Liebert inc Pp477-486
[5] Database Sequences DNA H1N1 Virus, National Center for Biotechnology Information (2011)
www.ncbi.nlm.nih.gov

ALFI YUSROTIS ZAKIYYAH


Graduate Student of Mathematics Department at Institut Teknologi Sepuluh Nopember.(ITS)
e-mail: [email protected]

M.ISA IRAWAN
Supervisor, Lecturer of Mathematics Department at Institut TeknologiSepuluh Nopember
(ITS)
e-mail: [email protected]

MAYA SHOVITRI
co.Supervisor, Lecturer of Biology Department at Institut Teknologi Sepuluh Nopember
(ITS)
e-mail: [email protected]
Proceedings of The 6th SEAMS-GMU Conference 2011
Applied Mathematics, pp. 339–346.

OPTIMIZATION PROBLEM IN INVERTED PENDULUM


SYSTEM WITH OBLIQUE TRACK

Bambang Edisusanto, Toni Bakhtiar, Ali Kusnanto

Abstract. This paper studies an optimization problem, i.e., the optimal tracking error
control problem, on an inverted pendulum model with oblique track. We characterize
the minimum tracking error in term of pendulum’s parameters. Particularly, we derive
the closed form expression for the pendulum length which gives minimum error. It is
shown that the minimum error can always be accomplished as long as the ratio between
the mass of the pendulum and that of the cart satisfies a certain constancy, regardless
the type of material we use for the pendulum.

Keywords and Phrases: inverted pendulum, tracking error, optimal pendulum length.

1. INTRODUCTION
Direct pendulum as well as inverted pendulum models are important devices in
supporting education and research activities in the field of control system as they have
distinct characteristics such as nonlinear and unstable systems thus can be linearized
around fixed points, its complexity can be modified, and they can easily be applied in
actual systems. In the field of engineering, direct and inverted pendulums are utilized
to monitor displacement of foundation of structures such as dam, bridge, and pier.
Cranes work based on pendulum principles. In geology, inverted pendulum system aids
us in detecting seismic noise due to macro-seismic, oceanic, and atmospheric activities
[11]. In physiology we may employ the pendulum laws to study the human balancing
[8, 9, 10]. The theoretical studies of pendulum systems are some. An analytical treat-
ment of the stability problem in the context of delayed feedback control of the inverted
pendulum can be found in [1], while a discussion on the limitations of controlling an
inverted pendulum system in term of the Poisson integral formula and the complemen-
tary sensitivity integral is presented in [13]. Further, an H2 control performance limits

2010 Mathematics Subject Classification: 34H05, 49K05.

339
340 B. Edisusanto, T. Bakhtiar , A. Kusnanto

Figure 1. Inverted pendulum system with oblique track.

of single-input multiple-output system applied to a class of physical systems including


cranes and inverted pendulums is provided by [6]. The performance limits of pendulum
systems are characterized by unstable zeros/poles location. Recent theoretical study on
factors limiting controlling of an inverted pendulum is carried-out in [7]. Calculation
based on symbolic programming is performed to determine the admissible pendulum
angle.
This paper examines an optimization problem on inverted pendulum systems
with oblique track. We consider an optimal tracking error control problem, where our
primary objective is to identify parameters which affect the stability of the pendulum
system. Particularly, we aim to determine the optimal pendulum length which provides
minimal tracking error. Our study is facilitated by the availability of analytical closed-
form expression on optimal tracking error solution provided by previous researches
[2, 3, 4, 5].
The rest of this paper is organized as follows. In Section 2, we consider the inverted
pendulum system with oblique track including the equations of motion. Sections 3
briefly describes the tracking error control problem, in which the optimization problem
is formulated. In Section 4 we apply the tracking error control problem into pendulum
system and derive the optimal pendulum length which provides the lowest possible
tracking error. We conclude in Section 5.

2. INVERTED PENDULUM SYSTEM


In this work we consider an inverted pendulum system as shown in Figure 1,
where an inverted pendulum is mounted on a motor driven-cart. We assume that the
pendulum moves only in the vertical plane, i.e., two dimensional control problem, on
an oblique track of elevation α. We denote respectively by M , m, and 2`, the mass of
the cart, the mass of the pendulum, and the length of the pendulum. Friction between
the track and the cart is denoted by µ and that between the cart and the pendulum by
Optimization Problem in Inverted Pendulum System 341

η. We consider an uniform pendulum so that its inertia is given by I = 13 m`2 . Position


and angle displacement of the pendulum are denoted by x and θ, respectively, and the
force to the cart as the control variable by u. The coefficient of gravity is denoted by g.
Based on the law of energy conservation, the equations of motion of the pendulum
can be written in the following nonlinear model:
(M + m)ẍ + m`θ̈ cos(θ − α) − m`θ̇2 sin(θ − α) + µẋ = u − (M + m)g sin α,
4 2
3 m` θ̈ + η θ̇ + m`ẍ cos(θ − α) − mg` sin(θ − α) = 0.
To linearize the model, we assume that there exists a small angle displacement of
the pendulum, i.e., θ is small enough and thus sin θ ≈ θ, cos θ ≈ 1, θ̇2 θ ≈ 0, and ẍθ ≈ 0.
Additionally, we also assume that the cart starts in motionless from the origin as well
as the pendulum, i.e., it is supposed that x(0) = 0, ẋ(0) = 0, θ(0) = 0, and θ̇(0) = 0.
The following linear model is then obtained:
(M + m)ẍ + m`θ̈ cos α + µẋ = u − (M + m)g sin α,
4 2
3 m` θ̈ + η θ̇ + m`ẍ cos α − mg`θ cos α + mg` sin α = 0.
For simplicity in the further analysis, we shall assume that there are no frictions,
i.e., µ = η = 0. Thus, the application of Laplace transform enables us to convert the
model from time domain into frequency domain as follows:
4 2 2
3 m` s − mg` cos α
Px (s) = , (1)
[ 43 (M + m)m`2 − m ` cos2 α]s4 − [(M
2 2 + m)mg` cos α]s2
−m` cos α
Pθ (s) = . (2)
[ 43 (M + m)m`2 − m2 `2 cos2 α]s2 − (M + m)mg` cos α
Plants Px in (1) and Pθ in (2) represent the transfer functions from force input u to the
cart position x and the pendulum angle θ, respectively.

3. OPTIMAL TRACKING ERROR CONTROL PROBLEM


The considered inverted pendulum system can be represented in frequency domain
as a simple feedback control system in Figure 2, where P is the plant to be controlled,
K is the controller to be designed such that producing a certain control action, F is a
sensor to measure the system output y which feeds-back to the system, r is the reference
signal, d is the disturbance which exogenously enters the system, and e is the tracking
error between reference input and sensor output, i.e., e = r − F y.
A number z ∈ C is said to be zero of P if P (z) = 0 holds. In addition, if z is lying
in C+ , i.e., right half plane, then z is said to be a non-minimum phase zero. P is said
to be minimum phase if it has no non-minimum phase zero; otherwise, it is said to be
non-minimum phase. A number p ∈ C is said to be a pole of P if P (p) is unbounded. If
p is lying in C+ , then p is an unstable pole of P . We say P is stable if it has no unstable
pole; otherwise, unstable.
In classic paradigm, the central problem of a feedback system is to manipulate
control input u or equivalently to design a controller K which stabilizes the system.
342 B. Edisusanto, T. Bakhtiar , A. Kusnanto

Figure 2. Feedback control system.

If the stability control problem is carried-out under the constraint of minimizing the
tracking error then it refers to the optimal tracking error control problem. Formulating
in time domain, the problem is to achieve minimal tracking error E ∗ , where
Z ∞
E ∗ := inf |e(t)|2 dt (3)
K∈K 0

with K is a set of all stabilizing controllers. In modern paradigm, however, the primary
interest is not on how to find the optimal controller, which commonly represented
as Youla’s parameterizations [12]. Rather, we are interesting in relating the optimal
performance with some simple characteristics of the plant to be controlled. In other
words, we provide the analytical closed-form expressions of the optimal performance in
terms of dynamics and structure of the plant [3, 4, 5, 6]. From (1) and (2) we may
construct a single-input and single-output (SISO) plant by selecting either P = Px or
P = Pθ , or alternatively a single-input and two-output (SITO) plant by selecting both
plants, i.e., P = (Px , Pθ )T .
Theorem 3.1. Let P be an SISO plant which has non-minimum phase zeros zi (i =
1, . . . , nz ) and unstable poles pj (k = 1, . . . , np ). Then the analytical closed-form expres-
sion of (3) is given by
nz np

X 2 Re zi X 4 Re pj Re pk (1 − φ(p̄j ))(1 − φ(pk ))
E = + ,
i=1
|zi |2 (p̄j + pk )p̄j pk σ̄j σk
k,`=1

where
nz
Y zi + s
φ(s) := ,
i=1
z̄i − s

 1Y ; np = 1
σj := pk − pj
; np ≥ 2.
 p̄k + pj
k6=j

Theorem 3.1 shows that the minimum tracking error is mainly determined by
non-minimum phase zeros and unstable poles of the plant. In particular, it is clear
that non-minimum phase zeros close to the imaginary axis contribute more detrimental
effect. Moreover, unstable poles and unstable zeros close each other will deteriorate the
minimum tracking error as revealed by following corollaries.
Optimization Problem in Inverted Pendulum System 343

Corollary 3.1. If P has only one non-minimum phase zero z and unstable pole p, both
are real, then
2 8p
E∗ = + .
z (z − p)2
Corollary 3.2. If P has only one non-minimum phase zero z and two unstable poles
p1 and p2 , then
 
2 8(p1 + p2 ) p1 (p1 + p2 ) 2p1 p2 p2 (p1 + p2 )
E∗ = + − + .
z (p1 − p2 )2 (z − p1 )2 (z − p1 )(z − p2 ) (z − p2 )2

4. OPTIMAL PENDULUM LENGTH


We focus our analysis in controlling the cart position only. In other words we
consider only an SISO plant Px in (1). It is easy to verify that Px has one non-minimum
phase zero z and one unstable pole p as follows:
r
3g cos α
z = , (4)
4`
s
3g(M + m) cos α
p = . (5)
`[4(M + m) − 3m cos2 α]
From the perspective of modern paradigm, the minimal tracking error of inverted
pendulum system with oblique track can explicitly be expressed in term of pendulum
parameters by substituting (4) and (5) into Corollary 3.1:
q √ 2
M + m − 34 m cos2 α + M + m
s
∗ `
E =4 q

 , (6)
3g cos α M + m − 3 m cos2 α − M + m
4

By imposing a simple differential calculus on (6) we determine the length which provides
the lowest possible tracking error

M (3 cos2 α − 8 + 1024 − 768 cos2 α + 9 cos4 α)
`∗ = , (7)
(8 − 6 cos2 α)ϕ
where ϕ is the ”length density” constant which represents the ratio between mass and
length of the pendulum, i.e., ϕ := m/`. We can see from (7) that the optimal length
can be reduced by decreasing the mass of the cart or by selecting the material of the
pendulum with bigger length density.
By reformulating (7) we may have

m 3 cos2 α − 8 + 1024 − 768 cos2 α + 9 cos4 α
= , (8)
M 8 − 6 cos2 α
which suggests that, for a given elevation α, the minimum error can always be ac-
complished as long as the ratio between the mass of the pendulum and that of the cart
344 B. Edisusanto, T. Bakhtiar , A. Kusnanto

80
α = 0o
70 α = 15o
α = 30o
60

50

tracking error 40

30

20

10

0
0 0.5 1 1.5 2 2.5 3
pendulum length (m)

Figure 3. The minimum tracking error with respect to the pendulum


length and the track elevation.

satisfies a certain constancy in the right-hand side of (8), regardless the type of material
we use for the pendulum. In particular, for α = 0 we have a kind of magic number

m 265 − 5
= .
M 2
To illustrate our result, we consider a rod-shaped pendulum made from platina
mounted on a cart of weight 1 kg. We assume that the base of pendulum is fixed at
radius of 1 cm and the length is varied. We may find that the pendulum has a length
density of 3.3677 kg/m. Figure 3 depicts the minimum tracking error calculated in
(6) with respect to the pendulum length ` and the track elevation α. It is endorsed
that more elevation needs more control effort. Figure 4 plots the relation between the
track elevation and the optimal pendulum length based on (7). It is shown that `∗ is
a decreasing function of α, indicates that more elevation, and thus more control effort,
can be compensated by selecting a shorter pendulum.

5. CONCLUSION
We have examined a simple but interesting optimization problem that arises in
the field of control engineering. In the perspective of tracking error control problem of a
pendulum, it has been shown that the lowest possible tracking error is solely dependent
on the pendulum parameters. In particular, we provide the analytical closed-form ex-
pression of the optimal pendulum length. The approach adopted in this paper, however,
enables us to design an apparatus that optimally accomplishes a certain objective.
Optimization Problem in Inverted Pendulum System 345

1.8

1.7

1.6

optimal pendulum length (m)


1.5

1.4

1.3

1.2

1.1

0.9

0.8
0 10 20 30 40 50 60 70 80 90
track elevation (deg)

Figure 4. The optimal pendulum length with respect to the track elevation.

References
[1] Atay, F.M., ”Balancing the Inverted Pendulum Using Position Feedback,” Applied Mathematics
Letters, 12, 51–56, 1999.
[2] Chen, G., Chen, J., and Middleton, R., ”Best Tracking and Regulation Performance under
Control Energy Constraint,” IEEE Transactions on Automatic Control, 48:8, 1320–1336, 2003.
[3] Chen, J., Hara, S., and Chen, G., ”Optimal Tracking Performance for SIMO systems,” IEEE
Transactions on Automatic Control, 47:10, 1770–1775, 2002.
[4] Chen, J., Qiu, L., and Toker, O., ”Limitations on Maximal Tracking Accuracy,” IEEE Trans-
actions on Automatic Control, 45:22, 326–331, 2000.
[5] Hara, S., Bakhtiar, T., and Kanno, M., ”The Best Achievable H2 Tracking Performances for
SIMO Feedback Control Systems,” Journal of Control Science and Engineering, 2007, 2007.
[6] Hara, S. and Kogure, C., ”Relationship between H2 Control Performance Limits and RHP
Pole/Zero Locations,” Proceedings of the 2003 SICE Annual Conference, Fukui, Japan, 1242–
1246, 2003.
[7] Lazar, T. and Pastor, P., ”Factors Limiting Controlling of an Inverted Pendulum,” Acta Poly-
technica Hungarica, 8:4, 23–34, 2011.
[8] Loram, I.D. and Lakie, M., ”Human Balancing of an Inverted Pendulum: Position Control by
Small, Ballistic-Like, Throw and Catch Movements,” Journal of Physiology, 540:3, 1111–1124,
2002.
[9] Loram, I.D., Gawthrop, P.J. and Lakie, M., ”The Frequency of Human, Manual Adjustments
in Balancing an Inverted Pendulum is Constrained by Intrinsic Physiological Factors,” Journal of
Physiology, 577:1, 417–432, 2006.
[10] Loram, I.D., Kelly, S.M., and Lakie, M., ”Human Balancing of an Inverted Pendulum: Is Sway
Size Controlled by Ankle Impedance?” Journal of Physiology, 532:3, 879–891, 2001.
[11] Taurasi, I., Inverted Pendulum Studies for Seismic Attenuation, SURF Final Report LIGO
T060048-00-R, California Institute of Technology, USA, 2005.
[12] Vidyasagar, M., Control System Synthesis: A Factorization Approach, MIT Press, Cambridge,
MA, 1985.
346 B. Edisusanto, T. Bakhtiar , A. Kusnanto

[13] Woodyatt, A.R., Middleton, R.H., and Freudenberg, J.S., Fundamental Constraints for the
Inverted Pendulum Problem, Technical Report EE9716, Department of Electrical and Computer
Engineering, the University of Newcastle, Australia, 1997.

Bambang Edisusanto
Madrasah Tsanawiyah Negeri Pakem,
Jl. Cepet Purwobinangun (PKM), Sleman 55582, Yogyakarta.
e-mail: [email protected]

Toni Bakhtiar
Departemen Matematika, Institut Pertanian Bogor,
Jl. Meranti, Kampus IPB Darmaga, Bogor 16880.
e-mail: [email protected]

Ali Kusnanto
Departemen Matematika, Institut Pertanian Bogor,
Jl. Meranti, Kampus IPB Darmaga, Bogor 16880.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 347–364.

EXISTENCE OF TRAVELING WAVE SOLUTIONS FOR


TIME-DELAYED LATTICE REACTION-DIFFUSION
SYSTEMS

Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

Abstract. In this work the existence of traveling wave solutions of some time-delayed
lattice reaction-diffusion systems is studied. Employing iterative method coupled with
the explicit construction of upper and lower solutions in the theory of quasi-monotone
dynamical systems, we obtain a critical speed, c∗ , and show the existence of traveling
wave solutions connecting the trivial solution and the coexistnece state when the wave
speed c is larger than c∗ .

Keywords and Phrases: Traveling Wave Solution, Delayed LDE, Upper And Lower So-
lutions.

1. INTRODUCTION
The purpose of this work is to investigate the existence of traveling wave solutions
for the following time-delayed lattice reaction-diffusion systems:
d  
un,i (t) = dn un,i−1 (t) − 2un,i (t) + un,i+1 (t) + un,i fn (ui )t (−τn ) , (1)
dt
where t ∈ R, dn > 0, un,i ∈ C 1 (R, R), fn ∈ C 1 (RN , R), τn = (τn,1 , · · · , τn,N ) for some
nonnegative constants τn,1 , · · · , τn,N and

ui )t (−τn ) = (u1,i (t − τn,1 ), · · · , uN,i (t − τn,N ) ,
for i ∈ Z and 1 ≤ n ≤ N . We assume fn (0, · · · , 0) > 0 and there exists positive
numbers k1 , · · · , kN such that fn (k1 , · · · , kN ) = 0 for each n. Then it is obvious that
0 := (0, · · · , 0) and K := (k1 , · · · , kN ) are equilibria of systems (1).

2010 Mathematics Subject Classification: Primary: 34A33, 34C37, 34K10, 35C07; Secondary: 92B20.

347
348 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

The systems of lattice differential equations (1) can be seen as the discrete version
of the following time-delayed reaction-diffusion systems:
∂ 
un (x, t) = dn ∆un (x, t) + un (x, t)fn u1 (x, t − τn,1 ), · · · , uN (x, t − τn,N ) . (2)
∂t
The systems (1) or (2) describe the dynamical interaction of N species distributed in
the one-dimensional integer lattice Z1 or R1 respectively. In recent years, the study of
the existence of traveling wave solutions for systems (1) and (2) has attracted a lot of
attention. For the related results of lattice differential equations, we refer the readers
to Bates et. al. [1], Chen et. al. [2], Chow at. al. [3], Hsu et. al. [4, 5, 6], Huang et.
al. [7], Keener [8], Ma et. al. [11], Mallet-Paret [12], Wu et. al. [13, 14] and Zinner et.
al. [16, 15]; and the references cited therein.
Motivated by the works of Lin et. al. [10] and Hsu et. al. [6], we extend the
results of Lin et. al. [10] to the lattice systems (1) in this work. A traveling wave
solution of systems (1) is a solution of the form
un,i (t) = φn (i + ct), for all i ∈ Z; t ∈ R; 1 ≤ n ≤ N,
where each φn ∈ C 1 (R, R) and c ∈ R is called the wave speed. Under the moving
coordinate s = i + ct, we then have the profile equation,
 
cφ0n (s) = dn φn (s − 1) − 2φn (s) + φn (s + 1) + φn (s)fn Φs (−cτn ) , (3)
for 1 ≤ n ≤ N and Φs (−cτn ) = (φ1 (s − cτn,1 ), · · · , φN (s − cτn,N )). Our purpose is
to find positive traveling solutions of (1) connecting the equilibria 0 = (0, · · · , 0) and
K = (k1 , · · · , kN ), i.e., the function φn is positive for all 1 ≤ n ≤ N , and satisfy the
following asymptotically boundary conditions,
lim (φ1 (s), · · · , φN (s)) = 0 and lim (φ1 (s), · · · , φN (s)) = K. (4)
s→−∞ s→∞

We make the following assumptions:


∂fn (X)
(A1) < 0 for all 1 ≤ n ≤ N and X ∈ RN .
∂xn
(A2) Given any 1 ≤ n, j ≤ N with n 6= j, we have either
∂fn (X) ∂fn (X)
≥ 0 for all X ∈ RN or ≤ 0 for all X ∈ RN .
∂xj ∂xj
(A3) For any 1 ≤ n ≤ N, we have
∂fn (X) X ∂fn (X) X ∂fn (X)
kn + kj < kj ,
∂xn +
∂xj −
∂xj
j∈In j∈In

for all X ∈ [0, 3k1 ] × · · · × [0, 3kN ], where


∂fn
In+ = {1 ≤ j ≤ N | j 6= n and ≥ 0},
∂xj
∂fn
In− = {1 ≤ j ≤ N | j =
6 n and ≤ 0}. (5)
∂xj
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 349

(A4) Assume that the set


N
∩ (λn,1 (c), min {λn,1 (c) + λj,1 (c), λn,2 (c)})
n=1 1≤j≤N

is not empty, where λn,1 (c) and λn,2 (c) are roots of the following function
∆n (λ, c) = −cλ + dn (e−λ + eλ − 2) + fn (0, · · · , 0).
Basing on the assumptions (A1), (A2) and (A3), we will show that the functions
f1 , · · · , fN satisfy the special condition called (EMQM). Then we can use the assump-
tion (A4) and apply the iteration method to obtain the traveling wave solutions of
systems (1). Our main results are stated as follows.
Theorem 1.1. Assume (A1)∼(A4) hold. Then there exist a positive number c∗ such
that if c > c∗ then there is a δ > 0 such that equation (3) has positive solutions satisfying
(4) when max{τn,n | 1 ≤ n ≤ N } < δ.
Note that although we apply the techniques similar to those in Lin et. al. [10],
there are some differences. First, the authors in the paper establish three pairs of
upper-lower solutions for three specific models to derive the existence of traveling wave
solutions, respectively. Compared to the results of the paper, we establish the pair
of upper-lower solutions of systems (1) explicitly and generally. Then we obtain the
existence of traveling wave solutions for more general models. Moreover, our results
also generalize the results in Hsu et.al. [6].
The remainder of this paper is organized as follows. In Section 2, we introduce
some definitions as well as notations, and show that (1) of condition (EMQM) holds
under some sufficient conditions. Next, we define the solution operator for equation
(3) and examine its properties in Section 3. In Section 4, we introduce the character-
istic function for equation (3) and use its roots to establish the upper-lower solutions.
According to the results in Sections 2, 3 and 4, we then use the iteration scheme and
Schauder’s fixed point theorem to prove our main results in Section 5. In the final sec-
tion, we illustrate some well-known models and using our results to derive the existence
of traveling wave solutions.

2. PRELIMINARIES
In this section, we will introduce some notations, a terminology and a lemma
which will be used in the proof of the main theorem. First of all, some definitions are
given as follows.

Definition 2.1. Let Φ = (φ1 , · · · , φN ), Ψ = (ψ1 , · · · , ψN ) ∈ C(R, RN ).


(1) The notation Φ  Ψ means φn (s) ≤ ψn (s) for all s ∈ R and 1 ≤ n ≤ N.
(2) Let Cb (R, RN ) and C3K (R, RN ) be the spaces defined by
Cb (R, RN ) := {Φ | Φ ∈ C(R, RN ) is bounded and uniformly continuous},
C3K (R, RN ) := {Φ | Φ ∈ Cb (R, RN ) with 0  Φ  3K},
350 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

where 3K = (3k1 , · · · , 3kN ). Then Cb (R, RN ) is a Banach space with the norm
kΦk := sups∈R;1≤n≤N |φn (s)|.
(3) Let Φ̂ = (φ̂1 , · · · , φ̂N ) and Φ̌ = (φ̌1 , · · · , φ̌N ) be two functions in C3K (R, RN )
which are continuously differentiable for all except finite t. They are called a pair
of upper-lower solutions of (3), respectively, if Φ̌  Φ̂ and satisfy the following
inequalities for all except finite t:
− cφ̌0n (s) + dn (φ̌n (s − 1) − 2φ̌n (s) + φ̌n (s + 1)) + φ̌n (s)fn (Φs (−cτn )) ≥ 0, (6)
and
− cφ̂0n (s) + dn (φ̂n (s − 1) − 2φ̂n (s) + φ̂n (s + 1)) + φ̂n (s)fn (Ψs (−cτn )) ≤ 0, (7)

where Φ(s) = (φ1 (s), · · · , φN (s)), Ψ(s) = (ψ1 (s), · · · , ψN (s)),


 
 ∂fn (X) 
 ∂fn (X)

 φ̂ j (s), if ≤ 0, 
 φ̂j (s), if ≥ 0,

 ∂x j 
 ∂xj
φj (s) = φ̌j (s), if j = n, ψj (t) = φ̂j (s), if j = n,

 


 ∂fn (X) 
 ∂fn (X)
 φ̌j (s), if ≥ 0, 
 φ̌j (s), if ≤ 0.
∂xj ∂xj

(4) Let Φ̂ and Φ̌ be a pair of upper-lower solutions of (3). We define Γ(Φ̌, Φ̂) as
the set of all functions Φ ∈ C3K (R, RN ) satisfying Φ̌  Φ  Φ̂ such that

eβn s [φ̂n (s) − φn (s)] and eβn s [φn (s) − φ̌n (s)]
for some βn > 0, 1 ≤ n ≤ N , are nondecreasing for all s ∈ R.

Definition 2.2. The functions f1 , · · · , fN of systems (1) are said satisfying conditions
(EMQM) if the following conditions hold:
(1) There exist positive real numbers, β1 , · · · , βN , such that given any 1 ≤ n ≤ N
and s ∈ R,
ψn (s)fn (Φ̄s (−cτn )) − φn (s)fn (Φs (−cτn )) + (cβn − 2dn )(ψn (s) − φn (s)) ≥ 0,
for all Φ̄ = (φ1 , · · · , ψn , · · · , φN ), Φ = (φ1 , · · · , φN ) ∈ C 1 (R, RN ) with 0 ≤
φn ≤ ψn ≤ 3kn , 0 ≤ φj ≤ 3kj for j ∈ {1, · · · , N }\{n} and eβn s (ψn (s) − φn (s))
is nondecreasing for s ∈ R.
(2) fn (x1 , · · · , xN ) is monotone with respect to xj for 1 ≤ n, j ≤ N with n 6= j.

Next, we show that part (1) of condition (EMQM) holds when all the delay times
τn,n are small enough.
Lemma 2.1. Let c > 0 be fixed. There exists some δ > 0 such that if τn,n < δ for all
1 ≤ n ≤ N, then (1) of condition (EMQM) holds.
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 351

3. SOME PROPERTIES OF SOLUTIONS OPERATOR


In this section, we define the solutions operator of equation (3) and investigate
its properties which will help us to prove the existence of traveling wave solutions.

First, we define the operators H = (H1 , · · · , HN ) and G = (G1 , · · · , GN ) on


C3K (R, RN ) by
1
Hn (Φ)(s) := {dn (φn (s − 1) − 2φn (s) + φn (s + 1)) + φn (s)fn (Φs (−cτn ))} + βn φn (s),
c Z s
Gn (Φ)(s) := e−βn s eβn z Hn (Φ)(z)dz,
−∞

where s ∈ R, 1 ≤ n ≤ N and Φ = (φ1 , · · · , φN ) ∈ C3K (R, RN ). Then it is clear that


the profile equation (3) can be represented as

φ0n (s) + βn φn (s) − Hn (Φ)(s) = 0

for 1 ≤ n ≤ N , and a fixed point of the operator G is a solution of (3). Some properties

of operator G are established in the following lemmas.

Lemma 3.1. The operator G is continuous with respect to the norm k · k.

Lemma 3.2. Let Φ = (φ1 , · · · , φN ) and Φ̄ = (φ1 , · · · , ψn , · · · , φN ), where 1 ≤ n ≤ N,


φn (s) ≤ ψn (s) and

eβn s (φn (s) − ψn (s)) is nondecreasing for all s ∈ R.

If τn,n is small enough, then Gn (Φ)(s) ≤ Gn (Φ̄)(s) for all s ∈ R.

Lemma 3.3. Assume (EMQM) holds. Then G : Γ(Φ̌, Φ̂) → Γ(Φ̌, Φ̂) is a compact
operator.

Proof. Let Φ = (φ1 , · · · , φN ) ∈ Γ(Φ̌, Φ̂). First of all, we claim that eβn s [φ̂n (s) −
Gn (Φ)(s)] and eβn s [Gn (Φ)(s) − φ̌n (s)] are nondecreasing for all s ∈ R and 1 ≤ n ≤ N.
Here we only prove the former case. The later case can also be shown by the same way.

For any 1 ≤ n ≤ N, by the definition of the upper solutions and the condition (1)
in (EMQM), it is easy to see that
d βn s
e [φ̂n (s) − G(Φ)(s)]
ds  
=eβn s βn φ̂n (s) + φ̂0n (s) − Hn (Φ)(s)
 
≥eβn s βn φ̂n (s) + φ̂0n (s) − Hn (Ψ)(s) + Hn (Ψ)(s) − Hn (Φ)(s)
≥0
352 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

for all except finite s, where Ψ(s) = (ψ1 (s), · · · , ψN (s)) and


 ∂fn (X)

 φ̂j (s), if ≥ 0,

 ∂xj
ψj (s) = φ̂j (s), if j = n,



 ∂fn (X)

 φ̌j (s), if ≤ 0.
∂xj
Then the continuity of φ̂n (s) and Gn (Φ)(s) implies that eβn s [φ̂n (s) − Gn (Φ)(s)] is non-
decreasing for all s ∈ R. Hence, the assertion of our claim follows.
Next, we prove that Φ̌(s)  G(Φ)(s)  Φ̂(s), for all s ∈ R. To this end, we first
show that
φ̌n (s) ≤ Gn (φ1 , φ2 , · · · , φ̌n , · · · , φN )(s), for all s ∈ R, 1 ≤ n ≤ N. (8)
Without loss of generality, we may assume that n = 1. By (A1) and (A2), we have
Z s
−β1 s
G1 (φ̌1 , φ2 , · · · , φN )(s) =e eβ1 z H1 (φ̌1 , φ2 , · · · , φN )(z)dz
−∞
Z s
−β1 s
≥e eβ1 z H1 (φ̌1 , ψ2 , · · · , ψN )(z)dz,
−∞
where 
 ∂fn (X)

 φ̂j (s), if ≤ 0,
∂xj
ψj (s) =

 ∂fn (X)
 φ̌j (s), if ≥ 0.
∂xj
Note that φ̌01 maybe does not exist at finite real numbers. If s ∈ R and φ̌01 exists on
(−∞, s), then we have
Z s Z s
e−β1 s eβ1 z H1 (φ̌1 , ψ2 , · · · , ψN )(z)dz ≥ e−β1 s eβ1 z (φ̌01 (z) + β1 φ̌1 (z))dz = φ̌1 (s),
−∞ −∞
(9)
since Φ̆ is a lower solution. On the other hand, if s ∈ R and φ̌01 does not exist at finite
points of (−∞, s), by improper integral, integration by parts and similar arguments as
above, one can also easily to check that (9) is also true. Hence the inequality (8) follows.
By the same way, we can also obtain that
Gn (φ1 , φ2 , · · · , φ̂n , · · · , φN )(s) ≤ φ̂n (s) for all s ∈ R, 1 ≤ n ≤ N. (10)
From (8), (10) and Lemma 3.2, we know that
φ̌n (s) ≤ Gn (φ1 , φ2 , · · · , φ̌n , · · · , φN )(s) ≤ Gn (φ1 , φ2 , · · · , φn , · · · , φN )(s)
≤ Gn (φ1 , φ2 , · · · , φ̂n , · · · , φN )(s) ≤ φ̂n (s).
for s ∈ R and 1 ≤ n ≤ N. Therefore, G(Φ) ∈ Γ(Φ̌, Φ̂).
The proof of compactness of operator G is similar the proof in Li et. al. [9] and
omit here. 
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 353

4. EXISTENCE OF TRAVELING WAVE SOLUTIONS


4.1. Construction Of Upper And Lower Solutions. In this subsection, we first
investigate some properties of the characteristic function of (3) and then use its positive
roots to construct a pair of upper-lower solutions.
First, let us define the characteristic functions ∆n (λ, c) of (3) at (0, · · · , 0) by
∆n (λ, c) = −cλ + dn (e−λ + eλ − 2) + fn (0, · · · , 0) (11)
for λ, c ∈ [0, ∞), 1 ≤ n ≤ N. Some properties of each function ∆n (λ, c) are stated as
follows.
Lemma 4.1. Let ∆n (λ, c) be the characteristic function defined by (11). For any
1 ≤ n ≤ N, there exists some cn > 0 such that the following statements hold.
(1) If 0 < c < cn then ∆n (λ, c) has no real roots;
(2) If c > cn then ∆n (λ, c) has two real positive roots λn,1 (c), λn,2 (c) with

 = 0,
 if λ = λn,1 (c) or λn,2 (c),
∆n (λ, c) < 0, if λn,1 (c) < λ < λn,2 (c), (12)


> 0, if λ < λn,1 (c) or λ > λn,2 (c).
To construct a pair of upper-lower solution, let us first recall that assumption
(A4) assume that the set
N
∩ (λn,1 (c), min {λn,1 (c) + λj,1 (c), λn,2 (c)})
n=1 1≤j≤N

is not empty. We start to establish the upper and lower solutions of (3) by using the
properties of the characteristic functions.
First, we define the function hn,q (t) = eλn,1 t − qeηλn,1 t , for 1 ≤ n ≤ N , where
q > 1 and η is a real number satisfying
λn,2 λn,1 + λm,1
1 < η < min{ , | 1 ≤ n, m ≤ N }.
λn,1 λn,1
Direct computation implies that hn,q (t) has a unique global maximum mn (q) at t =
tn (q), where
1 1 1 1 1
mn (q) = (1 − )( ) η−1 and tn (q) = ln( ).
η qη λn,1 (η − 1) qη
It is clear that limq→∞ tn (q) = −∞ and limq→∞ mn (q) = 0+ . Let σ(q) > 1 with
mn (q)/σ(q) < kn for all 1 ≤ n ≤ N and set
t∗n (q) := max{t | hn,q (t) = mn (q)/σ(q)}.
1 1
Note that hn,q (t) = 0 at t = ln , and
λn,1 (η − 1) q
1 1
tn (q) < t∗n (q) < ln .
λn,1 (η − 1) q
354 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

Then the fact q > 1 implies that t∗n (q) < 0. Hence we can choose a number δq > 0 and
small enough such that
mn (q) mn (q)  −γt∗n (q) mn (q)
< kn − kn − e < ,
2σ(q) σ(q) σ(q)
for all γ ∈ (0, δq ) and any 1 ≤ n ≤ N. Then there exists a number tn (γ, q) satisfying
1 1
t∗n (q) < tn (γ, q) < ln (13)
λn,1 (η − 1) q
and
mn (q) −γtn (γ,q)
kn − (kn − )e = hn (tn (γ, q)).
σ(q)
N
Next, let κ be a positive number in ∩ (λn,1 , min1≤j≤N {λn,1 + λj,1 , λn,2 }). Then
n=1
we consider the functions ĥn,q (t) = eλn,1 t + qkn eκt for 1 ≤ n ≤ N . For each n, one can
easily check that there exists a unique number b tn (q) such that

ĥn,q (b
tn (q)) = 3kn and lim b
tn (q) = −∞.
q→∞

Since lim ĥn,q (t) = 0, there exists some number denoted by b


t∗n (q) such that
t→−∞


ĥn,q (b
t∗n (q)) < kn < kn + kn e−γ tn (q) for any γ > 0. (14)
b

There also exists some δ̂q > 0 such that if 0 < γ < δ̂q , then

ĥn,q (b
tn (q)) = 3kn > kn + kn e−γ tn (q) (15)
b

From (14) and (15), there is a unique number denoted by b


tn (γ, q) such that

b
t∗n (q) < b
tn (γ, q) < b
tn (q) and ĥn,q (b
tn (γ, q)) = kn + kn e−γ tn (γ,q) ,
b

where 0 < γ < δ̂q . For convenience, let us replace tn (γ, q), mn (q), σ(q) and b
tn (γ, q) by
tn , mn , σ and b
tn respectively, and define
 λ t
e n,1 − qeηλn,1 t , if t < tn ,
φ̌n (t) = (16)
kn − (kn − mσn )e−γt , if t ≥ tn .
 λ t
e n,1 + qkn eκt , if t < b tn ,
φ̂n (t) = (17)
kn + kn e−γt , if t ≥ b
tn .

From the definitions of each φ̂n and each φ̌n , it is easy to see that

lim φ̌n (t) = lim φ̂n (t) = 0 and lim φ̌n (t) = lim φ̂n (t) = kn
t→−∞ t→−∞ t→∞ t→∞

for all 1 ≤ n ≤ N.
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 355

kn + kn e−γt
kn

eλn,1 t + qkn eκt


mn
kn − (kn − σ
)e−γt

eλn,1 t − qeηλn,1 t

0
tn t̂n

Figure 1. Graphs of upper solution Φ+ and lower solution Φ− .


Lemma 3.2. Assume that δ is large enough, 0 < γ ≤ min1≤n≤N {λn,1eλn,1 tn /εn }
Figure
and small enough, and there1. The
exists graphs
positive of φ̌{b
numbers nε(t)N and φ̂n (t)
n }n=1 satisfy (A4 ). Then
Φ− and Φ+ are lower and upper solutions of (1.4) respectively.
Proof. Our purpose is to show that Φ− and Φ+ satisfy the differential inequal-
ities of Definition 3.1 for t ∈weR\{t n, b | n = 1,that
tnshow ··· ,N
Φ̂}.=To(φ̂simplify the computa-
In the rest of the section, will 1 , · · · , φ̂N ) and Φ̌ = (φ̌1 , · · · , φ̌N )
tions, we introduce the notation (d¯n,0 , d̄n,1, d¯n,2, d¯n,3, d¯n,4 ) := (−dn,0 , dn,1, dn,2, dn,3, dn,4 )
is a pair upper-lower
and (α0 , α1 , α2solution
, α3 , α4 ) := of (3)1, −c
(0, −c respectively. Before doing this, we provide a simple
2 , c1 , c2 ). Then the function ∆n (λ, c) in the
lemma. characteristic equations can be rewritten as
P
∆n (λn,1 , c) = −λn,1 + i d¯n,i eλn,1 αi + fn (0, 0) = 0. (3.18)
Lemma 4.2. Assume (A3) holds. Then for any 1 ≤ n ≤ N, there exist ξn,1 , · · · , ξn,N , Ln >
0 such thatNote that
P ¯ ηλ α
∂fn (X) ∆n (ηλn,1, c)X = −ηλ + dn,i e X
+ fn (0, 0) < 0. (3.19)
n,1 i
∂fn,1
n (X) i −∂f n (X)
kn ξn,n + kj ξn,j + kj ξn,j < −Ln , (18)
Now
∂xwen start the proof of +the first
∂xjdifferential inequality of Definition
∂xj 3.1.

j∈I
If t ≤ tn , then (φ− ′ n
n ) (t) = λn,1 e
λn,1 t
−δηλn,1 eηλn,1j∈I
t n equations (3.18), (3.19)
. By
and the Mean Value Theorem, + we have −
for all 0  X  3K, where In as well as In are defined by (5) and
Ln [φ− ](t) + φ− + −
n (t)fn ([Φ |φn ](t),
b
+ n
X n  (Φt ) )
≥ d¯n,i (eλ (t+α )
n,1
− δeiηλ (t+α )
)(0,
+ φ1),
n,1 −
if +j|φ=
n (t)fn ([Φ
i −
n,(Φ+t )nb )
n ](t),
ξn,j ∈
i
− ′ ηλ t
(1,

∞), +if −j 6= n.+ nb 
= (φn ) (t) − δe n,1
∆(ηλn,1, c) + φn (t) fn ([Φ |φn ](t), (Φt ) ) − fn (0, 0)

= (φ−
n ) ′
(t) − δeηλn,1 t
∆(ηλn,1, c) + φ− + − + nb
n (t)Dfn (Ψ1 , Ψ2 ) · [Φ |φn ](t), (Φt ) , (3.20)

Lemma 4.3. Assume (A1)∼(A4) hold. 12Then there is a c∗ > 0 such that for each c > c∗
there exists a δ > 0 such that the functions Φ̂ = (φ̂1 , · · · , φ̂N ) and Φ̌ = (φ̌1 , · · · , φ̌N )
defined by (16) and (17) respectively is a pair of the upper-lower solutions of (3) if q is
large enough, γ is small enough and τn,n < δ for all 1 ≤ n ≤ N .

Proof. Let c > c∗ ≡ max{c1 , · · · , cN }, where c1 , · · · , cN are defined in Lemma 4.1.


Suppose Φ̂ = (φ̂1 , · · · , φ̂N ) and Φ̌ = (φ̌1 , · · · , φ̌N ) are functions defined by (16) and
(17) respectively. Then it is obvious that φ̌n (t) ≤ φ̂n (t) for all 1 ≤ n ≤ N and t ∈ R.
In addition, note also that (18) holds by Lemma 4.2 and it is for convenience that
we denote
∂fn (X)
max | | = Pn,j for all 1 ≤ n, j ≤ N.
0X3K ∂xj
356 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

Now we show that (6) and (7) hold. To prove (6), let us recall that the function
Φ = (φ1 , · · · , φN ) in (6) is defined by

 ∂fn (X)

 φ̂j (t), if ≤ 0,

 ∂xj
φj (t) = φ̌j (t), if j = n,



 ∂fn (X)
 φ̌j (t), if ≥ 0.
∂xj

and then we consider the following two cases : (i) t < tn and (ii) t ≥ tn .

(i) Assume t < tn .

By direct computation and (16), we know that

− cφ̌0n (t) + dn (φ̌n (t − 1) − 2φ̌n (t) + φ̌n (t + 1)) + φ̌n (t)fn (Φt (−cτn ))
≥ − cλn,1 (eλn,1 t − qηeηλn,1 t ) + dn (eλn,1 (t−1) − qeηλn,1 (t−1) )−
2dn (eλn,1 t − qeηλn,1 t ) + dn (eλn,1 (t+1) − qeηλn,1 (t+1) ) + φ̌n (t)fn (Φt (−cτn ))

= − q∆n (ηλn,1 , c)eηλn,1 t + φ̌n (t) fn (Φt (−cτn )) − fn (0) .

Applying the Mean Value Theorem, we have

∂fn (X(t)) X ∂fn (X(t))


fn (Φt (−cτn )) − fn (0) = φ̌n (t − cτn,n ) + φ̌j (t − cτn,j )+
∂xn +
∂xj
j∈In
X ∂fn (X(t))
φ̂j (t − cτn,j ), (19)

∂xj
j∈In

for some X(t) ∈ RN . Since φ̌n (t) ≥ 0 and φ̂j (t) ≤ eλj,1 t + qkn eκt for all j, then

 ∂fn (X(t))
φ̌n (t) fn (Φt (−cτn )) − fn (0) ≥φ̌n (t) φ̌n (t − cτn,n )+
∂xn
X ∂fn (X(t))
φ̌n (t) (eλj,1 (t−cτn,j ) + qkn eκ(t−cτn,j ) ).

∂x j
j∈In

Moreover, the fact φ̌n (t) ≥ 0 for all t implies that qeηλn,1 t ≤ eλn,1 t for all t < tn . Then,
it is easy to see that |φ̌n (t)| ≤ 2eλn,1 t and

∂fn (X(t)) X ∂fn (X(t)) 


| φ̌n (t − cτn,n ) + eλj,1 (t−cτn,j ) + qkn eκ(t−cτn,j ) |
∂xn −
∂xj
j∈In
 X 
≤Pn,n eλn,1 (t−cτn,n ) − qeηλn,1 (t−cτn,n ) + Pn,j eλj,1 (t−cτn,j ) + qkn eκ(t−cτn,j ) .

j∈In
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 357

Thus, if tn is small enough, one can easily verify that

∂fn (X(t))
−q∆n (ηλn,1 , c)eηλn,1 t + φ̌n (t) φ̌n (t − cτn,n )
∂xn
X ∂fn (X(t)) 
+ φ̌n (t) eλj,1 (t−cτn,j ) + qkn eκ(t−cτn,j ) > 0.

∂x j
j∈In

Hence the inequality (6) hold.

(ii) Assume t ≥ tn .

By equation (16), we have



− cφ̌0n (t) + dn φ̌n (t − 1) − 2φ̌n (t) + φ̌n (t + 1) + φ̌n (t)fn (Φt (−cτn ))
mn −γt mn −γ(t−1) 
≥ − cγ(kn − )e + dn kn − (kn − )e −
σ σ
mn −γt  mn −γ(t+1)  
2dn kn − (kn − )e + dn kn − (kn − )e + φ̌n (t)fn Φt (−cτn )
σ σ
mn −γt 
= − (kn − )e (cγ + dn eγ − 2dn + dn e−γ ) + φ̌n (t)fn Φt (−cτn ) .
σ

Applying the Mean Value Theorem again, we derive

fn (Φt (−cτn )) =fn (Φt (−cτn )) − fn (K)


∂fn (X(t)) X ∂fn (X(t))
= (φ̌n (t − cτn,n ) − kn ) + (φ̂j (t − cτn,j ) − kj )
∂xn −
∂xj
j∈In
X ∂fn (X(t))
+ (φ̌j (t − cτn,j ) − kj ). (20)
+
∂xj
j∈In

mj
Since φ̂j (t) ≤ kj + kj e−γt and φ̌j (t) ≥ kj − (kj − σ )e
−γt
for all t ∈ R and 1 ≤ j ≤ N,
it is clear that

∂fn (X(t))
fn (Φt (−cτn )) ≥ (φ̌n (t − cτn,n ) − kn )+
∂xn
X ∂fn (X(t)) X −∂fn (X(t)) mj −γ(t−cτn,j )
kj e−γ(t−cτn,j ) + (kj − )e .

∂x j +
∂x j σ
j∈In j∈In

By (13), if t ≥ tn then

mn −γt mn −γt∗n (q)


φ̌n (t) = kn − (kn − )e ≥ kn − (kn − )e . (21)
σ σ
358 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

Suppose t ≥ tn + cτn,n . By (16), we have


∂fn (X(t)) X ∂fn (X(t))
(φ̌n (t − cτn,n ) − kn ) + kj e−γ(t−cτn,j )
∂xn −
∂x j
j∈In
X −∂fn (X(t)) mj −γ(t−cτn,j )
+ (kj − )e
+
∂xj σ
j∈In
−∂fn (X(t)) mn γcτn,n X ∂fn (X(t))
= (kn − )e + kj eγcτn,j
∂xn σ −
∂x j
j∈In
X −∂fn (X(t)) mj γcτn,j  −γt
+ (kj − )e e .
+
∂xj σ
j∈In

According to (21) and (18), it is easy to see that (6) holds by taking all mj /σ being
small enough and letting γ → 0. .
On the other hand, if tn < t < tn + cτn,n , we have
mn −γtn
φ̌n (t − cτn,n ) = kn − (kn − )e + ε(t),
σ
where ε(t) → 0 as τn,n → 0, and
X ∂fn (X(t)) X −∂fn (X(t)) mj γcτn,j −γtn
( kj eγcτn,j + (kj − )e )e

∂xj +
∂xj σ
j∈In j∈In
∂fn (X(t))
+ (φ̌n (t − cτn,n ) − kn )
∂xn
X ∂fn (X(t)) X −∂fn (X(t)) mj γcτn,j −γtn
≥( kj eγcτn,j + (kj − )e )e

∂x j +
∂xj σ
j∈In j∈In
∂fn (X(t)) −∂fn (X(t) mn −γtn
+ ε(t) + (kn − )e .
∂xn ∂xn σ
Note that e−γt < e−γtn . According to equations (13), (21) and (18), the inequality (6)
holds by taking all mj /σ being small enough and letting γ as well as τnn be small.
Next, we prove that (7) holds for all 1 ≤ n ≤ N . To this end, let us recall that
function Ψ = (ψ1 , · · · , ψN ) in (7) is defined by

 φ̂j (t), if ∂fn (X) ≥ 0,



 ∂xj

ψj (t) = φ̂ j (t), if j = n,



 ∂fn (X)

 φ̌j (t), if ≤ 0.
∂xj
and then we also consider the following two cases: (i) t < b
tn and (ii) t ≥ b
tn .
(i) Assume t < b
tn .
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 359

By (17), direct computation implies that

− cφ̂0n (t) + dn (φ̂n (t − 1) − 2φ̂n (t) + φ̂n (t + 1)) + φ̂n (t)fn (Ψt (−cτn ))
≤ − cλn,1 eλn,1 t + kn q(−c)κeκt + d1 (eλn,1 (t−1) + kn qeκ(t−1) + eλn,1 (t+1) +
kn qeκ(t+1) ) − 2d1 (eλn,1 t + kn qeκt ) + φ̂n (t)fn (Ψt (−cτn ))
=qkn ∆n (κ, c)eκt + φ̂n (t)(fn (Ψt (−cτn )) − fn (0)).

From the choice of κ and (12), it is clear that ∆n (κ, c) < 0. Similar to equation (19),
we can get the following inequality

∂fn (X(t)) X ∂fn (X(t))


fn (Ψt (−cτn )) − fn (0) ≤ φ̂n (t − cτn,n ) + φ̂j (t − cτn,j )
∂xn +
∂xj
j∈In
∂fn (X(t)) λn,1 (t−cτn,n ) X ∂fn (X(t)) λj,1 (t−cτn,j )
≤ e + e +
∂xn +
∂xj
j∈In
∂fn (X(t)) X ∂fn (X(t))
qeκt ( kn e−κcτn,n + kj e−κcτn,j ),
∂xn +
∂x j
j∈In

for some X(t) ∈ RN . By (18), we know that if τn,n is small enough then

∂fn (X(t)) X ∂fn (X(t))


kn e−κcτn,n + kj e−κcτn,j ≤ −Ln .
∂xn +
∂x j
j∈In

Furthermore, we have

∂fn (X(t)) λn,1 (t−cτn,n ) X ∂fn (X(t)) λj,1 (t−cτn,j ) 


|φ̂n (t) e + e |
∂xn +
∂xj
j∈In
X
λn,1 t κt λn,1 t
≤(e + qkn e )(Pn,n e + Pn,j eλj,1 t ).
+
j∈In

Then it is easy to see that

qkn ∆n (κ, c)eκt + φ̂n (t)(fn (Ψt (−cτn )) − fn (0)) < 0

if b
tn and τn,n are small enough. Hence the inequality (7) hold.

(ii) Assume t ≥ b
tn .

By (17) again, we have

− cφ̂0n (t) + dn (φ̂n (t − 1) − 2φ̂n (t) + φ̂n (t + 1)) + φ̂n (t)fn (Ψt (−cτn ))
≤e−γt kn (cγ + dn (eγ − 2 + e−γ )) + φ̂n (t)fn (Ψt (−cτn )).
360 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

Similar to equation (20), we also obtain the following inequality


∂fn (X(t))
fn (Ψt (−cτn )) ≤ (φ̂n (t − cτn,n ) − kn )+
∂xn
X ∂fn (X(t)) X −∂fn (X(t)) mj γcτj −γt
( kj eγcτn,j + (kj − )e )e .
+
∂x j −
∂x j σ
j∈I j∈I

Note that φ̂n (t) > kn for all t > btn . If t > cτn,n + b
tn and γ is small enough, then (18)
implies that
X ∂fn (X(t))
φ̂n (t)fn (Ψt (−cτn )) ≤ φ̂n (t)( kj eγcτn,j +
+
∂x j
j∈In ∪{n}
X −∂fn (X(t) mj γcτn,j −γt
(kj − )e )e

∂xj σj
j∈In

≤ −Ln kn e−γt .
Thus, if γ is small enough, we have
e−γt kn (cγ + dn (eγ − 2 + e−γ )) + φ̂n (t)fn (Ψt (−cτn )) ≤ 0.
By (18), if b
tn < t < cτn,n + b
tn and γ is small enough then
−∂fn (X(t)) ∂fn (X(t))
fn (Ψt (−cτn )) ≤ kn + φ̂n (t − cτn,n )+
∂xn ∂xn
X ∂fn (X(t)) X −∂fn (X(t) mj γcτn,j −γt
( kj eγcτn,j + (kj − )e )e
+
∂xj −
∂xj σ
j∈In j∈In
−∂fn (X(t)) ∂fn (X(t))
= kn + (kn + kn e−γ tn + ε(cτn,n ))+
b
∂xn ∂xn
X ∂fn (X(t)) X −∂fn (X(t) mj γcτn,j −γt
( kj eγcτn,j + (kj − )e )e
+
∂x j −
∂x j σ
j∈In j∈In
∂fn (X(t))
≤ (kn e−γ tn + ε(cτn,n ))+
b
∂xn
X ∂fn (X(t)) X −∂fn (X(t) mj γcτn,j −γ btn
( kj eγcτn,j + (kj − )e )e
+
∂x j −
∂xj σ
j∈In j∈In
∂fn (X(t))
≤ ε(cτn,n ) − Ln e−γ tn ,
b
∂xn
where ε(t) → 0 as τn,n → 0. Note that eγ + e−γ > 2, for all γ > 0. Therefore, if γ and
τn,n are small enough then
e−γt kn (cγ + dn (eγ − 2 + e−γ )) + φ̂n (t)fn (Ψt (−cτn ))
∂fn (X(t))
≤e−γ tn kn (cγ + dn (eγ − 2 + e−γ )) − 3Ln kn e−γ tn + 3kn |ε(t)| ≤ 0.
b b
∂xn
The proof is complete. 
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 361

4.2. Existence Of Traveling Wave Solutions. Now we prove the results of Theorem
1.1 in this subsection.
Proof of Theorem 1.1. Let c∗ > 0 be defined in Lemma 4.3. For c > c∗ , we can choose
q and γ being large and small enough respectively. Then, by Lemma 4.3, there exists
a δ > 0 such that if |τn,n | < δ for all 1 ≤ n ≤ N then the functions Φ̂ = (φ̂1 , · · · , φ̂N )
and Φ̌ = (φ̌1 , · · · , φ̌N ) defined by (16) and (17) form a pair of upper-lower solutions of
(3). One can verify that eβn s (φ̂n (s) − φ̌n (s)) is nondecreasing for all s ∈ R. This implies
Γ(Φ̌, Φ̂) is a non-empty, convex, bounded and closed set with the super norm k · k.
Therefore, by Lemma 3.1, Lemma 3.3 and the Schauder’s fixed point theorem, equation
(3) has a solution Y (t) = (y1 (t), · · · , yN (t)) ∈ Γ(Φ̌, Φ̂) satisfying the inequalities

φ̌i (t) ≤ yi (t) ≤ φ̂i (t)


for all t ∈ R and 1 ≤ i ≤ N . Hence by the equations (16) and (17) it is easy to see that
limt→−∞ yi (t) = 0 and limt→∞ yi (t) = kn , that is, Y (t) satisfies the condition (4). We
complete the proof.

5. APPLICATIONS
In this section, we will apply our main theorem to show the existence of traveling
wave solutions for various types of lattice reaction-diffusion systems.
Example 5.1 (N Species Delayed Lotka-Volterra Ecological Models).

The N species delayed Lotka-Volterra ecological models can be described by the


following equations:

u0n,i (t) = dn un,i−1 (t) − 2un,i (t) + un,i+1 (t) +
P  (22)
un,i (t) rn + ann un,i (t − τnn ) + m6=n anm um,i (t − τnm ) ,

where i ∈ Z, dn , rn > 0, τnm ≥ 0 and ann < 0 for 1 ≤ m, n ≤ N . If anm are positive
for all n 6= m then systems (22) is called a cooperative model; if anm are negative for
n 6= m then systems (22) is called a competitive model; and if anm ak` < 0 for some
n 6= m and k 6= ` then systems (22) is called a predator-prey systems.

For systems (22), we assume that there is a positive equilibrium (k1 , · · · , kN ), i.e.,
the equations
X
rn + ann kn + anm km = 0 (23)
m6=n

hold for some ki > 0, i = 1, · · · , N . Let Φ(t) = (φ1 (t), · · · , φN (t)) be a traveling wave
solution of (22), then the corresponding profile equations are
−cφ0n (t) = dn (φn (t − 1) − 2φn (t) + φn (t + 1))+
P  (24)
φn (t) rn + ann φn (t − cτnn ) + m6=n anm φm (t − cτnm )
362 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

with the asymptotical boundary conditions,


lim Φ(t) = (0, · · · , 0) and lim Φ(t) = (k1 , · · · , kN ).
t→−∞ t→∞

It is clear that the conditions (A1) and (A2) hold for (22). By elementary com-
putation, one can see that condition (A3) holds when
X
−ann kn > |anm |km . (25)
m6=n

Therefore, if the conditions (23) and (25) hold, then we obtain the same results stated
in Theorem 1.1 for systems (22).
Note that the equations (23) can be rewritten as
X
−ann kn = rn + anm km ,
m6=n

then the condition (A3) (or (25)) always holds for cooperative systems.
However, it is not easy to verify condition (23) for general systems. Here we only
consider the following two species ecological systems
 0 
 ui (t) = d1 ui−1 (t) − 2ui (t) + ui+1 (t) + 

 ui (t) r1 + a11 ui (t − τ11 ) + a12 vi (t − τ12 ) ,
 (26)
 0
 vi (t) = d2 vi−1 (t) − 2vi (t) + vi+1 (t) +
 
vi (t) r2 + a22 vi (t − τ22 ) + a21 ui (t − τ21 ) .
If a11 a22 − a12 a21 6= 0, then it is obvious that the equilibrium (k1 , k2 ) can be expressed
explicitly by
−r1 a22 + r2 a12 −r2 a11 + r1 a21 
(k1 , k2 ) = , .
a11 a22 − a12 a21 a11 a22 − a12 a21
It is required that k1 and k2 should be positive. Hence we assume
a11 a22 − a12 a21 > 0, −r1 a22 + r2 a12 > 0 and − r2 a11 + r1 a21 > 0. (27)
Under above assumptions, (k1 , k2 ) is the unique positive equilibrium of (26). Note that
the assumption (27) also implies the equilibrium (k1 , k2 ) of (26) is linearized stable for
the following ODEs:
( 
u0i (t) = ui (t) r1 + a11 ui (t) + a12 vi (t) ,

vi0 (t) = vi (t) r2 + a22 vi (t) + a21 ui (t) .
Now we only need to verify the condition (A3). By (26), the inequalities in (A3)
can be stated as the following:
−a11 k1 > |a12 |k2 and − a22 k2 > |a21 |k1 . (28)
Then we consider the following two cases for competitive and predator-prey systems
respectively.
◦ Assume a12 > 0 and a21 < 0.
By the formula of (k1 , k2 ), the condition (28) is equivalent to
2r1 a21 a22 < r2 (a11 a22 + a12 a21 ). (29)
Traveling Wave Solutions for Time-Delayed Lattice Reaction-Diffusion Systems 363

◦ Assume a12 < 0 and a21 < 0.


By the formula of (k1 , k2 ) again, the condition (28) is equivalent to
2a11 a12 r1 a11 a22 + a12 a21
< < . (30)
a11 a22 + a12 a21 r2 2a21 a22
Therefore the existence results for traveling wave solutions of systems (26) are stated
as following.
Theorem 5.1. Assume ri and amn satisfying the conditions (27). Then the statements
of Theorem 1.1 hold for systems (26) if one of the following conditions holds:
(1) a12 > 0 and a21 > 0;
(2) a12 > 0, a21 < 0 and condition (29) holds;
(3) a12 < 0, a21 < 0 and condition (30) holds.

References
[1] Bates, P. W., Chen, X. and Chmaj, A. J. J. , Traveling Waves of Bistable Dynamics on a
Lattice, SIAM J. Math. Anal. 35(2), 520-546, 2003.
[2] Chen, X. and Guo, J.-S., Existence and Asymptotic Stability of Traveling Waves of Discrete
Quasilinear Monostable Equations, Journal of Differential Equations 184(2), 549-569, 2002.
[3] Chow, S.-N., Mallet-Paret, J. and Shen, W., Traveling Waves in Lattice Dynamical Systems,
Journal of Differential Equations 149(2), 248-291, 1998.
[4] Hsu, C.-H. and Lin, S.-S., Existence and Multiplicity of Traveling Waves in a Lattice Dynamical
System, Journal of Differential Equations 164(2), 431-450, 2000.
[5] Hsu, C.-H., Lin, S.-S. and Shen, W., Traveling Waves in Cellular Neural Networks, Internat. J.
Bifur. Chaos Appl. Sci. Engrg. 9(7), 1307-1319, 1999.
[6] Hsu, C.-H. and Yang, T.-H., Traveling Plane Wave Solutions of Delayed Lattice Differential
Systems in Competitive Lotka-Volterra Type, Discrete Contin. Dyn. Syst. Ser. B 14(1), 111-128,
2010.
[7] Huang, J., Lu, G. and Ruan, S., Traveling Wave Solutions in Delayed Lattice Differential Equa-
tions with Partial Monotonicity, Nonlinear Anal. 60(7), 1331-1350, 2005.
[8] Keener, J. P., Propagation and Its Failure in Coupled Systems of Discrete Excitable Cells, SIAM
Journal on Applied Mathematics 47(3), 556-572, 1987.
[9] Li, W.-T., Lin, G. and Ruan, S., Existence of Travelling Wave Solutions in Delayed Reaction-
diffusion Systems with Applications to Diffusion-competition Systems, Nonlinearity 19(6), 1253-
1273, 2006.
[10] Lin, G., Li, W.-T. and Ma, M., Traveling Wave Solutions in Delayed Reaction Diffusion Systems
with Applications to Multi-species Models, Discrete Contin. Dyn. Syst. Ser. B 13(2), 393-414,
2010.
[11] Ma, S., Liao, X. and Wu. J., Traveling Wave Solutions for Planar Lattice Differential Systems
with Applications to Neural Networks, Journal of Differential Equations, 182(2), 269-297, 2002.
[12] Mallet-Paret, J., The Global Structure of Traveling Waves in Spatially Discrete Dynamical
Systems, Journal of Dynamics and Differential Equations 11(1), 49-127, 1999.
[13] Wu, J. and Zou, X., Asymptotic and Periodic Boundary Value Problems of Mixed Fdes and
Wave Solutions of Lattice Differential Equations, Journal of Differential Equations 135(2), 315-
357, 1997.
[14] Wu, J. and Zou, X., Traveling Wave Fronts of Reaction-diffusion Systems with Delay, Journal of
Dynamics and Differential Equations 13(3), 651-687, 2001.
[15] Zinner, B., Harris, G. and Hudson, W., Traveling Wavefronts for the Discrete Fishers Equation,
Journal of Differential Equations 105(1), 46-62, 1993.
[16] Zinner, B. Existence of Traveling Wavefront Solutions for the Discrete Nagumo Equation, Journal
of Differential Equations, 96(1), 1-27, 1992.
364 Cheng-Hsiung Hsu, Jian-Jhong Lin, Ting-Hui Yang

Cheng-Hsiung Hsu
Department of Mathematics, National Central University,
Chung-Li 32001, Taiwan.
e-mail: [email protected]

Jian-Jhong Lin
Department of Mathematics, National Tsing Hua University,
Hsinchu 30013, Taiwan.
e-mail: [email protected]

Ting-Hui Yang
Department of Mathematics, Tamkang University,
Tamsui, Taipei County 25137, Taiwan.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 365 - 378.

EFFECT OF RAINFALL AND GLOBAL RADIATION ON OIL PALM


YIELD IN TWO CONTRASTED REGIONS OF SUMATERA, RIAU
AND LAMPUNG, USING TRANSFER FUNCTION

DIVO D. SILALAHI, J.P. CALIMAN, YONG YIT YUAN

Abstract. The paper attempts to study the relationship between rainfall and global
radiation on oil palm yield in Riau and Lampung. The method is based on multi -input
transfer function analysis, which is a multivariate time series analysis. This method
combines several properties of univariate ARIMA models and multiple linear regression
analysis. Based on the model obtained, in Riau oil palm yield is affected by last yield at t -
1, t-2, t-3 (1, 2, 3 months before harvest). Rainfall affects yi eld at t (actual), t-2, t-11, t-13
(2, 11, 13 months before harvest). While the global radiation affects yield at t (actual), t -1,
t-8, t-9 (1, 8, 9 months before harvest). In Lampung, oil palm yield is affected by last yield
at t-1, t-2 (1, 2 months before harvest). Rainfall affects yield at t (actual), t-1, t-7, t-8 (1, 7,
8 months before harvest). While the global radiation affects yield at t (actual), t -1, t-6, t-7
(1, 6, 7 months before harvest). Mean Absolute Percentage Error (MAPE) and Mean
Absolute Deviation (MAD) values in Lampung is less than Riau, so to predict the future
level of oil palm this model can be used in Lampung. But in Riau, alternatively ARIMA
model can be used also to predict and explain the future level of oil palm yield.
Keywords and Phrases: Multi-input transfer function, Correlation, Oil Palm Yield

I. INTRODUCTION

Climatic conditions such as rainfall and global radiation are uncontrollable parameters
of the environment. In Libo Estate of Riau, rainfall follows a seasonal pattern with rainy and
dry seasons. Divo, D. S [1] has observed that high rainfall in Libo Estate occurs in October,
November and December, while low rainfall occurs in February and June.
Climatic conditions have a relationship with oil palm yield. Chow [2] showed that yield
and rainfall effects of a number of fields were significant in practically all field analyzed.
Ochs and Daniel [3] described an empirical relationship between soil water deficit and yield,
which could be used to predict yield from rainfall data. Goh [4] compared data on rainfall and
________________________________
2010 Mathematics Subject Classification: STATISTICS Applications to biology (62P10)

365
366 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN

FFB yield from a number of countries, but the relationship was only moderately good.
Various methods have been developed for forecasting oil palm yield. Ahmad Alwi [5]
used ARIMA model that have given a good precision for forecasting oil palm yield. But the
weakness of this method was forecasting result did not consider the effects of uncontrollable
parameters likes rainfall and global radiation. From this cases, it was needed another
statistical analysis that can reflects the combined effects of many different types of causal
phenomena.
The paper attempts to study the relationship between rainfall and global radiation on oil
palm yield in Riau and Lampung. This research was located in oil palm plantation of
PT.SMART Tbk, which is one of the largest private oil palm companies in Indonesia. The
method was used in this paper was multi-input transfer function analysis, which is a
multivariate time series analysis combines several properties of univariate ARIMA models
and multiple linear regression analysis. The model obtained will be used to determine the
relationship between rainfall and global radiation on oil palm yield and predict the future
level of oil palm yield.

2. MATERIAL AND METHOD

Yield and climate data was recorded monthly, in Riau the data period from 1998 to
2010 and was located at Libo Estate (LIBE). In Lampung the data period from 2000 to 2010
and was located at Sungai Buaya Estate (SBYE).
Climate parameters, rainfall and global radiation were provided by the Meteorological
stations in both locations. Oil palm yield was used data from SMARTRI Fertilizer Trial:
LIBE-14 and SBYE-01. For the purpose of this study, we focused only on the optimum level
in each experiment.

500 Outliers are hidden


Outliers are hidden
Extreme values are hidden
Extreme values are hidden
400

400
Rainfall In SBYE (mm)
Rainfall in LIBE (mm)

300

300

200

200

100
100

0
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Month Month

Outliers are hidden


600 values are hidden Outliers are hidden
Extreme
Extreme values are hidden
Global.Ra diation In LIBE (MJ/M2)

Global.Ra diation In SBYE (MJ/M2)

600

500

500

400

400

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
Month
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 367

119

5
Outliers are hidden
Extreme
5 values are hidden 118

Yield In LIBE (Ton/Ha/Month) 117

115 130

Yield in SBYE (Ton/Ha/Month)


4 4 6

5
3

4
3
3 14

99

1
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Month
Month
Figure 1. Trend average per month: rainfall, global radiation, and oil palm yield in Riau
(LIBE) and Lampung (SBYE).

According to the figure, can be described that the rainfall at LIBE comparing to SBYE
have a different pattern, which at LIBE was more look like equatorial type and SBYE was
look like monsoon type. The same like on Global radiation, at LIBE global radiation is higher
than SBYE and the pattern also different comparing both of these. Based on this condition,
we will investigate the contribution affects of rainfall and global radiation to oil palm yield.

2.1. Time Series Analysis. In the time series analysis, one important step is to identify and
build models.The aim is to find out whether the time series data used follow a stochastic time
series model procedure, which is usually presented in the ARIMA (p, d, q) model (Hamilton
[6]):

(1  1 B ...  p Bp )(1  B)d Zt  (1  θ1B ...  θq Bq ) a t (1)

or

p (B)(1  B) d Z t  θ q (B) a t

 p (B) : Autoregresif (AR(p)) operator.

θ q (B) : Moving average (MA (q)) operator.

at : Random noise.
368 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN

2.2. Transfer Function. Transfer function is a multivariate time series analysis which
combines several properties of univariate ARIMA model and multiple linear regression
analysis. General model of transfer function (Wei [7]) is:

yt  v0 xt  v1 xt 1  ...  nt (2)

Where x t as input series, y t as output series, and nt as representation of the error


component (noise series) that follows a particular ARIMA model.

v( B) xt  v0 xt  v1 xt 1  ...  vn xt n

 s ( B) B b θ q (B)
Then, v( B)  and nt  at ;
 r ( B)  p (B)

substituted into (2).

Thus,

 s ( B) B b θ q (B)
yt  xt b  at ; (Wei[7])
 r ( B)  p (B)

 s ( B) : 0  1 B   2 B 2  ...   s B s
 r ( B) : 1  1 B   2 B 2  ...   r B r

 p ( B) : 1  1 B  2 B 2  ...   p B p
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 369

 q ( B) : 1  1 B   2 B 2  ...   q B q

General Model of Multi-Input Transfer Function is:

m  j ( B) θ(B)
yt   x j ,t bj  at (3)
j 1  j ( B)  (B)

 j (B) : MA operator order sj variable j.

 j (B) : AR operator order rj variable j.

2.3. Methodology of Analysis. The methodology of analysis in multi-input transfer function


analysis is as follows:

Start

Model Identification,
parameter estimating of
transfer function

Final Model and


Forecast

End

Figure 2. Flowchart methodology of analysis


370 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN

3. RESULTS AND DISCUSSION

3.1. Description of Record Data

Table 1. Description of record data

Loct. Variable Mean St.Dev Min. Max.

Rainfall 202,4 mm 97,6 18,6 494,7

Riau Global. 514,6


47,6 392,9 643,4
(LIBE) Radiation MJ/M2

2,3 Ton/ha
Yield 0,7 1,0 5,1
/month

Rainfall 182,8 mm 128,1 0,0 452,0

Lampung Global. 467,6


42,6 352,7 603,9
(SBYE) Radiation MJ/M2

1,5 Ton/ha
Yield 1,0 0,1 5,5
/month

Based on description of record data (Table.1) the average of rainfall that occurred in
Riau (LIBE) from 1998 to 2010 was 202,46 mm per month with a standard deviation
97,61.The average of global radiation was 514,68 MJ/M 2 per month with a standard deviation
47,6. And the average of oil palm yield was 2,34 ton/ha with a standard deviation 0,70. The
average of rainfall that occurred in Lampung (SBYE) from 2000 to 2010 was 182,8 mm per
month with a standard deviation 128,1.The average of global radiation was 467,6 MJ/M2 per
month with a standard deviation 42,6. And the average of oil palm yield was 1,5 ton/ha with a
standard deviation 1,0.

0.4

t_21
0.3
t_9
t_10
t_36
0.2 t_8
t_33 t_12
t_35t_34 t_24
t_32 t_30 t_23
0.1 t_22 t_20
t_15
t_19 t_13
t_27 t_25 t_7
0
t_6 t
t_11
-0.1 t_26 t_18
t_28 t_3
t_29
t_31 t_16 t_5 t_2
t_4
-0.2
t_1
t_17
-0.3 t_14

Pearson Corr. of Rainfall LIBE


Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 371

0,6
t_9
0,5
t_10 t_8
t_7
0,4
t_31 t_11
0,3 t_33
t_32 t_6
0,2 t_22 t_21 t_19
t_23
t_35 t_30
0,1
t_24
t_34
0
t_18 t_12 t_5
-0,1 t_20 t
t_29
t_36 t_17
-0,2
t_28 t_25 t_4
t_13 t_1
-0,3
t_27t_26
t_16 t_3 t_2
-0,4 t_15
t_14
-0,5

Person Corr. of Rainfall SBYE


0.30 t_2

t_28
t_27t_26 t_4 t_3
0.20
t_1

t_29 t_14 t_6 t_5


0.10 t_16
t_24
t_15
t_13
t_17
0.00
t_11
t
t_25 t_7
-0.10 t_34 t_31t_30 t_12
t_36 t_33
t_23 t_21 t_9
t_35 t_22 t_19t_18
-0.20
t_10
t_8
t_32
-0.30
t_20

-0.40

Pearson Corr. of Global Radiation LIBE

0,3

t_14
0,2

t_36 t_13
0,1 t_33
t_26 t_25 t_12 t_8
t_35 t_20 t_19 t_11 t_1

0
t_34 t_32 t_21 t_2
t_24 t_22 t_15
t_31 t_27 t_9
-0,1
t
t_10
t_30 t_28 t_3
t_23 t_18 t_16
-0,2 t_7
t_29
t_17 t_6
t_4
-0,3
t_5
-0,4

Person Corr.of Global Radiation SBYE

Figure 3. Correlation between input and output variable in Riau (LIBE) and lampung
(SBYE).
372 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN

Based on correlation value (Figure 3), in Riau, rainfall was significantly and positively
in related with oil palm yield at t-9 and t-21 (9, 21 months before harvest). Global radiation
was significantly and negatively in related with oil palm yield at t-20 (20 months before
actual). In Lampung, rainfall was significantly and positively in related with oil palm yield at
t-8 and t-9 (8, 9 months before harvest). Global radiation was significantly and negatively in
related with oil palm yield at t-5 (5 months before actual). In the parameter of rainfall, global
radiation and oil palm yield has performed differencing (1) to obtain input and output
variables which were stationary between the mean and variance.

3.2. Prewhitening on Input and Output Series. Prewhitening sequence was carried out on
input and output series to identify the transfer function model parameters.

a. Prewhitening on rainfall and yield series.

o Riau

x 1t y1t
αt  ; βt 
(1  0,781 B)(1  0,690B )
12
(1  0,781B)(1  0,690B12 )

o Lampung

x 1t y1t
αt  ; βt 
(1  0,890B)(1  0,687B12 ) (1  0,890B)(1  0,687B12 )

b. Prewhitening on global radiation and yield series.

o Riau

x 2t y 2t
αt  ; βt 
(1  0,737B)(1  0,663B )
12
(1  0,737B)(1  0,663B12 )

o Lampung

x 2t y 2t
αt  ; βt 
(1  0,947B)(1  0,667B )
12
(1  0,947B)(1  0,667B12 )

3.3. Single-Input Transfer Function on Rainfall and Oil Palm Yield. After identification
of the ARIMA model, transfer function parameters and diagnostic test model, the final model
obtained on a single-input transfer function between the rainfalls with yield variable was
given as:
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 373

o Riau
yt  0,388 yt 1  0,0009 x1t  0,0098 x1t 11 
at  1,238at 1  0,3298at  2  0,661at 12 
0,817at 13  0,217at 14

It was seen that in Riau rainfall affects the oil palm yield at actual t (actual) and t-11
(11 month before harvest).
o Lampung
yt  0,682 yt 1  0,009 x1t  0,001x1t 7 
at  1,125at 1  0,302at  2  0,632at 12 
0,712at 13  0,191at 14

It was seen that rainfall in Lampung affects the oil palm yield at actual t (actual) and t-
7 (7 month before harvest).

3.4. Single-Input Transfer Function on Global Radiation and Oil Palm Yield. After
identification of the ARIMA model, transfer function parameters and diagnostic test model,
the final model obtained on a single-input transfer function between the global radiations with
yield variable was given as:
o Riau
yt   1.081yt 2  0,001x2t  0,0019 x2t 8 
at  0,834at 1  1,081at 2  0,901at 3 
0,64at 12  0,53at 13  0,69at 14  0,93at 15

It was seen that global radiation in Riau affects the oil palm yield at t (actual) and t-8
(8 month before harvest).
o Lampung
yt   0,7127 yt 1  0,0021x 2t  0,0025 x 2t 6 
at  0,420at 1  0,208at 2  0,611at 12 
0,175at 13  0,436at 14

It was seen that global radiation in Lampung affects the oil palm yield at t (actual) and
t-6 (6 month before harvest).
374 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN

3.5. Multi-Input Transfer Function: Rainfall, Global Radiation and Oil Palm Yield. The
final multi-input transfer function model obtained was given as:

o Riau

y t  0,380 y t 1 0,957 y t  2  0,363 y t 3 


0,0008 x1t  0,0007 x1t  2  0,0012 x1t 11 
0,0010 x1t 13  0,0015 x 2t  0,0005 x 2t 1 
0,0013 x 2t 8  0,0004 x 2t 9  at  0,67 at 
1,380at 1  0,577 at  2  1,320at 3 
0,363at  4  0,919at 13  0,384at 14 
0,879at 15  0,242at 16

From the model obtained, it was known that the actual oil palm yield in Riau is
affected by last yield at t-1, t-2, t-3 (1, 2, 3 months before harvest). Rainfall affects yield at t
(actual), t-2, t-11, t-13 (2, 11, 13 months before harvest). While the global radiation affects
yield at t (actual), t-1, t-8, t-9 (1, 8, 9 months before harvest).

o Lampung

yt  0,066 yt 1  0,510 yt  2  0,0094 x1t 


0,0707 x1t 1  0,0016 x1t 7  0,0012 x1t 8 
0,0019 x2t  0,0013 x2t 1  0,002 x2t 6 
0,0018 x2t 7  at  0,299at 1  0,534at  2 
0,183at 3  0,650at 12  0,190at 13 
0,347 at 14  0,119at 15

In Lampung, oil palm yield is affected by last yield at t-1, t-2 (1, 2 months before
harvest).Rainfall affects yield at t (actual), t-1, t-7, t-8 (1, 7, 8 months before harvest). While
the global radiation affects yield at t (actual), t-1, t-6, t-7 (1, 6, 7 months before harvest).
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 375

3.6. Forecasting. Based on the final model obtained, the oil palm yield from January-
December 2011 was forecasted and presented in Table 2.

Table 2. Forecasting on oil palm yield (ton/ha/month) in Riau and Lampung on January to
December 2011.
Yield 2011
Month
Riau (ton/ha) Lampung (ton/ha)
Jan 2,4 1,4
Feb 2,0 0,8
Mar 2,1 0,7
Apr 2,2 0,8
May 1,8 1,1
Jun 2,4 0,5
Jul 2,2 1,7
Aug 2,1 1,2
Sep 2,3 2,2
Oct 2,9 2,9
Nov 2,1 2,8
Dec 2,4 1,8
From the forecasting result, we consider that the yield (ton/ha) in Riau more larger than
Lampung with total yield per year in 2011 was 26,9 ton/ha/year and Lampung was 17,9
ton/ha/year.
Forecast vs Actual in yield potential (ton/ha/month) - LIBE
5.6

4.8
Yield (ton/ha/month)

4.0

3.2

2.4

1.6

0.8
26 33 40 47 54 61 68 75 82 89 96 103 110 117 124 131 138 145 152 159 166

Series Forecast Actual


(a)
376 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN
Forecast vs Actual in yield potential (ton/ha/month) - SBYE
6.0

5.0
Yield(Ton/Ha/month)

4.0

3.0

2.0

1.0

0.0
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 111 116 121

Series (b)
Forecast Actual

Figure 4.Oil palm yield Actual vs Forecast. (a) Riau, (b) Lampung.

o Mean Absolute Percentage Error (MAPE):


n

 y t  ŷ t /y t
Riau: t 1
x 100%  22%
n
n

 y t  ŷ t /y t
Lampung: t 1
x 100%  18%
n

o Mean Absolute Deviation (MAD):


n

y t 1
t  ŷ t
Riau:  0,095
n
n

y t  ŷ t
Lampung: t 1
 0,024
n

Comparing to yield actual and forecast result, MAPE in Riau was 22% and MAD was
0,095. This means, yield forecasting in Riau have 22% average error and 9,5% average of
deviation comparing to actual. In Lampung, MAPE was 18% and MAD was 0,024. This
means, yield forecasting in lampung using this method was 18% average error and 2,4%
average of deviation comparing to actual. In Riau, MAPE >20% it is mean ARIMA model
can be used applied also in this region to estimate and explain the future level of oil palm.
Effect Of Rainfall And Global Radiation On Oil Palm Yield In Two Contrasted Regions .... 377

4. CONCLUSSION

In this paper we have discussed, the applied of multi-input transfer function to predict
the future level of oil palm generally has given good results. But we still consider that to
make a good forecasting model, we should have a long term data. Based on this result, for
next is still necessary to simulate this correlation using other statistics method to obtained
good model and get less of MAPE and MAD.

References

[1] DIVO, D .S. 2010. Probability Analysis of Rainy Event With the Weilbull Distribution as a Basic Management
in Oil Palm Plantation. Proceeding of 2010 Conference on Industrial and Applied Mathematics, ITB Bandung.
[2] CHOW, C.S. 1987. The Seasonal and Rainfall Effects on Palm Oil in Peninsular Malaysia. Proceeding of 1987
Oil Palm/Palm Oil Conference, pp.46-55, Kuala Lumpur.
[3] OCHS R. AND DANIEL C. 1976. Research on techniques adapted to dry regions. In: Oil palm Research (Ed. By
R.H.V. Corley, J.J. Hardon & B.J. Wood), pp. 315-330, Elsevier, Amsterdam.
[4] GOH K.J.2000. Climatic requirements of the oil palm for high yields. In: Managing oil palm for high yields:
agronomic principles (Ed. By Goh K.J.), pp. 1-17, Malaysian Soc.Soil Sci. and Param Agric. Surveys, Kuala
Lumpur.
[5] AHMAD ALWI AND CHAN, K.W.1990.The Future of Oil Palm Yield Forecasting: Guthrie’s Autoregressive
Integrated Moving Average Methode. In: Proc.1989 Int. Palm Oil Dev. Conf. Agriculture (Ed. By. B. S. Jalani
et al.), PP.144-150 Palm Oil Rest. Inst. Malaysia, Kuala Lumpur.
[6] HAMILTON, JAMES.D. 1994.Time Series Analysis. Princeton University Press, 41 William St.Princeton. New
Jersey.
[7] WEI, W.W.S. 1990. Time Series Analysis, Univariate and Multivariate Methods, Canada. Addison Wesley
Publishing Company.

DIVO D. SILALAHI
SMART Research Institute (SMARTRI), PT. SMART Tbk, Indonesia
e-mail: [email protected]

J.P. CALIMAN
SMART Research Institute (SMARTRI), PT. SMART Tbk, Indonesia
e-mail: [email protected]

YONG YIT YUAN


SMART Research Institute (SMARTRI), PT. SMART Tbk, Indonesia
e-mail: [email protected]
378 D. D. SILALAHI, J.P. CALIMAN, Y.Y. YUAN
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 379 - 386.

CONTINUOUSLY TRANSLATED FRAMELET

DYLMOON HIDAYAT

Abstract. Continuously Translated Framelet (CTF) is a frame generated by a single function in a


separable Hilbert space by continuous translations and discrete dilations. Continuously Translated
Framelet can be associated with its CTF operator. Decomposing the operator into its spectral family
leads to defining new operators. Applying this new operator on the generator function of the CTF
results new family of CTF.
Keywords and Phrases : Frame, wavelet, framelet, continuous, translated.

1. INTRODUCTION

Frames were first introduced by Duffin and Schaeffer to study an irregular sampling
problem in the context of non-harmonic Fourier series [1].Frames are different from bases.
Even though they can both represent functions in as series, frames may be linearly
dependent. This implies that the representation by frame is not unique. In some application,
the redundancy of the representation has important rules such as in signal processing. The
redundancy leads to robustness: the presence of un-correlated noises is less destructive to the
quality of the signal [2].We developed a Continuously Translated Framelet (CTF) as a
continuous translation and a discrete dilation of a framelet. We showed that a CTF can be
associated to an operator that would be called continuously translated framelet operator (CTF
Operator). Given a CTF, the corresponding CTF operator is self adjoint, bounded, and
positive. It is well known that every bounded self adjoint operator has a spectral
decomposition [3]. Using the spectral decomposition of the CTF operator, we defined a
family of new CTFs.

1.1. Frame and Framelet. Let be a separable Hilbert space with inner product and

norm ||∙|| . A sequence D in in is called a frame if there are two positive


constants m and M such that
for all

379
380 DYLMOON HIDAYAT

The constant m and M are called the frame bound. If m = M then the frame is called tight.
Given a fixed function we mean byDj, Tkand are the following:

(1)
(2)

(3)

Definition 1 A family is called (discrete) framelet if there exist two positive constantsm
and M such that
for all (4)
The constant m and M are called the framelet bound. If m = M then the framelet is called
tight.

1.2. Continuously Translated Framelet (CTF)

Definition 2 A family of function in is called a Continuously


Translated Framelet (CTF) if there exist two positive numbers m and M such that
for all (5)
The constant m and M are called the CTF bound. If m = M then the CTF is called tight.
Unless otherwise stated all summation is over the integer and the integral is over real
line . The function will be called the generator of the CTF. This definition is a
generalization of that of in [4].

Theorem 3 Every framelet is a CTF.


We will use the following Lemma to prove the theorem.

Lemma 4 [5]Let { be a framelet satisfying the Equation (4). Then for every positive odd
integer n the family { } remain a frame with the same bounds.
Proof of Theorem 3
FromLemma 4 we can see that the frame condition is true for all positive odd integern

If we let n run over odd integer infinitely, we may have


C on t inu ou s ly Tra n s la t ed F ra m el et
381

Hence { } satisfies equation (4), so it is a CTF.

2. THE CTF OPERATOR

2.1 Definition and Properties

Definition 5 Let be a CTF as in (3). An operator that will be called


continuously translated framelet operator (CTF Operator), F is defined by:
(6)
We will prove that CTF operator is self-adjoint, bounded, and positive.

Theorem 6 The following statements are equivalent:


(i) is a CTF(i.e. satisfying equation 5)
(ii) The operator F in (6) is a bounded operator with
The operator F is called the frame operator for CTF or shortly the CTF operator. Note that
F is automatically self adjoint since it is positive.
Proof. We introduce an operator G from to L2 ( ) with . The
operator G is bounded since

by the CTF condition. Therefore we can find the adjointG* of G as follows. Let us write g(j,z)
as a function in L2 ( ) Then

For all f, therefore we get

in the weak sense.


Let . Then
382 DYLMOON HIDAYAT

So F = G*G. Therefore

By the CTF condition on equation (5), the expression is bounded and we have

Where I is the Identity operator.


(ii). The converse is immediate since from (ii) we have

Since a positive operator is always self adjoint, then it is clear that the following corollary is
true.

Corollary 7 The CTF operator F is self adjoint bounded positive operator.

We know that a self adjoint operator can be decomposed by its spectral family as stated in the
following theorem:

Theorem 8 Let F be a bounded self adjoint operator in Hilbert space withinf F =m and sup
F =M. then there exist a spectral family { } on the interval [m,M] such that

Proof: See [3].


Since for every non negative integer s,

and the fact that the spectral family satisfy , it implies that are
pairwise orthogonal projections, therefore we have the following definition:

Definition 9 For each non negative integer s,


(7)
As the limit of S when

2.2 Properties of the CTF Operator

Theorem 10 Let F be as in Equation (7). For each non negative integer s, Then is
bounded positive self adjoint and admits the following properties:
C on t inu ou s ly Tra n s la t ed F ra m el et
383

(i)
and
(ii)
Proof. (i) since and for a positive integers, then , then

and finally

similar computation works for s < 0


(ii) The proof is merely based on the fact that are pairwise orthogonal
projection. If we define the sum as in the proof of Theorem 8 and similarly , then

If we let then we get (ii).


The following corollary is just a direct implication of Theorem 10

Corollary 11
for
and
for

The following corollary is also an implication of Theorem 8 and Theorem 10

Corollary 12 Let be as in Definition 9. Then


(i)
(ii)
(iii)

3. FAMILY OF FRAMELETS

Now we will show that the CTF operator commutes with the dilation and translation operators.

Theorem 13 If Fis the CTF operator and Dm is the dilation operator then
FDm = DmF
Proof. Using a simple substitution, it is not hard to show that
(8)
and
384 DYLMOON HIDAYAT

(9)
More over

by equation (8)
by equation (9)

for all f.

Theorem 14 If F is the CTF operator and Tl is the translation operator then


FTl = TlF
Proof. A simple computation show
(10)
and
(11)
Therefore

by equation (10)

by
equation (11)

for every f.

Finally we prove that the CTF operator preserves dilations and translations

Corollary 15 For each integer s, the CTF operator preserves dilations and translations.
Proof. Let Dm and Tl be a dilation as in (1) and a translation operators as in (2) respectively.
We will show that and .
(0)
It is clear that for s = 0, F = I commutes with Dm and Tl. For s = 1
FDm = DmF andFTl = TlF
by Theorem 13 and Theorem 14. Hence by induction

Theorem 16 Let be a CTF as in Definition 2 and F(s) be as in the Definition 9. Define


C on t inu ou s ly Tra n s la t ed F ra m el et
385

(s)
then is a CTF.
(s)
Proof. That is a family of translates and dilates follows from Corollary 15. By Theorem 6 it
suffices to show that the CTF operator G(s) for the CTF (s)
is positive and bounded. Where

Consider

since is self adjoint.


By Corollary 11 for

soG(s) is bounded. The following corollary is a direct implication of Corollary 15.


(s)
Corollary 17 Let be a CTF with bounds m and M. then is a CTF with bounds
for
and
for
Moreoverwe have even more general result:

Corollary 18
(s)
is a CTF is a CTF for any
386 DYLMOON HIDAYAT

References

[1] DUFFIN, R. AND SCHAEFER, A.,AClass of non Harmonic Fourier Series, Trans. Amer..Math. Soc. 72, 341 – 366,
1952
[2] GRIBONWAL, R., DEPALLE, P., RODET, X., AND MALLAT, S., Sound Signal Decomposition using a high
Resolution matching pursuit, In Proc. Int. Computer Music Conf. 293 – 296, 1996
[3] RIESZ, F. AND SZ – NAGY, B., Functional Analysis, p 2, Translation from French by Leo F.Boron. New York:
FrederickUngar Pub., 1955
[4] DAUBECHIES, I., GROSSMANN, A., AND MEYER, Y., Painless Non Orthogonal Expansions. J. Math. Physics
27, 1271 – 1276, 1986
[5] CHUI, C. K. AND SHI, X. L., n x oversampling Preserves any Tight Affine Frame for Odd n,Proc. Of the AMS,
121 no. 2, 511 – 517, 1994.

DYLMOON HIDAYAT
Universitas Pelita Harapan, Department of Mathematics Education.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011
Applied Mathematics, pp. 389 - 394.

MULTILANE KINETIC MODEL OF VEHICULAR TRAFFIC


SYSTEM

ENDAR H. NUGRAHANI

Abstract. A system of vehicular traffic can be modeled in various ways, such as


microscopic, macroscopic, as well as mesoscopic or kinetic models. A kinetic model
resembles the traffic as a system of interacting gas particles described by a distribution
function with a time evolution, formulated in a Boltzmann -like equation. This paper is
intended to derive multilane kinetic model, which is obtained from the formulation of
corresponding gain and loss terms, which are determined by using microscopic
interactions. Some numerical simulations are also presented .
Keywords and Phrases: vehicular traffic, multilane kinetic model.

1. INTRODUCTION
There are essentially three types of models which can be used to examine a system of
vehicular traffic, namely: microscopic, macroscopic, and kinetic models [1,2]. Microscopic
models focus on the modeling of individual cars with their deterministic or stochastic
interactions. Macroscopic models have the form of partial differential equations of
conservation type, which determine the relation of density and flux of a vehicular system.
Kinetic models predict the statistical distribution of cars with respect to their location and
velocity. Relevant literatures show that since the first continuum model describing traffic
flow given by Lighthill and Whitham [3], much progress has been made in the development
of microsocopic (follow-the-leader) models on the one hand, and of macroscopic (fluid-type)
models on the other hand [4].
A kinetic model, which is also known as mesoscopic model, describes the traffic as a
system of interacting gas particles, which is formulated as a Boltzmann-like equation. The
model is based on the time evolution of a one-vehicle probabilty distribution function in a
phase-space [5]. The first mesoscopic (or gas-kinetic) traffic flow model just appeared in
1960, when Prigogine and Andrews [6] wrote a Boltzmann-like equation to describe the time
evolution of a one-vehicle distribution function in a phase-space, where the position and the
velocity of the vehicles plays a central role [4].

387
388 E. H. NUGRAHANI

However, until the 1990s, mesoscopic traffic models do not get much attention from
scientists due to their lack of ability to describe traffic operations outside of the free-flow
regime. Additionally, compared to macroscopic traffic flow models, gas-kinetic traffic
models have a large number of independent variables that increase the computational
complexity. Nevertheless, in the last decades, the scientific interest on mesoscopic traffic
models has been enriched with the publication of some works that apply these models to
derive macroscopic traffic models. In fact, macroscopic equations for relevant traffic
variables can be derived from a Boltzamnn-like traffic equation by averaging over the
instantaneous velocity of the vehicles. This is a well-known procedure in the kinetic theory
[1,4,6,7].
The organization of this paper is as follows: Section 2 presents the underlying
multilane microscopic traffic model by considering some reaction thresholds. In Section 3,
the corresponding multilane kinetic model is discussed. Moreover, the results of some
numerical simulations are presented in Section 4. Finally, Section 5 gives a concluding
remark.

2. MULTILANE MICROSCOPIC MODEL

Consider a multilane road to be modelled. Let the car under consideration is denoted
by c, and the leading and following cars are denoted by c  and c  , respectively, whereas, the
corresponding cars on the left and right lane are cl , cl , and cr , cr . The velocities before
and after interation are given by v and v’. The maximal velocity is denoted by w, such that all
velocities are between 0 and w.

Let H 0 is the minimum headway between cars. The following are the thresholds for
line changing to the right (HR), line changing to the left (HL), braking (HB), accelerating (HA),
and free driving (HF):

H R  H 0  vTR
H L  H 0  vTL
H B  H 0  vTB
H A  H 0    vTA
H F  H 0    wTF

with TR , TL , TB , TA , TF denote the reaction times of each interaction, such that


TF  TA  TR  TL  TB , and δ denotes a time delay in the case of acceleration or free
driving. Moreover, line changing of cars to the right or left need some additional space
according to the following threshold:

H RS  H 0  vTRS and H LS  H 0  vTLS


M u lt i lan e Ki n et i c M od el of Veh i c u la r Tra ffi c S ys t em
389

with TRS , TLS  TB . Therefore, the interactions under consideration can now, according to [1],
be formulated as follows.

Interaction 1 (Lane changing to the right). If v  v and H R (v) is satisfied, then the car
will change lane to the right. Thus, a car will be able to pass the car ahead, only if there is
sufficient space on the right lane, i.e. if

xr  x  H RS (v) and x  xr  H RS (vr ) .

Moreover, c and c  will accelerate after lane changing with new velocity of

v~ , if xr   x  H F v~ , if x  x  H F
v   v  
v , otherwise v , otherwise

with v~, v~ according to a desired probability distribution function with density f D , i.e.
choose v ~  F 1 ( ) , with  a uniform random variable on the interval (0,1) with
D
v
FD (v)   f D (vˆ) dvˆ .
0

Interaction 2 (Lane changing to the left). If v  v and if H L (v ) is crossed, then the car
will change lane to the left, only if there is enough space available, i.e. if

xl  x  H LS (v) and x  xl  H LS (vl ) .

As before, c and c  will accelerate after lane changing with new velocity of

v~ , if xl   x  H F v~ , if x  x  H F
v   v  
v , otherwise v , otherwise

with v~, v~ defined above.

v  v and the braking threshold H B (v) is satisfied, then the car


Interaction 3 (Braking). If
will need to brake on the interval [  v, v ] to have actual speed lower than v . The new
velocity is given by

v   v   (v   v) ,    ,

with  uniformly distributed on [0, 1]. Braking will take place only under condition when
390 E. H. NUGRAHANI

acceleration is still possible, i.e. for any v , v the following conditions should be satisfied:

TB 
H A (v)  H B (v) or     1.
TA w TA

Interaction 4 (Acceleration I, Follower). If v  v and acceleration threshold H A (v) is


reached, then the car will accelerate on the interval [ v, v ] at a speed above the actual speed
v . The new velocity is given as

v  v   (min( w, v )  v) ,   1 .

Acceleration is allowed only if there is still possibility of braking, i.e. the velocities v, v
should satisfy

 TA
H B (v)  H A (v) or 1     .
wTB TB

Interaction 5 (Acceleration II, Free driving). If v  v and free driving threshold H F (v) is
satisfied, then the car will accelerate freely to the desired velocity. The new velocity will be
distributed according to certain probability distribution function with density f D , i.e.

1
v  FD ( ) .

3. MULTILANE KINETIC MODEL

In a mesoscopic description, in analogy with the kinetic theory of gases, a single


vehicle distribution function f(x,c,t) can be defined in such a way that f(x,c,t)dxdc gives at
time t the number of vehicles in the road interval between x and x + dx and in the velocity
interval between c and c + dc. For a closed uni-directional single-lane road, the one-vehicle
distribution function satisfies the following kinetic traffic equation [1,7].

f f   c 
c   f   Q f , f 
t x c  t 

where the interaction term on the right hand side


M u lt i lan e Ki n et i c M od el of Veh i c u la r Tra ffi c S ys t em
391

 c
Q f , f    1  p c  c  f x, c, t  f x, c, t dc   1  p c  c f x, c, t  f x, c, t dc
c 0

describes the decelaration processes due to slower vehicles which cannot be immediately
overtaken. The first part of the interaction term corresponds to situations where a vehicle with
velocity c′ must decelerate to velocity c causing an increase of the one-vehicle distribution
function, while the second one describes the decrease of the one-vehicle distribution function
due to situations in which vehicles with velocity c must decelerate to even slower velocity c′.
The positive part of this interaction term is also known as the gain term, and the negative part
is the lost term [1].

From another point of view, let f ( x, v) be a probability density function of the car
( 2)
in lane α, and the probability density function for the car ahead is f ( x, v, h, v ) . Those
functions are defined as follows.
w
f ( x, v)   f( 2) ( x, v, h, v ) dh dv .
0 0

f( 2) ( x, v, h, v )  q(h ; v, f ( x, .)) F ( x  h, v ) f ( x, v)


where

F (v ; h, v, x) : Probability distribution function of the car with velocity v at
distance h to the car at position x with velocity v ;

q(h ; v, f ) : Probability of the car ahead of the car with velocity v , which has
velocity distribution function f.

The kinetic equation for the distribution functions ( f1 ,..., f N ) on N lane is established
through defining gain (G) and loss (L) terms of the interaction. Furthermore, the interaction
can be defined in the following equation
~
 t f  v  x f  C ( f1( 2) ,..., f N( 2) , f1 ,..., f N ) .
~
C is the interaction term, which is defined as follows.
~
C ( f1( 2) ,..., f N( 2) , f1 ,..., f N )
~ ~ ~ ~ ~ ~
 (GB  LB )( f 1 , f( 2) , f 1 )  (GA  LA  GF  LF )( f( 2) )
~ ~

 GR ( f(21) , f )  LL ( f 1 , f( 2) , f 1 ) (1   ,1 )
~ ~

 GL ( f , f(21) , f 2 )  LR ( f( 2) , f 1 ) (1   , N ),

where  i, j denotes the Kronecker symbol.


392 E. H. NUGRAHANI

4. NUMERICAL SIMULATION

A numerical study has been established for the multilane microscopic model based on the
above mentioned interaction thresholds according to [8]. The simulation is assumed on a 3-
lane highway with some corresponding threshold’s parameter. The computed value is the
average velocity of the cars in the system, denoted by u t  . The simulation is carried out on
various values of density parameter, i.e.   0.1, 0.2, 0.4, 0.6 . The result is given in Figure
1.

Figure 1. Average velocity under multilane microscopic model.

This result shows that cars will move in lower velocity when the density becomes
higher, as a result of more limited space for the vehicles on the highway. This also resembles
quite well to traffic operations in real-life traffic.

5. CONCLUDING REMARK

Vehicular traffic system can be explained quite well using Boltzmann like kinetic
equation, which originally models the phenomena of interacting gas particles. In the case of
multilane highway, the microscopic interactions between vehicles can be used to resemble the
interacting gas particles in the model construction of kinetic model of vehicular traffic.

Acknowledgement. The author would like to thank the Department of Mathematics IPB
for its financial support. The author also thanks her former student Desyarti Safarini TLS for
M u lt i lan e Ki n et i c M od el of Veh i c u la r Tra ffi c S ys t em
393

providing the simulation data and results.

References
[1] KLAR, A. AND WEGENER, R., A hierarchy of models for multilane vehicular traffic I: modeling, SIAM J. on
App. Math. 59:983-1001, 1998.
[2] ILLNER, R., BOHUN, C.S., MCCOLLUM, S. AND VAN ROODE, T. Mathematical Modelling: A Case Studies
Approach. AMS, Providence, Rhode Island, 2005.
[3] LIGHTHILL, M. J. AND WHITHAM, G. B., On kinematic waves: II. A theory of traffic flow on long crowded
roads. Proceedings of the Royal Society, Series A 229, 317-345, 1955.
[4] MARQUES JR., W. AND MENDEZ, A. R., On the kinetic theory of vehicular traffic flow: Chapman-Enskog
expansion versus Grad’s moment method, arXiv:1011.6603v1 [math-ph], 2010.
[5] HERTY, M., KLAR, A., PARESCHI, L. General kinetic models for vehicular traffic flow and Monte Carlo
methods, Preprint, TU Kaiserslautern - Germany, 2005.
[6] PRIGOGINE, I. AND ANDREW, F., A Boltzmann like approach for traffic flow, Oper. Res. 8: 789, 1960.
[7] HELBING, D. AND TREIBER, M., Enskog equations for traffic flow evaluated up to Navier-Stokes order,
Granular Matter 1, 21, 1998.
[8] KLAR, A. AND WEGENER, R., A hierarchy of models for multilane vehicular traffic II: numerical
investigations, SIAM J. on App. Math. 59:1002-1011, 1998.

ENDAR H. NUGRAHANI
Bogor Agricultural University.
e-mails: [email protected] / [email protected]
394 E. H. NUGRAHANI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 395–402.

ANALYSIS OF A HIGHER DIMENSIONAL SINGULARLY


PERTURBED CONSERVATIVE SYSTEM : THE BASIC
PROPERTIES

Fajar Adi-Kusumo

Abstract. We consider a 5-dimensional system of ODE which are conservative and


singularly perturbed. The system comes up from the normalization of a three-coupled
oscillators system which is motivated by the Ultra Low Frequency Variability (ULFV)
model in atmospheric research. The nonlinear terms of the system are preserved the
energy. It means that the total energy of the system is constant and the system is called
conservative. We assume that the linear terms of the system has small magnitude. With
these terms, the conservative system is linearly perturbed. In this case, the total energy of
the system is no longer constant. In this paper we present analysis of the 5-dimensional
system for the unperturbed case especially for the properties of the system near the
invariant solutions.

Key words and Phrases: unperturbed, singularly perturbed, invariant, conservative.

1. INTRODUCTION
This paper is a sequel of [2]. In [2], a 6-dimensional system of ODE which repre-
sented a three-coupled oscillators system was studied. The nonlinearity of the system
was quadratic, preserved the energy and perturbed by its linear term. The system is
a generalization of the system in [3] in the sense of the dimension and resonance. In
[3], the system is 4-dimensional system and reduced into 3-dimensional system in the
normal form. Analysis of the normal form of the 3-dimensional system was done in
[3, 4, 1]. In [2] the author combines two class of resonance i.e. the widely-separated
frequencies which is the extreme type of the higher order resonance and the lowest order
resonance in the system.

2010 Mathematics Subject Classification: 34C14, 34C15, 34C30, 37G05,

395
396 Fajar Adi-Kusumo

In application, the system in [2] and [3] are motivated by an atmospheric model
which is called Ultra Low Frequency Variability (ULFV). The model represents a long-
time behavior of the interaction patterns in atmosphere.
Using Averaging Method and some other coordinate transformation, the 6-dimensional
system was transformed into 5-dimensional system, see [2]. The transformed system was
called the normal form. The normal form preserves the general properties of the original
system, but reduces its dimension. In this paper we will analyze the basic properties of
the normal form.

2. PROBLEM FORMULATION
Let us now consider a system of ODE in R5 :

ṙ = δ2 xr + δ1 pr + µ1 r
ṗ = (2δ1 q + ω1 )q − δ1 r2 + γxp + µ2 p
q̇ = −(2δ1 q + ω1 )p + γxq + µ2 q (1)
ẋ = (αx + βy + ω2 )y − δ2 r2 − γ(p2 + q 2 ) + µ3 x
ẏ = −(αx + βy + ω2 )x + µ3 y.

We assume that the parameters µi , i = 1, 2, 3 are small parameters (0 < µi  1). For
µi = 0 the system is called unperturbed system which is conservative. In this case, the
solution of System (1) lies on an invariant hyper-sphere (the sphere in R5 ).

Theorem 2.1. (1) If µ1 = µ2 = µ3 = 0 then the solutions of System (1) is invari-


ant to a hyper-sphere in R5 .
(2) If µi < 0, i = 1, 2, 3, then the System (1) has only one invariant structure, that
is the trivial equilibrium which is globally asymptotically stable.
(3) If µi > 0, i = 1, 2, 3, then the System (1) has only one invariant structure, that
is the trivial equilibrium which is globally unstable.

To proof the theorem, we  use the first integral of System (1) for µi = 0, that is
V = 12 r2 + p2 + q 2 + x2 + y 2 . For µi 6= 0, we see that the equation of V is a Lyapunov
function of System (1) and proof the global stability of the trivial equilibrium.
The other interesting parameter is γ. In [2], the parameter shows the interaction
between the system in the extreme type of higher order resonance class and the one in
the lowest order resonance class. In the Section 3 and Section 4 we will show that the
parameter influences the complexity of the equilibrium points of the system and also
the existence of their bifurcation values.
Analysis in this paper is focused to explore some the properties of the conservative
system, that is System (1) with µ1 = µ2 = µ3 = 0. In this case, we will show the
existence of the manifold of equilibria and the dynamics of the system which lies on the
hyper-sphere.
Analysis of a Higher Dimensional Singularly Perturbed... 397

Symmetries in the system. For µ1 = µ2 = µ3 = 0 we have symmetries in the


system. We define two types of transformation, those are the transformation of the
phase space Φi : R5 → R5 , i = 1, 2, 3, 4 and the transformation of the parameter space
Ψj : R7 → R7 , j = 1, 2, 3. Suppose that ξ = (r, p, q, x, y) and ζ = (δ1 , δ2 , ω1 , ω2 , β, γ, α).
The phase space of System (1) can be reduced into D = {ξ ∈ R5 |r > 0} when we apply
the transformation Φ1 (ξ) = (−r, p, q, x, y). In this case, the dynamics of the system in
r < 0 is the mirror symmetry with the dynamics in D.
Combination between the transformation of the phase space Φ2 (ξ) = (r, −p, −q,
− x, −y) and the transformation of the parameter space Ψ1 (ζ) = (−δ1 , −δ2 , ω1 , ω2 , −β,
− γ, −α) do not change the dynamics of System (1). The dynamics of the system for
β < 0 and α > 0 is equal with the dynamics for β > 0 and α < 0 but for the difference
sign of δ1 and δ2 . Related to the result in [1] and [3] we assume that β < 0.
The last symmetry is the combination between the transformation Φ3 (ξ) = (r, p,
− q, x, y) and the transformation Ψ2 (ζ) = (δ1 , δ2 , −ω1 , ω2 , β, γ, α) which preserves the
dynamics of the system. We have the similar situation for the combination between
Φ4 (ξ) = (r, p, q, x, −y), and Ψ3 (ζ) = (δ1 , δ2 , ω1 , −ω2 , β, γ, α). By the last symmetry, we
assume that ω1 > 0 and ω2 > 0.

3. THE EXISTENCE OF MANIFOLD OF EQUILIBRIA


Manifold of equilibria is the set of equilibria which can be seen as a manifold.
There are four types of manifold of equilibria in the conservative system. Two of them
are in r = 0 space. In this space, there are one equilibrium which is the trivial solution
(zero equilibrium) and two manifold of equilibria. The trivial solution is Lyapunov
stable equilibrium which is a center point. The other manifold of equilibria are in
(r, q, y)-space and in (r, p, x, y)-space.

3.1. Manifold of equilibria in r = 0. One of the manifold of equilibria in the r = 0


space is the line l, i.e. αx + βy + ω2 = 0 which lies on the (x, y)-plane. The other
manifold is
γ 4δ1 2 γp2 − 4βδ1 2 y 2 − 4ω2 δ1 2 y + γω1 2 = 0.

(2)
The manifold lies on the plane which is parallel with the (p, y)-plane at q = − ωδ11 . It is
an ellipse for γ > 0 and a hyperbola for γ < 0. For γ = 0, the manifold is vanished.
Let V (ξ) = 21 (r2 + p2 + q 2 + x2 + y 2 ). We define S(C) = {ξ|V (ξ) = C 2 } which
is a hyper-sphere in R5 with radius C. If the set of equilibria (2) is parameterized by
y = y◦ , we will have an equilibrium
 q 
γ 4βδ1 2 y◦ 2 + 4ω2 δ1 2 y◦ − γω1 2

ω
(r, p, q, x, y) = 0, ± ,− , 0, y◦  , (3)
2δ1 γ 2δ1

which is a member of the manifold of equilibria (2). The equilibrium (3) lies on the
intersection point between the manifold of equilibria (2) and the hyper-sphere. The
398 Fajar Adi-Kusumo

equilibrium exists for


p p
ω2 ω2 2 δ1 2 + βγω1 2 ω2 ω2 2 δ1 2 + βγω1 2
− + ≤ y◦ ≤ − −
2β 2βδ1 2β 2βδ1

and ω2 2 δ1 2 + βγω1 2 ≥ 0.

3.2. Manifold of equilibria in (r, q, y)-space. Manifold of equilibria in this space is


a combination of two equations, i.e. δ1 r2 − 2δ1 q 2 − ω1 q = 0 which is a hyperbola, and

δ1 βy 2 − (γ + 2δ2 )q 2 + δ1 ω2 y − δ2 ω1 q = 0.

(4)

Equation (4) is an ellipse for γ > −2δ2 , a hyperbola for γ < −2δ2 , and a parabola for
γ = −2δ2 . Furthermore, we parameterize q = q◦ and we have an equilibrium point
√ !
δ1 q◦ (2δ1 q◦ + ω1 ) ω2 H
(r, p, q, x, y) = , 0, q◦ , 0, − ± . (5)
δ1 2β 2δ1 β

with H = 4δ1 2 β(γ + 2δ2 )q◦ 2 + 4βδ1 δ2 ω1 q◦ + ω2 2 δ1 2 which lies on the hyper-sphere.
For δ1 > 0, the value of r of Equilibrium (5) exists for q◦ ≤ − ωδ11 or q◦ ≥ 0.
ω1
Otherwise for δ1 < 0, the value of r of the equilibrium exists for q◦ ≤ 0 or q◦ ≥ − 2δ 1
.
Furthermore, the value of y of Equilibrium (5) exists for

4δ1 2 β(γ + 2δ2 )q◦ 2 + 4βδ1 δ2 ω1 q◦ + ω2 2 δ1 2 ≥ 0.

The solution of the inequality is that


(1) if γ + 2δ2 > 0 and δ1 > 0 then
√ √
δ2 ω1 G δ2 ω1 G
− + ≤ q◦ ≤ − − ,
2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 ) 2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 )
(2) if γ + 2δ2 > 0 and δ1 < 0, then
√ √
δ2 ω1 G δ2 ω1 G
− − ≤ q◦ ≤ − + ,
2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 ) 2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 )
(3) if γ + 2δ2 < 0 and δ1 > 0, then
√ √
δ 2 ω1 G δ2 ω1 G
q◦ ≤ − − or q◦ ≥ − + ,
2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 ) 2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 )
(4) if γ + 2δ2 < 0 and δ1 < 0, then
√ √
δ2 ω1 G δ2 ω1 G
q◦ ≤ − + or q◦ ≥ − − ,
2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 ) 2δ1 (γ + 2δ2 ) 2δ1 β(γ + 2δ2 )

with G = β βδ2 2 ω1 2 − ω2 2 δ1 2 γ − 2ω2 2 δ1 2 δ2 .



Analysis of a Higher Dimensional Singularly Perturbed... 399

3.3. Manifold of equilibria in (r, p, x, y)-space. Manifold of equilibria in this space


ω1 δ2
lies on q = − δ1 (2δ2 +γ)
. Parameterization of the variable r by r◦ will produce the
equilibrium
 
ϑ ω1 δ2 ϑ αϑ ω2
ξ = r◦ , ,− , , − (6)
γδ1 (2δ2 + γ) δ1 (2δ2 + γ) γδ2 (2δ2 + γ) γδ2 β(2δ2 + γ) β
with
q
−γδ2 δ1 2 (4δ2 2 + 4δ2 γ + γ 2 )r◦ 2 + ω1 2 δ2 γ .

ϑ=

The Equilibrium (6) exists for −γδ2 δ1 2 (4δ2 2 + 4δ2 γ + γ 2 )r◦ 2 + ω1 2 δ2 γ ≥ 0, that is


when γδ2 < 0. In this case we have


ω1 2 γδ2
r◦ 2 ≥ − > 0,
δ1 2 (2δ2 + γ)2
those are
ω1 p ω1 p
r◦ ≥ −γδ2 or r◦ ≤ − −γδ2 .
δ1 (2δ2 + γ) δ1 (2δ2 + γ)

4. BIFURCATION ON THE HYPER-SPHERE


Analysis in this section is focused on the existence of the equilibria related to the
change of the radius of the hyper-sphere. We consider a hyper-sphere with radius R, i.e.
R2 = r2 +p2 +q 2 +x2 +y 2 . By applying the transformation r2 = R2 −(p2 +q 2 )−(x2 +y 2 )
to the conservative system (System (1) with µi = 0), we have new system which is 4
dimensional ODE, i.e.
ṗ = Ω1 q − δ1 R2 − p2 + q 2 − x2 + y 2 + γxp
 

q̇ = −Ω1 p + γxq (7)


2 2 2 2 2 2 2
  
ẋ = Ω2 y − δ2 R − p + q − x +y −γ p +q
ẏ = −Ω2 x,

with Ω1 = 2δ1 q + ω − 1 and Ω2 = αx + βy + ω2 . The phase space of System (7) is the


hyper-sphere with radius R. When the radius of the hyper-sphere is varied, we found
that the existence of the equilibrium of the conservative system will changes. We show
the situation in Theorem 4.1.
Theorem 4.1. Consider the System (7) and the hyper-sphere R2 = r2 +p2 +q 2 +x2 +y 2
in R5 . If the value of R is varied, then the System (7) has two bifurcation values in
(p, x, y)-space, i.e.
s
ω 2
1 3ω2 2 δ1 2 − ω1 2 γ(β + γ)
R1 = p and R2 = .

α2 + β 2 δ1 4γ(β + γ)
400 Fajar Adi-Kusumo

Proof. Firstly, we consider the manifold of equilibria in (x, y)-plane which is the line l
(Ω2 = 0). The value of R1 is computed by calculating the distance between the line
and the trivial equilibrium. For R < R1 there is no intersection point between the
hyper-sphere and the line. So the nontrivial equilibrium which is on the line Ω2 = 0 do
not exist. The nontrivial equilibrium which is on the line l exist only for R ≥ R1 . For
R = R1 the hyper-sphere has only one intersection point with the line l, and then for
R > R1 the hyper-sphere intersects the line l at two points. The second is that on the
(p, y)-plane we have a manifold of equilibria, see Equation (2), and for γ 6= 0 we have
4δ1 2 γp2 − 4βδ1 2 y 2 − 4ω2 δ1 2 y + γω1 2 = 0. (8)
2
The intersection point between the hyper-sphere and the (p, y)-plane is the circle R =
p2 + y 2 . If we substitute the equation of the circle to the Equation (8) then we have
ω1 2
 
β+γ ω2
R2 = y2 + y− (9)
γ γ 4δ1 2
ω2
The Equation (9) has maximum value at y = − 2(β+γ) . So that we have the bifurcation
value R2 by substituting the value of y to Equation (9). For γ > 0, the manifold of
equilibria (2) is an ellipse. In this case, the hyper-sphere has no intersection point with
the ellipse for R > R2 , it has one intersection point for R = R2 , and two intersection
points for R < R2 . 

By Theorem 4.1 we know that the existence of the nontrivial equilibria of Sys-
tem (7) depend on the radius of the hyper-sphere. In application, the radius of the
hyper-sphere represents the energy of the system and the nontrivial equilibrium can be
interpreted as a structure in atmosphere which do not to be changed in time.

5. CONCLUDING REMARKS
Analysis in this paper is focused on the basic properties of System (1), those
are the invariant properties, the symmetries, and also the conservative properties. For
the conservative properties of the system, we show the existence of the equilibrium
points and the bifurcation related to the radius of the hyper-sphere. We still left some
complicated problem in this case, i.e. more complicated bifurcation values related to
the radius of the hyper-sphere which are at the intersection point between the hyper-
sphere and (r, q, y)-space and between the hyper-sphere and the (r, p, x, y)-space, and
the stability of the equilibria.
The other interesting problem which are still open is the dynamics of System
(1) for µi 6= 0. For p = 0, q = 0, and δ1 = 0, the dimension of the System (1)can
be reduced into 3-dimensional system. In [4] and [1] the authors found that there are
chaotic solution of the system.

Acknowledgement. The author wishes to thank Department of Mathematics Gadjah


Mada University for the grant to support this research (Hibah Penelitian 2010 No.
Analysis of a Higher Dimensional Singularly Perturbed... 401

29/JO1.1.28/PL.06.02/2010). He also thank to his wife Juwairiah, his sons Rizky,


Radhya, and his daughter Azkia for their support during the research.

References
[1] Adi-Kusumo, F., Tuwankotta, J. M., and Setya-Budhi, W., Chaos and Strange Attractors in
Coupled Oscillators with Energy-preserving Nonlinearity, J. Phys. A: Math. Theor. 41, 255101
(17pp), 2008
[2] Adi-Kusumo, F., Normalisation of A Coupled-Three Oscillator with Energy-Preserving Quadratic
Nonlinearity Near 1 : 2 : ε - Resonance, Proceedings of IICMA 2009, Applied Mathematics, pp.
335-340, 2010.
[3] Tuwankotta, J. M., Widely Separated Frequencies in Coupled Oscillators with Energy-preserving
Quadratic Nonlinearity, Physica D 182, p.125-149, 2003.
[4] Tuwankotta, J. M., Chaos in coupled ocsillators with widely separated frequencies and energy-
preserving nonlinearity, Int. Journal on Nonlinear Mechanics 41, p. 180-191, 2006.

Fajar Adi-Kusumo
Applied Mathematics Group, Department of Mathematics, Gadjah Mada University, INDONE-
SIA.
e-mails: f [email protected]
402 Fajar Adi-Kusumo
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics , pp. 403 – 412.

A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE


POLICY BASED ON THE NUMBER OF FAILURES FOR TWO-
DIMENSIONAL WARRANTED PRODUCT

HENNIE HUSNIAH, UDJIANNA S. PASARIBU, A. H. HALIM,

B. P.ISKANDAR

Abstract. This paper deals with a periodic replacement policy based on the number of
failures after the expiry of warranty. The product is sold with a two -dimensional non-
renewing failure replacement warranty. We model product failures using the one-
dimensional approach which allows to modeling the effect of age and usage to the
product‟s degradation. Under the periodic replacement policy, for a given usage rate y, the
product is repaired with minimal repair when it fails at the first n -1 failures and it is
replaced with a new one at n th failure. We obtain the global optimal values of n which
minimizes the expected cost per unit time to the buyer and give a numerical example to
illustrate the optimal solution.
Keywords and Phrases : Periodic replacement with n failure policy, Two-dimensional
warranty, expected cost per unit times.

1. INTRODUCTION

A non-renewing failure replacement warranty (NFRW) is a warranty policy in which


all failures products under this warranty are rectified by the manufacturer at no cost to the
buyer. From the buyer‟s perspective, when the product fails under warranty, the buyer only
incurs the cost due to unable to use the product while the failed product is restored.
Nevertheless, after the warranty expires, all maintenance costs such as the costs of each repair
and replacement are borne by the buyer. As a result, determining the optimal maintenance
policy after the expiry of a warranty, which minimizes the expected cost per unit time, is of a
great interest to the buyer. This paper deals with a periodic replacement with nth failure policy
after the expiry of warranty for a repairable product sold with a two-dimensional warranty.
At the present time, all durable products are sold with warranty, and hence the study of
maintenance policies for such products needs to consider the warranty aspect. A general
403
404 H.HUSNIAH ET AL.

review of various maintenance policies developed are abundance, examples can be found in
[21] and [24]. Maintenance policies following the expiry of a warranty have been studied by
[23], [9], [25] and [10]. However, most maintenance policies studied are characterized by one
time scale –e.g. age.
A maintenance policy characterized by two time scales –i.e. age and usage has received
less attention in the literature (see [5]), despite in realty there are many products sold within
two time scale warranty scheme. In automotive industry, many products are sold with a two-
dimensional warranty. For example, a dump truck is warranted for 36 months or 30000
miles, whichever comes first. Recently, [6] introduced a periodic replacement policy for a
two-dimensional non-renewing failure replacement warranty and they extended the model
into a hybrid policy [7]. Those papers dealt with replacement in terms of the product‟s age. In
practice, there are alternative ways in deciding when replacement should be carried out. [18]
pointed out that for a large and complex system one could make only minimal repair at each
failure, and make a planned replacement at periodic times. Other alternative, a planned
replacement is done when the nth failure had occurred .
In this paper, we consider a replacement policy in which minimal repair is carried out
for every failure until the penultimate failure (n-1th failure), and right at the last predetermined
failure (nth failure) the product is replaced with a new one. Different from earlier papers
addressing the same problems [2], [18], [13], the present paper takes into account a two-
dimensional warranty policy into consideration. The outline of the paper is organized as
follow. In Section 2 we give the model formulation. The periodic replacement policy
considered is dependent on a usage rate and is characterized by one parameter. Section 3
deals with the analysis of the optimal replacement policy. Section 4 presents numerical
examples for the case where the product has a Weibull failure distribution. Finally, in Section
5, we conclude with a brief discussion for future research.

2. MODEL FORMULATION

2.1 Warranty Policy and Coverage. The product is sold with a two-dimensional non-
renewing failure replacement warranty (NFRW) with warranty region Ω, the rectangle
[0,W )  [0,U ) where W is the time limit and U is the usage limit. With NFRW, all failures
under warranty are rectified at no cost to the buyer. It is assumed that the rectification is done
through a minimal repair and the repaired product comes with the original warranty. The
warranty ceases at the first instance when the age of the product reaches W or its usage
reaches U , whichever occurs first. Wy denotes the warranty expiry time when the usage rate
is y, (see Fig.1).
We assume that the usage rate Y (e.g., the annual distance travelled for an automobile)
varies across the customer population but it does not change for a given consumer. Here Y is a
random variable with a density function g ( y), 0  y  . Conditional on Y  y, the total
usage at age x is a linear function of x and it is given by
u  yx (3)
A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE POLICY... 405

y  U /W

W
y  U /W

0 W x

 W if y  U / W ,
Wy   (1) , (2)
U / y if y  U / W .

Fig.1. Two-dimensional warranty region WUa).

2.2 Failure Modelling.


A) Approaches to Modelling Failures: Three approaches can be used to modelling
failures for products sold with two-dimensional warranties (See [20]). We use Approach 3
introduced by [16] to modelling failures. This approach assumes that the usage rate Y varies
from customer to customer but is constant for a given customer. For a given usage rate y ,
failures over time are modelled by a one-dimensional counting process. If failed products are
replaced by new ones, then this counting process is a renewal process associated with the
conditional distribution F ( x y) . If failed products are repaired then the counting process is
characterized by a conditional intensity function  ( x y) which is a non-decreasing function
of x and y . Moreover if all repairs are „minimal‟ [1] and repair times are negligible,
then  ( x y).  h( x y). The authors in [16], [8], [17] and [4] assume a linear relationship of the
form
 ( x y)  ax  byx  ax  bu , (4)
with a, b  0. [11] have developed a different method, applying concepts from the accelerated
failure time and proportional hazards models (see [11] and [3]) to represent the effect of
usage rate on reliability degradation. Conditional on the usage rate, the time to first failure has
distribution function
 ( x y)  y 0 ( xy 1 ), (5)
where 0 ( x) is the base intensity function.

B) Modelling First Failure: For a product sold with a two-dimensional warranty, one
needs to model the product‟s degradation which takes into account both age and usage. The
authors in [20] have introduced a more appropriate model which uses the accelerated failure
time (AFT) model, to represent the effect of usage rate on degradation. Let y0 denotes the
nominal usage rate value associated with component reliability. When the actual usage rate is
different from this nominal value, the component reliability can be affected and this in turn
affects the product reliability. As the usage rate increases above the nominal value, the rate of
406 H.HUSNIAH ET AL.

degradation increases and this, in turn, accelerates the time to failure. Consequently, the
product reliability decreases [increases] as the usage rate increases [decreases]. Using the
AFT formulation, if T0 [ Ty ] denotes the time to first failure under usage rate y0 [ y ] then we
have usage rate, the time to first failure has distribution function
Ty T0   y0 y 

(6)

Furthermore if the distribution function for T0 is given by F0 ( x;0 ), where  0 is the


scale parameter, then the distribution function for Ty is the same as that for T0 but with scale
parameter given by
usage rate, the time to first failure has distribution function
 ( y)   y0 y  0 ,

(7)
with   1. Hence, we have
F ( x; ( y))  F0 ({ y / y0 } x;0 ). (8)

The hazard and the cumulative hazard functions associated with F ( x,  ( y)) are given by
h( x; ( y))  f ( x; ( y)) /(1  F ( x; ( y)) (9)
x
H ( x; ( y))   h( x;  ( y))dx. (10)
0

where f ( x; ( y)) is the associated density function.

2.3 The Expected Cost Per Unit Time. We first consider a periodic replacement policy for a
given usage rate y and then the policy for various values of usage rate. Let n y denote the
number of failure parameter of a periodic replacement policy for a given usage rate. The
periodic replacement policy is defined as follows.

For a given usage rate y, the product is repaired with minimal repair when it fails at the first
n-1 failures and it is replaced with a new one at nth.

We seek the optimal value of n y which minimizes the expected cost per unit time to the buyer
then n*y denote the optimal value as y varies.
For a given usage rate y, the expected cost per unit time is obtained as follows.
Let J (n y ) denotes the expected cost per unit time, which is given by
With, then we define
J (ny )  E Cost per cycle E Cycle length  (11)
as in [22]. Let Cr and Cm , (Cm  Cr ) denote the cost of each repair and the cost of a
replacement, respectively. Since all failures during cycle length are restored by a minimal
repair, then failures occurs according to a non-homogeneous Poisson process in (0, x) with
intensity function y ( x)  h( x; ( y)). Let Cd denote the cost incurred to the buyer at each
minimal repair. Sold with NFRW, the cost per cycle depends on whether replacement is
performed within or outside the warranty region. Therefore, from buyer perspective, the cost
A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE POLICY... 407

model can be formulated in two cases: T  Wy and T  Wy .


Then, for T  Wy , the expected cost per cycle is given by

     
Cd H Wy  N y  H Wy  1  Cd  Cm   Cr (12)

  0
Wy
Where, H Wy  y  x  dx . This cost is composed with down time cost at each minimal
repair during warranty region, down time and minimal repair cost at each minimal repair after
the warranty region expiry.
While, the expected cycle length is

 
   
T
lim TP Yn  T  xdP Yn  x  (13)
T   y 0 y

As a result, the expected cost per unit time is given by


  
Cd H Wy  n y  H Wy  1  Cd  Cm   Cr  
 
J ny 
n y 1 n 1
. (14)
  H  x   y  H  x 
 
j 0
0 
ny  1 !
e dx

Likewise, for T  Wy , the expected cost per cycle is given by

 
Cd H Wy  Cr (15)

  0
Wy
Where, H Wy  y  x  dx .
While, the expected cycle length is given as

 
   
W
lim WP Yn  W  xdP Yn  x  (16)
W   y 0 y

As a result, the expected cost per unit time is given by


 
Cd H Wy  Cr
 
J ny 
n y 1 n 1
. (17)
  H  x   y  H  x 
 
j 0
0 ny  1 !
e

dx

3. GLOBAL OPTIMAL POLICY

In this section we obtain an optimal n y that minimizes J (n y ) . By using


inequality J (ny  1)  J (ny ) , we have for x y  Wy ,
 ny 1 
   
 
 
 j 0 0
p j  x  dx 
0
pny  x  dx   n y  H Wy  1

 
408 H.HUSNIAH ET AL.

  
 Cd H Wy  Cr   Cd  Cm  .

(18)


where, p j  x    H  x 
j

j! e
H  x
.

 
Next, let L n y denotes the left-hand side of (18) i.e.
 ny 1  
   


 j 0 0

p j  x  dx
0  
pny  x  dx   n y  H Wy  1 . (19)
 
then we have the following theorem as in [13] and [18] for the case of non-warranty product.

Theorem 1. Suppose that the product is sold with a two-dimensional NFRW and  y ( y ) is

   
IFR, and y ()  Cd H Wy  Cr    Cd  Cm  then there exist a finite and unique solution
*
n y if

   
L ny  Cd H Wy  Cr   Cd  Cm 
   ny  1, 2,  (20)

 
Where, L n y is stated in eq.(16). The corresponding expected cost per unit time is given by
eq.(14).

Proof: From inequality J (ny  1)  J (ny ) we have (15). If

  
Cd H Wy  Cr
 
lmt L n y  lmt 

 Cd  Cm 
, then L(ny  1)  L(ny ) is a non-negative

n y  n y 
pny  x  dx
0

and  0
pny  x  dx is a decreasing function of n y . Consequently, there is a finite and unique

optimal solution n*y , and hence

   
L n*y  Cd H Wy  Cr   Cd  Cm  . The corresponding expected cost per unit time is
 
given by eq. (14) .

By using inequality J (ny  1)  J (ny ) , we have for x y  Wy ,


 ny 1   

 j 0 0
p j  x  dx  0
pny  x  dx 
 (21)
 
 Cr Cd


where, p j  x    H  x 
j

j! e
H  x
.

 
Next, let L n y denotes the left-hand side of (21) i.e.
 ny 1   
 
 j 0 0
p j  x  dx  0
pny  x  dx 

(22)
 
A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE POLICY... 409

then we have the following theorem .

Theorem 2. Suppose that the product is sold with a two-dimensional NFRW and  y ( y ) is

IFR, and y ()  Cr Cd then there exist a finite and unique solution n*y if

 
L ny  Cr Cd  ny  1, 2,  . (23)

 
Where, L n y is stated in eq.(22). The corresponding expected cost per unit time is given by
eq.(17).

Proof : From inequality J (ny  1)  J (ny ) we have (21). If



 
lmt L n y  lmt 

Cr
, then L(ny  1)  L(ny ) is a non-negative and

n y  n y 
pny  x  dx
Cd
0

0
pny  x  dx is a decreasing function of n y . Consequently, there is a finite and unique

 
optimal solution n*y , and hence L n*y  Cr Cd . The corresponding expected cost per unit
time is given by eq.(17) .

In previous section, N *y  n is obtained under intervals x y  Wy dan x y  Wy .


However, in reality the periodic replacement time should be given whithout any limitation
regarding those intervals. As a consequence, we need to derive TyG* whithout any pre-
determined (called the global optimal). Then, combining Theorems 1 and 2, the following
Colloraly is obtained.

Colloraly 1. Suppose that the product is sold with a two-dimensional NFRW and h( x; ( y))
is IFR with xy   0,  and Cr , Cd  0 :

  
If L(ny )  Cd H Wy  Cr   Cd  Cm  then there exist a finite and unique solution N G
 y
*
in
x y  Wy with the corresponding expected cost per unit time,as in (14).

If
  
Cd H Wy  Cr   Cd  Cm   L(ny )  Cr Cd then

N yw* in x y  Wy with the
corresponding expected cost per unit time as in (17).

If L(ny )  Cr Cd then nGy* in x y  Wy with the corresponding expected cost per unit time as
in (17).
.
Proof: It is clear as the consequences of Theorems 1 and 2.
410 H.HUSNIAH ET AL.

4. NUMERICAL EXAMPLES

We consider a special case where failure distribution function with nominal design
usage rate y0 and scale parameter  0 is F0 ( x;0 )  1  exp( x / 0 ) , where  is the shape
parameter. The conditional failure distribution function, given the usage rate y is
F ( x; ( y))  1  exp( x /  ( y)) with  ( y) given by (7). The hazard function associated
 1
is given h( x; ( y))  0  y y0   x 0 

with F ( x; ( y)) and its cumulative hazard

 
function is given by R Wy  Wy 0  y0 y 

. In what follows we obtain the optimal
solution, n*y provided that
 Cr 
n*y     1 and
    1 Cd 
 
  y   
nw  Wy  0      1 where  k  is defined as the greatest integer contained in k .
  0
   y   
 
For numerical examples we consider the following values: (1) Warranty Policy: W  2
(years) and U  2 ( 104 Km); (2) Design Reliability: ( 104 Km per year),  0  1 (year)
and   2 ; (3) AFT Model:   1.5 ; Costs: Cm = Cd = 1,Cr =5. The numerical computation is
done by Maple V 9.5. The result in Table 1 indicates that the increase of usage rate would
take replacement early due to the number of failures. Furthermore, Table 2 shows that if the
costs of downtime and minimal repair are higher due to the numbers of failure (and the
increase of usage rate) then the replacement period is shorter than the replacement period if
otherwise case happens. This is a realistic conclusion since normally buyers would undertake
replacement early to avoid a high total maintenance cost due to high penalty of downtime and
repair costs.
Table 1: n y* for 0.8  y  3.0
W  [0, 2)  [0, ) W  [0, 2)  [0, 2)
y n*y J  
n*y n*y  
J n*y

0.80 3(>nw=1) 5.26


1.00 3(>nw=1) 5.12 3(>nw=1) 5.12
1.20 3(>nw=1) 4.90 3(>nw=1) 5.21
1.40 2(=nw=2) 4.23 3(>nw=1) 5.40
1.60 1(<nw=3) 3.33 3(>nw=1) 5.69
1.80 1(<nw=3) 2.35 3(>nw=1) 6.09
2.20 2(=nw=2) 7.62
2.60 1(<nw=3) 9.92
3.00 1(<nw=3) 13.16
Table 2: n y* for y0  2
W  [0, 2)  [0, )
 Cd y n*y  
J n*y
A MATHEMATICAL MODEL OF PERIODIC MAINTENANCE POLICY... 411

1.00 0.60 5(>nw=1) 4.48


1.5
2.00 0.60 3(>nw=1) 5.39
1.00 0.60 3(>nw=1) 5.35
2.0
2.00 0.60 2(>nw=1) 5.94
1.00 1.00 5(>nw=1) 4.34
1.5
2.00 1.00 3(>nw=1) 5.19
1.00 1.00 3(>nw=1) 5.12
2.0
2.00 1.00 2(>nw=1) 5.64
1.00 1.40 3(>nw=1) 3.85
1.5
2.00 1.40 1(<nw=2) 4.13
1.00 1.40 2(=nw=2) 4.23
2.0
2.00 1.40 1(<nw=2) 4.09

5. CONCLUDING REMARK

In this paper, we have studied a periodic replacement with nth failure policy for a
product with a two-dimensional warranty. For the case of two-dimensional warranties, one
can study other replacement policies such a replacement based on cumulative repair cost [15],
[16], [19]. These topics are currently under investigation.

Acknowledgement. Part of this work is funded by the Indonesian government through


Hibah Doktor to the first author.

References
[1] BARLOW, R.E., AND HUNTER, L., “Optimal preventive maintenance policies,” Operations Research, vol. 8, pp.
90–100, 1960.
[2] BARLOW, R.E., PROSCHAN, F., HUNTER, C.H., Mathematical Theory of Reliability Laser, New York: John Wiley,
1965.
[3] BLISCHKE, W.R., AND MURTHY, D.N.P., Reliability: Modeling,Prediction, and Optimisation, New York: John
Wiley, 2000.
[4] CHUN, Y.H., AND TANG, K., “Cost analysis of two-attribute warranty policies based on the product usage rate”,
IEEE Transactions in Engineering Management, vol. 46, pp. 201-209, 1999.
[5] FRICKENSTEIN, S.G., AND WHITAKER, L.R, “Age replacement policies in two time scales,” Naval Research
Logistics, vol. 50, pp. 592-613, 2003.
[6] HUSNIAH, H., AND ISKANDAR, B.P., An Optimal Periodic Replacement Policy for a Product Sold with a Two-
Dimensional Warranty. Proceedings of the 9th Asia Pacific Industrial Engineering & Management Systems,
pp. 232-238, 2008.
[7] HUSNIAH, H., U. PASARIBU, A.H. HAKIM AND B.P. ISKANDAR. A Hybrid Minimal Repair and Age Replacement
Policy for Warranted Products. Proceedings of the 2nd Asia Pacific Conference on Manufacturing System, pp.
8.25-8.30, 2009.
[8] ISKANDAR, B.P., WILSON R.J., AND MURTHY D.N.P. Two-dimensional combination warranty policies. RAIRO
Operational Research, 28: 57-75, 1994.
[9] JUNG, G..M., AND PARK, D.H., Optimal maintenance policies during the post-warranty period. Reliability
Engineering and System Safety, 82:173-185. 2003.
[10] JUNG, K.M., HAN, S.S., AND PARK, D.H. , Optimization of cost and downtime for replacement model following
the expiration of warranty, Reliability Engineering and System Safety, 93:995-1003, 2008.
412 H.HUSNIAH ET AL.

[11] LAWLESS, J. Statistical Models and Methods for Lifetime Data, Wiley, New York, 1982.
[12] LAWLESS J., HU, J., AND CAO, J., Methods for estimation of failure distributions and rates from automobile
warranty data, Lifetime Data Analysis , 1, 227-240, 1995.
[13] MAMABOLO R.M., AND BEICHELT F.E., Maintenance policies with minimal repair, Economic Quality Control,
19, p.143-166, 2004.
[14 MURTHY, D.N.P., AND ISKANDAR B.P. ,A New shock damage model: Part I-Model formulation and analysis,
Reliability Engineering and System Safety, 31 p. 191-208, 1991.
[15] MURTHY, D.N.P., AND ISKANDAR B.P., A New shock damage model: Part II-Optimal maintenance policies,
Reliability Engineering and System Safety, 31 p. 211-231, 1991.
[16] MURTHY, D.N.P., AND WILSON, R.J., Modelling two-dimensional warranties, In: Proceedings of the Fifth
International Symposium on Applied Stochastic Models and Data Analysis, Granada, Spain, 481-492, 1991.
[17] MOSKOWITZ, H., AND CHUN, Y.H. ,A Poisson regression model for two-attribute warranty policies.Naval
Research Logistics, 41,355-375, 1994.
[18] NAKAGAWA, T., Maintenance Theory of Reliability. Springer-Verlag, London, 2005
[19] NAKAGAWA, T., Shock and Damage Models. Springer-Verlag, London, 2007.
[20] NAT, J., ISKANDAR, B.P., MURTHY, D.N.P. , A repair-replace strategy based on usage rate for items sold with a
two-dimensional warranty. Reliability Engineering and System Safety, 94, 611-617, 2009.
[21] PIERSKALLA, W.P., AND VOELKER, J .A. , A survey of maintenance models: the control and surveillance of
deteriorating systems, Naval Research Logistics, 23, 353-388, 1976.
[22] ROSS, S.M. Stochastic Processes . DJohn Wiley & Sons, INC., Canada,1996.
[23] SAHIN, I.,AND POLATOGLU, H., Maintenance strategies following the expiration of warranty. IEEE Transactions
on Reliability, 45(2), 220-228, 1996.
[24] VALDEZ-FLORES, C., AND FELDMAN, R.M., A survey of preventive maintenance models for stochastically
deteriorating single-unit systems. Naval Research Logistics, 36, 419-446, 1989.
[25] YEH, R.H., CHEN, M.,Y., AND LIN, C., Y. , Optimal periodic replacement policy for repairable products under
free-repair warranty. EJOR, 176, 1678-1686, 2007.

HENNIE HUSNIAH
Department of Industrial Engineering Institut Teknologi Bandung, Jalan Ganesa 10, Bandung
40132, Indonesia
e-mail: [email protected]

UDJIANNA PASARIBU
Department of Mathematics Institut Teknologi Bandung, Jalan Ganesa 10, Bandung 40132,
Indonesia.
e-mail: [email protected]

A. HAKIM HALIM
Department of Mathematics Institut Teknologi Bandung, Jalan Ganesa 10, Bandung 40132,
Indonesia.
e-mail: [email protected]

BERMAWI. P. ISKANDAR
Department of Mathematics Institut Teknologi Bandung, Jalan Ganesa 10, Bandung 40132,
Indonesia.
e-mails: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 413 - 418.

THE EXISTENCE OF PERIODIC SOLUTION ON STN NEURON


MODEL IN BASAL GANGLIA

I MADE EKA DWIPAYANA

Abstract. Ganglia Basalis is part of the brain which balances motoric pathways of neurons signal
movements. Parkinson is one of the diseases that are caused by Ganglia Basalis disability. This
neurodegenerative disease is caused by the deficit of dopamine produced by Substantia Nigra pars
Compacta (SNPc) in the ganglia basalis. This deficit will reduce activities of Globus Palidus Externa
(GPe) and increase activities of Subthalamic Nucleus (STN). The pattern of neuron signal is almost
periodic. For that reason, the periodic solution must exist in the solution of STN Neuron model
constructed by considering the dynamics of ions within the cells. STN itself receives stimulus input
from GPE. This input will be various, so that the periodic solution must exist for various values of
current input for STN neuron. Using Matcont, the existence of Hopf bifurcation in the STN neuron
model will guarantee the existence of periodic solution for various values of current input.
Keywords and Phrases : Ganglia basalis, Parkinson, STN, Matcont.

1. INTRODUCTION
Every neuron cell communicates with each other by inhibition or excitation
stimulation. The STN neurons are located in the Basal Ganglia which have the function of
balancing body movement. The GPe neurons stimulate the STN cells by inhibition. Basal
ganglia disorder may lead to Parkinson disease.
Potential membrane is caused by differences in ion concentrations inside and outside
the cell. Neuron membrane is selective permeable with some ion such as: , , and
. When ion concentration changes suddenly, it will result in action potential.

2. METHODS
413
414 I MADE EKA DWIPAYANA

Recent neuron models are the derivative of Hudkin-Huxley model developed from
single squid neuron system. Our model is extended from Hudkin-Huxley which includes five
compartments as dynamic system. Those compartments are membrane potential of STN
neuron ( v ), slow gating ( n, h, r ) and ion calcium concentration ( [Ca] ) (Terman, 2002). The
system of differential equation for STN neuron is:
dv
Cm   I L  I K  I Na  IT  ICa  I AHP  IG S
dt

dn  n (v )  n 
 n   
dt   n (v ) 
dh  h (v )  h 
 h   
dt   h (v ) 
dr  r (v )  r 
 r   
dt  r  ( v ) 

   I Ca  I T  k Ca Ca 
d [Ca]
dt
The leakage current is given by I L  g L (v  v L ) and other currents will follow Hudkin-
Huxley equation (Beuter, 2003):
I K  g K n 4 (v  v K )
3
I Na  g Na m (v)h(v  v Na )
3
I T  g T a (v)b2 (r )(v  vCa )
2
I Ca  g Ca s (v)(v  vCa )

I AHP  g AHP (v  v K )
Ca
3. Ca  k1
 1X
With  X (v)   X0  , which X can be n, h or r . The steady state
4.   (v   X ) 
1  exp  

  X 
1
voltage was formulated as follows: Y (v)  , which Y can be
  (v   Y ) 
1  exp  
  Y 
n, m, h, a, r or s. The inactivation of T current was determined by
The Existence Of Periodic Solution On Stn Neuron Model in Basal Ganglia 415

. The Ca  is the concentration of ion


1 1
b (r )  
 r  b    
1  exp   1  exp  b 
 b   b 

Ca 2  in the cell. The current I G  S is representing the stimulus input from GPe neurons
(Guyton, 1993). This input is varied depending on GPe stimulus to STN neurons. For that
reason, it is important to know the impact of different input currents to STN cell.

3. RESULTS
At first, the stability of the system above will be calculated to find critical points and
eigenvalues that is related with every critical point. This step can be done by searching the
value of , , , and which will cause the system constant anytime. With software
Matcont, the curve of equilibrium can be seen as in figure 1:

Figure 1: Equilibrium curve for single STN neuron cell. There are two Hopf Bifurcation at
and .
There are four regions which have different dynamics. Region I is for
, in which all the eigenvalues from the system have negative real part, so that
the critical points will be stable at this region. Region II is for and
region III is between and . The eigenvalues of both regions have
416 I MADE EKA DWIPAYANA

positive real part, so that the critical points will not be stable but periodic. This periodicity is
guaranteed by the existence of Hopf bifurcation in points and
. The last region is region IV for and . At this
region, all eigenvalues have negative real part. It means the critical points are stable.
Positive value for means the STN cell has excitation stimulus from GPe,
negative value means the STN cell has inhibition stimulus and means the STN cell
has no stimulus. The interesting part is the inhibition from GPE for can
not cancel the spiking of STN cell, because the capability of the cell itself to spike is higher
than the inhibition from GPe cells. The periodic solution between
can be seen in figure 2:

Figure 2: Amplitude for STN spike.


The amplitude is higher when the value of becomes smaller. The STN cell
without stimulation from other cells will have a repetitive spiking pattern. This repetitive
pattern is showed in figure 3. The period of the repetitive spiking is 367 ms with amplitude
around 120 mV.
The Existence Of Periodic Solution On Stn Neuron Model in Basal Ganglia 417

Figure 3: Regular spiking for STN cell.

The amount of inhibition received by STN cell can be defined by step function as follows:

The value of c is constant which represents the amount of inhibition received by


STN cell. Figure 4 shows that the large enough current from GPe will stop the repetitive
spiking of STN cell to remain constant as if without any activity.

Figure 4: After 500 ms regular spiking the STN cell receives for 450 ms.

The STN cell will produce repetitive spiking at region I. When the inhibition current
received at for 450 ms, system will jump to the fourth region. In region IV, the
STN cell cannot produce repetitive spiking because all the eigenvalues there have negative
real part. It means the solution will be convergent to the critical point. At the end of
inhibition influence, the spike became tense. This situation is called bursting. After bursting
the repetitive spiking will appear again.
418 I MADE EKA DWIPAYANA

4. DISCUSSION

Bursting occurs within a very short time interval. The dynamic of ionic cell is generally
chaotic. Therefore, it is interesting to observe the behavior of every ion. The factors that
influence the length of the bursting need to be investigated also. The inhibition itself, which
makes the STN cell look like stand still, should cause the ion into the cell and out of the cell
to be balanced.

Appendix I
The parameter values for the model discussed above are (Terman, 2002)
, , , , ,
, , , , ,
, , , , , ,

h  0.75 , n  0.75 , r  0.2 , , , ,

, , , , , , ,
, , , , , , , ,
, , , , , and .

References

[1]BEUTER, A., GLASS, L., MACKEY, M.C., TITCOMBE, M.S., Nonlinear Dynamics in Physiology and medicine, vol
25, Springer-Verlag, New York, 2003.
[2]DHOOGE, A., GOVAERTS, W., KUZNETZOV, Y.A., SAUTOIS, B., Matcont: A Matlab Package for Dynamical System
with Applications to Neural Activity. 2006.
[3]GUYTON, Buku Ajar Fisiologi Kedokteran, vol. 7, Penerbit Buku Kedokteran, 1993.
[4]SHERWOOD, L., Fisiologi Manusia: dari Sel ke Sistem, vol. 2, Penerbit Buku Kedokteran, 1996.
[5]TERMAN, D., RUBIN, J.E., YEW, A.C., WILSON, C.J., Activity Patterns in a Model for the Subthalamopallidal
Network of the Basal Ganglia, The journal of Neuroscience, April 1, 2002, 22(7):2963-2976.
[6]TERMAN, D., An Introduction to Dynamical Systems and Neuronal Dynamics. 2004.
[7]VERHULST, F., Nonlinear Differential Equations and Dynamical Systems. Springer-Verlag, Inc, New York. 1996.

I Made Eka Dwipayana


Mathematics Department, Faculty of Mathematics and Natural Science, Udayana
University
e-mail :[email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 419 - 426.

OPTIMUM LOCATIONS OF MULTI-PROVIDERS


JOINT BASE STATION BY USING SET-COVERING
INTEGER PROGRAMMING:
MODELING&SIMULATION

I WAYAN SULETRA, WIDODO, SUBANAR

Abstract. A base station is very important to the cellular telecommunication. It provides service to
cellular subscribers. Because of limited radio spectrum capacity, the providers must build several
base stations on different location to give cellular service to their subscribers in targeted area.
However, it is not efficient if the base station site serves only one telecommunication provider
because its construction cost is very high. Optimization of joint base station locations will be
discussed in this paper. Every base station site will serve several GSM and CDMA providers. An
integer programming model based on set-covering problem is developed to determine the minimum
number of joint base stations and their optimum locations. A Branch-and-Bound algorithm is used to
find a set of optimum solution.
Keywords and Phrases : base station, set-covering, integer programming, cellular
telecommunication

1. INTRODUCTION
Cellular telecommunication technology has been advancing very fast up to now.
This brings many benefits both to the telecommunication providers and the cellular
subscribers. The providers can give subscribers a telecommunication service with a better
performance (in speed and reliability) and wider capacity by using newer technology.
However, because of limited radius of a base station coverage, advancement of cellular
technology cannot benefit many people if the providers do not build a number of base stations
in many areas.
Base Station is a site where the basic components of cellular telecommunication
network are located. These components are cellular tower, radio bases station (RBS), power
supply, sectoral antennas, microwave dish, baseband microwave processing, and shelter to
protect the equipment from the weather (http://www.withoutthecat.com). The base station

419
420 I.W. SULETRA, W IDODO, AND SUBANAR

connects a subscriber in one location to the others in a long-distance location. Therefore, the
base station is very important to the cellular telecommunication.
However, it is not efficient if the base station site serves only one cellular provider
because the construction cost of a cellular tower (macro cell) is very high. In addition to
cellular tower cost, the provider must also fund the cellular site leased area cost and the cost
to hire maintenance and security personels to maintain and protect the base station site. All
those site cost can be reduced if several providers use the same base station site. Then, those
shared costs can reduce the cellular service tariff imposed to the subscribers.
A macro cell tower can serve a number of providers according to its size. Each
cellular provider, GSM or CDMA, operate in its specific spectrum and its spectrum is
different to the others. No interference problem will be existed across providers (across
spectrum) if the position of each sectoral antenna is adjusted correctly (see Laiho et al.[6]).
Therefore, cellular tower sharing to several providers is possible.
Based on our knowledge, the available literatures of the base station placement
problem implicitly assume one provider for each base station (see ex. Mathar dan Niessen [8],
Amaldi et al. [1-4], Chen and Yuan, [5]). Because the traffic density or the number of
subscribers of each provider, in the same target area, is not always the same, then, the
number of base stations needed by each provider may be different. Therefore, it is not
realistic to assume each provider need the same number of base stations. We propose an
integer linear programming model based on set-covering problem to fill this gap.

2. MODEL ASSUMPTIONS
To understand the system described by the model developed in this study, the
context and the assumptions used in developing the integer programming model will be
explained in advance in this section.
The first, the subscriber location can be at any point in the area of cellular service
coverage because of the mobility factor. The subscriber located at one particular area (one
RW, rukun warga) is represented by a single point that is called a demand point. Many
previous studies assume that the demand point only represents the subscriber of a single
provider. This assumption is contrary to reality because at one demand point there are multi-
provider subscribers. In this study, one point represents the demand of multiple providers that
is called a multi-mode demand point.
The second, Joint base station (JBS) which serves the subscribers located at the same
point is most likely different if the providers are different. This happens because the JBS
consists of a number of radio base station (RBS) with a distinct coverage radius. Each RBS in
one JBS belongs to a distinct provider and each has a distinct coverage radius according to
the demand traffic density of the corresponding provider. Different coverage radius for traffic
density is intended to maintain the quality of service in an area.
The third, Each RBS is assumed to have sufficient capacity to serve the subscribers
in its service territory. Coverage radius of an RBS can be adjusted through the transmitter
power adjustment which the signal is channeled through the sectoral antennas. RBS with a
shorter coverage radius is set to more dense traffic areas, and vice versa, RBS with a longer
coverage radius is set to less dense traffic areas such as suburban or rural area. Coverage
radius adjustment is expected to meet the level of GOS (grade of service) specified by
provider.
The fourth, the alternative locations of JBS are assumed to have met the terms of the
location of a base station such as no radio wave interference from other sources, no blocking
Optimum Locations of Multi-Providers Joint Base Stations ... 421

hills, no tall buildings, and no large trees. In addition, land area is sufficient and meets
eligibility requirements for a tower establishment, and the locations meet the rules set by local
government. A comprehensive terms of the base station site has been discussed by Freeman
(2007, p.38-43.)}
The fifth, RBS that is assumed in this study is the RBS for GSM (2G), RBS CDMA
2000 1x for CDMA provider, and multi-standard or multi-mode RBS. A multi-standard RBS
is capable of serving multiple systems simultaneously. For example, a family of RBS 6000 by
Ericsson can provide GSM services, WCDMA service, and LTE service at the same time
(Ericsson, 2009). Placement of RBS which is devoted only to Internet data service such as
EV-DO, WCDMA, or LTE, is not discussed in this study.
The sixth, inter-provider inter-cell interference, intra-provider inter-cell interference,
and intra-provider intra-cell interference are not addressed in this study. Each provider has its
own spectrum which is different from one provider to another provider. Therefore, there is no
interference between provider. Spectrum of cellular telecommuication is allocated by
Ditjenpostel Kementerian Kominfo (FCC in the USA, and ITU worldwide) to avoid
interference between cellular providers.
Interference between cells intra-provider for the GSM system is minimized by the
frequency division mechanism. The spectrum of each GSM provider is divided into several
frequency slots, then, a distinct frequency slot is allocated to each of adjacent-cell. In
addition, it is usually sacrificed so-called guard band frequency (in KHz) as the gap between
two consecutive frequency slots (see Weicker et al.[10]). Interference most likely to occur
between the two cell that use the same frequency slot (co-channel) if the coverage radius of
the two cells overlap each other. Fortunately, this interference can be ignored because the two
co-channel cells is separated in a considerable distance by several other cells between the two
cells. For systems that do not use frequency division such as CDMA 2000, WCDMA, and
LTE, interference between cells is prevented by minimizing the overlap between the two
cells.
Interference from other sources that generate electromagnetic waves at radio
frequency is not discussed. This kind of interference is included in the terms of the
assumption of a base station locations that have been described previously.

3. SET COVERING INTEGER LINEAR PROGRAM


The model developed in this study is the integer programming model that refers to
the set-covering problem. The basic model of set-covering problem requires that each demand
point is served at least by one facility (supply point). In this study, the requirement is the
same, each subscriber located at a demand point must be covered by at least one JBS. The
goal is to minimize the number of JBS that can serve all demand points from all providers.
Because the number of JBS required by each provider is different according to the demand
traffic, then this model also optimizes JBS site selection for each provider simultaneously.
The following is the integer linear programming model of basic set-covering
problem reformulated by Owen&Daskin[9]:

Minimize c
j
j Xj (1)
422 I.W. SULETRA, W IDODO, AND SUBANAR

Subject to: X
jN i
j 1 i, (2)

X j  0,1 j. (3)


where,
c j  construction cost of facility at point j
S  coverage radius of the facility
N i  a set of facility location j that can serve demand point i,

N i  j d ij  S i,  (4)

d ij  the distance from demand point i to facility location j


1, if po int j is selected as a facility location
Xj 
 0, otherwise
The above model (1)-(3) explains that there is only one type of demand spreaded across
multiple locations point i and also only one type of facility that can be placed at some points j
to serve the demand. The goal is to minimize the the number of facility but all demand points
are served by at least one facility. Standard model is not suitable to be applied in the case of
determining the location of cellular JBS. If applied to one provider, it will get optimum
facility locations for that provider but not necessarily optimum for the others. Therefore, the
model should be developed to determine the optimum facility locations for many providers
simultaneously. A single location as a place of RBS for several providers is more efficient
than one facility location for one provider as described previously.
In this study, the basic model of set-covering problem is developed by introducing
the concept of multiple-type demand and multiple-type facility. The proposed model
describes the context of determining the location of cellular JBS. There are many types of
demand (i.e.subscribers of the multi-provider) at one demand point and many types of facility
(RBS of many providers with distinct coverage radius) placed at one JBS location. Let Xj and
Yrj are two binary valued variables, each of which respectively represent whether an
alternative location j is selected as the location of JBS or not, and whether the provider r
places its RBS at locations j or not. The goal is to minimize the number of JBS that is shared
by at least two providers (in accordance with the regulation of Menkominfo). Here is the
formulation of the proposed integer linear Program.

Minimize c
j
j X j   wrjYrj
r j
(5)

Subject to:  t
r j
Y 1
ikrj rj i, k , (6)

Yrj  X j  0 r , j, (7)

Y
k
rj  2X j j, (8)

X j  0,1 j, (9)


Optimum Locations of Multi-Providers Joint Base Stations ... 423

Yrj  0,1 r, j. (10)


where,
c j  fixed construction cost of JBS at location j
wrj  the cost incurred by provider r if put RBS at location j
1, if alternativ e location j is selected as a location of RBS of provider r
Yrj  
0, otherwise
1, if alternativ e location j is selected as a location of JBS
Xj 
0, otherwise
1, if subscriber typek at demand point i is served by RBS provider r at location j
t ikrj  
0, otherwise
Equation (5) is the goal of the model, i.e. to minimize the total cost of JBS.
Equation (6) is the requirement that each subscriber is served at least by one corresponding
RBS. Equation (7) is used to maintain model consistency, i.e. if provider r places its RBS at
location j then alternative location j must be selected as a JBS location. If Yrj=1 for any r, then
Xj=1. Otherwise, if Yrj=0 for all r, then Xj=0 because of the minimize case of objective
function. Equation (8) is to meet the requirement of Kepmenkominfo, i.e. each JBS is used by
at least two providers. Equation (9) & (10) are binary variable constraints. Parameter wrj
represents the provider preference, the greater value of wrj reflect the provider r does not like
the location j.
The binary valued parameter, tikrj, represents the relationship between the demand
point and the facility point. Index i (i = 1,2, .. M) is the demand point, index k (k = 1,2,3 .. K)
is the type of subscriber, index r is the type of facility, i.e. RBS of a distinct provider (r = 1.2,
..., K), and index j (j = 1,2,3 ... N) is the location point of facility, i.e. alternative location of
JBS. The value of parameter tikrj=1 if the subscriber k in demand point i can be served by
RBS r at location j. On the contrary, tikrj=0 if k at location i is not covered by r at location j.

4. NUMERICAL SIMULATION
To describe computation and applicability of the proposed model, we use the data of cellular
towers distribution (Figure 1) and the map of demand point distribution (point of RW, rukun
warga, Figure 2) both on the City of Surakarta, Central Java. Of 105 point locations towers in
use today, only 43 points are eligible to serve as an alternative location JBS (j = 1,2, ..., 43).
The population of Surakarta is around 600,000 citizens, spread over 598 RW. The center
point of each RW serve as the location of demand point (i = 1,2, ..., 598). There are 11
cellular service providers in Surakarta consisting of five GSM providers and sixth CDMA
provider (k = 1,2, ... 11) (r=1,2,…,11). The binary value for parameter tikrj is a hypothetical
data with reference to the approximate coverage radius of each RBS from each provider. RBS
coverage radius for the GSM system in urban areas ranged from 0.5km to 2km, while the
CDMA ranging between 3-5km due to the large capacity of CDMA system with less number
of subscribers. The value of parameters cj and wkj are also hypothetical.
424 I.W. SULETRA, W IDODO, AND SUBANAR

Figure 1. Alternative locations of JBS

Figure 2. Map of demand point distribution

By using branch-and-bound method on Lingo software with @OLE interface on Ms Excel,


the proposed model is applied to the numerical data. It gets 34 locations as optimum JBS
locations from 43 available alternatives. The optimum number of RBS for each provider is
also known from the result of simulation as depicted on figure 3. In this simulation, Tsel and
Isat are the two biggest providers with the largest number of subscribers.
Optimum Locations of Multi-Providers Joint Base Stations ... 425

Figure 3. Optimum number of RBS for each provider

5. CONCLUDING REMARK
We propose a kind of multiple-type demand and multiple-type facility location model applied
to the JBS location problem. Problem is formulated as single objective linear integer
programming. The standard set covering model is a special case of our model by setting it to
single type demand and single type facility. We impose many assumptions about interference
in this study and use non-polynomial Branch-and-Bound Algorithms to solve the integer
model. It is interesting for future study if we can accommodate a multiple objective approach
to represent a more realistic problem, i.e. problem with three different stake holders (provider,
subscriber, and government). Relaxation of interference assumptions may also be interesting.

Reference
[1] AMALDI, E., CAPONE, A., AND MALUCELLI, F., Discrete Models and Algorithms for the Capacitated Location
Problem Arising in UMTS Network Planning. Proceedings of the 5th International Workshop on Discrete
Algorithms and Methods for Mobile Computing and Communications (DIAL-M), ACM, pp.1–8, 2001.
[2] AMALDI, E., CAPONE, A., AND MALUCELLI, F., Planning UMTS Base Station Location: Optimization Models
With Power Control and Algorithms. IEEE Transactions on Wireless Communications, vol.2, no.5, pp.939-952.,
2003.
[3] AMALDI, E., BELOTTI, P., CAPONE, A., AND MALUCELLI, F., Optimizing base station location and configuration in
UMTS networks. Annals of Operations Research, 146, pp.135–151, 2006.
[4] AMALDI, E., CAPONE, A., AND MALUCELLI, F., Radio planning and coverage optimization of 3G cellular
networks. Wireless Networks, 14, pp.435–447, 2008.
[5] CHEN, L. AND YUAN, D., Solving a minimum-power covering problem with overlap constraint. European
Journal of Operational Research, 203, pp.714–723, 2010.
[6] LAIHO, J., WACKER, A., AND NOVOSAD, T., Radio Network Planning and Optimisation for UMTS, 2nd edition.
John Wiley & Sons Ltd, Chichester, England, 2006.
[7] Learn about what is on a cell tower: Without the Cat—Sponsored by MD7 (http://www.withoutthecat.com), URL
retrieved 12 Mei 2011.
[8] MATHAR, R. AND NIESSEN, T., Optimum positioning of base stations for cellular radio networks. Wireless
426 I.W. SULETRA, W IDODO, AND SUBANAR

Networks, 6, pp.421–428, 2000.


[9] OWEN, S. H. AND M. S. DASKIN. Strategic Facility Location: A Review, European Journal of Operational
Research, 111 , 423-447,1998.
[10] WEICKER, N., SZABO, G., WEICKER, K., AND WIDMAYER, P. Evolutionary Multiobjective Optimization for Base
Station Transmitter Placement With Frequency Assignment. IEEE transactions on evolutionary computation, 7,
2, pp.189-203, 2003.

I Wayan Suletra
Industrial Engineering Department, Sebelas Maret University, Solo.
e-mail: [email protected]

Widodo
Department of Mathematics Gadjah Mada University
e-mail: [email protected]

Subanar
Department of Mathematics Gadjah Mada University
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 427–434.

EXPECTED VALUE APPROACH FOR SOLVING


MULTI-OBJECTIVE LINEAR PROGRAMMING WITH
FUZZY RANDOM PARAMETERS

Indarsih, Widodo, Ch. Rini Indrati

Abstract. In this paper we present the expected value approach for solving multi-
objective linear programming (MOLP) with fuzzy random parameter. There are two
MOLPs. The first is an MOLP with fuzzy random objective function coefficients and the
second is an MOLP with fuzzy random right hand sides. The expected value approach
transforms the problems to MOLP with fuzzy parameter. In this paper we use triangular
fuzzy number for fuzzy random objective function coefficients and non decreasing linear
fuzzy number for fuzzy random right hand sides. We also introduce a method to solve
MOLP with fuzzy parameter

Keywords and Phrases: Expected value approach, fuzzy random parameters.

1. INTRODUCTION AND PRELIMINARIES


1.1. Introduction. In the real problems, sometimes the value of parameters of math-
ematics model are uncertainty or imprecise. The MOLP with uncertain parameters is
known as probabilistic MOLP. One method to solve the probabilistic MOLP is an ex-
pected value approach. The approach transforms the problem to deterministic problem
[2]. The MOLP with imprecise parameters is known as fuzzy MOLP. Many approaches
are proposed to solve the fuzzy MOLP. The approach depends on the parameters and
type of the fuzzy number. Jana and Roy [4] proposed a method to solve MOLP with
fuzzy resources and MOLP with fuzzy coefficients and fuzzy resources. Maleki at.al.[6]
proposed a method for solving linear programming with all parameters are fuzzy number
and linear programming with fuzzy variables.
An MOLP with uncertain and imprecise parameters is known as an MOLP with
fuzzy random parameters. In this paper we investigate two kinds of MOLPs with fuzzy
random parameters. They are MOLP with fuzzy random objective function coefficients
and MOLP with fuzzy random right hand sides(rhs). An rhs is resources in constraints
in the problem. Many definitions of fuzzy random variables are proposed. Nevertheless,
we choose the definition of fuzzy random variables from Luhandjula and Gupta [5].
427
428 Indarsih, Widodo, Ch.R. Indrati

The probability concept is preserved in this definition. We propose the expected value
approach to transform the problems to MOLP with fuzzy parameters. The approach
preserves the linear properties. Next, we solve the MOLP by a new method.

1.2. Preliminaries. We give a brief summary of the basic theory from Bector and
Chandra [1] and we give some review of the our last results [3].

1.2.1. Fuzzy Number and Ranking Fuzzy.


Definition 1.1. The fuzzy number Ā is called non decreasing linear fuzzy number if its
membership function µĀ has the following form:
1 for a ≤ x ≤ b



0 for x < al

µĀ (x) =
 x − al
for a < x ≤ al ,


a − al
where al , a and b are given. The non decreasing linear fuzzy number is denoted by
Āltt = (al , a, b).
Definition 1.2. The fuzzy number Ā is called non increasing linear fuzzy number if its
membership function µĀ has the following form:
1 for c ≤ x ≤ a



0 for x > au

µĀ (x) =
a
 u
 − x
for a < x ≤ au ,

au − a
where c, a and au are given. The non increasing linear fuzzy number is denoted by
Āltn = (c, a, au ).

In this paper we will solve the fuzzy MOLP with fuzzy objective function coef-
ficients by modified simplex method. Because we have fuzzy parameters, we need to
use ranking fuzzy to obtain optimal criteria. There are many methods proposed to
determine the ranking of fuzzy numbers. In this paper, we choose the ranking fuzzy
number in the following definition.
Definition 1.3. The ranking fuzzy for fuzzy number Ā is defined by
R au
xµĀ (x)dx
R(Ā) = Ralau ,
al
µĀ (x)dx

where al and au are the lower and the upper limits of the support of Ā.

Formulation R(Ā) above represents the centroid of Ā. In the MOLP with fuzzy
random objective function coefficients, we have triangular fuzzy number. For the trian-
gular fuzzy number(TFN) Ā = (al , a, au ), it can be verified that R(Ā) = 31 (al + a + au ).
Two TFNs Ā = (al , a, au ) and B̄ = (bl , b, bu ) are given, we define Ā ≤R B̄ if (al + a +
au ) ≤ (bl + b + bu )
Expected Value Approach for Solving Multi-objective Linear Programming 429

1.2.2. Expected Value of Fuzzy Random Variables. One of many methods to solve prob-
abilistic programming is an expected value approach. Based on this idea, we want to
solve fuzzy random programming by expected value approach. The first, we should
define an expected value in the fuzzy random variables(FRV). In [3], we have defined
an expected value of discrete and continuous FRV.
Definition 1.4. The Pnexpected value of n discrete fuzzy random variables Ā˜ on
−1 −1
α-level is Ē(α) = i=1 µi (α)pi , for α ∈ [0, 1] with µi and pi are pre-image and
˜
¯
probability of the fuzzy random variable Ai , respectively.
Furthermore, we have get some theorems in our results [3] Some of them are the
following:
Theorem 1.1. If all discrete fuzzy random variables have continuous membership func-
tion , then the expected value is a fuzzy number with continuous membership function.
Corollary 1.1. If all discrete fuzzy variables on FRV are non decreasing linear fuzzy
number, then the expected value is a non decreasing linear fuzzy number.
Corollary 1.2. If all discrete fuzzy variables on FRV are TFN, then the expected value
is a TFN.

2. MAIN RESULTS
We have two kinds of MOLPs with fuzzy random parameters. They are MOLP
with fuzzy random objective function coefficients and MOLP with fuzzy random rhs.
We solve them by expected value approach to transform fuzzy MOLPs. Finaly, we solve
the MOLPs by a new method.
2.1. MOLP with Fuzzy Random Objective Function Coefficients. Let consider
the multi-objective linear programming with fuzzy random objective function coeffi-
cients below:
min (f1 (x, c˜¯1 ), f2 (x, c˜¯2 ), · · · , fk (x, c˜¯k )) (1)
subject to Ax ≥ b, x ≥ 0,

where c˜¯K is a vector of fuzzy random parameter in the probability space K,


c˜ ˜
¯K = (cK1 ˜
¯ , ..., cKn
¯ ). We let c̄Kj be a triangular fuzzy number, for K = 1, 2, ..., k,
j = 1, 2, ..., n.

By the expected value approach the problem becomes a fuzzy MOLP,


¯1 )), E(f2 (x, c¯˜2 )), · · · , E(fk (x, c˜¯k )))
min (E(f1 (x, c˜
subject to Ax ≥ b, x ≥ 0,

Based on Corollary 1.2, we have E(fK (x, c˜¯K )) = fK (x, c¯K ), where c¯K is a vector
of triangular fuzzy numbers, for K = 1, 2, ..., k.
430 Indarsih, Widodo, Ch.R. Indrati

We denote the set of all feasible solution for (1) by D. By expected value approach,
we define two solution for (1). They are E-optimal solution and E-efficient solution for
(1).
Definition 2.1. Feasible solution x∗ is an E- optimal solution for (1) if
E(fK (x∗ , c˜
¯K )) ≤ E(fK (x, c˜
¯K )) for all K = 1, ..., k and x ∈ D.
Definition 2.2. Feasible solution x∗ is an E- efficient solution for(1) if there
exists no x ∈ D such that E(fK (x, c˜¯K )) ≤ E(fK (x∗ , c˜¯K )) for all K = 1, ..., k and
E(fl (x, c̄˜l )) < E(fl (x∗ , c̄˜l )) for some l = 1, ..., k.

We choose weighted method to get a single linear programming with fuzzy objec-
tive function coefficients,
min w1 E(f1 ) + w2 E(f2 ) + · · · + wk E(fk ) (2)
subject to Ax ≥ b, x ≥ 0.

Using the properties of arithmetic of fuzzy numbers, model (2) is a linear program-
ming with triangular fuzzy number for coefficient objective function. The problem can
be solved by modified simplex method. We choose the ranking fuzzy R in Definition
1.3 to obtain optimal solution criteria. So we have fuzzy decision variable.
We rewrite the model (2) below,
min z̄ = c̄x (3)
subject to Ax ≥ b, x ≥ 0.

We start a standard linear programming with a basis matrix B corresponding to a


basic feasible solution
xB = B−1 b and the current solution z̄ = c̄B XB . We define z̄j = c̄B Yj ,
Yj = B−1 Nj . The ratio test to chose a basic variable is the same as in the sim-
plex method. However, the criterion to insert a non basic variable xj is the value of
R(c¯j − z¯j ) has the largest positif. The optimal solution will be achieved if R(c¯j − z¯j ) ≤ 0
for all j.
Due to the ranking fuzzy in our method we define optimal solution for (3) in
Definition 2.3. We have relation between solution problem (1) and (3) in Theorem 2.1.
Definition 2.3. Feasible solution x∗ is an R-optimal solution for (3) if c̄x∗ ≤R c̄x for
all feasible solutions x.
Theorem 2.1. If x∗ is an R-optimal solution for (3), then x∗ is a solution for (1) with
respect to w1 , w2 , · · · , wk .

Proof. From (2) we have


c̄ = (w1 c¯11 + · · · + wk c¯k1 , · · · , w1 c1n
¯ + · · · + wk ckn
¯ )
Let x∗ optimal solution for (3), we have
c̄x∗ ≤R c̄x (4)
Expected Value Approach for Solving Multi-objective Linear Programming 431

for all feasible solutions x.


We apply c̄ in (4), so we have
w1 E(f1 (x∗ , c¯1 )) + · · · + wk E(fk (x∗ , c¯k )) ≤ w1 E(f1 (x, c¯1 )) + · · · + wk E(fk (x, c¯k ))
Since, the value of wi in the left side and in the right side for all i are the same, we have
E(f1 (x∗ , c¯1 )) + · · · + E(fk (x∗ , c¯k )) ≤ E(f1 (x, c¯1 )) + · · · + E(fk (x, c¯k )).
We have two cases. If E(fK (x∗ , c¯K )) ≤ E(fK (x, c¯K )) for all K = 1, 2, .., k, then x∗ is E-
optimal solution for (1). Otherwise, if E(fl (x∗ , c¯l )) > E(fl (x, c¯l )) for some l = 1, 2, .., k,
then x∗ is E-efficient solution for (1). 
2.2. MOLP with Fuzzy Random Right Hand Sides. Let consider the multi-
objective linear programming with fuzzy random right hand sides below:
min (f1 (x, c1 ), f2 (x, c1 ), · · · , fk (x, ck )) (5)
subject to Ax ≥ b̄ ˜ ,x ≥ 0,

where b̄ ˜ = (b˜
¯1 , · · · , b˜
¯m ) and b̄˜i is fuzzy random parameter for rhs i. Here, we use
b̄i is non decreasing linear fuzzy number, for i = 1, 2, · · · , m.
By the expected value approach the problem becomes a fuzzy MOLP,
min (f1 (x, c1 ), f2 (x, c1 ), · · · , fk (x, ck )) (6)
subject to Ax ≥ E(b̄ ˜ ),x ≥ 0,

where E(b̄˜ ) = (E(b˜¯1 ), ..., E(b˜¯m ))T . By Corollary 1.1, we have E(b¯˜j ) is non decreas-
ing linear fuzzy number, for all j. We write E(b˜¯j ) by b¯j = (bil , bi , biu ).
ltt

We solve (6) by a new method to transform problem (6) to crisp model . The
method is different from [4]. In our problem, we assume that the value of fK is increasing
or the same by increasing of value of rhs in constraints , for all K. So we have one type
of membership function of objective functions. Since the rhs is fuzzy, then the value of
the objective function is also fuzzy. We distinguish the degree of membership function
of constrains and objective functions. Recall by α1 and α2 respectively. The value of
rhs obtain value of the objective functions, so we let α1 ≥ α2 . We want to maximize
the sum of α1 and α2

The steps of our method:


Step 1. Solve the MOLP (6) as a single objective Linear Programming(LP) using each
time only one objective and ignore all other. Here, we divide LP into two problems.
The first, we use rhs by bil and we get fK = zK (bil ). Secondly, we use rhs by bi and we
get fK = zK (bi ). Here, we have 2k LPs and the set optimal solution x∗l , l = 1, 2, ..., 2k.
Step 2. Compute LK = min zK (x∗l ) and UK = max zK (x∗l ).
Step 3. Formulate the membership function for constrain below,
1
 for bi ≤ Ai x − pi

A x−b
i i
α1 = µC̄i (bi ) = for Ai x − pi ≤ bi ≤ Ai x ,

 pi

0 for bi > Ai x
432 Indarsih, Widodo, Ch.R. Indrati

where Ai is row-i of matrix A,pi = bi − bil for all i.


Formulate the membership function for goal below,
1
 for fK ≤ LK

 U −f
K K
α2 = µḠK (fK ) = for LK ≤ fK ≤ UK

 UK − L K

0 for fK > UK
Step 4. The crisp model for (6) is

max α1 + α2 (7)
subject to fK + α2 (UK − LK ) ≤ UK , K = 1, 2, ..., k
Ai x − α1 pi ≥ bi , i = 1, 2, ..., m
α1 ≥ α2
0 ≤ α1 ≤ 1, 0 ≤ α2 ≤ 1, x ≥ 0.
By the expected value approach, we transform model (5) to model (6). Then we
solve model (6) by the method above. Finally, we have crisp model (7).
We give an example for problem (5).
Example 2.1. Given MOLP with fuzzy random right hand side below.

min f1 = 50x1 + 30x2


min f2 = 20x1 + 70x2
subject to 2x1 + 4x2 ≥ b˜
¯1 ,
x1 + x2 ≥ b˜¯2 ,
x1 , x2 ≥ 0,
where
b̄ltt ltt
11 = (17, 18, 19), b̄12 = (19, 21, 22),
b̄ltt ltt
21 = (7, 8, 9), b̄22 = (15, 16, 17),
P r(b̄11 ) = P r(b̄12 ) = 0.5, P r(b̄21 ) = 0.75 and P r(b̄22 ) = 0.25.

We compute the expected value for rhs by Definition 1.1 and Corollary 1.1 and
we have E(b˜¯1 ) = b̄ltt = (18, 20, 21) and E(b˜¯2 ) = b̄ltt = (9, 10, 11).
1 2
The results from the our method are x∗1 = 4, 789, x∗2 = 5.251 and f1 = 194.978,
f2 = 462.555 with α1∗ = 1, α2∗ = 0.4566.

3. CONCLUDING REMARKS
In this paper, the expected value approach for solving MOLP with fuzzy random
coefficient objective function and MOLP with fuzzy random right hand side have been
presented. The approach transform the problems to MOLP with fuzzy parameters, and
we solve them by different method. The modified simplex can solve the MOLP with
Expected Value Approach for Solving Multi-objective Linear Programming 433

fuzzy objective function coefficients. The solution depends on ranking fuzzy. So, we
define E-efficient solution and R-optimal. The second method can solve the MOLP
with fuzzy rhs.

Acknowledgement. The authors would like to thank to Department of Mathematics,


Gadjah Mada University for the support of the research.

References
[1] Bector, C.R. and Chandra, S., Fuzzy Mathematical Programming and Fuzzy Matrix Games,
Springer, Germany, 2005.
[2] Caballero, R., Cerda, E., Munoz, M.M. and Rey, L., Stochastic Approach versus Mul-
tiobjective Approach for Obtaining Efficient Solutions in Multi Objective Programming Prob-
lem,European Journal of Operational Research, 158(3), 633-648, 2004.
[3] Indarsih, Widodo and Rini, Expected Value of Fuzzy Random Variables, Submitted to IJQM.
[4] Jana, B. and Roy, T.K., Multi-objective Fuzzy Linear Programming and Its application in
Transportation Model,Tamsui Oxford Journal of Mathematical Sciences,2,243-268, 2005.
[5] Luhandjula, M.K. and Gupta, M.M., On Fuzzy Stochastic Optimization, Fuzzy Set and System
81, 47-55, 1996.
[6] Maleki, H.R., Tata, M. and Mashinchi,M., Linear Programming with Fuzzy Variables, Fuzzy
Sets and Systems, 109, 21-33, 2000.

Indarsih
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]

Widodo
Department of Mathematics, Gadjah Mada University.
e-mail: widodo [email protected]

Ch. Rini Indrati


Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]
434 Indarsih, Widodo, Ch.R. Indrati
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 435 - 448.

CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC


MAP (PLCM)

JENNY IRNA EVA SARI AND BETY HAYAT SUSANTI

Abstract. As a ubiquitous phenomenon in nature, chaos is a kind of deterministic random -


like process generated by nonlinear dynamic systems. The properties of chaos includes:
sensitivity to tiny changes in an initial conditions and parameters, random-like behavior,
ergodicity, unstable periodic orbits with long periods, properties which seem pretty much
the same as required by cryptographic primitive characteristics such as “diffusion” and
“confusion”.In cryptography, one of many purposeschaotic function usage is to generate
Substitution box (S-box). In this paper, we will discuss how to generate S-box using
Piecewise Linear Chaotic Map (PLCM), the characteristics of PLCM, and analysis of S -box
that have good criteria, such as Strict Avalanche Criterion (SAC), Bit Independence
Criterion (BIC), and Nonlinearity.

Keywords and Phrases: Cryptography, Chaos, S-box, PLCM.

1. INTRODUCTION

One of the most intense areas of research in the field of symmetric cryptosystems is
about S-box design [13]. S-boxes are quite important components of modern cryptosystems
(especially in block ciphers) in the sense that S-boxes bring nonlinearity to block ciphers and
strengthen their cryptographic security[7]. They are typically used to obscure the relationship
between the key and the ciphertext – Shannon’s property of confusion. In many cases, S-
Boxes are carefully chosen to resist cryptanalysis.
According to [4], there are numbers of criteria forS-box design, i.e. S-box should
satisfy Avalanche Criterion (AC), Strict Avalanche Criterion (SAC), Bit Independence
Criterion (BIC), XOR Table Distribution, Avalanche Weight Distribution (AWD),
andNonlinearity.AWD was a criteria for testing the whole algorithm of block cipher.
The construction of S-boxes can be done by various ways. The most common methods
for constructing S-boxes are based on:random generation, testing against a set of design
criteria, algebraic construction having good properties, or a combination of these [10].One of
the random function that used to generate S-boxes is chaotic dynamic function. The
characteristics of chaos with property of ergodicity, mixing and exactness, and sensitivity to

435
436 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI

initial conditions makes this function is used in cryptography. These characteristics have in
common with confusion and diffusion properties in cryptography. The feature of ergodicity in
chaotic systems is to hide the relationship between plaintext, key, and ciphertext by uniform
distribution on the output for each input.While nature of sensitivity to initial conditions is
equivalent to the concept of diffusion in cryptography bydeploy a single bit inputs to all
outputs [6].
There are four points to be consideredinS-box construction process using chaotic
iteration functions:1) how to divide a range of areas as a result area, 2) what kind of chaotic
function being used, 3) number of iterations, and 4) initial value [3].
In this paper, we present a method for obtaining dynamically cryptographically
strong Substitution boxes (S-boxes) based on Piecewise Linear Chaotic Map (PLCM) that
have confusion and diffusion properties. PLCM uses mixing property to get the values of S-
box. In addition, PLCM requires two inputs as initial values that will produce the output
value. The natureof sensitivity alsoresults of random output, wherethe tiny changesof input
will produce significant changesof output.
We construct 8 x 8 dimension ofS-box which there are numbers of S-boxes with
different input values.We decided to use 8 x 8 dimensionof S-box refer to S-box dimension
used in Advanced Encryption Standard (AES) algorithm as a well accepted standard
algorithm for block cipher. The cryptographic properties such as strict avalanche criterion,
output bits independence, and nonlinearity of these S-boxes are analyzed in details. From this
study, we will conclude whether S-boxes generated by the PLCM function have a good
criteria forS-boxes with the certain input of values.The expected result of this study is to
expand knowledge of S-boxes construction with PLCM function. Such S-Boxes also could be
usedfor a particular encryption algorithm.

2. THEORETICAL BACKGROUND

2.1. Substitution Box (S-box). In general, S-box takes some number of input bits, m,
and transforms them into some number of output bits, n: an m×n S-Box can be implemented
as a lookup table with 2m words of n bits each [11].Fixed tables are normally used, as in the
Data Encryption Standard (DES), but in some ciphers the tables are generated dynamically
from the key; e.g. the Blowfish and the Twofish encryption algorithms. Bruce Schneier
describes International Data Encryption Algorithm (IDEA) modular multiplication step as a
key-dependent S-Box [14].
Definition 1. [15] : An n x n S-box is a mapping function 𝑓: {0,1}𝑛 → {0,1}𝑛 ,which
maps n-bit input strings, 𝑋 = 𝑥1 , 𝑥2 , … , 𝑥𝑛 , to n-bit output strings, 𝑌 = 𝑦1 , 𝑦2 , … , 𝑦𝑛 ,
where Y = f(x). Figure 1 shows S-boxes scheme process.Figure 1. Substitution Box (S-
box)scheme [5]
CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC MAP (PLCM) 437

Input
𝑥1 𝑥2 𝑥𝑛
.........

S-Box
.........

𝑦1 𝑦2 𝑦𝑛

Output

Mister and Adams [9] explained that S-box can be represented in three ways:
1. An n xm S-box S is a mapping 𝑆 ∶ {0,1}𝑛 → {0,1}𝑚 . S can be represented as 2n m-
bitnumbers, denoted 𝑟0 , …,𝑟2𝑛 −1 in which case S(x) = rx, 0 x 2n and the riare the
rows ofthe S-box.
2. 𝑆 𝑥 = [𝑐𝑚 −1 𝑥 𝑐𝑚 −2 𝑥 … 𝑐0 𝑥 ], where theciare fixed Boolean functions
ci : {0,1}n  {0,1} i ; these are the columns of the S-box.
3. S can be represented by a 2n x m binary matrix M with the i, j entry being bit j of row
i.

2.2. Strict Avalanche Criterion(SAC).According to [16], Strict Avalanche Criterion


(SAC) is the combination between the concepts of completeness and avalanche effect.If a
cryptographic function 𝑓 ∶ 0,1 𝑛 → 0,1 𝑛 satisfies SAC for all 𝑖, 𝑗 ∈ 1,2, … , 𝑛 , then each
output bit should change with a probability of one half whenever a single input bit is
complemented, formulated as follows:

1 𝑒 1
𝑊 𝑎𝑗 𝑖 = for alli, j (1)
2𝑛 2

We can modify equation (1) to determine the parameter of SAC,𝑘𝑆𝐴𝐶 𝑖, 𝑗 as follows:


1 𝑒 1
𝑘𝑆𝐴𝐶 𝑖, 𝑗 = 𝑊 𝑎𝑗 𝑖 = (2)
2𝑛 2

𝑘𝑆𝐴𝐶 𝑖, 𝑗 in therangeof [0,1] and can be interpreted as probability of a change in the j-th bit
output when the i-th bit input change.If 𝑘𝑆𝐴𝐶 𝑖, 𝑗 is not equal to ½ for every pair of (i,j), then
it is not satisfy SAC.
Relative error of SAC results can be obtained by the formula:

∈= max 1≤𝑖≤𝑛 |2𝑘𝑆𝐴𝐶 𝑖, 𝑗) − 1 | (3)


1≤𝑗 ≤𝑛

An S-box that satisfy SAC in the range±∈ if for every i and j satisfy the following
equation:
1 1
1 −∈𝑆 ≤ 𝐾𝑆𝐴𝐶 (𝑖, 𝑗) ≤ 1 +∈𝑆 (4)
2 2
438 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI

2.3. Bit Independence Criterion (BIC).A function 𝑓 ∶ 0,1 𝑛 → 0,1 𝑛 is to satisfies


BIC if ∀𝑖, 𝑗, 𝑘 ∈ 1,2, … , 𝑛 , with 𝑗 ≠ 𝑘, inverting input bit i causes output bits j and k to
change independently [16].
To measure the bit independence concept, one needs the correlation coefficient between
the j’th and k’th components of the output difference string, which is called the avalanche
vector 𝐴𝑒𝑖 . Bit independence parameter corresponding to the effect of the i’th input bit change
on the j’th and k’th bits of 𝐴𝑒𝑖 is defined as:
𝑒 𝑒
𝐵𝐼𝐶 𝑎𝑗 , 𝑎𝑘 = max1≤𝑖≤𝑛 |𝑐𝑜𝑟𝑟 𝑎 𝑗𝑡 , 𝑎𝑘𝑡 | (5)
Kwangjo Kim [5] explains that in order to find the correlations between pairs of
avalanche variables, the correlation coefficient can be calculated as follows:
𝑐𝑜𝑣 (𝐴,𝐵)
𝜌 (𝐴, 𝐵) = . (6)
𝜎 (𝐴) .𝜎(𝐵)

where, 𝜌 (𝐴, 𝐵) = correlation coefficient between A and B


𝑐𝑜𝑣 (𝐴, 𝐵) = covariance of A and B
= 𝐸 (𝐴𝐵) − 𝐸(𝐴) 𝐸(𝐵)
𝜎 2 (𝐴) = standard deviation of A
= 𝐸 (𝐴2 ) − {𝐸 𝐴 }2
𝜎 2 (𝐵) = standard deviation of B
= 𝐸 𝐵2 − {𝐸 𝐵 }2
𝐸(𝐴), 𝐸(𝐵) = expectation value or mean of A, B
𝐸 (𝐴𝐵) = expectation value or mean of the product A and B

From equation (6), avalanche variable will generate the correlation coefficient in the
range [0, 1], which means:
a. If the value is 1 then theavalanchevariableare always identical or complements of
one another;
b. If the value is 0 then the avalanchevariableare independent;
In the process of criteria analysis of BIC, the 𝐵𝐼𝐶 (𝑓) value will be the relative error
∈𝐵 . Thus, for an 𝑛𝑥𝑛S-box, the maximum value of ∈𝐵 = 𝐵𝐼𝐶(𝑓) is said the maximum value
of relative error of BIC results, denoted by ∈𝐵𝐼𝐶 ,

∈𝐵𝐼𝐶 = max𝑜𝑣𝑒𝑟 𝑎𝑙𝑙 𝑠−𝑏𝑜𝑥 {∈𝐵 } (7)

2.4. Nonlinearit.According to the nonlinearity of the function


f  ( f1 f 2 ... f n ) :   
n
2
m
2 fi :   2 i  1, 2,..., m is defined as the minimum
n
2
where
Hamming distance between the set of affine functionsand every nonzero linear combination
of the output coordinates off, i.e.

𝒩ℒ𝑓 = min𝑏,𝑐,𝑤 #{𝑥 ∈ 𝑍2𝑛 |𝑐. 𝑓(𝑥) ≠ 𝑤. 𝑥 ⊕ 𝑏} (8)


where 𝑤 ∈ 𝑍2𝑛 , c∈ 𝑍2𝑚 \{0}, b∈ 𝑍2 , 𝑥 ∈ 𝑍2 and w  x denotes the dot product betweenwand
xover 𝑍2 ,
CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC MAP (PLCM) 439

𝑚
𝑐. 𝑓 𝑥 = ⊕ 𝑐𝑖 𝑓𝑖 𝑥 , (9)
𝑖=1
where𝑐 = {𝑐1 , 𝑐2 , … , 𝑐𝑚 } ∈ 𝑍2𝑚 .
For a cryptosystem not to be susceptible to linear cryptanalysis, NLMf is required to be
as close as possible to its maximum value (perfect nonlinearity). The maximum nonlinearity
𝑛
value (perfect nonlinearity) of the Boolean function given by 𝑁𝑓 ≤ 2𝑛−1 − 22 − 1 [4].

2.5. Piecewise Linear Chaotic Map (PLCM). PLCM is a piecewise function that has
property of uniform invariant density and a good correlation function so that it can be used in
cryptography. PLCM function have been used in the encryption function, the function of a
random number generator, and the f function.
Given a real interval 𝑋 = 𝛼, 𝛽 ⊂ ℝ, PLCM maps Ϝ: 𝑋 → 𝑋is a multi-segmental
mapping [6]:
𝑖 = 1~𝑚, Ϝ 𝑥 |𝐶𝑖 = 𝐹𝑖 𝑥 = 𝑎𝑖 𝑥 + 𝑏𝑖 , (10)

where{𝐶𝑖 }𝑚 𝑚
𝑖=1 is partition of 𝑋, satisfy ⋃𝑖=1 𝐶𝑖 = 𝑋 and 𝐶𝑖 ∩ 𝐶𝑗 = ∅, ∀𝑖 ≠ 𝑗.
Equation (10) will fullfil the property of surjective function if every linear parts is
mapped to X with𝐹𝑖 : ∀𝑖 = 1 ~𝑚𝐹𝑖 𝐶𝑖 = 𝑋. If 𝑋 = [0,1] then that equation can be said to be
normalized PLCM. We can transform this equation into a linear form, i.e:
𝑥−𝛼
𝐹 −𝛼
𝛽−𝛼
𝐹 0,1 𝑥 = (11)
𝛽−𝛼

PLCM has the followingproperties on its definition interval 𝑋 [6]:


1. Chaotic, Its Lyapunov exponent 𝜆 = − 𝑚 𝑖=1 𝜇 𝐶𝑖 . ln 𝜇 𝐶𝑖 and satisfies 0 < 𝜆 < ln 𝑚,
𝐶𝑖
where 𝜇 𝐶𝑖 =
𝛽 −𝛼
2. It is exact, mixing, and ergodic, where ∀ 𝑥 ∈ 𝑋, 𝐹 ′ 𝑥 = 𝑎𝑖 > 1.
1 1
3. It has uniform invariant density function 𝑓 𝑥 = =
𝑋 𝛽 −𝛼
1 𝑁−11
4. Its auto-correlation function 𝜏 𝑛 = 2 lim𝑁→∞ (𝑥𝑖 − 𝑥 )(𝑥𝑖+𝑛 − 𝑥 ) will go to zero
𝜎 𝑁 𝑖=0
as 𝑛 → ∞, where 𝑥 , 𝜎 are the mean value and the variance of 𝑥. Furthermore, if
𝑚 2
𝑖=1 𝑠𝑖𝑔𝑛 𝑎𝑖 . | 𝐶𝑖 | = 0 is satiesfied, then 𝜏 𝑛 = 𝛿(𝑛).
Uniform invariant density function means that uniform input will generate uniform
output, and that chaotic orbit from almost every initial condition will lead to the same
uniform distribution,
𝑓 𝑥 = 1/(𝛽 − 𝛼) (12)
However, the above facts are not true for digital chaotic maps because it is realized in
a discrete space with 2n finite states. The number of different outputs after one digital chaotic
iteration, will be smaller than 2n since PLCM is a multi-to-one map.Because of that, for
PLCM, discrete uniform input can’t generate discrete uniform output, or a uniform random
variable will become nonuniform after digital chaotic iteration. PLCM function which will
used in this research is:
440 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI

𝑥
, 𝑥 ∈ 0, 𝑝
𝑝
𝑥−𝑝 1
𝐹 𝑥, 𝑝 = 1 , 𝑥 ∈ 𝑝, (13)
−𝑝 2
2
1
𝐹 1 − 𝑥, 𝑝 , 𝑥 ∈ ,1
2

3. CHAOTIC S-BOX

In this study,we conducted the 8x8 dimensional S-boxes construction using the chaotic
1
dynamic PLCM function that needs 2 input parameters,i.e., IC and p, with 0 < 𝑝 <
2
and0 ≤ 𝐼𝐶 ≤ 1. According to [1], the input parameter of PLCM can be selected arbitrarily
from the possibility of the value of IC and p. For this study, we select 4 pairs of values that
represent any equation in PLCM function. In PLCM function, there are 3 equations with
different range of IC, where we take randomly 1 pairs of the input value of IC and p.
However, it should be noted whether the input value pairs of the function could do iteration
and generate the S-box. For example, if the value of IC or p are zero then the next iteration
results zero such that we can’t generate the S-box.
Therefore, based on the experiment, we select4 pairs of the input value as seen in
Table 1.
Table 1. Sample values of PLCM
Input value Value
Initial Condition (IC) 0,125
0,425
0,750
Parameter value 0,150
0,275
0,450
0,125

In Table 1, we can see there are 3 values of IC and 4 values of p. From that value, we
generate S-boxes with 4 pairs input value of IC and p, i.e., (0,125; 0,150), (0,425; 0,275),
(0,750; 0,450), and (0,750; 0,125). Each pairs satisfy range of the region of IC, where the
value of (0,125; 0,150) satisfy the first range of IC, the value of (0,425; 0,275) satisfy the
second range of IC, and the value of (0,750; 0,450) and (0,750; 0,125) satisfy the third range
of IC where the next iteration of the function satisfy first equation and second equation.

Table 2. List of S-Boxes with different input and parameter


Number
Initial
No. S-box Parameter (𝒑) of
Condition(𝑰𝑪)
Iteration
1 S-box1 0,150 0,125
2 S-box2 0,275 0,425
1 times
3 S-box3 0,450 0,750
4 S-box4 0,125 0,750
CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC MAP (PLCM) 441

5 S-box5 0,150 0,125


6 S-box6 0,275 0,425
50 times
7 S-box7 0,450 0,750
8 S-box8 0,125 0,750
9 S-box9 0,150 0,125
10 S-box10 0,275 0,425
100 times
11 S-box11 0,450 0,750
12 S-box12 0,125 0,750

The number of iterations in chaos function is one of the factor that influence the results
[3]. According to [2], the valueswithminimumerrorandsignificant
differencesobtainedafter100th iterations. In this study, we use 1 time, 50 times, 100 times of
iteration todetermine the influence ofthe number of iterationsonthe resultingoutput of S-box.
Furthermore, we would do the S-box analysisbased on S-box testing criteria. The S-
box testing criteria that will be evaluated including the SAC, BIC, and Nonlinearity. There
are 12 S-boxes which generated with different input value and number as shown in Table 2.

3.1. Process of S-box Construction.The first step is to establish regions accordance


with the dimensions of S-boxes used. Region is the range of output that divided into several
sections with the same size. Region will be set out in 256 regions consist of 0 to 255. The
detailed processes in the S-box generation is described as follows:
a) Select the values of p and IC and input this value into the PLCM function and store the
output value. Consider the limit values of p and IC when using the equation in PLCM
function.
b) Iterate the PLCM function using the selected initial condition.There are three types of
iterations performed on PLCM function one times, 50 times and 100 times to get the
output of PLCM before finding the value of output PLCM.
c) The output of the PLCM functions will occupy one of the regions that meet the
coverage limits, and please do check the label region. If the output is not included in
any region, then ignore the value region and make the next iteration. And if the region
has been visited then make the next iteration to get another region not visited yet.
d) Define an array of S to the size required to store the value labels of the region.
e) The labelsof regions value are stored in the array S consecutively in accordance with
the iteration and no repeated region value.
f) After all the array filled, then make 8 x 8 table from the first regions until the last
regions. This table will become an S-box with certain inputs.
After the generation process is done, we will produce 12 S-boxes, which will then be
tested whether the S-Box satisfythe cryptographic criteria. We convert the output of S-box
into a hexadecimal form because in the cryptosystem, the S-boxes always read in
hexadecimal. Here's one of the generated S-box (Table 3).

Table 3.S-Box1
0 1 2 3 4 5 6 7 8 9 A B C D E F
0 EA 43 06 9D 0B 13 5C B8 9C C0 77 E0 7B FB D9 05
1 45 72 DD CB 99 22 BA 56 CF 69 84 A6 2D 38 1C 98
2 79 F8 15 47 24 14 E5 A9 3D EE 33 B2 34 53 0D DA
442 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI

3 63 E8 42 7D 19 39 9F AA 62 DF 44 11 7F A1 68 D5
4 CE 6B 6F 12 92 55 C5 A7 7E 2E 58 76 1A 32 2F 01
5 41 E4 F0 10 EC A5 59 AE 03 37 FE DB AB F1 C7 6E
6 6C B3 35 2A CA B4 DC 9A 3E 25 F9 9E A3 BF A2 AF
7 E7 8D 4B 2C 89 8A 8C D3 CD 1E 0A 8E 6A 71 73 31
8 7C FA 88 30 91 B0 FC 5D 6D EF 8B F5 E2 F6 C6 96
9 29 D8 86 3C 1D 95 94 BC 4D 9B FF E6 00 48 09 75
A 28 23 54 5E DE E1 02 74 90 D0 E9 93 07 BD 8F 2B
B 26 80 A4 BE 0C 08 C3 FD 36 3A C9 49 83 04 F4 17
C 20 78 B7 87 CC 21 5F C2 4F B1 4A C1 1B D6 F7 57
D 0E 66 81 70 46 4E 60 61 50 EB 85 65 64 E3 5A B9
E 18 D7 A8 F3 1F ED C4 C8 AC 3B 52 D4 D1 3F B6 27
F 7A 0F AD 82 97 16 5B 51 A0 BB 40 D2 4C 67 B5 F2

3.2. Experimental Results


3.2.1. Experimental Results of SAC.The SAC test’s would yield the value of 𝑘SAC , which
satisfied when the value of 𝑘SAC = 0,5. But in this result, we can’t get the exact value of
𝑘SAC .
Based on the calculation of the relative error of SAC in each S-box, we can see in Table
4that the maximumvalue of relative error of SAC satisfied by Sbox5, Sbox7, Sbox9, and
Sbox10with error value of0,25. In addition, the maximum value of relative error at each S-box
have the same results and quite evenly distributed and get the same range of 𝑘𝑆𝐴𝐶 .

Table 4.Relative error of SAC


No Sbox Value of ϵ
1 Sbox1 0,21875
2 Sbox2 0,21875
3 Sbox3 0,1875
4 Sbox4 0,21875
5 Sbox5 0,25
6 Sbox6 0,21875
7 Sbox7 0,25
8 Sbox8 0,21875
9 Sbox9 0,25
10 Sbox10 0,25
11 Sbox11 0,21875
12 Sbox12 0,21875
CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC MAP (PLCM) 443

3.2.2. Experimental Results of BIC.The results of BIC criteria test on each S-box
summarized in Table 5. The value of each entry in the table is the correlation coefficient
𝑒 𝑒
between the output components of different j.k (𝑐𝑜𝑟𝑟 𝑎 𝑗𝑡 , 𝑎𝑘𝑡 ). From the calculation of
BIC criteria of all S-boxes, we will find the maximum correlation value ∈𝑚𝑎𝑥 and the
average of correlation values.

Table 5.Correlation value and Mean of BIC


No Sbox Corr Max Mean
1 Sbox1 0,2919183 0,062907
2 Sbox2 0,2191785 0,072626
3 Sbox3 0,4056696 0,07396
4 Sbox4 0,25257 0,070928
5 Sbox5 0,3138824 0,07103
6 Sbox6 0,2260449 0,069942
7 Sbox7 0,316386 0,068591
8 Sbox8 0,2805474 0,079021
9 Sbox9 0,2574946 0,072197
10 Sbox10 0,2436982 0,067238
11 Sbox11 0,28125 0,073988
12 Sbox12 0,2511059 0,070517

In Table 5, we can see that maximum value of correlation coefficient satisfied by Sbox2
and Sbox3. Meanwhile, the minimum mean value satisfied by Sbox1 and the maximum mean
value satisfied by Sbox8.
For overallanalysis ofS-boxes,we can concludefor BIC criterion thatif
themaximumcorrelationobtainedisclose to 0thentheavalanchevariablesare independent
betweenthe output.ChaoticS-box producesa goodcorrelationvalueasthe averagevalue ofthe
resultingcorrelationisaround orbelow 0,1.However, the overall value ofthe correlationandthe
maximumerrorvaluegeneratedby theChaoticS-box is0,4056696.
Figure 2. shows the maximum correlation coefficient results of each S-box.
444 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI

BIC Graph
0,5 0,40567
correlation coefficient

0,4 0,31388 0,31639


0,29192 0,28055 0,28125
0,25257 0,25749
0,2437 0,25111
0,3 0,21918 0,22604

0,2
corr
0,1

0
1 2 3 4 5 6 7 8 9 10 11 12

Sbox

Figure 2. Maximum Correlation Coefficient of S-box

3.2.3. Experimental Results of Nonlinearity.An ideal nonlinearity value must satisfy or


close to perfect nonlinearity value and for 8 x 8 S-box the perfect nonlinearity value close to
𝑛
2𝑛−1 − 22 −1 = 27 − 23 = 120

InTable 6, we can see that the minimum value of nonlinearity of Chaotic S-boxes ranged
in value 88 until 96. The minimum value of 𝒩ℒ𝑓 is not yet close to a perfect nonlinearity
value.

Table 6. Minimum value ofNonlinearity


NO Sbox NLM (min) Probability
1 Sbox1 94 162/256
2 Sbox2 94 162/256
3 Sbox3 90 166/256
4 Sbox4 96 160/256
5 Sbox5 96 160/256
6 Sbox6 88 168/256
7 Sbox7 92 164/256
8 Sbox8 94 162/256
9 Sbox9 96 160/256
10 Sbox10 90 166/256
11 Sbox11 88 168/256
12 Sbox12 92 164/256
CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC MAP (PLCM) 445

Based on the minimum value of nonlinearity, the number of vectors, and the probability,
it can be concluded that Chaotic S-boxesare not satisfy the ideal nonlinearity (perfect
nonlinearity) and with the probability far from a half (ranging between 0,625 and 0,6525), we
can say that Chaotic S-box susceptible to linear cryptanalysis.

NLM (min)
98
96
94
NLM min

92
90
NLM (min)
88
86
84

Figure 3. NLM Minimum Value

3.3. Analysis of S-Box.In this section, we compare the maximum value of each S-boxes
according to the criteria of test conducted in this research SAC, BIC, and Nonlinearity.From
Table 7 and Table 8, we can conclude that all S-boxes satisfy the SAC and BIC criterias,
otherwise not satisfy Nonlinearity criteria.

Table 7. Test Results of Input1 and Input2


Input1 Input2
1 times 50 100 1 times 50 100
times times times times
Sbox1 Sbox5 Sbox9 Sbox2 Sbox6 Sbox10
SAC 0,21875 0,25 0,25 0,21875 0,21875 0,25
BIC 0,29192 0,31388 0,25749 0,21918 0,22604 0,24369
Nonlinearity 94 96 96 94 88 90

Table 8. Test Results of Input3 and Input4


Input3 Input4
1 times 50 100 1 times 50 100
times times times times
Sbox3 Sbox7 Sbox11 Sbox4 Sbox8 Sbox12
SAC 0,1875 0,25 0,21875 0,21875 0,21875 0,21875
BIC 0,40567 0,31638 0,28125 0,25257 0,28055 0,25111
Nonlinearity 90 92 88 96 94 92
446 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI

In general, the test results 12 S-boxesindicates that the difference is not


significant.BasedonTable7 andTable 8, theminimum relativeerrorvaluesof SACandBICis
generatedby theinputvalue ofinput1 andinput4.Cumulatively,theresults ofthe testingcriteria
ofS-boxes, thevalues ofinput1and input4havebetter outcomescomparedwiththe values of
input2and input3.Input1withthe valueof 𝑥=0.125and=0.15andinput4with the value of
𝑥=0.75and𝑝=0.125hasthesamecharacteristicswhere𝑥and𝑝valueapproaching0.While thevalue
ofinput2 andinput3with theinputvaluesofand 𝑥approaching½, the test resultsshowa
fairlyhighrelativeerror.However, this conclusioncannot beusedforallinputvaluesareclose to 0or
1/2, because 4inputvaluesin theS-box arenot representativefor thewholepopulationor all of
thepossibilities.
From Table 7andTable8, the test results ina singleiteration, 50 and100 timesshowed
nosignificantdifference orimprovement.This is due tosampling(input values)islimited. In
addition,theconcept of S-box generation, the values of S-box testedistheregionlabelis
notbased on changes inoutputfunctionof PLCM. Anotherfactoris the limitednumber of
iterationsconsistingofthreekinds, cannotrepresent thepossible influenceofthe number of
iterations.
In addition, we are also make a comparison between Chaotic S-box, Vergili’s random S-
box, and AES’s S-box. In this analysis, the value to be compared among the three S-boxes is
the result of maximum error value of AC, SAC, and BIC. We can see the comparation
between them in Table 9.

Table 9.Comparison of Test Results of AC, SAC, and BIC

AES’s S-box Vergili’s Random S-box Chaotic S-box


ϵSAC 0,125 0,42188 0,25
ϵBIC 0,1341 0,4426 0,4056696

From Table 9, we can conclude that Chaotic S-boxes has a higher error value than of the
AES’s S-box as the standard algorithm. However, when compared with the Vergili’s random
S-boxes, Chaotic S-Boxes have a smaller error value for all parameters exceptthe maximum
value of BIC which is approaching the maximum value of Vergili’s random S-box.Therefore,
we can conclude that AES’s S-box is still the best S-box among the three, butChaoticS-box is
betterthan Vergili’s random S-box. Moreover, the test result indicates thatChaotic S-box is
nearly similar to Vergili’s S-box which is the process of its generation is random.

3.4. Analysis of PLCM Function.PLCM function used in iterating the generation of S-


box has three equations depends on the value of its inputs. We select the input value from
each range of the region. After generation and analysis process,we find there are a few flaws
that make the repeated iteration in PLCM function without producing any output. Some of the
following value should not be used in the selection of input values to avoid from bad results,
i.e.:
1. 𝑥 = 0, if the value of x = 0 or the value of iteration x = 0, then it will produce a
value of 𝐹 𝑥 = 0.
CHAOTIC S-BOX WITH PIECEWISE LINEAR CHAOTIC MAP (PLCM) 447

1 1
2. 𝑥 = , if the value of 𝑥 = , then we use the second equation in PLCM function
2 2
and yield
𝐹 𝑥 = 𝐹 1 − 1, 𝑝 = 𝐹(0, 𝑝) which produce the value of 𝑥 = 0.
3. 𝑥 = 𝑝, if the value of 𝑥 = 𝑝, then we use the first equation in PLCM function
𝑝
such that the output will become 𝐹 𝑥 = = 1.
𝑝
4. 𝑥 = 1, if the value of 𝑥 = 1, then we use the third equation in PLCM function
such that the output will become 𝐹 𝑥 = 𝐹 1 − 1, 𝑝 = 𝐹(0, 𝑝) which going to
be the first equation and produce the value of 𝑥 = 0.

4. CONCLUSION

We defined that Chaotic PLCM function can be used to generate random output of S-
boxes.There is no significant effects in output of S-box which generated from same IC and p
with different number of iterations. This is due to the value of the results of the tests are
distributed in the same range.
In this study, only SAC and BIC criteria are satisfied from the test results, otherwise
not with the nonlinearity criteria. When using PLCM function as generator of the output of S-
box, we have to choose carefully the parameters of IC and p to avoid from bad results.

Open Problem
There are some problems that have not done yet by us:
 choosing more value of parameters IC and p that included all input value
 generate S-box with many different number of iterations.
 there are some approaches method related to dynamics system such as spread and
dispersion that can be used to analyze Chaotic S-box.
 generate S-box with different chaos functions, such as 2-dimention and 3-dimention
chaos function.

References

[1] ASIM, M., AND VARUN JEOTI. 2008. Efficient and Simple Method for Designing Chaotic S-boxes. ETRI Journal,
Volume 30 Number 1.
[2] BOSE, R AND BANNERJEE, A. 1999. Implementing symmetric (single-key) cryptography using Chaos
Functions.7th Int. Conf. on Advanced Computing and Communications.Roorkee. India.
[3] JAKIMOSKI,G.ANDKOCAREV, L. 2001. Chaos and Cryptography: Block Encryption Ciphers Based on Chaotic
Maps. IEEE Trans. Circuits Syst I, Volume 48, Nomor 2.
[4] KAVUT, S.ANDYUCEL, M.D. 2004.On Some Cryptographic Properties of Rijndael.
[5] KWANGJO, KIM. 1990. A Study on The Construction and Analysis of S-box for Symmetric Cryptosystem.
Yokohama National University.
[6] LI, SHUJUN, GONZALO ALVAREZ. 2005. Some Basic Cryptographic Requirements for Chaos-Based Cryptosystem.
Hongkong Polytechnic University: China.
[7] MAR, PHYUPHYU ANDKHINMAUNGLATT. 2008. New Analysis Methods on Strict Avalanche Criterion of S-boxes.
World Academy of Science, Engineering and Technology.Number 48.
[8] MENEZES, ALFRED J. PAUL C. VAN OORSCHOT, SCOTT A. VANSTONE. 1997. Handbook of Applied Cryptography.
CRC Press LLC. Boca Raton.
448 JENNY IRNA E VA SARI AND BETY HAYAT SUSANTI

[9] MISTER, S. AND ADAMS, C. Practical S-box Design. Nortel. Station C, Ottawa. Canada.
[10] NYBERG, K. 1991. Perfect Nonlinear S-boxes. Advances in Cryptology, Proceedings of EUROCRYPT’91.
Berlin: Springer-Verlag.
[11]SCHNEIER, BRUCE. 1996. Applied Cryptography: Protocols, Algorithms, and Source Code in C, Second Edition.
New York: John Wiley & Sons, Inc.
[12]SHANNON, CLAUDE. 1949. Communication Theory of Secrecy Systems.
[13] STALLINGS, WILLIAM. 2011. Cryptography and Network Security: Principles and Practices. Fifth edition.
Prentice Hall
[14] YOUSSEF, AMR M. 1997. Analysis and Design of Block Cipher. Phd Thesis. Queen’s University. Canada.
[15] YUCEL, M.D. AND VERGILI I. 2001. Avalanche and Bit Independence Properties for the Ensembles of Randomly
Chosen nxn S-boxes. EE Department of METU, Turkey.
[16] WEBSTER, A.F. AND TAVARES, S.E.. 1989. On the design of S-boxes. Department of Electrical Engineering.
Queen’s University.

JENNY IRNA EVA SARI


National Cryptography Institute
e-mail: [email protected]

BETY HAYAT SUSANTI


National Cryptography Institute
e-mails: [email protected] , [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 449 - 458

MODEL OF PREDATOR-PREY WITH INFECTED PREY IN


TOXIC ENVIRONMENT

LINA ARYATI AND ZENITH PURISHA

Abstract. Model of predator-prey with infected prey in toxic environment is discussed in this
article. The objective of this research are to find out whether each of populations is extinct or not
and to know the existence of toxicant concentration. In this paper, model of prey predator with
infected prey in toxic environment is constructed. Toxicant affects the growth rate of susceptible
and infected prey but not affects the growth rate of predator population. There are four
equilibrium points. With appropriate parameter value and any initial value near equilibrium
points, then for a long time, there are three possibilities, those are only the infected population
that will be extinct, only the predator population that will be extinct, or the infected prey and
predator population will be extinct. Then, with appropriate parameter value, for a long time,
susceptible and infected prey population and predator population will be existent for any initial
value. Numerical simulation is given to ilustrate stability behaviour of equilibrium point.

Keywords and Phrases : predator-prey, toxic environment, equilibrium points, infected prey,
stability.

1. INTRODUCTION

It is well known that the pollution of the environment is a very serious problem in the
world today because it is the threat for organisms. Organisms are often exposed to a
polluted environment and take up toxicant. In order to use and regulate toxic substances
wisely, we must asses the risk of the populations exposed to toxicant. Therefore, it is
important to study the effects of toxicant on populations and to find a theoretical threshold
value, which determines permanence or extinction of a population. Rapidly expansion of
agriculture and modern industrial, much toxicant contaminates the ecosystem.
The contact between species and environment is often happened. The changing of
environment that caused by pollution affects variety of life. One example is the use of
pesticides. Pesticides are useful tools in agriculture and forestry because they quickly kill
a significant portion of a pest population. Pesticides can be sprayed instantenously and
regularly and all available evidence suggest that pesticides pose potential health hazards

449
450 LINA ARYATI AND ZENITH PURISHA

not only to livestock and wild life but also to mammals and every human beings.
Modelling the depletion and conservation forestry resources (effects of population and
pollution) is studied in Shukla and Dubey [5] and nonlinear models for the survival of two
competing species dependent on resource in industrial environments has been investigated
by Dubey and Hussain [2].
The pollution can affect the depletion of species growth rate. As species do not exist
alone in nature, it is of more biological significance to study the permanence-extinction
threshold of each population about two or more interacting species subjected to toxicant in
a polluted environment. In an ecology, there is biology interaction, that is interaction of
two or more species in a ecosystem. Modelling the Interaction of Two Biological Species
in a Polluted Environment is studied in Dubay and Hussain [1] without spreading of
infection to the species. Modelling of population with spreading of disease is studied by
Mena-Lorca and Hetchote [3].
We have proposed a mathematical model which has been constructed by Sinha et. al
[4] combining three basic models dealing with prey-predator interaction, disease spread
and the effect of environmental pollution on single prey species. We considered that prey-
population is affected by infectious disease. From the model, we found a new equilibrium
point which is locally asymtotically stable.

2. MODEL

We considered a general prey-predator system, in which prey is subjected to some


disease. The prey is divided into two parts one of which is susceptible prey population and
another one is infected prey population; represented by x1(t) and x2(t) state variables
respectively. Predator population, y(t), is assumed to feed on both the susceptible and
infected prey population with different predation rates. Thus to study the effect of spread
of disease on the prey population, which is also assumed to be predated by some predator
species. Let U(t) be the toxicant concentration in the susceptible prey and infected prey
population. C(t) is the environmental concentration of toxicant.

Transfer diagram:
Model of Predator-Prey with Infected Prey in Toxic Environment 451

Now, we want to study the effects of environmental pollution on prey-predator system


when prey population is alreasy subjected to some disease. After incorporating the effect
of toxicant in the model 1, we get the following model:

dx1
    x1 x2  1 x1 y  rUx1 1  d1 x1 , (2.1)
dt
dx2
  x1 x2   2 x2 y  r2Ux2  dx2 , (2.2)
dt
dy
 1 x1 y   2 x2 y  d3 y, (2.3)
dt
dC
 Q   C   C ( x1  x2 ), (2.4)
dt
dU
  C ( x1  x2 )  mU . (2.5)
dt
Initial conditions:
x1(0) = x10 > 0,
x2(0) = x20 > 0,
y(0) = y0 > 0,
U(0) = U0 > 0,
C(0) = C0 > 0,

where
θ : recruitment rate,
α1 : predations rates of susceptible prey,
α1 : predations rates of infected prey,
β : disease contact rate,
d1 : natural death rate in susceptible prey population,
d3 : natural death rate in predator population,
d2 : natural death rate in infected prey population,
h : disease induced death rate of infected prey population.
d =( d2 + h) : net death rate in infected prey populations.
m : natural wash out rate of the toxicantfrom the organism,
r1 : the rate at which susceptible prey is decreasing due to toxicant,
r2 : the rate at which infected prey is decreasing due to toxicant,
δ : uptake rate of toxicant by organism,
α : the natural depletion rate of the environmental toxicant
Q : exogenous input rate of the toxicant in the environment.

The following lemma discusses that all the solutions of the model are bounded in
5
   x1 , x2 , y, C,U  x1 , x2 , y, C,U  0. as t to infinity.

Lemma 2.1 As t , all the solutions of the Model 2.1-2.5 will lie in the following
region:

where
452 LINA ARYATI AND ZENITH PURISHA

  Q1 C   Q  M1M 2
m1  , m2  , m3  , m4  min , M1  , M 2  , M 3  .
1 32 2  1 3 1  m

where
 1  Q
1    r1  d1 ,
1 1 m1
2  max d1 ,d ,1 ymax ,2 ymax ,rU
1 max ,r2U max  ,

1  min d1 ,d ,d3  .


Proof. Let W  t   x1  t   x2  t   y  t  hence W  t     1W  t  .
Then,

limsupW (t )  , where
t  1
W (0)  x10  x20  y0 , ( x10 , x20 , y0 )  R3 .
Q  Q
Analogously for C, U, x1 , x2 , y and x1  x2 , hence lim sup C  t   , lim supU  ,
t   t  m1
2
 
1   2   
lim inf x1 t  

, lim sup y  t    1  , lim inf  x1  t   x2  t   

,
t  1 t  d3 t  32
Q1  Cmin
lim inf C t   , lim inf U t   .
t  1   t  3m2
Then as t   , all the solution tend to
 x , x , y, C ,U  
 0  x1  x2  y  M 1 , m3  C  M 2 , m4  U  M 3 , 
5
 
B2   1 2 . ∎

 x1  m1 , x1  x2  m2 

In this section, we considered the stability of equilibrium points of the model.


Consider J  x1 , x2 , y, C,U  as Jacobian matrix of:
   x1 x2  1 x1 y  rUx1 1  d1 x1 
  x x   x y  r Ux  dx 
 1 2 2 2 2 2 2 
h  x1 , x1 , y, C ,U   1 x1 y   2 x2 y  d 3 y .
 
Q   C   C ( x1  x2 ) 
 C ( x  x )  mU 
 1 2 

Theorem 2.2 With appropriate parameter value, equilibrium point


d  r1 Qd3 d Q1  Qd3 
E1  3 , 0,   1, ,  is locally asymtotically
 1 d3 1m 1   d3  1 1   d3 m 1   d3  
stable.
Proof.  I  J  x1 , x2 , y, C,U   0
Model of Predator-Prey with Infected Prey in Toxic Environment 453

   
 
    x1   2 y  rU
1 
 d   4       x1  m   3      x1   r1 x1C 
 1
x   x1

  
 x1 
   
m      x1   12 x1 y   2     x1  12 x1 y   Cr1 x1   2 r1 x12C  m12 x1 y  

m    x1  12 x1 y  0
By Routh Hurwitz Theorem, hence:
d  r1 Qd3 d Q1  Qd3 
E1  3 , 0,   1, ,  locally asymtotically
 1 d3 1m 1   d3  1 1   d3 m 1   d3  
stable.∎

Explanation: With appropriate parameter value and any initial value near equilibrium
points, then for a long time, only the infected population that will be extinct.

Theorem 2.3 With appropriate parameter value, equilibrium point


  Q  mU 
E2  , 0, 0, ,U  is locally asymtotically stable.
 rU
1  d1  
Proof.  I  J  x1 , x2 , y , C ,U   0

  

     x1  rU
1 
 d     1 x1  d3    3       x1  m   2 
  1
x 
   
    x1   m      x1    r1 x1   C   
 x1  x1  
 
   x1  m   2 r1 x12C   r1 x1   C    x1   0
x1 
By RouthHurwitz Theorem, hence:
r U d   r U  d1  Q  mU 
E3  2 ,  1 , 0, ,U  locally asymtotically stable.∎
  r2 U  d   

Explanation: With appropriate parameter value and any initial value near equilibrium
points, then for a long time, the infected prey and predator population will be extinct.

Theorem 2.4 With appropriate parameter value, equilibrium point


r U d Q  mU 
E3  2 , x 2 , 0, ,U  is locally asymtotically stable.
   
Proof.  I  J  x1 , x 2 , y, C,U   0
454 LINA ARYATI AND ZENITH PURISHA

  
    1 x1   2 x 2  d3    4   m     x1  x 2    3 
 x1 
m   
   r1 x1 C   r2 x 2 C   2 x1 x 2      x1  x 2    m     2
 x1  x1  
  r2 x 2 C
  r2 x1 x 2 C   r1 x1 C    2 x1 x 2 m 
2

 x1

m 
    x 1  x2     r1 x1 C   r2 x 2 C   2 x1 x 2  
 x1 
    x 1 
 x 2    r1 x1 C   r2 x 2 C      x1  x 2   r2 x1 x 2 C 

 
 r1 x12 C   r2 x 2 C      x1  x 2    r2 x1 x 2 C   r1 x12 C 
 r2 x 2 C 
  2 x1 x 2 m 
x1 
By Routh Hurwitz Theorem, hence:
r U d   r U  d1  Q  mU 
E3  2 ,  1 , 0, ,U  locally asymtotically stable. ∎
  r2 U  d   

Explanation: With appropriate parameter value and any initial value near equilibrium
points, then for a long time, only the predator population that will be extinct.

The global stability of equilibrium point


 d   ˆx  ˆx r Uˆ d ˆ
Q 
E4  3 2 2
,xˆ 2 , 1  2  , ,Uˆ  where is determined by the


1  2  2  2    ˆx1  ˆx2 
  
following theorem.

Theorem 2.5 If d3   2 x2 ,  x1  r2U  d  0 , U  0 ,   ˆ , 1  ˆ1 ,  2  ˆ 2 ,


  ˆ, r1  rˆ1 , r2  rˆ2 , Q  Qˆ , and m̂  m , then equilibrium point
 d   ˆx  ˆx r Uˆ d Qˆ 
E4  3 2 2
,xˆ 2 , 1  2  , ,Uˆ  where of the model is globally
 1  2  2  2     ˆx1  ˆx2  
 
asymtotically stable.
Proof : Given Lyapunov function:
1 1 1 1
V2  x1 ,x2 , y,C,U    x1  xˆ1    x2  xˆ 2    y  ˆy   C  Cˆ   1
 
2 2
 U  Uˆ
2 2 2
.
2 2 2 2 2
Then
Model of Predator-Prey with Infected Prey in Toxic Environment 455

  
V2   a11  x1  ˆx1   a12  x1  xˆ 1  x2  xˆ 2   a13  x1  xˆ 1  y  ˆy   a15  x1  xˆ 1  U  Uˆ 
2

 
a22  x2  ˆx2   a23  x2  ˆx2  y  ˆy   a25  x2  ˆx2  U  Uˆ  a33  y  ˆy  
2 2

   a  x  ˆx  C  Cˆ   a  x2  xˆ 2   C  Cˆ   a55 U  Uˆ 
2
a44 C  Cˆ
2
14 1 1 24 

a45 C  Cˆ  U  Uˆ  ,
where
a11   x2  1 y  rU
1  d1 ,
a12    ˆx1  ˆx2  ,
a13  1  ˆx1  ˆy  ,
a   C,ˆ
24
ˆ
a15  r1 ˆx1   C,
a22  d   2 y  r2U   x1 ,
a23   2  ˆx2  ˆy  ,
a   C, ˆ
24
ˆ
a25  r2 ˆx2   C,
a33  d3  1 x1   2 x2 ,
a44      x1  x2  ,
a45    x1  x2  ,
a55  m .
If
i. 4a122  a11a22 ,
ii. 2a132  a11a33 ,
iii. 3a142  a11a44 ,
iv. 3a152  a11a55 ,
v. 2
2a23  a22 a33 ,
vi. 2
3a24  a22 a44 ,
vii. 2
3a25  a22 a55 ,
viii. 2
3a45  a44 a55 .
then V2 is negative definte .
Hence,   ˆ , 1  ˆ1 ,  2  ˆ 2 ,   ˆ, r1  rˆ1 , r2  rˆ2 , Q  Qˆ , dan m̂  m ,
where,
̂ is positive root of 11 2  12   13  0 ,
̂1 is positive root of  2112   221   23  0 ,
̂ is positive root of  3112   321   33  0 ,
̂ 2 positive root of  4122   42 2   43  0 ,
456 LINA ARYATI AND ZENITH PURISHA

 1 a  d1  
1
m   x2 a  1 ya  rU ˆ
1/ 2
r̂1    C.
3x̂1

ˆ      ˆx  ˆx   1  d   y  r U   x      x  x   
1/ 2

Q  2   ,

1 2 a 2 a 1b 1a 2a
 3 
1  m 
1/ 2

r̂2     d   2 ya  r2U a   x1b     Cˆ  ,

x̂2   3  

3 2  x1b  x2b 
2

m̂  .
    x1a  x2 a  
We consider the following set parameters, Sinha [4].
 =40, 1 =0,01,  2 =0,02,  =0,04, r1=0.005, r2=0.001, d1=0.01, d=0,05, d3=0,5 ,Q=10,
 =0.2,  =0.1, m=0.3. Then we get the following results (time in year):
80

70

60
x1(t)
x2 (t)
50 y(t)
C(t)
40 U(t)

30

20

10

0
0 5 10 15 20 25 30 35 40 45 50
t (waktu)

Figure 1

For a long time, with  x  0 , x  0 , y  0 , C  0 ,U  0   40, 10, 70, 5, 35 ,


1 2
the number of susceptible prey population will approach 35, infected prey population will
approach 8, predator population will approach 67, environmental concentration of toxicant
will approach 2,5, and toxicant population in the prey population (susceptible and
infected) will approach to 32.
900
x1(t))
800 x2(t)
y(t)
700 C(t)
U(t)
600

500

400

300

200

100

0
0 10 20 30 40 50 60 70 80 90 100
t (waktu)

Figure 2
Model of Predator-Prey with Infected Prey in Toxic Environment 457

For a long time, with  x  0 , x  0 , y  0 , C  0 ,U 0   250,150,550,80,300


1 2 , the
number of susceptible prey population will approach 35, infected prey population will
approach 8, predator population will approach 67, environmental concentration of toxicant
will approach 2,5, and toxicant population in the prey population (susceptible and
infected) will approach to 32.
The equilibrium point E4 of model 2, with appropriate paramater value and for any
initial value of a number of susceptible prey population, infected prey population, predator
population, toxicant concentration in environment, and toxicant concentration in prey
population, then for long time, the number of susceptible prey population will approach
35, the number of infected prey population will approach 8, dan the number of predator
population will approach 67, a number of toxicant concentration in environment will
approach 2,5, and a number of toxicant concentration in environment in prey population
will approach 32.

3. CONCLUSION

The system of prey-predator in this case has a globally asymtotically stable so that with
appropriate paramatere value and for any initial value of a number of susceptible prey
population, infected prey population, predator population, toxicant concentration in
environment, and toxicant concentration in prey population, then for long time, the
d   2 x2*
number of susceptible prey population will approach 3 , the number of infected
1
*
prey population will approach x2 , and the number of predator population will approach
 x̂1 r2Uˆ d
  , a number of toxicant concentration in environment will approach
2 2 2

, and a number of toxicant concentration in environment in prey
    ˆx1  ˆx2 
population will approach Û .

References

[1] DUBEY, B. AND HUSSAIN, J., Modelling the Interaction of Two Biological Species in a Polluted
Environment, Journal of Mathematical Analysis and Applications 246, 58-79, 1998.
[2] DUBEY, B. AND HUSSAIN, J., Nonlinear Models for The Survival of Two Competing Species Dependent
on Resource in Industrial Environments, Nonlinear Analysis: Real World Applications 4, 21-44, 2001.
[3] MENA-LORCA, J. AND HETHCOTE, W., Dynamic Models of Infectious Diseases as Regulators of
Population Sizes, J.Math Biol. 30: 693-716,1991.
[4] SINHA, S., MISRA, O.P., AND DHAR, J., Study of a Prey-Predator Dynamics Under the Simultaneous
Effect of Toxicant and Disease, J. Nonlinear Sci. Appl.1, no. 2, 102-117, 2008.
[5] SHUKLA, J. B. AND DUBEY, B., Modelling the Depletion and Conservation Forestry Resources: Effects
of Population and Pollution, J. Math. Biol. 36: 71-94, 1997.
458 LINA ARYATI AND ZENITH PURISHA

LINA ARYATI
Mathematics Department, Gadjah Mada University.
e-mail: [email protected]

ZENITH PURISHA
Mathematics Department, Gadjah Mada University.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 459 - 470.

ON THE MECHANICAL SYSTEMS WITH


NONHOLONOMIC CONSTRAINTS: THE MOTION
OF A SNAKEBOARD ON A SPHERICAL ARENA

MUHARANI ASNAL AND MUHAMMAD FARCHANI ROSYID

Abstract. Mathematically, a constraint in a mechanical system living on a


Riemannianmanifold are represented by a set of 1-forms which are non-degenerate. The
constraint then induces a distribution of vector fields in which the values of all
constraint1-forms vanish. When the associated distribution is not involutive (or
integrabel),then the constraint is said to be nonholonomic. The motion of a snakeboard
on a(curved) arena is an instance of a mechanical system with a nonholonomic
constraint.We have studied the motion of snakeboard on the internal surface of a sphere
in whichwe assumed that the radius of the spherical arena is so great and the involved
energyis small enough so that the motion of the snakeboard is limited around the nadir
pointof the sphere. We have derived the constraint 1 -forms of the system. The
dynamical equations of the system were then derived by making use of PCHS the so-
calledPort-Controlled Hamiltonian System as well as constrained Leci -Civita
connectionmethod on the configuration manifold. The PCHS method faced some
difficultiesconcerning the determination of basis vanishing the constrain t one-forms
and diagonalizing the inertia metric. On the other hand, the constrained Levi -Civita
connectionmethod worked systematically resulting in the equation of motion of the
snake boardon the spherical arena.

Keywords and Phrases:geometric mechanics, constraints, dynamics, snakeboard.

1. INTRODUCTION

Asnakeboard is a modified version of a skateboard in which the front and back pairs
of wheels are independently actuated. The extra degree of freedom enables the rider to
generate forward motion by twisting the body back and forth, while simultaneously
moving the wheels with the proper phase relationship.As with skateboarding, snakeboard
also be used on flat or curved surfaces.
The motion of snakeboard was first investigatedin details by Lewis et.al [1]. The work
discussed geometrically the non holonomic constraints, the dynamics, the control, and the
force of snakeboard.After that, research on snakeboard developed by Ostrowski et.al [2]

459
460 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID

in relation to the mechanics of undulatory locomotion. This type of locomotion is


generated by coupling of internal shape changes to external nonholonomic constraints.
Employing methods from geometric mechanics, they use dynamic symmetries and
kinematic constraints to develop a specialized form of dynamic equations which govern
undulatory systems. In their article, Koon and Marsden [3]establish necessary conditions
for optimal control using the idea Lagrangian reduction in the sense of reduction under a
symmetry group. The techniques they developed are design for Lagrangan mechanical
control systems with symmetry, optimal control and Lagrangan reduction for holonomic
systems. Blangkenstein [4], analyses a 2D mechanical network, the snakeboard. The
mechanism’s dynamics defines a Port-Hamiltonian system and is shown to possess an
SE(2) symmetry group, consisting of rigid body motion in the plane. The study of
controlled mechanical systems with nonholonomic constraints and symmetry is carried
out by Bullo and Zefran [5] by making use of the constrained Levi-Civita connection.
They discussed a simple control system, systems with nonholonomic constraints and
vector field, the external forces and symmetry. The study resulted in the equations of
kinematics and dynamics of snakeboard. To obtain the motion of snakeboard played on
the internal surface of the sphere, Bullo and Zefran [5] using the method used in [6]
byKrisnaprasad and Tsakiris. Method of Port-Controlled Hamiltonian System (PCHS)
proposed by Duindam and Stramigioli [7] shows explicitly the energy structure of the
system. In their article, they show how this method is related to the existing literature on
Hamiltonian reduction as well as to the nonholonomic momentum map. Recently, a new
port-based method was proposed that shows explicitly the energy-structure of the system.
This method is one used by the author in researching snakeboard motion on the internal of
the sphere
All previous studies mentioned above discussed the dynamics of snakeboard played
on flat surface as the arena and therefore involved the Lie group SE(2), i.e. special
Euclidean group as a part of configuration manifold. In this research we study the motion
of a snakeboard on an arena in the form of the internal surface of a sphere. In this case, the
Lie group involved is no longer SE(2), but 𝑆𝑂 3 × 𝑆 1 . Here we derive theequations
ofmotion ofthe snakeboardonthe internal surfaceofthe spherewithPCHSmethod. We also
derive theequation ofmotion ofthe snakeboard onthe internal surfaceofthe sphere by
making use of the so-calledconstrained Levi-Civita connection.In this research, we
consider the dynamics of snakeboard on the internal surface of large enough sphere. The
energy of the system under consideration is assumed to be small enough, so that the
motion of the system is limited around the nadir point of the sphere. The motion of the
snakeboard is assumed to be classical and non-relativistic.

2. THE MECHANICS OF SNAKEBOARD

2.1.The Mechanics of Snakeboardderived via PCHS Method. A conservative


unconstrained Port-Controlled Hamiltonian Systems (PCHS) is described bythe equations
(see Duindam and Stramigioli[7])

H
x  J ( x) ( x)  b( x)u
x (1)
H
y  bT ( x ) ( x). (2)
x
On th e M ec h an ic a l Sys t em s wi t h Non h olon om i c C on st ra in t s . . . . 461

Here, 𝑥 ∈ 𝒳 is the state of the system, with the state space 𝒳 being a smooth manifold, and
𝐽 𝑥 : 𝑇𝑥∗ 𝑀 → 𝑇𝑥 𝑀is a skew-symmetric vector bundle map. The map has locally the matrix
representation
 0 In 
(3)
 I 0 
 n

The input vector fields are of the form 𝑏 𝑞 = 0 𝐵𝑇 (𝑞) 𝑇 . Hamiltonan is a smooth
function on 𝒳 representing the total energy of the system. Then, the state space of the
system is 𝒳 = 𝑇 ∗ 𝑄, the cotangent bundle corresponding to the n-dimensional configuration
manifold 𝑄. A local coordinate for 𝑇 ∗ 𝑄 are denoted by (q,p) with 𝑞 ∈ 𝑄. The kinetic
energy is defined by a Riemannian metric 𝑔on 𝑄 and defines a quadratic form in 𝑝, i.e.,
𝑝, 𝑝 𝑔 −1 𝑞 = 𝑔𝑖𝑗 (𝑞)𝑝𝑖 𝑝𝑗 . The Hamiltonian is giventhen by
1
H ( q, p )  p, p g 1
(q )  V (q ) (4)
2
The constraints can be described as

 j (q)q  ij (q)q i  0, j  i,, k


(5)

where 𝜔1 , ⋯ , 𝜔𝑘 are independent 1-forms on 𝑄. They define a distribution 𝒟 ⊂ 𝑇𝑄, such


𝑗
that 𝑣𝑞 ∈ 𝒟𝑞 ⟺ 𝜔𝑖 𝑞 𝑣𝑞𝑖 = 0, 𝑗 = 1, ⋯ , 𝑘, called the constraints distribution. The
constraint is said to be non holonomic when the distribution is not involutive. Non
holonomic constraints are very important in the description of robotic locomotion systems,
suchas thesnakeboard later on this research. The constraints also introduce constraint forces
of the form 𝜆𝑗 𝜔 𝑗 . The complete set of equation of PCHS is then given by

 q i   0 1   qi   0   0 
H

     H   j l  ul (6)
 p i   1 0   pi   j  (q)i   Bi (q) 

 H 
y l  0 Bil (q )   qH 
i

  (7)
 pi 
H
0  i j (q) for j  1,, k (8)
pi
Here, 𝜆𝑗 are functions of both 𝑞and 𝑝, called Lagrange multipliers, 𝐵𝑖𝑙 𝑞 are input
vector fields and 𝑢𝑙 ∈ ℝ2 . Because of the constraints are satisfied at all times, then
equations (6), (7), and (8) is called a constrained Port-Controlled Hamiltonian System.
The snakeboard is a modified version of a skateboard in which the front and back
pairs of wheels are independently actuated. The extra degree of freedom enables the rider to
generate forward motion by twisting the body back and forth, while simultaneously moving
the wheels with the proper phase relationship. The configuration manifold S T is the
2 3

configuration space for a snakeboard played on the curved surface (the internal surface of
462 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID

the sphere). Snakeboard scheme can be seen in the figure 1.

Figure 1. Scheme of Snakeboard

Let { Θ, Φ, 𝜃} represent the position and orientation of the center of the board, 𝜓 is the
relative angle between the main body and the rotor, and 𝜙 is the relative angle between the
main body and the back wheel. The distance between the center of the board with wheels is
l.
The inertia tensor of snakeboard is

 mR 2 0 0 0 0 
 
 0 mR sin  0
2 2
0 0  (9)
M  0 0 ml 2 Jt 0 
 
 0 0 Jt Jt 0 
 0 0 J w 
 0 0

Hamiltonian of the snakeboard is obtained by making use of equations (4), namely

1  P2 P2 P2  2 P P JP2 P2 


H   2       mgR cos  (10)
2  mR mR 2 sin 2  J  Jt J t  J  J  Jw 
The constraints of snakeboard on the flat plane can be expressed via the one-forms

1  sin     dx  cos     dy  l cos  d  0 (11)

 2   sin     dx  cos     dy  l cos  d  0 (12)

If the energy of the snakeboard played on thearena of the form of the internal surface of a
sphere is small enough, the above constraints can be implemented. Therefore, the
constraints of the snakeboard played on the internal surface of the spherecan be written in
spherical coordinate system as one-forms

1  R cos  sin       d   R sin  cos       d   l cos  d (13)


(14)
 2   R cos  sin       d   R sin  cos       d   l cos  d
On th e M ec h an ic a l Sys t em s wi t h Non h olon om i c C on st ra in t s . . . . 463

From equation (5) we obtained the following equation of motion

    P 
   
   mR 2 
    P 
   mR sin 
2 2 
   
   1   
2 P  P 
    2  
(15)
   2  ml  J t  
   2
ml P 
     2
P
 
and the output variables   ml  J t J t  ml 2  J t  
   
    P 
   
   Jw 
 P   1 R cos  sin        2 R cos  sin       
   
 P   1 R sin  cos        2 R sin  cos       
 P    l cos    l cos  
   1 2

 P   U 
    
 P   U 

P P ml 2 (16)
y1   
ml 2  J t J t  ml 2  J t 
(17)
P
y 2

Jw
While the constraints are given by
P P 1  2 P  P 
0 cos  sin        cos         2  l cos  (18)
mR 2
mR sin 
2 2
2  ml  J t 
P P 1  2 P  P  (19)
0    2 cos  sin        cos           l cos 
mR mR 2 sin 2  2  ml 2  J t 
In the above equations of motion, the constraint forces still appear and there are no
restrictions in momentum phase space by the constraints. The PCHS method requires a
basis consisting of a vector field which vanish all of the constraint one-forms and
diagonalize the inertia metric of the system (snakeboard). The calculation however is not
simple and not every case gives an analytic answers.

2.2. The Mechanics of Snakeboardderived via Constrained Levi Civita Connection.


In general, a simple mechanical control system can be formally described by the following
objects:
[1]. an 𝑛-dimensional configuration manifold 𝑄 with coordinate system {𝑞1 , ⋯ , 𝑞 𝑛 },
[2]. an inertia tensor 𝑀 = {𝑀𝑖𝑗 } describing the kinetic energy and defining an inner
product ∙ ,∙ between vector fields on 𝑄, and
[3]. 𝑚 one-forms 𝐹1 , ⋯ , 𝐹𝑚 describing 𝑚 external control forces.
464 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID

The Christoffel symbols {Γ𝑗𝑘𝑖 ∶ 𝑖, 𝑗, 𝑘 ∈ {1, ⋯ , 𝑛}} of the inertia tensor 𝑀 are defined
by
1 𝜕𝑀𝑙𝑗 𝜕𝑀𝑙𝑘 𝜕𝑀𝑘𝑗 (20)
Γ𝑗𝑘𝑖 = 𝑀𝑙𝑖 + −
2 𝜕𝑞 𝑘 𝜕𝑞 𝑗 𝜕𝑞 𝑙

where Mli are the components of M−1(the summation convention is assumed throughout
the paper). All relevant quantities are assumed to be smooth. In coordinates the equations
of motion are
𝑚
(21)
𝑘
𝑞 + Γ𝑖𝑗𝑘 𝑞 𝑖 𝑞 𝑗 = 𝑀𝑘𝑗 𝐹𝑎 𝑗 𝑢𝑎 ,
𝑎=1

where 𝐹𝑎 𝑗 is the jth component of 𝐹𝑎 .


To formulate these equations in a coordinate-free setting, it is useful to introduce
some geometric concepts. Given two vector fields X and Y , the covariant derivative of Y
with respect to X is the vector field ∇𝑋 𝑌 with coordinates
𝜕𝑌 𝑖 (22)
∇𝑋 𝑌 𝒊 = 𝑗 𝑋𝑗 + Γ𝑗𝑘𝑖 𝑋𝑗 𝑋 𝑘 ,
𝜕𝑞

where 𝑋 𝑖 and 𝑌 𝑖 are the ith and jth component of X and Y . The operator ∇is called an affine
connection and it is determined by the functions Γ𝑗𝑘𝑖 .
Let ℒ𝑋 𝑓 be the Lie derivative of a scalar function f with respect to the vector field X .
Given a scalar function f, its gradient grad f is the unique vector field defined implicitly by

grad 𝑓, 𝑋 = ℒ𝑋 𝑓 (23)

Then it is easy to show the following fact.

Lemma: (Constrained Levi-Civita Connection) The equations of motion,

 q q      M 1 Fa ua
m

(24)
a 1

can be written as

  q    PM 1 F u
m
 q a a (25)
a 1

where ∇is the Levi-Civita given by

 Y   Y    P   (Y )
 (26)
X X X

for all vector fields X and Y. Furthermore, for all Y ∈ 𝐷:

 Y  P  Y 
 (27)
X X

Now let’s express the main Theorem [5] without proof on which our study is based.
On th e M ec h an ic a l Sys t em s wi t h Non h olon om i c C on st ra in t s . . . . 465

Teorema : Let { X1 , ... ,Xn - p }be an orthogonal basis of vector fields for D. The
generalized Christoffel symbols of ∇ are
1
 ijk  2
 Xi X j , X k (28)
Xk

and the equations of motion (7) read


m
v k   ijk vi v j   Yak ua (29)
a 1

where vi are the components 𝑞 of along { X1 , ... ,Xn - p }; i.e.;𝑞 = 𝑣𝑖 𝑋𝑖 ; and where the
coeficients of the control forces are

1
Yak  2
Fa , X k (30)
Xk

Furthermore; if the control forces are differential of functions; that is; if F a = da for
some a∈ {1, ... ,m}, then
1
Yak  2
L X k a (31)
Xk

The number of Christoffel symbols calculated according to equation (8) is 125, which
is nota small number. Fortunately, there are only three symbols which does not vanish.
They are explicitly given by

(32)
and

(33)

Furthermore, we must find an orthogonal basis spanning D, i.e. vanishing all of the
constraint one-forms. Initially, presumed a vector field X1 as
    
X q  v1  v2  v3  v4  v5 . (34)
    

This vector field must eliminate the one-forms𝜔1 and𝜔2 of constraints. So the vector field
of snakeboard is
l sin   l cos   R cos      
X1    ,
cos   sin   cos   (35)

X '2  ,
 ' (36)

X '3  .
 '
466 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID

(37)

Note that the field X’3is perpendicular to both X1andX’2. A direct way of computing an
orthogonal basis {X1, X2, X3}from the basis { X1, X’2, X’3}is to define

X '2 , X 1
X 2  X '2  X1
X1 , X1 (38)

so that

J t cos      cos  cos  sin   J t cos      cos 2  cos 2  


X2  
ml 2 Rf1  , ,  ,    ml 2 R sin f1  , ,  ,   
J t cos 2      cos 2   
 
ml 3 f1  , ,  ,     (39)

in which we have defined define the following shorthands:


1
f1  , , ,    (sin 2  cos 2   cos 4  cos 2   cos 2      cos 2 )
l
f 2  , , ,    J t cos 4      cos 4   cos 2 J t (cos 4  cos 2   2lf1  , ,  ,  
 sin 2  cos 2  ) cos 2       ml 4 f12  , , ,  
f3  , , ,    (cos 2      cos 4  sin      R sin   cos      cos 4 
 cos 2  l sin         cos 4  sin   1 cos 2   sin 2  sin 2  
sin 2  cos3  l
f 4  , , ,    (cos      J t2 cos3  (cos 2 ( R cos 2 (cos 2  cos 2   3cos 2  )
sin       3l sin 3  cos  sin ) sin  cos3       l cos 4 
cos 2  sin       cos 2  cos 2   3sin 2   cos 2       cos 2 
((cos 6 R cos   sin 4  R cos 2 ) sin       3cos  sin 3  sin 
1 
l  sin 2   cos 2  cos 2   sin  cos       (cos 6  cos8  l
 3 
cos 2  sin 4  cos 4  ) sin      cos 4  ) sin      cos 2 
f 5  , , ,    (cos 2  (cos 4  cos 2   sin 2  cos 2   3cos 2      cos 2 )(sin 
( R cos 2   sin 2   cos 2   sin       l sin 3  cos  sin )
cos       l sin      cos 2  cos 2  (sin 2   cos 2  cos 2  ) 2 )) J t
f 6  , , ,    J t cos      (sin 3  cos3  l sin 2   cos      sin     
cos 4  cos 2  l  cos 2      sin      cos 4 R sin )
f 7  , , ,    l 3 R( cos3      cos 4  sin      R sin   cos 2      cos 4 
sin      l cos 2   l (sin      cos 2 f1  , ,  ,   R sin 
 ( cos 2   1  sin   cos 4   sin 2  sin 2 ) cos3  sin  )
cos       sin      cos 2 f1  , , ,   l 2 cos 2  )m
f1  , , ,  )
On th e M ec h an ic a l Sys t em s wi t h Non h olon om i c C on st ra in t s . . . . 467

1
f8  , ,  ,    (3(cos 2  sin ( cos 2 R(3sin 2   cos 2  cos 2  ) sin       l
3
(40)
1
sin  cos  sin ) cos3       cos 4  cos 2  sin      l (3sin 2 
3

3
1
 cos  cos  ) cos       sin ( cos 2 R(sin 2   cos 2 
2 2 2

3
cos 2  )(cos 4  cos 2   sin 2  cos 2   2lf1  , ,  ,  ) sin     
1 2
 cos  sin l (cos 4  cos 2   sin 2  cos 2   lf1  , , ,  )
3 3
1
sin  ) cos       cos  cos l sin      (sin 2   cos 2 
3 2 2

3
cos 2  )(cos 4  cos 2   sin 2  cos 2   2lf1  , ,  ,  )) cos 2 
cos 2  J t f1  , , ,   cos 2     )
f 9  , ,  ,    (cos      J t (3cos 2      cos 2   2lf1  , ,  ,    sin 2 
cos 2   cos 4  cos 2 ) f1  , , ,   cos  (( R cos 2 (sin 2   cos 2 
cos 2  ) sin       l sin 3  cos  sin ) sin  cos       l cos 2 
sin      cos 2  (sin 2   cos 2  cos 2  )))
f10  , ,  ,    ( f1  , ,  ,   J t cos      ( cos3      cos 4 ) sin      R
sin   cos 2      cos 4  sin      cos 2  l  sin l (sin  cos 2 
sin 3   Rf1  , , ,   cos 2  sin     ) cos       sin     
cos 2 l 2 f1  , ,  ,   cos 2  )
f11  , ,  ,    (l 3 sin  f1  , , ,   cos      Rm( cos 2      cos 2   (1
 cos 2 ) cos 4   lf1  , ,  ,  ))
f12  , ,  ,    ( J t cos 2  cos  ( cos 2   2 cos 2  cos 2   sin 2  ) sin  ( cos 4 
cos 2   cos 2      cos 2   sin 2  cos 2   2lf1  , ,  ,  )
cos 2      f1  , ,  ,  )
According to equation (6), the only non-vanishing Chistoffel symbols are

f 3  , ,  ,  
 111 
cos 2  cos  sin f1  , ,  ,  

f 4  , ,  ,  
 122 
m 2 l 6 sin f1  , ,  ,  

f 5  , ,  ,  
 112 
ml R sin   f1  , ,  ,   
3 3

f 6  , ,  ,  
 121 
ml R sin   f1  , ,  ,   
3 2
468 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID

tan   cos 2      cos 2   cos 4   cos 4  cos 2  


 131 
f1  , , ,  

J t sin  cos      cos 2  cos 2    cos 2   2 cos 2  cos 2   sin 2  


 132 
ml 3 R  f1  , ,  ,   
2

f 7  , , ,  
 112 
cos 2  cos 2  sin f 2  , , ,  

f8  , ,  ,  
 222 
ml R sin   f1  , , ,    f 2  , , ,  
3 3

f 9  , , ,  
 122 
sin   f1  , ,  ,    f 2  , ,  ,  
2

f10  , ,  ,  
 221 
sin f1  , ,  ,   f 2  , ,  ,  

f11  , ,  ,  
 31
2

cos 2  f 2  , ,  ,  

f12  , ,  ,  
 32
2

 f1  , ,  ,    f 2  , ,  ,  
2

(41)
We also compute the three norms

(42)

(43)

(44)

The next step is to calculate all of the general external forces acting on the snakeboard, by
using the following equation

(45)

From (8), the only general external force acting of snakeboard is


(46)
.
On th e M ec h an ic a l Sys t em s wi t h Non h olon om i c C on st ra in t s . . . . 469

Finally, in the our coordinate system, the kinematic equations are given by

  J cos      cos  cos  sin  


   l sin    t 
     l 2 mRf1  , ,  ,   
cos  
   
l cos    J t cos      cos 2  cos 2  
    l 2 mR sin f  , ,  ,   
v 
   sin  (47)
     R cos        
1


     J cos 2
     cos 2  
 t
  
 
sin    ml 3 f1  , ,  ,  

 

The dynamic equations are

v   111vv   122
    12
1
v   121 v   131v   132

g tan  sin 
 U
lRf1  , , ,  

2  2 
(48)
   112 vv   22
2
    12
 2
v   21
2
 v   31 v   32
l 2 mf1  , ,  ,    gJ t sin  cos      cos  cos  sin   l 2 f1  , ,  ,   
 U
J t f 2  , , ,  

1
  U
Jw

The dynamics and kinematicsof the snakeboard obtained by making use of the method of
constrained Levi-Civita connection do not contain Lagrange multiplier expressing
constraint forces. The forces are therefore hidden by the method according to Lemma .
The non holonomic constraints of snakeboard do not restrict the direction of the motion of
snakeboard as shown by the equations of kinematics. While all the forces that appear in
the dynamic equation of snakeboard presented by the righthand side of equation (12).
Therefore, the equations of motion of the snakeboard obtained by making use of the
method of constrained Levi-Civita connection are well defined.

3. CONCLUDING REMARK

The formulation of the equations of motion of snakeboard on a plane and the


internal of the sphere can be obtained either by using the method PCHS or methods
constrained Levi-Civita connection. But in the formulation of the equations of motion of
snakeboard on the internal surface of the sphere using the metod PCHS, the Lagrange
multipliers still appear in the equation of motion, while in the equation of motion derived
with the constrained Levi-Civita connectionthe Lagrange multipliers do not appear (can
be hidden).
470 M UHARANI ASNAL AND M UHAMMAD FARCHANI ROSYID

References

[1]. LEWIS, A., OSTROWSKI, J.,MURRAY, R. DAN BURDICK, J., NonholonomicMechanicsand Locomotion: the
Snakeboard Example, In Proceeding of IEEE Conference onRobotics and Automation, 3, 2391-2400, San
Diego, CA, USA, 1994.
[2]. OSTROWSKI, J. P., BURDICK, J. W., LEWIS, A. D. DAN MURRAY, R. M., The Mechanics of Undulatory
Locomotion: The Mixed Kinematic and Dynamic Case,In IEEE International Conference on Robotics and
Automation, 2, 1945-1951, Nagoya, Japan, 1995.
[3]. KOON, W. S., MARSDEN, J. E., Optimal Control for Holonomic and NonholonomicMechanical Systems
with Symmetry and Lagrangian Reduction, SIAM J. ofControl and Optimization, 35, 1997, 901-929,
1997.
[4]. BLANGKEINSTEIN, G., Symmetries and Locomotion of a 2D Mechanical Net-work:The Snakeboard,
European Sponsored Project GeoPlex, IST-2001-34166,2001.
[5]. BULLO, F. DAN ZEFRAN, M., On Mechanical Control Systems with NonholonomicConstraints and
Symmetries, Systems and Control letters, 1, 45, 133-143, 2001.
[6]. KRISHNAPRASAD, P. S. DAN TSAKIRIS, D. P., 1998, Oscillations SE(2)-Snakes and Motion Control: Study
of Roller Racer, Center for Dynamics and Control of Smart Structures (CDCSS), Technical Report,
University of Maryland, College Park.
[7]. Duindam, V. dan Stramigioli, S., Energy-Based Model-Reduction and Control of Nonholonomic
Mechanical Systems, In Proceeding of IEEE International Conference on Robotics and Automation,
4584-4589, New Orleans, LA, 2004.

Muharani Asnal
Kelompok Penelitian Kosmologi, Astrofisika dan Fisika Matematik (KAM),
Laboratorium Fisika Atom dan Inti, Universitas Gadjah Mada Yogyakarta.
e-mail: [email protected]

Muhammad Farchani Rosyid


Kelompok Penelitian Kosmologi, Astrofisika dan Fisika Matematik (KAM),
Laboratorium Fisika Atom dan Inti,Jurusan Fisika, Fakultas Matematika dan Ilmu
Pengetahuan Alam , Universitas Gadjah Mada Yogyakarta.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 471 - 480.

SAFETY ANALYSIS OF TIMED AUTOMATA HYBRID SYSTEMS


WITH SOS FOR COMPLEX EIGENVALUES

NOORMA YULIA MEGAWATI, SALMAH, INDAH EMILIA WIJAYANTI

Abstract. In this paper we investigate a safety analysis or reachability of timed automata hybrid
systems as an extension of safety analysis of linear systems. The safety verification problem of
linear systems with complex eigenvalues will be converted to an emptiness problem from semi
algebraic set. Sum of square (SOS) decomposition will be used to check the emptiness of the set.
Suppose given set of initial state of mode i and final state at mode i+1. The safety analysis is
solved by determining the solution of differential systems at mode i and the solution of mode i at
the time occurrence. The solution then become initial state at mode i+1 and then we analyze by
using safety analysis of linear systems.
Keywords : hybrid systems, sum of square (SOS), safety, reachability, complex eigenvalues.

1. INTRODUCTION

Hybrid systems are used for modeling and analyzing systems which have interacting
continuous-valued and discrete-valued state variables. The continuous state variable may be
the value of the state in continuous time, discrete time or a mixture of the two.
A method which provides safety certificates in realistic computation time has been
developed for linear systems with polyhedral sets of initial and final states using geometric
programming in Yazarel, Pappas[6]. The extended method in timed automata hybrid systems
has been done by using geometric programming approach in Megawati, et. al.[8].
In Yazarel, et. al.[7], SOS method is used to analyze safety problems with eigen
structure. The safety verification problem of linear system with certain eigen-structure will be
converted to an emptiness problem for semi-algebraic set. Sum of squares (SOS)
decomposition will be used to check the emptiness of the set.
The safety analysis of linear systems with sum of square (SOS) is determined the
feasibility of boolean combination sets from polynomial equality and inequality. The
polynomial equality describes the trajectory of systems and the polynomial inequalities
describe the region of initial and final state. The tool which is used to check the emptiness set
of polynomial equalities and inequalities is Positivstellensatz Theorem.

471
472 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI

From [7], in this paper, we will discuss about safety predicate for timed automata hybrid
systems, especially for the case of two modes hybrid systems for purely complex eigenvalues
case.

2. SUM OF SQUARE
The safety analysis of linear systems with sum of square (SOS) is determine the feasibility
of boolean combination sets from polynomial equality and inequality. The polynomial
equality describe the trajectory of systems and the polynomial inequalities are describe the
region of initial and final state. The tools which used to check the emptiness set of polynomial
equalities and inequalities is Positivstellensatz theorem that is discussed in [1] .

Definition 2.1. The monomial m associated to the n – tuple has the form
m  x   x  x11 x22 xnn ,
where   Z n .

Definition 2.2 A multivariate polynomial f  x  is a finite linear combination of monomials


f   c m   c x ,
 

where c  R.

Definition 2.3 A multivariate polynomial f  x  is a sum of square (SOS) if there exist some
polynomials pi  x  , i  1, 2, , m such that
m
f  x    pi2  x  .
i 1

m
The condition that f  x    pi2  x  is equivalent to the existence of a positive
i 1

semidefinite matrix Q such that f  Z T  x  QZ  x  for some vector monomials Z  x  . For


presenting the method that can prove emptiness of semialgebraic sets, [1] need the following
definitions.

Definition 2.4. Given a finite sets of polynomials  p  x  , with


i pi  R  x1 , x2 , , xn  . Ideal

I  pi  generated by  pi  x  is

 
I  pi    ai pi ai are polynomials for all i  .
 i 
Safety Analysis of Timed Automata Hybrid Sytems Using SOS for Complex Eigenvalues
473

Definition 2.5. Given a finite sets of polynomials  p  x  , with p  x   R  x , x ,


i i 1 2 , xn  . The

multiplicative monoid generated by  p  x  , which is denoted by


i M  pi  , is the set of finite
products of elements pi , including the empty product (the identity).

Definition 2.6. Given a finite sets of polynomials  p  x  , with


i pi  x   R  x1 , x2 , , xn  .

Cone generated by  pi  x  , which is denoted by P  pi  is


 k 

P  pi   a   b j q j a, b j aresum of squares, q j  M  pi  for j  1, ,k

 j 1 

The above definitions will be used in Positivstellensatz lemma. The following lemma from
[7] provides characterization of infeasibility certificates for real solution of systems of
polynomial equalities and inequalities.

Lemma 2.7. Let f j , g k be finite sets of polynomials in x, then the following statements are
equivalent.
1. The following sets is empty
x  R n
f j  x   0, hl  x   0, j, l  (1)

2. There exist f  P  f j  , h  I  hl  such that


f  h 1  0 . (2)
In this paper, the safety problems is we compute suitable polynomials
f  P  f j  , h  I  hl  such that equation (1) is empty, which computed by SOSTOOLS.

3. SAFETY ANALYSIS LINEAR SYSTEMS


Consider the linear systems
x  Ax , (3)
with x  t   R is state systems at t, and A  R
n nxn
is matrix systems. Given the initial state of
systems is x0  x  0  , then the trajectories of equation (3) for t  0 is
x  t   e At x0 . (4)
Given sets of initial state X 0 and final or unsafe state X f defined as,
 m

X 0   x0  R n

p  x   0 ,
i 1
i 0 (5)

 m k

X f   x f  Rn

 p  x   0 ,
i  m 1
i f (6)
474 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI

where pi  x  are polynomial functions with rational coefficients.


According [7], they define the forward and backward reachable set of the linear systems
(3) are defined in the following definition.

Definition 3.1. Given a set of initial state X 0 , the forward reachable set  Post  A, X 0   and

the backward reachable set  Pre  A, X 0   of the linear systems (3) defined as


Post  A, X 0   x f  Rn t x0 : t  0  x0  X 0  x f  e At x0 ,  (7)

Pre  A, X 0   x f  Rn t x0 : t  0  x0  X 0  x f  e At x .
0 (8)

The forward and backward safety predicates of linear systems are defined in the following
definitions.

Definition 3.2. Given a set of final or unsafe state X f , then the forward safety predicate

Safe  A, X , X  and backward safety predicate Safe  A, X , X  defined as


 0 f  0 f

 1, if Post  A, X 0  Xf 
Safe  A, X 0 , X f    , (9)
0, otherwise
 1, if Pre  A, X 0  Xf 
Safe  A, X 0 , X f    . (10)
0, otherwise

We say Safe  A, X 0 , X f   1 if Safe  A, X 0 , X f   1 and Safe  A, X 0 , X f   1 . In this

paper we discussed if given linear systems  A, X 0 , X f  , determine if Safe  A, X 0 , X f   1

and Safe  A, X 0 , X f   1 .

4. SAFETY ANALYSIS FOR LINEAR SYSTEMS WITH PURELY


IMAGINARY EIGENVALUE
Let A  R2mx 2m can be decomposed into diagonal block diagonal form by an invertible
matrix T  Q2 mx 2 m . If we define a new vector state z  R2m , z  T 1 x , then we obtain the
equivalent linear systems
z  z , (11)
1
with   T AT , and   Q 2 mx 2 m
is a matrix of the form
Safety Analysis of Timed Automata Hybrid Sytems Using SOS for Complex Eigenvalues
475

  0 1  
  0 
  1 0  
 , (12)
 
  0 m  
 0   0  
  m
where eigenvalue of  are ii , i  Q . The differential equation in each 2 – dimensional
subspace take the form,
 z2i 1   0 i   z2i 1 
 z     0   z  , i  1, , m . (13)
 2i   i   2i 
The equation (13) has the following solution
z2i 1  cos  i t  z0,2i 1  sin  i t  z0,2i ,
(14)
z2i   sin  i t  z0,2i 1  cos  i t  z0,2i .
Let set of the initial and final or unsafe state in eigenspace, x0 and x f is a state of set
X 0 and X f which defined as (5) and (6). Since T invertible, then Z 0 , Z f become
 m

Z 0   z0  R n

p  z   0 ,
i 1
zi 0

 
Z f   z f  R n  pzi  z f   0 ,
m

 i 1 
where pzi  z  are polynomials with rational coefficients. The safety analysis for linear systems
in [7[ with purely imaginary eigenvalue is given in the following theorem.
Theorem 4.1. Given linear systems  A, X 0 ,Xf  where A is diagonalizable matrix with
purely imaginary eigenvalue, then the following statements is equivalent.
1. Safe  A, X 0 , X f   1 .

2. Safe  , Z0 , Z f   1 .
3. The following set defined by polynomial equalities and inequalities is empty for the
system in modal coordinate.
 z f ,2i  yi z0,2i 1  wi z0,2i  0, i  1, , m,
 z f ,2i 1  wi z0,2i 1  yi z0,2i  0, i  1, , m,
wi  f i  w, y   0, i  1, , m,
yi  gi  w, y   0, i  1, , m, (15)
w2  y 2  1  0,
pzi  z0   0, i  1, ,m, (16)

pzi  z f   0, i  1, ,m , (17)
476 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI

n
where i  R , ni  Z , di  Z  , d   di , d  Z  , si  i d , c  gcd  s1 , s2 , , sn 
i 1

and gcd is great command divisor and f  x, y  and g  x, y  are a polynomial function
as defined by
cos  t   f  cos t ,sin t  ,   1, (18)
sin  t   g  cos t ,sin t  ,   1. (19)

5. SAFETY ANALYSIS TIMED AUTOMATA HYBRID SYSTEMS

In this part, we will discussed the safety predicate of timed automata hybrid systems,
especially for the case of two modes hybrid systems.

Figure 1. Hybrid Automata with two modes

Let start with the system at mode 1 ( q1 ), with dynamic equation x  A1 x. Assume at t the
mode jump from mode 1 to mode 2 ( q2 ) with new dynamic equation x  A2 x , where
x  t   R n is a state at time t and A1 , A2  R nxn are a matrix systems. The set of initial state
 q1 , X 0  and final state  q2 , X f  are defined as follow,

 m

X 0   x0  R n  pi  x0   0 ,
 i 1

 mk

X f   x f  R n  pi  x f   0 ,
 i  m 1

where pi  x  is polynomial with rational coefficient. From [2], defined reachability set of
timed automata hybrid system.

Definition 5.1. State  qˆ, xˆ  is reachable, if there exist a finite execution  , q, x  where

   
and q  N'  , x  N,    qˆ, xˆ  . All collection of reachable state is denoted by
N
   i , i' 
i 0

the set Reach  Q  X .


Safety Analysis of Timed Automata Hybrid Sytems Using SOS for Complex Eigenvalues
477

If unsafe state of timed automata hybrid system is denoted by X f , the safety predicate of
timed automata hybrid system is defined as follows.

Definition 5.2. If X f at mode q2 is the final or unsafe state then the forward safety
predicate is defined as,
1, unsafe stateunreachable
Safe  A,(q1 , X 0 ),(q2 , X f )    .
0, others

The problem in safety analysis of timed automata hybrid system is if given two set of
initial and final or unsafe state, we will determine if trajectory state from initial state can

reach the final state or Safe+ A,  q1 , X 0  ,  q2 , X f   1 .
From the Definition 5.1., the following steps are used to solve the safety predicate of the
timed automata hybrid systems.
1. Determine the solution of differential equation at mode 1 ( q1 ) where x0  X 0 . Find the
state value of x T  where T denotes the occurrence of the discrete transition. The state
x T  becomes the initial state of the mode 2 ( q2 ) denoted by x0  x  0  x T  .
2. Determine safety analysis using the same steps with the initial state mode 2 ( q2 ) is
x0  x  0  x T  .
Example : Given a hybrid automaton with time to change mode T = 2 with the system
matrices
 2 2  0 1
A1    , A2   .
 1 3  9 0
The sets of initial mode q1 is   3x01  1   x02  2   1  0 ,and final or unsafe state mode
2 2

q2 is   3x f 1  4    x f 2  4   1  0 . The trajectory system of mode q1 at time T = 2 is


2 2

 2 2 1 8 2 2 
 3e  3e  e 2  e 8 
x  2  
3 3
 x0 ,
 e  e
1 2 1 8 1 
e  e
2 2 8 
 3 3 3 3 
 0.0903 0.09 
  x0 .
 0.0450 0.0453
Next, we will be analyze safety predicate at mode q2 . The initial state of mode q2 is
 0.0903 0.09 
xˆ0  x  2     x0 , where x0  X 0 . The eigen value of the matrix A2 are
 0.0450 0.0453
1  3i and 2  3i .
478 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI

We define w  cos3t and y  sin 3t , then equation (15) becomes

h1 :  z f 1  wz01  yz02  0,
h2 :  z f 2  yz01  wz02  0, (20)
h3 : w2  y 2  1  0.

Equation (16) become,

f1 :   zˆ01  1   zˆ02  2  1  0,
2 2
(21)

Equation (17) become

   
2 2
f 2 :  z f1  4  z f2  4  1  0, (22)

where zˆ01  0.1249z01  0.0935z02 and zˆ02  0.0138z01  0.0107 z02 .


SOSTOOLS test returns that Safe A,  q1 , X 0  ,  q2 , X f   1 It can be concluded : if it is
selected x0  X 0 at mode q1 , then x f  X f , or the systems is safe.

6. CONCLUSION

InThis paper, the safety analysis of linear systems with purely complex eigen value using
SOS has been extended to analyze the safety of timed automata hybrid system which has two
modes. We assumed that the matrix systems of each mode of timed automata hybrid systems
are diagonalizable. The safety verification for hybrid timed automata are if given two set of
initial and final or unsafe state, the safety analysis was carried out as follows : find the
solution of the differential of the mode 1 where x0  X 0 , determine the state at changing
mode T, x T  . Change the terminal state to the initial state of the next mode and then
determined the safety analysis in the mode 2 with like safety analysis in linear systems.

References
[1] BOCHANAK, J., COSTE, M., AND ROY, M.F., 1998, “Real Algebraic Geometry”, Springer – Verlag, Berlin
[2] BEMPORAD, A.,DE SCHUTTER, B. AND. HEEMELS, W.M.P.H., 2003,”Modelling and Control of Hybrid
Systems”, Lecture Notes of the DISC Course.
[3] BRANICKY, M.S. V.S. BORKAR AND S.K. MITTER, ”A Unified Framework for Hybrid Control: Model and
Optimal Control”, IEEE Trans on Automatic Control, Vol. 43, pp. 31-45, 1998.
[4] YAZAREL, H. AND PAPPAS,G.J., “Geometric Programming Relaxations for Linear System Reachability”,
Proceeding American Control Conference, pp. 553 – 559, 2004.
[5] YAZAREL,H. ,PRAJNA, S., AND PAPPAS,G.J., “SOS for Safety”, Proceeding 43rd IEEE Conference on Decision
and Control, vol.1, pp 461-466, 2004.
[6] MEGAWATI, N.Y., SUTARTO, H.Y., SALMAH, SUPARWANTO, A., WIJAYANTI, I.E., SOLIKHATUN, BUDIYONO, A.,
Safety Analysis of Timed Automata Hybrid Sytems Using SOS for Complex Eigenvalues
479

JOELIANTO, E., 2009, “Safety Analysis of a Class of Timed Automata Hybrid Systems”, Proceeding International
Conference of Instrument, Control and Automation, ICA, 20 – 22 Oktober 2009, pp. 277 – 282.

Noorma Yulia Megawati


Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia
e-mail : [email protected]

Salmah
Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia
e-mail :[email protected]

Indah Emilia Wijayanti


Department of Mathematics, Gadjah Mada University, Yogyakarta, Indonesia
e-mail : [email protected]
480 N.Y. MEGAWATI, SALMAH, I.E. W IJAYANTI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 481–498.

GLOBAL ASYMPTOTIC STABILITY OF VIRUS


DYNAMICS MODELS AND THE EFFECTS OF
CTL AND ANTIBODY RESPONSES

Nughthoh Arfawi Kurdhi and Lina Aryati

Abstract. Various of virus have attacked humans and caused diseases. In the human’s
body, a virus requires a host cell for reproduction and live on. The existence of virus
particles will activate the immune system that are roled by CTL and antibody. In this
paper, we present the virus dynamics with CTL and antibody responses. The global
stability of equilibrium point for dynamics virus models with CTL and antibody responses
are explored by using appropriate Lyapunov functions. We derive the basic reproduction
number of virus R0 and the immune responses reproduction number R1 and R2 for the
virus infection model. The global dynamics of the model are completely determined by
the values of R0 . If R0 <1 then the virus-free equilibrium is globally asymtotically stable,
and in case R0 >1 there is a unique endemic equlibrium which takes over this property. In
addition, we show that the CTL and antibody responses have important role in controling
of the density of free virus particles and of infected cells.
Keywords and Phrases: Virus, CTL and antibody immune responses, reproduction num-
ber, global stability, Lyapunov function.

1. INTRODUCTION
Virus is the one of intracellular pathogens which trouble the cell growth. For
reproduction, the virus has to enter a cell and use the cells metabolic machinery. Each
virus has an affinity for a particular type of cell. For example, HIV virus recognizes
CD4+ T cells-white blood cells which also known as helper T cell. A virus particle
encounters noninfected cells and becomes an infected cell. Then infected cells produce
more virus over time. In the course of viral infection, the host cell is changed or even
killed. On the other hand, virus particles activate the immune system. The role of
immune system is to fight off invasion by foreign phatogens such as virus. Note that

2010 Mathematics Subject Classification: 93A30, 37N25, 92Bxx

481
482 N.A. Kurdhi and L. Aryati

the immune response after viral infection is universal and necessary to eliminate or
control the desease. During the process of viral infection, the host response is induced
which is initially rapid and nonspecific (natural killer cells, macrophage cells, etc.) and
specific (CTLs, antibodies). However, in most virus infections, CTL and antibody play
a critical part in antiviral defense.
In fact, the specific immune system has three major branches. Two of them are
mainly effector responses; that is, they directly fight the virus. The third branch is
mainly a regulatory response that helps the effector responses to become established.
The two effector responses are CTL and antibody. There is a salient distinction between
their respective roles. That is, while antibodies attach to the free virus and neutralize
it, CTL identify and destroy infected cells. In addition, CTL can secrete substances
that trigger a reaction inside the infected cells that prevents viral genome from being
expressed. The helping branch of the immune system are the so called CD4+ T helper
cells. They help the induction of antibody and CTL responses.
In their books, Nowak and May [5] and Woodarz [10] have developed several
mathematical models (systems of differential equations) describing the dynamics of
virus and the responsiveness of immune system. He discussed equilibrium points of the
models and the basic reproduction number of the virus. Furthermore, Korobeinikov [2]
have analyzed the global properties of basic virus dynamics model; whereas Kurdhi and
Aryati [3] and Pruss et al. [8] analyzed the global stability of equilibrium points of model
for virus dynamics with cytotoxic T lymphocyte (CTL). Moreover, Yousfy et al. [9] and
Wodarz [10] analyzed the model of virus dynamics with CTL and antibody responses.
However, they did not consider the effects of immune responses in controling the virus
infection and the interaction between CTL and antibody when they work together as a
part of the immune system. In this paper, we analyze the global stability of the model
for virus dynamics with CTL and antibody responses by using appropriate Lyapunov
functions. In the global stability theorem, we consider the value of R0 , R1 , and R2
as basic reproduction number for virus, CTL, and antibody response, respectively. In
particular, the role of immune responses and interaction between CTL and antibody
can be described through equilibrium points of each models and their stability.
The paper is organized as follows. In section 2 we formulate the virus dynamics
models with and without CTL. In section 3 we formulate and analyze the global stability
of the virus dynamics model with CTL and antibody. The numerical simulation of the
models is presented in section 4. Finally, the conclusion are summarized in section 5.

2. VIRUS DYNAMICS MODEL WITH AND WITHOUT CTL


The simplest model of virus dynamics has three state variables, namely nonin-
fected cells (Z), infected cells (I), and free viruses (V ). It is assumed that the nonin-
fected cells are produced with a constant rate and noninfected cells encounter free virus
and become infected cells at rate rZV . It is also assumed that each infected cell pro-
duce new viruses at rate kI. The death rate of noninfected cells, infected cells, and free
Dynamics of Virus With CTL and Antibody Responses 483

viruses are mZ, µI, and cV , respectively. Thus, the average respective life times of non-
infected cells, infected cells, and free virus particles are m, µ, and c, respectively. The
interaction among these population variables are described by a system of differential
equations (Nowak and May [5]; Wodarz [10])
Ż = α − mZ − rV Z, t ≥ 0,
I˙ = rV Z − µI, t ≥ 0,
(1)
V̇ = kI − cV, t ≥ 0,
Z(0) = Z0 , I(0) = I0 , V (0) = V0 ,
with given constants α, m, r, µ, k, c > 0 and initial values Z0 , I0 , V0 ≥ 0. The basic
reproduction number
αrk
R0 = (2)
mµc
is the average number of newly infected cells produced by a single infected cell at the
beginning of the infection when almost all cells are still noninfected. Persistence or
extinction of an infection depends on the quantity of R0 . To describe it, we first note
that the system (1) has two equilibrium points:
¯ V̄ = α , 0, 0 ,
  
Q̄ = Z̄, I, (3)
m
  α mc(R0 − 1) m(R0 − 1) 
Q∗ = Z∗ , I∗ , V∗ = , , . (4)
mR0 rk r
The first one is virus free equilibrium point which is always positive, whereas the second
one is endemic equilibrium point and has positive value if and only if R0 >1. In fact, it
can be proved that if R0 <1 then the equilibrium Q̄ is globally asymtotically stable. If
R0 >1 then Q̄ becomes unstable, whereas the equilibrium Q∗ is globally asymtotically
stable. (See Korobeinikov[2, Thm.1.1]).
The one of important role in the human response is played by CTL. Their number
is denoted by T . Following Nowak and May [5] and Pruss et al. [8], we assume that
infected cells I are destroyed at a rate sIT by CTLs and CTL proliferation rate is
propotional to the abundance of infected cells and CTLs dIT . Hence, model (1) can be
modified to
Ż = α − mZ − rV Z, t ≥ 0,
I˙ = rV Z − µI − sIT, t ≥ 0,
V̇ = kI − cV, t ≥ 0, (5)
Ṫ = dIT − nT, t ≥ 0,
Z(0) = Z0 , I(0) = I0 , V (0) = V0 , T (0) = T0 ,
with given constants α, m, r, µ, k, c, s, d, n > 0 and initial values Z0 , I0 , V0 , T0 ≥ 0.
In this model, the interaction between infected cells and CTLs is very similar to the
dynamics between predator and prey in ecology. The CTL are predator that grow on
and kill their prey (infected cell).
484 N.A. Kurdhi and L. Aryati

The system (5) always has virus free equilibrium point


¯ V̄ , T̄ = α , 0, 0, 0 ,
  
P̄ = Z̄, I, (6)
m
and two endemic equilibrium points
  µc mc(R0 − 1) m(R0 − 1) 
P∗ = Z∗ , I∗ , V∗ , T∗ = , , ,0 , (7)
rk rk r
 
ˆ V̂ , T̂ = αdc n kn αdrk µ
P̂ = Ẑ, I, , , , − , (8)
mdc + rkn d dc s(mdc + rkn) s
where P∗ is positive and different from P̄ if and only if R0 >1; whereas P̂ is strictly
positive if and only if the second threshold condition R1 < R0 holds, where
rkn
R1 = 1 + . (9)
mdc
Here, R1 is called basic reproduction number for CTL response of system (5). It is easy
to see that R1 > 1 always holds. Prior to activation of the CTL response, spread of
an virus infection depends on R0 given by (2). If R0 >1, then the infection will spread
initially. The existence of infected cells trigger the activation of the CTL response. At
this stage, persistence or extinction of the CTL response depends on R1 given by (9).
In fact, it can be proved that if R0 <1 then the equilibrium P̄ is globally asymtotically
stable. If 1<R0 <R1 then the equilibrium P̄ becomes unstable and P∗ is globally asym-
totically stable. If R1 <R0 then the equlibrium P̄ and P∗ become unstable, whereas P̂
is globally asymtotically stable. (See Kurdhi[3, Thm.2.4] and Pruss[8, Thm.2.1]).

3. VIRUS DYNAMICS MODEL WITH CTL AND ANTIBODY


In this section, we consider the virus dynamics model with CTL and antibody
responses presented by Yousfi et al. in [9] and Wodarz in [10]. The model contains
five variables: noninfected cell Z, infected cell I, free virus V , a CTL response T and
an antibody response A. The model is given by the following nonlinear system of
differential equations
Ż = α − mZ − rV Z, t ≥ 0,
I˙ = rV Z − µI − sIT, t ≥ 0,
V̇ = kI − cV − pV A, t ≥ 0,
(10)
Ṫ = dIT − nT, t ≥ 0,
Ȧ = gV A − hA, t ≥ 0,
Z(0) = Z0 , I(0) = I0 , V (0) = V0 , T (0) = T0 , A(0) = A0 ,
with given constants α, m, r, µ, k, c, s, d, n, p, g, h > 0 and initial values Z0 , I0 ,
V0 , T0 , A0 ≥ 0. Susceptible host cells are produced at a rate α, die at a rate mZ and
become infected by virus at a rate rV Z. Infected cells die at a rate µI and are killed
by the CTL response at a rate sIT . Free virus is produced by infected cells at a rate
Dynamics of Virus With CTL and Antibody Responses 485

kI, decays at a rate cV , and is neutralized by antibodies at a rate pV A. CTL expand


in response to viral antigen derived from infected cells at a rate dIT , and decay in the
absence of antigenic stimulation at a rate nT . Antibodies develop in response to free
virus at a rate gV A and decay at a rate hA.

3.1. Some Preliminary Results. It is important to show positivity and boundedness


for the system (10) as they represent population. Positivity implies that populations
survives and boundedness may be interpreted as a natural restriction to growth as a
consequence of limited resources. In this section, we present some basic results, such
as the exixtence-uniqueness solution, the positive invariance of the system (10), the
existence of equilibrium points, and the boundedness of solutions.

3.1.1. Existence-uniqueness solution. We denote by C 1 [I] the set of differentiable con-


tinous functions defined on the interval I and taking values in <5 . The model (10) can
be written in the form
Ẋ(t) = F (X(t)), (11)
where X(t) = (x1 , x2 , x3 , x4 , x5 )T := (Z, I, V, T, A)T , X(0) = (Z0 , I0 , V0 , T0 , A0 )T ∈ <5+
and
   
F1 (X) α − mZ − rV Z
F2 (X) rV Z − µI − sIT 
   
F (X) =  F3 (X) =  kI − cV − pV A  .
  
F4 (X)  dIT − nT 
F5 (X) gV A − hA
It is easy to check that Fi (X) ∈ C 1 [0, +∞), i = 1, 2, 3, 4, 5. Due to the fundamental
theorem of existence and uniqueness for initial value problems in ordinary differential
equations by Perko [7], there exists a unique solution to model (10).

3.1.2. Positive invariance. It is easy to check that Fi (X)|xi =0 ≥ 0, i = 1, 2, 3, 4, 5.


Due to the well known theorem by Nagumo [4], any solution of (10) with initial point
X0 ∈ <5+ , say X(t) = X(t; X0 ), is such that Xt ∈ <5+ for all t ∈ [0, +∞).
µ µp
3.1.3. Boundedness. We set U = Z +I + 2k V + ds T + 2kg A and q=min{m, µ2 , c, n, h} > 0.
Then system (10) yields the inequality
µ µc sn µph
U̇ = α − mZ − I− V − T− A,
2 2k d 2kg
so that U ≤ U (0) + αq by integration. Therefore the solutions are bounded for all t ≥ 0.
This shows that the system (10) is dissipative.

3.2. Analysis of the model. In this subsection, we will study the global asymptotic
stability of model (10). This model reduces to (5) if A0 = 0. Then we obtain three
equilibrium points
¯ V̄ , T̄ , Ā = α , 0, 0, 0, 0 ,
  
Ē = Z̄, I, (12)
m
  µc mc(R0 − 1) m(R0 − 1) 
E∗ = Z∗ , I∗ , V∗ , T∗ , A∗ = , , , 0, 0 , (13)
rk rk r
486 N.A. Kurdhi and L. Aryati

 
ˆ V̂ , T̂ , Â = αdc n kn αdrk µ 
Ê = Ẑ, I, , , , − ,0 , (14)
mdc + rkn d dc s(mdc + rkn) s
where Ē is always positive; whereas E∗ and Ê are positive if and only if R0 >1 and
R1 < R0 , respectively. There are exactly two more endemic equilibrium points, namely
  αg αrh h αrkg c
˜ Ṽ , T̃ , Ã =
Ẽ = Z̃, I, , , , 0, − , (15)
mg + rh µ(mg + rh) g µp(mg + rh) p
  αg
˘ V̆ , T̆ , Ă = n h αrhd µ kng − cdh 
Ĕ = Z̆, I, , , , − , , (16)
mg + rh d g ns(mg + rh) s dhp
where Ẽ and Ĕ are strictly positive if and only if R0 > R2 and R2 < R1 < R0 ,
respectively, where
rh
R2 = 1 + . (17)
mg
Here, R2 is called basic reproduction number for antibody response of system (10).
Prior to activation of the CTL and antibody responses, spread of an virus infection
depends on the value of R0 . If R0 >1, then the infection will spread initially. The
existence of infected cells and virus particles trigger the activation of the CTL and
antibody responses. At this stage, persistence or extinction of the CTL and antibody
responses depends on the value of R1 and R2 . The next theorems describe the global
asymptotic stability of the equilibrium points completely in terms of the numbers R0 ,
R1 , and R2 .
Theorem 3.1. Let R0 <1. Then Ē is the unique positive equilibrium of (10). It is
globally asymptotically stable.
Proof. We introduce the function Φ0 : Ω0 → <, where Ω0 = {(Z, I, V , T , A) ∈ <5 :
Z > 0, I, V , T , A ≥ 0} and
Z Z r  p  s
Φ0 (Z, I, V, T, A) = Z̄ − ln + I + Z̄ V + A + T.
Z̄ Z̄ c g d
Clearly, Φ0 ∈ C 1 (Ω0 ) and Φ0 (Z, I, V, T, A) is positive difinite with respect to Ω0 . Cal-
culating the time derivative of Φ0 (Z, I, V, T, A) along the positive solutions of the model
(10), we obtain
 Z Z̄  sn αrph
Φ̇0 (Z, I, V, T, A) = α 2 − − + µI(R0 − 1) − T− A.
Z̄ Z d mcg
Using the arithmetic-geometric inequality, we have that
Z Z̄
+ ≥2
Z̄ Z
for all Z>0, and the equality holds only for Z = Z̄. Furthermore, since R0 < 1 and
I, T, A ≥ 0, we obtain that Φ̇0 (Z, I, V , T , A) ≤ 0 for all (Z, I, V , T , A) ∈ Ω0 . Thus
Φ0 is a Lyapunov function. And Φ̇0 (Z, I, V, T, A) = 0, when Z = Z̄, I = 0, T = 0, and
A = 0. Let M be the largest invariant set in the set
H = {(Z, I, V, T, A} ∈ Ω0 : Φ̇0 (Z, I, V, T, A) = 0}
= {(Z, I, V, T, A} ∈ Ω0 : Z = Z̄, I = 0, T = 0, A = 0}.
Dynamics of Virus With CTL and Antibody Responses 487

We have from the first equation of (10) that M ={Ē}. It follows from La Salle’s principle
that the equilibrium Ē is globally asymptotically stable on Ω0 . 

Theorem 3.2. Let 1<R0 <R1 and R0 <R2 . Then Ē and E ∗ are the positive equilibriums
of (10). The equilibrium E ∗ is globally asymptotically stable.

Proof. We introduce the function Φ∗ : Ω∗ → <, where Ω∗ = {(Z, I, V , T , A) ∈ <5 :


Z,I,V> 0, T , A ≥ 0} and
Z Z I I  r  V
Φ∗ (Z, I, V, T, A) =Z∗ − ln + I∗ − ln + Z∗ V ∗
Z∗ Z∗ I∗ I∗ c V∗
V  p  s
− ln + A + T.
V∗ g d
Clearly, Φ∗ ∈ C 1 (Ω∗ ) and Φ∗ (Z, I, V, T, A) is positive difinite with respect to Ω∗ . Cal-
culating the time derivative of Φ∗ (Z, I, V, T, A) along the positive solutions of the model
(10), we obtain
  1  2   α2  α2  1 
Φ̇∗ (Z, I, V, T, A) =α 3 1 − + − mZ + 2 − 1−
R0 R0 R0 mZ R0 mZ R0
αr  1 V Z mµR0  1 I smc
− 1− − 1− + (R0 − R1 )T
µ R0 I r R0 V rk
pmµ
+ (R0 − R2 )A.
rk
Since R0 > 1 and the arithmetic mean is greater than or equal to the geometric mean,
it is clear that
α2  1  αr  1 V Z mµR0  1 I  1 
1− + 1− + 1− ≥ 3α 1 − ,
R0 mZ R0 µ R0 I r R0 V R0
and
α2 2α
mZ + ≥
R02 mZ R0
for all Z, I, V >0, and the equalities hold only for Z = Z∗ , I = I∗ and V = V∗ . Further-
more, since R0 <R1 and R0 <R2 , and T, A ≥ 0, we obtain that Φ̇∗ (Z, I, V , T , A) ≤ 0 for
all (Z, I, V , T , A) ∈ Ω∗ . Thus Φ∗ is a Lyapunov function. And Φ̇∗ (Z, I, V, T, A) = 0,
when Z = Z∗ , I = I∗ , V = V∗ , T = 0, and A = 0. The largest compact invarian set in
H={(Z, I, V , T , A} ∈ Ω∗ : Φ̇∗ (Z, I, V , T , A) =0} is the singleton {E∗ }. Therefore,
the endemic equilibrium E∗ is globally asymptotically stable on Ω∗ by the La Salle’s
principle. 

Theorem 3.3. Let R1 <R0 and R1 <R2 . If R0 <R2 , Ē, E ∗ , and Ê are the positive
equilibriums of (10); whereas if R2 <R0 , Ē, E ∗ , Ê, and Ẽ are the positive equilibriums
of (10). The equilibrium Ê is globally asymptotically stable.
488 N.A. Kurdhi and L. Aryati

Proof. We introduce the function Φ̂ : Ω̂ → <, where Ω̂ = {(Z, I, V , T , A) ∈ <5 :


Z,I,V> 0, T , A ≥ 0} and
Z Z  ˆ I I r  V V
Φ̂(Z, I, V, T, A) =Ẑ − ln +I − ln + Ẑ V̂ − ln
Ẑ Ẑ Iˆ Iˆ c V̂ V̂
p  s T T
+ A + T̂ − ln .
g d T̂ T̂
Clearly, Φ̂ ∈ C 1 (Ω̂) and Φ̂(Z, I, V, T, A) is positive difinite with respect to Ω̂. Calculating
the time derivative of Φ̂(Z, I, V, T, A) along the positive solutions of the model (10), we
obtain
˙
  1  2   α2  α2  1 
Φ̂(Z, I, V, T, A) =α 3 1 − + − mZ + 2 − 1−
R1 R1 R1 mZ R1 mZ R1
mc  1 V Z αk  1 I αp (R1 − R2 )
− R1 1 − − 1− + A.
k R1 I c R1 V c R1
Using the arithmetic-geometric inequality, since R1 > 1, we have that
α2  1  mc  1 V Z αk  1 I  1 
1− + R1 1 − + 1− ≥ 3α 1 − ,
R1 mZ R1 k R1 I c R1 V R1
and
α2 2α
mZ + ≥
R12 mZ R1
for all Z, I, V >0, and the equalities hold only for Z = Ẑ, I = Iˆ and V = V̂ . Fur-
˙
thermore, since R1 <R2 and A ≥ 0, we obtain that Φ̂(Z, I, V , T , A) ≤ 0 for all (Z,
˙
I, V , T , A) ∈ Ω̂. Thus Φ̂ is a Lyapunov function. And Φ̂(Z, I, V, T, A) = 0, when
Z = Ẑ, I = I,ˆ V = V̂ , and A = 0. Let M be the largest invariant set in the set
˙
H = {(Z, I, V, T, A} ∈ Ω̂ : Φ̂(Z, I, V, T, A) = 0}
ˆ V = V̂ , A = 0}.
= {(Z, I, V, T, A} ∈ Ω̂ : Z = Ẑ, I = I,

We have from the second equation of (10) that M ={Ê}. It follows from La Salle’s
principle that the equilibrium Ê is globally asymptotically stable on Ω̂. 

Theorem 3.4. Let R2 <R0 <R1 . Then Ē, E ∗ , and Ẽ are the positive equilibriums of
(10). The equilibrium Ẽ is globally asymptotically stable.

Proof. We introduce the function Φ̃ : Ω̃ → <, where Ω̃ = {(Z, I, V , T , A) ∈ <5 :


Z,I,V> 0, T , A ≥ 0} and
Z Z  ˜ I I r  V V
Φ̃(Z, I, V, T, A) =Z̃ − ln +I − ln + Z̃ Ṽ − ln
Z̃ Z̃ I˜ I˜ c + pà Ṽ Ṽ
p  A A  s
+ Ã − ln + T.
g à à d
Dynamics of Virus With CTL and Antibody Responses 489

Clearly, Φ̃ ∈ C 1 (Ω̃) and Φ̃(Z, I, V, T, A) is positive difinite with respect to Ω̃. Calculating
the time derivative of Φ̃(Z, I, V, T, A) along the positive solutions of the model (10), we
obtain
˙
  1  2   α2  α2  1 
Φ̃(Z, I, V, T, A) =α 3 1 − + − mZ + 2 − 1−
R2 R2 R2 mZ R2 mZ R2
αr  1 V Z mµ  1 I
− 1− − R2 1 −
µ R2 I r R2 V
αs  rh µn 
+ − T.
µ mg + rh αd
Since R2 > 1 and the arithmetic mean is greater than or equal to the geometric mean,
it is clear that
α2  1  αr  1 V Z mµ  1 I  1 
1− + 1− + R2 1 − ≥ 3α 1 − ,
R2 mZ R2 µ R2 I r R2 V R2
and
α2 2α
mZ + ≥
R22 mZ R2
˜ and V = Ṽ . Further-
for all Z, I, V >0, and the equalities hold only for Z = Z̃, I = I,
more, since
R1 − 1 αrk kng  rh  rh µn
R0 < R1 < R2 ⇐⇒ < 1+ ⇐⇒ <
R2 − 1 mµc cdh mg mg + rh αd
˙
and T ≥ 0, we obtain that Φ̂(Z, I, V , T , A) ≤ 0 for all (Z, I, V , T , A) ∈ Ω̃. Thus Φ̃ is
˙
a Lyapunov function. And Φ̃(Z, I, V, T, A) = 0, when Z = Z̃, I = I, ˜ V = Ṽ , and T = 0.
Let M be the largest invariant set in the set
˙
H = {(Z, I, V, T, A} ∈ Ω̃ : Φ̃(Z, I, V, T, A) = 0}
= {(Z, I, V, T, A} ∈ Ω̃ : Z = Z̃, I = I,˜ V = Ṽ , T = 0}.

We have from the third equation of (10) that M ={Ẽ}. It follows from La Salle’s
principle that the equilibrium Ẽ is globally asymptotically stable on Ω̃. 

Theorem 3.5. Let R2 <R1 <R0 . Then Ē, E ∗ , Ê, Ẽ, and Ĕ are the positive equilibriums
of (10). The equilibrium Ĕ is globally asymptotically stable.

Proof. We introduce the function Φ̆ : Ω̆ → <, where Ω̆ = {(Z, I, V , T , A) ∈ <5 :


Z,I,V> 0, T , A ≥ 0} and
Z Z  ˘ I I r  V V
Φ̆(Z, I, V, T, A) =Z̆ − ln +I − ln + Z̆ V̆ − ln
Z̆ Z̆ I˘ I˘ c + pĂ V̆ V̆
p  A A  s  T T
+ Ă − ln + T̆ − ln .
g Ă Ă d T̆ T̆
490 N.A. Kurdhi and L. Aryati

Clearly, Φ̆ ∈ C 1 (Ω̆) and Φ̆(Z, I, V, T, A) is positive difinite with respect to Ω̆. Calculating
the time derivative of Φ̆(Z, I, V, T, A) along the positive solutions of the model (10), we
obtain
˙
  1  2   α2  α2  1 
Φ̆(Z, I, V, T, A) =α 3 1 − + − mZ + 2 − 1−
R2 R2 R2 mZ R2 mZ R2
αr R  1 V Z mµ R0 R2  1 I
− 1− − 1− ,
µ R0 R2 I r R R2 V
where R = kng
cdh R2 . Using the arithmetic-geometric inequality, since R2 > 1, we have
that
α2  1  αr R3  1 V Z mµ R0 R2  1 I  1 
1− + 1− + 1− ≥ 3α 1 − ,
R2 mZ R2 µ R0 R2 I r R3 R2 V R2
and
α2 2α
mZ + 2 ≥
R2 mZ R2
˘ and V = V̆ . Then, we
for all Z, I, V >0, and the equalities hold only for Z = Z̆, I = I,
˙
obtain that Φ̆(Z, I, V , T , A) ≤ 0 for all (Z, I, V , T , A) ∈ Ω̆. Thus Φ̆ is a Lyapunov
˙ ˘ and V = V̆ . Let M be the
function. And Φ̆(Z, I, V, T, A) = 0, when Z = Z̆, I = I,
largest invariant set in the set
˙
H = {(Z, I, V, T, A} ∈ Ω̆ : Φ̆(Z, I, V, T, A) = 0}
˘ V = V̆ , T = 0}.
= {(Z, I, V, T, A} ∈ Ω̆ : Z = Z̆, I = I,

We have from the second and third equations of (10) that M is the singleton {Ĕ}. It
follows from La Salle’s principle that the equilibrium Ĕ is globally asymptotically stable
on Ω̆. 

We observe that for R0 < R1 and R0 < R2 , the CTL and antibody responses have
no influence for large t in so far the solution of (10) converges to the equilibrium given
by T = 0 and A = 0 and the equilibrium of the basic virus system (1). In case R0 > 1,
notice that the threshold condition R0 < R1 and R0 < R2 are equivalent to I∗ < nd and
V∗ < hg , respectively. And that the number of CTLs and antibodies decrease strictly if
and only if I < nd and V < hg , respectively, due to (10) (where T , A> 0, say). Therefore
since I and V converges to I∗ and V∗ for large t, respectively, the number of CTLs and
antibodies converges to zero as t → ∞. Hence, in order to trigger a significant immune
responses the reproduction rate must be large enough to push I and V over the critical
value nd and hg , respectively. In this case, the following three outcomes can be observed.
(i) If R1 < R0 and R1 < R2 , the CTL response develops and the antibody response
cannot become established. This is because the CTL response is strong and
reduces virus load to levels that are too low to stimulate the antibody response.
In this case, Iˆ = nd , whereas the threshold condition R1 < R2 is equivalent to
V̂ < hg . Therefore, the number of antibodies converges to zero as t → ∞.
Dynamics of Virus With CTL and Antibody Responses 491

(ii) If R2 < R0 < R1 , the antibody response develops and a sustained CTL fails.
This is because the antibody response is strong relative to the CTL response
and reduces virus load to levels that are too low to stimulate the CTL. In this
case, we have Ṽ = hg . Moreover, since R2 < R0 and R2 < R1 , it is clear that
R2 (R2 − 1) < R0 (R2 − 1) = R2 (R2 − 1) + (R2 − 1)(R0 − R2 ),
R2 (R2 − 1) < R2 (R1 − 1) = R2 (R2 − 1) + R2 (R1 − R2 ).
Furthermore, since R2 < R0 < R1 , we have that (R2 − 1)(R0 − R2 ) < R2 (R1 −
R2 ). Therefore,
R2 − 1 R1 − 1
R0 (R2 − 1) < R2 (R1 − 1) ⇐⇒ <
R2 R0
α n
⇐⇒ (R2 − 1) <
µR2 d
˜ n
⇐⇒ I < .
d
Thus the number of CTLs converges to zero as t → ∞.
(iii) If R2 < R1 < R0 , both CTL and antibody responses develop. It is attained
because in this case, I˘ = nd and V̆ = hg .
These outcomes are thus governed by competition between CTL and antibody
responses for the virus population. This is because the virus population is a resource
that both CTL and antibody require for survival.
The role of CTL and antibody immune responses can be described through equi-
librium points of each models. Comparing the model (10) with the basic model (1), we
have that
V∗ R0 − 1 I∗ R0 − 1 Z∗ R2
= > 1, = > 1, = < 1.
V̆ R2 − 1 I˘ R1 − 1 Z̆ R0
Thus the CLT and antibody immune responses decrease the density of free viruses and
of infected cells and increase the density of noninfected cells. In addition, the effects of
antibody response is shown by the following comparison
V̂ R1 − 1 Iˆ Ẑ R2
= > 1, = 1, = < 1.
V̆ R2 − 1 I˘ Z̆ R1
Compared with the model (5), the antibody response thus decreases the density of
infected cells and increases the density of noninfected cells. However, the antibody
response has no influence to the density of infected cells.

4. NUMERICAL SIMULATION
In this section, we perform some numeric simulations to demonstrate the theoret-
ical results obtained in Section 4 by using Mathematica 7.0. We present the numerical
simulations to observe the dynamics of system (10) with a set of parameter values in
Table 1. We have seen in previous sections that the value of R0 , R1 , and R2 play a
dicisive rule in determining the virus and immune responses dynamics. We can get
492 N.A. Kurdhi and L. Aryati

that R0 = 5, 71429, R1 = 1, 05714, and R2 = 1, 01333, thus R2 < R1 < R0 . Hence,


the solutions of model (9) with two different initial conditions converge to the endemic
equiibria Ĕ = (986.842, 0.143, 0.667, 22.105, 16.429) (See Figure 1).

Table 1. Parameter values used for simulation

Parameters Range of the parameters Source Values


α (noninfected cell production rate) 0-10 cells mm−3 day−1 [1, 5, 6, 10] 10
m (noninfected cell death rate) 0.01-0.02 day−1 [1, 5, 6, 10] 0.01
µ (infected cell production rate) 0.2398-0.7 day−1 [1, 5, 6, 10] 0.7
r (The rate of noninfected cells to 2.4x10−5 -2x10−4 [1, 5, 6, 10] 2x10−4
be infected) mm3 cell−1 day−1
k (Free virus production rate 3-100 day−1 [1, 5, 6, 10] 100
c (Free virus death rate 3.33-12.5 day−1 [1, 5, 6, 10] 5
s (The rate of infected cells to be 10−5 -1 mm3 cell−1 day−1 [1, 6, 10] 10−2
eliminated by the CTL response)
d (CTL production rate) 0.1-1 mm3 cell−1 day−1 [1, 6, 10] 0.7
n (CTL death rate) 0.05-0.25 day−1 [1, 6, 10] 0.1
p (The rate of free virus to be deacti- 1 mm3 cell−1 day−1 [10] 1
vated by the antibody response)
g (Antibody production rate) 0.5-1.5 mm3 cell−1 day−1 [10] 1.5
h (Antibody death rate) 1 mm3 cell−1 day−1 [10] 1

In Theorem 3.5, we proved that the equilibrium point Ĕ is globally asymtoti-


cally stable. Figure 2 illustrates the global attractivity of the endemic equilibria in a
simulation of model with and without immune responses. This figure show that the
solutions (Z(t), I(t), V (t)) of model (1), (5), and (10) corresponding to different ini-
tial values converge to the equilibria Q∗ = (175, 11.786, 235.714), P̂ = (945.946, 0.143,
2.857, 308.378), and Ĕ = (986.842, 0.143, 0.667, 22.105, 16.429), respectively.
To understand the role of immune responses numerically, in Figure 3, we illus-
trate the comparison of noninfected cell, infected cell, and virus population between
model (1), (5), and (10). From the density of noninfected cells, infected cells and virus
particles in the endemic equilibria Q∗ , P̂ , and Ĕ, we can conclude that the CTL and
antibody immune responses will decrease the density of free viruses and of infected
cells and increase the density of noninfected cells. Furthermore, without the immune
responses, the peak of density of infected cells reaches 460 cells/mm3 while with the
CTL response it reaches 21 cells/mm3 , whereas with the CTL and antibody responses
it reaches only 1.5 cells/mm3 . Moreover, without the immune responses, the peak of
density of free viruses reaches 9.1x103 viruses/mm3 while with CTL response it reaches
320 viruses/mm3 , whereas with CTL and antibody it reaches only 14 viruses/mm3 .
Thus, it can be seen that the CTL and antibody immune responses have a very realistic
control over the population of infected cells and virus particles.
We have seen in previous section that the interaction between CTL and antibodies
is competition for the virus population. Figure 4 illustrate the competition between both
Dynamics of Virus With CTL and Antibody Responses 493

Figure 1. The density of noninfected cells, infected cells, free virus


particles, CLTs, and antibodies for parameter values in Table 1 and
initial values (Z0 , I0 , V0 , T0 , A0 ) = (1000, 0, 0.001, 0.001, 0.001) (red
line) and (500, 10, 10, 10, 10) (blue line).

the immune responses during acute infection. As the virus population grows, both CTL
and antibody responses will start to expand. The outcome of the dynamics in acute
infection depends on the relative strengths of the responses. According to the numerical
result, three outcomes are possible.

(i) The CTL response is strong relative to the antibody response. Thus, the CTL
response develops while the antibody response does not become fully estab-
lished. And the CTL will clear the infection.
494 N.A. Kurdhi and L. Aryati

Figure 2. State trajectories for (i) system (1), (ii) system (5), dan
(iii) system (10), starting from different initial conditions.

(ii) The CTL response is weak relative to the antibody response. However, the
antibody response is unlikely to clear the infection. The reason is that while
free virus particles are removed, a relatively large pool of infected cells remains
because they do not become killed. Hence, the result is persistent infection in
the presence of an ongoing antibody response.
(iii) Both the CTL and antibody responses are sufficiently strong to become fully
established. And the outcome is virus clearance.
Dynamics of Virus With CTL and Antibody Responses 495

Figure 3. The density of noninfected cells, infected cells, and free


virus particles for system (1) (red line), (5) (blue line), and (10) (green
line).

5. CONCLUDING REMARKS
In this paper, we have studied the global dynamics of virus dynamics model with
CTL and antibody immune responses. By constructing suitable Lyapunov functions,
sufficient conditions have been derived for the global stability of five equilibrium points.
If the basic reproduction number for virus infection R0 < 1, the virus-free equilibrium is
globally asymtotically stable, and in case R0 > 1 there is a unique endemic equlibrium
which takes over this property. The stability of the four endemic equilibrium points is
also dependent upon both the basic reproduction number for CTL response R1 and for
antibody response R2 , which determine the persistence or extinction of CTL and anti-
body responses; If 1<R0 <R1 and R0 <R2 , the equilibrium E ∗ is globally asymptotically
stable and the infection becomes chronic but without CTL and antibody responses; If
R1 <R0 and R1 <R2 , the equilibrium Ê is globally asymptotically stable and the infec-
tion turns to chronic with CTL response but without antibody response; If R2 <R0 <R1 ,
the equilibrium Ẽ is globally asymptotically stable and the infection becomes chronic
with antibody response but without antibody response; If R2 <R1 <R0 , the equilibrium
Ĕ is globally asymptotically stable and the infection turns to chronic with CTL and
antibody responses.
The interaction between CTL and antibody is the competition for virus popula-
tion. This is because both the CTL and antibody proliferate in response to stimulation
496 N.A. Kurdhi and L. Aryati

Figure 4. Dynamics during acute infection. The graph of the level


of immunity show the density of CTLs (blue line) and of antibodies
(red line); whereas the graph of the level of viral infection show the
density of infected cells (blue line) and of free virus particles (red line).
Parameters are chosen as follows: α = 10, m = 0.1, r = 0.01, µ =
0.1, s = 1, k = 1, c = 1, p = 1, n = 0.1, h = 0.1, for (i) g = 0.5, d = 1,
(ii) g = 1.5, d = 1, and (iii) g = 1, d = 0.1.

by the same virus. We have shown it by analysis and numerical simulation. For ex-
ample, if the CTL response suppresses virus load to levels that are to low to stimulate
the antibody response, then a successful antibody response might not be generated.
Conversely, if the antibody reduce virus load to levels that are to low to stimulate the
CTL, a successful CTL response will not be established.
Dynamics of Virus With CTL and Antibody Responses 497

From numerical simulation, we see that the persistence of the CTL and antibody
response will decrease the density of infected cells and of free virus particles either in
equilibrium condition or in peak of infection. Hence, the CTL and antibody responses
plays an important role in the reduction of the virus infection.

Acknowledgement. we are grateful to anonymous referees for their extremely insight-


ful comments which improved our paper.

References
[1] Adams, B. M., Banks, H. T., Davidian, M., Hee-Dae Kwon, and Tran, H. T., Dynamic
Multidrug Therapies for HIV: Optimal and STI Control Approaches, Mathematical Biosciences
and Engineering 1, 223-241, 2004.
[2] Korobeinikov, A., Global properties of Basic Virus Dynamics Models, Bull. Math. Biol. 68,
615-626, 2009.
[3] Kurdhi, N. A. and Aryati, L., Global Stability of Virus Dynamics Model with CTL Response,
Department of Mathematics UGM, 2010.
[4] Nagumo, N., Uber die lage der integralkurven gewohnlicher differential gleichungen, Proc. Phys-
Math. Soc. Japan 24, 551-559, 1942.
[5] Nowak, M. A. and May, R., Virus Dynamics, Oxford University Press, Inc., New York, 2000.
[6] Perelson, A. S., Kirschner, D. E., and Boer, R. D., Dynamics of HIV infection of CD4+ T
Cells, Mathematical Biosciences 114, 81-125, 1993.
[7] Perko, L., Differential Equations and Dynamical Systems, Springer-Verlag, New York, 1991.
[8] Pruss, J., Zacher, R., and Schnaubelt, R., Global Asymptotic Stability of Equilibria in Models
for Virus Dynamics, Math. Model. Nat. Phenom 3, 126-142, 2008.
[9] Yousfi, N., Hattaf, K., and Rachik, M., Analysis of a HCV Model with CTL and Antibody
Responses, Applied Mathematical Sciences 3, 2835-2846, 2009.
[10] Wodarz, D., Killer Cells Dynamics, Mathematical and Computational Approaches to Immunolog,
Springer-Verlag, New York, 2007.

Nughthoh Arfawi Kurdhi


Department of Mathematics, Sebelas Maret University.
e-mail: math [email protected]

Lina Aryati
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]
498 N.A. Kurdhi and L. Aryati
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 499 - 504

A SIMPLE DIFFUSION MODEL OF PLASMA LEAKAGE IN


DENGUE INFECTION

NUNING NURAINI, DINNAR RACHMI PASYA, EDY SOEWONO

Abstract. In this paper we present a mathematical model of diffusion within vascular wall on
dengue infection that may capture the plasma leakage phenomena. The aim of this model is to
analyze the relation between the increased level of cytokine and blood concentration within vascular
wall. Numerical solutions of the diffusion model are obtained by finite difference method.
Simulations of the model indicate the effect of cytokine for variations of parameter values. It is
shown that high increased level cytokine will cause high blood concentration in the vascular wall that
may contribute to plasma leakage.
Keywords and Phrases: diffusion model,dengue viruses, cytokine level, plasma leakage.

1. INTRODUCTION
Dengue virus infection is an acute febrile disease that has become major public
health problems in many tropical and subtropical regions of the world. One of the forms of
the illness is Dengue Haemorrhagic Fever (DHF). DHF is characterized by plasma leakage
that may lead to death. Once inoculated into a human host, dengue has an incubation period
of 3-14 days (average 4-7 days) while viral replication takes place in target dendritic cells
[3,4].
Infection of target cells, primarily those of the reticuloendothelial system, such as
dendritic cells, hepatocytes, and endothelial cells. After a person is infected with dengue, the
person develops an immune response to that dengue subtype. The immune response produced
specific antibodies to that subtype's surface proteins that prevent the virus from binding to
macrophage cells (the target cell that infected by dengue viruses). However, if another
subtype of dengue virus infects the individual, the virus will activate the immune system to
attack the first subtype.
The immune system is tricked because the four dengue subtypes (DEN 1, DEN 2,
DEN 3 and DEN 4 ) have very similar surface antigens. The antibodies bind to the surface
_______________________________
2010 Mathematics Subject Classification :34C60, 92D30

499
500 N. NURAINI, D . R. PASYA, E. SOEWONO

proteins but do not inactivate virus. The immune response attracts numerous macrophages,
which the virus proceeds to infect because it has not been inactivated. This makes the viral
infection much more acute [6].
The infected macrophages then give signals to the immune system. This
phenomenon is called by antigen presentation. As the result, it may lead activation of the T
cells. Activated T cell is purpose to produce a range of cytotoxics, which lyses infected
monocytes-macrophages and some uninfected target cells, and cytokine that regulate or
"help" the immune response such as gamma interferon (IFN-), IL-2, IL-4, IL-5, IL-6, IL-8,
IL-10, IL-12, TNF-α , and TNF-β [2,3,4]. Macrophages or monocytes which are infected by
dengue viruses also produce TNF- α , TNF- β, IL-1, IL-1B, IL-6, and platelet activating factor
(PAF). Base on Kurane and Ennis's hypothesis, the rapid increase in the levels of TNF- α, IL-
2, IL-6, IFN-γ , and PAF induce increasing vascular permeability, plasma leakage, shock and
malfunction of the coagulation system, which may lead to hemorrhagic [3].
Activation of complement is another important clinical manifestation in DHF. It was
reported that levels of C3a and C5a, complement activation products, are correlated with the
severity of DHF and the levels of C3a and C5a reached the peak at the time of defervesce
when plasma leakage become most apparent [5].
In summary, there are several factors that increasing vascular permeability such as
cytokine, chemical mediator (PAF) and complement activation. Increasing vascular
permeability may be leading to plasma leakage. One of manifestations that induce plasma
leakage is an increasing hematocrite level up to 20 % above than normal condition [1].
The model for dengue transmission among population is established, some of them
is explained in [7]. But the mathematical modelling of dengue infection within a host is quite
rare,modeling of dynamical virus for this infection using differential equation is studied in
[4,5,6]. The model in [4] discuss an immune response but in [5,6] without immune response.
In this model we discuss a simple model to capture the plasma leakage phenomena in dengue
infection within a host. We use an one-dimensional diffusion equation with cytokine effect to
see the dynamic of blood concentration in vascular wall.

2. MODEL FORMULATION

In this section we describe one-dimensional diffusion equation that we used to


represent diffusion process from inner radius represent by a, into outer radius, b, of vascular
wall as in Figure 1.
In this model we develop two scenarios. First scenario, the diffusion model within
vascular wall without high increasing level of cytokine. Second scenario, the diffusion model
within vascular wall with high increasing level of cytokine. In this model equation, the
increasing level of cytokine was represented by K(t).

a
b

Figure 1 Cross-section artery illustration


A S i m p le Di f fu s i on M od el Of P la s m a Lea k a ge In Den gu e In fec t i on 501

We assume that the concentration of blood represent plasma leakage only affected
by cytokine. Suppose that C is the concentration of blood (density in %), the radius of
vascular wall represented by r (in mm), diffusion coefecient of blood is D, t for time. The
model equation is as follow,

in case of diffusion process from inner radius represent by a, into outer radius, b, of vascular
wall the equation (1) would be transformed by r  (b  a)r  a :

by substituting equation (2) and (3) to equation (1), we have

Where D is blood diffusion coefficient; r is radius of vascular wall 0  r  1 , C is blood


concentration, and t is time, t  0.

~
In this model we assume that blood stream pumped in the body periodically C 1  e it with  
angular velocity ω rad/s Then we have the boundary condition for inner vascular wall as

We assumed that no blood leakage in the outside boundary of the vascular wall, then the
boundary condition for outside part of vascular wall is

Solving the equation (1) numerically, we use finite difference and develop the derivatives
approximation to determine the values of unknown function at points in its domain

3. NUMERICAL SOLUTION

In this numerical simulation, we used hypothetical data to present the trend of blood
concentration at viremia phase (1-7 days), because it is difficult to get the real data for this
phenomena. Table 1 will give a parameter values that used in simulation.
502 N. NURAINI, D . R. PASYA, E. SOEWONO

Table 1 Parameter Values


We choose radius of arteries around 3 mm because in general, the arteries diameter are 0.1-
10 mm. The value of arteries wall thickness is 0.58 mm [2,4]. A normal pulse rate for a
healthy adult, are 60 to 100 beats per minute (BPM) with angular velocity 2 rad/s in resting
condition. It is assumed that diffusion coefficient is 20 mm2/s continously to simplify the
model.
Numerical result in Figure 2. shows the blood concentration for one period from the first day
in normal condition (which means no infection by dengue virus). The blood concentration
moves periodically because the blood is pumped by heart periodically. Moreover, Figure 2.
shows blood concentration movement from inner, r  0 , to outer, r  1 , in vascular wall. It
also shows that blood concentration decreases continuously. This simulation represent the
“zooming” time in second to represent the decreasing blood concentration over radius. At the
end, the blood concentration at outer vascular wall will be zero, which means there is no
plasma leakage. For the next simulation in Figure 3 we will fix the time and plot the dynamic
of blood concentration over radius in two dimension.

Figure 2 Graph of blood concentration for K(t)=0

Figure 3 expresses the comparison between blood concentrations at the normal condition and
infection. The first condition represent by K(t) = 0, the second, K(t) is not zero. Figure 3a
(left) simulate the dengue infection before plasma leakage takes place, for K(t)=0, from the
first, fourth, and seventh day, there is no blood concentration changes. On the contrary, in
Figure 3b (right), when K(t)=2.10-4+60, from the first, fourth, and seventh day, the blood
concentration changes along the radius of vascular wall.
A S i m p le Di f fu s i on M od el Of P la s m a Lea k a ge In Den gu e In fec t i on 503

Figure 4, simulate the condition that increasing cytokine level will be increasing blood
concentration. It can be seen on Figure 4a, where K(t)=2.10 -4t+60, the cytokine level will
increase 1% from normal condition. In Figure 4b, when K(t)=5.10 -4t+60, the blood
concentration will increase more than or equal to 20% from normal condition, this may lead
to plasma leakage.The simulation confirm the medical information that one of manifestations
that induce plasma leakage is an increasing hematocrite level up to 20 % above than normal
condition [1].

Figure 3. Simulation of blood concentration movement for different K(t).


Figure 5a K(t)=0. Figure 5b K(t)=2.10-4t+60.

Figure 6. Simulation of blood concentration movement for K(t) ≠ 0.


Figure 6a K(t)=2.10-4t+6. Figure 6b K(t)=5.10-4t+6
504 N. NURAINI, D . R. PASYA, E. SOEWONO

4. CONCLUDING REMARK
In this paper we formulate a mathematical model to capture plasma leakage phenomena in
vascular wall caused by dengue infection. Numerical solutions of the diffusion model are
obtained by finite difference method. Numerical simulations of the model indicate the effect
of cytokine for variations of parameter values. It is shown that high increased level cytokine
will cause high blood concentration in the vascular wall that may contribute to plasma
leakage.

Acknowledgement. This research is funded by Riset Kelompok Keahlian ITB 2011


FMIPA PN-6-25-2011. The author would like to thank Dr. Agus Yudi Gunawan during the
discussion for final project with Dinnar Rachmi Pasya.

References

[1] B AYLEY N.J.T. The Mathematical Theory of Infectious Diseases and its Application, Griffin, London. 1975.

[2] KURANE, I, Dengue Hemorrhagic Fever with Special Emphasis on Immunopatho- genesi, National Institute
of Infectious Disease. Tokyo, Japan, 2006.

[3] MAZUMDAR, J, An Introduction to Mathematical Physiology and Biology, Cambridge University Press 1999.

[4] NURAINI, N, TASMAN H, SOEWONO, E, and KUNTJORO, AS A with-in Dengue infection model with immune
response, Journal Mathematical and Computer Modelling 49 pp 1148 - 1155.2009.

[5] NURAINI, N, SOEWONO, E, KUNTJORO, AS Mathematical Model of Dengue Internal Transmission Process,
Journal on Indonesian Mathematical Society (MIHMI) Vol 13, pp.123-132, 2007.

[6] NURAINI, N, ARI, Y, KUNTJORO, AS Model Matematik Penyebaran Internal Demam Berdarah dalam Tubuh
Manusia, Prosiding Konferensi Nasional Matematika XIII, UNNES. 2006.

[7] SUPRIATNA, A.K, NURAINI, N, SOEWONO, E, Mathematical Model of Dengue Transmission and control,
Dengue Virus: Detection, Diagnosis and Control. , Basak Ganim and Adam Reis, Nova Science
Publishers, New York pp.187 – 208, 2010.

NUNING NURAINI
Institut Teknologi Bandung.
e-mail: [email protected]

DINNAR RACHMI P ASYA


Institut Teknologi Bandung.

EDY SOEWONO
Institut Teknologi Bandung.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011
Applied Mathematics, pp. 505 – 514.

THE SEQUENCES COMPARISON OF DNA H5N1 VIRUS ON


HUMAN AND AVIAN HOST USING TREE DIAGRAM
METHOD

SITI FAUZIYAH , M. ISA IRAWAN , MAYA SHOVITRI

Abstract. This research compares DNA sequences of H5N1 virus to analyze them with a tree
diagram method. We use tree diagram method to analyze similarity level of nucleotides. A tree
diagram method is one of several methods to align a pair of DNA sequences at entirely length.
This method uses concept data structure in general tree with post -order traversal. Furthermore, we
obtain mutation level of nucleotides. The scoring equation and parameters are determined.
Keywords and Phrase : Global Alignment, Pair of DNA Sequences, Tree Diagra m Method.

1. INTRODUCTION

The comparison of the existing sequences is a modern method for studying the
evolutionary interaction between genes. It is based on the alignment-the process of arranging
two or more sequences to achieve the maximum level of identity (for evaluation purposes),
the degree of similarity and eventual homology. Sequence alignment is an important method
in DNA and protein analysis [6,3]. Increasing of new biological sequences is basic of any
sequence analysis [4]. Bioinformatics is a collection of mathematical, statistical and
computational methods for analyzing biological sequences, that is, DNA, RNA and amino
acid (protein) sequences.
We will compare the DNA sequences of H5N1 to analyze their similarity level. H5N1
is Influenza-A virus that has a segmented single strain negative RNA linear genome. Inside
the host cell, virus’ RNA will be a reverse transcription into RNA-DNA hybrid and
eventually forms the DNA. Furthermore, the virus’ DNA will enter into the nucleus’ host cell.
DNA virus will damage the DNA host’s and to forms mRNA (messenger RNA), then mRNA
will translate to produce a viral envelope protein to form new viruses. The virus has very high
mutation rate, so it can create different variations of the viruses [5].
In the previous research, the similarity level in segment HA and NA protein was

505
506 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI

determined by [1] using tools EMBOSS, this tools applied “needle” algorithm. This research
use tree diagram method to align a pair of DNA sequences [7]. This method consists of
three parts: (i) simple alignment algorithm, (ii) extension algorithm, (iii) Graphical Simple
Alignment tree (GSA tree). These theories will be explained as follows

1.1 Sequences Alignment of DNA. In this section we give an introduction of DNA


sequences and Sequence Alignment that used in the next discussion.

1.1.1 DNA Sequences. DNA (deoxyribonucleic acid) sequences are associated with the four-
letter DNA alphabet {A,C, G, T}, where A, C, G and T stand for the nucleic acids or
nucleotides Adenine, Cytosine, Guanine and Thymine respectively. Most DNA sequences
currently is being studied come from DNA molecules that found in chromosomes. They that
are located in the nuclei of the cells of living organisms. More information about DNA can be
found in [4].
DNA sequences are string of letters from a four-letter alphabet called nucleotides (A,
C, G, T). The length of sequence is a variable and not all sequences are of the same length.
Generally, we use the following description of DNA sequence:
A = (a1 a2 … am) B = (b1 b2 … bn) (1)
Where the capital letters A, B represent the sequence and a i, bi, ci represent the basic
units of the sequence at position i, whose elements are obtained from the set {A,C, G, T}. For
instance, DNA sequence could do substitutions (change of one nucleotide to another),
insertions and deletions (gain or loss of one or more nucleotides) and therefore algorithms
should include the possibility of gaps [9]. There are some biology assumptions about
beginning and the end of a sequence are useful to development of algorithm [2].

1.1.2 Sequence Alignment. An alignment between two sequences is simply a pairwise match
between the characters of each sequence. A true alignment of DNA sequences is one that
reflects the evolutionary relationship between two or more homologs (sequences that share a
common ancestor).
Given two sequences (X and Y) in first equation with lengths is m,n respectively, let
c be the total length of the alignment we have, i.e.,
max(n, m)  c  n  m
The alignment is represented by a matrix M(X,Y), with size of matrix is 2 x c (2 is
row and c is column). The row are the sequences and the columns are matches, mismatch,
insertion and deletion (called “indels”) [2].
Example: Given two sequences, X = GAATTAGTTA and Y = GGATCGA. Where length is
m = 10, n = 7 respectively. One possible arrangement can be defined as :
G A A T T  A G T T A
M ( X ,Y ) 
G G A  T C  G   A
Since there are different ways of arranging the sequences and find mismatch and
indels, so we are interested in the best possible arrangement. Furthermore, the “best”
alignment depends on how the scoring of matches, mismatches and gaps. Since we are
interested in the similarity of two sequences, we would reward a match and penalize a
mismatch/gap. Thus, the first step is to define an appropriate scoring equation in order to
quantify the sequence alignments.
A scoring equation can be designed to quantify the edit distance (mutations,
insertions and deletions):
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 507

Alignment score: S = p   q  r (o  ke) , (2)


ehere  is score of each match,  is score of each mismatch and (o  ke) is score of each
gaps. The parameters of p, q and r denote the total amount of matches, the total amount of
mismatches and the total amount of gaps, respectively [7].
For nucleotide sequences, Sequence similarity and sequence identity are
synonymous [10]. So, similarity or identity level for nucleotide sequence can be counted by
 L 2 
S s  100 (3)
 a
L  Lb 

where, S is the percentage sequence similarity, Ls is the number of aligned residues with
similar characteristics, La and Lb are the total lengths of each individual sequence in
alignment.

1.2 Tree Diagram Methods. In this section we discuss about Tree Diagram Methods and
their theory related to the problem discussed in this research. Tree diagram method is one
method to align a pair of DNA sequence in entire length (global alignment). This method
consists three parts: (1) improved simple alignment algorithm, (2) extension algorithm, and
(3) GSA Tree [7].

1.2.1 Improved Simple Alignment Algorithm. Given two sequences X={ x1 , x2 ,..., xm } and
Y={ y1 , y2 ,..., yn }, where m and n is denote the length of X and Y, respectively. So
improved simple alignment algorithms can be defined as step for align X and Y sequence,
with initial position is first base (x1) of X overlaps with the first base (y1) of Y. Then the
sliding process is done along the left and right, respectively. These steps shown by [7] as
follows:
(1) Initial position
x1 x2 x3 … xm
y1 y2 y3 … ym … yn
(2) Every times X moves one base position along the right direction

x1 x2 x3 … xm
y1 y2 y3 … ym … yn
(3) Every times X moves one base position along the left direction

x1 x2 x3 … xm

y1 y2 y3 … ym … yn

Every alignment has score S according to the scoring equation on equation 2 nd. Then
at the step (2) and (3) still define another score S’i and S’j, respectively is
508 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI

S’i=  p   q  r (o  ke) and S’j =  p   q  r (o  ke) as stopping condition in step


(2) and (3). Here, we choose typical values of parameters, such as   5,   4, o  10 and
e  0.5 . We choose this parameters from element of DNAFULL matrix (i.e.,  ,  )
and
from one standard combination in tools EMBOSS (i.e., o and e). For the more complete
explanation of this algorithm can be found in [7]. From this algorithm we choose best
alignment (R) denote by maximum score of alignment and longest common substring in each
step.

1.2.2 Extension Algorithm. This algorithm used to protect the longer common substring than
C in R from being split by C.
Let C is a common substring in R, with C   {C1 , C2 ,..., Cm } . If C   , then
none longest common substring in C can be extended. If C  {C1 , C2 ,..., Cm } there are m
longest common substrings, where | C1 || C2 | ... | Cm | k with k denotes the number of
matches within a longest common substring. Here are some steps of extension algorithm to
find the longest common substring [7]:
(1) Let K be length of the longest common substring of X and Y.
(2) When k=K, none of the longest common substrings Ci of R can be extended into
longer common substring.
(3) When k < K, there are exist at least a longer common substring than C i . There
are several sub-steps to find out the longer common substrings as the following:
(a) Let LL be the number of mismatches from the right end of C i-1 to the left
end of Ci. When i = 1, LL denotes the number of mismatches from the left
end of X and Y to the left end of C1. Similarly, let LR denotes the number of
mismatches from the right end of Ci to the left of Ci+1. When i= 1, LR
denotes the number of mismatches from the right end of Ci to the right end
of X and Y.
(b) When K < LL, the K mismatches are extracted from the left of Ci.
Otherwise, the LL mismatches are extracted from the left of Ci. Similarly,
when K < LR, the K mismatches are extracted from the right of C i.
Otherwise, the LR mismatches are extracted from the right of Ci. Then the
sequences extracted from left of Ci , Ci and from the right of Ci are
connected into two new sub-sequences Si1 and Si2.
(c) Apply the simple alignment algorithm to Si1 and Si2. If there exist a new
longer common substring within Si1 and Si2 than Ci, we find several choice,
i.e., a new longer common substring or Ci. If there is an increment of sore
when the new longer common substring comes into being, we can replace
the original Ci with a new substring also called as Ci.
(4) As for every Ci of R, the original Ci is replaced by a new longest common
substring if a new substring exists.
Output these algorithm is data C and U from R (or repaired R) which has extend if extension
is occur.
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 509

1.2.3 Graphical Simple Alignment Tree. This algorithm use to explore the appropriate gaps in
Uj1. Any several step in here [7].

(1) Compute the scores of all simple alignment of Uj1 by the simple alignment
algorithm. A good simple alignment Rj1 of Uj1 is generated when its score
maximum.
(2) If there is increment of the score due to appropriate gaps within U j1, Uj1 can be
further divided into the second level sub-alignment. When and how to add the gaps
in the sequences? Now, let Ci2 be the longest common substrings of Rj1, where Ci2
   {Ci21 , Ci2 2 ,...Ci2 m } . Then there are two sub-steps as the following:
(a) If Ci2  {Ci21 , Ci2 2 ,...Ci2 m } ,there are m longest common substrings. Then Uj1
can be further divided into the second level sub-alignment by Ci2 . Let U2j be

the substrings spaced by Ci2, where U 2j    {U 2j 1 ,U 2j  2 ,...,U 2j  n } .


Then every substring Ci2 k (k=1,2,..,m) of Ci2 becomes a leaf node in GSA
2
tree. There are no gaps within them. As for every substring U j  k (k=1,2,…,n) of

U 2j , the operation flow goes back to the step (1). And the level of sub-
alignment enters the next.
(b) If Ci2   , there is no a longest common substrings in R1j . Then U1j can not
be further broken down. The good simple alignment R1j of U1j becomes a leaf
node in GSA tree. The two sequences of R1j might be entirely overlapping, or
partially overlapping, or one sequence might be aligned entirely internally to the
other. When the two sequences of R1j are entirely overlapping, there are no gaps
within R1j . Otherwise, the hanging ends of the overlap come into being the gaps
of R1j . The relative position of these gaps is fixed, and becomes gaps within the
final global alignment.

These above steps repeatedly do, until all U the last level sub-alignment cannot be
further decomposed by the improved simple alignment algorithm and the extension
algorithm. Then we can obtain a Graphical Simple Alignment Tree (GSA Tree) for string X
and Y, consisting of a series of substrings.

1.2.4 General Tree Using post-order Traversal. Global alignment in this method is formed
by GSA tree. In this case, GSA tree is a general tree or tree structure that has any children.
We can construct any general tree in figure 1.
510 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI

X,Y

C11 C31
U21

U32
U12
C22

Figure 1. The General tree

General tree in Figure 1 is construct from any sequence X and Y. In the position of root is
consist best alignment R, from improved simple alignment algorithm and extension algorithm. From
R, we can divide two kind of child in each level, i.e: C and U. C is a longest common substring from R
and U is substrings spaced by C. There are consists two types of nodes: inner and leaf nodes. The
global alignment of string X and Y is formed by all leaf nodes. To obtain global alignment, GSA tree
is traversed by post-order traversal of tree. A post-order traversal of a general tree performs a
post-order traversal of the root’s sub-trees from left to right, then visits the root [8]. Then all
inner nodes are deleted from the result of post-order traversal [7].

2. GLOBAL ALIGNMENT DNA VIRUS H5N1 AND ITS ANALYSIS

In this section, we discuss about the analysis of sequences DNA virus H5N1 using
tree diagram methods. This method is described in figure 2.

Output:
Improved Extension length
Input 2 algorithm
Simple Similarity
sequence Gaps
alignment
Score

C U

GSA Tree

Figure 2. Construction of global alignment using tree diagram method

In figure 2. After we input a pair of sequences DNA virus H5N1, we can process
alignment of two sequences in three part, i.e improved simple alignment, extension algorithm
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 511

and GSA Tree. The scoring equation is S =  p   q  r (o  ke) and we choose the
parameters   5,   4 (from DNAFULL matrix) and o = 10, e = 0.5 (default gap open
and gap extension in tools EMBOSS). Then we obtain percentage of similarity and gaps to
analysis mutation level in nucleotides.
We obtain DNA sequences in this research from GenBank database. These data are 2
sequences from host human and 4 sequences from host avian. The result of alignment using
this method can explain in Table 1.

Table.1, The result alignment sequence DNA H5N1 virus on HA segment using tree diagram
methods

No A pairs of Length similarity Gaps


sequences alignment
1 CY088769 and 1774 1583 68
HQ200596 (89.2%) ( 3.8%)
2 CY088769 and 1746 1590 39
CY091956 (91.1%) ( 2.2%)
3 CY088769 and 1709 1554 7
HM172081 (90.9) % (0.4) %
4 CY088769 and 1745 1560 41
AB569353 (89.4%) ( 2.3%)
5 CY088769 and 1749 1562 49
AB629698 (89.3) % (2.8) %
6 HQ200596 and 1775 1631 31
CY091956 (91.9%) ( 1.7%)
7 HQ200596 and 1773 1550 69
HM172081 (87.4%) ( 3.9%)
8 HQ200596 and 1774 1606 33
AB569353 (90.5%) ( 1.9%)
9 HQ200596 and 1774 1601 33
AB629698 (90.2%) ( 1.9%)
10 CY091956 and 1747 1566 44
HM172081 (89.6%) ( 2.5%)
11 CY091956 and 1747 1601 6
AB569353 (91.6%) ( 0.3%)
12 CY091956 and 1747 1600 6
AB629698 (91.6%) ( 0.3%)
13 HM172081 and 1746 1542 46
AB569353 (88.3%) ( 2.6%)
14 HM172081 and 1745 1541 44
AB629698 (88.3%) ( 2.5%)
15 AB569353 and 1742 1714 0
AB629698 (98.4%) ( 0.0%)

Using Table 1, we found similarity level internal host (human-human and avian-
avian) and external host (human-avian). The similarity level in human-human is 89.2%,
avian-avian is 91.3%, and human-avian is 90.1%. furthermore, we count mutation level of
512 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI

nucleotides using information of similarity and gap in the Table 1. We obtain the mutation in
human-human is 7%, avian-avian is 7.3% and human-avian is 7.8%. These result was same
with the result alignment from EMBOSS tools (a tools for pairwise alignment using “needle”-
algorithm).

3. CONCLUDING REMARK

According to this result, we can conclude that tree diagram method is sufficient to
align the sequences by applying the concept of a tree data structure that contains simple
alignment algorithms, extension algorithms and GSA tree. This method can improved the
alignment of two DNA sequences by exploring the appropriate gaps, gradually based on
simple alignment algorithms and extension algorithms. Based on validation results using tools
EMBOSS, shows that the optimal alignment generated from parameters match, mismatches,
penalty gap open and penalty gap extend respectively   5,   4 and o = 10, e = 0.5.
Furthermore, based from similarity level result of sequence DNA H5N1 virus on
segment HA, in internal and external host we conclude that in biological this is an indication
that they are different species. Based from mutation level result shown that mutation level on
this virus is high, and the highest is mutation level between host human-avian 7.8%.

Acknowledgement. I wish to give my gratitude to Ministry of Religion of


Republic Indonesia that has given me scholarship to study magister program in
Institut Teknologi Sepuluh Nopember (ITS) Surabaya.

References
[1] CHEN, G.W, CHANG, S.C, MOK, C.K, LO, Y.L, KUNG, Y.N, HUANG, J.H, SHIH, Y.H, WANG, J.Y, CHIANG, C,
CHEN, C.J, SHIH, S.R., Genomic signature of Human versus Avian Influenza A viruses. Emerging
infectious Diseases. www.cdc. Vol. 12, no.9, September 2006.
[2] ESCARINO, CLAUDIA-RANGEL: A two-base encoded DNA sequences alignment problem in computational
biology. Math-In-Industry Project, National Institute Of Genomic Medicine, Mexico, 2009.
[3] I. EIDHAMMER. Protein Bioinformatics: an algorithmic to sequences and structure analysis’ John Wiley &
Sons, Ltd ISBN: 0-470-84839-1. 2004
[4] ISAEV, ALEXANDER: Introduction to Mathematical Methods in Bioinformatics, Springer-Verlag Berlin
Heidelberg, Germany, 2004,.
[5] PETERSON A. TOWNSEND, SARAH E. BUSH, ERICA SPACKMAN, DAVID E. SWAYNE, AND HON S: Influenza
A Virus Infections in Land Birds, People’s Republic of China”, Emerging Infectious Diseases •
www.cdc.gov/eid • Vol. 14, No. 10, 2008.
[6] PEVSNER, JONATHAN. Bioinformatics and Functional Genomics . Department of Neurology, Kennedy
Krieger Institut & department of neuroscience and division of health Sciences informatics, The John
Hopkins School of Medicine, Baltimore: Maryland, 2009.
[7] QI, Z.H, QI, X.Q., New method for alignment 2 DNA sequences by tree data structure. Journal of
theoretical Biology 263, 227-236, 2009.
[8] SHAFFER, A. Clifford: A Practical Introduction to Data Structures and Algorithm Analysis Edition 3.2
(C++ Version), Department of Computer Science Virginia Tech Blacksburg, VA 24061, 2011.
[9] HEN, SHIYI NANKAI, JACK A. TUSZYNSKI: Theory and Mathematical Methodes for Bioinformatics, Springer
Vierlag, San Francisco, 2008.
[10] XIONG, JIN: Essential Bioinformatics, CAMBRIDGE University Press, United States Of America, 2006.
Th e S eq u en c es C om p a ri s on Of DNA H5 N1 Vi ru s . . . 513

SITI FAUZIYAH
Graduate Student of Mathematics Department at Institut Teknologi Sepuluh Nopember (ITS)
Surabaya.
e-mail: [email protected]

M. ISA IRAWAN:
supervisor, lecturer of Mathematics Department at Institut Teknologi Sepuluh Nopember
(ITS) Surabaya.
e-mail: [email protected]

MAYA SHOVITRI
co. supervisor, lecturer of Biology Department at Institut Teknologi Sepuluh Nopember (ITS)
Surabaya.
e-mail: [email protected]
514 S FAUZIYAH , M . I IRAWAN , M. SHOVITRI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 515 – 528.

REGULA FUZZY CONTROLLER DESIGN ON MODEL OF


MOTION SYSTEM OF THE SATELLITE BASED ON LINEAR
MATRIX INEQUALITY

SOLIKHATUN AND SALMAH

Abstract. In this paper, the linear state feedback controller for fuzzy model is designed by using
the linear matrix inequalities (LMIs). Fuzzy model are described as sum of weighting of r
subsystems. The controller design is guarantees the stability of system and satisfies the desired
transient responses of system. If the r approach to infinity, the existence of controller that stabilize
the system will be difficult to obtain. It is caused by find the solution of set of LMIs that satisfying
some the conditions. By relaxing the stability conditions we will formulate the problem of controller
design in LMIs feasibility problem.
Keywords and Phrases : fuzzy control, feedback, linear matrix inequalities, Lyapunov stability,
pole placement, Takagi-Sugeno model, relaxed stability conditions .

1. INTRODUCTION

In recent years, there have been many research efforts on these issues based on the
Takagi-Sugeno (TS) model based fuzzy control. For this TS model based fuzzy control
system, Wang et all [14] proved the stability by finding a common symmetric positive
definite matrix P for the r subsystems in general and suggested the idea of using Linear
Matrix Inequalities (LMIs). The process of controller design are involves an iterative process,
that is, for each rule a controller is designed based on consideration of local performance
only, then LMI-based stability analysis is carried out to check the global stability condition.
In the case that the stability conditions are not satisfied, the controller for each rule should be
redesigned.
Olsder [9] and Ogata, K, [8] presented about the basic theories of systems and controls.
Boyd, et all [1] discussed about the linear matrix inequality in system and control theory.
Lam, H.K et all [4] designed the controller to fuzzy system by linear matrix inequality
approach. Mastorakis, N.E. [5] discussed about the modeling of dynamic system by TS fuzzy
model. Messousi, W.E. et all [6] discussed a bout how the pole placement on fuzzy model by

515
516 SOLIKHATUN AND SALMAH

LMI-approach. Tanaka, K., dan Sugeno, M.[12] analyzed the stability and designed the fuzzy
control system.
Motivated by the LMI formulation of pole placement constraint of the conventional
state feedback in Chilali [2] and Hong, S.K. & Nam, Y [3], Solikhatun [11] modify the
formulation and apply to the multi-objective TS model based fuzzy logic controller design
problem. The design the fuzzy controller system for the way of simultaneously guaranteeing
global stability and adequate transient behavior (pre-specified transient performance) are
formulated. Tanaka and Wang [13] wrote that if the r approach to infinity then the existence
of controller that stabilize the system will be difficult to obtain. It is caused by find the
solution of set of LMIs that satisfying some the conditions. By relaxing the stability
conditions we will formulate the problem of controller design in LMIs feasibility problem.
Polderman [10] presented about the dynamic of satellite of motion system and the
factors that affect it and derive the equations of satellite motion of dynamic without
disturbance. Last, we will present the simulation results by apply the proposed methodology
to model of motion system of the satellite. The followed definitions, lemmas and theorems
are used in the main results.
Definition 1. The matrix P  Rnxn is called positive definite matrix if ut Pu 0, u  Rn .
Lemma 2. Suppose Q, R  R nxn are symmetric matrix and matrix S  R nxn . The condition
Q S
S t R 
0 is equivalent to

R 0, Q  SR1S 0 .
Lemma 2 is known as Schur complement.
Definition 3. Consider the linear x  Ax, A  Rnxn , x  Rn , x(0)  x0 .
system
Equilibrium point x is stable if for all   0 there    ( )  0 such that for each solution
x(t , x0 )
If x0  x   then x(t , x0 )  x   , t  t0 .
Equilibrium point x is asymptotically stable if x stable and there  0  0 such that for each
solution x(t , x0 )
if x0  x   0 then lim x(t , x0 )  x  0 .
t 
The stability of linear system can be formulated by LMI. It is known by Lyapunov
theorem about stability. In stabilization, we need a controller such that the state feedback
fuzzy control system is asymptotically stable, i.e.
lim x(t )  0
t 

for initial condition x(0) and x  0 .


Theorem 4. Consider the linear system x  Ax, A  Rnxn , x  Rn . Matrix A is called
stable if there exist positive definite matrix P such that
At P  PA 0 .
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 517

Theorem 5. Consider set S  R invariant and x  0 is equilibrium point of system


n

x  Ax, A  R , x  R . If the Lyapunov function V : S  R is differentiable


nxn n

continuously then
1. V (0)  0 and V ( x)  0, x  S \ {0} .
2. If V ( x)  0, x  S then x  0 is stable.
3. If V ( x)  0, x  S \ {0} then x  0 is asymptotically stable.
Definition 6. Consider the set X and set M  X . Consider function  M is defined as
function of
 M : X  [0,1]
x  X with the real number M (x) on [0,1] , with
that corresponding the each element of
the M (x) represent membership function grades for x on M . The fuzzy set M  X is
defined by x, M ( x), x  X  .
Definition 7. A subset of D of the complex plane is called an LMI-D region if there exist a
symmetric matrix   [ kl ]  Rmxm and   [kl ]  Rmxm such that


D  z  C f D ( z) 0 
where the characteristic function f D is given by
f D ( z)  [ kl  kl z  kl z ]1k ,l m .
Example 8. Consider a circle LMI region D

D  x  iy  C ( x  q)2  y 2  r 2 
r  0 , where the characteristic function is given by
centered at (-q, 0) and has radius
 r z  q 
f D ( z)   .
 z  q r 
As shown in Figure 2, we can chose the poles in region D such that desired transient
responses of system.
518 SOLIKHATUN AND SALMAH

max n Im

(-q,0)
max d
Re
r

Fig. 2. Circular region D for pole placement

Definition 9. Consider the circular region D of the left half complex plane. The system
x  Ax, x  Rn , A  Rnxn is said D-stable if all the poles lies on LMI-D region.

1.1 Affine Fuzzy Model. The problem of LMI-based fuzzy state feedback controller becomes
yet more complex if some of model parameters are unknown. By using a Takagi-Sugeno (TS)
fuzzy model, a nonlinear model can be expressed as a weighted sum of r simple subsystem.
The inference performed via the Takagi-Sugeno model is an interpolation of all the relevant
linear models. Takagi and Sugeno define the inference in the rule base as the weighted
average of each rule’s consequents :
r

  ( A x(t )  B u(t )  d )
i i i i
x(t )  i 1
r
. (1)

i 1
i

The TS fuzzy model consists of an if-then rule base. The rule antecedents partition a subset of
the model variables into fuzzy sets. The consequent of each rule is a simple functional
expression. The i-th rule of the Takagi-Sugeno fuzzy model is of the following form :
If x1 (t ) is Li1 and xn (t ) is Lin and u (t ) is M i
then x(t )  Ai x(t )  Biu(t )  di
where i  1, 2,..., r and r is the number of rules and Lij , j  1, 2,..., n and M i are fuzzy
sets centered at the i-th operating point. The categories of the fuzzy sets are expressed as
N_Left, Z_Equal and P_Right where N_Left represents negative, Z_Equal zero and P_Right
positive (Figure 1).
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 519

N_Left Z_Equal P_Right

-1 0 1

Fig. 1. membership function

The truth value of the i-th rule in the set i is obtained as the product of the
membership function grades :
i ( x, u)  L ( x1 )...L ( xn ).M (u)
i1 in i

with L ( x j )
ij
represent membership function grades for Lij at x j . Consider the linearized
state space form with the bias term d induced from the model linearization is follow:
x(t )  Ax(t )  Bu(t )  d (2)
with x  ( x1 , x2 ,..., xn )  R , A  R , B  R , u  R and d  R . The model
n nxn nxm m n

(2) is known an affine model. When d  0 , this model is called a linear model.

1.2 Design Controller. The linear control theory can be used to design the consequent parts
of the fuzzy control rules because they are described by linear state equations. Suppose that
the control input u is
ui (t )  ui (t )  k0i
in order to cancel the bias term d i . Then the Takagi-Sugeno fuzzy model is described by
x(t )  Ai x(t )  Biui (t ), i  1, 2,..., r .
Hence, the state feedback controller described by
ui (t )  Ki x(t )
where Ki  R n is vector of feedback gains to be chosen for i-th operating point. Therefore a
set of r control rules takes the following form:
If x1 (t ) is Li1 and xn (t ) is Lin and u (t ) is Mi
ui (t  1)  Ki x(t )  koi
then
where the index t  1 in the consequent part is introduced to distinguish the previous control
action in the antecedent part in order to avoid algebraic loops.
The resulting total control action is
520 SOLIKHATUN AND SALMAH

  (K x  ki i 0i )
u i 1
r
. (3)

i 1
i

Substituting (3) into (1), the state feedback fuzzy control system can be represented by
r r

   ( A  B K
i 1 j 1
i j i i j )x
x(t )  r r
. (4)
  
i 1 j 1
i j

i ( z ) r
Define hi  r
then  h ( z)  1. The system (4) can be written by
i

 i ( z)
i 1
i 1

r
x (t )   hi Gii  2 hi h jGij ,
2

i 1 i j

( Ai  Bi K j )  ( Aj  B j Ki )
Gii  Ai  Bi Ki , i  1,2,...r and Gij  ,i  j  r .
2
r r
1 r
Corollary 10.  hi ( z) 
i 1
2
 2hi ( z)hj ( z)  0 where
r  1 i 1 i  j
 h ( z)  1, h ( z)  0
i 1
i i

for all i.
Proof. It holds since
r
1 r 1 r
 hi ( z)   i j   (hi ( z)  hj ( z))2  0 .■
2
2 h ( z ) h ( z )
i 1 r  1 i 1 i  j r  1 i 1 i  j
Corollary 11. If the number of rules that fire for all t is less than or equal to s, where
1  s  r , then
r
1 r
 hi ( z)   2hi ( z)hj ( z)  0
2

i 1 s  1 i 1 i  j
r
where  h ( z)  1, h ( z)  0 for all i.
i 1
i i

Proof. It follows directly from Corollary 10.■

LMI Formulation for Stability Requirement

The stability of feedback system can be formulated by LMI according to Theorem 4. A


sufficient quadratic stability condition derived by Tanaka [12] for ensuring stability of (4) is
given by Theorem 12 as follow :
Theorem 12. The fuzzy control system (4) is asymptotically stable for some stable feedback
K j if there exists a common positive definite matrix P such that
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 521

( Ai  Bi K j )t P  P( Ai  Bi K j ) 0, i, j  1, 2,..., r . (5)

LMI Formulation for Pole Placement Requirement

In the synthesis of control system, meeting some desired performances should be


considered a long with stability. Generally, stability condition doesn’t directly deal with the
transient responses of the closed loop system. In contrast, a satisfactory transient response of
a system can be guaranteed by confining its poles in a prescribed region. To this purpose, we
introduce the following LMI-based representation of stability region. The pole placement
problem design a controller such that the state feedback fuzzy control system are located in a
prescribed sub region D in the left half plane to prevent too fast controller dynamics and
achieve desired transient behavior, i. e.
 ( Ai  Bi K j )  D
for initial condition x(0) .
Motivated by Chilali [2] an extended Lyapunov theorem for system (4) is develop with
the above definition of an LMI-based circular pole region as below.
Theorem 13. The fuzzy control system (4) is D-stable (all the complex poles lying in LMI
region D) if only if there exists a positive definite matrix Q such that
  rQ qQ  Q( Ai  Bi K j )t 
   0, i, j  1,2,..., r .
qQ  ( Ai  Bi K j )Q  rQ 
Proof. Let   D as eigenvalue of Ai  Bi K j , i, j  1,2,..., r and v  C is eigenvector
n

that corresponding with  . Then v* ( Ai  Bi K j )   v* . For each v  C n then

v* 0    rQ qQ  ( Ai  Bi K j )Q v 0
 *   0
 0 v  qQ  ( Ai  Bi K j ) Q  rQ
t
 0 v 
  rv*Qv qv*Qv  v* ( Ai  Bi K j )Qv
 * 0
qv Qv  v Q( Ai  Bi K j ) v  rv*Qv
* t

  r q  
 v*Qv 0
q    r 
Because Q  0 then we obtain
  r q  
f D ( )   0.
q    r 
In other word the fuzzy control system (4) is D-stable.
Contrary. Consider the fuzzy control system (4) is D-stable. We divide into two cases. For the
case Ai  Bi K j , i, j  1,2,..., r is diagonal matrix   Diag (l ), l  1,2,..., n, l  D .

  rI qI  It 
Suppose that    0 then according to the Lemma 2, we obtain
qI  I  rI 
522 SOLIKHATUN AND SALMAH

 rI  0
1
 rI  (qI  It )(rI )1 (qI  I )  0  rI  (qI  It )(qI  I )  0
r
Because r  0 then  rI  0 . It is contradiction with assumption. For the case
Ai  Bi K j , i, j  1,2,..., r is diagonalizable matrix then there exists an invertible matrix T
such that
T 1 ( Ai  Bi K j )T  , i, j  1,2,..., r
with  is a diagonal matrix that elements of  are eigenvalues of
Ai  Bi K j , i, j  1,2,..., r . Then
  rI qI  I (T 1 ( Ai  Bi K j )T )t 
 1 
qI  (T ( Ai  Bi K j )T ) I  rI 
  rI qI  It 
   0.
qI  I  rI 
For the case Ai  Bi K j , i, j  1,2,..., r is not diagonalizable matrix then there exists an
invertible matrix T such that
T 1 ( Ai  Bi K j )T  J , i, j  1,2,..., r
with J is a Jordan matrix. Then
  rI qI  I (T 1 ( Ai  Bi K j )T )t 
 1 
qI  (T ( Ai  Bi K j )T ) I  rI 
  rI qI  IJ t 
   0.
qI  JI  rI 
Let Q  TT such that Q  0 . Then
*

T 0    rI qI  I (T 1 ( Ai  Bi K j )T )t  T * 0 
 0 T  qI  (T 1 ( A  B K )T ) I  *
0
   i i j  rI   0 T 
  rTT * qTT *  TT * (T 1 ( Ai  Bi K j )T )t 
 1 0.
qTT  (T ( Ai  Bi K j )T )TT  rTT *
* *

  rQ qQ  Q(T 1 ( Ai  Bi K j )T )t 
 1 0
qQ  (T ( Ai  Bi K j )T )Q  rQ 
Furthermore if we take Re(Q) is matrix that its element is real part of Q then
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 523

  r Re(Q) q Re(Q)  Re(Q)(T 1 ( Ai  Bi K j )T )t 


 1 0
q Re(Q)  (T ( Ai  Bi K j )T ) Re(Q)  r Re(Q) 
1
Because QR nxn
and T ( Ai  Bi K j )T similar with Ai  Bi K j then
  rQ qQ  Q( Ai  Bi K j )t 
   0, i, j  1,2,..., r .■
qQ  ( Ai  Bi K j )Q  rQ 

2.THE MAIN RESULTS

Formulation for synthesis

We will formulate a problem for the design of fuzzy state feedback control system
that guarantees stability and satisfies desired transient responses by using the LMIS
constraints. The LMIs formulations of fuzzy state feedback synthesis problem are followed:
Theorem 14. The fuzzy control system (4) can be stabilized in the LMI-D region if there
exists a common positive definite matrix Q and Yi such that the following conditions hold
AiQ  QAit  BiYi  Yi t Bit  0
AiQ  QAit  BiY j  Y jt Bit Aj Q  QAtj  B jYi  Yi t Btj
 0 (6)
2 2
  rQ qQ  QAit  Yi t Bit 
   0, i, j  1,2,..., r .
qQ  AiQ  BiYi  rQ 
1
Given solution (Q, Yi ) , the fuzzy state feedback gain is obtained by Ki  Yi Q .
Proof. System (4) can be presented as
1 r 
x(t )    i iGii  2 i  j Gij  x ,
W  i 1 i j 
1 1
Gij  ( Ai  Bi K j )  ( Aj  B j Ki ), i  j , Gii  ( Ai  Bi Ki )  ( Ai  Bi Ki ) and
2 2
r r
W   i  j . By define Q  P 1 then LMI of (5) of stability Lyapunov can be rewrite
i 1 j 1
as
QGii t  Gii Q 0, i  1, 2,..., r
QGij t  Gij Q 0, i  j  r.
It is equivalent to
524 SOLIKHATUN AND SALMAH

Q( Ai  Bi Ki )t  ( Ai  Bi Ki )Q 0
Q( Ai  Bi K j )  ( Ai  Bi K j )Q
t
Q( Aj  B j Ki )t  ( Aj  B j Ki )Q
 0
2 2
Q( Ai  Bi Ki )t  ( Ai  Bi Ki )Q 0
Q Q
( Ai  Bi K j )t  ( Ai  Bi K j )Q  ( Aj  B j Ki )t  ( Aj  B j Ki )Q 0
2 2

By define Yi  KiQ then we have


AQ
i  QAit  BiYi  Yi t Bit  0
AQ  QAit  BiY j  Y jt Bit Aj Q  QAtj  B jYi  Yi t B tj
  0.
i

2 2
The last sufficient can be derived immediately from Theorem 13.■

Fuzzy controller model (4) is described as sum of weighting of r subsystems. The controller
design is guarantees the stability of system and satisfies the desired transient responses of
system. If the r approach to infinity, the existence of controller that stabilize the system will
be difficult to obtain. It is caused by find the solution of set of LMIs that satisfying some the
conditions.
There are two approach to relax the stability conditions according [7], namely, the global and
the regional Membership Function Shape Dependent (MFSD). In this paper we used the
regional MFSD. The operating regions of membership functions is divided into r region. Each
of region have has individual constraints that brings regional information to relaxation of
stability condition by some slack matrices T. We will formulate the problem of controller
design in LMIs feasibility problem as followed:

Theorem 15. Assume that the number of rules that fire for all t is less than or equal to s,
where 1  s  r . The fuzzy control system (4) can be stabilized in the specified region D if
there exists a common positive definite matrix Q , Yi and a common positive semi definite
matrix T such that the following conditions hold
AiQ  QAit  BiYi  Yi t Bit  ( s  1)T  0
AiQ  QAit  BiY j  Y jt Bit Aj Q  QAtj  B jYi  Yi t Btj ( s  1)T
  0
2 2 2 (7)

  rQ qQ  QAit  Yi t Bit 
   0, i  j  1,2,..., r , hi  h j  
qQ  AiQ  BiYi  rQ 
where s  1 . Given solution (Q, Yi ) , the fuzzy state feedback gain is obtained by
Ki  YiQ1 .
Proof. Consider a candidate of Lyapunov function V ( x(t ))  xt (t ) Px(t ), P  0 . Then
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 525

r
V ( x(t ))   hi ( z )xt (t )(Gii P  PGii ) x(t ) 
2

i 1

r  (Gij  G ji )t (G  G ji ) 

i 1 i  j
2 hi ( z ) h j ( z ) x t
(t ) 
 2
P  P ij
2
 x(t )

1 1
Gij  ( Ai  Bi K j )  ( Aj  B j Ki ), i  j and Gii  ( Ai  Bi Ki )  ( Ai  Bi Ki ) .
2 2
From condition of second LMI (7) and Corollary 10, we have
r r
V ( x(t ))   hi ( z )xt (t )(Gii P  PGii ) x(t )   2hi ( z )h j ( z ) xt (t )Tx(t )
2

i 1 i 1 i  j
r r
  hi ( z )xt (t )(Gii P  PGii ) x(t )  ( s  1) hi ( z )xt (t )Tx(t )
2 2

i 1 i 1
r
  hi ( z )xt (t )(Gii P  PGii  ( s  1)T ) x(t ).
2

i 1

Because condition of the first LMI (7) holds by Q  P 1 and Yi  KiQ ,


V ( x(t ))  0, x(t )  0 .■

Simulation Results on Motion System of Satellite


Last, simulation results are presented by application of the proposed methodology to
model of motion system of the satellite. Consider the model in state space form is followed:
0 0
 0 1 0 0 1 
3 2   0
0 2 
x(t )    u (t )
0
x(t )  
m
 0 0 0 1  0 0
   
 0 2 0 0  0 1
 m 

 r  
 
Where x(t )   r  and u (t )   ur (t )  .
 (  t )  u (t ) 
  
 
  (   ) 
The r (t )   is the distance between the surface earth and the satellite that is affected by
time. The  (t )  t is the different between angle and angle position that is affected by time.
Let the x1 (t ) as variable fuzzy, then exist three power one rules as followed:
526 SOLIKHATUN AND SALMAH

If x1 (t ) is ZE then x(t )  A1 x(t )  B1u(t ) ,


If x1 (t ) is PO (or NE) and u(t) is ZE then x(t )  A2 x(t )  B2u(t ) ,
If x1 (t ) is PO (or NE) and u(t) is NE (or PO) then
x(t )  A3 x(t )  B3u(t ) .
The fuzzy rules for controller are given
If x1 (t ) is ZE then ui (t  1)  Ki x(t )
If x1 (t ) is PO (or NE) and u(t) is ZE then ui (t  1)  Ki x(t ) ,
If x1 (t ) is PO (or NE) and u(t) is NE (or PO) then
ui (t  1)  Ki x(t ) .
Supposed the matrices are followed
0 1 0 0  0 1 0 0  0 1 0 0
3 0 0 2  12 0 0 4  48 0 0 8
A1   , A2   , A  
0 0 0 1  0 0 0 1 3  0 0 0 1
     
0  2 0 0  0  4 0 0  0  8 0 0
 0 0 
 1 
1620 0 
and B1  B2  B3  
0 0  . The results of simulation are expressed in Graphics
 1 
 0 
 1620 
of response impulse are followed:
Respon impulse after are given controller
2.5

1.5
To: Out(1)

1
9
x 10 Respon impulse bef ore are given controller
5 0.5

0 0
Amplitude
To: Out(1)

-5
-0.5 -4
x 10
-10

-15
Amplitude

-20 5
x 10 2
To: Out(2)

15

10
To: Out(2)

1
5

0
-5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec)
Time (sec)
R egu la Fu z z y C on t ro ll er D es i gn on M od el of M ot i on S ys t e m . . . 527

Respon impulse after are given controller by relaxing the stability


2.5

1.5
To: Out(1)

0.5

0
Amplitude

-0.5 -4
x 10

2
To: Out(2)

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time (sec)

3. CONCLUSIONS
In this paper, the linear state feedback fuzzy controller with guaranteed stability and pre-
specified transient performance is presented. By formulate the system into TS-fuzzy model
and recasting these constraints into LMIs, we formulate an LMI feasibility problem for the
design of the fuzzy state feedback control system. If the r approach to infinity then the
existence of controller that stabilizes the system will be difficult to obtain. It is caused by find
the solution of set of LMIs that satisfying some the conditions. By relaxing the stability
conditions we have formulated the problem of controller design in LMIs feasibility problem.

Acknowledgement. This project was supported by Research Grant of Mathematics


Department for 2010-2011, Faculty of Mathematics and Natural Sciences, Gadjah Mada
University.

References

[1] BOYD, S., GHAOUI, FERON, LE., E. AND BALAKRISHNAN, V., LMI in System and Control Theory,
SIAM, Philadelphia, 1994.
[2] CHILALI, M., AND GAHINET, P., H Design with Pole Placement Constraints: An LMI Approach,
IEEE Trans. Automatic Control, Vol 41, No 3 pp 358-367,1996.
[3] HONG, S.K. AND NAM, Y., Stable Fuzzy Control System Design with Pole Placement Constraint: An
LMI Approach, Computer in Industry, Elsevier Science, 2003.
[4] LAM, H.K., LEUNG, F.H.F. AND TAM, P.K.S., A LMI Approach for Control of Uncertain Fuzzy
System, IEEE Control Systems Magazine, August 2002.
[5] MASTORAKIS, N.E., Modeling Dynamical Systems via the Takagi-Sugeno Fuzzy Model, Department of
Electrical Engineering and Computer Science, Hellenic Naval Academy Piraeus, Greece, 2004.
[6] MESSOUSI, W.E., PAGES, O. AND HAJJAJI, A.E., Robust Pole Placement for Fuzzy Models with
Parametric Uncertainties: An LMI Approach, University of Picardie Jules Verne, France, 2005.
[7] NARIMANI, M and LAM, HK., Relaxed LMI-Based Stability Conditions for Takagi-Sugeno Fuzzy
Control Systems Using Regional-Membership-Function-Shape-Dependent Analysis Approach, IEEE
Transaction on Fuzzy Systems, Vol 17, No 5, 2009.
[8] OGATA, K, Modern Control Engineering, 2nd ed. Englewood Cliffs, N.J,: Prentice Hall, Inc, 1990.
[9] OLSDER,J., Mathematical System Theory, Faculty of Technical Mathematics and Informatics, Delf
528 SOLIKHATUN AND SALMAH

University, Netherland, 1994.


[10] POLDERMAN J.W., Applied Mathematical Sciences (AMS) series, Enscheda, Groningen, 2006.
[11] SOLIKHATUN, Fuzzy State feedback Control with Multiobjectives, Proceeding IndoMS International
Conference Mathematics Applied, Gadjah Mada University, Yogyakarta, 2009.
[12] TANAKA, K., AND SUGENO, M., Stability Analysis and Design of Fuzzy Control System, Fuzzy Set
System, Vol 45, No 2, pp135-156, 1992.
[13] TANAKA, K. AND WANG, H., Fuzzy Control System Design and Analysis: A Linear Matrix Inequality
Approach, John Wiley & Sons, Inc, 2001.
[14] WANG, H. O., TANAKA, K., AND GRIFFIN, M. F., An Approach to Fuzzy Control of Nonlinear
Systems: Stability and Design Issues, IEEE Trans. On Fuzzy System, Vol 4. No 1 pp 14-23, 1996.

Solikhatun
Department of Mathematics
Faculty of Mathematics and Natural Sciences
Gadjah Mada University, Yogyakarta
e-mail : [email protected]

Salmah
Department of Mathematics
Faculty of Mathematics and Natural Sciences
Gadjah Mada University, Yogyakarta
e-mail : [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 529–546.

UNSTEADY HEAT AND MASS TRANSFER FROM A


STRETCHING SURFACE EMBEDDED IN A POROUS
MEDIUM WITH SUCTION/INJECTION AND THERMAL
RADIATION EFFECTS

Stanford Shateyi and Sandile S Motsa

Abstract. This work investigates the application of the new successive linearisation
method (SLM) to the problem of unsteady heat and mass transfer from a stretching sur-
face embedded in a porous medium with suction/injection and thermal radiation effects.
The governing non-linear momentum, energy and mass transfer equations are success-
fully solved numerically using the SLM approach coupled with the spectral collocation
method for iteratively solving the governing linearised equations. Comparison of the SLM
results for various flow parameters against numerical results and other published results,
obtained using the Homotopy Analysis Method and Runge-Kutta methods, for related
problems indicates that the SLM is a very powerful tool which is much more accurate
and efficient than other methods. The SLM converges much faster than the traditional
methods like the Homotopy Analysis Method and is very easy to implement.
Keywords and Phrases: successive linearization, heat & mass transfer, porous medium,
thermal radiation.

1. INTRODUCTION
The study of heat and mass transfer over a stretching surface is important in many
industrial applications such as hot rolling and wire drawing, glass fibre production, the
aerodynamic extrusion of plastic sheets, the continuous casting, paper production, glass
blowing and metal spinning. The quality of the final product depends on the rate of
heat transfer at the surface. In the pioneering work of Crane [1], the flow of Newtonian
fluid over a linearly stretching surface was studied. Subsequently, the pioneering works
of Crane are extended by many authors to explore various aspects of the flow and heat
transfer occurring in an infinite domain of the fluid surrounding the stretching sheet

2010 Mathematics Subject Classification: 76

529
530 S. Shateyi, S.S. Motsa

(e.g Laha et al. [2]; Afzal [3]; Prasad et al. [4]; Abel and Mahesha [5]; Abel et al. [6];
Cortell [7]).
Physically, the problem of natural/mixed convection flow past a stretching sheet
embedded in porous medium arises in some metallurgical processes which involve the
cooling of continuous strips or filaments by drawing them through quiescent fluid. Draw-
ing the strips through porous media allows to control the rate of cooling better and the
final product of desired characteristics can be achieved. Abdou [8] developed a numeri-
cal model to study the effect of thermal radiation on unsteady boundary layer flow with
temperature dependent viscosity and thermal conductivity due to a stretching sheet in
porous media. Pal and Mondal [9] performed a boundary layer analysis to study the
influence of thermal radiation and buoyancy force on two-dimensional magnetohydrody-
namic flow of an incompressible viscous and electrically conducting fluid over a vertical
stretching sheet embedded in a porous medium in the presence of inertia effect.
Motivated by the previous works and the vast possible industrial applications, it
is of interest in this article to analyze unsteady heat and mass transfer from a stretching
permeable surface embedded in a porous medium with suction/injection and thermal
radiation. The governing partial differential equations are transformed into ordinary
differential equations using the similarity transformation, before being solved by a new
technique called successive linearization method (SLM).

2. MATHEMATICAL FORMULATION
We consider an unsteady boundary-layer flow due to a stretching permeable sur-
face embedded in a uniform porous medium and issuing from a slot as shown in Figure
1. We assume that equal and opposite forces are applied along the x− axis so that the
sheet is stretched, keeping the origin fixed in the fluid of the ambient temperature T∞
and concentration C∞ .At t = 0, the sheet is impulsively stretched with the variable ve-
locity Uw (x, t), the temperature distribution Tw (x, t) and the concentration distribution
Cw (x, t) varies both along the sheet and with time. We assume the fluid properties to
be constant. The radiative heat flux in the x− direction is negligible in comparison with
that in the y−direction. The fluid flow over the unsteady stretching sheet is composed
of a concentration. Under these assumption with the usual Boussinesq approximation,
the governing boundary-layer equations for this investigation are given by:
∂u ∂v
+ = 0, (1)
∂x ∂y
∂u ∂u ∂u ∂2u ν
+u +v = ν 2
+ gβT (T − T∞ ) + gβC (C − C∞ ) − u, (2)
∂t ∂x ∂y ∂y k
∂T ∂T ∂T ∂2T 1 ∂qr
+u +v = α 2 − , (3)
∂t ∂x ∂y ∂y ρcp ∂y
∂C ∂C ∂C ∂2C
+u +v = D . (4)
∂t ∂x ∂y ∂y 2
Unsteady Heat and Mass Transfer from a Stretching Surface... 531

The boundary conditions for this problem can be written as:


u(x, 0) = Uw (x, t), v(x, 0) = Vw , T (x, 0) = Tw (x, t), C(x, 0) = Cw (x, t), (5)
u(x, ∞) = 0, T (x, ∞) = T∞ , C(x, ∞) = C∞ . (6)

Slit
y
6 9
6
j
j
j
j
1
s
j x
z+ j
j
Solid block

*
z
Force
Stretching Sheet

Figure 1. The physical configuration for the boundary layer flow.

By using the Rosseland diffusion approximation (Hossain et al. [12], Seddeek [13])
and following Raptis [14] among other researchers, the radiative heat flux, qr is given
by
4σ ∗ ∂T 4
qr = − , (7)
3Ks ∂y

where σ and Ks are the Stefan-Boltzman constant and the Rosseland mean absorption
coefficient, respectively. We assume that the temperature differences within the flow
are sufficiently small such that T 4 may be expressed as a linear function of temperature.
T 4 ≈ 4T∞
3 4
T − 3T∞ . (8)
Using (5) and (6) in the last term of equation (3) we obtain
∂qr 16σ ∗ T∞
3
∂2T
=− . (9)
∂y 3Ks ∂y 2
532 S. Shateyi, S.S. Motsa

The stretching velocity Uw (x, t), the surface temperature Tw (x, t) and the surface
concentration Cw (x, t) are assumed to be of the form:
ax bx bx
Uw (x, t) = , Tw (x, t) = T∞ + , Cw (x, t) = C∞ + , (10)
1 − ct 1 − ct 1 − ct
where a, b and c are positive constants with dimension reciprocal time and ct <
1. The effective stretching rate b/1 − ct is increasing with time. In the context of
polymer extrusion the material properties may with time even though the sheet is
being pulled by a constant force. With unsteady stretching (i.e c 6= 0), however, a−1
becomes the representative time scale of the resulting unsteady boundary layer problem.
The expressions for the temperature Tw (x, t) and concentration Cw (x, t) of the sheet
represent a situation in which the sheet temperature and concentration increase (reduce)
if b is positive (negative) from T∞ and C∞ , respectively at the leading edge (x = 0) in
proportion to x and such that the amount of temperature and concentration increase
(reduction) along the sheet increase with time. Further, in equation (5) Vw is the
suction/injection parameter, Vw > 0 (injection) and Vw < 0 (suction).
We introduce the following self-similar transformation (see Ishak et al. [15] and
Anderson et al. [16], among others):
 1/2
Uw 1 T − T∞ C − C∞
η= y, ψ = (νxUw ) 2 f (η), θ = , φ= , (11)
νx Tw − T∞ Cw − C∞
where ψ(x, y) is the physical stream function which automatically satisfies the conti-
∂ψ
nuity equation (1). The velocity components are then given as: u = ∂Ψ ∂y , v = − ∂x .
Governing equations are then transformed into a set of ordinary differential equations
and associated boundary conditions as given below:
η
f 000 + f f 00 − (f 0 )2 − Kf 0 − A(f 0 + f 00 ) + Grθ + Gcφ = 0, (12)
  2
1+R 1
θ00 + f θ0 − f 0 θ − A(θ + ηθ0 ) = 0, (13)
Pr 2
1 00 1
φ + f φ0 − f 0 φ − A(φ + ηφ0 ) = 0, (14)
Sc 2
where the prime indicates differentiation with respect to η.
3
c 16σT∞
A= , R= , Gr = gβT (Tw − T∞ )x3 /ν 2 ,
a 3Ks αρcp
νx ν
Gc = gβC (Cw − C∞ )x3 /ν 2 , K = , P r = , Sc = ν/D, (15)
kUw α
In view of equations (11), the boundary conditions (5) to (6), transform into

f (0) = fw , f 0 (0) = 1, θ(0) = φ(0) = 1, (16)


u=0, θ = 0, φ = 0 as η → ∞, (17)
q
2x
where fw = −Vw νU ∞
is the mass transfer coefficient such that fw > 0 indicates
suction and fw < 0 indicates blowing at the surface. The quantities of physical interest
Unsteady Heat and Mass Transfer from a Stretching Surface... 533

are the skin-friction coefficients Cf , local Nusselt number N ux and local Sherwood
number Shx which are defined as follows:

1 1/2
Re Cf = −f 00 (0), (18)
2 x
N ux Re−1/2
x = −θ0 (0), (19)
Shx Re−1/2
x
0
= −φ (0), (20)

xU∞
where Rex = ν , is the Reynolds number.

3. SUCCESSIVE LINEARISATION METHOD (SLM)


In this section we apply the proposed linearisation method of solution, hereinafter
referred to as the successive linearisation method (SLM), to solve the governing equa-
tions (12 - 14). The SLM is based on the assumption that the unknown functions f (η),
θ(η) and φ(η) can be expanded as

i−1
X i−1
X i−1
X
fi (η) = Fi (η) + fm (η), θi (η) = Gi (η) + θm (η), φi (η) = Hi (η) + φm (η),
m=0 m=0 m=0
(21)
where i = 1, 2, 3, . . . ; Fi , Gi , Hi are unknown functions and fm , θm , φm (m ≥ 1) are
the successive approximations which are obtained by recursively solving the linear part
of the equation system that results from substituting (21) in the governing equations
(12 - 14). Substituting (21) in the governing equations gives

F 000 + a1,i−1 Fi00 + a2,i−1 Fi0 + a3,i−1 Fi + GrGi + GcHi + Fi00 Fi − Hi0 Hi0 = ri−1 ,
i 
1+R
G00i + b1,i−1 G0i + b2,i−1 Gi + b3,i−1 Fi0 + b4,i−1 Fi + Fi G0i − Fi0 Gi = si−1 ,
Pr
1 00
H + c1,i−1 Hi0 + c2,i−1 Hi + c3,i−1 Fi0 + c4,i−1 Fi + Fi Hi0 − Fi0 Hi = ti−1 (22)
Sc i
534 S. Shateyi, S.S. Motsa

where the coefficient parameters ak,i−1 , bk,i−1 , ck,i−1 (k = 1, .., 4), ri−1 , si−1 and ti−1
are defined as
i−1 i−1 i−1
X η X
0
X
00
a1,i−1 = fm − A, a2,i−1 = −2 fm − K − A, a3,i−1 = fm ,
m=0
2 m=0 m=0
i−1
X i−1
X i−1
X
0 0
b1,i−1 = a1,i−1 , b2,i−1 = − fm − A, b3,i−1 = − θm , b4,i−1 = θm ,
m=0 m=0 m=0
i−1
X i−1
X
c1,i−1 = a1,i−1 , c2,i−1 = b2,i−1 c3,i−1 = − φm , c4,i−1 = φ0m ,
m=0 m=0

i−1 i−1 i−1 i−1
!2 i−1
X X X X X
000 00 0 0
ri−1 = − fm + fm fm − fm − (K + A) fm
m=0 m=0 m=0 m=0 m=0
i−1 i−1 i−1
#
η X X X
− A f 00 + Gr θm + Gc φm ,
2 m=0 m m=0 m=0
"  i−1 i−1 i−1 i−1 i−1
1 + R X 00 X
0
X X
0
X
si−1 = − θm + θm fm − fm θm
Pr m=0 m=0 m=0 m=0 m=0
i−1 i−1
!#
X η X 0
− A θ+ θ ,
m=0
2 m=0 m
" i−1 i−1 i−1 i−1 i−1
1 X 00 X X X X
ti−1 = − φm + φ0m fm − 0
fm φm
Sc m=0 m=0 m=0 m=0 m=0
i−1 i−1
!#
X η X 0
− A φ+ φ ,
m=0
2 m=0 m

We choose the following functions, as initial approximations of the SLM algorithm

f0 (η) = fw + 1 − e−η , θ0 (η) = e−η , φ0 (η) = e−η , (23)

which are chosen to satisfy the boundary conditions (15) and (16). The solutions
for fm , θm , φm for m ≥ 1 are obtained by successively solving the linearised form
of equations (22) - (22) which are given as

f 000 + a1,i−1 fi00 + a2,i−1 fi0 + a3,i−1 fi + Grθi + Gcφi + = ri−1 , (24)
i 
1+R
θi00 + b1,i−1 θi0 + b2,i−1 θi + b3,i−1 fi0 + b4,i−1 fi = si−1 , (25)
Pr
1 00
φ + c1,i−1 φ0i + c2,i−1 φi + c3,i−1 fi0 + c4,i−1 fi = ti−1 , (26)
Sc i
with boundary conditions

fi (0) = fi0 (0) = fi0 (∞) = θi (0) = θi (∞) = φi (0) = φi (∞) = 0. (27)
Unsteady Heat and Mass Transfer from a Stretching Surface... 535

Once each solution for fi , θi and φi for (i ≥ 1) has been found from iteratively solving
equations (24) - (26) for each i, the approximate solutions for f (η), θ(η) and φ(η) are
obtained as

M
X M
X M
X
f (η) ≈ fm (η), θ(η) ≈ θm (η), φ(η) ≈ φm (η), (28)
m=0 m=0 m=0

where M is the order of SLM approximation. In arriving at (28), it is assumed that Fi ,


Gi and Hi become increasingly smaller when i becomes large, that is

lim Fi = lim Gi = lim Hi = 0. (29)


i→∞ i→∞ i→∞

Since the coefficient parameters and the right hand side of equations (24) - (26), for i =
1, 2, 3, . . ., are known (from previous iterations), the equation system (24 - 27) can easily
be solved using analytical means (whenever possible) or any numerical methods such as
finite differences, finite elements, Runge-Kutta based shooting methods or collocation
methods. In this work, equations (24 - 27) are solved using the Chebyshev spectral
collocation method.

4. Results and Discussion


In order to check the accuracy of the successive linearisation approach used, a
comparison of the skin friction f 00 (0) and heat transfer rate at the wall θ0 (0) for Gr =
0, Gc = 0, K = 0, R = 0 with various values of fw , A and P r is made with the results
reported in Ziabakhsh et al.[10]. A comparison of the present results for the skin friction
f 00 (0) is also made with the results of Dulal and Hiremath [11] for various values of A
and K when Gr = 0, Gc = 0, fw = 0, R = 0. Validation of the SLM is also done by
comparing it to the MATLAB boundary value problem solver called bvp4c numerical
method.
It should be noted that default choices were made in choosing the intervals within
which the respective physical quantities discussed here were varied. The intervals are
mainly dependant on the convergency of the numerical methods used.
536 S. Shateyi, S.S. Motsa

Table 1. Comparison between the present successive linearisation


results and the bvp4c numerical results with previous results of Zi-
abakhsh et.al.[10] for f 00 (0) using various values of fw , A and P r when
Gr = 0, Gc = 0, K = 0 and R = 0.

Present Results f 00 (0)


fw A 3rd order 4th order bvp4c Numerical[10] HAM [10]
-1.5 1.0 -0.8095115 -0.8095115 -0.8095115 -0.8095115 -0.8095291
0.0 1.0 -1.3205221 -1.3205221 -1.3205221 -1.3205220 -1.3203526
5/6 1.0 -1.7706579 -1.7706579 -1.7706579 -1.7706579 -1.7706091
1.5 1.0 -2.2223550 -2.2223553 -2.2223554 -2.2223554 -2.2222147
2.1 1.0 -2.6853982 -2.6853998 -2.6853998 -2.6853999 -2.6853923
1.5 0.0 -1.9999953 -2.0000000 -2.0000000 -2.0000001 -1.9989219
1.5 1.0 -2.2223550 -2.2223553 -2.2223554 -2.2223554 -2.2222147
1.5 1.5 -2.3310176 -2.3310177 -2.3310177 -2.3310177 -2.3309709
1.5 2.0 -2.4360794 -2.4360794 -2.4360794 -2.4360794 -2.4360762
1.5 2.5 -2.5371027 -2.5371027 -2.5371027 -2.5371027 -2.5372085

Table 2. Comparison between the present successive linearisation re-


sults and the bvp4c numerical results against previous results of Dulal
and Hiremath [11] for f 00 (0) using various values of A and K when
Gr = 0, Gc = 0, fw = 0 and R = 0.
Present Results f 00 (0)
K A 2nd order 3rd order 4th order bvp4c Dulal and Hiremath [11]
0 0.5 -1.1672111 -1.1672115 -1.1672115 -1.1672115 -1.167221
1.0 -1.3205203 -1.3205221 -1.3205221 -1.3205221 -1.320540
1.5 -1.4596629 -1.4596659 -1.4596659 -1.4596659 -1.459687
2.0 -1.5873623 -1.5873661 -1.5873661 -1.5873661 -1.587403
0.5 0.5 -1.3662303 -1.3662375 -1.3662375 -1.3662375 -1.366245
1.0 -1.4983871 -1.4983943 -1.4983943 -1.4983943 -1.498408
1.5 -1.6214874 -1.6214944 -1.6214944 -1.6214944 -1.621517
2.0 -1.7366949 -1.7367016 -1.7367016 -1.7367016 -1.736733
1.0 0.5 -1.5390352 -1.5390514 -1.5390514 -1.5390514 -1.539056
1.0 -1.6571174 -1.6571302 -1.6571302 -1.6571302 -1.657140
1.5 -1.7687139 -1.7687246 -1.7687246 -1.7687246 -1.768747
2.0 -1.8744531 -1.8744623 -1.8744623 -1.8744623 -1.874492
1.5 0.5 -1.6940627 -1.6940851 -1.6940851 -1.6940851 -1.694089
1.0 -1.8018426 -1.8018594 -1.8018594 -1.8018594 -1.801870
1.5 -1.9046672 -1.9046807 -1.9046807 -1.9046807 -1.904669
2.0 -2.0029245 -2.0029356 -2.0029356 -2.0029356 -2.002957
2.0 0.5 -1.8359363 -1.8359622 -1.8359622 -1.8359622 -1.835965
1.0 -1.9357356 -1.9357550 -1.9357550 -1.9357550 -1.935763
1.5 -2.0315838 -2.0315990 -2.0315990 -2.0315990 -2.031611
2.0 -2.1237414 -2.1237538 -2.1237538 -2.1237538 -2.123771
Unsteady Heat and Mass Transfer from a Stretching Surface... 537

Table 1 indicates that the SLM is very accurate and rapidly converges to the
numerical results generated by bvp4c. The numerical results obtained by Ziabakhsh et
al. [10] are comparable with the present numerical results of bvp4c. The HAM results
are not so accurate. We also observe in Table 1 that the local skin friction for the flow
is increased by increasing values of fw and A.
Table 2 depicts the comparison between the present successive linearisation results
and the bvp4c numerical results against results of Dulal and Hiremath [11]. It can be
clearly seen in this table that the SLM is very accurate and rapidly converges to the
numerical results generated by bvp4c. At 3rd order of approximation the solution has
already converged. The numerical results obtained by Dulal and Hiremath [11] using
the Runge-Kutta-Fehlberg with shooting technique are not as accurate as our present
results. In Table 2 we also observe that the skin friction increases as the values of the
permeability parameter K increase.
The results of both the SLM and numerical bvp4c computations are displayed in
Figures 2 - 5 for non-dimensional velocity f 0 (η), temperature θ(η) and concentration
φ(η). As expected, the velocity f 0 (η) is also a decreasing function of A as clearly shown
in Figure 2. As can be seen in this Figure 2, increasing the unsteadiness parameter A
reduces the flow properties such as velocity, temperature θ(η) and concentration φ(η).
As evidenced in Figure 2, the boundary layer thickness decreases with increasing values
of A. As a consequence the transition of the boundary layer to turbulent flow conditions
occurs farther downstream. This confirms that the stretching of surfaces can be used
as a flow stabilizing mechanism.
Figure 3 depicts the effects of varying the permeability parameter K of the porous
medium on the velocity f 0 (η), temperature θ(η) and concentration φ(η). The parameter
K as defined in equation 15 is inversely proportional to the actual permeability k of the
porous medium. As the permeability of the porous medium represents resistance to the
flow since it restricts the motion of the fluid along the surface, the stream function and
subsequently the velocity f 0 (η) decreases as K increases (as the permeability physically
becomes less with increasing k). With increasing K, the thickness of the boundary
layer increases so the velocity decreases with the increase of K. This is by virtue of the
fact that the effect of porous medium which opposes the flow also increases and leads
to enhanced deceleration of the flow. We observe in this Figure 3 that both the fluid
temperature and concentration are increasing functions of the permeability parameter
K. This is consistent with the fact that the increase of porosity parameter causes
the fluid velocity to decrease and due to which there is rise in the temperature and
concentration in the boundary layer.
Figure 4 represents the velocity, temperature and concentration profiles for fw =
0, 1, 2, and 3. We see that the effect of suction is to decrease the horizontal velocity f 0 (η).
The physical explanation for such behaviour is that while stronger suction is provided,
the heated fluid is sucked through the wall where buoyancy forces can act to decelerate
the flow with more influence of viscosity. Sucking decelerated fluid particles through
the porous wall reduce the growth of the fluid boundary layer as well as thermal and
concentration. From Figure 4 it is clear that the dimensionless temperature and con-
centration decrease due to suction. The physical interpretation of this is that the fluid
538 S. Shateyi, S.S. Motsa

0.9

0.8

0.7

0.6

f 0 (η)
0.5
A = 0, 1, 2, 5
0.4

0.3

0.2

0.1

0
0 1 2 3 4 5 6
η

0.9

0.8

0.7

0.6
θ(η)

0.5

0.4
A = 0, 1, 2, 5
0.3

0.2

0.1

0
0 1 2 3 4 5 6
η

0.9

0.8

0.7

0.6
φ(η)

0.5

0.4 A = 0, 1, 2, 5

0.3

0.2

0.1

0
0 1 2 3 4 5 6
η

Figure 2. The comparison between the bvp4c numerical results (solid


line) and 4th order successive linearisation solution (squares) for
f 0 (η), θ(η) and φ(η), for different values of A, when Gr = 0, Gc = 0,
fw = 0, R = 0, P r = 1, Sc = 1, K = 1.
Unsteady Heat and Mass Transfer from a Stretching Surface... 539

0.9

0.8

0.7

0.6

f 0 (η)
0.5

0.4 K = 0, 2, 4, 8

0.3

0.2

0.1

0
0 1 2 3 4 5 6 7 8
η

0.9

0.8

0.7

0.6
θ(η)

0.5

0.4
K = 8, 4, 2, 0
0.3

0.2

0.1

0
0 1 2 3 4 5 6 7 8
η

0.9

0.8

0.7

0.6
φ(η)

0.5

0.4

0.3 K = 8, 4, 2, 0

0.2

0.1

0
0 1 2 3 4 5 6 7 8
η

Figure 3. The comparison between the bvp4c numerical results (solid


line) and 4th order successive linearisation solution (squares) for
f 0 (η), θ(η) and φ(η), for different values of K, when Gr = 0, Gc = 0,
fw = 0, R = 0, P r = 1, Sc = 1, A = 1.
540 S. Shateyi, S.S. Motsa

at the ambient conditions is brought closer to the surface and reduces the thermal and
solutal boundary layer thicknesses. But the temperature and concentration increase due
to injection. The thermal and solutal boundary layer thicknesses increase (decrease)
with injection (suction) which cause decreases (increases) in the rates of heat and mass
transfer.
Figure 5 shows the thermal radiation effect on the velocity, temperature and concen-
tration profiles. Increasing the thermal radiation parameter produces increases in the
stream function, velocity and temperature of the fluid. This can be explained by the
fact that the effect of radiation R is to increase the rate of energy transport to the fluid
and accordingly to increase the fluid temperature. In Figure 5 we see that radiation
has no significant effect on the concentration of the flow.
Unsteady Heat and Mass Transfer from a Stretching Surface... 541

0.9

0.8

0.7

0.6

f 0 (η)
0.5

0.4

0.3 fw = 0, 1, 2, 3

0.2

0.1

0
0 1 2 3 4 5 6 7 8
η

0.9

0.8

0.7

0.6
θ(η)

0.5

0.4
fw = 0, 1, 2, 3
0.3

0.2

0.1

0
0 1 2 3 4 5 6 7 8
η

0.9

0.8

0.7

0.6
φ(η)

0.5

0.4
fw = 0, 1, 2, 3
0.3

0.2

0.1

0
0 1 2 3 4 5 6 7 8
η

Figure 4. The comparison between the bvp4c numerical results (solid


line) and 4th order successive linearisation solution (squares) for
f 0 (η), θ(η) and φ(η), for different values of f w, when Gr = 0, Gc = 0,
K = 1, R = 0, P r = 1, Sc = 1, A = 1
542 S. Shateyi, S.S. Motsa

0.9

0.8

0.7

0.6

f 0 (η)
0.5

0.4
R = 5, 2, 1, 0
0.3

0.2

0.1

0
0 2 4 6 8 10
η

0.9

0.8

0.7

0.6
θ(η)

0.5
R = 5, 2, 1, 0
0.4

0.3

0.2

0.1

0
0 2 4 6 8 10
η

0.9

0.8

0.7

0.6
φ(η)

0.5

0.4

0.3
R = 0, 1, 2, 5
0.2

0.1

0
0 2 4 6 8 10
η

Figure 5. The comparison between the bvp4c numerical results (solid


line) and 4th order successive linearisation solution (squares) for
f 0 (η), θ(η) and φ(η), for different values of R, when Gr = 0, Gc = 0,
K = 1, f w = 0, P r = 1, Sc = 1, A = 1.
Unsteady Heat and Mass Transfer from a Stretching Surface... 543

Nomenclature

a, b, c constants
A unsteadiness parameter
C species concentration at any point in the flow field
Cf skin-friction coefficient
Cw species concentration at the wall
cp specific heat at constant pressure
C∞ species concentration at the free stream
D molecular diffusivity of the species concentration
f dimensionless stream function
f0 dimensionless velocity
fw mass transfer coefficient
g acceleration due to gravity
Gc Concentration buoyancy parameter
Gr Grashof number
k Darcy permeability
K permeability parameter
Ks mean-absorption coefficient
N ux local Nusselt number
Pr Prandtl number
qr Rosseland approximation
R thermal radiation parameter
Rex Reynolds number
Sc Schmidt number
Shx local Sherwood number
T fluid temperature at any point
Tw fluid temperature at the wall
T∞ free stream temperature
u streamwise velocity
Uw velocity of the stretching sheet
v normal velocity
Vw Suction/injection velocity
x streamwise coordinate axis
y normal coordinate axis
Greek Symbol
α thermal conductivity
ν kinematic viscosity
βC volumetric coefficient expansion with concentration
βT volumetric coefficient of thermal expansion
ρ density of the fluid
σ∗ Stefan-Boltzman constant
η similarity variable
θ dimensionless temperature
φ Dimensionless concentration
544 S. Shateyi, S.S. Motsa

5. CONCLUDING REMARKS
In this work, we employed a very powerful new linearisation technique, known
as the Successive Linearisation Method (SLM), to study the unsteady heat and mass
transfer from a stretching surface embedded in a porous medium with suction/injection
and thermal radiation effects. The SLM results for the governing flow properties, such
as velocity profiles, temperature profiles, concentration profiles, wall heat transfer and
mass transfer, were compared with results obtained using MATLAB’s bvp4c function
and excellent agreement was observed. The SLM was found to converge very rapidly
to the numerical results and accuracy to 10−7 was achieved only after two or three
iterations when varying all the governing physical parameters.

Acknowledgement. The authors acknowledge the financial support from the National
Research Foundation (NRF) and University of Venda.

References
[1] Crane, L. J., Flow past a stretching plate, Z. Angew. Math. Physc. 12, 645-647, 1970.
[2] Laha, M. K., P. S Gupta and A. S Gupta, Heat transfer characteristics of the flow of an
incompressible viscous fluid over a stretching sheet, Warme-und Stoffubertrag, 24, 151-153, 1989.
[3] Afzal, N., Heat transfer from a stretching surface, Int. J. Heat Mass Trans. 36, 1128-1131, 1993.
[4] Prasad, K. V., A. S Abel, and P. S. Datti., Diffusion of chemically reactive species of non-
Newtonian fluid immersed in a porous medium over a stretching sheet, Int.J. Non-Linear Mech.
38, 651-657, 2003.
[5] Abel, M.S. and Mahesha N., Heat transfer in MHD viscoelastic fluid over a stretching sheet
with variable thermal conductivity, non-uniform heat source and radiation, Appl. Math. Modell.
32, 1965-1983, 2008.
[6] Abel, P. G. Siddheshwar and Mahantesh M. Nandeppanava, Heat transfer in a viscoelastic
boundary layer flow over a stretching sheet with viscous dissipation and non-uniform heat source,
Int. J. Heat Mass Trans. 50, 960-966, 2007.
[7] Cortell, R., Viscoelastic fluid flow and heat transfer over a stretching sheet under the effects of
a non-uniform heat source, viscous dissipation and thermal radiation, Int. J. Heat Mass Trans.
50, 3152-3162, 2007.
[8] Abdou, M.M.M, Effect of radiation with temperature dependent viscosity and therma
conductivity on unsteady a stretching sheet through porous media, Nonlinear Analysis:
Modelling and Control, 15(3), 257-270, 2010.
[9] Pal, D., and S. Chatterjee, Heat and mass transfer in MHD non-Darcian flow of a
micropolar fluid over a stretching sheet embedded in a porous media with non-uniform
heat source and thermal radiation,Commun Nonlinear Sci Numer Simulat, 15(7), 1843-1857,
2010.
[10] Ziabakhsh, Z, Domairry, G. Mozaffari, M. Mahbobifar, M., Analytical solution of heat
transfer over an unsteady stretching permeable surface with prescribed wall temper-
ature, J. Taiwan Inst. Chem. Eng., 41 (2), 169-177, 2010.
[11] Pal, D., and P. S. Hiremath, Computational modeling of heat transfer over an unsteady
stretching surface embedded in a porous medium, Meccanica, 45(3), 415-524, 2009.
[12] Hossain, M. A., M.A. Alim, and D. A. S. Rees, The effect of radiation on free convection
from a porous vertical plate,Int. J. Heat Mass Transfer, 42, 181 - 191, 1999.
[13] Seddeek, M. A.M., Thermal radition and buoyancy effects on MHD free convection
heat generation flow over an accelerating permeable sur face with temperature de-
pendent viscosity, Canadian Journal of Physics, 79(4), 725-732, 2001.
Unsteady Heat and Mass Transfer from a Stretching Surface... 545

[14] Raptis, A., Flow of a micropolar fluid past a continuously moving plate by the pres-
ence of radiation, Int. J. Heat Mass Transfer, 41, 2865-2866, 1998.
[15] Ishak, A., R. Nazar, and I. Pop, Heat transfer over an unsteady stretching permeable
surface with prescribed wall temperature,Nonlinear Analysis: Reall World Applications,
10, 2909-2913, 2009.
[16] Andersson, H. I., J. B. Aarseth, N. Braud, and B. S. Dandapat, Flow of a power-law
fluid film on an unsteady stretching surface, J.Non-Newtonian Fluid Mech. 62, 1-8, 1996.

Stanford Shateyi: University of Venda, Department of Mathematics, P Bag X5050,


Thohoyandou 0950, South Africa.
e-mail: [email protected]

Sandile S Motsa: University of Swaziland, Private Bag 4, Kwaluseni, Swaziland.


e-mail: [email protected]
546 S. Shateyi, S.S. Motsa
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 547 - 558.

LEVEL-SET-LIKE METHOD FOR COMPUTING MULTI-


VALUED SOLUTIONS TO NONLINEAR TWO CHANNELS
DISSIPATION MODEL

SUMARDI, SOEPARNA DARMAWIJAYA, LINA ARYATI,


F.P .H. VAN BECKUM

Abstract. We develop a level set method for computing multi -valued solution to quasi-
linear hyperbolic partial differential equation. Here we apply the method to nonlinear two -
channel dissipation model. The model does not have level set equation, so we find the level
set equations that approximate the model. We explain the difference between of the level
set and level-set-like equations. The multi-valued solutions of the model are approximated
as the zeros of set scalar functions that solve the initial val ue problems of a time dependent
linear partial differential equation in augmented space.
Keywords and Phrases: Hyperbolic PDEs, Multi-valued solutions, Level set method

1. INTRODUCTION
The phenomenon of wave breaking in system occurred in physical systems. Where the
physics of one system may dictate that a shock develops after the wave breaking event, the
physics of another system may dictate that the formation of multivalued solutions is
appropriated after the breaking wave event. Physical systems where multi-valued solutions
may be appropriate include geometric optic Osher et. al. [10], arrival time in seismic imaging
Formel et. al.[5], nonlinear plasma waves, stellar dynamic and galaxy formation, multi-lane
traffic flows, multi-phase fluids. So we need the computation of multivalued solutions.
There are two classes of method used to compute multivalued solutions of the
nonlinear partial differential equations. The first class is Lagrangian method that solves a set
ordinary differential equations in order to trace the wavefronts Benamou [2] [3]. The second
class involves Eulerian methods, which solve partial differential equation a fixed grid. The
methods based in physical space often use the classical Liouville equations. For the
computation of wavefronts, a Liouville equation based phase techniques, using the
______________________________________________
2010 Mathematics Subject Classification :35F06,35L06,65M06
547
548 SUMARDI ET AL.

segmentation projection method Engquist et. al. [4], or using level set method Osher et.
al.[10], Jin et. al. [6] [7]. There are nonlinear partial differential that cannot transform in
Liouville equations, which solved by level set method, for example: multidimensional
hyperbolic partial differential equations Liu et. al. [8] [9].
In this paper we develop the level set method to solve system nonlinear partial
differential that has not level set equation. The system is
u t  uu x   (v  u )
(1)
vt  vvx   (u  v).
We could consider them as inviscid Burgers equations, in two parallel channels, with a
dissipative exchange to one another that called the nonlinear two channels dissipation model.
This system was presented by Van Beckum [1], who also derived a travelling wave solution,
i.e. a solution that travels at constant speed and undisturbed in shape. Here we find the level
set equation that approximate the nonlinear two channels dissipation model. The method is
named the level-set-like method.

2. THE LEVEL SET EQUATION FOR MULTI-VALUED SOLUTION

In this section we discuss level set equation for the computation of multi-valued solutions to
nonlinear differential equation. We formulate level set equation to first order differential
equation, first order partial differential equation and system ordinary differential equation.
The last example we give level set equations that approximate the system of ordinary
differential equation. We present these in the following lemmas.

Lemma 1. If  ( x, t ) is the solution of the initial value problem:


 ( x, t )  ( x, t )
 f ( x, t ) 0
t x
 ( x,0)  x  x0 (2)
then  ( x, t )  0 is the solution of the initial value problem:
dx
 f ( x, t )
dt
x(0)  x0 . (3)
Proof : Consider the initial value problem:
 ( x, t )  ( x, t )
 f ( x, t ) 0
t x
 ( x,0)  x  x0
then using characteristic method we have
dx(t ) d ( x(t ), t )
 f ( x, t ), 0
dt dt
t  0, x   ,     x0 .
Le v el - S et - Li k e M et h od For C om p u ti n g Mu lt i -Va lu ed Solu t i on s ... 549

The characteristic equation equivalent with


dx(t )
 f ( x, t ),
dt
t  0, x   ,     x0
Because  ( x, t ) is the solution and  ( x, t )  0 , so   x0 . Here we have x  x0 and
obtain
dx(t )
 f ( x, t ),
dt
t  0, x  x0
The proof is complete.
Simple proof using differentiation of  ( x, t )  0 with respect to t leads to
 ( x, t ) dx  ( x, t )
 0 (4)
t dt x
Compare equation (4) with equation (2) and use initial condition (2) then substitute to
 ( x, t )  0 , we get (3). 

Lemma 2. If  (u, x, t ) is the solution of the initial value problem:


 (u, x, t )  (u, x, t )  (u, x, t )
u  u 0
t x u (5)
 (u, x,0)  u  f ( x)
then  (u, x, t )  0 is the solution of the initial value problem:
u ( x, t ) u ( x, t )
 u ( x, t )  u ( x, t )
t x . (6)
u ( x,0)  f ( x)
Proof: Suppose  (u, x, t ) is the solution of the initial value problem:
 (u, x, t )  (u, x, t )  (u, x, t )
u  u 0
t x u
 (u, x,0)  u  f ( x)
then using characteristic method we have
dx du d
 u,  u, 0
dt dt dt .
t  0, x   , u  ,     f ( )
The solution of the characteristic is
550 SUMARDI ET AL.

    f ( ),
u  e t

x (1  e t )  

If  (u, x, t )  0 , then
u  f ( )e t
f ( ) .
x (1  e t )  

The last two equations are the solution of characteristic of the initial value problem
u ( x, t ) u ( x, t )
  u ( x, t )
t x
u ( x,0)  f ( x)
The proof is complete. 
Lemma 3. If a pair of the  ( x, y) and  ( x, y) is the solution of the initial value
1 2

problem:
 1 ( x, y, t )  1 ( x, y, t )  1 ( x, y, t )
 f ( x, y )  g ( x, y ) 0
t x y
 2 ( x, y, t )  2 ( x, y, t )  2 ( x, y, t )
 f ( x, y )  g ( x, y ) 0 (7)
t x y
 1 ( x, y,0)  x  x0
 2 ( x, y,0)  y  y 0
then the intersection of  1 ( x, y, t )  0 and  2 ( x, y, t )  0 is the solution of the initial
value problem:
dx
 f ( x, y )
dt
dy
 g ( x, y ) . (8)
dt
x ( 0)  x 0
y ( 0)  y 0
Le v el - S et - Li k e M et h od For C om p u ti n g Mu lt i -Va lu ed Solu t i on s ... 551

Proof: Consider the initial value problem:


 1 ( x, y, t )  1 ( x, y, t )  1 ( x, y, t )
 f ( x, y )  g ( x, y ) 0
t x y
 2 ( x, y, t )  2 ( x, y, t )  2 ( x, y, t )
 f ( x, y )  g ( x, y ) 0
t x y
 1 ( x, y,0)  x  x0
 2 ( x, y,0)  y  y 0
then using characteristic method we have two systems of the ordinary differential equation
dx dy d 1
 f ( x, y ),  g ( x, y ), 0
dt dt dt .
t  0, x  1 , y  1 ,   1  x0
dx dy d 2
 f ( x, y ),  g ( x, y ), 0
dt dt dt
t  0, x   2 , y   2 ,    2  y 0
The characteristic equivalent with
dx dy
 f ( x, y ),  g ( x, y )
dt dt .
t  0, x  1 , y  1 ,    1  x 0
dx dy
 f ( x, y ),  g ( x, y )
dt dt
t  0, x   2 , y   2 ,    2  y0

Because of  ( x, y, t )  0 and  2 ( x, y, t )  0 , so we have


1

dx dy
 f ( x, y ),  g ( x, y )
dt dt
t  0, x  x0 , y  1
dx dy
 f ( x, y ),  g ( x, y )
dt dt (9)
t  0, x   2 , y  y 0
The intersection two systems above is
552 SUMARDI ET AL.

dx
 f ( x, y )
dt
dy
 g ( x, y )
dt
x ( 0)  x 0
y ( 0)  y 0
which is the characteristic of the initial value problem of (8). 

Lemma 3 describes the exact level set equation, the following Lemma 4 we give the
approximation of the level set equation for initial value problem (8).

Lemma 4. If a pair of  ( x, y, t ) and  2 ( x, y, t ) is the solution of the initial value


1

problem:
 1 ( x, y, t )  1 ( x, y, t )
 f ( x, y ) 0
t x
 2 ( x, y, t )  2 ( x, y, t )
 g ( x, y ) 0 (10)
t y
 1 ( x, y,0)  x  x0
 2 ( x, y,0)  y  y 0
then the intersection of  ( x, y, t )  0 and  ( x, y, t )  0 approximates the initial
1 2

value problem (8). The order of error is O(t )


2

Proof: Consider the initial value problem:


 1 ( x, y, t )  1 ( x, y, t )
 f ( x, y ) 0
t x
 2 ( x, y, t )  2 ( x, y, t )
 g ( x, y ) 0
t y
 1 ( x, y,0)  x  x0
 2 ( x, y,0)  y  y 0
then using characteristic method and applying  1 ( x, y, t )  0 and 
2
( x, y, t )  0 yields
dx
 f ( x, 1 ),
dt
t  0, x  x0
dy
 g ( 2 , y )
dt
t  0, y  y0
Le v el - S et - Li k e M et h od For C om p u ti n g Mu lt i -Va lu ed Solu t i on s ... 553

Suppose the solution the system above at t is ( x , y ) and the exact solution of system (8) is
( x1 , y1 ) at t . The error can be compute using Taylor series as follow:
x1  x  x(0)  f ( x0 , y 0 )t  O(t 2 )   x(0)  f ( x0 , y )t  O(t 2 ) 
  f ( x0 , y 0 )  f ( x0 , y ) t  O(t 2 )
Applying mean value theorem, we obtain

 f ( x0 , y 0 )  f ( x0 , y )  f ( x0 , y
*
)
( y  y 0 ), y 0  y *  y
y
f ( x0 , y * )
 y ( )t , 0    t
y
So we have x1  x  O(t ) . Similarly we can proof for variable y .
2

3. LEVEL-SET-LIKE-METHOD

In this section we write theorem of the level-set-like method for nonlinear two channels
dissipation model without boundary condition using idea Lemma 4. Then we obtain the
corollary of the level-set-like method for boundary value problem of the nonlinear two
channels dissipation model.

Theorem 1. If a pair  ( x, t , u, v) and  2 ( x, t , u, v) is the solution of initial value:


1

 t1  u 1x   (v  u ) u1  0
 t2  v x2   (u  v) v2  0
(11)
 1 ( x, t 0 , u , v )  u  f ( x )
 2 ( x, t 0 , u , v )  v  g ( x )
then the intersection of  ( x, t 0  t , u, v)  0 and  ( x, t 0  t , u, v)  0
1 2

approximates the solution of the initial value problem:


u t  uu x   (v  u )
vt  vvx   (u  v)
(12)
u ( x, t 0 )  f ( x )
v ( x, t 0 )  g ( x )
at t 0  t with order error O(t 2 ) .

Proof:The first equation of (11) has the characteristic equations:


554 SUMARDI ET AL.

dx
 u , x(t 0 )  1
dt
du
  (v  u ) , u (t 0 )  1
dt
dv
 0, v(t 0 )  1
dt
d 1
 0  1 (t 0 )  1  f (1 )
dt
and using zero level  ( x, t , u, v)  0 , we have the solution:
1

v  1
u  1  ( f (1 )  1 )e  (t t0 ) (13)
( f (1 )  1 ) ( f (1 )  1 )
x  1 (t  t 0 )  e  (t t0 )   1
 
Similarly for the second equation of (11), we have the solution:
u  2
v   2  ( g ( 2 )   2 )e  (t t0 ) (14)
( g ( 2 )   2 ) ( g ( 2 )   2 )
x   2 (t  t 0 )  e  (t t0 )   2
 
The intersection of the equation (13) and (14) for (u, v) :
 ( t t 0 )
f (1 )  g ( 2 )  g ( 2 )e
u
2  e  (t t0 )
g ( 2 )  f (1 )  f (1 )e  (t t0 )
v
2  e  (t t0 )
Suppose ( x , u , v ) approximates the exact solution ( x, u, v) of the initial value (12) at
t 0  t , so we have
f (1 )  g ( 2 )  g ( 2 )e t
u
2  e t
(15)
g ( 2 )  f (1 )  f (1 )e t
v
2  e t
We have the characteristic form of the initial value (12) as follow
du dx
  (v  u ) on  u with initial condition t  t 0 , x  1 , u  f (1 )
dt dt
(16)
dv dx
  (u  v) on  v with initial condition t  t 0 , x   2 , v  g ( 2 )
dt dt
Le v el - S et - Li k e M et h od For C om p u ti n g Mu lt i -Va lu ed Solu t i on s ... 555

(17)
dx
The error u at t 0  t on characteristic  u can be computed using Taylor series as
dt
follow:
 du (t 0 ) d 2 u (t 0 ) t 2 3 
u (t 0  t )  u (t 0  t )   u (t 0 )  t  2
 O ( t ) 
 dt dt 2 
 du (t 0 ) d u (t 0 ) t
2 2

  u (t 0 )  t  2
 O(t 3 ) 
 dt dt 2 
 d 2 u (t 0 ) t 2 
  f (1 )   ( g ( 2 )  f (1 ))t  2
 O(t 3 ) 
 dt 2 
 d 2 u (t 0 ) t 2 
  f (1 )   ( g (1 )  f (1 )t  2
 O(t 3 ) 
 dt 2 
 d u (t 0 ) d u (t 0 )  t
2 2 2
  ( g ( 2 )  g (1 ))t   2
 2
  O(t 3 )
 dt dt  2
It is clear that if g is constant function, then the error is O(t ) . It is possible to
2

choose g ( 2 )  g (1 )  g ( )( 2  1 )  g ( )t , 1     2 . So we have


 d 2 u (t 0 ) d 2 u (t 0 )  t 2
u (t 0  t )  u (t 0  t )   g ( )  2
 2
  O(t 3 )
 dt dt  2 . (18)

 O(t 2 )
v (t 0  t )  v(t 0  t )  O(t 2 ) .
Similarly, we can proof that
dx
The error x at t 0  t on characteristic  u can be computed using Taylor series as
dt
follow
556 SUMARDI ET AL.

 dx (t 0 ) d 2 x (t 0 ) t 2 
x (t 0  t )  x(t 0  t )   x (t 0 )  t  2
 O(t 3 ) 
 dt dt 2 
 dx(t 0 ) d 2 x(t 0 ) t 2 
  x(t 0 )  t  2
 O(t 3 ) 
 dt dt 2 
 d 2 x (t 0 ) t 2 

  u (t 0 )t  2
 O(t 3 ) 
 dt 2 
 d x(t 0 ) t
2 2
3 
  u (t 0 )t  2
 O ( t ) 
 dt 2 
 d x (t 0 ) d x(t 0 )  t
2 2 2
  2
 2
  O(t 3 )
 dt dt  2
 O(t 2 ).
The proof is complete. 
This theorem has initial values which are single value functions. Here we can apply
for the initial values of (12) is multivalued functions, for example the values of the function f
at x  x0 are f1 , f 2 , f m that is notated :
f ( x0 )   f 1 , f 2 ,  f m  .
Then the level set equation has initial value

 1 ( x0 , t 0 , u, v)  u  f i if f i  min u  f j : j  1,2,m .
Here we can generalize the Theorem 1 to approximate values that obtained at a set grid times:
t 0  t1  t 2    t n  
The approximate value at each t n is obtained by using some of the values obtained in
previous step. This method we call level-set-like method, because of the idea as level set
method, but the system doesn’t have the exact level set equation. The method has local errors
that formulate as follow:
u (t n 1 )  u (t n 1 )  O(t 2 )
v (t n 1 )  v(t n 1 )  O(t 2 ) (19)
x (t n 1 )  x(t n 1 )  O(t 2 )
where ( x , u , v ) is the approximate solution and ( x, u, v) is the exact solution. The global
error is still difficult to find.

Corollary: If a pair  1 ( x, u, v)  2 ( x, u, v) is the solution of steady state of the


and
initial value problem (8), then the intersection of  ( x, u, v)  0 and  ( x, u, v)  0 is
1 2

the solution of steady state of the initial value problem (9).


Proof:
The steady state solution of initial value problem (8) is the solution of
Le v el - S et - Li k e M et h od For C om p u ti n g Mu lt i -Va lu ed Solu t i on s ... 557

u 1x   (v  u ) u1  0
, (17)
v x2   (u  v) v2  0
for u and v are not zero, so the equation (17) can be written:
 (v  u )
 1x   u1  0
u
(18)
 (u  v) 2
x 2
v  0
v
Derive  ( x, u, v) and  ( x, u, v) respect to x and compare to the equation (18), then we
1 2

obtain the steady state of the initial value problem (9).

Corollary: Theorem 1, given for problems on the whole x-axis, is also valid for problems
with boundary conditions.

4. CONCLUDING REMARK

The level set method changes nonlinear differential equations into linear differential
equations in higher dimension. However, the level-set-like method approximates the solution
of nonlinear differential equations into linear differential equations in higher dimensions. The
multivalued solutions of two channels dissipation model are computed as the zeros of the
level set equation.

References

[1] VAN BECKUM F.P.H., Travelling wave solution of a coastal zone non-Fourier dissipation model, Proceedings of
the Symposium on Coastal Zone Management, 2003.
[2] BENAMOU J,-D., Big ray tracing: Multi-valued travel time field computation using viscosity solution of the
eikonal equation, J. Comp. Phys. 128, 463-474, 1996.
[3] BENAMOU J,-D., Direct Computation of multivalued phase space solution for Halmiton-Jacobi Equations,
Commun. Pure. Appl. Math. 52(11), 1443-1475, 1999.
[4] ENGQUIST B., RUNBORG O. AND TORBERG K, High frequency wave propagation by the segmen projection, J.
Comp. Phys. 178, 373-390, 2002
[5] FORMEL S. AND SETHIAN J.A., Fast phase space computation of multiple arrivals, Proc. Natl. Acad. Sci.,
99(11) , 7329-7334, 2002
[6] JIN S., AND OSHER S., A level set method for the computation multivalued solution to quasilinear hyperbolic
PDE’s and Hamilton-Jacobi equations, Comm. Math. Sci. 1(3) 575-591, 2003
[7] JIN S., LIU H., OSHER S. AND TSAI R., Computing multi-valued physical observables for the high frequency
limit of symmetric hyperbolic systems, J. of Comp. Physics, 2005
[8] LIU H. L, AND WANG Z. M., Computing multivalued velocity and electrical fields for 1D Euler-Poisson, App.
Num. Math, 2005
[9] LIU H., CHENG L.T. AND OSHER S, A level set framework for capturing multi-valued solutions of nonlinear
first-order equations, J. Sci. Comput,2005
[10] OSHER S., CHENG L.T., KANG M., SHIM H. AND TSAI Y. H., Geometric Optics in a phase space based
level set and Eulerian framework, J. Comp. Phys., 79, 622-648, 2002
558 SUMARDI ET AL.

SUMARDI
Department of Mathematics, Gadjah Mada University, Indonesia
e-mails: [email protected], [email protected]

SOEPARNA DARMAWIJAYA
Gadjah Mada University, Indonesia
E-mails:

LINA ARYATI
Department of Mathematics, Gadjah Mada University, Indonesia
e-mail: [email protected]

F. P. H. VAN BECKUM:
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 559 - 564.

NONHOMOGENEOUS ABSTRACT DEGENERATE


CAUCHY PROBLEM: THE BOUNDED OPERATOR ON
THE NONHOMOGEN TERM

SUSILO HARIYANTO, LINA ARYATI, WIDODO

Abstract. We discuss about how to solve nonhomogeneous abstract degenerate


Cauchy problem with B bounded linear operator on the nonhomogen term. The problem
is formulated in a Hilbert space . which can be written as an orthogonal direct sum of Ker
M and Ran M*. Under certain assumption, we can reduce the problem to nondegenerate
problem. Finally, we apply this method to descriptor system.

Keywords and Phrases : Degenerate Cauchy problem, descriptor system.

1. INTRODUCTION

In this paper, we are going to investigate how to solve nonhomogeneous abstract


degenerate Cauchy problem with the bounded operator on the nonhomogen term:
d
Mz (t )  Az (t )  Bf (t ) , (1)
dt
where M, A are linear operator between Hilbert space H and W. Here M is not invertible.
Operator B is linear bounded between Hilbert space U and W. If operator B is identity
operator, then we have nonhomogeneous abstract degenerate Cauchy problem:
d
Mz (t )  Az(t )  f (t ), z (0)  z 0 (2)
dt

2010 Mathematics Subject Classification :.47A68

559
560 S. HARIYANTO, L. ARYATI, WIDODO

For finite dimensional, nonhomogeneous abstract degenerate Cauchy problem (2)


is understood completely in the book [2] by Dai, where we can transform matrices M and
A to a common normal form. We also can find many examples, applications to control
theory, and references to the earlier literature in his book.
In the infinite dimensional case, it is mentioned by Carrol et. al. [1] that they
treat the Singular and Degenerate Cauchy Problem. In Favini [3,4], he investigates
degenerate Cauchy problem in Hilbert spaces. In his articles, the problem is treated also
under the assumption that the Hilbert space of the system can be written as a direct sum
of the kernel of M (Ker M ) and the range of adjoint M (Ran M*).
B. Thaller et. al. [5] emphasize on the possibility of factorization and the relation of
the factorized problem with the original degenerate system-without assuming parabolicity.
Making use of the decomposition of the Hilbert space into a direct sum of Ker M and Ran
M* (resp. Ker M* and Ran M) they formulate the conditions which allow us to obtain an
equivalent but nondegenerate Cauchy problem in the factor space H/Ker M and give the
explicit form of generators of the factorized problem. The crucial assumptions is that the
restriction of A to a mapping from Ker M to Ker M* is well defined and invertible. This
allows to define a factorization operator ZA which maps the solutions of the factorized
system to solutions of the original degenerate system. According to Thaller [5] , we will
find the method to solve the problem (1) and then use it to solve the descriptor system.

2.THE SOLUTION OF NONHOMOGENEOUS ABSTRACT DEGENERATE


CAUCHY PROBLEM

In this section, we discuss about a method to solve the problem (1). In this paper
D(T ) is represented domain of T, where T is an operator. First we make the following
assumptions.

Assumption 2.1: Operator M:D(M)  H  W and A: D(A)  H  W are closed linear


operators which are densely defined.

By Assumption 2.1, the Hilbert space H can be written as,


H  Ker M  ( Ker M )  .
Since the adjoint of M (M*) of a closed operator is again densely defined in H, we have
the relations
PH  Ker M , and P T H  (1  P)H  Ran M * ,
where P is an orthogonal projection onto Ker M. For the orthogonal projection Q onto Ker
M* in Hilbert space W, we also have the relations
QW  Ker M * , and Q T W  (1  Q)W  Ran M .

The function z:[0,  )  H is a strict solution of nonhomogeneous abstract


degenerate Cauchy problem with the bounded operator on the nonhomogen term if z(t) 
D(A)  D(M) for all t  0 , Mz(t) is continuously differentiable, and (1) holds.

Assumption 2.2: Operator B:D(B)  U  W is linear bounded operator, which is Ran


B  Ran M.
Non h om ogen eou s Ab s t ra c t Degen era t e C a u c h y Prob lem . . . . 561

If DA,B is { z(t)  D(A)| Az(t) + Bf(t)  Ran M }, then DA,B is subspace of H and
any strict solution z (t ) of degenerate Cauchy problem (1) is clearly in DA,B. Obviously
we have Ker A  D A, B and operator A |DA, B is closed.
B. Thaller et. al. [5] e Operator M injective if only if Ker M={0}. In order to
transform the problem (1) to nondegenerate problem we will restrict D(M) domain of
operator M to ( Ker M )   D(M )  D( M r ). Therefore, the restriction of operator M
on D(Mr), that is defined by
Mr = M |D ( M r ) , where D( M r )  ( Ker M )   D(M ),
is an invertible operator.
Next, we define an operator A0 which is restriction A on ( Ker M )  ,


A0 x(t )  A P T 1

{x(t )}  DA, B  (Ran M) , for every x(t)  D(A0)

where D(A0)= x(t )  ( Ker M )  | P T   1
{x(t )}  DA, B   . 
The operator A, however, will become a multi valued operator A0 on ( Ker M )  . So,
we need the following assumption.

Assumption 2.3: PDA,B  DA,B dan operator (QAP)|PDA,B has a bounded inverse.

Under Assumption 2.1, 2.2, and 2.3 a vector z (t )  H is in the subspace DA,B if
only if z(t )  D( A) , and Pz(t )  (QAP) 1 QAP T z (t ) . Therefore any
x(t )  P T DA, B  ( Ker M )  uniquely determines z (t )  DA, B such that
 1
x(t )  P z (t ) and z(t )  (1  (QAP) QA) x(t ) . Hence the set ( P ) x(t )  DA, B
T 1

contains pricesly one elemen. And then we define the operator ZA by


Z A  P T  (QAP) 1 QAP T
This operator is defined on D( Z A )  P T DA, B . The restriction Z A | PT D , given by
A, B

1  (QAP) 1 QA on P T D A, B , is the inverse of the projection P T |DA, B :


Z A P T  1 on D A, B
dan P T Z A  1 on P T DA,B.
Hence, the operator A0 can be determined by
A0  AZ A , on D( A0 )  P T DA, B .
For every z (t )  DA, B , we can write A0 x(t )  Az(t ) where x(t )  P T z (t ).
Since Az(t )  Q T Az(t ) for all z (t )  DA, B , the operator A0 can be written in the more
symetric form:
A0  Q T APT  Q T AP(QAP) 1 QAPT . (3)
1
To factorize A0 , we define an operator YA  Q  Q AP(Q A )P Q . Therefore
T T
562 S. HARIYANTO, L. ARYATI, WIDODO

YA AP  0 , then YA APT  YA A and A0  YA A on D( A0 )  P T DA, B .

Assumption 2.4: The operator A has a bounded inverse.

This Assumption implies that A|DA,B has a bounded inverse, and


A |DA, B : D A, B  Q T W ; ( A |DA, B ) 1 : Q T W  DA, B .
Hence, the operator A01  ( AZ A) 1  P T A 1 |QT W is bounded and densely defined on

Q T W . According to assumption 2.2, 2.3 and 2.4, we have:


Bf (t )  (Q T  Q T AP(QAP) 1 Q) Bf (t )  YA Bf (t ), for every f (t )  D( B). (4)

Finally by Assumptions 2.1, 2.2, 2.3 and 2.4, the degenerate Cauchy problem (1)
can be reduced to nondegenerate problem:
d
M r x(t )  A0 x(t )  Y A Bf (t ), x(t )  P T z 0 (5)
dt
where M r is invertible .

Assumption 2.5: Let DA, B  D( M ) and the operator M has a closed domain.

Remark 2.6: If M is closed and densely defined (Assumption 2.1), by using closed graph
theorem we have the equivalent formulation: M r is bounded and defined on all of P T
H.

By Assumption 2.1, 2.2, 2.3, 2,4 and 2.5, the nonhomogeneous abstract
degenerate Cauchy problem with the bounded operator on the nonhomogen term can be
written in the normal form:
d
x(t )  A1 x(t )  ( M r ) 1 Y A Bf (t ) , where A1  (M r ) 1 A0 . (6)
dt
The operator A1  ( M r ) 1 A0 on the natural domain,
 
D( A1 )  x  PT DA A0 x  Ran M  A01 Ran M  PT DA ,
is closed, because it is the product of a boundedly invertible operator ( M r ) 1 with a

closed operator A0 . It is also densely defined in the Hilbert space H0  ( P T DA, B ).

Assumption 2.7: A1 generates a strongly continuous semigroup in H 0 .


By Assumption 2.1, 2.2, 2.3, 2.4, 2.5 and 2.7, the solution (6) which is also the
solution of nonhomogeneous abstract nondegenerate Cauchy problem with bounded
operator on the nonhomogen term is:
t


x(t )  e A1t P T z 0  A1 e A1t (t  s ) g ( s)ds , where g (t )  P T A 1 Bf (t ).
0
(7)

Moreover, a necessary condition for z (t ) to be solution of (1) is now given by


Non h om ogen eou s Ab s t ra c t Degen era t e C a u c h y Prob lem . . . . 563

z(t )  Z A x(t )  (QAP) 1 QBf (t ) , for all t  0. (8)

Theorem 2.8: Let Bf (t ) be a continuosly differentiabel function with values in Ran M.


Under Assumption 2.1, 2.2, 2.3, 2.4, 2.5 and 2.7, the degenerate Cauchy problem (1) for
each initial value z 0  D A, B has a unique strict solution:
z(t )  Z A x(t )  (QAP) 1 QBf (t ) ,
t
where x(t )  e 
P z 0  A1 e A1t (t  s ) g ( s)ds and g (t )  P T A 1 Bf (t ).
A1t T

0
t


x(t )  e P z 0  A1 e A1t (t  s ) g ( s)ds,
A1t T
Proof: We know that where
0

g (t )  P T A 1 Bf (t ) is solution of nonhomogeneous abstract nondegenerate Cauchy


problem with bounded operator on the nonhomogen term (5). By Assumption 2.7, the
function t  x(t ) is continuosly differentiable and is in D( A1 ) for all t  0. Therefore
z (t )  Z A x(t ) is well defined. Using the continuity of M r we find that Mz (t )  M r x(t )
is continuosly differentiable with
d d d
Mz (t )  M r x(t )  M r x(t )  M r [ A1 x(t )  ( M r ) 1 YA Bf (t )]
dt dt dt
 A0 x(t )  YA Bf (t )  Az(t )  Bf (t ).
Hence z (t ) is a strict solution of degenerate Cauchy problem (1).
After we discus about how to solve the nonhomogeneous abstract degenerate
Cauchy problem with bounded operator on the nonhomogen term, we will apply this
method to solve descriptor system in the next example.

Example 2.9: We will solve the descriptor system in this following:


 1 0   1 0  1  z1 
  z    z    f ; y  0 1 z , where z   .
 0 0  0 1  0  z2 
1 0 1 0 1
If we let M    , A    , B    , and z 0  DA, B then the problem of
 0 0 0 1  0
descriptor system can be expressed as nonhomogeneous abstract degenerate Cauchy
problem with the bounded operator on the nonhomogen term:
d
Mz (t )  Az(t )  Bf (t ), z (0)  z 0
dt
In this case, we can define the orthogonal projection operator:
 0 0 1 0
P  Q    and P T  Q T    ,
0 1  0 0
Under the Assumption 2.2, 2.3, 2.4, 2.5 and 2.7, we can reduce the problem to:
564 S. HARIYANTO, L. ARYATI, WIDODO

d
M r x(t )  A0 x(t )  YA Bf (t ), x( 0 )  P T z 0 on D( A0 )  P T DA, B ,
dt
1 0 1 0 1
where M r  MP T    ; A0  YA A  APT    ; and Y A B   .
 0 0  0 0  0
By restriction to D( A0 )  P T DA, B the problem can be expressed as nondegenerate
problem:
1 0   1 0 1
  x    x    f
 0 0  0 0  0
According (6) we have:
1 0
A1   
 0 0

So, solving of descriptor system the above on P DA, B  ( Ker M )  DA, B is:
T

t t

 
A1t ( t  s )
x(t )  e P z 0  e
A1t T
A1 g ( s)ds  e P z 0  A1 e A1 (t  s ) g ( s)ds,
A1t T

0 0
1
where g (t )  P A B f . According to Theorem 2.8, any solution of nondegenerate
T

problem can be mapped to solution degenerate problem by using a particular operator


Z A . Hence, the solution of descriptor system is y  0 1 z(t ), where
t
1 0

z (t )  Z A e A1t P T z 0  A1 e A1 (t  s ) g ( s)ds and Z A  
0  0 0
.

3. CONCLUDING REMARK

After discussing, there are many stages to solve nonhomogeneous degenerate


Cauchy problem with B bounded linear operator on the nonhomogen term. First, under
certain assumption we reduce the original problem to nondegenerate Cauchy problem.
Second, we transform nondegenerate problem to normal form and then solve it. The
solution of normal form is solution of nondegenerate problem. Third, the solution of
nondegenerate problem can be mapped to the solution of nonhomogeneous degenerate
Cauchy problem by operator Z A . Finally, we can find the solution of nonhomogeneous
degenerate Cauchy problem with B bounded linear operator on the nonhomogen term.

References

[1] CARROL, R.W. AND SHOWALTER, R.E., Singular and Degenerate Cauchy Problems:Math. Sci.
Engrg., vol 127, Academic Press, New York-San Fransisco-London, 1976.
[2] DAI, L. A., A Singular Control Systems, Lecture Notes in Control and Inform, Sci., vol.118, Springer-
Verlag, Berlin-Heidelberg-New York, 1989.
[3] FAVINI, A., Laplace Tranform Method for a Class of Degenerate Evolution Problems, Rend. Mat. Appl.
12(2): 511-536.
Non h om ogen eou s Ab s t ra c t Degen era t e C a u c h y Prob lem . . . . 565

[4] FAVINI, A. 1981. Abstract Potential Operator and Spectral Method for a Class of Degenerate Evolution
Problems. J. Differential Equations. 39: 212-225.
[5] THALLER, B. AND THALLER, S., Factorization of Degenerate Cauchy Problems : The Linear Case. J.
Operator Theory. 36:121-146, 1996.
[6] THALLER, B. AND THALLER, S., Approximation of Degenerate Cauchy Problems. SFB F0003
”Optimierung und Kontrolle” 76. University of Graz.
[7] WEIDMAN, J., Linear Operators in Hilbert Spaces, Springer-Verlag, Berlin-Heidelberg- New York,
1980.
[8] ZEIDLER, E., Nonlinear Functional Analysis and Its Applications II/A. Springer-Verlag, Berlin-
Heidelberg- New York, 1990.

SUSILO HARIYANTO
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Diponegoro
University, Semarang, Indonesia.
e-mail: [email protected]

LINA ARYATI
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah Mada
University, .Yogyakarta, Indonesia.
e-mail:[email protected]

WIDODO
Department of Mathematics, Faculty of Mathematics and Natural Sciences, Gadjah Mada
University, Yogyakarta, Indonesia.
e-mail: [email protected]
566 S. HARIYANTO, L. ARYATI, WIDODO
Proceedings of ”The 6th SEAMS-GMU Conference 2011”
Applied Mathematics, pp. 567 - 578.

STABILITY ANALYSIS AND OPTIMAL HARVESTING OF


PREDATOR-PREY POPULATION MODEL WITH TIME
DELAY AND CONSTANT EFFORT OF HARVESTING

SYAMSUDDIN T OAHA

Abstract. This paper studies the effect of time delay and harvesting on the dynamics of the
predator-prey which is based on the Lotka-Volterra model. The time delay is incorporated in the
growth rate of the prey equation. In the delayed predator-prey model, the predator and prey are
then harvested with constant efforts. It is shown that the time delay can induce instability, Hopf
bifurcation and stability switches. The constant efforts do not affect the stability of the stable
equilibrium point when the positive equilibrium point exists and is stable. For the model with
constant effort, we found that there exists a critical value of the efforts for a certain value of time
delay that maximizes the profit function and present value. This means that the predator and prey
populations can live in coexistence and also give maximum profit and present value.

Keywords and Phrases: Predator-prey, time delay, stability, optimal harvesting.

1. INTRODUCTION

Predator-prey population model based on Lotka-Volterra model is one of the most


popular models in mathematical ecology. Luckinbill [7] has considered a predator-prey
population model and the result showed that the prey and predator can coexist by reducing
the frequency of contact between them.
Kar and Chaudhuri [6] have studied the predator-prey model based on Lotka-
Volterra model with harvesting. They discussed about the possibility of existence of
bionomic equilibrium and optimal harvesting. The effect of constant effort of harvesting
has been studied by Holmberg [5] and the results showed that the constant catch quota can
lead to both oscillations and chaos and an increased risk for over exploitation.
The predator-prey models play a crucial role in bioeconomics, that is the
management of renewable resources. The management of renewable resources has been
based on the maximum sustainable yields (MSY). The MSY is a simple way to manage
the resources. According to Clark [2], the MSY level has been found to be situated

2010 Mathematics Subject Classification: 92B, 34C23, 49J15

567
568 SYAMSUDDIN TOAHA

between 40% and 60%. The main problem of the MSY is economical irrelevant because it
just considers the benefits of the exploitation, but disregard the cost operation of resource
exploitation. Confronted with the inadequacy of the MSY, people tried to replace it by the
optimum sustainable yield (OSY) or maximum profit.
In this paper we present a deterministic and continuous model for predator – prey
population based on Lotka – Volterra model and then extend the model by incorporating
time delay and harvesting. The same models have been considered in [10] and discussed
the effect of time delay in the stability of equilibrium point related to the maximum profit
problem. The objective of this paper is to study the combined effects of harvesting and
time delay on the dynamic of predator-prey model. Besides that, for the model with
constant effort of harvesting we relate the stable equilibrium point to the maximum profit
and present value of a continuous time-stream of revenue by using Pontryagin’s maximum
principle.

2. PREDATOR –PREY MODEL WITH TIME DELAY

We consider a predator – prey model based on Lotka – Volterra model with one
predator and one prey populations. The model for change rate of prey population (x) and
predator population (y) is
dx  x
 rx1    xy
dt  K (1)
dy
 cy  xy .
dt
The model includes parameter K, the carrying capacity, for the prey population in
the absence of the predator. Parameter r is the intrinsic growth rate of the prey, c is the
mortality rate if the predator without prey,  measures the rate of consumption of prey by
the predator,  measures the conversion of prey consumed into the predator reproduction
rate. All parameters are assumed to be positive.
The equilibrium points of model (1) are 0, 0 , K , 0 and
 c r ( K  c) 
E0   ,  . In order to get a positive equilibrium point we assume that
 K 
K  c  0 . The characteristic equation of the Jacobian matrix evaluated at the

equilibrium point E0 is f    2  K  c  with the eigenvalues have


cr cr

K K
negative real parts. It means that the equilibrium point E0 is locally asymptotically stable.
Furthermore, since K  c  0 then the equilibrium point E0 is also globally
asymptotically stable, see Ho and Ou [4].
Starting from Hutchinson’s delay logistic model, May [8] has proposed the
following system
dx(t )  x(t  ) 
 rx (t ) 1    x(t ) y (t )
dt  K  (2)
dy(t )
 cy(t )  x(t ) y (t ) .
dt
Model (2) contains a single discrete delay. The term 1  x(t  ) / K  in model (2)
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 569

denotes a density dependent feedback mechanism which takes  units of time to respond
to changes in the population density. If we think the gestation period of prey is  , then
the per capita growth rate function should carry a time delay  .

3. MODEL WITH TIME DELAY AND CONSTANT EFFORTS

We consider model (2) where the two populations are subjected to constant efforts
of harvesting. The model with harvesting is as follows
dx(t )  x(t  ) 
 rx (t ) 1    x(t ) y (t )  q x E x x(t )
dt  K  (3)
dy(t )
 cy(t )  x(t ) y (t )  q y E y y (t ) .
dt
Here, q x and q y are the cathability coefficients of the prey and predator
populations respectively. The constants E x and E y are the efforts of harvesting for the
prey and predator populations. For analysis, we set q x  q y  1 . The model (3) can be
rewritten as
dx(t )  x(t  ) 
 r1 x(t ) 1    x(t ) y (t )
dt  K1  (4)
dy(t )
 c1 y (t )  x(t ) y (t ) ,
dt
where r1  r  E x , K1 
r  E x K , and c  c  E .
1 y
r
 c1 r1 ( K1  c1 ) 
The equilibrium point of model (4) is E1  x1 , y1    ,  . In
 K1 
order to have a positive equilibrium point we assume that r  E x , K  c  0 , and
K1  c1  0 , or equivalently E , E   ,
x y where
 K 

   Ex , E y Ex  E y  K  c, Ex  0, E y  0 .
 r 
To linearize the model about the equilibrium point E1 of model (4), let
u(t )  x(t )  x1 and v(t )  y(t )  y1 . We then obtain the linearized model
r
u (t )   1 x1u (t  )  x1v(t )
K1
v(t )  y1u (t ) .
From the linearized model we have the characteristic equation
2  P1e   Q1  0 , (5)
570 SYAMSUDDIN TOAHA

r1
where P1  x1 and Q1  x1 y1 .
K1
For   0 , the characteristic equation (5) becomes
2  P1  Q1  0 , (6)

 P1  P1  4Q1
2
which has the roots 1, 2  . Since P1 and Q1 are both positive, the
2
characteristic equation has negative real roots. Hence, for   0 and E x , E y   , the  
equilibrium point E1 is globally asymptotically stable.
Now for   0 , if   i ,   0 , is a root of the characteristic equation (5), then
we have
 2  iP1e i   Q1  0 ,
 2  iP1 cos()  P1sin()  Q1  0 .
Separating the real and imaginary parts, we have
 2  P1 sin() Q1  0 ,
(7)
P1 cos()  0 .
Squaring both sides of equation (7) gives
P1 2 sin 2 ()  4  2Q12  Q1
2 2

P1 2 cos2 ()  0 .
2

Adding both equations and regrouping by powers of  , we obtain the fourth degree
polynomial
 
4  P1 2 2Q1 2  Q1  0 ,
2
(8)
from which we have

2 
1
2
 
P1  2Q1  P1  4P1 Q1 .
2 4 2
 (9)

From (9) we can see that there are two positive solutions of  2 . We can now find

the values of  j by substituting  2 into equation (7) and solving for  . We obtain
 2k 3 2k
 k   ,  k   , k  0, 1, 2, . (10)
2  2 

 
Theorem 1. Let K  c  0 , E x , E y   and  k be defined by equation (10). Then
there exists a positive integer m such that there are m switches from stability to instability
and to stability. In other words, when   [0,  0 )  ( 0 , 1 )    ( m1 ,  m ) , the
equilibrium point E1 of model (4) is stable, and when
  ( 0 ,  0 )  (1 , 1 )    ( m1 ,

 m1 ) , the equilibrium point E1 is unstable.
Therefore, there are bifurcations at the equilibrium point E1 for
  k , k  0, 1, 2, .
Proof. From (6) we know that the equilibrium point E1 is stable for   0 . Then to
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 571

prove the theorem we need only to verify the transversality conditions, see Cushing [3],
d (Re ) d (Re )
 0 and  0.
d k d k
Differentiating the equation (5) with respect to  we obtain
d d  d 
2  P1e   P1e        0 ,
d d  d 
2  (1   ) P1e 
d
d

 2 P1e  .
1
 d  d
For convenience, we study   instead of . Then we have
 d  d
1
 d  2e   P1 (1   ) 2e   P1 
     .
 d  2 P1 2 P1 
 P
From the characteristic equation (5) we know that e   2 1 .
  Q1
1
 d   2  Q1 
Then we have     . Therefore
 d   (  Q1 ) 
2 2

 d Re      d  1 
sign    sign  Re  
 d   i   d  
   i
  1   Q  
 sign  Re  2   Re  4 1 2  
   Q 1      Q 
1   i 
  i

 1   4  Q 2 
  sign  
Q1
 sign   1

   2
 Q 1  4
  2
Q 1
 
    Q1
2 2
 
2 


 sign 4  Q1 .
2

From equation (8) we know that 4  Q1  24  P1  2Q1 2 . Then we have
2
 2

 d Re   
sign 
 d   i

2
 2

 sign 24  ( P1  2Q1 )2  sign 22  ( P1  2Q1 ) .  
By substituting the expression for  2 , it is easy to see that the sign is positive for
 2 while the sign is negative for  2 . Therefore, crossing from left to right with
increasing  occurs for values of  corresponding to  and crossing from right to left
occurs for values of  corresponding to   . From (9) and the last result, we can verify
that the transversality conditions are satisfied. Therefore  k are Hopf bifurcation values.

572 SYAMSUDDIN TOAHA

Example 1. Consider model (4) with parameters r  1.1 , K  110 ,   0.2 , c  0.8 ,
  0.1 , E x  0.1 , and E y  0.2 . The equilibrium point of the model in the positive
quadrant is 10, 4.5 . For   0 , the Jacobian matrix of the model associated with the
equilibrium point has eigenvalues  0.05000 0.94736i . This means that the
equilibrium point of the model without time delay is stable. Following Theorem 1 we
have  0  1.57080,  0  5.23599, 1  7.85398, 1  12.21730,  2  14.13717,
 2  19.19862, 3  20.42035, 3  26.17994,  4  26.70354,  4  33.16126,
5  32.98672 and 5  40.14257. Then we have 4 stability switches from stability to
instability and to stability.

4. BIONOMIC EQUILIBRIUM

Bionomic equilibrium is one concept that integrates the biological equilibrium and
economic equilibrium, Bhattacharya and Begum [1]. As we discussed before, the
dx dy
biological equilibrium is found by solving  0 and  0 simultaneously. The
dt dt
economic equilibrium is reached when the total revenue from selling harvested biomass
equals to the total cost of harvesting efforts.
Let cx = harvesting cost per unit effort of prey population
cy = harvesting cost per unit effort of predator population
cf = fixed cost of harvesting
px = price per unit biomassa of prey population
py = price per unit biomassa of predator population.
The profit function is given by
  px xEx  p y yE y  c f  cx Ex  c y E y .
The harvesting cost per unit effort is actually not constant, but here we assume the cost is

constant in order to simplify the analyzes. The bionomic equilibrium x , y , E x , E y 
is found by solving the following simultaneous equation
 x 
r1 1    y  0
 K1  (11)
 c1  x  0
and px xEx  p y yEy  c f  cx Ex  c y E y  0 . (12)
c1 r ( K   c1 )
By solving equation (11) we get x  and y  1 1 . Substituting
 K1
c1 r1 ( K1  c1 )
x  and y  into equation (12) we get
 K1
 pyr   KE    pc 
 E y   1 x 1  E y   x  c x  E x  c f
2
with critical point
K  K    
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 573

1 2 p y r1
E x , 
E y , where Ex 
1K
 , E y 
1
1
, 1  p x  p y ,
1 K
2

1  cx  cpx , and 1  p y cr  p y rK  c y K .


In order to have a real critical point we assume that p x  p y  0 . The profit

function  has a saddle point at the critical point E x , E y . The critical point may or 
may not belong to  . When the critical point belongs to  , it is still not considered
because we only consider the critical point that maximizes the profit function.
The maximum value of the profit function occurs at the boundary of  , it occurs in
 Kr  cr 
the segment line AB, where A   , 0  and B  0, K  c  . From the linear
 K 
 K
equation which passes through the point A and B, we have E y  E x  K  c and
r
substitute Ey into the profit function to obtain
 p x K 2  p x Kr  c x r  c y K 
  E x   E x    E x  c y c  c y K  c f with critical
r  r 
p x Kr  c x r  c y K p x Kr  c x r  c y K Kr  cr
point E x1  . If 0   holds, then
2 px K 2 px K K
the critical point lies in the segment line AB. We assume that this condition holds. Further
p x Kr  c y K 2  c x r  2crp x
we obtain E y1 
2rp x
and the critical point E x1 , E y1  
 
maximizes the profit. Although E x1 , E y1 maximizes the profit function, it will not be
applied since this situation implies the equilibrium point does not occur in the positive
r1 ( K1  c1 )
quadrant, i.e., y   0 . This situation leads to the extinction of predator
K1
population.
Under ecological consideration we may think that there is a certain allowed
minimum value for the population y to prevent the extinction. Let y  ymin  0 , where
y min is the constant allowed minimum size for population y. Under this consideration, it
r ( K   c1 )
follows that y  1 1  ymin . The inequality can be rewritten as
K1
ymin K1  r1 ( K1  c1 )  0 and after simplification we have
E y r  ymin K  Kr  KE x  cr  0 . The efforts now should belong to 1 , where
 
1  E x , E y rE y  KE x  ymin K  Kr  cr  0, E x  0, E y  0 . The critical point 
occurs at the boundary of 1 , it occurs at the curve
E y r  ymin K  Kr  KEx  cr  0 from which we have
574 SYAMSUDDIN TOAHA

( ymin K  Kr  KE x  cr )


E y  .
r
In order to get a positive value of E y , we choose the value of y min which
Kr  KE x  cr
satisfies ymin  . Since E x  0 , then y min should satisfy
K
Kr  cr
y min  . Substitute E y into the profit function  to have
K
 p x K 2 A1
  E x  
B
Ex  Ex  1 ,
r r r
where A1   p x Ky min  p x Kr  p y Kymin  c x r  c y K and

B1   p y Kymin
2
 p y Kry min  p y cry min  c y Kymin  c f r  c y Kr  c y cr .
A1
The critical point of the profit function is E x 2  . Substitute Ex 2 to get
2 px K
 2 p x Kymin  2 p x Kr  A1  2 p x cr
E y 2 
2 px r
. The critical point E x 2 , E y 2 
maximizes the profit function and also prevents the predator population from the
extinction. When the efforts E x 2 and E y 2 are applied into the model, the equilibrium
point E1 is in the positive quadrant and stable.

Example 2. Consider model (4) with parameters r  1 , K  100 ,   0.2 , c  1 , and


  0.1 . Take p x  1 , p y  1 , c f  2 , c x  0.5 , and c y  0.5 . The equilibrium point
of the model is x , y  , where x  10  10E y and y  4.5  5Ex  0.5E y . The
equilibrium point is still depend of Ex and Ey . Then we have
 
  E x , E y 10E x  E y  9, E x  0, E y  0 .  The profit function becomes

  ( Ex )  100E x  104.5000E x  6.5 with Ex1  0.5225 and E y1  3.7750 .


2

The critical point E x1 , E y1  belongs to  and the maximum profit is
max  20.800625. After we substitute Ex1  0.5225 and E y1  3.7750 , the
equilibrium point becomes x , y   47.75, 0 .
When we apply the critical effort E x1 , 
E y1  0.5225, 3.7750 , the profit
function is at the maximum level but the predator population becomes extinct. This policy
will not be considered. If we take, for example, ymin  1 , we have
 
1  E x , E y 10E x  E y  7  0, E x  0, E y  0  . The profit function becomes

  ( E x )  100Ex  74.5000E x  1.5 and the critical point for the profit function is
2


Ex 2  0.3725 and E y 2  3.2750 . The critical point E x 2 , E y 2 belongs to 1 and 
the maximum profit becomes max  15.375625. Substituting Ex 2  0.3725 and
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 575

E y 2  3.2750, we obtain the equilibrium point x , y   42.7500, 1.0000 which is


also stable. This policy is reasonable to consider since the predator population will not
become extinct and the profit function is also at the maximum level.
Following Theorem 1, we get the time delay margin 0  1.35088 in which the
equilibrium point 42.7500, 1.0000 is asymptotically stable for   0, 1.35088 and
the profit function is also at the maximum level. We can verify that the maximum profit
function depends on the given value of y min . If y min approaches to zero, the maximum
profit approaches to max  20.800625.

5. HARVESTING OPTIMAL POLICY

The objective is to maximize the present value J of a continuous time-stream of


revenues given by
J  0

  
et  px qx x  cx Ex (t )  p y q y y  c y E y (t )  c f dt  (13)
where  denotes the instantaneous annual rate of discount. We need to maximize J
subject to the constraint equation (3) by using Pontryagin’s maximum principle,
Pontryagin, et.al. [9]. Variables x and y in equation (13) refer to the stable equilibrium
point of equation (3). When we set equation (3) equals zero, then time delay does not
change the value of stable equilibrium point. The control variables Ex (t ) and E y (t ) are
subject to the constraints 0  Ex (t )  ( Ex ) max and 0  E y (t )  ( E y ) max .
The Hamiltonian equation for this problem is given by
  
H  e  t  p x q x x  c x E x  p y q y y  c y E y  c f 
  
x
 
(14)
  x rx1    xy  q x E x x    y  cy  xy  q y E y y
  K 
where  x (t ) and  y (t ) denote adjoint variables.
The control variables E x and E y are linear in the Hamiltonian equation H.
H
Therefore, the necessary condition for the control variables to be optimum are 0
E x
H
and  0. From Hamiltonian equation (14), we have
E y
H e  t  p x q x x  c x 
 e t  p x q x x  c x    x q x x  0 , from which we get  x  .
E x qx x
H
Again from equation (14) we have
E y
 
 e t p y q y y  c y   y q y y  0 and then we
576 SYAMSUDDIN TOAHA

get  y 

e t p y q y y  c y .
qy y
From the Hamiltonian equation we also have
H   x  rx 
 e t p x q x E x   x r 1     y  q x E x    y y and
x   K K 
H
y

 e t p y q y E y   x x   y  c  x  q y E y . 
 H
From the Pontryagin’s maximum principle  x   we get
x
 e t  p x q x x  c x  t   x  rx 
 e p x q x E x   x r 1     y  q x E x    y y  0 ,
qx x   K K 
or equivalently
 e t Kp x q x x  Kc x   e t p x q x2 E x xK   x q x xrK  2 x q x x 2 r
(15)
  x q x xyK   x q x2 xE x K   y yq x xK  0
 H
Again, since  y   we get
y

 e t p y q y y  c y e t

p y q y E y   x x   y  c  x  q y E y  0 , 
qy y
or equivalently
 
 et p y q y y  c y  et p y q 2y E y y   xxqy y
 
(16)
  y q y y  c  x  q y E y  0 .

By substituting x 
e  t  p x q x x  c x 
and y 

e  t p y q y y  c y  into
qx x qy y
equation (15) and (16) and solving simultaneously, we get

Ex 
1
qx Kq y cx

Kp x qx xqy  Kc x q y  rKq y px qx x  rKq y cx  2 x 2rq y px qx
(17)
 2 xrq y cx  yKq y px qx x  yKq y cx  qx xKp y q y y  qx xKc y 
and

Ey 
1
q y qx c y

p y q y yqx  c y qx  q y ypx qx x  q y ycx  cqx p y q y y
(18)
 cqxc y  xqx p y q y y  xqxc y . 
c  qy Ey Kr  cr  Kq x Ex  q y E y r
By substituting x  and y  , which
 K
constitute the value of stable equilibrium point, into equation (17) and (18), and then
solving simultaneously we get the value of control variables E x and E y . Therefore, the
value of E x , E y , x , and y maximize the present value (13).
St ab i li t y An a l ys i s a n d Op t i m a l Ha rves t i n g of Pre d a t or -Pre y . . . . 577

Example 3. Consider the values of parameter r  1 , K  100 ,   0.2 , c  1 , and


  0.1 . Take p x  1 , p y  1 , c x  0.5 , c y  0.5 , c f  2.0 , q x  1 , q y  1 , and
  0.005 . From these values we have the optimal equilibrium point
x , y   10  10Ey , 4.5  5Ex  0.5Ey  with the adjoint variables
e 0.005t (9.5  10E y ) e 0.005t (4  5E x  0.5E y )
x  and  y  . Further, we get the
10  10E y 4.5  5E x  0.5E y
optimal harvesting efforts E x  0.52491 and E y  3.75004 associate with the optimal
equilibrium point. From the optimal harvesting efforts we get the optimal equilbirium
point x , y   47.50042, 0.00044 and the value of adjoint variables
 x  0.9895e 0.005t and  y  1130.0081e 0.005t . Therefore, we get the maximum

value of J  0
20.7975e 0.005t dt  4159.5009 .
Following Theorem 1, we get the time delay margin 0  3.30078 in which the
equilibrium point 47.50042, 0.00044 is asymptotically stable for   0, 3.30078
and the present value J is also at the maximum level.

6. CONCLUSIONS

Prey-predator population model with and without constant efforts of harvesting


may has a positive equilibrium point. When the positive equilibrium point exists, it is
globally asymptotically stable. When the time delay is considered in the equation of prey
growth rate, the stable equilibrium point maybe still stable or unstable, and also there
exists Hopf bifurcation values. Stability of the equilibrium point depends on the value of
time delay.
For the model with constant efforts of harvesting and time delay, we found a
certain condition for the value of efforts and time delay that maximizes the profit function
without altering stability of the stable equilibrium point. When we introduce a minimum
value of predator population and a small value of time delay, the two populations will
remain exist and give a maximum profit.
By using the Pontryagin’s maximum principle, we maximize the present value of
revenues with the constraint equation (3) equals zero. From the analyzes, we found a
certain value of harvesting efforts and a stable equilibrium point that maximize the present
value of revenues. With a small value of time delay, the stable equilibrium point that
maximizes the present value also remains stable.

References

[1] BHATTACHARYA, D. K. AND BEGUM, S., Bionomic Equilibrium of Two-Species System, Mathematical
Biosciences, 135(2), 111-127, 1996.
[2] CLARK, C. W., Mathematical Bioeconomics, The Optimal Management of Renewable Resources, 2nd Ed.,
578 SYAMSUDDIN TOAHA

John Wiley & Sons, New York-Toronto, 1990.


[3] CUSHING, J. M., Integrodifferential Equations and Delay Models in Population Dynamics, Heidelberg:
Springer-Verlag, 1977.
[4] HO, C.P. AND OU, Y.L., Influence of Time Delay on Local Stability for a Predator-Prey system, Journal of
Tunghai Science, 4, 47-62, 2002.
[5] HOLMBERG, J., Socio-Ecological Principles and Indicators for Sustainability, PhD Thesis, Goteborg
University, Sweden, 1995.
[6] KAR, T.K. AND CHAUDHURI, K.S., Harvesting in a Two-Prey One Predator Fishery: A Bioeconomic Model.
J. ANZIAM, 45, 443-456, 2004.
[7] LUCKINBILL, L.S., Coexistence in Laboratory Populations of Paramecium Aurelia and Its Predator
Didinium Nasutum, Journal of Ecology, 54(6), 1320-1327, 1973.
[8] MAY, R. M., Stability and Complexity of Model Ecosystems, Princeton, New Jersey: Princeton University
Press, 1974.
[9] PONTRYAGIN, L. S., BOLTYANSKII, V. S., GAMKRELIDRE, R. V., AND MISHCHENKO, E. F., The Mathematical
Theory of Optimal Processes. Wiley, New York, 1962.
[10] TOAHA, S., Stability Analysis and Maximum Profit of Predator – Prey Population Model with Time Delay
and Constant Effort of Harvesting. MJMS, 2(2), 147-159, 2008.

SYAMSUDDIN TOAHA
Department of Mathematics, Hasanuddin University, Makassar, Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied mathematics, pp. 579–588.

DYNAMIC ANALYSIS OF ETHANOL, GLUCOSE, AND


SACCHAROMYCES FOR BATCH FERMENTATION

WIDOWATI, NURHAYATI, SUTIMIN, LAYLATUSYSYARIFAH

Abstract. In this paper, a dynamic model is presented to describe the behavior of glucose
concentration, Saccharomyces, and ethanol concentration during batch fermentation process. The
desired product of batch alcohol fermentation is ethanol. The form of this mathematical model is
nonlinear differential equation systems. The equilibrium point stability of the dynamic model is
discussed. Further, simulation numeric based on the experimental data is proposed to analysis the
stability of dynamic model. From the simulations results, the behavior of glucose, Saccharomyces,
and ethanol achieve steady-states at the 3rd day.

Keywords and Phrases : Dynamic model, ethanol, glucose, Saccharomyces, steady state

1. INTRODUCTION

A model for predicting alcoholic fermentation behaviour would be a valuable


instrument for tequila research, due to the technical and economical implications has been
proposed by Arellano-Plaza, et. al[1]. Some researchers [8] have presented a biochemically
structured model for the aerobic growth of Saccharomyces cerevisiae on glucose and ethanol.
They showed a bifurcation analysis using the two external variables, the dilution rate and the
inlet concentration of glucose, as parameters, that a fold bifurcation occurs close to the
critical dilution rate resulting in multiple steady states. Further, Wei and Chen [9] are
investigated a mathematical model for ethanol fermentation with gas tripping. They studied
the existence and local stability of two equilibrium points of the subsystem. Cheng-Ce Li [3]
observed oscillations during the production of ethanol fermentation by Zymomonas mobile.
The focus of their paper is to help understand which inhibitory mechanism may cause the
oscillatory phenomena in the fermentation process. Furthermore, Widowati, et.al [11]
proposed stability analysis of the dynamic model of alcoholic fermentation.
In comparison with previous papers [3, 9, 11], this paper investigates the dynamic
analysis of batch alcoholic fermentation, whereas in [3, 9, 11] discussed the dynamic

_________________________________
2010 Mathematics Subject Classification : 92B99

579
580 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH

analysis of continuous alcoholic fermentation. In this paper, we present the stability


analysis of the dynamic model of ethanol concentration, glucose concentration, and
Saccharomyces during batch fermentation process. The local stability can be observed from
the eigen values of the Jacobian matrix of the linearized system.
The paper is organized as follows. Section 2 describes mathematical modeling of
alcoholic fermentation. Results concerning the stability analysis of the dynamic model are
discussed in Section 3. In Section 4 the validity of the proposed method is presented by using
experimental data. We demonstrate our results by numerical simulation. Finally, concluding
remarks are given in Section 5.

2. MATHEMATICAL MODELING

The process of fermentation is crucial in the ethanol production. The mathematical


model is proposed to represent the dynamics of batch alcoholic fermentation. The model
variables are glucose concentration, Saccharomyces, and ethanol concentration. The
following set of differential systems describes a batch alcoholic fermentation process[5],

dP
 X ,
dt
dS
 qX , (1)
dt
dX VS
 X ,
dt KS

where is the growth rate of ethanol (mg / ml), is the rate of glucose consumption (mg /
ml), is concentration of ethanol (ml / ml), is concentration of glucose (mg / ml), is
Saccharomyces wet weight (mg / ml), V is the maximal growth rate of Saccharomyces (mg /
ml), K is Michaelis-Menten constant, and .
Let the equilibrium for ethanol, glucose, and Saccharomyces model
system of equations (1). Equilibrium point can be obtained by

, , .

System of equations (1) at the point can be written

(2)

Further, we obtain equilibrium point, , of alcoholic fermentation model as


follows
Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation 581

, , , (3)

where , .

3. STABILITY ANALYSIS

Stability analysis of dynamic models (1) is derived through linearized system


around equilibrium point using Taylor series [2, 4, 7]. Local stability of the system around the
equilibrium point can be determined by eigenvalues of the Jacobian matrix of the linearized
system.

Consider,

(4)

where ̅ ̅ and ̅ .
Linearization model (4) in the equilibrium point using Taylor series are as
follows

̅
̅ ̅ ̅

̅ (5)
̅ ̅ ̅

̅
̅ ̅ ̅

Substitution (4) into the equation (5), so that we find


̅
̅ (6)
582 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH

̅
̅

̅
̅ ̅

System (6) in matrix form will become

̅
̅
[ ] [ ̅] (7)
̅ ̅
[ ]

with Jacobian matrix is

J (8)

[ ]

The behavior of the system (4) around the equilibrium point ( ) can be seen from the
Jacobian matrix as follows:

J ( ) [ ]

Characteristic matrix equation J ( ) can be found by

(J ( ) ) where I is the identity matrix, so that we get

J ( ) | |

The characteristic equation of matrix J ( ) is

(9)

Equation (9) has solutions and . It is indicate that the behaviour of the system
around the equilibrium point will be stable if . In this case, it should be
and the behaviour of the system around the equilibrium point is unstable
Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation 583

if . In this case From the above results, the stability of equilibrium point
is determined by the eigenvalues of the Jacobian matrix.

4. NUMERICAL SIMULATION

As a verification of the proposed method is given numerical simulations using data


from laboratory experiments. Here, we check the optimal concentration of ethanol and the
stability of the system. We use data from Widowati, et al [10] which has been conducting
experiments in the Microbiology laboratory for batch systems. The data consist of the
concentration of ethanol, glucose 10%, and Saccharomyces wet weight. Further, by using the
least square method and some manipulating algebra, we find the parameters ,
, , . Substituting these parameters in the equation
(1), we have

(10)

Linearization of the system (10) around equilibrium point


, , , using Taylor series results the following equations

̅
̅

̅
̅ (11)

̅
̅ ̅

Based on the data of ethanol, glucose, and Saccharomyces wet weight [10], we obtain the
Jacobian matrix,

J [ ]
584 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH

Eigenvalues of the matrix is as follows

and

Solving the differential equation systems (11) using initial values,

̅ ̅ ̅

are obtained particular solutions as follows

̅
̅
̅

Furthermore, we evaluate the behavior of the dynamic model around the equilibrium point.

P(ml/ml)

day

Figure 1. Ethanol concentration vs time


Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation 585

S(mg/ml)

day

Figure 2. Glucose concentration vs time

X (mg/ml)

day

Figure 3. Saccharomyces vs time

From Figure 1- 3, it can be seen that the ethanol concentration, glucose


concentration, and Saccharomyces will achieve steady states. On the 3rd day, glucose
concentration decrease and the number of Saccharomyces cells will reduce because no
nutrients are consumed. It indicate that the fermentation process stops and the optimal
concentration of ethanol is achieved at 3rd day.
586 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH

5. CONCLUDING REMARK

Dynamic model have been proposed to describe the behavior of glucose


concentration, Saccharomyces, and ethanol concentration during batch fermentation
process. Dynamical behavior around the equilibrium point is stable, if all of the real eigen
values of the Jacobian matrix of the linearized system are less than or equal to zero. While the
dynamical behavior of the systems around the equilibrium point is unstable if there exists a
real part of the Jacobian matrix eigen value is greater than zero.
From the simulation results are found that the ethanol concentration, glucose
concentration, and Saccharomyces will achieve steady states. It means that the dynamic
model is stable. In this case, the optimal concentration of ethanol is achieved at 3rd day with
glucose concentration 10%.

Acknowledgement. This study is a part of research (ST No. 485/H7.3. 8/


AK/2011). The authors are gr eatfull to the Dean of Faculty of Mathematics and
Natural sciences for the support o f this research.
We also would like to thank the reviewers for their comments and suggestions.

References

[1] ARELLANO-PLAZA, M., et.al., Unstructured kinetic model for tequila batch fermentation International Journal of
Mathematics and Computers in Simulation, 1, 1, 1-6, 2007.
[2] BOYCE, W. E. AND DIPRIMA, R. C., Elementary Differential Equation and Boundary Value Problem, John Wiley
& Sons, Inc, New York, 1992.
[3] CHENG-CHE LI, Mathematical models of ethanol inhibition effect during alcohol fermentation, Nonlinear
Analysis, 71, e1608-e1619, 2009.
[4] CRONIN, J., Differensial Equation : Introduction and Qualitatif Theory, New York : Marcel Dekker. Inc., 1994.
[5] HISAHARU, T, MICHIMASA, DIMITAR, SUBHABRATA, DAN TOSHIOMA, Application of the Fuzzy Theory to
Simulation of Batch Fermentation, Japan, 1985.
[6] JAMES, L., Kinetics of ethanol inhibiton in alcohol fermentation, Biotechnol. Bioeng, 280–285, 1985.
[7] LEDDER, G, Differensial Equation: A Modeling Approach, The McGraw-Hill, New York, 2005.
[8] LEI, F., ROTBOLL, M. JORGENSEN, S.B., A biochemically structured model for saccharomyces cerevisiae, Journal
of Biotechnology, 88, 205-221, 2001.
[9] WEI, C. AND CHEN, L. Dynamics analysis of mathematical model of ethanol fermentation with gas stripping,
Journal of Nonlinear Dynamic, 77,. 13-23, 2009.
[10] WIDOWATI, NURHAYATI, DAN SUTIMIN, Laporan Penelitian: Model dinamik fermentasi alkohol untuk
menentukan optimasi produk etanol: Studi Kasus Industri alkohol Sukoharjo, FMIPA Universitas Diponegoro,
Semarang, 2011.
[11] WIDOWATI, NURHAYATI, LAILATUSYSYARIFAH, Kestabilan model dinamik fermentasi alkohol secara kontinu,
Prosiding Seminar Nasional Statistika, ISBN: 978-979-097-142-4, Mei 2011.

WIDOWATI
Mathematics Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.
e-mail: [email protected]
Dynamic Analysis of Ethanol, Glucose, and Saccharomyces for Batch Fermentation 587

NURHAYATI
Biology Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.

SUTIMIN
Mathematics Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.

LAILATUSYSYARIFAH
Mathematics Department, Diponegoro University
Jl. Prof. H. Soedarto, S.H., Semarang, 50275, Indonesia.
588 WIDOWATI, NURHAYATI, SUTIMIN, LAILATUSYSYARIFAH
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 589–600.

SURVEY OF METHODS FOR MONITORING


ASSOCIATION RULE BEHAVIOR

Ani Dijah Rahajoe and Edi Winarko

Abstract. Much of the existing research in data mining focused on how to generate
rules efficiently from static databases. However, it is often the database for generating
the rules has been collected over a considerable period of time, making it subject to
changes than not. As a result, it is possible that the underlying rules will also be subject
to change as a function of time. As an example, consider the personalization of web sites,
changes in topology of the web site may result in different user navigation behaviour.
Observation on the changing of the rules could provide more useful information that the
rules themselves. In this paper, we survey different methods for monitoring the behaviour
of association rules over periods of time.

Keywords and Phrases: Data mining, Association rules, rule evolution, rule monitoring.

1. INTRODUCTION
Much of the existing research in data mining focused on how to generate rules
efficiently from static datasets. However, it is often the dataset for mining has been
collected over a considerable period of time making it subject to changes than not.
Moreover, consumer behavior and preferences also change over time. As a result, it is
possible that the underlying rules will also be subject to change as a function of time.
As an example, consider the personalization of web sites, changes in topology of the
web site may result in different user navigation behaviour.
Research associated with mining rule changes has been focused in several areas.
In classification mining, there has been work on concept drift, see [1, 2]. Here concept
drift refers to the phenomenon that some or all of the rules defining classes change over
time. In association rule mining, a number of works on mining rule changes has been
done with the objective to observe the behaviour of association rules over several time
periods [3, 4, 5, 6, 7]. In addition, other works focus on the changes that occurred
over two periods of time [8, 9, 10]. Apart from these, a number of studies, called
incremental mining has emerged. The objective is to update previously discovered
rules incrementally when the underlying dataset is updated. By doing this, they can
589
590 A.D. Rahajoe and E. Winarko

avoid scanning the whole dataset again. In incremental mining the statistical properties
of known patterns are updated, instead of being recorded over time as in mining rule
change. Incremental mining algorithms designed for maintaining discovered association
rules can be found in [11, 12, 13, 14, 15].
In this paper, we survey different methods for monitoring the behaviour of associ-
ation rules. We first give an overview of research in mining association rule changes. In
this context, mining rule change generally consists of three steps. The most important
step is the monitoring step, in which different methods to monitor the rules have been
proposed. We will describe each step briefly, then present the classification of meth-
ods currently used in the monitoring. Finally, we describe how each method in each
category is used to monitor the behaviour of association rules.

1.1. Overview of Research in Mining Rule Change. Based on their mining plat-
form, research on mining association rule changes has been done in both temporal
database and non-temporal database. However, only a few of them using temporal
databases as a platform for data mining task, for example [3]. Saraee et al. [16] is
the first to introduce a framework for mining association rules and sequential patterns
from a temporal database, using the ORES temporal database management system.
However, the framework does not consider the rule changes. This work focuses on two
areas and their integration, i.e., data mining as a technique to increase the quality of
data, and temporal databases as a technique to keep the history of data. A number of
enhancements to the basic algorithm for mining association rules and sequential pat-
terns is introduced. One of them is a new measure for mining association rules, called
time confidence. Tansel et al. [3] study the problem of discovering association rules
and their evolution from temporal databases. The proposed approach allows the user
to observe the changes in association rules occur over periods of time. The observed
changes include a decrease/increase in the support/confidence of an association rule
and addition/removal of itemsets from a particular itemsets.
The problem of monitoring the support and confidence of association rules from
non-temporal databases has been addressed in [4, 5, 6, 7]. Agrawal et al. [4] propose
the method to monitor rules from different time periods. The discovered rules from
different time periods are collected into a rule base. Ups and downs in support or
confidence over time are represented and defined using shape operators. The user can
then query the rule base by specifying some history specifications. In addition, the user
can specify triggers over the rule base in which the triggering condition is a query on
the shape of the history. In [5], Liu et al. propose a technique to use statistical methods
to analyze the behavior of association rules over time. They focus on determining rules
that are semi-stable, stable, showing trends over the several time periods. [6] proposes
visualization techniques that allow the user to visually analyze association rules and
their changing behaviours over a number of time periods. Baron et al. [7] introduce the
GRM (General Rule Model ) to model both the content and the statistics of a rule as
a temporal object. Based on this two components of a rule, different types of pattern
evolution are defined, such as changes of statistics or content, disappearance of a rule,
the correlated changes of pairs of rules. In [17], Baron et al. study the evolution of web
usage patterns using PAM (PAttern Monitor ). The association rules that show which
Survey of Methods for Monitoring Association Rule Behavior 591

pages tend to be visited within the same user session are generated from a web server.
They demonstrate how the mechanisms implemented by PAM can be used to identify
interesting changes in the usage behaviour. In most of these works, the behaviour of
rules is based on the behaviour of the rule’s statistics, the changing in support and
confidence values. They do not consider the changes in the rule contents.
Other works in mining association rule changes focus on detecting the changes
from two datasets, i.e., to find rule changes that occur from one dataset to another
[8, 9, 10]. Ganti et al. [8] present a general framework for measuring changes or
differences in two sets of association rules from two datasets. They compute a deviation
measure which makes it possible to quantify the difference between two datasets in
terms of the model they induce, called FOCUS. In [9], Dong and Li introduce a new
kind pattern, called emerging pattern. The support differences of association rules
mined from two datasets are used to detect the emerging patterns. Liu et al. [10]
study the discovery of fundamental rule changes. They consider rules of the form
r1 , . . . , rm−1 → rm and detect changes on support or confidence between two consecutive
time periods by applying a chi-square test.
The rest of the paper is organized as follows. Section 2 describes three basic steps
in mining association rule changes. Section 3 describes several monitoring methods
based on statistical test, while Section 4 describes monitoring methods based on visu-
alization. Section 5 presents methods to monitor rule behaviour in two datasets. The
conclusion and future work are given in Section 6.

2. BASIC STEPS IN MINING RULE BEHAVIOR


Most of works in mining association rule changes divides the process into three
steps. The first step is partitioning the dataset to extracts the portion of the data
for each time period. The second step is generating the rules, and the final step is
monitoring the rules. This section briefly describes each of these steps.

Step 1: Partitioning the dataset. In this step, the important parameters are the
length of the interval during which data is accumulated, and the number of such inter-
vals. Then, we walk over the dataset D to extract a subset of dataset for each time
period. Let ti = [bi , fi ) be a time period where bi denotes its starting time point and
fi denotes its end. Time periods t1 , t2 , . . . , tn are consecutive, non overlapping fixed
length time periods. Di denotes the portion of dataset that is valid during the time
period ti .

Step 2: Mining rules from sub-datasets. Two different approaches are generally
used to mine the rules from the set of sub-datasets. The first approach is to mine
each sub-dataset Di in sequence. Let Ri be the set of temporal association rules from
Di , then after the mining we have the rule sets of R1 , R2 , R3 , . . ., accordingly. If R is
the set of rules that will be monitored in the next step, R is defined as R = {r|r ∈
(R1 ∪ R2 . . . ∪ Rq )}. It is possible for a rule r ∈ R to appear in Ri but not in Rj (i 6= j)
592 A.D. Rahajoe and E. Winarko

because r may not satisfy minsup and/or minconf in Dj . This approach is used in
[3, 4, 5, 7, 17].
The second approach is to mine the rules from one sub-dataset, and apply the
resulting rules on other sub-datasets to calculate the support and confidence values on
them. It means that only initial mining session is launched (on D1 ). At each later time
period, an instance of each existing rule is created, computing the statistic values from
the sub-dataset in the corresponding time period. If R is the set of rules that will be
monitored in the next step, R is defined as R = {r|r ∈ R1 }. Thus, for each rule, we get
a sequence of support and confidence values. This approach is used in [6].
The first approach results in larger number of rules than the second one. However,
the users may find this is more useful as it gives more detailed view of the whole data.
In the second approach, since the monitoring is focused only on the rules generated
from the first time period, it cannot be used to detect new rules that appear in the next
time periods. It can only detect rules that disappear in the next time periods.
A variation of the second approach is proposed in [18], by selecting a subset of
rules generated in the first time period. If R is the set of rules that will be monitored,
then R is a subset of R1 . It results in reducing the computational effort to a minimum
while focusing only on interesting rules. If the user choose to monitor all rules in
R1 , then it would be similar to the second approach described above. Choosing the
rules to be monitored is generally user and application dependent. The rules that are
interesting to one user may be of no interest to another user, and the interestingness of
patterns varies from application to application. Regardless of which approach is used,
the number of discovered rules could still be large. Several methods have been proposed
to reduce the number of generated rules, for example by pruning [19, 20] or by using
templates [21].
For the monitoring purpose, we need support and confidence values of every rule
r ∈ R, in all time periods. Therefore, we need to obtain the missing support and
confidence values in certain time periods for each r ∈ R. This can be done by rescanning
the corresponding sub-dataset to calculate the support and confidence values. If a rule
does not appear in a sub-dataset Dk , we set its support and confidence values in a time
period tk to zero.

Step 3: Monitoring rules over time. The direct and simple approach is to monitor
the rule from one time period to other time periods by comparing the support and
confidence of each rule from all time periods. This can be done using a graph, where
x-axis represents the time line, and y-axis represent the support of a large itemset or
support/confidence of a particular rule. It is useful when the user wants to see the
fluctuation in a particular rule. This method is used in [3, 22]. However, it has two
drawbacks. First, it often reports far too many changes and most of them are simply
the snowball effect of some fundamental changes. Second, analysing the difference in
supports/confidences may miss some interesting changes [10].
In this paper, we describe monitoring methods which are more advance than
the above method. We classify these methods into three categories: statistical based
methods and visualization based methods, and methods to monitor the rule from two
Survey of Methods for Monitoring Association Rule Behavior 593

Statistical based methods


Method Description Statistical test Authors
1. Detecting semi-stable rules z test Liu et al. 2001 [5]
2. Detecting stable rules Chi-square test Liu et al. 2001 [5]
3. Detecting rules that exhibit trends Run test Liu et al. 2001 [5]
4. Detecting rules with significant changes Two-tail binomial test Baron & Spiliopoulou 2003 [17]

Visualization based methods


Method Description On display Author
1. Visualize similar rules A group of rules Zhao & Liu 2001 [6]
2. Visualize neighbour rules A group of rules Zhao & Liu 2001 [6]
3. Visualize a permanent rule One rule Baron et al. 2003 [18]

Monitoring from two datasets


Method Description Detection tool Author
1. Detecting emerging patterns Border-based algorithm Dong & Li 1999 [9]
2. Detecting fundamental rule changes Quantivative analysis Liu et al. 2001 [10]
Qualivative analysis
Table 1. Classification of methods to monitor rule behaviour

datasets, as shown in Table 1. Statistical based methods use statistical test, while
visualization based methods use visualization.

3. STATISTICAL BASED METHOD


The statistical test used in the following methods is applied on an individual rule.
The test is performed on either the support or confidence of the rule. The focus of our
discussion is on the test applied to the rule’s confidence. The test on the rule support
is analogous.

3.1. Detecting Semi-Stable Rules. A rule r ∈ R is a semi-stable rule if none of


its confidences (supports) in the time periods t1 , t2 , . . . , tn is statistically below min-
conf (minsup). Its formal definition is defined below.

Definition 3.1. Semi-stable confidence rules. Let minsup and minconf be the
minimum support and confidence, supD and confD be the support and confidence of a
rule r from the whole dataset D, confi be the confidence of the rule in the time period
ti , and α be a specified significance level. The rule r is a semi-stable confidence rule
over the time periods t1 , t2 , . . . , tn , if the following two conditions are met:
1. supD ≥ minsup and confD ≥ minconf
2. for each time period ti , we fail to reject the following null hypothesis at significance
level α: Ho : confi ≥ minconf

The first condition is used to ensure that the confidence of a rule r satisfies the
minimum confidence threshold in the whole dataset. The second condition is tested
using the z test.
594 A.D. Rahajoe and E. Winarko

3.2. Detecting Stable Rules. A semi-stable rule only requires its confidences (sup-
ports) over time are not statistically below minconf (minsup). However, the confidences
(supports) of the rule may vary a great deal. Hence, the behaviour can be unpredictable.
A stable rule is a semi-stable rule and its confidences (supports) are homogeneous.
Definition 3.2. Stable confidence rules. Let minsup and minconf be the minimum
support and confidence, supD and confD be the support and confidence of a rule r from
the whole dataset D, confi be the confidence of the rule in the time period ti , and α be
a specified significance level. The rule r is a stable confidence rule over the time periods
t1 , t2 , . . . , tn , if the following two conditions are met:
1. r is a semi-stable confidence rule
2. we fail to reject the following null hypothesis at significance level α: Ho : conf1 =
conf2 = . . . = confn
The second condition is tested using χ2 test for testing homogeneity of multiple
proportions.
3.3. Detecting Rules that Exhibit Trends. Sometimes the users are more inter-
ested in knowing whether changes in support or confidence of a rule are random or there
is underlying trend. In this case, a statistical test called the run test can be used to
detect if a rule’s support or conficence values exhibit trend or not. The run test can
find those rules that exhibit trends. But, it does not tell the types of trends.
3.4. Detecting Significant Change. In [17] a mechanism called change detector is
used to identify significant changes. In this mechanism a two-tailed binomial test is
utilized to verify whether an observed change is statistically significant or not.
For a a rule r and a statistical measure s at a time point ti it is tested whether
r.s(ti−1 ) = r.s(ti ) at a confidence level α. The test is applied upon the subset of data
Di accumulated between ti−1 and ti , so that the null hypothesis means that Di−1 is
drawn from the same population as Di , where Di−1 and Di have an empty intersection
by definition. Then, for a rule r an alert is raised for each time point ti at which the
null hypothesis is rejected.

4. VISUALIZATION BASED METHOD


4.1. Visualizing Similar Rules. This method is used to visualize rules that have
similar behaviour to a rule r. A commonly used method for searching similarity is to
map objects into points in a high dimensional space, so that each point is represented
as a series of values. Similarity of two objects is defined as the distance between their
respective values.
In [6], euclidean distance is used as the distance function between two series. The
distance d(X, Y ) between two series X = hx1 , x2 , . . . , xn i and Y = hy1 , y2 , . . . , yn i is
v
u n
uX
d = t( (xi − yi )2
i=1
Survey of Methods for Monitoring Association Rule Behavior 595

where xi and yi are the values of X and Y in the ith time period, respectively. Given
a similarity threshold value , if this distance is below user-defined threshold , we say
that the two series are similar.
The parameter  is a distance parameter that controls when two series should be
considered similar. It could be either user-defined, or determined automatically. For a
rule r with n time periods,  is calculated as:
n−1
P
|zi+1 − zi |
i=1
=
n−1
where zi is a support (confidence) value of r at a period i. The bigger the value of ,
the more rules will be included as similar rules, with respect to r.

4.2. Visualizing Neighbour Rules. This method is used to visualize the neighbour
rules of a rule r. A rule r1 : lhr1 → rhs1 is a neighbour of a rule r2 : lhr2 → rhs2 if the
following two conditions are met:
(i). rhs1 = rhs2
(ii). lhs1 ⊇ lhs2 or lhs1 ⊆ lhs2
As an example, take a rule r : A, B → D, then rules r1 : A → D, r2 : B → D, and r3 :
A, B, C → D are neighbour of r. But r4 : A, C → D is not, because {A, B} 6⊇ {A, C}
and {A, B} 6⊆ {A, C}

4.3. Visualizing a Permanent Rule. A permanent rule is is a rule that is perma-


nently supported by the data. All rules produced from the first partition D1 are po-
tential candidates for permanent rules and can be monitored. In real applications the
number of rules may be huge. However, it shown in [18] that the number of permanent
rules decreases relatively fast with the number of periods.
It may also be possible that there are no more permanent rules at one point in
time. In order to avoid this problem, the approach used in [18] is as follows. Depending
on the number of initially discovered rules, rules that disappear from the rule base in
subsequent periods could be kept in memory until their support values violate a given
threshold, or they are absent for more than a given number of periods. For example, a
rule would be removed from memory if it appears in less than 50% of the periods, or if it
is absent for more than three periods. Depending on the number of occurrences different
rule groups could be defined (e.g., 100%, 90%, 75%, and 50% rules). For example, using
these groups, a rule change could be considered interesting if it leads to a group change.

5. MONITORING RULE BEHAVIOR OVER TWO DATASETS


5.1. Detecting Emerging Patterns. Emerging patterns (EPs) are defined as item-
sets whose support increase significantly from one dataset, D1 , to another, D2 . EPs
can also be defined as itemsets whose ratios of the two supports are larger than a given
threshold ρ. EPs can be large in size, and may have very small support. Since apriori
property is no longer holds for EPs and because there are too many candidates, then
596 A.D. Rahajoe and E. Winarko

naive algorithms such as Apriori algorithm [23] are not efficient for discovering EPs.
Therefore, an efficient algorithm for discovering EPS was proposed in [9].

5.2. Detecting Fundamental Rule Changes. Fundamental rule changes are de-
fined as changes that cannot be explained by other changes (its formal definition will be
described below). To detect fundamental rule changes, two techniques are used: quanti-
tative analysis and qualitative analysis. Quantitative analysis measures the magnitude
of change, while qualitative analysis finds the direction of change. We describe each
analysis by focusing our discussion on the change of the rule’s support. The change of
the rule confidence can be done in the same way. This method considers the rule r of
the form r: a1 , a2 , . . . , an → y.

5.2.1. Quantitative Analysis. In order to perform this analysis, we need to calculate the
expected support of a rule r in t2 , which is defined as follows:
1. If r is a 1-condition rule, its expected support in t2 is its support in t1
2. If r is a k-condition rule (k > 1) of the form r : a1 , a2 , . . . , an → y, then r can be
considered as a combination of two rules, a 1-condition rule rone and a k-condition
rule rrest , where

rone : ai → y rrest : a1 , a2 , . . . , aj → y
and {a1 , a2 , . . . , aj } = {a1 , a2 , . . . , an } − {ai }. Let supt (x) be the support of a
rule x at a time period t. The expected support of r in t2 with respect to rone
and rrest are
 
supt1 (r)
Erone (supt2 (r)) = min × supt2 (rone ), 1 (1)
supt1 (rone )
 
supt1 (r)
Errest (supt2 (r)) = min × supt2 (rrest ), 1 (2)
supt1 (rrest )
After we know the expected support of a rule r in t2 , we can check if the change
in support of a rule r from t1 to t2 is fundamental or not. The change is fundamental
if:
1. r is a 1-condition rule and its support is significantly different from its expected
support, or
2. r is a k-condition rule (k > 1), and Erone (supt2 (r)), Errest (supt2 (r)) and supt2 (r)
are significantly different, for all rone and rrest combinations.
Then, χ2 test is used to check if the support is significantly different from the
expected support. It means that if r is a 1-condition rule, the different is significant
if we fail to reject null hypothesis Ho : E(supt2 (r)) = supt2 (r) at significance level α.
If r is a k-condition rule, the different is significant if we fail to reject null hypothesis
Ho : Erone (supt2 (r)) = Errest (supt2 (r)) = supt2 (r) at significance level α, for all rone
and rrest combinations.
As an example, consider the example shown in table 2 (adopted from [10]). Since
r1 is a 1-condition rule, to test if the change in support of r1 is fundamental, we have
null hypothesis Ho : E(supt2 (r1 )) = supt2 (r1 ), where the E(supt2 (r1 )) = supt1 (r1 ) =
Survey of Methods for Monitoring Association Rule Behavior 597

Rule Support at t1 Support at t2


r1 : a → y 5.2% 4.4%
r2 : b → y 6.0% 4.3%
r3 : a, b → y 2.1% 4.2%
Table 2. Rule support at two time periods

E(supt2 (r1 )) supt2 (r1 )


satisfy r1 52 44
do not satisfy r1 948 956
Table 3. The 2x2 contingency table of r1 support

Figure 1. Explainable change combinations used in qualitative analysis

0.052. Assume that the size of dataset at each period is 1000 tuples, we have 2 × 2
contingency table in which each cell contains observed frequencies of tuples that satisfy
r1 and do not satisfy r1 , for each time period, as shown in table 3. From this table,
we can compute the value of chi-square, which is equal to 0.7. Using significance level
of 5% and degrees of freedom 1, the critical value is equal to 3.84. Since the chi-squre
value is smaller than the critical value, we do not reject the Ho , and conclude that the
support of r1 is not significantly different1 . It means that r1 does not show fundamental
change in support.

5.2.2. Qualitative Analysis. In qualitative analysis the magnitude of change is ignored


and is only focus on the direction of change, i.e., increase, drop or noChange from t1 to
t2 . Given a rule r, Definition 5.1 is used to determine if the changes of its support or
confidence is fundamental or not.
Definition 5.1. Fundamental rule change in support/confidence via qualita-
tive analysis. The support (or confidence) change in a rule r from t1 to t2 , where r is
a k-condition rule (k¿1), is said to be a fundamental support (or confidence) change if
for all ro ne and rr est combinations, the directions of support (or confidence) changes
of ro ne, rr est and r in time period 2 do not belong to any of the 7 cases in Figure 1.

1 The computation is performed using chi-square calculator, which is available on the web at
http://schnoodles.com/cgi-bin/web chi.cgi
598 A.D. Rahajoe and E. Winarko

6. CONCLUSION
All of the proposed methods, either statistical or visualization based methods,
consider only the statistical properties of the rule, i.e., its support or confidence. In
[7], the first step toward an integrated treatment of two aspects of a rule, its content
and statistics, has been made, by proposing the GRM which models both the content
and the statistics of a rule as a temporal object. In our next research, we will combine
statistical and visualization methods for observing evolution of temporal association
rules generated from interval sequence data.

References
[1] Hembold, H.M., Long, P.M.: Tracking drifting concepts by minimizing disagreements. Machine
Learning 114 (1996) 27–45
[2] Hickey, R.J., Black, M.M.: Refined time stamps for concept drift detection during mining for
classification rules. In Roddick, J.F., Hornsby, K., eds.: Proceedings of the 1st International
Workshop, TSDM 2000. Volume 2007 of LNAI., lyon, France, Springer (2000) 20–30
[3] Tansel, A.U., Ayan, N.F.: Discovery of association rules in temporal databases. In: Proceedings
of the 4th International Conference on Knowledge Discovery and Data Mining, Distributed Data
Mining Workshop, New York City, New York, USA (1998)
[4] Agrawal, R., Psaila, G.: Active data mining. In Fayyad, U.M., Uthurusamy, R., eds.: Proceed-
ings of the First International Conference on Knowledge Discovery and Data Mining (KDD’95),
Montréal, Québec, Canada (1995) 3–8
[5] Liu, B., Ma, Y., Lee, R.: Analyzing the interestingness of association rules from the temporal
dimension. In: Proceedings of IEEE International Conference on Data Mining (ICDM’01), Silicon
Valley, USA (2001) 377–384
[6] Zhao, K., Liu, B.: Visual analysis of the behavior of discovered rules. In: Workshop Notes in
ACM-SIGMOD 2001 Workshop on Visual Data Mining, San Fransisco, CA, USA (2001)
[7] Baron, S., Spiliopoulou, M.: Monitoring change in mining results. In Kambayashi, Y., Winiwarter,
W., Arikawa, M., eds.: Proceedings of the 3rd International Conference on Data Warehousing and
Knowledge Discovery (DaWak’01). Volume 2114 of LNCS., Munich, Germany, Springer (2001)
51–60
[8] Ganti, V., Gehrke, J., Ramakrishnan, R.: A framework for measuring changes in data character-
istics. In: Proceedings of the 18th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of
Database Systems, Philadelphia, Pennsylvania (1999) 126–137
[9] Dong, G., Li, J.: Efficient mining of emerging patterns: Discovering trends and differences. In:
Knowledge Discovery and Data Mining. (1999) 43–52
[10] Liu, B., Hsu, W., Ma, Y.: Discovering the set of fundamental rule changes. In: Proceedings of
the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San
Francisco, CA, USA (2001) 335–340
[11] Cheung, D.W.L., Ng, V., Tam, B.W.: Maintenance of discovered knowledge: A case in multi-level
association rules. In: Knowledge Discovery and Data Mining. (1996) 307–310
[12] Cheung, D.W.L., Lee, S.D., Kao, B.: A general incremental technique for maintaining discovered
association rules. In: Database Systems for Advanced Applications. (1997) 185–194
[13] Ayan, N.F., Tansel, A.U., Arkun, M.E.: An efficient algorithm to update large itemsets with
early pruning. In: Proceedings of the 5th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, San Diego, CA, USA (1999) 287–291
[14] Omiecinski, E., Savasere, A.: Efficient mining of association rules in large dynamic databases.
In Embury, S.M., Fiddian, N.J., Gray, W.A., Jones, A.C., eds.: Proceedings of the 16th British
National Conference on Databases (BNCOD’98. Volume 1405 of LNCS., Cardiff, Wales, U.K.,
Springer (1998) 49–63
Survey of Methods for Monitoring Association Rule Behavior 599

[15] Ganti, V., Gehrke, J., Ramakrishnan, R.: DEMON: Mining and monitoring evolving data. In:
Proceedings of the 16th International Conference on Data Engineering (ICDE’00), San Diego,
CA, USA, IEEE Computer Society Press (2000) 439–448
[16] Saraee, M.H., Theodoulidis, B.: Knowledge discovery in temporal databases. In: IEE Colloquium
on ’Knowledge Discovery in Databases’, IEE, London (1995) 1–4
[17] Baron, S., Spiliopoulou, M.: Monitoring the evolution of web usage patterns. In: Proceedings of
the First European Web Mining Forum (EMWF 2003), Cavtat-Dubrovnik, Croatia (2003) 181–200
[18] Baron, S., Spiliopoulou, M., Günther, O.: Efficient monitoring of patterns in data mining envi-
ronments. In Kalinichenko, L.A., Manthey, R., Thalheim, B., Wloka, U., eds.: Proceedings of the
7th East European Conference on Advances in Databases and Information Systems (ADBIS’03).
Volume 2798 of LNCS., Dresden, Germany, Springer (2003) 253–265
[19] Toivonen, H., Klemettinen, M., Ronkainen, P., Hatonen, K., Mannila, H.: Pruning and grouping
of discovered association rules. In: ECML-95 Workshop on Statistics, Machine Learning, and
Knowledge Discovery in Databases, Heraklion, Greece (1995) 47–52
[20] Liu, B., Hsu, W., Ma, Y.: Pruning and summarizing the discovered associations. In: Proceedings
of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,
San Diego, CA, USA (1999) 125–134
[21] Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H., Verkamo, A.: Finding interesting
rules from large sets of discovered association rules. In Adam, N., Bhargava, B., Yesha, Y., eds.:
Proceedings of the 3rd International Conference on Information and Knowledge Management,
Gaithersburg, Maryland, ACM Press (1994) 401–407
[22] Koundourakis, G., Theodoulidis, B.: Association rules and evolution in time. In: Proceedings
of Methods and Applications of Artificial Intelligence, Second Hellenic Conference on AI, SETN
2002, Thessaloniki, Greece (2002) 261–272
[23] Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th
International Conference on Very Large Data Bases. (1994) 487–499

Ani Dijah Rahajoe


Faculty of Informatic Engineering,
Universitas Bhayangkara, Surabaya.
e-mail: [email protected]

Edi Winarko
Faculty of Mathematics and Natural Sciences,
Universitas Gadjah Mada, Jogjakarta.
e-mail: [email protected]
600 A.D. Rahajoe and E. Winarko
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp.601–614.

A Comparison Framework for


Fingerprint Recognition Methods

Ary Noviyanto and Reza Pulungan

Abstract. Many fingerprint recognition methods have been proposed and the need arises
for a methodology to compare these methods, in order to be able to decide whether a
particular method is better than another. In this paper, we report on our effort to
develop a methodology to compare the robustness of fingerprint recognition methods. As
a case study, we apply this methodology to compare two recent fingerprint recognition
algorithms proposed by Chikkerur (2005) and Wibowo (2006). We are able to conclude
that, overall, Chikkerur’s algorithm performs better than Wibowo’s.
Keywords and Phrases: Fingerprint recognition methods, comparison framework.

1. INTRODUCTION
The needs for biometrics that can be used to recognize people based on their bodily
characteristics have existed long. Biometric recognition is associated with identification
(“Who is X?”) and verification (“Is this X?”) [13]. Alphonse Bertillon, chief of the
criminal identification division of the police department in Paris, conceived an idea that
body measurements can be used to identify criminals; and this has changed major law
enforcement departments in the mid-19th century [3].
Not all body measurements can be eligible to be a biometric. Human fingerprint,
which has been used for authentication purposes for more than 100 years [3, 4, 7], is
one of the most well-known biometrics. Fingerprints can be a biometric because they
have characteristics that are feasible to measure, distinct, permanent, accurate, reliable,
and acceptable [7]. There are three levels of fingerprints’ features that can be used in
recognition processes [5]:
(1) Global level: the ridge flows of fingerprints create particular patterns, such as
shown in Figure 1.

2010 Mathematics Subject Classification: 68T10

601
602 Ary Noviyanto and Reza Pulungan

(2) Local level: there are 150 different patterns or forms of ridges in fingerprints.
These patterns are called minutiae (see Figure 2). The most popular minutiae
are ridge endings and ridge bifurcations.
(3) Very-fine level: at this level, we look at the deeper levels of detail in the ridges.
The most important feature is finger sweet pore, which can be observed using
a high resolution sensor (1000 dpi) (see Figure 2).

(a) (b) (c) (d) (e)

Figure 1. Global level of fingerprints’ features [5]: (A) Left-loop, (B)


Right-loop, (C) Whorl, (D) Arch, and (E) Tented-arch

Figure 2. Black solid circles are minutiae and circles with hole are
sweat pore [5]

Several researchers have attempted to propose fingerprint recognition methods


(FRM), such as Chikkerur [1] and Wibowo [12]. Methodologies and techniques to com-
pare those FRMs are required. In this paper, we are going to report on our effort to
build a comparison framework for FRMs, that considers the minutiae as the distinguish-
ing features. The comparison framework measures the quality of the methods based
on their robustness. An FRM is robust if it can distinguish every input properly. The
robustness of an FRM can also be analyzed from the robustness of each phase involved
in FRM. If each phase of the fingerprint recognition method is robust, we can then
confidently take a conclusion that the FRM itself is robust.
A Comparison Framework for Fingerprint Recognition Methods 603

2. PRELIMINARIES
2.1. Failures in Biometric System. There are two possible errors in biometric sys-
tems [2], namely:
(1) α-error, which is a failure occurring when comparison results reject or conclude
as different, things which are the same. Hence, this is also called a false non-
match. The ratio of this failure is called false non-match rate (FNMR) or false
reject rate (FRR).
(2) β-error, which is a failure occurring when comparison results accept or conclude
as the same, things which are different. Hence, this is also called a false match.
The ratio of this failure is called false match rate (FMR) or false accept rate
(FAR).

2.2. False Non-Match (FNM) and False Match (FM). In order to define FNM
and FM, we first define feature extraction, matching and the process of making con-
clusions. We define a sample of biometrics as Sik ; where i is the individual that the
biometric sample belongs to and k denotes the index of the successful acquisition pro-
cess (different samples of biometrics can be acquired from the same individual). The
features of every biometric sample Sik , denoted by Xik , is then extracted. The result
of matching, denoted by Yik,i0 k0 , is obtained from matching biometric samples Sik and
Si0 k0 . The next process is to decide whether the two biometric samples represent the
same biometric. Given a threshold τ , two biometrics are not similar if Yik,i0 k0 > τ and
two biometrics are similar if Yik,i0 k0 ≤ τ .
We define Dii0 j , a binary function that represent the j-th conclusion taken from
comparing the biometrics of the i-th and i0 -th individuals. If the value of Dii0 j is one,
then the sistem has made a mistake and if the value of Dii0 l is zero, then the sistem
was right. Dii0 l is defined as follows:

1 jika i = i0 and Yik,ik0 > τ,




0 jika i = i0 and Yik,ik0 ≤ τ,


Dii0 j = (1)
 0 jika i 6= i0 and Yik,i0 k0 > τ,
0

1 jika i 6= i and Yik,i0 k0 ≤ τ.

From Dii0 k we can compute False Non-Match Rate (FNMR) dan False Non-Match
Rate (FMR) as follows:
P P P
j Dii j
0
i i0 6=i
FMR = P P , and, (2)
i0 6=i nii
0
i
P P
i Diij
FNMR = P j , (3)
i nii

where nii0 is the number of comparisons between two individuals, and nii is the number
of time an individual is compared with himself. If i = i0 then the comparison is called
genuine matching and if i 6= i0 then the comparison is called imposter matching.
604 Ary Noviyanto and Reza Pulungan

2.3. Sensitivity and Specificity. Sensitivity and specificity are used to measure the
success of an algorithm in detecting minutiae [8]. They are defined as follows:
missedminutiae
Sensitivity = 1 − , and (4)
groundtruth
falseminutiae
Specificity = 1 − , (5)
groundtruth
where missedminutiae is the number of genuine minutiae that are not detected, falseminutiae
is the number of false minutiae that are detected, and groundtruth is number of minutiae
that are defined by fingerprint experts.

2.4. Equal Error Rate (ERR). Equal Error Rate (ERR) is an objective evaluating
criteria for classifier performance testings. It is objective in the sense that the rejection
threshold is selected independently. Equal Error Rate defines the intersection point
between FNMR and FMR curves as the function of the rejection threshold [6]. In other
words, ERR is a value where FNMR is equal to FMR, as shown in Figure 3.

Figure 3. The Relationship between FNMR, FMR and ERR

2.5. Mean and Standard Deviation. Mean is used to assert the common value from
a collection of values. The mean of N data Xi , denoted by X̄, is defined by [9]:
PN
Xi
X̄ = i=1 . (6)
N
Beside mean, we need a way to measure the spread of the collection of values from
its means. Standard deviation of N data Xi , denoted by s, is defined by [9]:
s
PN 2
i=1 (Xi − X̄)
s= . (7)
N −1
A Comparison Framework for Fingerprint Recognition Methods 605

3. THE COMPARISON FRAMEWORK


We divide the whole fingerprint recognition process into three parts, namely en-
hancement process, feature extraction process and matching process. To get a complete
view of the quality of two fingerprint recognition methods or more, we need to com-
pare the three parts separately. The results of the partial comparisons will inform us
about the relative quality of the given fingerprint recognition methods. We therefore de-
fine a comparison framework that contains testings for each part, namely enhancement
testing, feature extraction testing, and matching testing.

3.1. Enhancement Testing. The enhancement testing is a testing that compares only
the enhancement process of fingerprint recognition methods. The aim of this testing is to
compare the success rate of each enhancement process. Figure 4 shows how enhancement
testings are carried out. To fairly compare the quality of enhancement processes, we use
third party softwares that do not contain any enhancement process whatsoever. We use
MINDTCT [11] as feature extraction software and BOZORTH3 [10] as matcher software
to perform verification. The quality of each enhancement process is then represented
by its FNMR and FMR.

Figure 4. The Flow Diagram of the Enhancement Testing

3.2. Feature Extraction Testing. The feature extraction testing is a testing that
compares only the feature extraction process of fingerprint recognition methods. The
difficulty of this testing lies in that a particular feature extraction method might be
linked with a particular enhancement method during analysis. Therefore, we need to
pass all necessary parameters from the enhancement process to the feature extraction
process, if any. We have to make sure that the image is not changed after the en-
hancement process. Figure 5 shows that the enhancement process is retained in the
testing, but filtering process that modifies the raw image is removed. With this scheme,
parameters that are required during the feature extraction process can be passed on
without changing the input image, and hence the input images before and after the
enhancement process are the same.
The result as shown in Figure 5 is a fingerprint image with additional feature
points. We then compute the values of sensitivity and specificity to determine the
performance of the feature extraction process. Ideally we need fingerprint experts to
606 Ary Noviyanto and Reza Pulungan

create a standard template of the genuine fingerprint features, so that we can compute
the values of sensitivity and specificity precisely.

Figure 5. The Flow Diagram of the Feature Extraction Testing

3.3. Matching Testing. The matching testing is a testing that compares only the
matching process of fingerprint recognition methods. The difficulty of this testing lies in
the differences of features’ representation. A particular matcher might be related with a
particular features’ representation. Therefore, features’ representations are converted to
a particular format that conforms with the matchers. To compare fairly, we have to make
sure that the features are the same although they might have different representations.
In this testing, we compute mean and standard deviation of matched feature
points (minutiae) in genuine matching and imposter matching. Using the combination
of the values of mean and standard deviation of matched minutiae, the performance
of matcher to determine fingerprint images through its features can be observed. The
distance of the mean ± standard deviation between genuine matching and imposter
matching is required to determine the threshold. The greater the distance, the easier it
is to determine the threshold. An overlap of the mean ± standard deviation between
genuine matching and imposter matching—i.e., their mean ± standard deviation inter-
sect each other—means that there must have been mistakes or failures in the matching
process. If such overlap exists, the threshold cannot be determined precisely.

Figure 6. The Flow Diagram of the Matching Testing

3.4. Overall Testing. Beside the partial testings (i.e., enhancement testing, feature
extraction testing and matching testing), we also perform an overall testing that com-
pares the whole process of fingerprint recognition methods. This testing is used to
compare the evaluation results of the fingerprint recognition methods.
EER is used as an evaluating measure for this overall testing . EER provides
information of the best performance of the FRM, namely when its value is the same as
that of FNMR.
A Comparison Framework for Fingerprint Recognition Methods 607

4. EXPERIMENTAL DATA
We collect our data set using a 500 dpi resolution fingerprint sensor that can
produce images of size 280 × 360 pixels. Figure 7 shows several examples of the obtained
fingerprint images. We also have a particular naming scheme: each fingerprint image is
named according to format: name.fingerprint-code.index-of-acquisition.bmp.

(a) (b) (c) (d) (e)

Figure 7. Several fingerprint images in 500 dpi resolution:


(A) agung.0.03.bmp, (B) ata.1.06.bmp (C) kartika.2.01.bmp, (D)
christ.3.03.bmp, and (E) illy.4.01.bmp

From 16 volunteers, a total of 640 fingerprint images have been collected. From
each volunteer we took 40 images; eight different images for each finger.

5. EXPERIMENTAL RESULTS
To demonstrate the use of the framework, we will use and compare the imple-
mentations of two fingerprint recognition methods based on Chikkerur’s [1] and Wi-
bowo’s [12].

5.1. Enhancement Testing Result. In this phase, we compare two enhancement


methods: STFT analysis method [1] and Gabor filter method [12]. The result of the
enhancement testing is shown in Table 1 and Table 2. Table 1 shows the comparison
of FNMR and FMR of both Chikkerur’s and Wibowo’s methods for each data set. The
values of the mean and the corresponding standard deviation of data in Table 1 are
presented in Table 2.
The values of FNMR and FMR from Table 1 are also depicted in Figure 8. From
Figure 8a, we can observe that Chikkerur’s method performs better than Wibowo’s
method; Chikkerur’s values of FNMR are less than Wibowo’s. In Figure 8b, Wibowo’s
method performs better than Chikkerur’s. However, as can be observed, the y-axis of
the graph in Figure 8b ranges only between 0 and 1 (while the y-axis of the graph in
Figure 8a ranges between 0 and 16); hence the difference in the FMR results is not
608 Ary Noviyanto and Reza Pulungan

Table 1. Enhancement Testing Result

Data set Comparison Method (%)


Chikkerur’s Wibowo’s
1 FNMR 3.57 10.71
FMR 0.03 0.07
2 FNMR 0.36 0.36
FMR 0.00 0.00
3 FNMR 5.54 7.14
FMR 0.87 0.83
4 FNMR 1.07 5.54
FMR 0.00 0.00
5 FNMR 2.14 7.14
FMR 0.00 0.00
6 FNMR 7.86 15.36
FMR 0.16 0.00
7 FNMR 0.36 0.36
FMR 0.05 0.00
8 FNMR 3.04 10.71
FMR 0.54 0.07

Table 2. Mean and STD of FNMR and FMR from Table 1

Parameter Method (%)


Comparison Chikkerur’s Wibowo’s
Mean FNMR 2.99 7.17
STD FNMR 2.64 5.18
Mean FMR 0.21 0.12
STD FMR 0.32 0.29

as significant as that of FNMR results. Overall, we can conclude that Chikkerur’s


enhancement method performs better than Wibowo’s enhancement method.

5.2. Feature Extraction Testing Result. In the second testing, we compare two fea-
ture extraction methods: chain code based method [1] and templating based method [12].
The result of feature extraction testing is shown in Table 3. The columns of Table 3
are as follows:
(1) File Name is name of the file of the fingerprint image.
(2) Ground truth is the number of the genuine minutiae based on benchmark.
(3) Total is the total number of minutiae that can be extracted.
(4) Missed is the number of genuine minutiae that cannot be extracted.
(5) False is the number of minutiae that are extracted but not genuine.
(6) Match is the number of minutiae that are extracted and genuine.
A Comparison Framework for Fingerprint Recognition Methods 609

(a) FNMR (b) FMR

Figure 8. Enhancement testing result: (A) the values of FNMR for


each data set, and (B) the values of FMR for each data set. The
continuous line represents Chikkerur’s enhancement method while the
dashed line Wibowo’s

(7) Sens. and Spec. are the values of sensitivity and specificity, respectively.
The associated sensitivity and specificity of the two methods from Table 3 are
shown in Figure 9. From Figure 9, we observe that both sensitivity and specificity of
Chikkerur’s method are higher than those of Wibowo’s. This means that Chikkerur’s
feature extraction method performs better than Wibowo’s features extraction method.

(a) Sensitivity (b) Specificity

Figure 9. Feature extraction result: (A) the values of sensitivity, and


(B) the values of specificity. The continuous line represents Chikkerur’s
enhancement method, while the dashed line represents Wibowo’s

5.3. Matching Testing Result. In this testing, we compare two matching methods:
graph-based [1] and point pattern matching based on alignment methods [12]. The result
Table 3. Feature Extraction Testing Result
No File Name Ground Total Missed False Match Sens. (%) Spec. (%)
truth C W C W C W C W C W C W
1 aaron.0.01.bmp 38 46 77 13 19 21 58 25 19 65.79 50.00 44.74 -52.63
2 arief.0.01.bmp 23 52 84 10 11 39 72 13 12 56.52 52.17 -69.57 -213.04
3 ata.0.01.bmp 24 60 96 12 17 48 89 12 7 50.00 29.17 -100.00 -270.83
4 beny.0.01.bmp 29 38 85 10 15 19 71 19 14 65.52 48.28 34.48 -144.83
5 christ.0.01.bmp 17 35 70 10 9 28 62 7 8 41.18 47.06 -64.71 -264.71
6 danu.0.01.bmp 28 53 27 8 13 33 12 20 15 71.43 53.57 -17.86 57.14
7 danu.1.08.bmp 27 41 57 13 14 27 44 14 13 51.85 48.15 0.00 -62.96
8 danu.4.07.bmp 23 48 32 10 16 35 25 13 7 56.52 30.43 -52.17 -8.70
9 firsty.0.01.bmp 27 35 82 12 15 20 70 15 12 55.56 44.44 25.93 -159.26
10 ilham.0.01.bmp 23 38 97 9 15 24 89 14 8 60.87 34.78 -4.35 -286.96
11 illy.0.01.bmp 29 44 106 13 16 28 93 16 13 55.17 44.83 3.45 -220.69
12 reza.0.01.bmp 30 33 111 15 14 18 95 15 16 50.00 53.33 40.00 -216.67
13 ria.0.01.bmp 25 54 200 9 14 38 189 16 11 64.00 44.00 -52.00 -656.00
14 riza.0.01.bmp 41 67 72 22 19 48 50 19 22 46.34 53.66 -17.07 -21.95
15 sigit.0.01.bmp 31 34 66 19 17 22 52 12 14 38.71 45.16 29.03 -67.74
16 willy.0.01.bmp 37 43 74 21 23 27 60 16 14 43.24 37.84 27.03 -62.16
Ary Noviyanto and Reza Pulungan

Average 28.25 45.06 83.50 12.88 15.44 29.69 70.69 15.38 12.81 54.54 44.80 -10.82 -165.75
STD 6.26 9.99 38.71 4.32 3.29 9.65 39.27 4.08 4.17 9.46 7.91 44.91 167.91
Table 4. Matching Testing Result
Method
Data set Chikkerur’s Wibowo’s
Mean STD Mean STD Mean STD Mean STD
Genuine Genuine Imposter Imposter Genuine Genuine Imposter Imposter
1 7.29 3.43 3.45 0.79 24.28 11.73 20.14 10.30
2 12.58 6.94 3.20 0.73 25.63 13.47 10.40 3.95
3 12.76 6.39 3.13 0.55 22.90 11.73 8.05 2.87
4 10.95 5.15 2.96 0.63 19.53 9.69 7.57 4.51
5 8.97 5.32 3.13 0.69 21.47 11.12 11.23 5.44
6 11.98 5.62 2.90 0.61 18.68 9.60 6.22 2.76
7 6.32 3.15 3.15 0.60 22.49 10.83 17.31 10.45
8 10.90 6.03 3.16 0.74 20.78 9.93 8.84 4.03
Avg. 10.22 5.25 3.14 0.67 21.97 11.01 11.22 5.54
Table 5. The Overall Testing Result
Method
Data set Chikkerur’s Wibowo’s
Mean STD Mean STD Mean STD Mean STD
EER FNMR/ Gen. Gen. Imp. Imp. EER FNMR/ Gen. Gen. Imp. Imp.
(%) FMR (%) (%) (%) (%) (%) FMR (%) (%) (%) (%)
1 7.60 0.20 18.54 12.65 6.48 3.26 19.51 0.43 22.74 9.84 18.24 7.42
2 10.00 0.10 28.08 15.20 7.64 1.79 19.10 0.23 38.07 18.70 15.45 5.20
3 11.30 0.05 27.91 12.90 7.95 1.87 17.60 0.18 38.83 18.13 13.66 4.51
4 14.50 0.04 38.61 14.64 9.48 2.46 18.70 0.20 37.74 17.77 13.83 6.16
5 9.60 0.14 21.40 12.77 7.60 1.86 19.77 0.30 33.42 18.17 16.27 6.73
610

6 13.55 0.05 32.09 13.81 9.31 2.48 17.20 0.20 40.15 19.28 13.30 5.13
7 6.95 0.28 21.47 18.08 6.26 2.27 21.38 0.40 25.62 10.86 18.39 8.46
8 12.20 0.09 31.35 16.02 8.64 2.52 18.50 0.24 36.53 18.50 14.64 5.75
Avg. 10.71 0.12 27.43 14.51 7.92 2.31 18.97 0.27 34.14 16.41 15.47 6.17
STD 2.69 0.08 1.31 0.10
A Comparison Framework for Fingerprint Recognition Methods 611

of the matching testing is shown in Tabel 4. Tabel 4 shows comparison of the mean
and the standard deviation of genuine and imposter matchings. The values of the mean
and the standard deviation of both genuine and imposter matchings of both methods
are plotted in Figure 10.

(a) Wibowo’s (b) Chikkerur’s

Figure 10. Matching testing result: (A) Wibowo’s matching method,


and (B) Chikkerur’s matching method. The continuous line represents
genuine matching, while the dashed line represents imposter matching

From Figure 10, we observe that Wibowo’s method produces more overlaps than
Chikkerur’s method (seven overlaps compared to three). This means that Chikkerur’s
matcher has better ability in distinguishing fingerprint images based on their features
than Wibowo’s matcher.
5.4. Overall Testing Result. The overall testing result is shown in Table 5. Table 5
shows the comparison of EER with the corresponding FNMR/FMR and also the com-
parison of the mean and the standard deviation of both genuine and imposter matchings
(Mean Gen., STD Gen., Mean Imp. and STD Imp.). The values of the mean and the
standard deviation of both genuine and imposter matchings are the measure of the sim-
ilarity between two fingerprint images. The comparison of the mean and the standard
deviation of both genuine and imposter matchings is depicted in Figure 11 and the
comparison of EER with the corresponding FNMR/FMR is depicted in Figure 12.
From Figure 11, we observe that Chikkerur’ method produces greater gaps be-
tween genuine matching and imposter matching than Wibowo’s method. This means
that Chikkerur’s method is able to distinguish fingerprint images better than Wibowo’s.
This result can also be confirmed in Figure 12: Chikkerur’s method produces higher
values of FNMR/FMR than Wibowo’s method.
612 Ary Noviyanto and Reza Pulungan

(a) Wibowo’s (b) Chikkerur’s

Figure 11. Overall testing result: the similarity value of (A) Wi-
bowo’s method, and (B) Chikkerur’s method. The continuous line rep-
resents genuine matching, while the dashed line represents imposter
matching

(a) EER (b) FNMR/FMR

Figure 12. Overall testing result: the value of (A) EER, and (B)
FNMR/FMR. The continuous line represents Chikkerur’s FRM, while
the dashed line represents Wibowo’s FRM

6. CONCLUDING REMARKS
Our experiment shows that, overall, Chikkerur’s FRM is better then Wibowo’s
FRM. This conclusion is based on the partial comparison results. The result is that
Chikkerur’s enhancement method performs better than Wibowo’s, which is shown by
the fact that Chikkerur’s method has smaller false non-match rate in accuracy test-
ing. The Chikkerur’s feature extraction method also performs better than Wibowo’s,
which is shown by its higher values of sensitivity and specificity. For matching method,
A Comparison Framework for Fingerprint Recognition Methods 613

Chikkerur’s method can distinguish features format better than Wibowo’s method. In
addition, we estimate the classification accuracy of the whole FRM in the overall test-
ing. In this testing, Chikkerur’s method has a higher accuracy than Wibowo’s. Hence,
the partial testings and the overall testing bring us to the same conclusion: Chikkerur’s
method is better than Wibowo’s.
In this paper, we have developed a framework that can be used to compare fin-
gerprint recognition methods. We have also demonstrated the use of the proposed
framework by comparing two recent methods. The experiments showed that the com-
parison framework performs well in measuring the relative quality of the two fingerprint
recognition methods. Since a fingerprint recognition method can usually be divided into
the three processes—i.e., enhancement, feature extraction and matching processes—the
proposed comparison framework provides specific and detailed information in each pro-
cess. The comparison results of each process enable us to investigate the performance
of a fingerprint recognition method in a more detail way. This framework provides a
basis to compare other fingerprint recognition methods.

References
[1] Chikkerur, S.S.: Online Fingerprint Verification System. Master’s thesis, State University of New
York at Buffalo, Buffalo, New York, June 2005.
[2] Dunstone, T., Yager, N.: Biometric System and Data Analysis Design, Evaluation, and Data
Mining. Springer, 2009.
[3] Jain, A.K., Ross, A., Prabhakar, S.: An introduction to biometric recognition. IEEE Transac-
tions on Circuits and Systems for Video Technology, 14(1), 4–20, 2004, http://dx.doi.org/10.
1109/TCSVT.2003.818349.
[4] Komarinski, P.: Automated Fingerprint Identification Systems (AFIS). Academic Press, 2004.
[5] Maltoni, D., Maio, D., Jain, A.K., Prabhakar, S.: Handbook of Fingerprint Recognition.
Springer Publishing Company, Incorporated, 2009.
[6] Poh, N., Bengio, S.: Evidences of equal error rate reduction in biometric authentication fusion.
Idiap-RR Idiap-RR-43-2004, IDIAP, 2004.
[7] Ravi, J., Raja, K.B., Venugopal, K.R.: Fingerprint Recognition Using Minutia Score Matching.
CoRR abs/1001.4186, 2010.
[8] Sherlock, B., Monro, D., Millard, K.: Fingerprint enhancement by directional Fourier fil-
tering. IEEE Proceedings - Vision, Image, and Signal Processing, 141(2), 87–94, 1994, http:
//link.aip.org/link/?IVI/141/87/1.
[9] Stockburger, D.W.: Introductory statistics: Concepts, models and applications 1998, http:
//business.clayton.edu/arjomand/book/sbk00.html.
[10] Watson, C.I., Garris, M.D., Tabassi, E., Wilson, C.L., Mccabe, R.M., Janet, S., Ko, K.:
User’s Guide to Export Controlled Distribution of NIST Biometric Image Software (NBIS-EC),
2007.
[11] Watson, C.I., Garris, M.D., Tabassi, E., Wilson, C.L., Mccabe, R.M., Janet, S., Ko, K.:
User’s Guide to NIST Biometric Image Software (NBIS), 2007.
[12] Wibowo, M.E.: Sistem Identifikasi Sidik Jari Berdasarkan Minutiae. Master’s thesis, Universitas
Gadjah Mada, Yogyakarta, Indonesia, Oktober 2006.
[13] Woodward, J.D., Orlans, N.M.: Biometrics. McGraw-Hill, Inc., New York, NY, USA 2002.

Ary Noviyanto
Faculty of Computer Science, Universitas Indonesia, Indonesia
e-mail: [email protected]
614 Ary Noviyanto and Reza Pulungan

Most of this work was done when the first author was with Universitas Gadjah Mada

Reza Pulungan
Department of Computer Science and Electronics
Faculty of Mathematics and Natural Sciences, Universitas Gadjah Mada, Indonesia
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 615–620.

THE GLOBAL BEHAVIOR OF CERTAIN TURING SYSTEM

Janpou Nee

Abstract. We first extend general maximum principle of Lou and Ni [6] of elliptic
equations to parabolic equations then we show that certain Turing system has a global
attractor when the coefficient of the diffusion is closed to 1. Then we show the existence
of periodic solution due to Hopf’s bifurcation.
Keywords and Phrases: Global existence, global attractor, periodic solution.

1. INTRODUCTION
Recently, many research engaged in the study of instability of steady states of
system of reaction diffusion [3, 5, 8, 7] that induced by the difference of the diffusion
coefficients. Such a phenomenon is the signature of Turing system [11]. The importance
of such instability of Turing system is that it corresponding to the model of development
of morphogenetic of biology. Mathematically, such instability corresponds to Hopf’s
bifurcation.
In this paper, we will study the Brusselator equation which is a model of auto
catalytic reaction in which one of the reactant is also the product. This model is
characterized by the reactions:
A → X, B + X → C + Y, 2X + Y → 3X, X → E.
Moreover, it is a model of activation-depletion mechanism of a chemical (or biological
chemical reaction [1, 2, 5, 8]) and a Turing system as well. The behavior of the solution
of such a system is complicated because the system contains many parameters. To
simplify the parameters of the equation, we rescale the parameters and change variables.
Eventually, the equation of Brusselator model looks like:
ut = D∆u + α − (1 + β)u + u2 v

x ∈ Ω, (1)
vt = ∆v + βu − u2 v,

2010 Mathematics Subject Classification: 35K41,35K55,

615
616 Janpou Nee

with Neumann boundary condition


∂u ∂v
= = 0 x ∈ ∂Ω, (2)
∂n ∂n
and initial data:
u0 = u(0, x), v0 = v(0, x). (3)
Here u, v, α and β stand for the concentration of the reactants. Furthermore, we assume
µ(Ω) = 1 where Ω ⊂ Rn is a smooth bounded domain.
In this paper, we first extend the generalized maximum principle of elliptic equa-
tion of Lou and Ni [6] to parabolic equation. By this result, we deduce that system
(1) has global attractor in C 1 ((0, ∞), C(Ω)) ∩ C([0, ∞), C(Ω)) and then the existence
of time periodic solution and the occurrence of Hopf’s bifurcation in section 3. At the
end, we will discuss the blow-up behavior induced by the diffusion coefficient D.

2. EXISTENCE OF GLOBAL ATTRACTOR


To prove the existence of time global solution of equation (1), we first extend
the general maximum principle of Lou and Ni [6] to parabolic equation. The next two
theorems can be found in Nee [9]. However, for self-contentment, we will state and
prove below.
Proposition 2.1. Let g ∈ C(Ω̄ × (0, T ) × R) and w(x, 0) ≥ 0.
(i) If w ∈ C 2,1 (Ω × (0, T )) ∩ C 1,0 (Ω̄ × [0, T ]) satisfies
∂w
−wt + ∆w + g(x, t, w) ≥ 0, in ET , ≤0 on ∂Ω (4)
∂n
and w(x0 , t0 ) = maxx∈Ω̄×[0,T ] w(x, t), then g(x0 , t0 , w(x0 , t0 )) ≥ 0.
(ii) If w ∈ C 2,1 (Ω × (0, T )) ∩ C 1,0 (Ω̄ × [0, T ]) satisfies
∂w
−wt + ∆w + g(x, t, w) ≤ 0, in ET , ≥0 on ∂Ω (5)
∂n
and w(x0 , t0 ) = minx∈Ω̄×[0,T ] w(x, t), then g(x0 , t0 , w(x0 , t0 )) ≤ 0.

Proof. : We only prove (i) since (ii) can be derived similarly. Let (x0 , t0 ) be an interior
point of domain Ω × (0, T ) such that w(x0 , t0 ) = maxξ∈Ω×(0,T ) w(ξ). Thus (5) implies
g(x0 , t0 , w(x0 , t0 )) ≥ wt (x0 , t0 ) − ∆w(x0 , t0 ) ≥ 0.
Let (x0 , t0 ) ∈ ∂ Ω̄ × [0, T ]; then we argue by contradiction (cf. [10]) and we
assume that g(x0 , t0 , w(x0 , t0 )) < 0 where w(x0 , t0 ) = maxξ∈∂ Ω̄×[0,T ] w(ξ). By (4),
g(x0 , t0 , w(x0 , t0 )) + ∆w(x0 , t0 , w(x0 , t0 )) ≥ wt , we have wt < 0, thus, w is larger at
some eariler time. This derives a contradiction, and hence results of (i) hold. 

With the help of maximum principle above, the following results hold.
The Global Behavior of Certain Turing System 617

Theorem 2.1. If u0 , v0 are non-negative and the diffusion coefficient |D − 1| ≤ , for


some  > 0, then (1) has a unique positive classical solution with global attractor
α β(β + 1) αβ β(β + 1)
≤u≤α+ , ≤v≤ . (6)
1+β α α2 + β(β + 1) α
Proof. : We first assume that D = 1, and let u(pn ) = minξ∈Ω̄×(0,T ) u(ξ); then, by (ii)
of proposition 1, we have
0 ≥ α − (1 + β)u(pn ) + u2 (pn )v(pn ) ≥ α − (1 + β)u(pn )
thus
α
u(pn ) ≥ . (7)
1+β
Let v(qm ) = maxξ∈Ω×(0,T ) v(ξ); then, by (i) of proposition 1 and (7), we have
β(β + 1)
v(qm ) ≤ . (8)
α
We define w = u + v; then,
0 = −wt + ∆w + α − u. (9)
Let w(rm ) = maxξ∈Ω×(0,T ) w(ξ); then, by (i) of proposition 1, u(rm ) ≤ α. Hence,
β(β + 1)
u(x, t) ≤ w(x, t) ≤ w(rm ) = u(rm ) + v(rm ) ≤ α + . (10)
α
Let v(qn ) = minξ∈Ω×(0,T ) v(ξ); then, by (ii) of proposition 1, we have βu(qn ) ≤
u2 (qn )v(qn ). Hence,
β αβ
v(x, t) ≥ v(qn ) ≥ ≥ 2 . (11)
u(qn ) α + β(β + 1)
Let  
D∆ 0
LD = ,
0 ∆
and consider Banach space X = C(Ω)×C(Ω). We denote U = (u, v), and F (α, β, u, v) =
F (α, β, U ); then, (1) may rewrite as follows:

Ut = LD U + F (α, β, U )
(12)
U0 = (u0 , v0 ) ∈ X,
Since ∆ generates a contraction semi-group on C(Ω), so does the direct sum LD . Thus
the existence of unique local solution of equation (14) is established. By standard results
of parabolic equation [4], the solution of (1) is classical.
In our proof, w = u + v is crucial. Thus for D 6= 1, w then satisfies
0 = −wt + ∆w − (1 − D)∆u + α − u.
If  is small enough and |1 − D| <  then our results remain true. 
The existence of global attractor in fact against the natural of the first equation
of system (1) which we will discuss in later section.
618 Janpou Nee

3. EXISTENCE OF PERIODIC SOLUTIONS


It is well known that system (1) has Hopf’s bifurcation which occurs at β = α2 +1.
Thus to study the existence of time periodic solution we again define s = µt and y = µx
and we scale u0 = µu, v 0 = µv. Instead of s, y, and u0 , v 0 we still use t, x and u, v then
equation (1) become as follows;
(
ut = D∆u + α − 1+β 1 2
µ u + µ3 u v
β 1 2 x ∈ Ω, (13)
vt = ∆v + µ u − µ3 u v,

and µ-periodic solution is now a 1-periodic solution satisfying u(t, x) = u(t + 1, x) and
v(t, x) = v(t + 1, x).
To show the existence of 1-periodic solution of equation (13), we denote
 
D∆ 0
LD = .
0 ∆

We consider Banach space X = C(Ω) × C(Ω) and let U = (u, v), and we denote
F (α, β, µ, u, v) = F (α, β, µ, U ); then, (1) may rewrite as follows:

Ut = LD U + F (α, β, µ, U )
(14)
U (t) = U (t + 1) ∈ X,

and the solution satisfies


Z t
−LD (t)
U (t) = e U0 + e−A(t−s) F (α, β, µ, U (s))ds. (15)
0

Theorem 3.1. If |D − 1| ≤  then equation (13) has a periodic solution.

Proof. : We consider Banach space X i = C i (Ω) × C i (Ω), and ellipsoid E ⊂ X 2 ;

α µ3 β(β + 1) αβ µ3 β(β + 1)
E = {(x, y) ∈ X : ≤x≤α+ , 2 3
≤v≤ }.
1+β α α + µ β(β + 1) α
By (15), we let Poincare’s map

P (U (0)) = U (1). (16)

Since |D − 1| ≤ , theorem 2 implies that equation (14) has a global attractor

α µ3 β(β + 1) αβ µ3 β(β + 1)
≤u≤α+ , ≤ v ≤ . (17)
1+β α α2 + µ3 β(β + 1) α

Thus P (X 0 ) ⊂ E ∩ X 2 . By Azelá Ascoli theorem, Poincare map has a fixed point and
hence (1) has a periodic solution. 
The Global Behavior of Certain Turing System 619

4. BLOWUP OF SINGLE EQUATION


At the end of this paper, we discuss how the solution evolves according to initial
data. We consider only the first equation of system (1) as follows;
ut = D∆u + α − (β + 1)u + a(t)u2 , x ∈ Ω,

∂u (18)
∂n = 0, u(0, x) = u0 > 0.
The following result can be found in [?]. Again, for the purpose of completeness, we
demonstrate the proof below.
Lemma 4.1. If a(t) > v0 ≥ 0 is a positive continuous function, then there exists some
constant k such that solution u of (18) blow-up in finite time provided u0 > k where
p
(1 + β − Dλ1 ) + (1 + β − Dλ1 )2 − 4v0 α
k≥ .
2v0
Proof. : We let Z
U (t) = ψ(x)u(t, x)dx, (19)

where ψ = φ1 +k and φ1 is the first non-constant eigenfunction of ∆ satisfying Neumann
boundary condition. Without loss of generality, we may assume ψ ≥ 0. Thus U satisfies
dU
= Du + α − (1 + β)U + f (u) (20)
dt
where Z Z
2
f (u) = ψ(x)u a(t)dx > v0 ψ(x)u2 dx. (21)
Ω Ω
Here a(t) ≥ v0 > 0 and
Z
Du = D ∆uψ(x)dx = −Dλ1 U + Dkλ1 .

By Hölder inequality,
ψ(x)udx]2
R
[
Z
2 Ω
ψ(x)u dx ≥ R . (22)
Ω [ Ω ψ(x)dx]
Thus
dU
≥ (α + Dkλ1 ) − (Dλ1 + β + 1)U + v0 U 2 .
dt
Let Ū be the solution of
dŪ
= (α + Dkλ1 ) − (Dλ1 + β + 1)Ū + v0 Ū 2 , (23)
dt
then Ū satisfies
Z Ū Z t
F (s)ds = dτ, (24)
u0 0
where F (s) = (α+Dkλ1 )−(Dλ1 1 +β+1)s+v0 s2 , Ū (0) = u0 . Ū must blow up in finite time,
otherwise the right-hand side of equation (24) will go to infinte which contradict to the
left-hand side of equation (24). The theorem is proved. 
620 Janpou Nee

5. CONCLUDING REMARKS
The solution of this system is rather complicated. If the condition |D − 1| ≤  of
the theorems can be removed then this would be a major improvement. However, the
blowup result of the last section indicates that this task could be difficult.
There are a lot of research subjects of this system remain open, for example, the
threshold of the activation and depletion. In ([9]), we only observed a global condition
of it, global in the sense of the concentration of the reacton, but not local. It would be
interesting if one can find the local condition of the thershold in any Turing system.

References
[1] K.J. Brown and F.A. Davidson, Global bifurcation in the Brusselator system, Nonlinear Anal-
ysis, 24, 1713-1725, 1995.
[2] T. Erneux and E. Reiss, Brusselator isolas, SIAMJ. appl. Math., 43, 1240-1246, 1983.
[3] P. Fife, Mathematical Aspects of Reacting and Diffusing Systems, Lecture Notes in Biomathemat-
ics, Springer-Verlag, Berlin, Heidelberg, New York, 28 1979. bibitemghr M. Ghergu Non-constant
steady-state solutions for Brusselator type systems, Nonlinearity, 21, 23312345 2008.
[4] Daniel Henry, Geometric Theory of Semi-linear Parabolic Equations, Lecture notes in Math.
840, Springer-Verlag, New York, 1981.
[5] T. Kolokolnikov, T. Erneuxa, J. Weib, Mesa-type patterns in the one-dimensional Brusselator
and their stability, Physica D, 214, 63-77 2006.
[6] Lou Y. and Ni W-M Diffusion, self-diffusion and cross-diffusion, J. Diff. Eqns, 131, 79131 1996.
[7] J. D. Murray, Mathematical Biology, Springer-Verlag, Berlin, Heidelberg, New York 1993.
[8] G. Meinhardt, Models of Biological Pattern Formation, London Academic Press 1982.
[9] J. Nee, On Brusselator Equation to appear.
[10] J. H. Protter and H. F. Weinberger, Maximum Principles in Differential Equations, 2nd ed.,
Springer-Verlag, New York, 1984.
[11] A.M. Turing, The chemical basis of morphogenesis,Phil. Trans. R. Soc., London B 237, 3772,
1952.

Janpou Nee
Chieng-Kuo Technology University
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 621 - 630

LOGIC APPROACH TOWARDS FORMAL VERIFICATION


OF CRYPTOGRAPHIC PROTOCOL

D. L. CRISPINA PARDEDE, MAUKAR, SULISTYO PUSPITODJATI

Abstract. A number of techniques based on logic have been developed to provide formal
verification of cryptographic protocols. Some of them are based on logic of belief, and others based
on logic of knowledge, or even combine the logic of knowledge and logic of belief. This paper
discusses two logical approaches towards verification of security protocols. We implement BAN
logic approach and CS logic approach to verify a security protocol. A case study incorporating the
verification of an authentication protocol is presented. The result of the analysis highlight the
advantages and limitations of these approaches, and comparing their specifications and
competences.

Keywords: formal method, modal logic, security protocols, authentication.

1. INTRODUCTION

A protocol is a sequence of communication steps and computation defined in


sequence. A communication step move messages from one principal (sender) to another
principal (recipient), while the step of computing updates the internal state of a principal.
Cryptography is the science of information security based on a set of techniques that involve
a transformation of clear text into an encrypted text (chipered text). The reverse process is
called decryption or decomposition.
A cryptographic protocol that is also referred to as security protocols.
Cryptographic functions are used to achieve security goals. Security is becoming an
increasingly important issue in computing, because of the extraordinary development of
concurrent and distributed systems, such as databases, world wide web, electronic mail, and
trade, and others. In that context, the information must be protected from confusion,

2010 Mathematics Subject Classification : 03B42


621
622 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI

destruction, and disclosure. Therefore, much attention is directed to the development and use
of cryptographic protocols.
In the literature, two main classes cryptographic protocols have been proposed,
namely: authentication protocols and key distribution protocols. The main purpose of the
authentication protocol is that it allows principals to identify themselves to each other.
Cryptographic key distribution protocol aims to distribute the key among the principals.
The design of cryptographic protocols is very difficult and complicated. If the
protocol is not designed carefully enough, it will contain weaknesses that could be an ideal
starting point for various attacks. In that context, it is not surprising that we encountered
several examples of cryptographic protocols, which is believed to be good, and then shown
to have some security flaws. It is well known also that the design of cryptographic protocols
prone to error. In contrast, formal methods seem more appropriate to resolve the issue. The
application of formal methods for cryptographic protocols become widespread in the early
1990s. Indeed, the search for undiscovered security flaws in cryptographic protocols
encourage the use of formal analysis techniques. This fact further stimulated research into
the development of several different formal methods to detect weaknesses in the protocol.
This paper discusses two logical approaches towards verification of security
protocols. We implement BAN logic approach and CS logic approach to verify a security
protocol. A case study incorporating the verification of an authentication protocol is
presented. The result of the analysis highlight the advantages and limitations of these
approaches, and comparing their specifications and competences.

2. APPROACHES TO CRYPTOGRAPHIC PROTOCOL ANALYSIS

Verification of cryptographic protocols is done through formal techniques. A


variety of formal approaches for verification of security protocols are classified into four
types, namely general-purpose tools, expert systems, algebra, and logic-based. Among the
various kinds of verification techniques, the most prominent is a technique based on logic
(logic based).
Burrows, Abadi and Needham developed a modal logic known as the BAN logic
[1]. The logic is built on statements about the messages sent and received via a protocol.
BAN logic is the logic of the famous capital of the logic developed for the analysis of
security protocols.
BAN logic has been proved to be more widely used than any other logics. The
strength of BAN logic lies in the simplicity of its rules of inference, the ease of use and its
intuitive nature. BAN logic does not attempt to present knowledge, and can only work at
high abstraction level. Therefore, BAN logic can only be used to prove the authentication
protocol. BAN logic can not be used to prove the nature of the secret (secrecy) of a security
protocol.
Many attempts have been made to develop the BAN logic in order to become an
effective formal method for verification of security protocols. GNY logic is an example of its
extention. GNY logic is developed by Gong, Needham and Yahalom [2]. GNY using
constructs that exist in the BAN logic and inference rules add up to total 55 units. The
number of rules of inference making GNY logic is considered impractical to implement.
Syverson and van Oorschot [3] combines the four logic families residing in the BAN logic to
obtain a more systematic approach than GNY. Those logics are the GNY logic, BAN logic,
model and semantics of BAN logic reformulation of Abadi & Tuttle [4], and the logic of
belief from van Oorschot [5].
Logic Approach Towards Formal Verification of Cryptographic Protocol 623

Coffey and Saidha [6] developed a logic for verification of security protocols with
public key. The logic is then referred to as CS logic. Logic CS is based on logic of belief and
logic of knowledge. Inference rules used is natural deduction. Knowledge operator and belief
operator are used in this logic.
2.1. The BAN Logic BAN logic proposed by Mike Burrows, Martin Abadi, and Roger
Needham [1] is one approach based on logic of belief. It allows the assumptions and goals of
a protocol to be stated abstractly in logic of belief. BAN logic provides notations, constructs,
inference rules, and postulates. Using BAN Logic, the analysis of a protocol is performed as
follows:
1. Idealization of the original protocol using BAN logic notations and constructs.
2. Assumptions about the initial state are written
3. Logical formulas are attached to the statements of the protocol as assertions about the
state of the system after each statement
4. The logical postulates are applied to the assumptions and the assertions in order to
discover the beliefs held by the parties in the protocol
This procedure may be repeated as new assumptions are found to be necessary and as the
idealized protocol is refined.
BAN logic formalism is built on three kinds of objects: principals involved in
security protocols, encryption/decryption and signing/verification key held by the principal,
and the messages exchanged between principals. The symbols P, Q, and R denote specific
principals; X, Y range over statements; K range over keys (encryption keys); and K -1 range
over corresponding decryption keys; Np, Nq, Nc denote specific statements. All these are used
in writing a propositional logic connected with conjunction, denoted by a comma. In forming
a proposition, BAN logic uses several constructs (Table 1).

Table 1. Constructs in BAN Logic

P |≡ X Principal P believes in statement X.


P|~ X Principal P once said statement X.
P|⇒X Principal P has jurisdiction (control) over statement X.
P⊳X Principal P sees statement X.
K is a good key for communication between principals P and Q
{X}K X is encrypted under key K
#(X) Statement X is fresh
Principal P has K as a public key

At the idealization stage, the protocol is rewritten using BAN logic notation and
constructs. Initial assumptions will be needed in the analysis and expressed using the
notation and constructs that exist. BAN logic has 19 inference rules. These rules are used in
security protocol analysis to examine whether the ultimate goal of security protocols is
achieved. Inference rules in BAN logic are:
624 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI

 Message meaning shared key:  Message meaning shared secret


P  Q  P, P  {X}K
K Y
P  Q  P, P  X Y
P Q~ X
P Q~ X

 Nonce verification rule  Jurisdiction rule


P  # (X), P  Q ~ X P  Q  X, P  Q  X
P Q X P X

 Decomposition
P  (X, Y) P  Q  (X, Y) P  Q ~ (X, Y)
, ,
P X P QX P Q ~ X

 Decomposition in see
P  (X, Y) P  X Y P  Q  P, P  {X}K
K

, , ,
PX PX P X
K K
P   Q, P  {X}K P   Q, P  {X}K -1
,
PX PX

 Nonce influence  Shared commutative


P  # ( Y) P  R 
K
R' P  Q  R 
K
R'
,
P  # ( Y, X ) P  R' 
K
R P  Q  R ' 
K
R

 Shared secret commutative


X X
P  R  R' P  Q  R  R'
X
, X
P  R'  R P  Q  R'  R

2.2. The CS Logic Another logic approach towards formal verification of cryptographic
protocols is CS Logic that was proposed by Coffey and Saidha [6]. CS logic is built on logic
of belief and logic of knowledge. It provides belief operators and knowledge operators. One
knowledge operator is propositional and deals with the knowledge of statements or facts. The
other knowledge operator is a predicate and deals with the knowledge of objects (e.g.
cryptographic keys, ciphertext data, etc.). The inference rules provided are the standard
inferences required for natural deduction. The axioms of the logic are sufficiently low level
to express the fundamental properties of cryptographic protocols.
Logic Approach Towards Formal Verification of Cryptographic Protocol 625

The language for the CS logic, as shown in Table 2, is used to formally express the
logical postulates and protocol facts. The languange includes the clasical logical connectives
of conjuction ‘’, disjunction ‘’, complementation ‘’, and material implication ‘’. The
symbol ‘’ denote universal quantification and ‘’ denote universal quantification.
Membership of a set is denoted by the symbol ‘’, and set exclusion by ‘/’. The symbol ‘├’
denotes a logical theorem.

Table 2. Language of CS Logic

a, b, c Variables
 Arbitrary statement
 and  Arbitrary entities
i, j Range over entities
ENT Set of all possible entities
k A cryptographic key
t, t’, t” Time
e(x, k ) Encryption function, encryption of x using key k.
d(x, k-1) Decryption function, decryption of x using key k-1.
Propositional knowledge operator, K,t means  knows statement  at
K
time t.
Knowledge predicate, L,tx means  knows and can reproduce object x at
L
time t.
B Belief operator, B,t means  belief at time t that statement  is true.
C Contains operator. C(x,y) means that the object x contains the object y.
S Emission operator. S(, t, x) means  sends message x at time t.
R Reception operator. R(, t, x) means  receive message x at time t.

The rules of inferences incorporated in CS logic are:


R1: ├ p , ├ (p  q) ⇒ ├ q
R2: (a). ├ p ⇒├ K,t p. (b). ├ p ⇒├ B,t p
R3: (p  q) ⇒ p.
R4: p, q ⇒(p  q)
R5: p ⇒(p  q)
R6: (p) ⇒ p.
R7: (p ⇒ q) ⇒ (p  q)
In the CS logic, postulates that will be used in the analysis of security protocols are
defined as axioms. The set of axioms is divided into two groups, namely the logical axioms
and the non-logical axioms:

 Logical Axioms
A1. (a).  t  p  q (K,t p  K,t ( p  q)  K,t q )
(b).  t  p  q (B,t p  B,t ( p  q)  B,t q )
A2.  t  p (K,t p  p)
A3. (a).  t  x  i, i {ENT} (Li,t x  t’, t’  t Li,t’ x)
(b).  t  x  i, i {ENT} (Ki,t x  t’, t’  t Ki,t’ x)
626 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI

(c).  t  x  i, i {ENT} (Bi,t x  t’, t’  t Bi,t’ x)


A4.  t  x  y, ( i, i {ENT} Li,t y  C(y,x)   j, j {ENT} Lj,t x)

 Non-logical Axioms
A5.  t  x (S(.t,x)  L,t x   i, i {ENT/} t’, t’ > t R(i.t’,x))
A6.  t  x (R(.t,x)  L,t x   i, i {ENT/} t’, t’ < t S(i.t’,x))
A7. (a).  t  x  i, i {ENT} (Li,t x  Li,t k  Lj,t (e(x,k)))
(b).  t  x  i, i {ENT} (Li,t x  Li,t k-1  Lj,t (d(x,k-1)))
A8. (a). t x i, i{ENT}(Li,tk  t’, t’ < t Lj,t’ (e(x,k))   (y (R(i.t,y)  C(y,
e(x,k))))  Li,t (e(x,k)))
(b). t x i, i{ENT}(Li,tk-1 t’, t’<t Lj,t’(d(x,k-1))   (y (R(i.t,y)  C(y,
d(x,k-1))))  Li,t (d(x,k-1)))
A9.  t (i, i {ENT} Li,t k-1   j, j {ENT/i} Lj,t k-1
A10. t x (i, i{ENT} Li,t d(x,k-1)  L,t x )

The verification process using CS logic involves the following steps:


1. Formalisation of the protocol messages,
2. Specification of the initial assumption,
3. Specification of the protocol goals,
4. Application of the axioms and inference rules.
The objective of the verification process is to establish whether the desired goals of the
protocol can be derived from the initial assumptions and protocol steps. If such a derivation
exists, the protocol is succesfully verified, otherwise the verification fails.

3. NEEDHAM-SCHROEDER SHARED KEY PROTOCOL

The Needham-Schroeder shared key (NSSK) protocol was proposed by Roger M.


Needham and Michael D. Schroeder [7] This protocol assumes a shared-key setting, allows a
principal A to establish a session with another principal, B. An authentication server
generates a fresh session key Kab and distributes it to A and B.
Message 1 AS : A, B, Na.
Message 2 S A : {Na , B, Kab , {Kab , A}Kbs } Kas .
Message 3 AB : {Kab , A}Kbs
Message 4 B A : {Nb} Kab .
Message 5 A B : {Nb - 1} Kab .
In the protocol run, only A makes contact with the server, who provides A with the
session key Kab, and a certificate encrypted with B’s key conveying the session key an A’s
identity to B. B, then, decrypts the certificate and carries out a nonce handshake with A to be
assured that A is present currently. The use of Nb-1 in the last message is conventional.
Substraction is used to indicate that the origin of the message is A, rather than B.
Logic Approach Towards Formal Verification of Cryptographic Protocol 627

4. PROTOCOL VERIFICATION

4.1. Verification Using BAN Logic The original NSSK protocol without idealization has
been presented in the previous section. The corresponding idealised protocol using BAN
Logic notations and constructs is as follows:
Message 2
S  A : { N a , (A 
 B), # (A 
K ab
 B), {(A 
K ab
 B)}K bs }Kas
K ab

Message 3 A  B : { A 
 B}K bs
K ab

Message 4 B  A : { N b , (A 
 B)}K ab from B
K ab

Message 5 A  B : { N b , (A 
 B)}K ab from A
K ab

The initial assumptions are


A  A 
 S
K as
; B  B 
 S
K bs

S  A 
 S
K as
; S  B 
 S
K bs
; S  A 
 B
K ab

A  (S  A 
K
B) ; B  (S  A 
K
B) ;
A  (S # (A 
K
B))
A  # (Na ) ; B  # (Nb )
S  # (A 
 B) Kab
; B  # (A 
 B)
Kab

Once all the assumptions have been written, verification is conducted to prove that some
formulas hold as conclusions. These conclusions describe the goal of authentication
protocols. The authentication is complete between A and B if there is a K such that the
following beliefs are attained
A  A 
 B ;
Kab
B  A 
 B ; A  B  A 
Kab
 B
Kab
;

B  A  A 
 B.
Kab

From Message 2 we obtain


A  { Na , (A 
 B), # (A 
K ab
 B), {(A 
K ab
 B)}K bs }K as which then is
K ab

decrypted by A, since A  A  


Kas
S . Since A  # (Na ) , the nonce verification rule
could be applied and it leads us to A  S  A   B and A  S  # (A 
Kab
 B) .
Kab

Applying the jurisdiction rule infers A  A   B and A  # (A 


Kab
 B) .
Kab

Partially, A  { A   B}K bs and A could send the message


K ab

{ A   B}K bs to B who is then able to decrypt the message since B  B 


K ab
 S.
K bs

By applying the appropriate message meaning postulate, we obtain B  S ~ A   B,


Kab

and then by the nonce verification and the jurisdiction postulates, we immediately obtain
B  A   B.
Kab

In Message 4, B sents { N b , (A   B)}K ab to A who is also in possession of


K ab

the key Kab. A then can deduce that B believes in the key, A  B  A   B . In
Kab
628 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI

Message 5, A replies similarly, and then B can deduce that A also believes in the key,
B  A  A  B.
Kab

The proof leads us to the following beliefs:


A  A 
 B ;
Kab
B  A 
 B ;
Kab
A  B  A 
 B
Kab
;

B  A  A 
 B . This result shows that the NSSP protocol attained it’s objective.
Kab

The verification using BAN logic proved that there is no flaw in the protocol.

4.2. Protocol Verification Using CS Logic The first step of the verification procedure is the
formalization of the protocol using CS Logic notations. The NSSP protocol is then rewritten
into
Message 2 KA,t2 (R(A, t2, {Na, B, Kab, {Kab, A}Kbs}Kas))
Message 3 KB,t3 (R(B, t3, { Kab, A}Kbs))
Message 4 KA,t4 (R(A, t4, { Nb}Kab))
Message 5 KB,t5 (R(B, t4, { Nb}Kab)).
The goals of this protocol are specified as follows
Goal 1 (Message 2) : KA,t2 (t, t0 < t < t2, S (S, t, {Na, B, Kab, {Kab, A}Kbs}Kas))
Goal 2 (Message 3) : KB,t3 (t, t2 < t < t3, S (A, t, {Kab, A}Kbs))
Goal 3 (Message 4) : KA,t4 (t, t3 < t < t4, S (B, t, {Nb}Kab))
Goal 4 (Message 5) : KB,t5 (t, t4 < t < t5, S (A, t, {Nb}Kab)).
The initial assumptions are
LA,t0 (Kas) ; LB,t0 (Kbs) ; LS,t0 (Kas) ; LS,t0 (Kbs) ;
KA,t0 (i, i  {ENT}, t’, t’ < t0,  Lit’ (Na) ;
KB,t0 (i, i  {ENT}, t’, t’ < t0,  Lit’ (Nb).
Message 2 of the protocol is analyzed in order to determine if Goal 1 can be derived
using the axioms and the inference rules of the CS logic. By using axioms A6 and inference
rule R3 on Message 2: KA,t2 (R(A, t2, {Na, B, Kab, {Kab, A}Kbs}Kas)), we obtain i,
i{ENT/A}, t’, t’<t2, S(i, t’, {Na,B,Kab, {Kab, A}Kbs}Kas). By A9, only S knows the key Kas
so that KA,t2 (t’, t’ < t2, S(S, t’, {Na, B, Kab, {Kab, A}Kbs}Kas)). This can be compared to
Goal 1, except the time range is not restricted to being after t 0. It shows that there is a flaw in
the NSSK protocol.

5. BAN LOGIC VS CS LOGIC

BAN logic and CS logis are two logic approaches for formal verification of
cryptographic security protocol. BAN logic is built on logic of belief, while CS logic is built
on logic of belief and logic of knowledge. The analysis of a protocol using BAN logic
consists of four stages: idealization; definition of assumptions about the initial states;
attachement of logical formulas to the statements of the; application of the logical postulates
to the assumptions and the assertions in order to discover the beliefs held by the parties in the
protocol. The verification process using CS logic involves four steps: formalisation of the
protocol messages; specification of the initial assumption; specification of the protocol goals;
application of the axioms and inference rules. The objective of the verification to prove
whether the desired goals of the protocol can be derived from the initial assumptions and
protocol steps. The difference between BAN logic and CS logic is that the second employes
Logic Approach Towards Formal Verification of Cryptographic Protocol 629

timelines in the verification. The time range of every step in the protocol could be verified
using CS logic. The verification of NSSK protocol using CS logic shows that the protocol
has flaw, while BAN logic failed to show that the protocol has flaw.

6. CONCLUSION

This paper discusses two logic-based approaches for verifying cryptographic


protocols, the BAN logic and the CS logic. BAN logic analyses the protocol, one message at
a time. CS Logic analysis a protocol step to verify that the step reach its goal. CS logic
allows analysis of each step independently while BAN logic analysis the protocol steps in
order to determine if the ultimate goal is reached.
A case study incorporating the verification of an authentication protocol was
presented in this paper. We show the implementation of BAN logic and CS logic on
Needham-Schroeder Shared Key Protocol. CS logic verify that there is a flaw in the protocol
in question, while BAN logic reports that the protocol has no flaw. This fact shows the
weakness of BAN logic in the verification of cryptographic protocols, yet it is widely used
for the reason of its simplicity.

References

[1]. BURROWS, M., ABADI, M., AND NEEDHAM, R., A Logic of Authentication, DEC System Research Centre
Report, 39, February 1989.
[2]. GONG, L., NEEDHAM, R., AND YAHALOM, R., Reasoning about Belief in Cryptographic Protocols, Proceedings
1990 IEEE Symposium on Research in Security and Privacy, IEEE Computer Society Press, 234-248. 1990.
[3]. SYVERSON, P. F. AND VAN OORSCHOT, P. C., On Unifying SOme Cryptographic Protocol Logics, IEEE
Symposium on Research in Security and Privacy, 14-28, 1990.
[4]. ABADI, M., AND TUTTLE, M., A Semantics for A Logic of Authentication, Proceedings of the ACM Symposium
of Principles of Distributed Computing, ACM Press, 201-216, 1991.
[5]. VAN OORSCHOT, P. C., Extending Cryptographic Logics of Belief to Key Agreement Protocols, Proceedings of
The 1st ACM Conference on Communications and Computer Security, 1993.
[6]. COFFEY, T., AND SAIDHA, P. Logic for Verifying Public-Key Cryptographic Protocols, IEEE Proceedings
Online, 1996.
[7]. NEEDHAM, R. M., AND SCHROEDER, M. D., Using Encryption for Authentication in Large Networks of
Computers. Commun. ACM., 1978, 21 (12), pp.993-999.
[8]. LAL, S., JAIN, M., AND CHAPLOT, V., Approaches to Formal Verification of Security Protocols.
http://arxiv.org/ftp/arxiv/papers/1101/1101.1815.pdf.

D. L. CRISPINA PARDEDE
Gunadarma University
e-mail: [email protected]

MAUKAR
Gunadarma University
e-mail: [email protected]

SULISTYO PUSPITODJATI
Gunadarma University
630 D. L. CRISPINA PARDEDE, MAUKAR, S. PUSPITODJATI

e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Computer, Graph and Combinatorics, pp. 631–646.

A FRAMEWORK FOR AN LTS SEMANTICS FOR


PROMELA

Suprapto And Reza Pulungan

Abstract. A high-level specification language PROMELA can be used not only to model
interactions that occur in distributed or reactive systems, but also to express requirements
of logical correctness about those interactions. Several approaches to a formal semantics
for PROMELA have been presented, ranging from the less complete formal semantics to
the more complete ones. This paper presents a significantly different approach to provide
a formal semantics for PROMELA model, namely by an operational semantics given as
a set of Structured Operational Semantics (SOS) rules. The operational semantics of a
PROMELA statement with variables and channels is given by a program graph. The
program graphs for the processes of a PROMELA model constitute a channel system.
Finally, the transition system semantics for channel systems yields a transition system
that formalizes the stepwise behavior of the PROMELA model.

Keywords and Phrases: PROMELA, formal semantics, SOS rules, program graphs, chan-
nel systems, transition systems.

1. PRELIMINARIES
It is still a challenging problem to build automated tools to verify systems especially
reactive ones and to provide simpler formalisms to specify and analyze the system’s
behavior. Such specification languages should be simple and easy to understand, so that
users do not require a steep learning curve in order to be able to use them [1]. Besides,
they should be expressive enough to formalize the stepwise behavior of the processes and
their interactions. Furthermore, they must be equipped with a formal semantics which
renders the intuitive meaning of the language constructs in an unambiguous manner.
The objective of this work is to assign to each model built in the specification language
PROMELA a (labeled) transition system that can serve as a basis for further automated
analysis, e.g., simulation or model checking against temporal logical specifications.
A PROMELA model consists of a finite number of processes to be executed
concurrently. PROMELA supports communications over shared variables and message
passing along either synchronous or buffered FIFO-channels. The formal semantics of
631
632 Suprapto And Reza Pulungan

a PROMELA model can be provided by means of a channel system, which then can
be unfolded into a transition system [1]. In PROMELA, the stepwise behavior of the
processes is specified using a guarded command language with several features of classical
imperative programming languages (variable assignments, conditional and repetitive
commands, and sequential composition), communication actions where processes may
send and receive messages from the channels, and atomic regions that avoid undesired
interleaving [1].
Several researches in formal semantics of PROMELA have been carried out with
various approaches [2, 3, 4]. By considering previous researches, the derivation approach
of formal semantics of PROMELA presented here consists of three phases of transfor-
mation. First, a PROMELA model is transformed into the corresponding program
graphs; then the generated program graphs will constitute a channel system. Finally,
the transition system semantics for the resulting channel systems then produces a tran-
sition system of the PROMELA model that formalizes the operational behavior of that
model. The rules of transitions presented here are operational semantics given as a set
of Structured Operational Semantics (SOS) rules. SOS rules are a standard way to
provide formal semantics for process algebra, and are useful for every kind of operational
semantics.
The discussion of LTS semantics of PROMELA with this approach should be
considered as an initial version that only covers some small parts of PROMELA features.
Therefore, in order to handle more features, an advanced research will be required in
the near future. One thing could have been noticed as a contribution in this research is
the initial implementation of LTS to explain the behavior of PROMELA model.
1.1. PROMELA. PROMELA is a modeling language equipped with communication
primitives that facilitate an abstraction of the analyzed systems, suppressing details that
are not related to the characteristics being modeled. A model in PROMELA consists of
a collection of processes that interact by means of message channels and shared variables.
All processes in the model are global objects. In a PROMELA model, initially only
one process is executed, while all other processes are executed after a run statement [6].
A process of type init must be declared explicitly in every PROMELA model and it
can contain run statements of other processes. The init process is comparable to the
function main() of a standard C program. Processes can also be created by adding
active in front of the proctype declaration as shown in Figure 1.
Line (1) defines a process named Bug, a formal parameter x of type byte and the
body of process is in between and . Line (2) defines the init process containing two run
statements of process Bug with different actual parameters; the processes start executing
after the run statement. Line (3) creates three processes named Bar, and immediately
the processes are executed.
Communnications among processes are modeled by using message channels that
are capable of describing data transfers from one process to another; they can be either
buffered or rendezvous. For buffered communications, a channel is declared with the
maximum length no less than one, for example, chan buffname = [N] of byte, where
N is a positive constant that defines the size of the buffer. The policy of a channel
communication in messages passing is FIFO (first-in-first-out). PROMELA also allows
A Framework for an LTS Semantics for PROMELA 633

(1) ..... proctype Bug(byte x) {


...
}
(2) ..... init {
int pid = run Bug(2);
run Bug(27);
}
(3) ..... active[3] proctype Bar() {
...
}

Figure 1. A general example of PROMELA model

(1) ........ chan <name> = [<cap>] of {<t1>, <t2>, ..., <tn>};


(2) ........ chan ch = [1] of {bit};
(3) ........ chan toR = [2] of {int, bit};
(4) ........ chan line[2] = [1] of {mtype, Msg};

Figure 2. Examples of channel declaration

rendezvous communication, which is implemented as logical expressions of buffered


communications, permitting the declaration of zero length channels, for example, chan
rendcomm = [0] of byte. A channel with zero length means that the channel can pass,
but cannot store messages. It implies that message interactions via such rendezvous
channels are by definition synchronous. Rendezvous communication is binary, for there
are only two processes; i.e., a sender and a receiver can be synchronized in a rendezvous
handshake. The declaration syntax and declaration examples of channels are shown in
Figure 2.
Line (1) declares a channel named name, with capacity cap, and ht1 i, . . . , htn i
denote the type of the elements that can be transmitted over the channel. Line (2) and
(3) are obvious, and line (4) declares an array of channels of size two, where each has
capacity of one.
Message channels are used to model the transfer of data from one process to
another. Similar to the variables of the basic data types, they are declared either locally
or globally. The statement qname!expr sends the value of expression expr via the channel
(qname), that is, it appends the value of expr to the tail of the channel qname. The
statement qname?msg, on the other hand, retrieves a message from the head of the
channel qname, and assigns it to variable msg. In this case, there must be compatibility
between the type of the value being stored and the type of variable msg. Instead of
sending a single value through the channel, it is allowed to send multiple values per
message. If the number of parameters to be sent per message exceeds the message
channel can store, the redundant parameters will be lost. On the other hand, if the
number of parameters to be sent less than the message channel can store, the values of
the remaining parameters will be undefined. Similarly, if the receive operation tries to
634 Suprapto And Reza Pulungan

proctype One(chan q1) { chan q2; q1?q2; q2!123 }

proctype Two(chan qforb) { int x;


qforb?x;
printf(x = %d\n, x) }

init{ chan qname[2] = [1] of { chan };


chan qforb = [1] of { int };

run One(qname[0]);
run Two(qforb);
qname[0] ! qforb
}

Figure 3. An example of data communication using message channel

retrieve more parameters than are available, the values of the extra parameters will be
undefined; on the other hand, if it retrieves fewer than the number of parameters that
was sent, the extra values will be lost.
The send operation is executable only when the channel being used is not full.
While the receive operation is executable only when the channel (for storing values)
is not empty. Figure 3 shows an example that uses some of the mechanisms in data
communication using message channel. The process of type One has two channels q1
and q2 ; they are as a parameter and a local channel respectively; while the process of
type T wo has only one channel qford as a parameter. Channel qford is not declared as
an array and therefore it does not need an index in send operation at the end of the
initial process. The value printed by the process of type T wo will be 123.
The discussion so far is about asynchronous communications between processes
via message channels created in statements such as chan qname = [N] of { byte },
where N is a positive constant that defines the buffer size. A channel size of zero, as
in chan port = [0] of { byte }, defines a rendezvous port that can only pass, but
not store, single-byte messages. Message interactions via such rendezvous ports are
synchronous, by definition. Figure 4 gives an example to illustrate this situation. The
two run statements are placed in an atomic sequence to enforce the two processes to start
simultaneously. They do not need to terminate simultaneously, and neither complete
running before the atomic sequence terminates. Channel name is a global rendezvous
port. The two processes synchronously execute their first statement : a handshake on
message msgtype and a transfer of the value 124 to local variable state. The second
statement in process of type XX is not executable, since there is no matching receive
operation in process of type YY.
PROMELA allows several general structures of control flow, namely atomic se-
quences, conditional (if-statement), repetition (do-statement), and unconditional jumps
(go to) [7]. The if-statement has a positive number of choices (guards). If there are at
least two choices executable, it is executable and the guard is chosen non-deterministically.
A Framework for an LTS Semantics for PROMELA 635

#define msgtype 33
chan name = [0] of { byte, byte };
byte name;

proctype XX() {
name!msgtype(124);
name!msgtype(121) }

proctype YY() {
byte state;
name!msgtype(state) }

init { atomic { run XX(); run YY() } }

Figure 4. An example of data communication using message channel

if
:: (n % 2 != 0) -> n = 1
:: (n >= 0) -> n = n-2
:: (n % 3 == 0) -> n = 3
:: else -> skip
fi;

Figure 5. Example of the modified if-statement structure

Otherwise, it is blocked if there is no choice executable. The structure of if-statement


may be modified by replacing choice with else guard as shown in Figure 5. When none of
the guards is executable, the else guard become executable. In this example, statement
skip will be executed when n%2 = 0 and n < 0 and n%3 6= 0. Hence by adding the else
guard, the if-statement will never block.
With respect to the choices, a do-statement has the same syntax as the if-statement
does, and behaves in the same way as an if-statement. However, instead of ending the
statement at the end of the chosen list of statements, a do-statement repeats the choice
selection. Only one option can be selected for execution at a time, and after the option
completes the execution of the structure is repeated. When there is no choice executable
the control will be transfered to the end of the loop; the break statement which is always
executable can be used to exit a do-statement.
Besides, in PROMELA there are also some interesting predefined statements, such
as timeout and assert. The timeout statement is used to model a special condition
that allows a process to abort the waiting for a condition that may never become true,
for example an input from an empty channel. The timeout keyword is a modeling
feature in PROMELA that provides an escape from a hang state. It becomes true only
when no other statements within the distributed system is executable. Figure 6 shows
636 Suprapto And Reza Pulungan

proctype keeper()
{
do
:: timeout -> guard!reset
od
}

Figure 6. The example of timeout usage

proctype monitor () {
(1) ............ assert (n <= 3);
}

proctype receiver () {
...
toReceiver ? msg;
(2) ............ assert (msg != ERROR);
...
}

Figure 7. The examples of assert statement usage in process

the definition of the process that will send a reset message to a channel named guard
whenever the system is blocked.
The assert statement, i.e., assert(any boolean condition) is always executable. If
the specified boolean condition holds (true), the statement has no effect, otherwise,
the statement will produce an error report during the verification process. The assert
statement is often used within PROMELA models to check whether certain properties
are valid in a state. Figure 7 shows that assert statement in line (1) will have no effect
whenever the value of variable n is less than or equal to 3, otherwise it will produce an
error report during the verification process; and similarly to assert statement in line (2).
Another interesting statement in PROMELA is the unless statement, with syntax
{statement1 }unless{statement2 }. The mechanism of execution of unless statement
might be explained as follows. The start point of the execution is in statement1 , but
before each statement in statement1 is executed, enabledness of statement2 is checked. If
statement2 is enabled then statement1 is aborted and statement2 is executed, otherwise
statement1 is executed. Figure 8 illustrates the use of unless statement in a fragment
of codes. The result of the statement execution depends on the value of c: if c is equal
to 4 then x will be equal to 0. Since then statement {x! = 4; x = 1} is not enabled,
statement {x > 3; x = 0} is executed. In case the value of c is 5 then x is equal to 1,
since statement {x! = 4; x = 1} is enabled. This means that statement {x > 3; x = 0} is
aborted and statement {x! = 4; x = 1} is executed.
A Framework for an LTS Semantics for PROMELA 637

byte x = c;
{ x > 3; x = 0 }
unless
{ x != 4; x = 1 }

Figure 8. The examples of assert statement usage in process

1.2. Labeled Transition System. Labeled Transition System or Transition System


(TS) for short is a model utilized to describe the behavior of systems. TS is represented
as a directed graph consisting of a set of nodes and a set of edges or arrows; nodes denote
states and arrows model transitions (the changes of states). A states describes some
information about the system at a certain moment of the system’s behavior, whereas
transition specifies how the system evolves from one state to another. In the case of
a sequential program a transition system describes the sequence of the execution of
statements and may involve the changes of the values of some variables and the program
counter [1].
There have been many different types of transition systems proposed, however,
action names and atomic propositions always denote the transitions (or state changes)
and the states respectively in TS. Moreover, action names are used to describe the
mechanisms of communication between processes, and the early letters of the Greek
alphabet (such as α, β, γ, and so on) are used to denote actions. Besides, atomic
propositions formalize temporal characteristics : they express intuitively simple known
facts about each state of the system under consideration. The following is the formal
definition of transition system [1].
Definition 1.1. A Transition System TS is a tuple (S, Act, →, I, AP, L) where S
is a set of states, Act is a set of actions, →⊆ S × Act × S is a transition relation, I ⊆ S
is a set of initial states, AP is a set of atomic proposition, and L is a labeling function
defined as L : S → 2AP . T S is called finite if S, Act, and AP are all finite.

1.3. Program Graph. A program graph (PG) over a set of typed variables is a
digraph (directed graph) whose arrows are labeled with conditions on these variables
and actions. The effect of the actions is formalized by means of a mapping : Effect :
Act → Eval(V ar) × Eval(V ar), which indicates how the evaluation η of variables is
changed by performing an action. The formal definition of program graph is follows.
Definition 1.2. A Program Graph PG over set V ar of typed variables is a tuple
(Loc, Act, Ef f ect, ,→, Loc0 , g0 ) where Loc is a set of locations, Act is a set of actions,
Ef f ect : Act → Eval(V ar) × Eval(V ar) is the effect function, ,→⊆ Loc × Cond(V ar) ×
Act × Loc is the conditional transition relation, Loc0 ⊆ Loc is a set of initial locations,
and g0 ∈ Cond(V ar) is the initial condition.

` g:α 0 0
,→ ` is shorthand for (`, g, α, ` ) ∈,→. The condition g is called the guard of
g:α 0
conditional transition ` ,→ ` , therefore, if g is tautology then conditional transition
α 0
would simply be written ` ,→ ` . The behavior in ` ∈ Loc depends on the current variable
evaluation η. A nondeterministic choice is made between all transitions ` g:α 0
,→ ` which
638 Suprapto And Reza Pulungan

satisfy condition g in evaluation η (i.e., η |= g). The execution of α changes the variables
evaluation according to Effect(α, ). The system changes into `0 subsequently, otherwise,
the system stop.
Each program graph can be interpreted as a transition system. The underlying
transition system of a program graph results from unfolding (or flattening). Its states
consist of a control component, i.e., a location of the program graph, together with an
evaluation η of the variables. States are thus pairs of the form h`, ηi. An initial state
is initial location that satisfies the initial condition g0 . To formulate properties of the
system described by a program graph, the set AP of propositions is comprised of location
∈ Loc, and Boolean conditions for the variables.
Definition 1.3. Transition System Semantics of Program Graph The transition
system TS(PG) of program graph P G = (Loc, Act, Ef f ect, ,→, Loc0 , g0 ) over variables
set V ar is (S, Act, →, I, AP, L) where
• S = Loc × Eval(V ar)
• →⊆ S × Act × S is defined by the rule
0
` g:α
,→ ` ∧η|=g
α
h`,ηi → h`0 ,Ef f ect(α,η)i

• I = h`, ηi|` ∈ Loc0 , η |= g0
• AP = Loc ∪ Cond(V ar)
• L(h`, ηi) = {`} ∪ {g ∈ Cond(V ar)|η |= g}.

1.4. Parallelism and Communications. The mechanism to provide operational mod-


els for parallel systems by means of transition systems ranges from simple one where
no communication between the participating transition systems takes place, to more
advanced (and realistic) schemes where messages can be transfered, either synchronously
(by means of handshaking) or asynchronously (by buffer with a positive capacity). Given
the operational (stepwise) behavior of the processes that run in parallel with transition
systems T S1 , T S2 , . . . , T Sn respectively, the purpose is to define an operator k such that
:
T S = T S1 kT S2 k . . . kT Sn ,
is a transition system that specifies the behavior of the parallel composition of
transition system T S1 through T Sn . The operator k is assumed to be commutative and
associative, and of course the nature of k will depend on the kind of communication
that is supported. T Si may again be a transition system that is composed of several
transition systems (T Si = T Si,1 kT Si,2 k . . . kT Si,ni ).
Definition 1.4. Interleaving of Transition Systems Let T Si = (Si , Acti , →i , Ii , APi , Li )
i = 1, 2, be two transition systems. The transition system T S1 |||T S2 is defined by :
T S1 |||T S2 = (S1 × S2 , Act1 ∪ Act2 , →, I1 × I2 , AP1 ∪ AP2 , L)
where the translation relation → is defined by the following rules :
α 0 α 0
s1 → s1 s2 → s2
α
hs1 ,s2 i →
1
0
hs1 ,s2 i
and α
hs1 ,s2 i →
2
0
hs1 ,s2 i

and the labeling function is defined by L(hs1 , s2 i) = L(s1 ) ∪ L(s2 ).


A Framework for an LTS Semantics for PROMELA 639

1.5. Communication via Shared Variables. The interleaving operator ||| can be
used to model asynchronous concurrency in which the subprocess acts completely
independent of each other, i.e., without any form of message passing or contentions
on shared variables. The interleaving operator for transition systems is, however, too
simplistic for most parallel systems with concurrent or communicating components.
In order to deal with parallel programs with shared variables, an interleaving
operator will be defined on the level of program graphs (instead of directly on transition
systems). The interleaving of program graphs P G1 and P G2 is denoted P G1 |||P G2 .
The underlying transition system of the resulting program graph P G1 |||P G2 , i.e.,
T S(P G1 |||P G2 ) describes a parallel systems whose components communicate via shared
variables. In general, T S(P G1 |||P G2 ) 6= T S(P G1 )|||T S(P G2 ).
Definition 1.5. Interleaving of Program Graphs
Let P Gi = (Loci , Acti , Ef f ecti , ,→i , Loc0,i , g0,i ), for i = 1, 2 are two program graphs
over the variables V ari . Program graph P G1 |||P G2 over V ar1 ∪ V ar2 is defined by
U
P G1 |||P G2 = (Loc1 × Loc2 , Act1 Act2 , Ef f ect, ,→, Loc0,1 × Loc0,2 , g0,1 ∧ g0,2 )
where ,→ is defined by the rules :
0 0
`1 ,g:α
→ `1 `2 ,g:α
→ `2
1
h`1 ,`2 i g:α
0 and 2
h`1 ,`2 i g:α
0
,→ h`1 ,`2 i ,→ h`1 ,`2 i

The program graphs P G1 and P G2 have the variables V ar1 ∩ V ar2 in com-
mon. These are the shared (sometimes also called global ) variables. The variables in
V ar1 V ar2 are the local variables of P G1 , and similarly, those in V ar2 V ar1 are the
local variables of P G2 .

1.6. Handshaking. The term handshaking means that concurrent processes that want
to interact have to do this in a synchronous fashion. Hence, processes can interact only
if they are both participating in this interaction at the same time - they shake-hand [1].
Definition 1.6. Handshaking (Synchronous Message Passing) Let T Si = (Si , Acti , →i
, Ii , APi , Li ), i = 1, 2 be transition systems and H ⊆ Act1 ∩ Act2 with τ ∈
/ H. The tran-
sition system T S1 kH T S2 is defined as follow :
T S1 kH T S2 = (S1 × S2 , Act1 ∪ Act2 , →, I1 × T2 , AP1 ∪ AP2 , L) where L(hs1 , s2 i) =
L1 (s1 ) ∪ L2 (s2 ), and the transition relation → is defined by the rules :
• interleaving for α ∈
/H :
α 0 α 0
s1 → s1 s2 → s2
1 2
α 0 α 0
hs1 ,s2 i → hs1 ,s2 i hs1 ,s2 i → hs1 ,s2 i
• handshaking for α ∈ H :
α 0 α 0
s1 → s1 ∧s2 → s2
1 2
α 0 0
hs1 ,s2 i → hs1 ,s2 i

Notation : T S1 ||T S2 abbreviates T S1 ||H T S2 for H = Act1 ∩ Act2

1.7. Channel Systems. Intuitively, a channel system consists of n (data-dependent)


processes P1 trough Pn . Each Pi is specified by a program graph P Gi which is extended
with communication actions. Transitions of those program graphs are either the usual
640 Suprapto And Reza Pulungan

conditional transitions (labeled with guards and actions), or one of the communication
actions with their respective intuitive meaning :
c!v transmit the value v along channel c,
c?x receive a message via channel c and assign it to variable x.
Let Comm = c!v, c?x|c ∈ Chan, v ∈ dom(c), x ∈ V ar with dom(x) ⊇ dom(c) denote the
set of communication actions where Chan is a finite set of channels with typical element
c.

Definition 1.7. Channel System


A program graph over (V ar, Chan) is a tuple P G = (Loc, Act, Ef f ect, ,→, Loc0 , g0 )
where
,→⊆ Loc × (Cond(V ar) × (Act ∪ Comm) × Loc).
A channel system CS over (VSar, Chan) consists of program graphs P Gi over (V ari , Chan)
(for 1 ≤ i ≤ n) with V ar = 1≤i≤n V ari . Channel system is denoted by
CS = [P G1 |P G2 | . . . |P Gn ].

The transition relation ,→ of a program graph over (V ar, Chan) consists of two
types of conditional transitions. Conditional transitions ` g:α
,→ `’ are labeled with guards
and actions. These conditional transitions can happen if the guard holds. Alternatively,
conditional transitions may be labeled with communication actions. This yields con-
ditional transitions of type ` g:c!v g:c?x
,→ `’ (for sending v along c) and ` ,→ `’ (for receiving a
message along c).

Definition 1.8. Transition System Semantics of a Channel System Let CS =


[P G1 |P G2 | . . . |P Gn ] be a channel system over (Chan, V ar) with P Gi = (Loci , Acti , Ef f ecti , ,→i
, Loc0,i , g0,i ), for 0 < i ≤ n. The transition system of CS, denoted T S(CS), is the tuple
(S, Act, →, I, AP, L) where :
• S = (Loc
U 1 × . . . × Loc
U n ) × Eval(V ar) × Eval(Chan)
• Act = 0<i≤n Acti {τ }
• → is defined by the following rules :
– interleaving for α ∈ Acti :
0
`i g:α
,→ `i ∧η|=g
0
α
h`1 ,...,`i ,...,`n ,η,ξi → h`1 ,...,`i ,...,`n ,η 0 ,ξi
0
η = Ef f ect(α, η).
– asynchronous message passing for c ∈ Chan, cap(c) > 0 :
- receive a value along channel c and assign it to variable x :
0
`i g:c?x
,→ `i ∧η|=g∧len(ξ(c))=k>0∧ξ(c)=v1 ...vk
0
τ
h`1 ,...,`i ,...,`n ,η,ξi → h`1 ,...,`i ,...,`n ,η 0 ,ξ 0 i
0 0
where η = η[x := v1 ] and ξ = ξ[c := v2 . . . vk ].
- transmit value v ∈ dom(c) over channel c :
0
`i g:c!v
,→ `i ∧η|=g∧len(ξ(c))=k<cap(c)∧ξ(c)=v1 ...vk
0
τ
h`1 ,...,`i ,...,`n ,η,ξi → h`1 ,...,`i ,...,`n ,η,ξ 0 i
0
where ξ = ξ[c := v1 v2 . . . vk v].
A Framework for an LTS Semantics for PROMELA 641

– synchronous message passing over c ∈ Chan, cap(c) = 0 :


0 0
`i g1,→
:c?x
`i ∧η|=g1 ∧η|=g2 ∧`j g2,→
:c!v
`j ∧i6=j
0 0
τ
h`1 ,...,`i ,...,`j ,...,`n ,η,ξi → h`1 ,...,`i ,...,`j ,...,`n ,η 0 ,ξi
0

nwhere η = η[x := v]. o


• I = h`1 , . . . , `n , η, ξ0 i|∀0 < i ≤ n.(`i ∈ Loc0,i ∧ η |= g0,i )
U U
• AP = 0<i≤n Loci Cond(V ar)
 
• L(h`1 , . . . , `n , η, ξi) = h`1 , . . . , `n ∪ g ∈ Cond(V ar)|η |= g .

2. TRANSFORMATION
PROMELA is a descriptive language used to model especially concurrent systems.
Elements of PROMELA model P mostly consist of a finite number of processes P1 , . . . , Pn
to be executed concurrently. PROMELA supports communication over shared variables
and message passing along either synchronous or asynchronous (buffered FIFO-channels).
The formal semantic of a PROMELA programs can be provided by means of a channel
system, which then can be unfolded into a transition system.
As already mentioned in the previous section, the discussion here will only cover
small part of PROMELA features, which primarily concentrates on the basic elements of
PROMELA. A basic (element) PROMELA model consists of statements that represent
the operational behavior of the processes P1 , P2 , . . . , Pn together with a Boolean condition
on the final values of the program variables. It is then represented as P = [P1 |P2 | . . . |Pn ],
where each process Pi is normally built by one or more statement(s). So that, the
statements formalize the operational behavior of the process Pi . The main element
of the statements are the atomic command (skip), variable assignment (x := expr ),
communication activities: reading a value for variable x from channel c (c?x) and sending
the current value of expression expr over channel c (c!expr), conditional commands
(if..fi ), and repetitive commands (do..od ). The syntax of basic PROMELA statements
is shown in Figure 9.

stmnt ::= skip | x := expr | c?x | c!expr | stmnt1; stmnt2


| atomic {assignments}
| if :: g1 -> stmnt1 ... :: gm -> stmntm fi
| do :: g1 -> stmnt1 ... :: gm -> stmntm od

Figure 9. Syntax of basic PROMELA-statements

Considering the fact that the PROMELA-statement itself is built by either variables,
expressions or channels; before proceeding further discussion about statements it will be
wiser to do a brief discussion of them. The variables in a basic PROMELA model P
are used to store either global information about system as a whole or information that
is local to one specific process Pi , depending on where the variable declaration takes
place. They may be in (basic) type of (bit, Boolean, byte, short, integer and channel ).
Similarly, data domains for the channels must be specified: they must be also declared
642 Suprapto And Reza Pulungan

to be synchronous or FIFO-channels of predefined capacity. Furthermore, variable can


be formally defined as :
(name, scope, domain, inival, curval),
where name is variable name, scope is either global or local to a specific process,
domain is a finite set of integers, inival is the initial value of variable, and curval is
current value of variable. It is assumed that the expression used in assignments for
variable x are built by constants in set of domain of x dom(x ), variable y of the same
type as x (or a subtype of x ), and operators on dom(x ), such as Boolean connectives ∧,
∨, and ¬ for dom(x ) = {0, 1} and arithmetic operators +, -, ∗, etc. for dom(x ) = <
(set of real numbers). The example of Boolean expressions are guards that determine
conditions on the values of the variables, (guards ∈ Cond (Var )). In accordance to
Figure 8, x is a variable in Var, expr is an expression, and chan is a channel of arbitrary
capacity. Type compatibility of variable x and the expression expr in assignments x
:= expr is highly required. Similarly, for the actions c?x and c!expr are required that
dom(c) ⊆ dom(x ) and that the type of the expression expr corresponds to dom(c). The
gi s in both command if..fi and do..od are guards, and gi ∈ Cond (Var ) by assumption.
The body assignments of an atomic region is a nonempty sequential composition of
assignments, and it has the form :
x1 := expr1 ; x2 := expr2 ; . . . ; xm := exprm
where m ≥ 1, x1 , . . . , xm are variables and expr1 , . . . , exprm expressions such that
the types of xi and expri are compatible. The intuitive meaning of the statements in
Figure 8 can be explained as follows. skip represents a process that terminates in one
step, and it does not affect the variables values neither channels contents. Variable x
in assignment x := expr is assigned the value of the expression expr given the current
variable evaluation. stmnt1 ; stmnt2 denotes sequential composition, i.e., stmnt1 is
executed first and after the execution of stmnt1 terminates, stmnt2 is executed. In
basic PROMELA, the concept of atomic region is realized by statement of the form
atomic{stmnt}; the execution of stmnt is treated as an atomic step that cannot be
interleaved with the activities of other processes. The statements of the form :
if :: g1 → stmnt1 . . . :: gm → stmntm fi
stand for a non-deterministic choice between the statement stmnti for which the
guard gi is satisfied in the current state, i.e., gi holds for the current valuation of the
variables. However, the if..fi command will blocks if all of guard gi s do not hold. This
blocking must be seen by other processes that run in parallel that might end the blocking
by altering the shared variable values so that one or more of the guards may finally
become hold. Similarly, the statements of the form :
do :: g1 → stmnt1 . . . :: gm → stmntm od
represent the iterative execution of the non-deterministic choice among the guarded
commands gi → stmnti , where guard gi holds in the current state. do..od loops do not
block in a state when all guards are violated, instead the do..od loop is aborted.
2.1. Semantics. The operational semantics of basic PROMELA statement with vari-
ables and channel from (Var, Chan) is given by a program graph over (Var, Chan). The
A Framework for an LTS Semantics for PROMELA 643

Initial state

conditional command

true : x := 0 x > 1 : y := x + y

y := x Exit
true : y := x

Figure 10. Program graph for conditional command if..fi

program graphs P G1 , . . . , P Gn for the processes P1 , . . . , Pn of a basic PROMELA model


P = [P1 | . . . |Pn ] constitute a channel system over (Var, Chan). The transition system
semantics for channel systems then results a transition system TS (P ) that formalizes
the operational behavior of P [1].
The program graph associated with a basic PROMELA statement stmnt formalizes
the control flow of stmnt execution. It means that the sub-statements play the role
of the locations. For instance, in modeling termination, a special location exit is used.
Thus in a program graph, any guarded command g → stmnt corresponds to an edge
with the label g : α where α represents the first action of stmnt. For example, consider
the statement :
conditional_command = if
:: x > 1 -> y := x + y
:: true -> x := 0; y := x
fi
The program graph associated with this command may be explained as follows.
conditional command is viewed as an initial location of the program graph; from this
location there are two outgoing edges : one with the guard x > 1 and action y := x +
y leading to (location) exit, and the other edge with the guard true and action x := 0
resulting in the location for the statement y := x. Since y := x is deterministic there
is a single edge with guard true and action y := x leading to location exit. Figure 10
shows the program graph for conditional command if..fi.

2.2. Substatement. The set of sub-statements of a basic PROMELA statement stmnt


is defined recursively. For statement stmnt ∈ {skip, x := expr, c?x, c!expr} the set of
sub-statements is sub(stmnt) = {stmnt, exit}. For example, sub(x := expr) = {x :=
expr, exit}, sub(c?x) = {c?x, exit}, etc. For sequential composition :
sub(stmnt1 ; stmnt2 ) = stmnt; stmnt2 |stmnt ∈ sub(stmnt1 ){exit} ∪ sub(stmnt2 ).
For example, sub(x := expr; skip) = {x := expr; skip, skip, exit} where {stmnt;
stmnt2 |stmnt ∈ sub(stmnt1 )\{exit}} is {x := expr; skip} and sub(stmnt2 ) = sub(skip) =
{skip, exit}.
644 Suprapto And Reza Pulungan

For conditional commands, the set of sub-statement is defined as the set consisting
of the if..fi statement itself and sub-statements of its guarded commands. Then, its
sub-statements is defined as :
sub(conditional command) = {conditional command}∪sub(stmnt1 )∪. . .∪sub(stmntn ).
For example, sub(if :: x > 1 → y := x + y :: true → x := 0; y := x f i) = {if ::
x > 1 → y := x + y :: true → x := 0; y := x f i, y := x + y, x := 0; y := x, y :=
x, exit}.
The sub-statements of loop command (loop = do :: g1 → stmnt1 . . . :: gn →
stmntn od) is defined as : sub(loop) = {loop, exit}∪{stmnt; loop|stmnt∪sub(stmnt1 ) {exit}}∪
. . . ∪ {stmnt; loop|stmnt ∪ sub(stmntn )\{exit}}. For example, sub(do :: x > 1 → y :=
x + y :: y < x → x := 0; y := x od) = {do :: x > 1 → y := x + y :: y < x →
x := 0; y := x od, y := x + y; do :: x > 1 → y := x + y :: y < x → x := 0; y :=
x od, x := 0; y := x; do :: x > 1 → y := x + y :: y < x → x := 0; y := x od, y :=
x; do :: x > 1 → y := x + y :: y < x → x := 0; y := x od, exit}.
For atomic regions atomic{stmnt}, the sub-statement is defined as:
sub(atomic{stmnt}) = {atomic{stmnt}, exit}.
Then, for example the sub-statements of atomic{b1 := true; x := 2} is sub(atomic
{b1 := true; x := 2}) = {atomic{b1 := true; x := 2}, exit}.

3. INFERENCE RULES
The inference rules for the atomic commands, such as skip, assignment, communi-
cation actions, and sequential composition, conditional and repetitive commands give
rise to the edges of a large program graph in which the set of locations agrees with the
set of basic PROMELA statements [1]. Thus, the edges have the form :
g:α
stmnt −→ stmnt0 or stmnt g:comm
−→ stmnt
0

where stmnt is a basic PROMELA statement, stmnt is sub-statement of stmnt, g is


a guard, a is an action, and comm is a communication actions c?x or c!expr. The
subgraph consisting of the sub-statements of Pi then results in the program graph P Gi
of process Pi as a component of the model P .
(1) The semantics of skip is given by a single axiom formalizing that the execution
of skip terminates in one step without affecting the variables.

skip true:id
−→ exit

where id denotes an action that does not change the values of the variables, i.e.,
for all variable evaluations η, Effect(id, η) = η.
(2) Similarly, the execution of a statement consisting of an assignment x := expr
has trivial guard (true) and terminates in one step.

x:=expr true:assign(x,expr)
−→ exit
A Framework for an LTS Semantics for PROMELA 645

where assign(x, expr) denotes the action that changes the value of variable
x according to the assignment x := expr and does not affect the other vari-
ables, i.e., for all variable evaluation η (η ∈ Eval(V ar)) and y ∈ V ar then
Effect(assign(x, expr), η)(y) = η(y). If y ∈ x and Effect(assign(x, expr), η)(x)
is the value of expr when evaluated over η.
(3) For the communication actions c!expr and c?x the following axiom apply :
cap(c)6=0 len(c)<cap(c)
and
c?x dom(c)⊆dom(x):c?x
−→ exit c!expr dom(Eval(expr))⊆dom(c):c!expr
−→ exit
where cap(c) is maximum capacity of channel c, len(c) is current number of
messages in channel c, dom() is set of type, and Eval(expr) is the value of
expression expr after evaluated.
(4) For an atomic region atomicx1 := expr1 ; ...; xm := exprm , their effect is defined
as the cumulative effect of the assignments xi := expri . It can be defined by the
rule:
atomicx1 :=expr1 ;...;xm :=exprm true:assign(x,expr)
−→ exit
where α0 = id, αi = Ef f ect(assign(xi , expri ), Ef f ect(a( i − 1), η)) for 1 ≤ i ≤
m.
(5) There are two defined rules for sequential composition stmnt1 ; stmnt2 that
distinguish whether stmnt1 terminates in one step. If stmnt1 does not terminate
in one step, then the following rule applies :
g:α 0
stmnt1 − → stmnt1 6=exit
g:α 0
stmnt1 ;stmnt2 −→ stmnt1 ;stmnt2

Otherwise, if stmnt1 terminates in one step by executing action a, then the


control of stmnt1 ; stmnt2 moves to stmnt2 after executing a. The rule is :
g:α
stmnt1 − → exit
g:α
stmnt1 ;stmnt2 −→ stmnt2

(6) The effect of a conditional command conditional command = if :: g1 →


stmnt1 . . . :: gn → stmntn f i is formalized by the rule :
h:α 0
stmnti −→ stmnti
∧h:α 0
conditional command gi−→ stmnti

(7) For repetition command loop = do :: g1 → stmnt1 . . . :: gn → stmntn od is


defined three rules. The first two rules are similar to the rule for conditional
command, but the control returns to loop after guarded command execution
completes. This corresponds to the following rules :
h:α 0 h:α 0
stmnti − → stmnti 6=exit stmnti − → stmnti
gi ∧h:α 0
loop −→ stmnti ;loop
dan gi ∧h:α
loop −→ loop
The third rule applies when none of the guards g1 , g2 , . . . , gn holds in the current
state :
loop ¬g1 ∧...∧¬g
−→
n exit

So far, some basic statements of PROMELA (skip, assignment, communication actions,


atomic region, sequential composition, conditional command, and repetition command )
have been successfully defined their rules (axioms) for inferencing. However, for the sake
of completeness it will be required a large effort to accomplish this work. Therefore, this
should be becoming a further consideration.
646 Suprapto And Reza Pulungan

4. CONCLUDING REMARKS
A new approach to a formal semantics for PROMELA has been presented. This
approach first derives channel system from PROMELA model P which consists of a finite
number of processes P1 , P2 , . . . , Pn to be executed concurrently (P = [P1 |P2 | . . . |Pn ]).
The channel system CS = [P G1 |P G2 | . . . |P Gn ], where P Gi corresponds to process
Pi , so that the transition system of CS consists of transition system of P Gi . Each
program graph Pi can be interpreted as a transition system, but the underlying transition
system of a program graph results from unfolding (flattening). Then the transition
system for channel systems yields a transition system T S(P ) that formalizes the stepwise
(operational) behavior of PROMELA model P . This approach is modular that makes
reasoning and understanding the semantics easier. It is more practical and fundamental
than the one given in [8]. Therefore, this approach should be more suitable for reasoning
about the implementation of a PROMELA interpreter.

References
[1] Baier, C., and Katoen, J. P., Principles of Model Checking, The MIT Press, Cambridge,
Massachusetts, 2008.
[2] Bevier, W. R., Toward an Operational Semantics of PROMELA in ACL2, Proceedings of the
Third SPIN Workshop, SPIN97, 1997.
[3] Natarajan, V. and Holzmann, G. J., Outline for an Operational Semantic of PROMELA,
Technical report, Bell Laboratories, 1993.
[4] Ruys, T., SPIN and Promela Model Checking, University of Twente, Department of Computer
Science, Formal Methods and Tools, 1993.
[5] Shin, H., Promela Semantics, presentation from The SPIN Model Checker by G. J. Holzmann,
2007.
[6] Spoletini, P., Verification of Temporal Logic Specification via Model Checking, Politecnico Di
Milano Dipartimento di Elettronicae Informazione, 2005.
[7] Vielvoije, E., Promela to Java, Using an MDA approach, TUDelft Software Engineering Research
Group Department of Software Technology Faculty EEMCS, Delft University of Technology Delft,
the Netherlands, 2007.
[8] Weise, C., An Incremental formal semantics for PROMELA, Prentice Hall Software Series,
Englewood Cliffs, 1991.

Suprapto
Universitas Gadjah Mada.
e-mail: [email protected]

Reza Pulungan
Universitas Gadjah Mada.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Mathematics Education, pp. 647 - 658.

MODELLING ON LECTURERS’ PERFORMANCE WITH


HOTTELING-HARMONIC-FUZZY

H.A PARHUSIP AND A. SETIAWAN

Abstract. This paper is a brief research on modelling of lecturers' performance with modified
Hotelling-Fuzzy. The observation data is considered to be as a fuzzy set which is obtained from
students survey at Mathematics Department of Science Mathematics Faculty in SWCU in the year
2008-2009. The modified Hotelling relies on a harmonic mean instead of an arithmetic mean which
is normally used in a literature. The used charaterization function is an exponential function which
identifies the fuzziness. The result will be a general measurement of lecturers' performance. Based on
the 4 variables used in the analysis (Utilizes content scope and sequence planning, Clearness of
assignments and evaluations, Systematical in lecturing, and Encourages students attendance) we
obtain that lecturers' performance are fair, poor, fair, and poor respectively.

Keywords and Phrases : Hotelling, harmonic mean, fuzzy, exponential function, characteristic
function.

1. INTRODUCTION

University is one of post investments for human development in education which most
activities based on the learning process during in a university. This process relies on
interaction between lecturers and students. After some period of time, students may adopt
some knowledge which may determine their futures. Therefore lecturers require to transfer
their knowledge such that students capable to reproduce their knowledge obtained in their
universities as a support for themselves in their working places.
Since a lecturer becomes an important agent in the effort of human development in a
university, some necessary conditions must be satisfied as a lecturer. This paper will propose
some results on lecturers' performance based on students survey. Students evaluate a lecturer
by some questions (some are adopted from a literature given in a web which are listed on
Table 1 and these are considered the same items in Indonesian that have been used as

2010 Mathematics Subject Classification : 03B50, 03C65 , 62H25.

647
648 H.A. PARHUSIP AND A. SETIAWAN

instruments in the observation for this paper.


According to Yusrizal [1], the items are validated which is not done in this paper. We
have used these items to evaluate lecturers‟ performance. Each item will be quantified as 0,
1, 2 or 3 which means poor, fair, satisfactory and good respectively. Since all students did not
meet with all lecturers in 1 semester then the research is simply made as a general evaluation
for all TABLE I lecturers in the
QUESTIONS FOR STUDENTS OBSERVATION TO EVALUATE A Department.
LECTURER HELD DURING IN A CLASS
No Questions

1 Utilizes content scope and sequence planning


2 Clearness of assignments and evaluations
3 Systematical in lecturing
4 Encourages students attendance
5 Demonstrates accurate and current knowledge in subject field
6 Creates and maintains an environment that supports learning
7 Clearness to answer the question from a student
8 Attractiveness of the lecture
9 Clearness of purpose on each assignment
10 Lecturer‟s competence on stimulating students excitement
11 Lecturer‟s competence on delivering idea

12 Does teacher have available time to help students


for the related topic outside the class?
13 Effectiveness of time used in the class
14 Provides relevant examples and demonstrations to illustrate concepts and skills
15 The quality of lecturer‟s feedback due to the given assignment.
16 Score the overall the lecturer‟s performance

maxwell, G = gauss, Oe = oersted; Wb = weber, V = volt, s = second,


T = tesla, m = meter, A = ampere, J = joule, kg = kilogram, H = henry.

The used data here are obtained from students survey in Mathematics Department in
Science and Mathematics Faculty, Satya Wacana Christian University in the year 2008-2009.
Furthermore, the lecturers‟ performance can not be exactly in one of the values 0,1, 2 or 3.
Therefore data can be considered nonprecise and hence we refer to fuzzy in presenting the
analysis.
The remaining paper is organized as follows. The used theoretical background is
shown in Section 2. Procedures to analyze the obtained data are shown on Section 3. The
analysis is then shown in Section 4 and finally some conclusion and remark are written in the
last Section of this paper.
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 649

2. MODELLING HOTELLING-HARMONIC-FUZZY
ON LECTURERS’ PERFORMANCE

2.1 Review Stage. Some authors have proposed a quality evaluation based on some
mathematical approaches such as using fuzzy set. Teacher performance is studied with fuzzy
system [2] to generate an assesment criteria. In complex system such as cooling process of a
metal, the combined PCA (Principal Component Analysis) algorithm and T2 -Hotteling are
used to define quality index after all data are normalized in a range -1 and 1 [3]. The other
instrument for lecturers‟ performance have been developed and validitated using variable
analysis . Lecturers‟ strictness are also estimated using intuitionistic fuzzy [4]. There are
several standards can also be used to indicate a quality of a teacher which are more a
qualitative approach which are easily found through internet. In a strategic planning, SWOT
is one of the most pointed references. In SWOT analysis some uncertainties can be
encountered and therefore using fuzzy will renew the classical SWOT by presenting the
internal and external variables in the sense of fuzzy as shown by Ghazinoory, et.all[5].
It is assumed that the given data contain non-precision. Thus the representation data
will be as a fuzzy set [6]. The fuzzy data are indicated by a characteristic function. There are
some well-known characteristic functions. In this research, we apply an exponential
characteristic function ,i.e
 L( x), x  m1  q2

with L( x)  exp   x  m1  and R( x)  exp   x  m 2  .
q
1

 (1)
 ( x)  1, m1  x  m2    a2 
a1
 R( x), x  m    
 2

There are 6 parameters in Eq.(1), i.e m1 , a1 , q1 and m2 , a2 , q2 that must be determined. An


illustration of this characteristic function is depicted in Figure 1.
Figure 1 shows us that a single value x*=0.5 can be considered that the value may vary
in the interval 0.25  x*  1.5 . This allows us to tolerate that the given value is not exactly 0.5
but it can be in this interval which indicates its non-precision. However, if this tolerance is
too big, one introduces a parameter  which varies in 0    1 such that our tolerance can
be chosen freely. For example, if   0.2 then the tolerance for saying 0.25  x*  1.5 is
reduced as illustrated in Figure 1. We may say that 0.35  x*  0.85 .

Figure 1.  (x) with the given data is 0.5 (denoted by star on the peak) , and the value of
each parameter m1 , a1 , q1 is 0.4980, 0.1 , 2 respectively and the value of each parameter
m2 , a2 , q2 is 0.2, 0.502 , 0.8 respectively. The tolerance for x* is reduced. The
corresponding   cut is also shown as a horizontal line for   0.2 .
A problem appears here that the determination of parameters becomes time consuming
650 H.A. PARHUSIP AND A. SETIAWAN

for each value. Up to now, there exists no optimization procedure to find the best values of
parameters here. Therefore we may vary parameters based on the given data. Let us consider
by this following example.

Example 1: Let the given data be a vector x  [0.5 0.67 0.83 1. 1.17 1.33 1.5 1.67
T 
1.83 2] . Let the vector x be an element in the observation space M X . We want to
express this vector in the sense of fuzzy. By trial and error we try to use the value of each
parameter. This means that each value in the vector has its own characteristic function. We
will also define the average value of this vector by the harmonic mean and the characteristic
function of this average value. The used parameters for all characteristic functions are
a1  0.1, q1  q2  2 , a 2  0.2 . These parameters are chosen freely. The illustration of
characteristic functions is depicted in Figure 2a. The values of m1 and m 2 are taken from
the given vector. The characteristic function of the harmonic average is obtained as
 L( x), x  1.039
with L( x)  exp   x  1.039  and
2

 ( x)   1, x  1.039  
 0.1 
 R( x), x  1.309

 x  1.039 2 
R( x)  exp   .
 0 .2 
 
Thus the characteristic function is defined after the harmonic mean is obtained. The statistical
2
test for multivariate data is based on the T named as Hotelling [7] in a classical sense that
 2  nX   0  S 1 ( X   0 )
 T 
(2)

where X  ( x1 , x2 , ......, x p )T is a vector of an arithmetic mean of each variable,


1 n
xk   xki is an arithmetic mean for the k-th item. It is also known from Wikipedia or [7]
n i 1
TABLE II
STUDENTS SURVEY FOR 6 LECTURERS BASED ON 4 VARIABLES
THE SYMBOL D j INDICATES THE NAME OF J-TH LECTURER.
Row
Harmonic
No Questions D1 D2 D3 D4 D5 D6 Average
Utilizes content
scope and sequence
1 planning 3.0 1.67 1.89 1.71 1.62 2.2 1.92
Clearness of
assignments and 1.84
2 evaluations 3.0 1.51 2.00 1.57 1.58 2.05
Systematical in
3 lecturing 3.0 1.72 1.11 1.29 2.01 1.83 1.65
Encourages students
4 attendance 2.0 1.41 1.78 1.14 2.48 1.83 1.69
Column-Harmonic
Average 2.75 1.57 1.61 1.39 1.86 1.97 1.77
TOTAL
AVERAGE
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 651

that  2 ~ T p2,n1 or n p 2
 ~ Fp ,n  p . (3)
p(n  1)

Figure 2 The illustration of all characteristic functions with a1 , q1 are 0.1 and 2. and
a 2 , q 2 are 0.2 and 2.The value of m1 and m 2 are taken from the given vector.

Since 2 is distributed as (n  1) p
Fp ,n  p ( )
, then 2 can be used for hypotheses. A test of
(n  p)
 
hypothesis: H 0 :    0 versus H1 :    0 at the  level of significance, reject H 0 in favor of
H 1 if [7]  2

(n  1) p
Fp ,n p ( )
. (4)
(n  p)

Thus the control limit of the  2 control chart can be formed as [8]
(n  1) p and LCL=0. (5)
UCL  Fp ,n p ( )
(n  p)
There is a reason proposed in [8] for LCL=0 , but it ignores here. It is convenient to refer to
the simultaneous intervals for the confident interval of each  j , j=1,…,p as ([7](page 193))

xj  (n  1) p s jj (n  1) p s jj , j=1,…,p.
Fp,n  p ( )  j  xj  Fp,n  p ( )
(n  p) n (n  p) n
Equation (5) allows us to simplify the statement above as
s s
x j  UCL jj   j  x j  UCL jj , j =1,…,p. (6)
n n
In this paper, we will replace an arithmetic mean by a harmonic mean in the sense of
fuzzy. The harmonic mean is used to increase the precision of average quantity (since we
know that arithmetic mean ≥ harmonic mean) and the harmonic mean is formulated as
 n
xH  n
1
x
i 1 i

The harmonic mean will be useful if we have highly oscillated data. Additionally, there exists
no literature so far in doing this way which shows an originality of this paper though the
number of the used data is considerable small. The used covariance matrix S is in the usual
way, i.e
 S11 S12 ... S1, p  with 1 n
S S 22 ...
S 
S 2, p 
( x  x )( x  x )
pq
n
 pi p qi q
S  i 1
21

 .... .... .... ... 


 
 S p ,1 S p,2 .... S p , p 
652 H.A. PARHUSIP AND A. SETIAWAN

for p, q = 1, 2, ...., M.
The example 2 shows the idea of Hotteling-harmonic-Fuzzy. Note that the number of samples
for each item (variable) is 6.

Example 2. Suppose the given data are shown in Table 2 with each number in the entry is an
average from the n j -number of students for each lecturer. We assume that the result is
independent with the number of students in each class.
If each number in the Table 2 presented as a characteristic function then there will be
many parameters must be determined. Hence we define the characteristic function of the
harmonic mean of each item. The resulting of characteristic-harmonic mean is depicted in
Figure 2.
Furthermore, the harmonic mean of each lecturer in the whole items and for each item
for all lecturers are shown in Table 2. Based on the average quantity, one has 1.77 which is
not precisely in the original values (0, 1, 2, and 3). Since it closes to 2, the lecturers‟
performance is considered satisfactory. This paper will evaluate into more rigorously.
The covariance matrix in the classical sense can be obtained by function cov in
MATLAB, one yields
 0.2776 0.2921 0.2748 0.0939 
  . Furthermore, using harmonic mean, the matrix
 0.2921 0.3180 0.2759 0.1185 
S 
0.2748 0.2759 0.4447 0.2008 
 
 0.0939 0.1185 0.2008 0.2422 
 0.2396 0.2531 0.2449 0.0889 
covariance sample S h is obtained as  . In the rest paragraphs
 0.2531 0.2765 0.2488 0.1114 
Sh  
0.2449 0.2488 0.4015 0.1881 
 
 0.0889 0.1114 0.1881 0.2157 

we will compute all related formulas using harmonic mean.

One may also observe that covariance matrix S h is positive definite by computing its
eigenvalues which are all positive (0.0015, 0.0717, 0.1710 and 0.8891).We have also
 395.0665  324.4680  71.335 67.0196 
compute the invers matrix of S h as  324.4680 274.6743 53.3587  54.7142 . Finally one
S h1  
  71.3350 53.3587 20.4292  15.9823
 
 67.0196  54.7142  15.9823 19.2192 
needs to consider the Hotelling to involve in this paper which mostly taken from literature
(Johnson and Wichern, 2007) and reformulate it using harmonic mean.

The hypothesis test with H 0 :  0  3, 2, 2,1T against H 1 : 1  3, 2, 2,1T at level
significance   0.10 is employed here. The observed  is  =238.9666. Comparing the
2 2

observed  2 =238.9666 with the critical value (using Eq.4)


(n  1) p 5(4)
Fp ,n  p (0.10)  F4, 2 (0.10)  10(9.2434)=92.434.
(n  p) 2

Thus we get  2 > 92.434 and consequently we reject H 0 at the 10% level of significance.
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 653

For completeness, The value of F4, 2 (0.10 ) is computed using function in MATLAB and type
as finv(0.90,4,2). Finally, we try fuzziness takes place in this hypothesis.

2.2 Hotelling-fuzzy. We agree that hypothesis is proposed by assuming that the given
 0 is known. In this case we should present the example 0  3, 2, 2,1T with its
characteristics and using the previous procedures shown in Example 1, we directly obtain the

characteristics as depicted in Figure 3. Since we allow  0 as in the Figure 3, then we will
also have  2 as many as the number of points to define each characteristic function. Thus we
have a vector value of  which is shown in Figure 3.
2


Figure 3 The illustration of all characteristic functions of  0 with a1  0.1, q1  2 and
a 2  0.2, q 2  2 . The values of m1 and m 2 are taken from the harmonic mean of each item on Table
2. Note that there exists 4 characteristic functions but the second and the third are in the same curves.

Be aware that the given  0 in the sense in fuzzy means that all points in horizontal axis
(x) with  (x) > 0. Practically means that we need to search the index of each characteristics
function (since each function is discritized) and hence find the corresponded value of x. What

will be a problem here ?. We need always a set of  0 (with the number of element is the
same as the predicted one) . Let us study for  (x) = 0.6 for one of possible memberships of

 0 . We draw a horizontal line that denotes  (x) = 0.6 to find intersection points such that we
can draw the vertical line (denoted by an arrow in Figure 4).

The vertical lines denote the possible new sets of  0 and we always have two values of
each given characteristic value (membership) that may act as the bounds of each fuzzy-  0 .
Let us denote these intervals as I o , j where index j denotes the j-th vector of 
0 . Thus to
make the computation simpler, we choose the inf( I o , j ) and sup( I o , j ) to continue our

hypothesis. The used number of points will influence the obtained fuzzy-  0 .
654 H.A. PARHUSIP AND A. SETIAWAN

Figure 4 The illustration of all  0 taken from the characteristic function with a1  0.1, q1  2 and

a 2  0.2, q 2  2 . The values of m1 and m2 are taken from the harmonic mean of each item on Table
2.

Due to numerical techniques, the idea to use the inf( I o , j ) and sup( I o , j ) still can not
be implemented. Instead, we can do manually, but it will be time consuming if we have a
large number of points. Thus we are left this problem for the next research. In this paper we

apply  0 in the hypothesis as the usual procedure in a classical sense.
Example 3. Suppose that we have  0  3, 2, 2,1T and the resulting characteristic function is
shown in Figure 4. If  (x) =0.6, we get intervals of each value of  0  3, 2, 2,1T as shown in
Figure 4. Practically, there exists no  (x) =0.6 precisely since its value is constructed by
Equation 1. Manually, by drawing the horizontal line and the vertical line as shown in
Figure 4, we have approximations of inf( I o , j ) and sup( I o , j ). We propose that
inf( I o , j )=[0.8, 1.9, 1.9, 2.9] and sup( I o , j ).=[1.1, 2.1, 2.1, 3.2]. These results lead to the
same conclusion that H 0 :  0  3, 2, 2,1T is again rejected at the 10% level of significance.
On the other hand, one may determine all sets of 0 to find a 100%(1-  ) confidence
region for the mean of a p-dimensional normal distribution such that
 2  nX   0  S 1 ( X   0 )  (n  1) p F ( )
 T 
(7)
p ,n  p
(n  p)
provided the positive definite matrix covariance S. The complete observation will be
examined in Section 4. One may introduce Principal Component Analysis (PCA) of Variable
Analysis to reduce the number of variables if necessary.

3. RESEARCH METHOD

The used data are taken from the students‟ survey. We assume that the observations
are independent with the given lecturers and the number of students on each class. Table 1
contains questions which are used to evaluate lecturers‟ performance. The result is shown in
Table 3.
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 655

TABLE III
STUDENTS SURVEY FOR 6 LECTURERS BASED ON 16 VARIABLES
THE SYMBOL D j INDICATES THE NAME OF J-TH LECTURER.
No D1 D2 D3 D4 D5 D6
1 4.00 2.67 2.89 2.71 2.62 3.20
2 4.00 2.51 3.00 2.57 2.58 3.05
3 4.00 2.72 2.11 2.29 3.01 2.83
4 3.20 2.41 2.78 2.14 3.48 2.83
5 4.00 2.59 2.11 2.29 3.14 2.88
6 3.20 2.79 2.89 2.29 3.48 2.85
7 4.00 2.41 3.11 2.43 3.35 3.12
8 3.73 2.33 2.33 2.14 2.88 2.67
9 4.00 2.21 3.00 2.71 2.75 3.02
10 3.73 2.33 2.89 2.29 2.67 3.03
11 3.73 2.62 3.00 2.71 3.40 2.92
12 4.00 2.31 2.56 2.86 3.18 2.43
13 3.47 2.51 3.56 2.86 3.05 3.38
14 3.47 2.23 3.22 2.71 3.05 3.20
15 3.73 2.36 3.11 2.29 2.92 2.92
16 3.73 2.64 2.78 2.57 3.35 3.12

3.2 Calculate harmonic mean of variable


The harmonic mean of each variable is computed using formula in Section 2.
3.3 Covariance matrix is computed which its average formula is taken from the harmonic
mean. If the covariance matrix is singular, one needs to reduce the number of variables by
principal component analysis.
3.4 Given the predicted  0 then define the characteristic function of each element of  0 .
3.5 Compute the value of  2 for each set of  0 taken from the characteristic function
.

4. RESULT AND ANALYSIS

Evaluation is based on Table 2 which means that we have p=16 as the number of
variables. Unfortunately, the covariance matrix is almost singular that causes the invers
matrix is badly obtained. One may observe by computing its determinant which tends to 0
(O( 10 170 )). According to [7](page 110), this is caused by the number of samples is less than
the number of variables which happens in any samples. This means that some variables
should be removed from the study. The corresponding reduced data matrix will then lead to a
covariance matrix of full rank and nonzero generalized variance. Thus as mention in Section
656 H.A. PARHUSIP AND A. SETIAWAN

2, we need to use principal components analysis (PCA).


The standard PCA was employed, we compute the proportion of total variance
accounted by the first principal component is 1 = 0.7943. Continuing, the first two
p


j 1
j

components accounted for a proportion 1   2 = 0.8933 of the sample variance. After


p


j 1
j

the sixth eigenvalue, we observe that only six variables with nonzero proportions. Thus we
will only consider the 6 variables in the analysis. Additionally, we also observe that
 j  0 ,j=7,…,p whereas 1  3.1702, 2  0.3948, 3  0.2859, 4  0.1110,
5  0.0274, and 6  0.0016. It is reasonable to choose the first six variables.
Unfortunately, the singular covariance matrix still exists. To handle this problem, the pseudo-
invers matrix is applied. Again, one needs to concern Equation (2) that the number of
samples is not allowed to be the same as the number of variables which leads to the division
by zero. Since the eigenvalues represent variables variances, we select only the first four
variables such that n > p. In this case, we are led to the result in Section 2.
The control limit of the  control chart can be obtained by using Eq.(4), one yields
2

(2  1)4
UCL  F4, 2 (0.10) = 92.4342 and LCL=0.
(6  4)
Section 2 suggests us to compute the confidence intervals of the reduced data and we have the
following results
1.0453  1  4.8552 ; 0.8337   2  4.9181 ;
0.2776  3  5.1428 ; 0.9465   4  4.5201 .
Up to now, the conclusion for lecturers‟ performance qualitatively is not yet concluded
which is practically useful for application. These intervals show us that all observed values
are in the intervals. Finally, we suggest that the evaluation is proceeded as follows :
1. Compute the UCL= (n  1) p F ( ) , n= the number of the observed lecturers, p =
p ,n  p
(n  p)
the number of variables used in the evaluation (the number of items used to survey).
2. Compute the harmonic mean of each variable.

3. Compute the interval confidence of harmonic mean of each variable using the
Equation (6).

4. Divide each interval by the number of categories of qualities as the number of


subintervals which each subinterval corresponds to each quality. These subintervals
can be presented their characteristics functions as shown in Figure (5). Since we
have 4 categories for qualities (poor, fair, satisfactory and good) then each interval
will be divided into four subintervals. Thus we have categories for each variable in
the intervals given in Table 4.
Modelling On Lecturers‟ Performance with Hotteling-Harmonic-Fuzzy 657

TABLE IV

INTERVAL CONFIDENCE FOR EACH VARIABLE


VS EACH QUALITY OF LECTURERS‟ PERFORMANCE

TABLE V
QUALITY OF LECTURERS‟ PERFORMANCE
BASED ON THE DATA SURVEY 2008-2009.
Name Utilizes Clearness Systematical Encourages
of content of in lecturing students
parameter scope assignments (1.61) attendance
and and (1.39)
sequence evaluations
planning (1.57)
(2.75)
Fair Poor Fair Poor

Fig 5 Characteristic function of each subinterval of 1 .


658 H.A. PARHUSIP AND A. SETIAWAN

5. CONCLUSION

This paper proposes an evaluation of lecturers‟ performance by Hotelling-harmonic-


fuzzy. The harmonic mean is used to increase the precision of averaging and replace the
arithmetic mean in the covariance matrix. The Hotelling is employed here to find the
confidence interval of mean value of each variable. Using 16 variables, we obtain singular
covariance matrix such that only 4 variables must be considered in the analysis through
principal component analysis and we must have the number of variables (p) must be less than
the number of observations (n).
The studied data are the students‟ survey in Mathematics Department of Science and
Mathematics Faculty, Satya Wacana Christian University in the year 2008-2009. Based on
the 4 variables used in the analysis, we obtain that lecturers‟ performance are minimum (
fair).

Acknowledgement. The authors gratefully acknowledge T. Mahatma,SPd, M.Kom for


assistance with English in this manuscript, and the Research Center in Satya Wacana
Christian University for financial support of this work.

References

[1] YUSRIZAL AND HALIM, A.,”Development and Validation of an Intrument to Access the Lecturers‟ Performance
in the Education and Teaching Duties”, Jurnal Pendidikan Malaysia 34(2) 33-47,2009.
[2] JIAN MA AND ZHOU, D. „‟Fuzzy Set Approach to the Assessment of Student-Centered Learning‟‟, IEEE
Transaction on Education, Vo. 43, No. 2,2000.
[3] BOUHOUCHE, S., BOUHOUCHE, M. LAHRECHE, , A. MOUSSAOUI, AND J. BAST, „‟Quality Monitoring Using
Principal Component Analysis and Fuzzy Logic Application in Continuous Casting Process‟‟, American
Journal of Applied Sciences, 4 (9):637-644,ISSN 1546-9239, 2007.
[4] SHANNON, A, E. SOTIROVA, K. ATANASSOV, M. KRAWCZAK, P.M, PINTO, T. KIM, „‟Generalized Net Model of
Lecturers‟s Evaluation of Student Work With Intuitionistic Fuzzy Estimations‟‟, Second International
Workshop on IFSs Banska Bystrica, Slovakia, NIFS 12,4,22-28,2006.
[5] GHAZINOORY, A., ZADEH, A.E., “Memariani, Fuzzy SWOT analysis, Journal of Intelligent & Fuzzy Systems”
18, 99-108, IOP Press, 2007.
[6] VIERTL, R., “Statistical Methods for Non-Precise Data”, CRC Press,Tokyo, 1996.
[7] JOHNSON, R.A. AND WICHERN,D.W Applied Multivariate Statistical Analysis, 6th ed. Prentice Hall, ISBN 0-13-
187715-1, 2007.
[8] KHALIDI, M.S.A, „‟Multivariate Quality Control, Statistical Performance and Economic Feasibility‟‟,
Dissertation, Wichita State University, Wichita-Kansas ,2007.

HANNA ARINI PARHUSIP


FSM-Satya Wacana Christian University.
e-mail: [email protected]

ADI SETIAWAN
FSM-Satya Wacana Christian University.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Mathematics Education, pp. 659 - 670.

DIFFERENCES IN CREATIVITY QUALITIES BETWEEN


REFLECTIVE AND IMPULSIVE STUDENTS IN SOLVING
MATHEMATICS PROBLEMS

WARLI

Abstract. This research aims at describing differences between creativities reflective and impulsive
students in solving mathematics problems. The analyses and results are based largely on written tests
and task-based interviews. This research is of explorative and qualitative nature. The subjects are
junior high school students of reflective and impulsive cognitive styles measured by MFFT
(Matching Familiar Figures Test). Method triangulation was used to validate data collected by
comparing written test and interview results. This research results in the following: Differences in
creativity qualities between reflective and impulsive students in solving problems are as follows. In
planning phase, reflective students have a little bit better performance than that of impulsive ones. In
executing phase, for novelty and flexibilty qualities, reflective students perform better impulsive
students. In looking back phase, reflective students tend to look back their works, while impulsive
students do not.

Keywords and Phrases: creativity, problem solving, reflective, impulsive, and cognitive styles.

1. INTRODUCTION
Since the 1980's to present problem-solving has been recognized as a difficult task in
learning mathematics. LeBlanc, Proudfit & Putt [1] said that developing skill in problem
solving has long been recognized as one of the important goals in the elementary school
mathematics program. Instruction in problem solving has also been recognized as being a
difficult task. Suryadi et. al. [2] also said that solving the problem is still considered the most
difficult for students to learn and for teachers to teach it. This reinforced the results of tests
conducted by the PISA (Programme for International Student Assessment) Indonesian student
achievement has not been satisfactory. Mode of mathematical problem-solving abilities of
students Indonesia is located at Level 1, which is about 49.7% of students are at the lowest
level. At level 1, students are only able to solve mathematical problems can be solved with
one step [3]. In addition, research Siswono [4] showed that one of the problems in learning
mathematics in junior high is the low ability students in problem solving (word problems), in

659
660 WARLI

particular the problem is not routine or open (open ended). One cause of low problem-solving
skills in planning is not considered problem-solving strategies that vary or that encourage
creative thinking skills to find answers to problems.
Polya [5] says that the real problem-solving skills on the idea of drafting a plan.
Likewise, Orton [6] the crucial and sometime very difficult stages are the middle two,
particularly stage (2, devising a plan) for which creativity, inventiveness and insight might be
required. Based on some of the clear opinion that creativity is an important capital to improve
students' problem-solving abilities. The development of problem-solving ability is indirectly
have developed creative solutions to problems.
Creativity in solving the problem is the individual's ability to generate ideas that are
"new" in finding a way / tool to obtain answers to questions (problem) is fluent, and flexible.
Fluent in problem solving refers to the diversity (variety) which made the students answer the
problem correctly. Flexibility in problem solving refers to the ability of students to solve
problems in many different ways; students are able to change a decision problem into a
different solution. "New" in problem solving refers to the ability of students to answer the
problems with a few different answers but the correct value or an answer that is not usually
done by individuals (students) at this stage of their intellectual development or their
knowledge level.
Sternberg [7], creativity is a unique meeting point between the three psychological
attributes: intelligence, cognitive style, and personality/ motivation. These three attributes in
mind helps to understand what lies behind the creative individual. Woodman & Schoenfeldt
[7] says that creativity can be investigated from the perspective of: a) personality differences,
2) differences in cognitive style or ability, and 3) social psychology, which combines the
behavior of individual creative, socially creative behavior. Based on the opinion, between
creativity and cognitive styles have a close relationship. Creativity can be investigated from
the perspective of cognitive style differences. Cognitive style is an important part in assessing
creativity, because cognitive style is one of the psychological attributes of creativity.
Therefore, creative problem solving can be assessed based on different cognitive styles.
Cognitive style is characteristic of individuals in remembering, organizing, processing,
and problem solving, in an attempt to distinguish, understand, store, create, and use
information. Cognitive style are examined in this study is cognitive style proposed by Jerome
Kagan [19], the reflective vs. impulsive cognitive style. Impulsive students are students who
have characteristics of quick in answering the problem, but no/less accurate, so that the
answers tend to be wrong. Reflective students are students who have a characteristic slow in
responding to problems, but accurate, so that the answers tend to be correct. The reason for
reflective vs impulsive cognitive style (Kagan), among others: a) The object studied is
creative problem solving, which requires divergent thinking skills and reflective. b) Man has
a few errors in thinking, among others, in haste, disheveled, out of focus, and narrow, which
tend to have impulsive kids. Slow thinking which tends to have a reflective child may both be
annoying/support in creative problem solving. d) Reflective vs impulsive cognitive style in
Indonesia has not been much studied and developed in depth.
Different cognitive styles of students may cause differences in creativity in solving
problems. Nietfeld and Bosma [8] describe impulsives as individuals who act without much
forethought, are spontaneous, and take more risks in everyday activities and reflectives as
more cautious, intent upon correctness or accuracy, and take more time to ponder situations.
According Kozhevnikov [9] Researchers did find, however, that impulsive children displayed
Differences In Creativity Qualities Between Reflective And Impulsive Students In Solving .... 661

more aggression than reflective children and also that reflective children exhibited more
advanced moral judgment than impulsive ones. The differences in executive functioning and
attentional control would be reflected in cognitive style dimensions located on the control
allocation metadimension, with reflective–impulsive individuals differing primarily in
allocation of their attentional resources when performing simple perceptual tasks and with
constricted–flexible individuals differing in their level of self-monitoring when carrying out
complex thinking and reasoning processes. Such an approach relating information processing
theories and intelligence components to different cognitive style dimensions could provide a
general research model, which could be more fully adapted by investigators concerned with
the specific relations among learning, memory, attention, and cognitive style. Based on
gender students may also cause differences in the reflective vs. impulsive cognitive style.
Norton [10] noted boys tended to be more impulsive, had a willingness to take risks, and were
happier to launch into practical work even though they did not know what they were doing. In
contrast, girls were more reflective and felt inhibited about commencing the task.
The relationship between reflective vs. impulsive cognitive style with problem solving,
McKinney [11] explains that the data show that children who are reflective of information
processing tasks / problems more efficiently than the children's impulsive and do more
systematic or forward the strategy. According Landry [12] student problem solving behavior
can be classified along a continuum from impulsive to reflective. Reflective problem-solving
is characterized by thoughtfulness and looking back. Impulsive problem-solving is
characterized by a precipitous jump to an implemented solution without either sufficient
reflection or thought. Although correct solutions can be conceived from an intuitive leap,
standard practice prescribes reflective approaches to software development. Whereas Orton
[6] Suydam and Weaver also noted that more impulsive students are often poor problem-
solvers, while more reflective students are likely to be good problem-solvers. According to
some opinions, it can be said that the child impulsive or reflective cognitive style have
differences in problem-solving strategies. This, allows children who have different cognitive
styles will have a different problem-solving profiles as well.
Several previous studies that evaluated the reflective or impulsive children with
creativity, including Fareer [13] investigated the relationship between impulsive-reflective
and creative thinking, critical thinking, and intelligence. There is no difference between
students' reflective and impulsive students on cognitive factors (action recognition,
interpretation, fluency in the skills, ability to think generally, spontaneous flexibility, and
general ability to think critically). Fuqua, Bartsch, and Phye [14] found a significant effect for
cognitive tempo, subjects who had higher scores reflective than impulsive subjects on each
measure of creativity. But Ward [14] found no significant relationship between students'
reflective-impulsive and Wallach-Kogan divergent scores. Whereas Kagan & Kogan [15]
explains that the reflective-impulsive dimensions also affect the quality of inductive
reasoning. Garrett [16] they found reflectives performed better on logical deductive
reasoning tasks than impulsive.
This study will examine how differences between creativities of those students with
such distinct cognitive styles (reflective and impulsive cognitive style) in solving
mathematical problems. The purpose of this study is to describe in detail the differences
between creativities reflective and impulsive students in solving mathematics problems.
662 WARLI

2. METHOD
This study intends to describe in detail about the differences in mathematical problem-
solving creativity in particular the geometry of the research subjects. To obtain these
descriptions, do creative problem-solving tests. This type of research is exploratory
qualitative research with primary data in the form of writing (the written test) and the words
on the interview-based task.
Research subjects is a class VII Junior High School Students of reflective and
impulsive cognitive styles. Subjects were 8 students, including four students each reflective
and impulsive. Instrument to determine the reflective-impulsive cognitive style, developed
from the tests made by Jerome Kagan, the MFFT (Matching Familiar Figure Test). The
reasons include: 1) test MFFT is a typical instrument for assessing reflective impulsive
cognitive style [17]. 2) MFFT is an instrument that is widely used to measure cognitive tempo
[18].
The main instrument is the researcher's own research, aided by auxiliary instruments,
including: 1) problem-solving task, 2) interview guide, and 3) MFFT (determination of the
subject). Instruments PST (problem-solving task) is used to obtain data geometry students'
creative problem solving of reflective or impulsive students. Instruments interview guide to
explore creative solving geometry problem of the student.
The process of collecting data in this study, using two methods, ie problem-solving
task and interview. Given PST is a matter of geometry, consisting of two issues, namely: 1)
the widespread problem of rectangle, 2) around the rectangle. The task of solving the
problem, followed by clarification, to get clear of some of the written answers are unclear or
the answer is not written. Triangulation used in this study is the triangulation method. Data is
said to satisfy the validity of the data (data valid), if the answer to a written test result data
equal to data from interviews. Furthermore, the data are analyzed to obtain valid conclusions
on the outcome of research.
For valid data made scoring (coding) refers to any indicator of creativity and
problem-solving phase. Scoring performed twice, namely achievement scores, and weighted
scores. Achievement scores are the scores achieved by students in solving the problem-
solving task. Weighted score is the score obtained from the multiplication achievement scores
by weighting each indicator of creativity. For the weighted score is determined based on the
quality of each indicator of creativity. Novelty given weight 3, the flexibility given weight 2,
and fluency is weighted 1. The quality of creativity is sum of weighted scores of fluency,
flexibility, and novelty. If each indicator reached the highest score, which is 3 then the
highest weight obtained 18 ((3 x 1) + (3 x 2) + (3 x 3) = 18) by multiplying each score by the
weight of each indicator. The following criteria for determining the quality of creativity.

Tabel 1. Quality of Creativity


Quality of Creativity Weighted Score (ws)
high creativity ws  16
creativity is quite 10  ws  15
low creativity 4  ws  9
Creativity is very low 0  ws  3
The following description of the coding/scoring guidelines on every indicator of
creativity in problem solving phase (especially at the planning and executing phase).
Differences In Creativity Qualities Between Reflective And Impulsive Students In Solving .... 663

Table 2. Scoring (Coding) Guidelines at the Planning Phase


Fluency Indicators
Code/ Description Quality
Score
3 Students shows problem-solving plan diverse and it very well. (more Very
than two kinds) fluent
2 Students shows one or two kinds of problem-solving plan and all true Fluent
or students shows more than two kinds of problem-solving plans, but
there are some plans that are written / stated incorrect.
1 Students shows one or two kinds of problem-solving plans, but only in Less
part can be written / stated correctly fluent
0 Students do not shows problem-solving plans diverse or shows one or Not
two problem-solving plan, but it is false. fluent
Flexibility Indicators
3 Students can change the problem-solving plan to another problem- Very
solving plans of different by more than two and everything is correct. flexible
2 Students can change the problem-solving plan to another problem- Flexible
solving plans of different as much as one or two. Or students can
change the problem-solving plan to another problem-solving plans of
different by more than two, but there are some plans that are written /
stated incorrect.
1 Students can change the problem-solving plan to another problem- Less
solving plans of different as much as one or two, but only some of flexible
which can be written / stated correctly
0 Students can not change the problem-solving plan to another problem- Not
solving plans of different or the student can change the problem- flexible
solving plan to another problem-solving plans of different, but
appreciating incorrect.
Novelty Indicators
3 Students can demonstrate problem-solving plan different (by more Very
than two) and write / say the right. Or students showed one or more an novelty
unusual problem-solving plans by individuals (students) at this stage
of the level of their development or knowledge.
2 Students can demonstrate problem-solving plan different (as much as Novelty
one or two) and write/say the right. Or students can demonstrate
problem-solving plan different (by more than two), but there are some
plans that are written/stated incorrect.
1 Students can demonstrate problem-solving plan different (as much as Less
one or two), but can only write/say some of the true. Or students can novelty
demonstrate a problem-solving plan that is not usually done by
individuals (students) at this stage of their development or knowledge,
but can only write/say some of the true.
0 Students can not demonstrate problem-solving plan is different, or Not
may not indicate a problem solving plan is not usually done by novelty
individuals (students) at this stage of their development or knowledge.
Or it could indicate a problem-solving plans vary, but it is false.
664 WARLI

For a description of creativity indicators on executing phase analog as in Table 1, by


replacing the plan solving the problem by executing way.
Furthermore, to determine students' creativity in solving a problem is determined by
the quality of the creativity of each stage, especially the planning phase (phase 2) and
executing phase (phase 3).

Tabel 3 The Quality of Problem Solving Creativity


The Quality of Problem Solving Problem Solving Creativity
Creativity (PSC) Phase 2 Phase 3
PSC very high Creativity high Creativity high
Creativity high Creativity is quite
Creativity is quite Creativity high
PSC high Creativity high Creativity low
Creativity high Creativity very low
Creativity low Creativity high
Creativity very low Creativity high
PSC is quite Creativity is quite Creativity is quite
Creativity is quite Creativity low
Creativity low Creativity is quite
PSC low Creativity is quite Creativity very low
Creativity very low Creativity is quite
Creativity low Creativity low
PSC very low Creativity low Creativity very low
Creativity very low Creativity low
Creativity very low Creativity very low

3. RESULT AND DISCUSSION

To answer the research objectives, namely differences in students' creative problem


solving of reflective and impulsive will do the analysis of differences in problem-solving
creativity. The analysis was done by looking at the trend between the two groups of different
subjects in the creativity of solving problems. Analyses were performed on each indicator of
creativity. The following scores valid data.

Table 4. Distribution of Score/Code Data Valid Students’ Reflective and Impulsive

REFLECTIVE SUBJECT IMPULSIVE SUBJECT


P C KP CP BU TP DK AE SAJ ER
S I P1 P P P2 P1 P2 P1 P2 P1 P2 P1 P2 P1 P2 P1 P2
2 1
L K 3 2 2 2 2 2 2 2 2 2 2 2 1 1 1 2
2 N 9 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
F 4 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0
L K 2 3 3 3 3 3 3 3 3 2 3 3 3 3 3 3
Differences In Creativity Qualities Between Reflective And Impulsive Students In Solving .... 665

3 N 9 9 0 9 0 0 0 9 0 0 0 0 0 0 0 0
F 6 2 0 6 0 0 - 0 2 2 0 0 0 0 - 0
L 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -
4
Keterangan: PS = Phase of Problem Solving K = Fluency
CI = Creativity Indicator B = Novelty
Pi = Problems i , i = 1, 2 F = Flexibility
Based on the scores (code) data is valid between students who are reflective and
impulsive in Table 4, would explain differences in creative problem solving at every stage of
problem solving and indicators of creativity (fluency, novelty and flexibility)

3.1 Differences Creativity Students in Reflective and Impulsive of Phase Devising a


Plan.
3.1.1. Fluency Indicator. Based on Table 4, the reflective and impulsive subjects showed a
tendency to have two plans, namely to draw first, then determine the size or vice versa.
However there is a reflective subjects who are able to make a plan as much as 3 pieces.
Currently there are impulsive subjects showed only one problem-solving plan. The following
one excerpt of the interview with KP (reflective subjects).
R : For question (b) how do you plan?
KP : ... (silence) equal to (a) for first picture continues to count its size, but there is also
what
i calculate or determine the size after that i made the picture
R : How do you plan to resolve the question (c)?
KP : ... (silence) I cut into pieces, keep the pieces connected formed a new build
Referring to the fact it can be concluded that fluency in solving planning problems in
both reflective and impulsive students tend fluent, has two problem-solving plan, namely to
draw first, then determine the size or vice versa. But students tend to be very fluent reflective
than impulsive students.

3.1.2. Novelty Indicator. Based on Table 4, the reflective and impulsive subjects tend not to
meet the novelty in problem-solving plan. But there is one subject that meets the novelty of
reflective problem solving in planning, while others do not meet the novelty. As the interview
excerpt above, " ... (silence) I cut into pieces, keep up the pieces connected formed a new
build" plan is a "new", because beyond the ability of the child at her age.
Referring to the fact it can be concluded that both reflective and impulsive subjects
tend not to meet the novelty in problem-solving plan. However, reflective students are better
than impulsive students in a novelty.

3.1.3. Flexibility Indicator. Based on Table 4, there are two reflective subjects that meet the
flexibility in planning for problem solving, while others do not meet the flexibility. While
impulsive subjects there is no that meet the novelty. Referring to the fact it can be concluded
that the flexibility in planning for problem solving of the reflective student is better than
impulsive students.
To find out students' creative differences reflective and impulsive students in the
planning stages pemecehan problem, a weighted score of all three indicators, namely fluency,
novelty, and flexibility will be summed. Results sum tesebut authors present the diagram
below.
666 WARLI

P-i = Problems i , i = 1, 2
S.i = Subjct i, i = 1, 2, 3, 4
Figure 1. The difference profile Reflective and Impulsive Children's Creativity in Devising a
Plan Phase
Based on Figure 1 differences in students' creativity shows that reflective students’
creativity in problem-solving planning tends to be low. Being impulsive students' creativity in
solving-problems planning tend to be very low.

3.2. Creativity Differences between Reflective and Impulsive Students Carrying out the
Plan Phase
3.2.1. Fluency Indicator. Based on Table 4, found that in general both reflective and
impulsive students tend to meet the fluency in carrying out the plan. Likewise, there is a
reflective or impulsive subjects are only able to do as much as 2 pieces. Referring to the fact
it can be concluded that between reflective and impulsive students in problem-solving fluency
doing relatively the same. Both reflective and impulsive students tend to be very fluent in
carrying out the plan. Likewise, there is a reflective or impulsive subjects are only able to do
as much as 2 pieces.

3.2.2. Novelty Indicator. Based on Table 4, found that there are three subjects that meet the
novelty of reflective on carrying out the plan, whereas subjects of impulsive meet no novelty.
The following one excerpt of the interview with KP (reflective subjects).
R : Are you able to transform again into another form?
KP : ... (still) be a rectangle
R : How you do?
KP : ... (silence) I move the pieces
Differences In Creativity Qualities Between Reflective And Impulsive Students In Solving .... 667

R : Does it still have any other ideas?


KP : ... (still think) his name is not know
R : How?
KP : ... (drawing) is, Sir!

Referring to the fact it can be concluded that the novelty of carrying out the plan the
reflective students a better than impulsive students.

3.2.3. Flexibility Indicator


Based on Table 4, was obtained that both reflective and impulsive subjects tend not to
meet the flexibility to carrying out the plan. However, there are two reflective subjects that
meet the flexibility in carrying out the plan, and there is one impulsive subject that meets
flexibility. But the students' reflective showed better scores. The following one excerpt of the
interview with CP (reflective subjects). the subject of changing the rectangle in a different
form, but the same circumference.

R : You have another way?


CP : ... (silence) Well ... if I cut off a corner of the picture that this side (left)
R : How way try to picture?
CP : ... (still drawing) is, Sir!

Referring to the fact it can be concluded that reflective and impulsive students tend not
meet the flexibility in executing phase, however, reflective students are slightly more flexible
than impulsive students in executing phase.
To find out students' creative differences reflective and impulsive students on
executing phase, a weighted score of all three indicators, namely fluency, novelty, and
flexibility will be summed. Results sum tesebut authors present the diagram below.
668 WARLI

20

15

10
IMPULSIF
5
REFLEKTIF
0
S.1 S.2 S.3 S.4 S.1 S.2 S.3 S.4

P-1 P-2

P-i = Problems i , i = 1, 2
S.i = Subjct i, i = 1, 2, 3, 4

Figure 2. The Difference Profile Reflective and Impulsive Children's Creativity in Executing
Phase
Based on Figure 2. Creative differences indicate that reflective students' creativity in
executing phase tends to be high. Being impulsive students' creativity in executing phase
tends to be low.
Referring to the planning and executing phase indicates that reflective students'
creativity in solving geometry problems tend to be high. Being impulsive students' creativity
in solving geometry problems tend to be very low.

3.3. Creativity Differences between Reflective and Impulsive Students Looking Back
Phase
At the looking back phase of students' reflective work tends to examine the results of
his work. Although there is one subject that did not check his work, but he checked before
being written on the answer sheet. Answers will be written on the answer sheet, when it is
believed to be the truth. Being impulsive students tend not to examine the results of their
work. A common reason is that they check when writing an answer, if there is something
wrong immediately crossed out or corrected. Based on this we can conclude that impulsive
students tend not to examine the results of his work, being reflective students tend to check
the results of his work.

3. CONCLUDING REMARK

Based on the analysis results can be summarized as follows.


The creativity profile difference between reflective students and impulsive students in solving
the problems as follows: the creativity profile of reflective students in solving geometric
problems tends to be high. Whereas the creativity of impulsive students in solving geometric
problems tends to be very low. In detail, can be described as follows.
1. In the phase of planning, the creativity of reflective students in planning the problem
Differences In Creativity Qualities Between Reflective And Impulsive Students In Solving .... 669

solving tends to be low. Meanwhile, the creativity of impulsive students in planning the
problem solving tends to be very low.
2. In executing phase, the creativity of reflective students in carrying out the problem
solving tends to be high. Whereas the creativity of impulsive students in carrying out the
problem solving tends to be low.
3. In looking back phase, reflective students were much cautious in executing phase (more
trials first), consider various aspects so that they obtained much fewer answers, but the
correct ones. impulsive students were less acurate in executing phase (less trials), rushed
through the problems, so that they have more answers, but often the wrong ones.
Impulsive students tend not to examine the results of his work, being reflective students
tend to check the results of his work.

References

[1] LEBLANC, JOHN F., PROUDFIT, LINDA & PUTT, IAN J. Teaching in Problem Solving in the Elementary
School. In Krulik, Stephen & Reys, Robert E. (Ed) Problem Solving in School Mathematics. Reston,
Virginia: NCTM Yearbook 1980.
[2] SUHERMAN, ERMAN DKK. Strategi Pembelajaran Matematika Kontemporer. Bandung: Jurusan
Pendidikan Matematika FPMIPA UPI Bandung. 2001
[3] BALITBANG-DEPDIKNAS. Rembug Nasional Pendidikan Tahun 2007. Jakarta: Badang Penelitian dan
Pengembangan, Departemen Pendidikan Nasional. 2007.
[4] SISWONO, TATAG YE. Desain Tugas untuk Mengidentifikasi Kemampuan Berpikir Kreatif dalam
Matematika. Pancaran Pendidikan Tahun XIX, No. 63 April 2006. Jember: FKIP Universitas Jember.
[5] POLYA, G. How to Solve It. Second Edition. Princeton, New Jersey: Princeton University Press. 1973.
[6] ORTON, ANTHONY. Learning Mathematics. Issues, Theory and Classroom Practice. Second Edition.
Printed and bound in Great Britain by Dotesios Ltd. Trowbrigde, Wilts. 1992.
[7] STERNBERG, ROBERT J. & LUBART, TODD I. Defying the Crowd Cultivating Creativity in a Culture of
Conformity. The Free Press. New York. 1995.
[8] NIETFELD, JOHN & BOSMA, ANTON. Examining the Self-Regulation of Impulsive and Reflective Response
Style on Academic Tasks. Journal of Research in Personality. Vol. 32, (2003) 118 – 140,
[9] KOZHEVNIKOV, MARIA. Cognitive Styles in the Context of Modern Psychology: Toward an Integrated
Framework of Cognitive Style. Psychological Bulletin. 2007, Vol. 133, No. 3, (2007) 464–481. Retrieved
April 1, 2011, from http://www.nmr.mgh.harvard.edu.cognitive/styles2007
[10] NORTON, STEPHEN. Pedagogies for the Engagement of Girls in the Learning of Proportional Reasoning
through Technology Practice. Mathematics Education Research Journal 2006, Vol. 18, No. 3, (2006) 69–
99. Retrieved April 1, 2011, from http://www.merga.net.auMERJ/18/3/Norton.
[11] MCKINNEY, JAMES D. Problem Solving Strategies in Reflective and Impulsive Children. Journal of
Educational Psychology Vol. 67 No.6. (1976) 807-820.
[12] LANDRY, JEFFREY P.; PARDUE, J. HAROLD; DORAN, MICHAEL V. ; DAIGLE, ROY J. 2002. Encouraging
Students to Adopt Software Engineering Methodologies: The Inf luence of Structured Group Labs on
Beliefs and Attitudes. Journal of Engineering Education. (2011) p-103 – 107. Retrieved April 23, 2011.
from http://jee.org
[13] AL-SHARKAWY, ANWAR M. (1998) Cognitive Style Research in The Arab World. Psychology in The Arab
Countries. Cairo. Egypt: Menoufia University Press. Retrieved Juny 15, 2005.
[14] KOGAN, NATHAN. Creativity and Cognitive Style: A Life-Span Perspective. In Baltes, R.B. & Schaie,
K.W. Life-Span Developmental Psychology. Academic Press. London. 1973.
[15] KAGAN, J AND KOGAN, N. Individual Variation in Cognitive Process. Dalam Mussen, P (Edt.)
Carmichael’s Manual of Child Psychology (3rd ed. Vol. 1), New York: Wiley. 1970
[16] GARRETT, ROGER M. Problem-solving and Cognitive Style. Research in Science & Technological
Education, Vol. 7: No. 1, (1989) 27 — 44. Retrieved April 23, 2011, from http://pdfserve.informaword
[17] ROZENCWAJG, PAULETTE & CORROYER, DENIS. Cognitive Processes in the Reflective-Impulsive
Cognitive Style. The Journal of Genetic Psychology, 166(4), (2005) 451 – 463.
[18] KENNY, ROBERT F. Digital Narrative as a Change Agent to Teach Reading to Media-Centric Students.
International Jurnal of Social Sciences Volume 2 Number 3 Tahun 2007.
[19] KAGAN, JEROME. Impulsive and Reflective Children: Significance of Conceptual Tempo. In Krumboltz,
J.D (Edt.) Learning and the Educational Process. Chicogo: Rand Mc Nally & Company. 1965.
670 WARLI

WARLI
Departement of Mathematics Education. UNIROW Tuban. East Java
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 671 – 678.

TWO-DIMENSIONAL WARRANTY POLICIES USING


COPULA

ADHITYA RONNIE EFFENDIE

Abstract. In two-dimensional warranty model, we included two factors that


simultaneously affect warranty cost, i.e. usage and age of the product. The warranty is
characterized by a region in a two dimensional plane rather than an interval as in one -
dimensional approach. Most of the results in two-dimensional warranty model are coming
from analytical or numerical solution from given two -variables joint distribution function.
This paper will give alternate way by using copula method combining functions that are
already known. We apply this method in Free Replacement Warranty (FRW) policy from
motorcycle’s warranty in Indonesia.
Keywords and Phrases : Warranty Cost Model, Copula, Free Replacement Warranty Policy

1. INTRODUCTION

Many products are sold with warranty policy that the manufacturer agrees to repair or
provide a replacement for failed items free of charge up to a time W or up to a usage U,
whichever occurs first, from the time of the initial purchase. For example, certain motor
company in Indonesia gives its costumers 3 years warranty or 100,000 km of usage,
whichever occurs first. This is an example of two-dimensional warranty policies because it
has two factors that affect warranty coverage, age of product and usage. An approach to
modeling this type of warranty by one-dimensional model was clearly explained in Blischke
and Murthy [2]. This was originally proposed by Moskowitz and Chun [3]. Another approach
to modeling two-dimensional warranty policies is by two-dimensional renewal process, which
is will be investigated thoroughly in this paper.
To lessen difficulties of finding renewal function, which is main topic in warranty cost
estimation, some estimators were proposed by Brown, Solomon and Stephens [1] via Monte
Carlo simulation. These estimators are unbiased and compete favorably with the Naive
estimator N(t). In this paper we give an alternative method in finding renewal function by
using one of those estimator and applying Copula method as a replacement of joint

671
672 ADHITYA RONNIE EFFENDIE

distribution function of two random variable W, time until product failed, and U, usage of the
product.

2. TWO-DIMENSIONAL RENEWAL PROCESS AND ITS APPLICATION IN


WARRANTY MODELING

A two-dimensional renewal process is a natural extension of a one-dimensional renewal


process. Here events occur on a two-dimensional plane as opposed to a line in the one-
dimensional case.

Definition A counting process , where is the positive quadrant


in the two-dimensional plane, is an ordinary two-dimensional renewal process if the
following hold:

______________________________
2010 Mathematics Subject Classification :

1.
2. is a sequence of independent and identically distributed
nonnegative bivariate random variables with a common joint distribution function
.

3. where and

To apply two-dimensional renewal process in warranty modeling, we consider a two-


dimensional free-replacement policy defined by a rectangle in a two-
dimensional plane with the horizontal axis representing time and the vertical axis representing
usage. The item is nonrepairable, and all failed items under warranty are replaced by new
ones.
Let is a sequence of independent and identically distributed
nonnegative bivariate random variables with common joint distribution , and the
number of failures over rectangle is given by , a two dimensional
renewal process.
Define and , then it is
Two- Di m en s i on a l Wa rra n t y Po li c i es Us i n g C opu la 673

easily seen that and as a result, we have:

Since involves sums of n bivariate, independent and identically distributed


random variables with common joint density function , clearly is given by
the n-fold convolution of with itself, or in mathematical notation:

for .
Let denote the expected number of renewals over the rectangle
or in other word . We have from Blischke and
Murthy [2],

(1)

The last expression is quite difficult to solve, because it involves inside and outside
the integration which lead to the integral equation. One way to solve this equation is by using
an estimation. Brown, et.al. [1] give an estimation of (1), but in one dimensional case, by
using a weighted average. After some modification in order to fit to two dimensional case we
have the following Brown, et.al estimation for two dimensional renewal function:

(2)

3. COPULA METHOD AND ITS APPLICATION IN TWO-DIMENSIONAL


WARRANTY
A copula can be used to describe the dependence between random variables. The
cumulative distribution function of a random vector can be written in terms of marginal
distribution functions and a copula. The marginal distribution functions describe the marginal
distribution of each component of the random vector and the copula describes the dependence
structure between the components. Copulas are popular in statistical applications as they
674 ADHITYA RONNIE EFFENDIE

allow one to easily model and estimate the distribution of random vectors by estimating
marginal and copula separately. There are many parametric copula families available, which
usually have parameters that control the strength of dependence.
Suppose that we can find copula function C such that
, a n-dimensional distribution function with
marginal distribution functions , of random vectors
where are their marginal functions. Furthermore, according to the
famous Sklar’s theorem of copula we can find, in anyway, this copula function.
Theorem (Sklar) Let H be a two-dimensional distribution function with marginal
distribution functions F and G. Then there exists a copula C such that

Conversely, for any univariate distribution functions F and G and any copula C, the function
H is a two-dimensional distribution function with marginals F and G. Furthermore, if F and
G are continuous, then C is unique.

So, using this theorem we can propose the following corollary:

Corollary (Main result) We can obtain a new estimator from (2) using Copula method as
follows:

(3)

Proof: This result can easily found by substituting Sklar theorem into equation (2)

4. APPLICATION TO WARRANTY MODEL

We observe 33 failure time data of the most frequent component claim from a
Japanese motorcycle importer in Indonesia. The claims include failure of cylinder, piston and
piston ring. We use this data to estimate marginal distribution and copula parameters.
Furthermore, we find the best copula function which fit to the data and use this in simulating
warranty cost estimation. Below some statistics of the claim data:
Two- Di m en s i on a l Wa rra n t y Po li c i es Us i n g C opu la 675

Table: Statistics of the claim data


Statistics Age (W) Usage (U)
Mean 0.7178182 0.8802121
Standard deviation 0.5325039 0.7511111
Variance 0.2835604 0.5641679
Linier correlation 0.5758115 0.5758115
Kendall’s correlation 0.539924 0.539924

4.1 Marginal distribution


By using Kolmogorov-Smirnov, we find that Weibull and Exponential distribution fit
for Age variable (W) and Usage variable (U), respectively. Parameters for those distribution
are estimated using Maximum Likelihood Estimation (MLE) method, which yield:
Age  and Usage  .

4.2 Finding the best Copula


We investigate 6 copula families of non-negative Archimedean copula class. We
show estimation values of each family on the next table.

Table: Parameter estimation values


Family No. alpha Valid alpha range
2 4.34711 Yes [1,Inf)
4 2.17355 Yes [1,Inf)
12 1.44904 Yes [1,Inf)
14 1.67355 Yes [1,Inf)
15 2.67355 Yes [1,Inf)
18 2.89807 Yes [2,Inf)

To find best copula from these 6 families, we use Kolmogorov-smirnov test and the result as
follow:

Table: P-value of Goodness of fit test


family no. p-value
2 0
4 0.99229
12 0.99524
14 0.99785
15 0.79188
18 0.00077
676 ADHITYA RONNIE EFFENDIE

From this table we see that goodness of fit test shows that all family have a good
result except family no. 2 and 18 (pvalue < 0,05). The bigger p-value of the model the better
model fit to the data and the best is family no.14 with estimated parameter .
Thus we have the following model :

where , and .

4.3 Warranty cost estimation


We randomize ith failure processes age and usage (in 10.000 km). We then
determine the expected claim number, for single item motorcycle sale. We can see
from this table that according to simulation we have made the effective period of warranty lie
on the grey cells, which is not yet reach one failure. Thus we can estimate the warranty cost
as we expected. We present the result in the appendix tables.

Acknowledgement. We say tons of thanks to my students, Reny Apprilia and


Lina Istiqomah for their help during this research.

References

[1] BROWN, M., SOLOMON, H., AND STEPHENS, M., (1981). Monte Carlo simulation of the renewal function, J.
Appl. Prob., 18, 426-434
[2] BLISCHKE, W.R., MURTHY, D.N., (1994) Warranty Cost Analysis, Marcel Dekker,inc, New York
[3] BLISCHKE, W.R., KARIM, M.R., AND MURTHY, D.N.P, (2011) Warranty Data Collection and Analysis, Springer
[3] MOSKOWITZ, H., CHUN, Y.H., (1988) a Bayesian approach to the Two-attribute Warranty Policy, Paper No 950
Krannert Graduate School of Management, Purdue University

Adhitya Ronnie Effendie


Department of Mathematics, Gadjah Mada University Yogyakarta Indonesia
e-mail: [email protected]
Two- Di m en s i on a l Wa rra n t y Po li c i es Us i n g C opu la 677

Table: Estimated for single item motorcycle sale for various warranty period
U\W 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
0.3 0.012 0.026 0.043 0.045 0.048 0.050 0.052 0.054 0.050 0.050 0.052 0.052
0.6 0.029 0.092 0.122 0.153 0.180 0.182 0.184 0.190 0.193 0.194 0.195 0.200
0.9 0.042 0.148 0.251 0.303 0.362 0.368 0.376 0.385 0.386 0.397 0.397 0.412
1.2 0.062 0.202 0.340 0.509 0.549 0.575 0.589 0.591 0.593 0.631 0.633 0.636
1.5 0.069 0.261 0.444 0.619 0.721 0.807 0.828 0.843 0.872 0.873 0.880 0.897
1.8 0.079 0.306 0.588 0.757 0.922 1.068 1.122 1.140 1.158 1.183 1.199 1.205
2.1 0.087 0.331 0.629 0.893 1.143 1.287 1.314 1.341 1.485 1.497 1.506 1.566
2.4 0.090 0.366 0.696 1.091 1.323 1.409 1.596 1.617 1.669 1.683 1.752 1.792
2.7 0.093 0.378 0.794 1.158 1.406 1.659 1.834 1.936 2.004 2.006 2.041 2.110
3 0.101 0.393 0.837 1.203 1.572 1.781 2.003 2.160 2.261 2.337 2.348 2.411

Table: Estimated warranty cost per motorcycle item for various warranty period
U\W 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
0.3 39145 8395 13809 14495 15541 16065 16860 17452 16088 16132 16779 16895
0.6 9394 29584 39265 49500 58194 58795 59461 61254 62299 62529 63098 64452
0.9 13687 47671 81217 98027 117053 118723 121384 124241 124600 128089 128130 133195
1.2 19944 65252 109932 164441 177282 185782 190370 190881 191623 203660 204416 205271
1.5 22149 84395 143498 200015 232741 260661 267364 272194 281716 281947 284295 289772
1.8 25632 98960 189795 244637 297816 344924 362538 368285 373937 382209 387267 389096
2.1 27940 106782 203015 288598 369193 415638 424311 433116 479759 483449 486422 505958
2.4 29040 118366 224873 352337 427355 455216 515450 522137 538986 543542 565906 578791
2.7 30048 122083 256553 373927 454056 535940 592391 625408 647149 648005 659122 681571
3 32501 126780 270378 388686 507724 575171 647109 697709 730399 754854 758415 778646
678 ADHITYA RONNIE EFFENDIE
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 679 – 688.

CONSISTENCY OF THE BOOTSTRAP ESTIMATOR FOR MEAN


UNDER KOLMOGOROV METRIC
AND ITS IMPLEMENTATION ON DELTA METHOD

BAMBANG SUPRIHATIN, SURYO GURITNO, SRI HARYATMI

Abstract. It is known that by Strong Law of Large Number, the sample mean X converges almost
surely to the population sample  . Central Limit Theorem asserts that the distribution of


n X   converges to Normal distribution with mean 0 and variance  2 as n   . In
bootstrap view, the key of bootstrap terminology says that the population is to the sample as the
sample is to the bootstrap samples. Therefore, when we want to investigate the consistency of the
bootstrap estimator for sample mean, we investigate the distribution of 
n X* X  contrast to

n X   , where X * is bootstrap version of X computed from sample bootstrap X * .
Asymptotic theory of the bootstrap sample mean is useful to study the consistency for many other
statistics. Thereupon some authors call 
n X   as pivotal statistic. Here are two out of some
ways in proving the consistency of bootstrap estimator. Firstly, the consistency was under Mallow-
Wasserstein metric was studied by Bickel and Freedman [2]. The other consistency is using
Kolmogorov metric, which is a part of paper in Singh [9]. In our paper, we investigate the
consistency of mean under Kolmogorov metric comprehensively and use this result to study the
consistency of bootstrap variance using delta Method. The accuracy of the bootstrap estimator using
Edgeworth expansion is discussed as well. Results of simulations show that the bootstrap gives good
estmates of standard error, which agree to the theory. All results of Monte Carlo simulations are also
presented in regard to yield apparent conclusions.

Keywords and phrases: Bootstrap, consistency, Kolmogorov metric, delta method, Edgeworth
expansion, Monte Carlo simulations

1. INTRODUCTION

Some questions are usually arise in study of estimation of the unknown parameter 
involves the estimation: (1) what estimator ˆ should be used or choosen? (2) having choosen

679
680 B. SUPRIHATIN, S. GURITNO, S. HARYATMI

to use particular ˆ , is this estimator consistent to the population parameter  ? (3) how
accurate is ˆ as an estimator of  ? The bootstrap is a general methodology for answering
the second and third questions. Consistency theory is needed to ensure that the estimator is
consistent to the actual parameter as desired.
Consider the parameter  is the population mean. The consistent estimator for  is the


1 n
sample mean ˆ  X  X . The consistency theory is then extended to the
n i 1 i

consistency of bootstrap estimator for mean. According to the bootstrap terminology, if we


want to investigate the consistency of bootstrap estimator for mean, we investigate the
distribution of 
n X   and 
n X* X .  We will investigate the consistency of
bootstrap under Kolmogorov metric which is defined as
sup PF
x
 n X     x P  n X Fn
*
 
X x .

The consistency of bootstrap estimator for mean is then applied to study the
consistency of bootstrap estmate for variance using delta method. We describe the
consistency of bootstrap estimates for mean and variance. Section 2 reviews the consistency
of bootstrap estimate for mean under Kolmogorov metric. Section 3 deal with the consistency
of bootstrap estimate for variance using delta method. Section 4 discuss the results of Monte
Carlo simulations involve bootstrap standard errors and density estmation for mean and
variance. Section 5, is the last section, briefly describes concluding remarks of the paper.

2. CONSISTENCY OF BOOTSTRAP ESTIMATOR FOR MEAN

Let X 1, X 2 ,, X n  be a random sample of size n from a population with common


distribution F and let T X 1, X 2 ,, X n ; F  be the specified random variable or statistic of
interest, possibly depending upon the unknown distribution F. Let Fn denote the empirical
distribution function of X 1, X 2 ,, X n  , i.e., the distribution putting probability 1/n at each
of the points X 1, X 2 ,, X n . The bootstrap method is to approximate the distribution of
T X 1, X 2 ,, X n ; F  under F by that of 
T X 1* , X 2* ,, X n* ; Fn  under Fn whrere
X 1* , X 2* ,, X n*  denotes a bootstrapping random sample of size n from F . n
We start with definition of consistency. Let F and G be two distribution functions on
sample space X. Let  F, G  be a metric on the space of distribution on X. For
X 1, X 2 ,, X n i.i.d from F, and a given functional T X 1, X 2 ,, X n ; F  , let
H n ( x)  PF T X 1, X 2 ,, X n ; F   x ,
  
H Boot ( x)  P* T X 1* , X 2* ,, X n* ; Fn  x .
We say that the bootstrap is consistent (strongly) under  for T if
CONSISTENCY OF THE BOOTSTRAP ESTIMATOR FOR MEAN... 681

 H n , H Boot   0 a.s.
Let functional T is defined as T  X 1, X 2 ,, X n ; F   n X    where X and  are
sample mean and population mean respectively. Bootstrap version of T is
T 
X 1* , X 2* ,, X n* ; Fn   n X *

 X , where X *
is boostrapping sample mean. Bootstrap

method is a device for estimating PF  n X     x by PFn  n X *


 
 X  x . We will
investigate the consistency of bootstrap under Kolmogorov metric which is defined as
K F, G   sup F ( x)  G( x) = sup PF
x x
 n X     x P  n X Fn
*
 
X x .

We state some theorems and lemma which are needed to show that
K H n , H Boot   0 a.s.

Theorem 1 (KHINTCHINE-KOLMOGOROV CONVERGENCE THEOREM) Suppose


X 1, X 2 , are independent with mean 0 such that  varX n n  . Then,  n
Xn  

 
n 
a.s., i.e. S n  X i converges a.s. to Xn .
i 1 n 1

Kronecker Lemma Suppose a n  0 and a n   . Then  n


X n a n   implies


n
X j an  0 .
j 1


n
Proof. Set bn  X i ai and a 0  b0  0. Then, bn  b   and
i 1

X n  a n bn  bn1 . Write


1
an 
n
j 1
Xj 
1
an 
n
j 1

a j b j  b j 1 

 
1 
a j b j 1 
n n
=  a jb j 
an  j 1 j 1 

 
1 
a j b j 1 
n 1 n
= bn   a jb j 
an  j 1 j 1 

 
1 
a j b j 1 
n n
= bn   a j 1b j 1 
an  j 1 j 1 

= bn 
1
an 
n
j 1
 
b j 1 a j  a j 1  b  b  0. □

Theorem 2 (POLYA’S THEOREM) If Fn  F , where F is a continuous distribution


d

function, then sup Fn x   F x   0 as n  .


x
682 B. SUPRIHATIN, S. GURITNO, S. HARYATMI

Theorem 3 (BERRY-ESSEN) Let X 1, X 2 ,, X n be i.i.d. with E  X 1   , Var  X 1    2 ,


3
and E X 1    . Then there exists a universal constant C, not depending on n or the

 n X 
distribution of the X i , such that sup P

 x    x  

C  E X1  
3
.
   3 n
x  

Theorem 4 (ZYGMUND-MARCINKIEWICZ SLLN) Suppose X , X 1, X 2 , are i.i.d. and


E X p
   for some 0 < p < 1. Then, Sn
 0 a.s.
n1 / p
Proof. This is consequence of the corrolary following Theorem 1 and Kronecker lemma, as
desired. □

Now we show the consistency of H Boot under Kolmogorov metric, which is based
on Sigh [9] and DasGupta [4]. We can write that

H Boot K H n , H Boot  = sup PF Tn  x   P* Tn*  x
x

T x T * x 
= sup PF  n    P*  n  
   s s 
x 

T x x x  x  x T* x 


= sup PF  n                P*  n  
      s s  s s 
x 
T x x x  x  x T* x
 sup PF  n       sup       sup    P*  n  
      s s  s 
x x x  s

= An  Bn  Cn , say.
By Polya’s theorem, we conclude that An  0 . Also, by SLLN, we obtain s 2   2
a.s., and by the continuous mapping theorem, s   a.s. Hence, we conclude that Bn  0
a.s. Finally, by the Berry-Essen theorem,


3 n 3
C  E X 1*  X n C X1  X n
i 1
Cn 
   3/ 2
=
n varFn X 1* n  ns 3

 3

n 3
C   X1    n  Xn 
i 1
 

n3/ 2 s3
CONSISTENCY OF THE BOOTSTRAP ESTIMATOR FOR MEAN... 683

   Xn
3 
C  1 

n 3
= 3  3/ 2 X1    .
i 1
s n n 
 
3
  Xn 3
Since X   , it is clear that  0 a.s. In the first term, let Yi  X 1  
3
ns
and take p = 2/3, by Zygmund-Marcinkiewicz SLLN yields

 
1 n 3 1 n
X    1/ p Y  0 a.s. as n   .
3/ 2 i 1 1 i 1 i
n n
Thus, An  Bn  Cn  0 a.s. and hence K H n , H Boot   0 a.s.

Since 
n X   
d

N 0, 2   and K H n , H Boot   0 a.s we could infer that


n X *  X 
d


N  0, *  , where  * is bootstrap version of  2 . On the other hand,
2


2

according to the terminology of bootstrap, we conclude that X *  X almost surely as


X   a.s. Moreover, by Theorem 2.7 of van der Vaart [10] we conclude the crux result i.e.
X *   . Then, a question arises about the use the bootstrap whether the bootstrap has any
advantages when a Central Limit Theorem is already available. For our case, suppose
T  X 1, X 2 ,, X n ; F   n X   . Then   
n X   
d
 
N 0, 2 and under Kolmogorov 
metric K H n , H Boot   0 almost surely. So, we have two approximations to

PF  n X     x, i.e. x /   and PFn  n X *


 
 X  x . The bootstrap approximation is
theoretically more accurate than the approximation provided by the Central Limit Theorem.
This is caused by the fact that normal distribution is symmetric such that the Central Limit
Theorem can not capture information about the skewness at the finite sample distribution of
T X 1, X 2 ,, X n ; F  , whereas the bootstrap approximation does so. Thus, the bootstrap can
be used in correcting for skewness, as an Edgeworth expansion would do. Babu and Singh [1]
discussed the accuracy of bootstrap using one term Edgeworth expansion. Hutson and Ernst
[8] studied the exact bootstrap for mean and suggest the bootstrap for variance of an L-
estmator.
  
Since T  n X    N 0, 2 , then Edgeworth expansion for T is
d

H ( x)  PT  x   x /    n 1/ 2 px /   x /    O p n 1 ,  
where  is the standard normal distribution function and p is polynomial with
coefficients depending on cumulants of X   . In the comprehensive studies, Hall (1992)
showed that p(x) denotes a function whose Fourier-Stieltjes transform

e dp( x )  r(it )e t
2
/2
itx
, where r(it ) can be derived from Hermite’s polynomials

684 B. SUPRIHATIN, S. GURITNO, S. HARYATMI

dn
r  d / dx  H n ( x)( x) and satisfies H n ( x)  ( 1) n e  x e x
2 2
/2 /2
n
. The bootstrap
dx
estimate of H admits an analogous expansion
 
Hˆ ( x)  P T *  x X  x / ˆ   n 1/ 2 pˆ x / ˆ  x / ˆ   O p n 1 ,  
where p̂ is obtained from p on replacing unknowns by their bootstrap estimate.
According to Davison and Hinkley [5], the estimate in the coefficients of p̂ are typically
 
distant O p n 1/ 2 from their respective value in p, and so pˆ  p  O p n 1/ 2 . Hall [7] also  
 
showed that ˆ    O p n 1/ 2 whence Hˆ ( x)  H ( x)  x / ˆ   x /    O p n 1 . Thus  
we can deduce that x / ˆ   x /   is generally of size n 1/ 2 not n 1 . Hence,
   
P T *  x X  PT  x   O p n 1/ 2 . Consistency of the bootstrap sample mean is useful to
study the consistency for many other statistics, see e.g. van der Vaart [10] and Cheng and
Huang [3].

3. CONSISTENCY OF BOOTSTRAP ESTIMATE FOR VARIACE USING


DELTA METHOD

The delta method consists of using a Taylor expansion to approximate a random vector of
the form  Tn  by the polynomial       Tn      in Tn   . This method is useful
to deduce the limit law of  Tn      from that of Tn   . This method is also valid in
bootstrap view, which is given in the following theorem.

Theorem 5 (DELTA METHOD FOR BOOTSTRAP) Let  :  k   m be a measurable

map defined and continously differentiable in a neighborhood of  . Let ˆn be random


vectors taking their values in the domain of  that converge almost surely to  . If
 
n ˆn    T and d
n   ˆ T conditionally almost surely, then both
ˆn* d

n  ˆ        T  and


n
d
n  ˆ    ˆ    T  conditionally almost
*
n n
d

surely.

Let  =  is the population mean, and then ˆn  X is the sample mean. The SLLN

asserts that ˆn   a.s. and 


n X   
d
  
N 0, 2 . The resulting of Section 2 shows
that 
n X *  X 
d
  
N 0, s 2 . Based on the consistency of the bootstrap for the sample
mean we investigate the consistency of the bootstrap for the unbiased sample variance using
CONSISTENCY OF THE BOOTSTRAP ESTIMATOR FOR MEAN... 685

delta method. Again, the SLLN asserts that unbiased sample variance

  
1 n 2
s2  Xi  X converges almost surely to 2. Let
n 1 i 1

  
1 n 2
s 2*  X i*  X * is the bootstrap estimate for the sample variance, the
n  1 i 1

 X  
X X 
n 2 2
*
 X* n 
*2  *  
  
2 n i 1 i i i
counterpart of s . Set s 2*
    .
n 1 n n 1 n  n 
   
The question is the s 2* converges a.s. to s 2 ? We see that s 2 equals to  X, X 2  and s 2*
 
equals to  X * , X *2 for the map  x, y  
n
n 1
 
y  x 2 . Thus, according to Theorem 5 we

conclude that s 2* converges to s2 conditionally almost surely. Furthermore,


 
n s 2*  s 2 
d
T where T is a normal distribution.

4. RESULTS OF MONTE CARLO SIMULATIONS

The simulation is conducted using S-Plus and the sample is twenty marks of statistics
test for 20 students are taken as follows: 80, 90, 75, 50, 85, 85, 45, 65, 50, 95, 70, 90, 35, 45,
50, 75, 70, 95, 60, 70. It is obvious that sample mean X = 69.0 with standard error 18.4.
Efron and Tibshirani [6] suggested to conduct simulations using at least B equals 50 for
standard errors and that 1000 for confidence intervals due to give good approximations.
Using the number of bootstrap samples B = 2000, the resulting of simulation gives X * =
69.12 with estimate for standard error 18.1, which is a good approximation. Figure 1 depicts
the densities estimation for the distribution of n X *  X and    
n s 2*  s 2 , respectively.
From the figure, we could infer that the distributions for both statistics are approximately
normal.
686 B. SUPRIHATIN, S. GURITNO, S. HARYATMI

0.020

0.0012
0.0010
0.015

0.0008
0.010

0.0006
0.0004
0.005

0.0002
0.000

0.0000
-60 -40 -20 0 20 40 60 -1000 -500 0 500 1000
sqrt(n) * (mean.boot - mean.sample) sqrt(n) * (var.boot - var.sample)

Figure 1 Left panel: Plot of Density Estimation for  


n X *  X , Right panel:
Plot of Density Estimation for 
n s 2*  s 2 

5. CONCLUDING REMARK

A number of points arise from the consideration of Section 2, 3, and 4, amongst which
we note as follows.
1. Since X   a.s. and X *  X a.s., according to the bootstrap terminology, we
conclude that X * is a consistent estimator for  .
2. So far, by using delta method we have shown that unbiased bootstrap sample
variance s 2*  s 2 a.s., and it is obvious that for biased version

 X 
* 2
n *
i X
i 1
sˆ 2*  . Accordingly, both s 2* and sˆ 2* are consistent estimators
n
for  2 .
3. Resulting of Monte Carlo simulation show that the bootstrap estimators are good
approximations, as represented by their standard errors and plot of densities
estimation.

References

[1] BABU, G. J. AND SINGH, K. On one term Edgeworth correction by Efron’s bootstrap, Sankhya, 46, 219-232,
1984.
[2] BICKEL, P. J. AND FREEDMAN, D. A. Some asymptotic theory for the bootstrap, Ann. Statist., 9, 1996-1217,
1981.
[3] CHENG, G. AND HUANG, J. Z. Bootstrap consistency for general semiparametric M-estimation, Ann. Statist.,
5, 2884-2915, 2010.
CONSISTENCY OF THE BOOTSTRAP ESTIMATOR FOR MEAN... 687

[4] DASGUPTA, A. Asymptotic Theory of Statistics and Probability, Springer, New York, 2008.
[5] DAVISON, A. C. AND HINKLEY, D. V. Bootstrap Methods and Their Application, Cambridge University
Press, Cambridge, 2006.
[6] EFRON, B. AND TIBSHIRANI, R. Bootstrap methods for standard errors, confidence intervals, and others
measures of statistical accuracy, Statistical Science, 1, 54-77, 1986.
[7] HALL, P. The Bootstrap and Edgeworth Expansion, Springer-Verlag, New York, 1992.
[8] HUTSON, A. D. AND ERNST, M. D. The exact bootstrap mean and variance of an L-estimator, J. R. Statist.
Soc, 62, 89-94, 2000.
[9] SINGH, K. On the asymptotic accuracy of Efron’s bootstrap, Ann. Statist., 9, 1187-1195, 1981.
[10] VAN DER VAART, A. W. Asymptotic Statistics, Cambridge University Press, Cambridge, 2000.

BAMBANG SUPRIHATIN
University of Sriwijaya
e-mail: [email protected]

SURYO GURITNO
University of Gadjahmada
e-mail: [email protected]

SRI HARYATMI
University of Gadjahmada
e-mail: [email protected]
688 B. SUPRIHATIN, S. GURITNO, S. HARYATMI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 689 – 696.

MULTIVARIATE TIME SERIES ANALYSIS USING


RCMDRPLUGIN.ECONOMETRICS AND ITS APPLICATION FOR
FINANCE

DEDI ROSADI

Abstract. In this paper, we discuss the application of our new R-GUI software, which we called as
RcmdrPlugin.Econometrics (Rosadi[6],[7]) especially for doing the multivariate time series analysis.
We show an empirical application of this software for finance modeling, especially for forecasting
the yield curve of Indonesia Government Bond using a multivariate time series models, which is
called VAR (Vector Auto Regressive) model. For further detail application of the
RcmdrPlugin.Econometrics for various application of econometrics and time series analysis in
Business, Economics and Finance is discussed in Rosadi [8].

Keywords : R-GUI, multivariate time series analysis, yield-curve forecasting

1. INTRODUCTION

R (R Development Core Team [5]), an open source programming environment for data
analysis and graphics, has becoming the 'lingua franca' of data analysis and statistical
computing. The functionality of R is based on the add-on packages/library (similar to
toolboxes in MATLAB). Default installation of R will automatically install and load several
basic packages. Beyond these packages, there are thousand of contributed packages, available
in CRAN and related websites. For time series analysis, there are various packages of R,
available under the taskviews Econometrics, Finance and Time Series in CRAN), with the
main user interaction via Command Line Interface (CLI). Unfortunately, e.g., for teaching
purpose, R-CLI seems to be less user friendly and relatively difficult to use, especially if we
compare it with the commerical econometrics softwares which has an extensive GUI
capabilities, such as Eviews. For solving this problem, Hodgess and Vobach [3] introduced
RcmdrPlugin.epack and Rosadi [6],[7] introduced RcmdrPlugin.Econometrics. In this paper,
we discuss the latest development of RcmdrPlugin.Econometrics, which is not reported in
Rosadi [6],[7] yet, especially for the purpose of modeling multivariate time series analysis.

689
690 DEDI ROSADI

The rest of this paper is organized as follows. In the second section, we quickly review
R-GUI, R-Commander and discuss the philosophy design of RcmdrPlugin.Econometrics. In
section three, we discuss the VAR modeling and its computation using R-CLI and
RcmdrPlugin.Econometrics. In section four, we provide empirical application of the
RcmdrPlugin.Econometrics in Finance, especially for the forecasting the yield curve using
VAR model. Final section concludes.

2. R-GUI AND RCMDRPLUGIN.ECONOMETRICS

The main user interaction of R is using CLI (Command Line Interface). For the purpose
of improving the user-friendliness of R, some statistican and programmers have been
developed the R-GUI version, see http://www.sciviews.org for more information on R-GUI.
One of the most popular R-GUI package is R Commander (Rcmdr). It provides the point and
click GUI for doing some basic statistical analysis, and it can be easily extended using
suitable plug-in (Fox [1],[2]). Currently, in CRAN server, there are several Rcmdr plug-ins.
For time series and econometrics analysis, Hodgess and Vobach [3] introduced
RcmdrPlugin.epack and Rosadi [6],[7] introduced RcmdrPlugin.Econometrics. Compare to
RcmdrPlugin.epack, in Rosadi[7], it is shown that RcmdrPlugin.Econometrics is more easy
to use, has better input-output dialog, more comprehensive GUI layout (it is more compatible
to Eviews and has better input dialog for forecasting purpose than Eviews) and has better
menu coverage.
In this paper, we discuss the latest development of RcmdrPlugin.Econometrics, which
is not reported in Rosadi [6],[7] yet, especially for the purpose of modeling multivariate time
series analysis. For multivariate time series analysis, currently RcmdrPlugin.Econometrics
can be used for Granger Causality and Cointegration test, Johansen Test, VAR (Vector Auto
Regressive), ECM and VECM modeling, modeling dynamic linear model (ADL) and
modeling linear panel model (see Figure 1) . These types of analysis are not available in
RcmdrPlugin.epack.
In the following sections, we provide in detail application of
RcmdrPlugin.Econometrics, especially for VAR modeling. We provide empirical examples
in Finance showing a unique features of RcmdrPlugin.Econometrics. For further detail
application of the software for econometrics and time series analysis is discussed in Rosadi
[8].

3. VAR MODELING

3.1. Introduction. The VAR(p) model with k endogeneous variables y t  ( y1t , , ykt )
can be written as

yt  A1yt 1   A p y t  p  CDt  ut
Multivariate Time Series Analysis using RcmdrPlugin… 691

where Ai , i  1, ,p is the coefficient matrix with dimension (k  k ) , u t is white


noise process with dimension k and the time invariant and positive definite matriks
Eut ut'  u . Matrix C is the coefficient matrix of m-independent random variables Dt with
dimension k m, where the matrix Dt contains all possible random variables, such as
constanta, trend components, dummy variables and/or seasonal dummy variables. The
parameters of the VAR model can be estimated using ordinary least square method, where the
optimal order can be found using information criteria, such as AIC (Akaike Information
Criteria), HQC (Hannan Quinn Criteria), SBC (Schwarz Bayesian Criteria). See e.g.,
Lütkepohl [4] for further discussion on VAR modeling.

3.2. VAR Analysis using R-CLI. For illustrative purpose, we provide an example of
computation of VAR model using R-CLI. We simulate the following VAR(2)

 y1   5   0.5 0.2   y1   0.3 0.7   y1   u1 


 y   10   0.2 0.5  y    0.1 0.3   y   u 
 2 t      2  t 1    2  t 2  2  t
All computation stages (simulation, identification, estimation, diagnostic check and
forecasting) are ilustrated using the following R-Script.

Figure 1. RcmdrPlugin.Econometrics menus for multivariate time series analysis


692 DEDI ROSADI

> ## Simulate VAR(2)-data


> library(dse1)
> library(vars)
> ## Setting the lag-polynomial A(L)
> Apoly <- array(c(1.0, -0.5, 0.3, 0,
0.2, 0.1, 0, -0.2,
0.7, 1, 0.5, -0.3) ,
c(3, 2, 2))
> B <- diag(2)
> TRD <- c(5, 10)
> ## Generating the VAR(2) model
> var2 <- ARMA(A = Apoly, B = B, TREND = TRD)
> ## Simulating 500 observations
> varsim <- simulate(var2, sampleT = 500, noise = list(w =
matrix(rnorm(1000),nrow = 500, ncol = 2)), rng =
list(seed = c(123456)))
> ## Obtaining the generated series
> vardat <- matrix(varsim$output, nrow = 500, ncol = 2)
> colnames(vardat) <- c("y1", "y2")
> ## Plotting the series
> plot.ts(vardat, main = "", xlab = "")
> ## Determining an appropriate lag-order
> infocrit <- VARselect(vardat, lag.max = 3,type = "const")
> infocrit # order sesuai bentuk model simulasi yakni p=2
> ## Estimating the model
> varsimest <- VAR(vardat, p = 2, type = "const", season =
NULL, exogen = NULL)
> ## diagnostic check, testing serial correlation
> ## Portmanteau-Test show no correlation until lag h=16
> var2c.serial <- serial.test(varsimest, lags.pt = 16,
type = "PT.asymptotic")
> var2c.serial
>## Forecasting until n=25 step a head
> predictions <- predict(varsimest, n.ahead = 25, ci = 0.95)
> ## Plot of predictions for y1
> plot(predictions, names = "y1")
> plot(predictions, names = "y2")

3.3. VAR Analysis using RcmdrPlugin.Econometrics. In the latest version of the


development of RcmdrPlugin.Econometrics, we already developed the R-GUI Version of
VAR modeling. In the Figure 1, we showed that there is menu for VAR modeling. From
Figure 3, it can be seen that the VAR modeling interface in RcmdrPlugin.Econometrics is
more comprehensive than the same interface available in Eviews version 4 (see Figure 2). In
Figure 3, we see that the RcmdrPlugin.Econometrics interface contains all modeling stages
(identification, estimation, diagnostic check and forecasting) of VAR modeling, where the
menu in EViews, it is only contain the interface for estimation of the model.
Multivariate Time Series Analysis using RcmdrPlugin… 693

4. EMPIRICAL APPLICATION FOR FINANCE

For ilustration of the empirical application of RcmdrPlugin.Econometrics for VAR


modeling, we will show how to apply the package for forecasting the yield curve. The term
structure of interest rates or yield curve is a relation of the yield and the maturity of default
free zero-coupon securities and provides a measure of the returns that an investor might
expect for different investment periods in a fixed income market. There are various models of
yield curve available in literature, see e.g. Stander [11] for overview. One of the most
important yield curve model is called Nelson and Siegel (NS) model. The Nelson-Siegel yield
curve is given as:

 m  m  m  m m 
R(t , m)  0  1  1  exp( )  /    2   1  exp( )  /   exp( ) 
         

Figure 2. Eviews Input dialogue for VAR modeling


694 DEDI ROSADI

Figure 3. RcmdrPlugin.Econometrics Input dialogue for VAR modeling

The rate of exponential decay is governed by the parameter  and  0 , 1 ,  2 are


interpreted as the short-term component, long-term component, and the medium-term
component, respectively, and m denotes the maturity time. The parameters β0 and  need to be
positive and 0  1  0 . NS model is a complex non linear regression model, requires
complex optimization method to estimate the models (see e.g. Rosadi [9], for further detail
discussion on this aspects).
In this paper, for forecasting the NS yield curve, we apply the VAR model for the
endogenous variables  ,  0 , 1 ,  2 of the NS model. For empirical analysis, using the data
set of government bonds, we estimate the NSS curve using data from April 2 until June 30,
2008. Using this set of (, β0, β1, β2) data, we try to fit VAR(p) using
RcmdrPlugin.Econometrics. We found that the best model for the data using AIC is VAR(2)
model. Based on the forecasting result of this model, we found that the accuracy of prediction
based on VAR method will perform better for forecasting the yield of of the long term bonds.
See Rosadi, Nugraha, and Dewi [10] for further detail discussion.

5. CONCLUDING REMARKS

In this paper, we already discuss the latest development of RcmdrPlugin.Econometrics


Multivariate Time Series Analysis using RcmdrPlugin… 695

for the purpose of modeling multivariate time series analysis. We provide an empirical
example of the package for Finance modeling and showing a unique features of the package.
Further detail application of the RcmdrPlugin.Econometrics for various application of
econometrics and time series analysis in Business, Economics and Finance is discussed in
Rosadi [8].

References

[1] FOX, J. , The R Commander : A Basic –Statistics Graphical User Interface to R,. Journal of Statistics Software,
Vol.14, Issue 9, 2005.
[2] FOX, J. , RcmdrPlugin.TeachingDemos. 2009 [Online] Available at www.cran.r-project.org
[3] HODGESS, E. AND VOBACH, C., RcmdrPlugin.epack: A Menu Driven Package for Time Series in R. Paper
presented at the annual meeting of the The Mathematical Association of America MathFest, TBA, Madison,
Wisconsin, Jul 28, 2008
[4] LÜTKEPOHL, H., New Introduction to Multiple Time Series Analysis, Springer, New York, 2006.
[5] R DEVELOPMENT CORE TEAM, R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria, 2011. ISBN 3-900051-00-3.
[6] ROSADI, D., Rplugin.Econometrics: R-GUI for teaching Time Series Analysis, Proceeding of CompStat 2010,
1st Edition., ISBN: 978-3-7908-2603-6, Springer Verlag, Paris, 2010
[7] ROSADI, D., Teaching Time Series analysis course using RcmdrPlugin.Econometrics, Proceeding USER 2010,
Gaithersburg, Washington DC, USA, 2010
[8] ROSADI, D., Analisa Ekonometrika dan Runtun Waktu dengan R : Aplikasi untuk bidang Ekonomi, Bisnis dan
Keuangan, Andi Offset, Yogyakarta (in Bahasa Indonesia), 2011
[9] ROSADI, D., Pemodelan Kurva Imbal Hasil dan Komputasinya dengan Paket Software
RcmdrPlugin.Econometrics, Prosiding Seminar Nasional Statistika, 21 Mei 2011, Program Studi Statistika,
Universitas Diponegoro (in Bahasa Indonesia)
[10] ROSADI, D., NUGRAHA, Y. A. AND DEWI, R.K., 2011, Forecasting the Indonesian Government Securities
Yield Curve using Neural Networks and Vector Autoregressive Model, Bulletin of the International Statistical
Institute 58th Session, 21-26 August 2011, Dublin, Ireland
[11] STANDER, S.Y., Yield Curve Modeling, Palgrave Macmilan, New York, 2005.

DEDI ROSADI
Dept. of Mathematics, Faculty of Math. and Natural Sciences, Gajah Mada University
e-mail: [email protected]
696 DEDI ROSADI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 697 – 704.

UNIFIED STRUCTURAL MODELS AND REDUCED-FORM


MODELS IN CREDIT RISK BY THE YIELD SPREADS

DI ASIH I MARUDDANI, DEDI ROSADI, GUNARDI, ABDURAKHMAN

Abstract. In recent years a lot of works have been done to bridge the gap between the two main
approaches in credit risk modeling: structural models and reduced form models. Many papers try to
obtain this using special assumptions about the problem. This paper is a literature study about unified
models. We use a method to unify these two models by removing the discrepancy of yield spreads
between structural models and reduced-form models. We show the equivalence of yield spreads
between structural model and reduced-form model. The unified model is obtained.

Keywords and Phrases : Credit Risk, Structural Models, Reduced Form Models, Yield Spreads

1. INTRODUCTION

In credit risk model, there are two broad classes of models. There are Structural
Models, and Reduced-Form Models. Structural models use the capital structure to find the
default probability and the mean recovery rate (Merton [16]). Reduced-form models use the
market spread to find the default probability and the mean recovery rate (Jarrow & Turnbull
[8], Jarrow, Lando, & Turnbull [9]).
Structural models are based on the information set available to the firm’s
management, which includes continuous-time observations of both asset values and
liabilities. Reduced-form models are based on the information set available to the market,
typically including only partial observations of both the firm’s asset values and liabilities.
The main distinguishing characteristic of structural models with respect to reduced-
form models is the link the former provide between the probability of default and the firms’
fundamental financial variables: assets and liabilities. Reduced form models use market
prices of the firms’ defaultable instruments (such as bonds) to extract both their default
probabilities and their credit risk dependencies, relying on the market as the only source of
information regarding the firms’ credit risk structure. Although easier to calibrate, reduced
form models lack the link between credit risk and the information regarding the firms’
financial situation incorporated in their assets and liabilities structure.

697
698 D. A. I. M ARUDDANI ET AL.

Merton [16] firstly builds a model based on the capital structure of the firm, which
becomes the basis of the structural approach. In his approach, the company defaults at the
bond maturity time T if its assets value falls below some fixed barrier at time T. Thus the
default time  is a discrete random variable which picks T if the company defauls and infinity
if the company does not default. As a result, the equity of the firm becomes a contingent
claim of the assets of the firm's assets value. Black and Cox [3] extends the definition of
default event and generalize Merton's method into the first-passage approach. In Black and
Cox [3], the firm defaults when the history low of the firm assets value falls below some
barrier D. Thus, the default event could take place before the maturity date T.
These models ignore the possibility of bankcruptcy of underlying firm and in real
worls, firms have a positive probability of default in finite time (Mendoza [15]). Empirical
studies of Indonesian structural models of credit risk have been worked in Maruddani et al
[12] and Maruddani et al [13].
Intensity-based approach, also known as reduced form model, as a counterparty of
the structural model, is introduced by Artzner & Delbaen [2], Jarrow & Turnbull [8] and
Duffie & Singleton [5]. In this approach, the default event is modelled as either a stopped
Poisson process or a stopped Cox Process with intensity ht. The intensity ht is then called
hazard rate in reduced form approach, since the product of ht and an infinitesimal time period
dt is the default probability of the firm at that infinitesimal time period dt given the firm has
not default yet before time t. It was showed in Lando [11] and Duffie & Singleton [5] that the
defaultable bonds can be calculated as if they were default-free using an interest rate that is
the risk-free rate adjusted by the intensity.The results of the two models are usually different.
These credit models ignore the information of stock option market and there is no
connection between equity model and credit model (Mendoza [15]). Empirical study of
Indonesian reduced-form models of credit risk has been worked in Maruddani et al [14].
In recent years, some papers have tried to bridge the gap between these two models.
Many papers try to obtain this using special assumptions about the problem (Elizalde [6],
Chen & Panjer [4], Akat [1]). Elizalde [6] argues that the key element to link both approaches
lies in the model’s information assumptions. Using a specification of a structural model
where investors do not have complete information about the dynamics of the processes which
trigger the firm’s default, these reconciliation models derive a cumulative rate of default
consistent with a reduced form model. Akat [1] proposed a model where the credit default
event is defined as the minimum of the two default times, one from the structural default and
the other from the exogenous intensity.
This paper is a literature study about unified model. We study a method for unifying
structural models and reduced form models by showing the equivalency of these two yield
spreads, then the unified model is obtained (Chen & Panjer [4]).

2. YIELD SPREADS

Yields on corporate bonds exceed those on equivalent government bonds by an


amount known as the yield spread or credit spread. The factors which affect the spread are:
the probability of default, the loss if there is a default, the taxation of corporate bonds relative
to government bonds, and the systematic risk of corporate bonds.
In interest rate theory, we use PG(t, T) to denote the price of a default-free zero
coupon bond paying $1 at maturity date T. Defaultable bonds are added into our consideration
UN IF IE D S TR UC TUR AL M ODE LS AND R E DUC E D - FOR M M ODE LS … 699

when credit risk is involved. Empirically, we assume treasuay bonds issued by the
government are default-free while corporate bonds issued by firms are defaultable. A
defaultable bond is always mentioned together with its issuer. Thus we use PC(t, T) denotes
the price of a defaultable zero-coupon bond issued by a firm paying $1 at maturity date T
given there is no default. We will see in this section that credit risk theory share many
similarities with interest rate theory.
The term structure of defaultable bond price PC(t, T) is a function of T, for fixed t.
We only consider the bond prices when T ≥ t. We assume both bond prices vanish to 0 as T
goes to infinity (Yi [17]).

Definition 1 Default-Free Yield YG(t, T) is defined by


(1)

Definition 2 Defaultable Yield Spread YC(t, T) of a firm is defined by


(2)

Definition 3 Yield Spread R (t, T) for defaultable bond PC(t, T) is defined by the difference
of those two yields mentioned above
(3)

3. STRUCTURAL MODEL

Structural model is firstly pioneered by Merton [16] using a diffusion process to


model the firm value evolution. Consider a firm issues equity and bonds of amount D that
mature at time T. The total market value of the firm Vt at time t is assumed to follow a
diffusion process as

(4)

Where
: the instantaneous variance
: instantaneous return
: total payout including dividends or coupons
: a standard Brownian motion.

A default event can only occur at its bond’s maturity and occurs only if the value of
the firm VT at bond maturity T falls below the debt obligation D. Default-free interest rate is
assumed to be a positive constant r, therefore the risk-free bond prices are given by

(5)
700 D. A. I. M ARUDDANI ET AL.

and the defaultable bond price at time t is

(6)

where is the Black-Scholes formula for put options

Yield Spread spread in Merton’s Model at time t of a risky debt that matures at time
T given by Merton is

(7)

Where

(8)

Let the equity of the firm at time t or the stock price be denoted by St. The price of
the equity is considered as the price of an European call option on the value of the firm.
Black and Scholes [3] show that price of the equity is

(9)

Where

(10)

(11)

[] is the cumulative distribution function of Standard Normal Distribution

Assuming that a company has survived t years and its random failure time is t. At
current time t, the probability that the firm will default at time T is

(12)

The recovery rate  (T) is the proportion VT of D if the firm defaults at time T. The
formal definition of recovery rate can be expressed as

(13)
UN IF IE D S TR UC TUR AL M ODE LS AND R E DUC E D - FOR M M ODE LS … 701

4. REDUCED-FORM MODEL

Reduced form approach or intensity-based approach goes back to Artzner & Delbaen
[2], Jarrow & Turnbull [8], Lando [11], and Duffie & Singleton [5]. The basic idea is based
on modeling the default process as a Stopped Poisson process (Yi [17]).
Reduced-form model use hazard rate framework to model default. Based on
the models of Jarrow & Turnbul [8], and Jarrow, Lando, & Turnbull [9], we assume the
financial market is frictionless with a finite time horizon. The price of a riskless bond
can be written as
(14)
Where r(s) is the spot rate at time s.
And the price of risky bond can be written as

(15)

If the default-free spot rates and the default process are independent, the price of a
risky bond is
(16)

Lemma 1 (Chen and Panjer [4])

From the Lemma 1, the yield spreads is:

(17)
Where represents the mean recovery rate if the maturity is time T.

5. UNIFIED MODEL

Up to now, there are two main quantitative approaches to analyzing credit risk:
structural approach and reduced form approach. Both classes have its own advantages and
disadvantages, e.g. although reduced form models have a lot of room for calibrating to
historical data, they lack the financial ingredient for the model parameters. On the other hand
structural models have a nice explanation in financial terms and rather intuitive, they lack
measuring in particular the short-term credit risk and much harder to apply when there is
more than one name involved. Hence, a framework that would combine the two general
classes of models in a way that none of the above criticisms do not apply would be the ideal
framework to model credit risk.
702 D. A. I. M ARUDDANI ET AL.

The other critical assumption of the structural model is that the evolution of firm
value follows a diffusion process. Thus, the yield spreads of corporate bonds, especially those
with short maturities, are explained in the context (Jones et al [10], Fons [7]).
In the reduced-form model, since the hazard rate of default is modeled as an
exogenous process, it is unknown what economic mechanism behind the default process.
According to Duffie & Singleton, the parameters of the reduced-form models are unstable
when the models are applied to fit observed yield spreads.
In this paper, we unify the structural model with the reduced-form model by showing
that the yield spread of the Merton model is equivalent to the yield spread of a reduced-form
model. After rearranging the yield spread in the Merton structural model, the price of risky
bond for a structural model can be rewritten as the price of a contingent claim under risk-
neutral valuation paying full obligation if there is no default and paying the recovery rate if
default happens at maturity like a reduced-form model.
We start with a reduced-form model under risk-neutral valuation. The corresponding
value of the firm process defines the default probability and mean recovery rate. We now
show that the yield spread of the Merton model is equivalent to the yield spread for a
reduced-form model. The yield spread of the Merton model is

(18)
Which is the yield spread for a reduced-form model in equation (17).
In general, the structural model and reduced-form model can be unified by
UN IF IE D S TR UC TUR AL M ODE LS AND R E DUC E D - FOR M M ODE LS … 703

(19)

Structural models usually impose assumptions on the value of the firm Vt while
reduced-form models usually impose assumptions on the components related to the equation

(20)

References

[1] AKAT, M., 2007, A Unified Credit Risk Model, Dissertation, Department of Mathematics, Stanford University.
[2] ARTZNER, P. AND DELBAEN, F., 1995, Default Risk Insurance and Incomplete Markets, Mathematical Finance,
5:3, 187-195.
[3] BLACK, F. AND SCHOLES, M., 1973, The Pricing of Options and Corporate Liabilities, Journal of Political
Economy, 81, 637-654.
[4] CHEN, C. AND PANJER, H., 2007, Unifying Discrete Structural Models and Reduced Form Models in Credit
Risk using a Jump Diffusion Process, Insurance: Mathematics and Economics, 33, 357-380.
[5] DUFFIE, D. AND SINGLETON, K., 1999, Modeling Term Structure of Defaultable Bonds, Review of Financial
Studies, 12, 687 – 720.
[6] ELIZALDE, A., 2005, Credit Risk Models III: Reconciliation Reduced-Structural Models, www.abelelizalde.com
[7] FONS, J.S., 1994, Using Default Rates to Model the Term Structures of Credit Risk, Financial Analysts Journal,
25-32.
[8] JARROW, R.A. AND TURNBULL, S.M., 1995, Pricing Derivatives on Financial Securities Subject to Credit Risk,
The Journal of Finance, 50, 53-85.
[9] JARROW, R.A., LANDO, D., AND TURNBULL, S.M., 1997, A Markov Model for the Term Structure of Credit
Risk Spreads, The Review of Financial Studies, 10, 481-523.
[10] JONES, E.P., MASON, S.P., AND ROSENFELD, E., 1984, Contingent Claims Analysis of Corporate Capital
Structures: An Emprical Investigation, Journal of Finance, 39, 611 – 627.
[11] LANDO, D., 1998, On Cox processes and credit risky securities. Review of Derivatives Research, 2, 99-120.
[12] MARUDDANI, D.A.I., 2011a, Pengukuran Risiko Kredit Obligasi dengan Model Merton, Jurnal Ekonomi
Manajemen dan Akuntansi, Fakultas Ekonomi Universitas Mercu Buana Yogyakarta, Vol. 1, No. 1, 123-141.
[13] MARUDDANI, D.A.I., ROSADI, D., GUNARDI, AND ABDURAKHMAN, 2011b, Credit Spreads Obligasi Korporasi
dengan Model Merton, Prosiding Seminar Nasional Statistika Universitas Diponegoro, ISBN: 978-979-097-
142-4.
[14] MARUDDANI, D.A.I., ROSADI, D., GUNARDI, AND ABDURAKHMAN, 2011c, Credit Spreads pada Reduced-Form
Model, Jurnal Media Statistika, Universitas Diponegoro, Vol. 4., No. 1, 57-63.
[15] MENDOZA, N, 2009, Unified Credit-Equity Modeling, Recent Advancements in the Theory and Practice of
Credit Derivatives, The University of Texas at Austin.
[16] MERTON, R.,1974, On the Pricing of Corporate Debt: The Risk Structure of Interest Rate Journal of Finance,
29, 449–470.
[17] YI, C., 2005, Credit Risk from Theory to Application, Thesis, Mc. Master University.

DI ASIH I MARUDDANI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Diponegoro University, Semarang, Indonesia
e-mail: [email protected]
704 D. A. I. M ARUDDANI ET AL.

DEDI ROSADI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta, Indonesia
e-mail: [email protected]

GUNARDI
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta, Indonesia
e-mail: [email protected]

ABDURAKHMAN
Department of Mathematics, Faculty of Mathematics and Natural Science,
Gadjah Mada University, Yogyakarta, Indonesia
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 705 - 714.

THE EFFECT OF CHANGING MEASURE IN


INTEREST RATE MODELS

DINA INDARTI, BEVINA D. HANDARI, AND IAS SRI WAHYUNI

Abstract. Estimation parameters are one of the important steps in interest rate modeling. The
interest rate model is defined under risk-neutral measure and the data in the real world
characterizes the distribution of the interest rate model under the actual measure. In this case, the
changing measure of the model is needed and is conducted by using Girsanov’s theorem. This
paper will investigate the effect of this changing measure on Vasicek’s model and Cox, Ingersoll
and Ross’ model. The implementation shows that the simulation of Vasicek and CIR models
obtained via changing measure has less mean absolute error than the simulation of those models
without changing measure.
Keywords and Phrases : changing measure, Girsanov’s theorem, interest rate model, Vasicek’s
model, Cox, Ingersoll and Ross’ model.

1. INTRODUCTION

The dynamics of interest rates are important factors to consider in pricing derivative
products such as bond, option, and equity, etc. As the value of interest rates are always
changing from time to time, an interest rate is considered as a stochastic process. The
dynamic of interest rate can be represented in stochastic differential equations (SDEs). This
paper discusses two interest rate models, Vasicek and Cox, Ingersoll dan Ross (CIR) models.
Based on Dominedo [2], Vasicek dan CIR models have unique solutions. Based on Indarti [5]
and Wahyuni [10], both are stable models. To determine the model’s parameters, the
real/historical data was needed. According to Brigo [1], data are collected in the real world,
and their statistical properties characterize the distribution of interest rate process under actual
measure, while the models are under risk-neutral measure. Interest rate model under risk-
neutral measure guarantees that if the model used for assets pricing, it will satisfy the
martingale properties so there is no arbitrage. Related to the model implementation to the real
data, the model transformation from risk-neutral measure to actual measure is needed.
According to this, the Girsanov theorem will be used and parameter estimation process can

705
706 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI

proceed under actual measure. To obtain the model’s parameters under risk-neutral measure,
some market price of risks will be used. These market prices of risks relate the model’s
parameters in actual measure to the parameters in risk-neutral measure. Hence, the model’s
parameters can be obtained and the simulations of two models can be generated. By
comparing the simulations, the effect of changing measure will be investigated. The content
of this paper is as follows: At first, changing measure in Vasicek and CIR models will be
discussed, followed by parameter estimation of Vasicek and CIR models. Next, the effect of
changing measure will be discussed which followed by a concluding remark and
acknowledgements.

2. CHANGING MEASURE IN VASICEK AND CIR MODELS USING


GIRSANOV THEOREM

2.1 Changing Measure in Vasicek Model. Vasicek model is a model of interest rate
introduced by Oldrich Vasicek in 1977. The interest rate of Vasicek model has a mean
reversion property, i.e. the interest rate appears to be pulled back to some long-run average
over time. Vasicek model under risk-neutral measure P is defined as
drt   (  rt )dt   dWt , (1)
withr0 as an initial value, r0 ,  ,  , and  are positive constants, rt is an interest rate at t,
 is a long-run level (mean reversion level),  is the speed of adjustment of the interest rate
towards its long-run level  , 2 is a variance rate, and Wt is a Brownian motion on a
probability space  , F , P  .
In order to obtain the analytic solution of (1), set rt  X t   then we get
dX t   X t dt   dWt , which is an Orstein-Uhlenbeck process. To solve this Orstein-
Uhlenbeck process, we set Yt  e t X t . Then by using Ito-Doeblin formula, we get
 
t t
1
Yt  Y0    e s X s   X s e s   2 .0 ds    e s dWs
0  
2 0
t
   e s dWs ,
0
so, we obtain
t
X t  e t X 0   e t  e s dWs .
0

Since rt  X t   then the solution of (1) is


t
rt  r0e t   (1  e t )    e (t  s ) dWs .
0
Th e E ffec t Of C h an gi n g M ea s u re In In t e res t R a t e M odels 707

For 0 < u < t, the solution of (1) can be written as


t
rt  ru e t u    (1  e t u  )   e t u   e s dWs . (2)
u
The conditional mean and variance of rt for 0 < u < t respectively are as follows,
E  rt | Fu   ru e t u    (1  e t u  ),
2
Var  rt | Fu  
2
1  e2 (t u )  .
Thus, the distribution of rt in Vasicek model is given by
 2

rt ~ N ru e t u    1  e t  u  ,

 
1  e 2 t u  . (3)
 2 
Since the Vasicek model is under risk-neutral measure, and the historical data
characterizes the distribution of interest rate under actual measure, then the model
transformation is needed. The following theorem will be used to change the model under risk-
neutral measure to a model under actual measure [9].

Theorem. (Girsanov’s theorem) Let Wt, 0 ≤ t ≤ T, be a Brownian motion on a probability


space (, F , P) and let Ft, 0 ≤ t ≤ T, be a filtration for this Brownian motion. Let θ t, 0 ≤ t ≤
T , be an adapted process. Define
 t 1 t 
Z t  exp   u dWu   u2 du  ,
 0 2 0

t
Wt  Wt   u du,
0
and assume that

E   u2 Zu2 du   .
T

 0 
Set Z = Z(T). Then E[Z] = 1 and under the probability measure P , the process Wt , 0 ≤ t ≤
T, is a Brownian motion.

By using Girsanov’s theorem and let t   rt


t
, we have

 t 1 t 
Zt  exp 0  ru dWu  0  2 ru2 du  ,
 2 
Wt  Wt  0t  ru du
 Wt   rt t ,
then we have
Wt  Wt   rt
t . (4)
708 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI

By substituting equation (4) into Vasicek model (1), we have Vasicek model under actual
measure as follows
drt  [  (   )rt ]dt   dWt , (5)

which is also can be written as,


  
drt  (   )   rt  dt   dWt . (6)
 (   ) 
By using Girsanov theorem, P and P are equivalent which means both measures
agree what is possible and what is impossible, they may disagree on how probable the
possibilities are. Thus, change from the risk-neutral to the actual probability measure changes
the distribution of interest rates without changing the interest rates themselves [9].

2.2 Changing Measure in CIR Model. The Cox-Ingersoll-Ross (CIR) model is an interest
rate model which is constructed by Cox, Ingersoll, and Ross in 1980 and published in 1985.
Like Vasicek model, CIR model also has mean reversion property. CIR model eliminates the
main drawback of the Vasicek model, a positive probability of getting negative interest rate.
So, the interest rate of CIR model is always positive. CIR model under risk-neutral measure is
defined as
drt   (  rt )dt   rt dWt , r (0)  r0 , (7)
with r0 as an initial value, r0 ,  ,  ,  are positive constants, rt is an interest rate at t, 
is a mean reversion level,  is the speed of adjustment of the interest rate towards its long-
run level  ,  is a variance rate, and Wt is a Brownian motion on a probability space
2

 , F , P  .
The CIR model has no analytic solution. However, its mean and variance can be
calculated analytically [8]. By using Ito-Doeblin formula with Yt  e t X t , for 0 < u < t can
be obtained
E  rt | Fu      ru    e (t u ) , (8)

and
ru 2  2
Var  rt | Fu  

e  ( t u )
e 2 ( t u )
  2 1  e(9) .
 ( t u ) 2

In the CIR model, the interest rate admits a noncentral chi-square distribution [1].
By using Girsanov’s theorem and let t   rt t , we have
 t 1 t 
Zt  exp 0  ru dWu  0  2 ru du  ,
 2 
Wt  Wt  0t  ru du
 Wt   rt t ,
Th e E ffec t Of C h an gi n g M ea s u re In In t e res t R a t e M odels 709

so
Wt  Wt   rt t. (10)
Then, by subtituting (10) into CIR model (7) yields CIR model under actual measure as
follows
drt  [  (   )rt ]dt   rt dWt . (11)

3. PARAMETER ESTIMATION OF VASICEK AND CIR MODELS

3.1 Parameter Estimation of Vasicek Model. After obtaining Vasicek model under actual
measure, the parameter estimation are performed by using maximum likelihood method. To
simplify the estimation process, Vasicek model under actual measure (6) can also be
expressed as follows,
drt   * (  *  rt )dt   *dWt , (12)
where

 *     ,  *  ,  *  . (13)
(   )
The market price of risk λ relates the model’s parameters in actual measure with the
parameters in risk-neutral measure as shown in equations (13).
The analytic solution of Vasicek model under actual measure (12) can be obtained in a similar
way as we get the analytic solution of Vasicek model under risk-neutral measure. It can be
shown that the analytic solution of Vasicek model under actual measure has the form
t
 *  t u   *  t u  *   t u 
e
 *s
  (1  e )  e
*
rt  ru e *
dWs ,
u

with distribution
  * t  u 

 *  t  u  
 
 
*2
  1 e , * 1  e 2 t u  .
*
rt ~ N ru e *
(14)
 2 
According to equations (3) and (14) show that Vasicek model in different measure has
different distribution.
According to Brigo [1], let
 *2
  exp   *  , V 2 
2 *

1  exp  2 *  ,  (15)

such that (14) can be written as



rt ~ N ru   * 1   ,V 2 .  (16)
710 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI

By using maximum likelihood method, the formulas of parameter estimation are given by
n n n n
n ri ri 1   ri  ri 1   r  ˆr  i 1
i
1 n
, Vˆ    ri  ˆri 1  ˆ (1  ˆ )  .
2
ˆ  i 1 i 1 i 1
, ˆ  i 1
* 2 *

n
 n 
2 n(1  ˆ ) n i 1
n ri 21    ri 1 
i 1  i 1 
Then by substituting the above equations to equation (15), we get the estimated parameters of
Vasicek model under actual measure as
n

ln ˆ   r  ˆr  i i 1
2ˆ *Vˆ 2
ˆ  
*
, ˆ *  i 1
, ˆ *  .
1  exp  2ˆ * 
(17)
 n(1  ˆ )
Since the model under actual measure in equation (12) is similar to the model under risk-
neutral measure (1), the parameter estimations of the model under risk-neutral measure
(without changing measure) using maximum likelihood method also can be stated as
n

ln ˆ   r  ˆr  i i 1
2ˆ *Vˆ 2
ˆ   , ˆ  i 1
, ˆ  .
 
(18)
 n(1  ˆ ) 1  exp 2ˆ *
From equation (13), we have
(   )  *
   *   ,   ,    *. (19)

Finally, by substituting equation (17) into equation (19), the formulas of parameter estimation
of Vasicek model under risk-neutral measure can be stated as
ˆ *ˆ *
ˆ  ˆ *  ˆ * , ˆ  , ˆ  ˆ *. (20)
ˆ  ˆ
* *

According to Brigo [1] and Zeytun [11],  is known as a market price of risk which is
assumed as a constant. In this paper,  is chosen so that the parameters satisfy the stability
condition of Vasicek model [5].

3.2 Parameter Estimation of CIR Model. The parameters of CIR model under actual
measure are estimated using Generalized Method of Moments (GMM) based on Matyas [7].
GMM is an estimation technique which suggests that the unknown parameters should be
estimated by matching population (or theoretical) moments (which are functions of unknown
parameters) with the appropriate sample moments. Parameter estimation does not require the
knowledge of the data distribution.
CIR model under actual measure (11) can also be expressed as
~
drt  (c1  c2rt )dt   * rt dWt , (21)
where
c1   , c2  (   ) and  *   . (22)
In equation (21), there are three parameters that will be estimated, ĉ1 , ĉ2 , and ˆ 2 that can
Th e E ffec t Of C h an gi n g M ea s u re In In t e res t R a t e M odels 711

be obtained by using GMM method [10].


The relation between the model’s parameter under actual measure (21) and the
model’s parameter under risk-neutral measure (7) can be written as
c1   , c2   and  *   , (23)
while the parameter estimations using GMM method to the model under risk- neutral measure
(without changing measure) can be stated as
c
ˆ  c2 , ˆ  1 and ˆ   *. (24)

By equation (22), the formulas of parameter estimation of the model with risk-neutral
measure (with changing measure) using GMM method can be stated as
cˆ1
ˆ  (cˆ2  ˆ * ) , ˆ   , and ˆ  ˆ .
*
(25)
(cˆ2  ˆ )
As before, according to Brigo [1] and Zeytun [11],  in CIR is a market price of risk,
which is chosen as a constant. Wahyuni [10] calculates mean absolute error of some  ,
where the mean absolute error is the mean absolute value of the difference of historical data
and the approximate solution of CIR model. Here we choose  with the smallest mean
error.

4. DISCUSSION

The effect of changing measure in interest rate models is shown through simulations
by using MATLAB software. In case of Vasicek model, simulations are performed by using
Monte Carlo method [3], whereas the simulations of CIR model are performed by using
Milstein method since the CIR model does not have an analytic solution. Simulations are
conducted by using the daily interest rate data of the zero-coupon bond with five years
maturity which can be downloaded in www.bankofengland.co.uk.
There are two simulations of each model, the simulation use the model’s parameters
with changing measure and without changing measure. By assuming the daily interest rate
data and the interest rate model have the same measure (both satisfy the risk-neutral
measure), the model’s parameters without changing measure can be obtained directly from
the corresponding risk-neutral model. The model’s parameters with changing measure are
obtained by initially changing the measure of the model into actual measure, then estimating
the model’s parameters under actual measure using the daily interest rate data, and finally
substituting it back to the equations that relate the parameters in risk-neutral and actual
measure in order to get the model’s parameters under risk-neutral measure.
Based on the data, by equation (18) the parameter estimations of Vasicek model
without changing measure are ˆ  0, 2439, ˆ  0,0479, ˆ  0.002743 , and by equation
(20) the parameter estimations with changing measure are
ˆ  0, 238463, ˆ  0,048988, ˆ  0.002743 .
712 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI

Monte Carlo iteration scheme of rt for 0 = t0 < t1 < ... < tn can be expressed as

2

r (ti 1 )  e (ti1 ti ) r (ti )   1  e (ti ,ti1 )   2
1  e2 (t ,t i i 1 )
Z i 1

where Z1 , , Z n are random variables which has N(0,1) distribution. This scheme is used to
generate the simulation of Vasicek model.
The simulation of Vasicek model is as follows,

Interest Rate Data


Simulation Without Changing Measure
0.052 Simulation With Changing Measure

0.05

r(t)0.048

0.046

0.044

0.042

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1


t
Figure 1. The Comparison of Interest Rate Data with Simulations of Vasicek Model with and
without Changing Measure

In figure 1, we can see that data and its simulations are different. On time interval, t ≤
0,2, the simulation quite close with data but on the further time interval, the simulation and
data are very different. In other words, Vasicek’s model only quite good to analyze interest
rate for short time interval. It’s agree with [5] that states Vasicek’s model is short rate model.
The simulations in figure 1 show that the mean absolute error of the simulation
without changing measure is 0,003389 and the mean absolute error of simulation with
changing measure is 0,003319.
By using the same data, by equation (24) the parameter estimation of CIR model
without changing measure are ̂ = 1,55770, ˆ = 0,04095, and ˆ = 0,03292, and by
equation (25) the parameter estimation with changing measure are
ˆ  1, 49187, ˆ  0,04276, ˆ  0.03292 .
Milstein scheme for CIR Model in interval [0, T] is

1  

rj  rj 1     rj 1  t   rj 1 W j  W j1   rj 1 
1
 
  W j  W j1   t 
2

2  2 rj 1  
 

 
 rj 1     rj 1  t   rj 1 W j  W j1   2  W j  W j1
1
   t  ,
2

4  
T
where j = 1, 2, ..., L , and t  .
N
Th e E ffec t Of C h an gi n g M ea s u re In In t e res t R a t e M odels 713

The simulation of CIR model is as follows,


0.058
Interest Rate Data
0.056 Simulation Without Changing Measure
Simulation With Changing Measure
0.054

0.052

0.05
r(t)

0.048

0.046

0.044

0.042

0.04
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
Figure 2. The Comparison of Interest Rate Data with Simulations of CIR Model with and
without Changing Measure

The simulations of CIR model show that the mean absolute error of simulation without
changing measure is 0,0028 while the mean absolute error of simulation with changing
measure is 0,0016.
Based on both simulations, by applying the changing measure on both models, show
that the mean absolute error of simulation with changing measure is less than the mean
absolute error of simulation without changing measure.

5. CONCLUDING REMARK

The mean absolute error of simulation of Vasicek and CIR model by using parameter
estimation with changing measure is less than the mean absolute error of simulation of
Vasicek and CIR model by using parameters estimation without changing measure.
Therefore, changing measure is one factor to consider in interest rate modeling.

Acknowledgements: To The Directorate of Research and Community Service (DRPM)


UI through RUUI-Utama 2010 grant in supporting this research.
714 D. INDARTI, B. D. HANDARI, AND I. S. WAHYUNI

References

[1] BRIGO, DAMIANO AND FABIO MERCURIO, Interest Rate Models Theory and Practice, Springer Finance, New
York, 2001.
[2] DOMINEDO. Affine jump-diffusion interest rate models and application to insurance, thesis, Universitas Roma
Tor Vergata. 2009.
[3] GLASSERMAN, PAUL, Monte Carlo Methods in Financial Engineering : Applications of Mathematics, Springer,
New York, 2004.
[4] HIGHAM, DESMOND J, An Algorithmic Introduction to Numerical Simulation Differential Equations, SIAM
Review Vol.43, No.3, 525-546, 2001.
[5] HULL, J.C, Options, futures, and other derivatives (5th ed), Prentice-Hall, New Jersey, 2003.
[6] INDARTI, DINA. Analisis Stabilitas dan Implementasi Model Vasicek, thesis, Universitas Indonesia. 2011.
[7] KLOEDEN, P.E. DAN PLATEN, E. 1991. Numerical Solution of Stochastic Differential Equations. Applications of
Mathematics, Vol.23, Springer-Verlag Berlin Heidelberg.
[8] MATYAS, LASZLO. Generalized Method of Moments Estimation, Cambridge
University Press, 1999.
[9] MISHRA, RAJA KUMAR. Study of Positivity Preserving Numerical Methods for Cox-Ingersoll-Ross Interest Rate
Model, project report, Indian Institute fo Science, Bangalore. 2010.
[10] SHREVE, STEVEN E, Stochastic Calculus for Finance II Continuous-Time Models, Springer Finance, New
York, 2004.
[11] WAHYUNI, IAS SRI. Kajian Stabilitas dan Implementasi Model Cox, Ingersoll, and Ross, thesis, Universitas
Indonesia. 2011.
[12] ZEYTUN, S AND A.GUPTA, A Comparative Study of the Vasicek and the CIR Model of the Short Rate, Institut
Techo- und Wirtschaftsmathematik ITWM, Fraunhofer. 2007.

DINA INDARTI
University of Indonesia.
e-mail: [email protected]

BEVINA D. HANDARI
University of Indonesia.
e-mail: [email protected]

IAS SRI WAHYUNI


University of Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 715 - 728.

NEW WEIGHTED HIGH ORDER FUZZY TIME


SERIES FOR INFLATION PREDICTION

DWI AYU LUSIA AND SUHARTONO

Abstract. Weighted fuzzy time series that developed based on concept of fuzzy theory is one of
relatively new methods for time series forecasting. Up to now, a single order of weighted fuzzy time
series, either non-seasonal or seasonal order, is mostly used for time series forecasting. In practice,
many time series forecasting problems usually also deal with more than single order, known as high
order model. This paper focuses on the development of new weighted fuzzy time series method for
high order model. New rules to find the forecast in high order weighted fuzzy time series model are
also proposed. Three data about Indonesia’s inflation are used as case study. Root mean of squares
errors in testing datasets is used for evaluating the forecast accuracy. The results are compared to
other three weighted fuzzy time series methods (i.e. Chen’s, Yu’s, and Cheng’s methods) and two
classical statistical methods, namely ARIMA and exponential smoothing models. The results show
that the proposed weighted high order fuzzy time series yields more accurate forecast than other
methods in two datasets, whereas ARIMA yields the best forecast in one dataset.

Keywords and Phrases: fuzzy time series, high order, inflation, new weight.

1. INTRODUCTION

Fuzzy time series is a concept which can be used to deal with forecasting problems in
which historical data are linguistic values. Fuzzy time series was firstly introduced by Song
and Chissom [10, 11]. Song and Chissom [11] stated that fuzzy time series divided into two
types, namely time variant and time-invariant. If the relations are all same between time t and
its prior time t – k (where k = 1,2, ... , m), it is a time-invariant fuzzy time series; likewise, if
the relations are not the same, then it is time variant. This paper discusses about the time
invariant fuzzy time series. First order time invariant fuzzy time series was studied by Chen
[1]. The fuzzy time series model proposed by Chen were ignoring of recurrence and not
properly handle weighted to various fuzzy relationships. The problem was solved by Yu [12],
Cheng et al. [4], and Lee and Suhartono [7]. Futhermore, Chen [2] also has studied about high
order time invariant fuzzy time series.
2010 Mathematics Subject Classification: 62A86 (Fuzzy analysis in statistics)

715
716 LUSIA AND SUHARTONO

In this paper, the first order method introduced by Lee and Suhartono [7] was
developed for high order model. First, new rules to find the forecast in high order weighted
fuzzy time series model were proposed. Then, empirical study was done by using data about
Indonesia’s inflation as a case study. Root mean of squares errors is used for evaluating the
forecast accuracy, particullarly in testing datasets. The results were compared to other three
weighted fuzzy time series methods (i.e. Chen’s, Yu’s, and Cheng’s methods) and two
classical statistical methods, namely ARIMA and exponential smoothing models. The results
showed that the proposed weighted high order fuzzy time series yielded more accurate
forecast than other methods.

1.1 Fuzzy Time Series. Generally, Song and Chissom [10, 11] described concepts about
fuzzy time series as follows:
Let U be the
universe of dicourse, where U = and
= [begin, end]. A fuzzy set of U is defined as
, where is teh membership function of
the fuzzy set . is a generic element of fuzzy set , and is the
degree of belongingness of to , where and .

Definition 1. Fuzzy time series. Let ( ), a subset of real numbers R, be


the universe of discourse by which fuzzy set are defined. If is a collection of
then is called a fuzzy time series defines on .

Definition 2. If there exist a fuzzy relationship , such that


, where is an arithmatic operator, then is said to be
caused by . The relationship between and can be denoted by
.

Definition 3. Suppose is only calculated by , and


. For any t, if is dependent of then is called a
time-invariant fuzzy time series. Otherwise, is time-variant.

Definition 4. Suppose and , a Fuzzy Logical Relationship (FLR)


can be defined as , where and are called Left Hand Side (LHS) and Right-Hand
Side (RHS) of FLR.

First order seasonal fuzzy time series model defined by Song [11] is given as follows:
Definition 5. Let is a fuzzy time series which there exist seasonality with m period, then
FLR is represented by .

1.2 First Order Weighted Fuzzy Time Series. Based on Song and Chissom [10,11], Chen
[1] improved the establishment step of fuzzy relationships which it used a simple operation
and instead of complex matrix operations. The algorithm of Chen’s method is as follows:

Chen’s Algorithm
New Weighted High Order Fuzzy Time Series for Inflation Prediction 717

1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR and group them based on the current states of the data of the FLR. For
example FLR are , , , , , then the Fuzzy
Logical Relationship Group (FLRG) of FLR is
5. Forecast . Let , then the forecast is as follows:
i. If the FLR of is empty , then ,
ii. If there is only one FLR (for example ), then ,
iii. If then .
6. Defuzzy. For example , then , where is
defuzzy and is a midpoint of .

The fuzzy time series model proposed by Chen were ignoring of recurrence and not
properly handle weighted to various fuzzy relationships. The problem was solved by Yu [12],
Cheng et al. [4], and Lee and Suhartono [7]. The diffecence between Chen’s algorithm and
Yu, Cheng, also Lee’s algorithm is after third step. Model proposed by Yu, Cheng, and Lee
and Suhartono called Yu’s Algorithm, Cheng’s Algorithm, and Lee’s Algorithm,
respectively. These three algorithms are as follows:

Yu’s Algorithm
1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR and FLRG. For example the FLR are , , ,
, , then the FLRG is .
5. Forecast. Use the same rule as Chen’s algorithm.
6. Defuzzy. Suppose , then the defuzzified matrix is equal to a
matrix of the midpoints of is represented by ,
where represents the defuzzified forecast of .
7. Assigning weights. Suppose the weigths of are
which it specified as , where and for .
Then the weight matrix can be written as
.
8. Calculating the final forecast value. In the weighted model, the final forecast is

Cheng’s Algorithm
718 LUSIA AND SUHARTONO

1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR, FLRG, and calculated the weighted. For example if the FLR are
, , , , , then the FLRG is
which it have weighted (the first RHS of ),
(the first RHS of ), (the second RHS of ), (the first RHS of ),
(the third RHS of ). So the weight matrix can be written as
.
5. Calculated the standardize matrix ( ). The standardize matrix calculate using

6. Calculated forecast value that follow , where


is the defuzzified matrix and is the weight matrix. For example is ,
, , , then the forcast value is

7. Employ the adaptive forecasting equation is defined as


, where is the actual data on
and is a weighted parameter.

Lee’s Algorithm
1. Define the universe of discourse (U = [starting, ending]) and intervals for rule
abstraction. As the length of interval is determined U can be partitioned into several
equally length intervals.
2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data.
3. Fuzzify observed rules.
4. Establish FLR and FLRG. For example if the FLR are , , ,
, , then the FLRG is
5. Forecast. Use the same forecast rule as Chen’s algorithm.
6. Defuzzify. If , then defuzzified matrix is equal to a matrix of the
midpoint of can be written as , where is
represented the defuzzified forecast of .
7. Assigning weight. The weight of are which it
specified as , where and for and . Then
the weigth matix can be written as
.
8. Calculate final forecast using

1.3 High Order Weighted Fuzzy Time Series. High order weighted fuzzy time series was
defined by Chen [2] can be explained as follows:

Definition 6. Suppose is the fuzzy time series. If calculated by


, then the FLR can be represented as follows:
New Weighted High Order Fuzzy Time Series for Inflation Prediction 719

(1)
Calculation for Chen, Yu and Cheng all are the same as first order weighted fuzzy
time series, but there are several rules in defuzzified step. The rule for each method is given
as follows:

Chen’s rule:
1. If only have one value in RHS. For example
and in FLR have one value of RHS
called then defuzzified value are
2. If have more than one value in RHS. For example
then the defuzzified value are
3. If FLR of is empty set. For example
then the defuzzified value is as follows:
i. or
ii.

Yu’s rule:
1. If then the defuzzified value or is equal to the first step of Chen’s rule.
2. If have more than one value in RHS. For example
then the defuzzified value is
3. If FLR of is empty set. For example
then the fuzzified value is as follows:

(2)

Cheng’s rule:
1. If then the defuzzified value or is equal to the first step of Chen’s rule.
2. If have more than one value in RHS. For example
then the defuzzified value is

(3)

3. If FLR of is empty set. For example


and then the defuzzified value is . Other example
and then the defuzzified value is .

2. PROPOSED MODEL AND RULE


Based on the first order method introduced by Lee and Suhartono [7], in this paper
high order model was developed. The new rules to find the forecast in high order weighted
fuzzy time series model were proposed. The rule for Lee’s method are proposed as follows:
720 LUSIA AND SUHARTONO

1. If then the defuzzified value or is equal to the first step of Chen’s rule.
2. If have more than one value in RHS. For example
then the defuzzified value is

(4)

3. If FLR of is empty set or . For


example in third order that have FLR are .
Suppose that then there are several proposed schemes that introduced in
this paper. Some definitions will be used in the proposed schemes, i.e.:
i. is the defuzzified of the first order:
ii. is the defuzzified of the first order:
iii. is the defuzzified of the first order:
iv. is the defuzzified of the second order:
v. is the defuzzified of the second order:
vi. is the defuzzified of the second order: .
The several proposed schemes are as follows:
1st Scheme:
1. If exist, then the defuzzified is .
2. If do not exist but exist, then the defuzzified is
3. If and do not exist but exist, then the defuzzified is
4. If and do not exist, then the defuzzified is

2nd Scheme:
1. If and exist, then the defuzzified is
2. If and exist, then the defuzzified is
3. If and exist, then the defuzzified is
4. If the thirt of second scheme do not exist, then the defuzzified is equal to the fourth
rule of the first scheme.
3th Scheme:
1. If , and exist, then the defuzzified is

2. If , and do not exist, then the defuzzified is equal to the fourth


rule of the first scheme.
4th Scheme
1. If exist, then the defuzzified is
2. If do not exist but exist, then the defuzzified is
3. If and do not exist but exist, then the defuzzified is

4. If , and do not exist, then the defuzzified is


New Weighted High Order Fuzzy Time Series for Inflation Prediction 721

Data about general Indonesia’s inflation, food stuffs inflation, and education and
sport inflation are used as case study, where January 2000-Desember 2009 are use as training
datasets and January-Desember 2010 are use as testing datasets. Time series plot for each data
are shown in Figure 1.

Time Series Plot of Y(t) Time Series Plot of Y(t) Time Series Plot of Y(t)
10 8 8

(a) 8 (b) 6 (c) 6

4 4
6
Y(t)

Y(t)

Y(t)
2 2
4

0 0
2
-2 -2
0
-4 -4
Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan Month Jan Jan Jan Jan Jan Jan Jan Jan Jan Jan
Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Figure 1. Time series plot of general Indonesia’s inflation (a), food stuffs inflation (b),
and education and sport inflation (c).

Figure 1.a shows that general Indonesia’s inflation are stationary in mean and
variance and also there is neither seasonal nor trend pattern. This figure also shows an outlier
in October 2005 which is caused by increasing price of fuel. Because of no seasonal pattern in
this data, the order model that can be used are the first order, the second order and the third
order. To demonstrate the proposed algorithm, general Indonesia’s inflation data is used as a
numerical example in the second order as follows:
Step 1. Define the universe of discourse and partitioned into several equally length of
intervals.
We justifed that partition the universe of discourse into 16
invervals which have length of interval 0,6. The 16 intervals are ,
, , , , until
.
Step 2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data
In this step, the fuzzy sets, for the universe of discourse are defined
as in Table 1.

Table 1. Fuzzy sets for 16 linguistic variables

Step 3. Fuzzify observed rules


This step, is use to find the best relationship between LHS and RHS. In this
numerical example, the LHS are and and the RHS is .
Step 4. Establish FLR and FLRG.
From LHS and RHS in step 3, we can make FLR and FLRG in the second order as
722 LUSIA AND SUHARTONO

shown in Table 2 and Table 3.


Step 5. Forecast .
The forecast rule is equal to the forecast in Chen’s algorithm. For example, if
and then the forecast value is (see Table 3).

Step 6. Defuzzify.
Using example in step 5, the forescast are then the deffuzified forecast of
are .
Table 2. Fuzzy Logical Relationship

1
2
3 
4 
5 
6 
7 

117 
118 
119 
120 

Table 3. Fuzzy Logical Relationship Group










New Weighted High Order Fuzzy Time Series for Inflation Prediction 723










Step 7. Assigning weight.


The weight matrix for are . From our
reserch conclude using the second scheme and that optimasize RMSE at
testing datasets.
Step 8. Calculate final forecast.
In the weighted model, the final forecast is equal to the product of the defuzzified
matrix and the transpose of the weight matrix. For example in step 6 and 7 we have
and then the final forecast
value is determined as

To illustrate how to calculate the final forecast values at training ( and


testing ( and ) data of the second order fuzzy time series models, we use ,
and as examples.
(1) For , the linguistic value at is and at is , and based on
the second order in Table 3 we have the forecast of and is
. Thus the formula to calculate the final
forecast value at or for the proposed fuzzy time series is as follows:

(2) For , the linguistic value at is and at is , and based on


the second order in Table 3 we have the forecast of and is
724 LUSIA AND SUHARTONO

. Thus the formula to the calculate final forecast


value at or for the proposed fuzzy time series is as follows:

0.585
(3) For , we already have the linguistic of the final forecast for
( is and for ( ) is . Based on Table 3 we have
the forecast of and is
.
Thus the formula to calculate final forecast value is as follows:

The forecasting of the proposed model is verified by using the general Indonesia’s
inflation, the food stuffs inflation, the education and sport inflation, and three weighted fuzzy
time series models, Chen’s [1], Yu’s [12], and Cheng’s [4] and also two classical time series
models, exponential smoothing and ARIMA, are employed as comparison models. To
evaluate the performance of high order fuzzy time series, the root mean squared error
(RMSE) and mean absolute percentage error (MAPE) are selected as an evaluation index in
testing data. The RMSE and MAPE are defined as

and
where is the number of forecast.
The result of RMSEs obtained by using three weighted fuzzy time series for several
partition and order in general Indonesia’s inflation are listed in Table 4.

Table 4. Accuration of the proposed method and three fuzzy time series in several k (number
of partitions) and order for general Indonesia’s inflation
RMSE MAPE
Order Method
k = 16 k = 19 k = 22 k = 16 k = 19 k = 22
First Chen 0.693 1.193 1.331 318.203 528.283 587.233
Yu 0.462 0.504 0.513 189.899 115.293 110.247
Cheng 0.451 0.498 0.532 166.423 116.691 99.705
Lee 0.457 0.487 0.502 168.026 125.627 115.982
New Weighted High Order Fuzzy Time Series for Inflation Prediction 725

Second Chen 0.503 0.570 0.619 128.842 98.516 84.736


Yu 0.503 0.574 0.619 128.842 103.146 84.736
Cheng 0.474 0.519 0.530 121.426 95.434 88.955
Lee 0.474 0.479 0.473 135.107 130.981 136.021
Third Chen 0.503 0.570 0.463 128.842 98.053 156.678
Yu 0.503 0.570 0.618 128.842 98.053 82.745
Cheng 0.461 0.516 0.440 120.251 97.396 147.955
Lee 0.532 0.473 0.473 114.407 150.014 125.900
These results based on RMSE show that high order fuzzy time series, i.e. the third order,
proposed by Cheng, which have 22 partition the universe of discourse, yields the best
forecast, whereas based on MAPE show that high order fuzzy time series proposed by Yu
yields the best forecast. Whereas the result of RMSE obtained by using single exponential
smoothing ( ), three weighted fuzzy time series models, and ARIMA are listed in
Table 5. These results show that MA(1) with outlier generates more accurate forecasted value
than the proposed model, three other fuzzy time series model, and single exponential
smoothing model. Otherwise in the food stuffs inflation and the education and sport inflation,
show that the proposed model (Lee’s for high order) generate more accurate forecasted value
than three other fuzzy time series models, exponential smoothing, and ARIMA (see Table 6
and 7).

Table 5. Forecast accuracy of all methods for general Indonesia’s inflation


Method RMSE MAPE
Single exponential smoothing ( ) 0.456 169.541
ARIMA: MA (1) with outlier 0.324 91.963
WFTS: 1. Chen with order (1,2,3) and k = 7 0.463 156.678
2. Yu with order (1) and k = 5 0.462 189.899
3. Cheng with order (1,2,3) and k = 7 0.440 147.955
4. Lee with order (1,2,3) and k = 5 0.457 168.026

Table 6. Forecast accuracy of all methods for food stuffs inflation


Method RMSE MAPE
Single exponential smoothing ( ) 1.658 86.471
ARIMA: MA ([1,12]) with outlier 1.710 86.807
WFTS: 1. Chen with order (12) and k = 11 1.568 115.272
2. Yu with order (1) and k = 8 1.639 91.373
3. Cheng with order (1) and k = 7 1.653 107.015
4. Lee with order (1,2,3) and k = 8 1.377 87.726

Table 7. Forecast accuracy of all methods for education and sport inflation
Method RMSE MAPE
Winter’s exponential smoothing ( , 0.153 278.143
, and )
726 LUSIA AND SUHARTONO

ARIMA([5],0,[3,12])(0,1,0)12 with outlier 0.802 126.639


WFTS: 1. Chen with order (12,24) and k = 20 0.281 311.339
2. Yu with order (12) and k = 20 0.132 190.101
3. Cheng with order (12) and k = 20 0.161 80.668
4. Lee with order (12) and k = 20 0.125 65.276

3. CONCLUDING REMARK

In this paper, we have proposed new rule for high order fuzzy time series based on
Lee’s first order fuzzy time series. Three empirical data were used to compare the forecasting
accuracy between an exponential smoothing, ARIMA, and weighted fuzzy time series
methods. In general, the results showed that the proposed method (Lee’s for high order
WFTS) yielded more accurate forecast than other methods, particularly three WFTS methods.
Specifically, the result of RMSEs for general Indonesia’s inflation showed that ARIMA with
outlier generates more accurate forecasted value than the proposed model, three other fuzzy
time series model, and single exponential smoothing model. Whereas, the results of
forecasting accuracy in food stuffs inflation and education and sport inflation ults showed that
the proposed model, i.e. Lee’s high order WFTS and Lee’s seasonal order WFTS,
respectively, yielded more accurate forecasted value than three other fuzzy time series
models, exponential smoothing, and ARIMA.

References

[1] CHEN, S.M. 1996. “Forecasting Enrollments Based on Fuzzy Time Series”. Fuzzy Sets and System 81, 3:311-
319.
[2] CHEN, S.M. 2002. “Forecasting Enrollments Based on High-order Fuzzy Time Series”. Cybernetics and
Systems 33, 1:1-16.
[3] CHEN, S.M. AND HWANG, J.R. 2000. “Temperature Prediction Using Fuzzy Time Series”. IEEE Transaction
on Systems, Man, and Cybernetics 30, 2:263-275.
[4] CHENG, C.H., CHEN, T.L., TEOH, H.J., AND CHIANG, C.H. 2008. “Fuzzy Time Series Based on Adaptive
Expectation Model for TAIEX Forecasting”. Expert Systems with Application 34, 2:1126-1132.
[5] HUARNG, K.H. 2001. “Heuristic Models of Fuzzy Time Series for Forecasting”. Fuzzy Sets and Systems 123,
3:369-386.
[6] HWANG, J.R., CHEN, S.M., AND LEE, C.H. 1998. “Handling Forecasting Problems Using Fuzzy Time Series”.
Fuzzy Sets and Systems 100, 2:217–228.
[7] LEE, M.H., AND SUHARTONO. 2010. “An Novel Weighted Fuzzy Time Series Models for Forecasting Seasonal
Data”. Proceeding 2nd International Conference on Mathematical Sciences. Kuala Lumpur, 30 November-30
Desember: 332-340.
[8] SIGH, S.R. 2007. “A Simple Time-Variant Method for Fuzzy Time Series Forecasting”. Cybernetics and
Systems 38, 3:305-321.
[9] SONG, Q., AND CHISSOM, B.S. 1993a. “Forecasting Enrollments with Fuzzy Time Series-part I”. Fuzzy Sets
and System 54, 1-9.
New Weighted High Order Fuzzy Time Series for Inflation Prediction 727

[10] SONG, Q., AND CHISSOM, B.S. 1993b. “Fuzzy Time Series and Its Model”. Fuzzy Sets and System 54, 269-277.
[11] SONG, Q. 1999. “Seasonal Forecasting in Fuzzy Time Series”. Fuzzy Sets and Systems 107, 235-236.
[12] YU, H.K. 2005. “Weighted Fuzzy Time Series Models for TAIEX Forecasting”. Physica A. Statistical
Mechanics and Its Application 349, 609-642.
[13] ZHANG, G.P. 2003. “Time Series Forecasting using A Hybrid ARIMA and Neural Network Model”.
Neurocomputing 50, 159-175.

DWI AYU LUSIA


Department of Statistics from Institut Teknologi Sepuluh Nopember, Indonesia.
e-mail: [email protected]
SUHARTONO
Department of Statistics from Institut Teknologi Sepuluh Nopember, Indonesia.
e-mail: suhartono@ statistika.its.ac.id
728 LUSIA AND SUHARTONO
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 729–736.

DETECTING OUTLIER IN HYPERSPECTRAL IMAGING


USING MULTIVARIATE STATISTICAL MODELING AND
NUMERICAL OPTIMIZATION

Edisanter Lo

Abstract. An analytic expression of an algorithm for the detection of outliers in hyper-


spectral imaging using remote sensing is developed in this article. The pixel from a data
cube which is a vector of observable random variables is modeled as a linear transforma-
tion of a vector of unobservable random variables from the clutter subspace plus a vector
of unobservable random variables from the error in which the transformation matrix is
unknown. The dimension of the clutter subspace for each spectral component of the
pixel can vary. The outlier detection algorithm is defined as the Mahalanobis distance
of the residual. The experimental results are obtained by implementing the outlier de-
tection algorithm as a global detection algorithm in unsupervised mode with background
statistics computed from hyperspectral data cubes with wavelengths in the visible and
near-infrared range.
Keywords and Phrases: outlier, multivariate statistical analysis, hyperspectral imaging,
numerical optimization.

1. INTRODUCTION
Outlier detection is an important research area in remote sensing using hyperspec-
tral imaging. In this article an outlier detector is developed in analytical expression for
detecting anomalous objects in a large area in remote sensing using hyperspectral imag-
ing. Reviews of some common outlier detectors for hyperspectral imagery are discussed
in [1,2]. The conventional outlier detectors for detecting anomalous objects in a large
area are the RX detector [3] and SSRX detector [1,4] so the performance of the new
outlier detector developed in this article is compared with the RX detector and SSRX
detector. The RX detector is a general-purpose outlier detector and is defined as the
Mahalanobis distance of the pixel. The SSRX detector is defined as the Mahalanobis
distance of the pixel in the noise subspace. The SSRX detector has been known to
perform better than the RX detector.
729
730 Edisanter Lo

The MSM (Maximized Subspace Model) detector in [5] partials out the effect of
the clutter subspace in a pixel by predicting each spectral component of the pixel using
a linear combination of the clutter subspace. Both the SSRX and MSM detectors have
only one user-specified parameter which is the dimension of the deleted clutter subspace.
The maximum number of possible values for this parameter is typically large and this
would result in a large number of images of detector output to be analyzed. This paper
proposes an outlier detector that would result in significantly fewer images of detector
output to be analyzed and the outlier detector is developed in Section 2. The outlier
detector partials out the effect of the unknown clutter subspace in a pixel by modeling
the pixel as a linear transformation of the unknown clutter subspace plus an unknown
error in which the transformation matrix is also unknown. The dimension of the clutter
subspace can vary from one spectral component to another one. The outlier detector
is the Mahalanobis distance of the resulting residual. The performance of the outlier
detector is compared to the RX detector and SSRX detector using a hyperspectral data
cube in the visible and near-infrared range and the results are presented in Section 3.

2. OUTLIER DETECTION ALGORITHM


The analytical expression of the outlier detector for detecting outliers in hy-
perspectral imagery is developed in this section. Denote the demeaned pixel from a
 T
data cube by a p × 1 vector x = x1 x2 . . . xp of observable random vari-
ables with expected value E(x) = 0p and covariance V ar(x) = C. Suppose there
 T
exist a q × 1 vector y = y1 y2 . . . yq of unobservable random variables from
 T
the clutter subspace for q ≤ p, p × 1 vector  = 1 2 . . . p of unobserv-
able
 random variables of the error, and p × q matrix γ of unknown coefficients with
γi,1 γi,2 . . . γi,qi 0 0 . . . 0 as row i of γ such that the pixel x is a linear
transformation of y plus the error , i.e.
x = γy +  (1)
 
where E(y) = 0q , V ar(y) = Iq , Cov y, T = E (y − E(y))( − E())T = ∅, and
E() = 0p , V ar() = δ. The notations 0q is a q × 1 zero vector, 0p is a p × 1 zero vector,
∅ is a p × p zero matrix, Iq is a q × q identity matrix, δ is a p × p diagonal matrix
with diagonal elements δ1 , δ2 , . . . , δp , and q is the maximum of qi for 1 ≤ i ≤ p. By
substituting (1) into the definition of the covariance of x, the covariance matrix for x
can be written in factored form as
C = γγ T + δ. (2)

The unknown coefficient γ and covariance δ can be estimated using maximum


likelihood estimation by assuming that the random variables x1 , x2 , . . . , xp have a non-
singular multivariate normal distribution with zero mean and covariance C, the random
variables y1 , y2 , . . . , yq are indepedently and normally distributed with zero means and
unit variances, the random variables 1 , 2 , . . . , p are indepedently and normally dis-
tributed with zero means and variances δ1 , δ2 , . . . , δp , and y and  are independently
distributed. The sample covariance S from a random sample of n pixels is used to
Detecting Outlier in Hyperspectral Imaging 731

estimate the population covariance C. The likelihood function for C is the Wishart
density function
−1
L(C) = k|S|(n−p−1)/2 |C|−n/2 e−(n/2)tr(C S ) (3)
where
p  !−1
Y n + 1 − i
k= π (p−1)/4 2np/2 Γ (4)
i=1
2
and Γ is the gamma function and tr denotes trace. The maximum likelihood estimates
of γ and δ are obtained by maximizing the logarithm of the likelihood function in
(3) subject to the constraint in (2). The constrained maximization problem can be
transformed into an unconstrained maximization problem by substituting the constraint
into the likelihood function. Maximizing the function in (3) is equivalent to minimizing
the following function
 −1 
φ(γ, δ) = ln δ + γγ T + tr δ + γγ T

S . (5)

The maximum likelihood estimates for γ and δ are obtained by minimizing the function
φ in (5) numerically using optimization methods.
Quasi-Newton method with inexact line search is used to find a minimum solution
for the function φ. The Quasi-Newton method that has been implemented updates the
Hessian matrix using the BFGS update [6,7] and estimates the step size using three
different inaccurate line search methods (Armijo’s rule, Goldstein test, and Wolfe test).
Quasi-Newton method with inexact line search does not require the computation of the
Hessian matrix in analytic form but it requires the gradient vector of φ in analytic form.
The gradient vector of φ, denoted by g(z), can be derived to be
h iT
∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ) ∂φ(γ,δ)
g(z) = ∂δ1 . . . ∂δp ∂γ1,1 . . . ∂γ1,q . . . ∂γp,1 . . . ∂γp,qp (6)
1

where
 T
z = δ1 ... δp γ1,1 ... γ1,q1 ... γp,1 ... γp,qp (7)
∂φ(γ, δ)
= ψi,i (8)
∂δi
∂φ(γ, δ)    T
= 2 ψi,1 ψi,2 . . . ψi,p × γ1,j γ2,j ... γp,j (9)
∂γi,j
−1  −1 
ψ = δ + γγ T I − S δ + γγ T (10)

and ψi,i is the element in row i and column i of the matrix ψ.


The iteration in the BFGS Quasi-Newton method is terminated when the ratio
of the 2-norm of the difference between two successive solutions to the 2-norm of the
current solution is within a prescribed tolerance tolq. The starting values for γ are the
values of γ that maximize the squared correlation between each spectral component yi
of the pixel and linear combination of the first qi principal components of the sample
covariance S in which the cofficients of the linear combination are γ1 , γ2 , . . . , γqi for
i = 1, 2, . . . , p. The starting values for δ are obtained by satisfying the constraint in (2).
732 Edisanter Lo

The proposed criterion for selecting qi , the number of high-variance principal


components for the ith spectral component yi of the pixel y, is based on the maximized
squared correlation between each spectral component of the pixel and its linear com-
bination of the high-variance principal components. The value of qi is selected such
that the absolute difference between the maximized squared correlation for the next
value of qi and the maximized squared correlation for the current value of qi is within
a prescribed tolerance.
By treating the converged values of γ and δ from the BFGS Quasi-Newton method
as the true values of γ and δ and by estimating the values of y using the conditional
distribution of y given x, the Mahalanobis distance of the residual can be derived to be
  −1  T −1
d(x) = QT diag δ − γγ T + γγ T γγ T + δ γγ Q (11)

where
 −1 
Q = x − γγ T γγ T + δ x . (12)

A large value in the detector output d(x) would indicate that the pixel x is a potential
outlier.

3. EXPERIMENTAL RESULTS
A relative comparison of the performance between the outlier detector in (11) and
the SSRX detector with respect to the RX detector is presented in this section using
the RIT (Rochester Institute of Technology) data cube [8] which is in the visible and
near-infrared wavelengths with spatial dimensions of 280 by 400 and spectral dimension
of 126. The outlier pixels are selected to be man-made objects that are easy to detect.
The 280x400 RGB image of the data cube using spectral band number 17, 7, and 2
for the red band, green band, and blue band, respectively, is shown in Fig. 1 and the
targets are man-made objects.
The number of high-variance principal components for each spectral component
of the pixel is shown in Fig. 2 for tol = 10−6 , tol = 10−7 , and tol = 10−8 . The
ROC curves for the SSRX detector are shown in Fig. 3 for 1 ≤ q ≤ 12. The ROC
curves for 1 ≤ q ≤ 125 show that the SSRX detector performs like the RX detector
for q = 1 but it performs increasingly worse than the RX detector for 2 ≤ q ≤ 125.
The Quasi-Newton method fails for tol = 10−1 , tol = 10−2 , tol = 10−3 , tol = 10−4 ,
and tol = 10−5 because the objective function is undefined during the line search test.
The Quasi-Newton method requires a significant increase in memory for tol = 10−9 so
the iteration is not carried out. The ROC curves obtained using the Goldstein test are
shown in Fig. 4 for tolq = 10−3 . The ROC curves for tolq = 10−3 show that there is
no significance difference among Armijo’s test, Goldstein test, and Wolf test in which
Goldstein test is the most efficient. The outlier detector performs better than the RX
detector for tol = 10−6 , performs like the RX detector for tol = 10−7 , and performs
worse than the RX detector for tol = 10−8 . The outlier detector for tol = 10−6 performs
better than the SSRX detector for 1 ≤ q ≤ 125.
Detecting Outlier in Hyperspectral Imaging 733

Figure 1. An RGB image of the 280x400 data cube using spectral


band number 17, 7, and 2 for the red band, green band, and blue band,
respectively.

Figure 2. An image of the locations of targets for the data cube in Fig. 1.

4. CONCLUSION
An outlier detector for detecting anomalies in hyperspectral imaging using remote
sensing is defined as the Mahalanobis distance of the residual resulting from partialling
out the effect of the clutter subspace from a pixel. The pixel of known random variables
from a data cube is modeled as a linear transformation of a set of unknown random
734 Edisanter Lo

80
tol=10−6
−7
70 tol=10
−8
tol=10
60

50

Number of PC
40

30

20

10

0
0 20 40 60 80 100 120 140
Spectral band number

Figure 3. Number of principal components selected for each spectral


band number for the outlier detector for tol = 10−6 , tol = 10−7 , and
tol = 10−8 .

0.02

0
Difference in probability of detection

−0.02

−0.04 q=1
q=2
q=3
−0.06 q=4
q=5
q=6
−0.08 q=7
q=8
q=9
−0.1 q=10
q=11
q=12
−0.12
−6 −4 −2 0
10 10 10 10
Probability of false alarm

Figure 4. Difference in probability of detection between the SSRX


detector and RX detector (SSRX-RX) for q = 1, 2, . . . , 12.

variables from the clutter subspace plus an error of unknown random variables in which
the transformation matrix of constants is also unknown. The dimension of the clutter
subspace for each spectral component of the pixel can vary. The experimental results
are obtained by implementing the outlier detector as a global anomaly detector in
unsupervised mode using a hyperspectral data cube with wavelengths in the visible and
near-infrared range. The results show that the best ROC curve of the outlier detector is
better than that of the SSRX detector. Moreover, the outlier detector would generate
significantly fewer images of detector output to be analyzed than the SSRX detector
Detecting Outlier in Hyperspectral Imaging 735

0.1
tol=10−6
−7
tol=10
−8
0.05 tol=10

Difference in probability of detection


0

−0.05

−0.1

−0.15
−6 −5 −4 −3 −2 −1 0
10 10 10 10 10 10 10
Probability of false alarm

Figure 5. Difference in probability of detection between the outlier


detector and RX detector using Quasi-Newton method with tolq =
10−3 and Goldstein test.

which generates 125 images. Thus, the outlier detector is computational more efficient
than the SSRX detector.

References
[1] Schaum, A.P., Hyperspectral anomaly detection beyond RX, Proceeding of 13th SPIE Conference
on Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery 6565,
656502, 2007.
[2] Stein, D.W.J., Beaven, S.G., Hoff, L.E., Winter, E.M., Schaum, A.P., and Stocker, A.D.,
Anomaly detection from hyperspectral imagery, IEEE Signal Processing Magazine 19, 58-69, 2002.
[3] Reed, I.S. and Yu, X., Adaptive multiple-band CFAR detection of an optical pattern with un-
known spectral distribution, IEEE Trans. Acoustics, Speech, Signal Processing, 38, 1760-1770,
1990.
[4] Schaum, A. and Stocker, A. Joint hyperspectral subspace detection derived from a Bayesian
likelihood ratio test, Proceeding of 8th SPIE Conference on Algorithms and Technologies for
Multispectral, Hyperspectral, and Ultraspectral Imagery 4725, 225-233, 2002.
[5] Lo, E., Maximized Subspace Model for Hyperspectral Anomaly Detection, Pattern Analysis and
Applications (Published online March 20, 2011), 1-11, 2011.
[6] Luenberger, D.G, Linear and Nonlinear Programming, 2nd Ed., Addison-Wesley, 1984.
[7] Fletcher, R, Practical Methods of Optimization, 2nd Ed., John Wiley and Sons, 1987.
[8] Kerekes, J.P. and Snyder, D.K., Unresolved target detection blind test project overview, Pro-
ceeding of 16th SPIE Conference on Algorithms and Technologies for Multispectral, Hyperspectral,
and Ultraspectral Imagery 7695, 769521, 2010.

Edisanter Lo
Department of Mathematical Sciences, Susquehanna University,
514 University Avenue, Selinsgrove, Pennsylvania 17870, U.S.A.
e-mail: [email protected]
736 Edisanter Lo
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 737 - 750.

PREDICTION THE CAUSE OF NETWORK CONGESTION USING


BAYESIAN PROBABILITIES

ERWIN HARAHAP , M. YUSUF FAJAR, HIROAKI NISHI

Abstract. Network Management System (NMS) is a service that employs a variety of


tools, applications, and devices to assist network administrators on monitoring and
maintaining network. Keeping the network in high quality of service is the main purpose of
NMS. In this paper, we proposed a system to predict congestion on the network by using
probability model based on Bayesian Network method. The probability model is
constructed based on data training of network congestion, by analysis on join probability
distribution equation from certain Bayesian network structure. Congested network
simulation is conducted to demonstrate how to collect training and testing data, and to
evaluate validity of probability model. Ba sed on simulation result, we concluded that the
model can predict congestion on the network caused by a certain problem.

Keywords and Phrases :Bayesian network, fault management, congestion, prediction

1. INTRODUCTION

Network Management System (NMS) is a service that employs a variety of tools,


applications, and devices to assist network administrators on monitoring and maintaining
network. A large scale network, such as an enterprise network, requires network tools to
monitor their network and maintain them. Keeping network in high performance is the main
purpose of network services and administrations. However, in accordance with the
complexity of network, the probability of the occurrence of network fault, such as,
congestion, packet loss, or any other network troubles is increasing. This kind of problems
may lead the network into a degradation of network Quality of Service (QoS).
NMS is an essential factor in successfully operating and maintaining a network. As a
company becomes increasingly dependent on networking services, keeping a network
management system running is synonymous with keeping the business running [1]. In the
advance predictive maintenance, the key challenge is the precise failure prediction which
could avoid great operation loss in the upfront stage [2].

737
738 E. HARAHAP, M.Y . FAJAR, H. NISHI

Fault management as part system of NMS, has the function to support the
administration on managing problems in the network. Based on RFC-3877 as concerns to the
network fault, a network administrator will get alarm regarding to the problems.
Unfortunately, in some cases, the administrator faced with repeatedly similar problems with
unknown causes. This kind of problem can be avoided if the exact causing the problem is
known. However, based on RFC-1157 fault management via SNMP protocol do not have
feature to detect the cause of problem [3].
Performing a prediction on the cause of problem is one alternative way to support a
network administrator. Prediction system may avoid the occurrence of the same problem in
the future. This prediction is also might be used to reduce the network administration cost on
keeping the network in high performance and to maintain QoS as well. Fault prediction will
support the activity of network administration and may help the administrator on performing
a certain action to solve the problem that will occur.

2. BACKGROUND INFORMATION

A Network Management system (NMS) is a combination of hardware and software


system to monitor and administer a network. A network management system requires a
number of network management tasks be folded in a single software solution. The network
management system should discover the network inventory, monitor the health and status of
devices and provide alerts to conditions that impact system performance.
NMS systems make use of various protocols for the purpose they serve. For example,
SNMP protocol allows them to simply gather the information from the various devices down
the network hierarchy. NMS software is responsible for identification of the problem, the
exact source of the problem and solving them. The NMS systems not only are responsible in
collecting data for monitoring purposes, but also for detection of faults and collecting the
statistics of the devices over a period of time. They may include a library where the previous
network statistical data over a period of time is stored along with the problems and the
solutions that worked in the past. This library can come useful in case a fault is found. NMS
software can consult the library and search for the best possible method to resolve the
problem.
Network Management System manages the network elements. These elements or
devices are managed by the NMS, so these devices used to call as managed devices. The
management of device includes Fault, Accounting, Administration, Configuration and
Security management. Each of these five functionalities is specific in some organizations or
companies.
The goal of fault management is to recognize, isolate, correct and log faults that occur
in the network. Errors primarily occur in the areas of fault management and configuration
management. Fault management is concerned with detecting network faults, logging this
information, contacting the appropriate person, and ultimately fixing a problem. A common
fault management technique is to implement an SNMP (simple network management
Protocol) based network management system to collect information about network devices. In
turn, the management station can be configured to make a network administrator aware of
problems (by email, paging, or on-screen messages), allowing appropriate action to be taken.
Prediction The Cause of Network Congestion Using Bayesian Probabilities 739

3. PROPOSED SYSTEM AND METHODOLOGY

3.1 Existing Network Fault Management. Traditionally, network management has been
performed using a wide variety of software tools and applications [4] for diagnosing and
isolating network faults, which require human intervention for corrective action, resulting in
most network management solutions specialized for fault management, at the expense of the
other functional areas as specified by the OSI FCAPS model. Over the years, network
management as a specialized field has matured tremendously, with consumers particularly IT
managers demanding high level of sophistication and support for monitoring and managing
newer network technologies. Moreover organizations are now more interested in
infrastructure management in context of business process that it directly or indirectly impacts.
However, existing network management solutions have not kept pace with the changing
requirements of industry providing only partial solutions to these issues [5].
Such an approach can backfire in scenarios of network congestion, where such systems
end up contributing to network congestion while suffering from packet losses and timeouts
which further impacts their effectiveness and efficiency. An improvement over such systems
can be envisaged in terms of dynamically formulated polling strategies, which are able to
pinpoint network faults faster, while being dynamically able to adjust the amount of network
management traffic that is generated based on the state of the network. Such a system could
utilize historical data collected about the network made available from the network base
lining statistics, which could provide some indication on the existing hot-spots in the network
and potential trouble areas which the solution could focus on, resulting in faster RCA and
lower Mean-Time-To-Repair (MTTR) for network faults.
Typical fault management tasks include detecting and identifying faults, isolating faults
and determining the root cause, and recovering from the faults. A manual approach requires
accepting and acting on error detection notifications, maintaining and examining error logs,
tracing and identifying faults by examining environmental changes and database information,
carrying out sequences of diagnostics tests, and correcting faults by reconfiguring/
restarting/replacing network elements. Manual fault management is usually a time-consuming
and tedious task, requiring a human expert to have a thorough knowledge of the network and
to comprehend a large amount of information. [6].
It is desirable to provide Autonomic Fault Management (AFM) for any large-scale
network supporting many users and a diverse set of applications. AFM aims to automate
many of the fault management tasks by continuously monitoring network condition for self-
awareness, analyzing the fault after it is detected for self-diagnosis, and taking adaptation
actions for self-recovery. Thus AFM can reduce potential human errors and can respond to
faults faster, thus effectively reducing the network downtime.
Based on the existing network management system, especially for fault management,
new method is needed to minimize time on determining all objects in the network. Also, the
new method is needed for resulting faster Root Caused Analysis (RCA) and lower Mean-
Time-To-Repair (MTTR) on network faults.

3.2 Proposed System. The main objective of NMS is to help reduce the administration of
network cost by predict the congestion link in the network. It would be a great help to the
network administrator, if there is a new method that develop the capability of the NMS,
especially for fault management. In this paper, a method for network management system,
especially for network fault management, by using Bayesian network is proposed. The
method is to minimize the time on determining all objects in the network, resulting faster
740 E. HARAHAP, M.Y . FAJAR, H. NISHI

RCA, to reduce MTTR, and other network faults. Bayesian Network is a method which can
be used as a prediction tools through its causal-relation features. Bayesian network is a
graphical probability model for representing the probabilistic relationships among a large
number of nodes and probabilistic inference with them. Bayesian network provides a
framework for addressing problems that contains uncertainty and complexity [7].
The development on fault management is needed to achieve the objective of NMS,
keeping network running optimally and effectively. With concerning to the reducing of
administration cost, and the development of network technology and network complexity, the
following are the features of the proposed system:
Root-Cause Analysis, the proposed system can detect not only specified fault but it can detect
the cause of fault as well. This feature is the development from current existing network fault
management method.
Posterior Fault Predicting, the proposed system can detect the fault earlier before it actually
occurs. This is the best way to keep the network running effectively.
Adaptive System, the proposed system has learning capability so that it can update the
resources periodically to keep monitoring the network.

3.3 Methodology. This section will be describe about the methodology concerning to the
proposed system. Referring to the figure 1, the proposed system image, the proposed system
process is starting from data collection. The data source is from Management information
Base (MIB) by implementing SNMP protocol into the real network. The network data can be
collected in the simulation generated by network simulator application. There are some
network simulator applications that can be used as data source, such as network simulator 2
(NS2) [8] or OPNET [9]. Next, a data collection is sent to the proposed system. The propose
system, has two main modules, Learning System and Diagnosis System. Learning system is a
module that has a function to build a Bayesian Network model. Diagnosis system is a module
that has a function to monitoring the fault in the network by using the constructed Bayesian
Network model. After that, the monitoring result is sent to the alarm and recovery system.
Prediction The Cause of Network Congestion Using Bayesian Probabilities 741

Figure 1. Proposed system for fault prediction

4. EXPERIMENTAL CONFIGURATION

4.1 Initialization. The mesh network topology consists of four routers and five links between
routers provided in figure 2.

Figure 2. Topology of Mesh Network


742 E. HARAHAP, M.Y . FAJAR, H. NISHI

Each router connected with the nodes that will send packet each other. The bandwidth
links between routers are 2.5 Mbps with 40ms delay on each link. The bandwidth and delay
between routers and nodes are not measure.
The link between nodes and routers are duplex links with drop tail queue type. Each
node may send packet or receive packet. The purpose of the experiment is to confirm that
congestion occurs is caused by link down in specified location. Refer to the figure 2, if ink 2
down, the flowing packer will through link 0, or link 1, or link 3, or link 4, so that the related
links has higher possibility to be congested than link 2 is not down.

4.2 Training Simulation. Training simulation is conducted to generate Bayesian network


model for congestion link prediction with initial value [10]. In training simulation, some
faults are implemented. The faults are making disconnection in certain links and certain time,
so that congestion will occur in one or more links. The links down schedule are applied as
follow:
• From 10 second to 20 second: link 0 down
• From 30 second to 40 second: link 2 down
• From 50 second to 60 second: link 3 down
• From 70 second to 80 second: link 1 down
• From 90 second to 100 second: link 4 down
After conducted training simulation, all data is collected and analyzed. The results of
traffic of throughput from link 0 to link 4 are showed on figure 3.

Figure 3.a.

Figure 3.b.
Prediction The Cause of Network Congestion Using Bayesian Probabilities 743

Figure 3.c.

Figure 3.d.

Figure 3. (a, b, c, d) The traffic of throughput on Mesh Network

4.3 Constructing Model. The next step is constructing Bayesian network model. Data
throughput and link down from training simulation is collected and then categorized by “D”
for link down and “C” for congestion, following by the link number. For example, “D1” for
link 1 down, or, “C4” Link congested. Then, send data to B-Course or Bayonet for model
construction. Figure 4 is a Bayesian network model constructed by Bayonet with its joint
probability distribution.
744 E. HARAHAP, M.Y . FAJAR, H. NISHI

Figure 4. Bayesian network model constructed by Bayonet

P( D0 , D1 , D2 , D3 , D4 , C2 , C3 , C4 ) 
P( D0 )* P( D1 )* P( D2 )* P( D3 )* P( D4 )*
P(C2 | D0 , D1 , D3 )* P(C3 | D2 , D4 )* P(C4 | D3 )

The complexity of Bayesian network can be reduced by using Markov-blanket


theorem. Given the posterior probability to be calculated, and then the joint probability can be
reduced related to the Bayesian network model.
A congested link that will be analyzed is at link 3 (C3). Therefore, based on Markov-
blanket theorem, the structure of Bayesian network (fig.4) can be reduced. Some random
variables that not related directly to “C3” can be removed as showed on figure 5.

Figure 5. Reduced Bayesian Network model

4.4 Testing Simulation. After Bayesian network model is built, the next step is to evaluate
the validity of model by testing simulation. Let the mesh topology (figure 1) that will be used
Prediction The Cause of Network Congestion Using Bayesian Probabilities 745

as a network topology for testing simulation. Let a scenario for testing simulation as follow:
• Total time simulation : 100 second
• Link 2 down : from 10 second to 40 second
• Link 4 down : from 60 second to 90 second
• Traffic used : FTP and CBR
The result of testing simulation shoes below. Traffic at link 2 is down from 10 second
to 40 second (fig. 6 and 7) makes congestion in link 3. Link 4 down from 60 second to 90
second makes congestion in link 3. Congestion occurs in link 3 caused by link 2 down.

Figure 6.a.

Figure 6.b.
746 E. HARAHAP, M.Y . FAJAR, H. NISHI

Figure 6.c.

Figure 6 (a, b, c). Experiment result of links down and congestions


on testing simulation

4.5 Discussion and Result. The evaluation of proposed system will be performed using data
from testing simulation. Given throughput data from link 3 (figure 6), then the proposed
system will be tested whether it can predict that the congestion occurs is caused by the down
of link 2 and link 4. Apply the reduced Bayesian network at fig. 5, we can write its joint
probability distribution as follows:

P( D2 , D4 , C3 )  P( D2 )* P( D4 )* P(C3 | D2 , D4 )

To predict the cause of congestion in link 3, the following posterior probabilities are
needed:

true
P( D2 , C3 )
 P( D 2
true
, D4 , C3 )
P( D2true | C3 )  
D4
(eq.1)
P(C3 )  P( D , D , C )
D2 , D4
2 4 3

true
P( D4 , C3 )
 P( D , D 2 4
true
, C3 )
| C3 )  
true D2
P( D4 (eq.2)
P(C3 )  P( D , D , C )
D2 , D4
2 4 3
Prediction The Cause of Network Congestion Using Bayesian Probabilities 747

P(D2true|C3) is the prediction of congestion occurs in link 3 caused by disconnection at


link 2, and P(D4true|C3) is the prediction of congestion occurs in link 3 caused by
disconnection at link 4.

Figure 7. Experiment result of links down and congestions

All data throughput at link 3 are analyzed by Eq. 1 and Eq. 2. The results are showed at
fig. 8 and fig. 9.

Figure 8. Prediction of congestion at link 3 caused by link 2 down


748 E. HARAHAP, M.Y . FAJAR, H. NISHI

Figure 9. Prediction of congestion at link 3 caused by link 4 down

5. CONCLUSION

The conclusion is based on the observation made while utilizing a fairly limited set of
scenarios. Development and technology update for Network Management System is
needed especially in Fault Management System due to increasing network complexity in the
future. Based on the simulation on the mesh network, the proposed system can detect the
cause of fault. It showed by diagnosis that congested link is caused by the down on one or
more links.
The proposed method can also be implemented to predict a congested link in other
network topology based on the steps of training simulation and testing simulation which
similar. The experiment result shows an accurate prediction when monitoring on link in
testing simulation. The prediction result can be used as quick reference to support the
network administrator on taking specified action and can reduce the network administration
cost.
For future works, some improvement of the proposed system is needed to predict other
failures such as high latency, throughput degradation, or packet loss. The most important part
is the information or data about causal relation between cause and effect. Once the
information of causal relation is obtained, it will make easier to construct the Bayesian
network model. Some research should be conducted to compare the performance of Bayesian
network with other similar prediction method to on network fault management area.
Regarding to MPLS network, it should be examine deeper about the congested link given
more than one burst. In some specific situation, some traffic burst doesn’t caused congestion
on the link.

Acknowledgement. This work was partially supported by National Institute of Information


and Communications Technology (NICT) Japan and Asian Development Bank – Japan
Prediction The Cause of Network Congestion Using Bayesian Probabilities 749

Scholarship Program (ADB-JSP). Thank you is addressed to Prof. Hiroaki Nishi, for valuable
guidance and support throughout the course of this work and for enriching knowledge about
the paper contents, and to all West-Lab members, Keio University Japan, for great help and
research collaborations.

References

[1] CLEMM ALEXANDER, Network Management Fundamentals, Cisco Press, 2006.


[2] ZHIQIANG CAI; SHUDONG SUN; SHUBIN SI; NING WANG; IndustrialEngineering and Engineering
Management, 2009. IE&EM '09. 16th International Conference on Digital Object Identifier, 2009, Page(s):
2021-2025.
[3] ERWIN HARAHAP. A Study on Network Management System with Fault Prediction Function by using
Bayesian Network to Reduce Administration Cost. In master course’s thesis, Hiroaki Nishi Lab., Dept. of
Integrated Design Engineering, Keio University, Japan. 2010.
[4] LAWRENCE BERNSTEIN AND C.M. YUHAS. Basic Concepts for Managing Telecommunications Networks-
Copper to Sand to Glass to Air. Kluwer Academic/Plenum Publishers, 1999.
[5] ANKUR GUPTA. Network Management : Current Trends and Future Perspectives. Journal of Network and
System Management Vol. 14, No.4, December 2006. Page(s): 483-491.
[6] NAN LI, GUANLING CHEN, AND MEIYUAN ZHAO. Autonomic Fault Management For Wireless Mesh
Network. The Electric Journal for Emerging Tools & Applications. Volume 2, Issue 4. January 2009.
[7] RICHARD E. NEAPOLITAN, Learning Bayesian Networks, Prentice Hall, 2003.
[8] NS2. http://www.isi.edu/nsnam/ns/. Last Access, July 26, 2010.
[9] OPNET. http://www.opnet.com/solutions/network_rd/modeler.html, last access, july 26, 2010.
[10] ERWIN HARAHAP, WATARU SAKAMOTO, HIROAKI NISHI. Failure Prediction Method for Network
Management System by using Bayesian Network and Shared Database. In Proceeding of the APSITT
2010. IEEE Asia-Pasific Symposium on Information and Telecommunication Technologies.

ERWIN HARAHAP
Mathematics Dept. Bandung Islamic University, Indonesia.
e-mail: [email protected]

M. YUSUF FAJAR
Mathematics Dept. Bandung Islamic University, Indonesia.
e-mail: [email protected]

HIROAKI NISHI
Integrated Design Engineering Dept., Keio University, Japan.
e-mail: [email protected]
750 E. HARAHAP, M.Y . FAJAR, H. NISHI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 751 – 762 .

SOLVING BLACK-SCHOLES EQUATION BY


USING INTERPOLATION METHOD WITH
ESTIMATED VOLATILITY

F.DASTMALCHISAEI , M.J AHANGIRHOSSEINPOUR, S.YAGHOUBI

Abstract.We seek good numerical approximation of solutions for stochastic differential


equation with multiplicative noise for stock price of underlying asset of an European call
Option .We estimate drift and volatility parameter by using a sample set of data and apply
Euler-Maruyma and Milstein schemes for solving Black-Scholes equation. The convergence
analysis for proposed methods are given .Numerical examples verifies accuracy and
efficiency of method.

Keywords and Phrases: Black-Scholes equation, stochastic differential equation, European


option, divided difference, interpolation.

1. INTRODUCTION

In the most important problem on the financial market is to estimate the price of
underlying asset of an European call option, see [6, 7]. In the central form of SDE for
evolution of a firm stock price is:

dS  Sdt  SdW , (1)

Where  ,  ,W (t ) express respectively the drift, volatility and the Wiener process.
Generally a stochastic differential equation

dX t  a(t , X t )dt  b(t , X t )dW . (2)

751
752 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI

With X (.) : [0, )   random function, a : [0, T ]    , b : [0, T ]     ,


define a stochastic differential equation. The common numerical methods employed
include the binomial scheme,finite difference and Monte Carlo simulation [2].The
binomial scheme are most widely used in the finance community for valuation of a wide
variety of option models. Monte Carlo methods simulate the stochastic movement of the
price of securities and provide a probabilistic solution. The finite difference method seeks
discretization of the differential operators in the continuous Black-Scholes models. In this
paper, we estimate drift and volatility parameter by using a sample set of data by unbiased
method and are used Euler-Maruyama(EM) and Milstein(MS) iteration schemes for
solving (1) and show that EM ,MS methods have strong order of
convergence   0.5,1 ,respectively[4]We apply linear interpolation for showing
approximate path. In section 2,we introduce finite difference methods for SDE and
convergence theorems. In section 3, we apply linear interpolation method for approximate
solution. In section 4, drift and volatility parameters are estimated by using a sample data
set. The numerical results for solving Black-Scholes equation with European option are
given in section 5.The conclusion is in section 6.

2. FINITEDIFFERENCE METHODS FOR SDE

In order to construct a Taylor series expansion using Ito calculus, the chain rule for this
calculus must be defined, and this requires Ito’s lemma. Let f : [0, T ]     have
continuous partial derivatives f x , f t andf xx .A scalar transformation by f Of the
stochastic differential (2) results after some nontrivial analysis in the formula,
f (t , X t ) f (t , X t ) 1  2 f (t , X t )
df ( X t , t )  dt  dX t  dWt , (3)
t x 2 x 2
and in other way
f f 1  2 f f
t t
f (t , X t )  f (t , X t )   (  a  b 2 2 )du   b dWu , (4)
s
t x 2 x s
x
W.P.1, for any 0  s  t  T , where the integrand are evaluated at ( X u , u ) .
By using Ito lemma and Taylor expansion, we can obtain numerical method for solving
(2).In this paper, for discretizing SDE are used Euler-Maruyama (EM) and Milstein (MS)
method.

2.1Euler-Maruyama method. One of the simplest time discrete approximations of an Ito


process is the EM approximation. We shall consider an Ito process
X  { X t , t 0  t  T } satisfying the scalar stochastic differential equation (2) on
t 0  t  T with the initial value X t0  X 0 . For a given discretization
t 0   0   1  ...   n  T of the time interval [t 0 , T ] , an EM approximation is a
continuous time stochastic process Y  {Y (t ), t 0  t  T } satisfying the iterative
scheme
Solving Black-Scholes Equation by Using Interpolation Method ... 753

Yn1  Yn  a( n , Yn )( n1   n )  b( n , Yn )(W n 1  W n ) (5)

for n  0,1,2,..., N  1with initial value Y0  X 0 , where we have written Yn  Y ( n )


for the value of the approximation at the discretization time  n .We shall consider
equidistant discretization times  n  t 0  n with step size   (T  t 0 ) N for some
integer N .Now , we need to generate the random increments Wn  W n 1  W n

for n  0,1,2,..., N  1, of the Wiener process W  {Wt , t  0} .We notice that these
increments are independent Gaussian random variables with mean E (Wn )  0 and
variance E ((Wn ) )   .
2

2.2 Milstein method. If, we add to the EM method the term

1 '
bb ' (1,1)  bb {(Wn ) 2   n } , (6)
2
from the Ito-Taylor expansion, then we obtain the Milstein method as

1
Yn1  Yn  a n  bWn  bb ' {(Wn ) 2   n } . (7)
2
We can rewrite this as

1
Yn 1  Yn  a  n  bWn  bb ' (Wn ) 2 , (8)
 2
1
Where a  a  bb ' .
 2
2.3 path wise approximation and strong and weak convergence. We shall say that a
discrete time approximation Y convergence strongly with order   0 to X at time T if
there exist a positive constant C, which does not depend on  and a  0  0 such that

 ( )  E ( X T  Y (T ) )  C  , (9)

for each   (0,  0 ) .We shall investigate the strong convergence of a number of
different discrete time approximation experimentally.

We shall say that a discrete time approximation Y convergence weakly with order
  0 to X at time T as   0 if for each polynomial g there exist a positive constant C
754 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI

,which does not depend on  and a finite  0  0 such that

E ( g ( X T ))  E ( g (Y (T )))  C  . (10)

2.4 convergence theorem.

Theorem1: Suppose that

E ( X 0 )  ,
2
(i)

1 1
 2 2
(ii) E ( X 0  Y0 )  K1 , 2

(iii) a(t , x)  a(t , y)  b(t , x)  b(t , y)  K 2 x  y

(iv) a(t , x)  b(t , x)  K 3 (1  x ),

1
(v) a( s, x)  a(t , x)  b( s, x)  b(t , x)  K 4 (1  x ) s  t 2 ,

For all s, t  [0, T ] and x, y   , where the constant K1 , K 2 , K 3 , K 4 do not


depend on  . Then, for the EM approximation Y , the estimate

1
E ( X T  Y  (T ) )  K 5 2
,

holds, where the constant K 5 does not depend on  .

Proof: refer to [4].

Theorem2: Suppose that

E ( X 0 )  ,
2
(i)

1 1
2
(ii) E ( X 0  Y0 ) 2  K1 2 ,

(iii)

a(t , x)  a(t , y)  K 2 x  y ,
 

b j1 (t , x)  b j1 (t , y)  K 2 x  y ,
Solving Black-Scholes Equation by Using Interpolation Method ... 755

L 1 b j2 (t , x)  L 1 b j2 (t , y)  K 2 x  y ,
j j
 

(iv)

a(t , x)  L a(t , y)  K 3 x  y ,
j
  

b j1 (t , x)  L b j2 (t , y)  K 3 x  y ,
j

L L 1 b j2 (t , x)  K 3 (1  x ),
j j
 

(v)
1
a( s, x)  a(t , x)  K 4 (1  x ) s  t 2 ,
 

1
b j1 ( s  x)  b j1 (t  x)  K 4 (1  x ) s  t 2 ,

1
L 1 b j2 ( s  x)  L 1 b j2 (t  x)  K 4 (1  x ) s  t 2 ,
j j
 

for all s, t  [0, T ], x, y  , j  0,..., m and j1 , j2  1,...m, where the constants


K1 , K 2 , K 3 , K 4 do not depend on  . Then for the MS approximation Y  , the estimate

E ( X T  Y  (T ) )  K 5 ,

holds, where the constant K 5 does not depend on  .

Proof: refer to [4,5].

3. LINEAR INTERPOLATION

Let [0, T ] be a finite time horizon. Consider a discretization


0  t 0  t1  ...  t n  T of [0, T ] .Let Yk is the numerical solution for (2) in the
points t k (k  0,1,..., N  1) defined by (5), (7).A corresponding continuous solution to
(1) is given by the linear interpolation[1],
756 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI

Yt  Yk  a(t k , Yk )(t  t k )  b(t k , Yk )(Wt  Wk )


(11)
Y (t k )  Yk

The following theorem guarantee existence and uniqueness of the strong solution of the
(1).A modulus of continuity condition in the t variable is required to obtain the similar
order of convergence as in the deterministic case of EM and MS scheme.

Theorem3: Assume that

(i) a(t , x)  a(t , y)  b(t , x)  b(t , y)  K1 x  y ,

 b(t , x)  K 2 (1  x ),
2 2 2
(ii) a(t , x)

(iii) a( s, x)  a(t , x)  b( s, x)  b(t , x)  K 3 t  s , s, t  [0, T ]


2 2

Then E ( X tk  Ytk ) 2  O(h 2 ), k  1,2,..., N t  [0, T ] ,

where h  t k  t k 1 is the constant step size. It is clear that the linear interpolation
process for Yt has the same order of mean square error,
i.e. E ( X t  Yt )  Ch , t  [t k , t k  h] where h is the equidistant step size and C is a
2 2

constant independent of h,see[4].

4. ESTIMATE VOLATILITY AND DRIFT PARAMETERS

We have a sample set of data that represent the evolution of a firm stock prices précised in
Table1 in order to approximation of (1).

Table1: Evolution of a firm stock prices andabilities estimated

Day of the Date S(i) R(i) Day of the Date S(i) R(i)
Week Week

We 01.03.95 2.11 Fr 24.03.95 2.73 0.0706

Th 02.03.95 1.9 -0.0995 Mo 27.03.95 2.91 0.0656

Fr 03.03.95 2.18 0.1474 Tu 28.03.95 2.92 0.0034

Mo 06.03.95 2.16 -0.0092 We 29.03.95 2.92 0.000

Tu 07.03.95 1.91 -0.1157 Th 30.03.95 3.12 0.0685


Solving Black-Scholes Equation by Using Interpolation Method ... 757

We 08.03.95 1.86 -0.0262 Fr 31.03.95 3.14 0.0064

Th 09.03.95 1.97 0.0591 Mo 03.04.95 3.13 -0.0032

Fr 10.03.95 2.27 0.1523 Tu 04.04.95 3.24 0.0351

Mo 13.03.95 2.49 0.0969 We 05.04.95 3.25 0.0031

Tu 14.03.95 2.76 0.1084 Th 06.04.95 3.28 0.0031

We 15.03.95 2.61 -0.0543 Fr 07.04.95 3.21 -0.0213

Th 16.03.95 2.67 0.0230 Mo 10.04.95 3.02 -0.0592

Fr 17.03.95 2.64 -0.0112 Tu 11.04.95 3.08 0.0199

Mo 20.03.95 2.6 -0.0152 We 12.04.95 3.19 0.00357

Tu 21.03.95 2.59 -0.0038 Mo 17.04.95 3.21 0.0063

We 22.03.95 2.59 0.000 Tu 18.04.95 3.17 -0.0125

Th 23.03.95 2.55 -0.0154 We 19.04.95 3.24 0.0221

We estimate the drift and volatility by using unbiased estimators. In discrete time the rent
ability of stock S over a time interval (t k 1 , t k ) is:

R(t k )  (S (t k )  S (t k 1 ) S (t k 1 ) , k  1,

And in continuous time the reability stock at time is: R(t )  dS (t ) S (t ) .

We use a Kolmogorov-Smirnov test to verify if values of R follow a normal probability


distribution. We find

E ( R)  0.01474;Var ( R)  0.0035084;  Var ( R)  0.05923196 .

Thus (1) can be written as

dS S  dt  dW (t ), S (0)  S 0 ,

  E( R)  0.01474,   0.05923, S 0  2.11, T  33,W (t )  N (0, t ) .


758 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI

By using Ito stochastic integral the exact solution of (1) has the form[7]

2
W ( t )  (   )t
S (t )  S 0 e 2
.

5. NUMERICAL RESULTS

In order to obtain approximation values of (1), first we solve Black –Scholes equation by
EM and MS method. We use linear interpolation for approximate point. Numerical results
are given below. The programs are run by MATLAB.

Figure 1. EM method(N=2^4) and exact solution

Figure 2. EM method (N=2^8) and exact solution


Solving Black-Scholes Equation by Using Interpolation Method ... 759

Figure 3.MS method(N=2^6) and exact solution

Figure 4. MS method (N=2^10) and exact solution

Figure 5.EM and MS method (N=2^4) and exact solution


760 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI

Figure 6. EM and MS method (N=2^6) and exact solution

Figure 7.EM and MS method (N=2^8) and exact solution

6. CONCLUSION

In this paper, we used a simple method for solving SDE. First, SDE are solved by explicit
finite difference methods and use linear interpolation for connect numerical points.
Convergence and error analysis theorems are given. We can use implicit methods (for
example Runge-Kutta) for solving SDE .Wavelets is very useful and simple method. By
applying Wavelets (For example Haar wavelet and etc.), we can obtain good results.

References

[1] S.D.CONTE,C.D.BOOR,Elementary numerical analysis,McGraw-Hill,Inc,1980.


[2] D.DUFFIE, P.Glynn, Efficient Monte Carlo simulation of security prices,Ann.Appl.Probab.5(1995)895-905.
[3] D.J.HIGHAM, An algorithmic introduction to numerical simulation of stochasticdifferential equations, SIAM
Review ,Vol43,No 3(2001)525-546.
[4] P.E.KLOEDEN ,E.PLATEN, Numerical solution of stochastic differentialequations,Springer-VerlagBerlin
Heidelberg 1992.
[5] G.N.MILESTEIN, A method of second-order accuracy integration of stochastic differential equations,Theory
Probab.Appl.23,1978.
[6] S.SONDERMann,Introduction to stochastic calculus for finance,Springer -Verlag Berlin Heidelberg 2006.
[7] P.WILMOTT,Derivatives, The theory and practice of financial engineering,John Wiley and
Sons,Chichester,1998.
Solving Black-Scholes Equation by Using Interpolation Method ... 761

F.DASTMALCHISAEI
Department of mathematics, Islamic Azad University-Tabriz branch, Tabriz- Iran
e-mail: [email protected]

M. JAHANGIR HOSSEINPOUR
Department of mathematics, Islamic Azad University-Tabriz branch, Tabriz- Iran
e-mail: [email protected]

S.YAGHOUBI
Department of mathematics, Islamic Azad University-Tabriz branch, Tabriz- Iran
e-mail: [email protected]
762 F. DASTMALCHISAEI, M. JAHANGIRHOSSEINPOUR , S. YAGHOUBI
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 763 - 772.

ARTIFICIAL ENSEMBLE FORECASTS:


A NEW PERSPECTIVE OF WEATHER FORECAST IN
INDONESIA

HERI KUSWANTO

Abstract. Ensemble forecast has been widely used in the Ensemble Prediction System
(EPS) of developed countries to forecast the weather condition. The idea is to use the
Numerical Weather Prediction (NWP) models for generating probabilistic forecast which is
able to cover uncertainty in the atmospheric behavior as well as in the model itself. Indeed,
ensemble forecast is able to generate reliable forecast after calibration. This paper explores
the current status of weather forecast applied in Indonesia and formulates some potential
works dealing with ensemble forecast. It turns to an idea of creating artificial ensemble
forecast. An illustration about the proposed methodology is given.

Keywords and Phrases : Ensemble forecast, calibration, probabilistic,uncertainty,reliable.

1. INTRODUCTION

Ensemble forecast is a method for forecasting the atmospheric behavior by


utilizing several models, denoted hereafter as of ensemble member. Such kind of
forecast has been widely applied in many EPS of developed countries e.g. NCEP,
ECMWF, MetOffice, etc (van der Linden and Mitchell, 2009). By ensemble forecast, it
is expected that the uncertainty due to the atmospheric behavior as well as model
uncertainty can be covered into a model which yields on probabilistic forecast. The
ensemble members are generated by numerical procedure called as NWP. It has to be
noted that the process of generating NWP ensemble is a complex work requiring
supercomputers. Hence, weather forecast by ensemble is rarely or has not been applied
in developing countries such as Indonesia.

763
764 H. KUSWANTO

Currently, BMKG as the representative institution which has an authority to


issue weather forecast, is still using deterministic forecast in forecasting the daily
weather forecast. Among the employed forecasting methods are ARIMA, ANFIS,
Wavelet method and Artificial Neural Network (ANN). These methods yield on
deterministic forecast, which means that they are unable to explain uncertainty (Zhu,
2005). This paper proposes an idea of developing weather forecast using artificial
ensemble.
The term “artificial” is adopted by considering the fact that the ensemble
members are not generated through NWP process, but they are generated by
conventional time series models i.e. ARIMA model of Box-Jenkins (1970). The steps of
the artificial ensemble data generation as well as its calibration are described as follows:
determine a number of reference models, and then use the models to generate the
forecasts. The criteria of determining the reference models could be based on the k
models with minimum MSE. It is important to note that the generation process should
be done by updating process depending on the lead time interest. It means that for
different days, we omit the last observation and add a new observation in the modeling
process. Finally, there will be a collection of artificial ensemble, but they are
uncalibrated.
The typical characteristic of ensemble forecasts is underdispersive and hence it
has to be calibrated (Hamill dan Colucci, 1997). Several calibration methods have been
proposed to calibrate the ensemble forecast in order to generate reliable forecast.
Among the methods are BMA (Raftery et al., 2005), dressing kernel (Wang and Bishop,
2005), MOS (Wilks and Hamill, 2007), etc. The BMA is currently the most commonly
applied in the prediction centers. The idea of the BMA is to remove the bias by OLS
regression using m training windows, and then adjust the variance forecast. Each
member will have its corresponding weight proportional to its contribution to the
mixture model. The estimation of the weight and variance can be done by EM algorithm
or MCMC (proposed by Vrugt et al., 2008). The proposed method in this paper is
applied to forecast the daily mean temperature observed over the Bandara Intl. Juanda
Surabaya Station. It shows that the method can calibrate the temperature well and
provide more reliable forecast.

2.ARTIFICIAL ENSEMBLE FORECAST

The term “artificial” means that the ensemble forecast data are not generated
from the Numerical Weather System, but it is generated from any time series models. In
this case, we propose to generate the forecast from ARIMA Models. The procedure of
making prediction using several time series models has been well known in time series or
statistics modeling namely model combination (see Karllson and Eklund (2007),
Kapetanios et al. (2006), and Feldkircher (2011)).
The main difference of the artificial ensemble forecasts with model combination is
about the calibration. Model combination does not involve any calibration to the
collection of forecast, while artificial ensemble forecast employs calibration to the
generated forecast. Nevertheless, they are similar in the sense of averaging the forecast i.e.
weighted average.
The procedures of generating the artificial ensemble forecasts by ARIMA Box-
Jenkins (1970) are described as follows:
1. Determine the in-sample data for ARIMA modeling.
765 Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia

2. Do the identification of the data from time series plot, Autocorrelation (ACF) and
Partial Autocorrelation (PACF) plots.
3. Estimate the order of ARIMA from the plots. In this step, we should guess more
than one model. As we might know that there are usually a couple of models can
be fitted.
4. Estimate the model parameters and evaluate the goodness of fit of the models
based on the guess at step 2. It is important to note that the error residuals are not
necessarily to fit the required assumptions (normality and white noise). However,
the models which are not satisfy the assumptions may be used in the case where
we obtain only a few numbers of fitted models. In this case, we relax the
assumption as we only need the collection of ensemble data to be calibrated. The
models resulted in this step is hereafter called as reference models.
5. Generate the forecasts using the reference models. ARIMA models can be used
to generate forecasts for several lead times. This is one of the advantages of using
ARIMA where we can obtain a sequence of artificial ensemble data from one
model. From this step, we have ensemble forecasts for a single date.
6. Repeat the procedures 1 to 5 by inserting one more actual data and omitting the
most past data in order to obtain forecasts for all examined date. Tabulate all
generated forecasts in accordance to its lead. These dataset are the artificial
ensemble forecasts.

3. CALIBRATION USING BAYESIAN MODEL AVERAGING (BMA)

Calibration means that there are any consistency between the distribution of the
forecasts and observations. Bayesian Model Averaging (BMA) is one of the most popular
calibration methods in climatology, introduced at first by Raftery et al. (2005). Several
calibration methods have been developed as the extension of BMA. The concept of BMA
is to give more weight to the best model, by firstly remove the bias of the forecasts. The
weight represents the contribution of the model forecast to the predictive distribution. The
procedure of calibration using BMA can be briefly summarized as follows:
1. Suppose that we have four (k) model outputs (ensemble member) i.e.
and observation . In this case, we would like to calibrate the ensemble forecasts
at valid date . Determine the training length for calibration i.e. the number of
dataset used to estimate the calibration parameters. We denote the training length
as . Therefore, for calibration of ensemble forecast at date t, we will use the
dataset (both ensemble and observation) from t-m to t-1.
2. Remove the bias of the forecasts by carrying out linear regression between y and
using training data. From this regression, we obtain parameters and .
These parameters are used to remove the bias of forecasts at date t so that
where is the bias corrected forecast for k-th forecast.
3. Estimate the variance ( and weight for each member using maximum
likelihood by employing the Expectation-Maximization (EM) algorithm. See
Raftery et al. (2005) for more details of EM algorithm. In specific case, the
variance can be set to be the same for all ensemble member.
4. Using and , we can generate the predictive pdf for each member.
766 H. KUSWANTO

In this case, the pdf can be fitted following the considered case for instance
normal pdf for temperature, Gamma for wind speed, etc.
5. The calibrated ensemble forecast is obtained by averaging the weighted
distribution of the forecasts (mixture distribution) such that

The performance or reliability of the calibrated forecast is assessed by the Continuous


Ranked Probabilistic Score (CRPS) of Hersbach (2000).

4. APPLICATION
This section discusses the application of the proposed method for generating
reliable forecast. We apply the method to calibrate the daily temperature forecast observed
over the Bandara International Juanda Station, Surabaya Indonesia. The dataset used in
this study is spanning from January 2008 to December 2009. The artificial ensemble
forecast will be generated for the last three months ie. from October to December 2009.
Therefore, the remaining datasets are used to build the ARIMA models.

Figure 1. depict the time series plot of the daily mean temperature for the
considered case

Time Series Plot of Tmean


33

32

31

30

29
Tmean

28

27

26

25

24
1 64 128 192 256 320 384 448 512 576
Index

Figure 1. Time series plot of daily mean temperature

Having carefully implemented the Box-Jenkins Method, we find four models


which are feasible for the case i.e. ARMA (2,1), ARMA (1,2), ARIMA (1,1,1) and
ARIMA (1,1,2). They are hereafter denoted as M1, M2, M3 and M4 respectively. From
the models, we generated artificial ensemble forecasts for 1 day and 7 day lead following
the steps explained in section 2. The ensemble data are plotted as in Figure 2.
767 Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia

Figure 2. Time series plot of generated artificial ensemble for 1 day (upper) and 7 day
(lower) lead forecast

From the figure, we see that the generated ensemble forecasts are
underdispersive both for one and seven lead times and hence it has to be calibrated. The
temperature ensemble has normal distribution (Raftery et. A. (2005)), and hence the pdf of
the temperature forecast is generated following normal distribution. The following figures
depicts a sample of normal pdf for forecast on 31st December 2009. We can see that the
interval forecast is not reliable as it is unable to cacth the observation well.

Distribution Plot Distribution Plot


Normal, Mean=27.38, StDev=0.066756 Normal, Mean=28.25, StDev=0.222351
Observasi = 27.7 Observasi = 27.7
2.0
6

5
1.5

4
Density

Density

3 1.0

2
0.5
1
0.05 0.05 0.05 0.05
0 0.0
27.3 27.4 27.5 27.9 28.3 28.6
31 Desember 2009 31 Desember 2009
Figu
re 3. Illustration of pdf forecast on 31st December 2009
768 H. KUSWANTO

The summary of the uncalibrated forecast’s performance is given in the table.


From the table, it is known that only few percetages of the observation lie within the
interval created by the uncalibrated forecast.

Table 1. Percentage of observations captured by the interval forecast (uncalibrated)

Number Number of obs. Covered


Data Percentage
of Pred. by th einterval
Lead 1 92 10 10.87 %
Lead 7 86 20 23.26 %

The reliability of the forecast is better assessed by the CRPS. The lower the
CRPS, the more reliable the forecast. The CRPS measures the reliability by evaluating
the compactness and validity of the resulted interval.

Table 2. CRPS of calibrated forecast for single date (31st December 2009)

CRPS
Periode
m=10 m=15 m=20 m=25
31 December 2009 (lead1) 0.152 0.179 0.370 0.217
31 December 2009 (lead7) 0.340 0.365 0.312 0.225

The CRPS in Table 2 does not has any meaning when if we do not compare it
with another case. In this paper, we will compare the CRPS of the uncalibrated ensembel
forecast with calibrated one. The calibration is done by the BMA. The following are the
parameters of the BMA for the considered date of forecast. We perform only lead 1 for
sake of space availability.

Table 3. The parameters of BMA for calibration on 31st December 2009

m
10 15 20 25
Parameter Model
M1 27.733 27.589 27.140 27.478
M2 27.733 27.607 27.177 27.481
Mean
M3 27.694 27.529 27.071 27.427
M4 27.698 27.558 27.087 27.447
M1 0.435 0.470 0.438 0.660
M2 0.435 0.470 0.438 0.660
Varians
M3 0.435 0.470 0.438 0.660
M4 0.435 0.470 0.438 0.660
M1 0.250 0.000 0.000 0.001
M2 0.250 0.000 0.000 0.999
Weight
M3 0.250 0.999 0.006 0.000
M4 0.250 0.001 0.994 0.000

We examine four different training window for the calibration. The varians of the
BMA parameter is set to be the same for all members. Using different weight leads to
different parameters in particular of the weight. We se that using 15 to 25 training
769 Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia

windows leads to the domination of one ensemble member as the best model. The
predictive model for the calibrated forecast using BMA can be expressed as (Raftery et
al., 2005):

The predictive pdf using those four training length can be seen at Figures 4 and
Figure 5. If we compare Figure 3 with Figures 4 and 5 we can clearly seen that the
interval of the forecast is now adjusted or mov closer to the observation. Indeed, the
observation (represented by the blue vertical line) is well captured by the forecast interval.
The interval of the forecast has a very rasonable range i.e. between 26 to 29 degree.

m = 10 m = 15

m = 20 m = 25

Figure 4. Predictive pdf of calibrated forecast on 31st December 2009


770 H. KUSWANTO

m = 10 m = 15

m = 20 m = 25

Figure 5. Predictive pdf of calibrated forecast on 31st December 2009

We now assessing the CRPS of the uncalibrated (denoted as ORI) and calibrated
(denoted as BMA) ensemble forecast. The evaluation is not done for a single day,
however in this case we evaluate the performance of the calibration in the system. It
means that the evaluation is done for the whole date of calibration by taking the average
CRPS.

Let us first compare the CRPS of ORI with BMA. In all cases, we can see that
the calibration using BMA reduces the CRPS significantly, in particular of lead 7 forecast.
It means that the calibration can generate a more reliable forecast, by creating more
compact interval with lower bias forecast. The choice of the optimum training length will
be 25 days for lead 1 and 10 days for lead 7.

Table 4. CRPS of uncalibrated (ORI) and calibrated (BMA) ensemble


m=10 m=15 m=20 m=25
CRPS CRPS CRPS CRPS CRPS CRPS CRPS CRPS
ORI BMA ORI BMA ORI BMA ORI BMA
Lead 1 0.672 0.535 0.681 0.566 0.679 0.529 0.696 0.510
Lead 7 0.882 0.495 0.889 0.529 0.917 0.544 0.938 0.584
771 Artificial Ensemble Forecasts: A New Perspective of Weather Forecast in Indonesia

Another way to show the accuray of the calibration is by showing the percentage
of the observations captured by the forecast interval as in Table 1, but for calibrated
ensemble forecast.

Table 5. Percentage of observations captured by the interval forecast (calibrated)


Number Number of observation lies
Data of within the prediction Percentage
prediction interval
Lead 1 67 60 89.55 %
Lead -7 76 65 85.53 %

After calibration, there is significant improvement in the forecast performance.


About 80% of the obrervations are now captured by the intreval forecast. Of course, using
only this information may lead to a misleading conclusion as we do not know the wide of
the interval. Table 5 only provides additional information. However, the CRPS confirms
that the resulted interval is reasonable and reliable eanough.

6. CONCLUSION

We show that we can generate the probabilistic forecast or predictive pdf by a


very simple way called as artificial ensemble utilizing ARIMA models. The
ensemble is underdispersive and it has to be calibrated. The BMA method
performs well in calibrating the forecast and it is capable to generate reliable
forecasts, proven by the CRPS value. The performance of the calibration is
influenced by the choice of the training window. The method has been
successfully applied to calibrate the wind speed observed at the same weather
station. However, it is beyond the scope of this paper.

References

[1] Box, G.E.P; Jenkins, G.M.: Time Series Analysis - Forecasting and Control, San Francisco: Holden Day,
1970.
[2] Feldkircher, M., Forecast combination and Bayesian Model Averaging: A prior sensitivity analysis. Journal
of Forecasting. Published online DOI: 10.1002/for.1228, 26 March 2011.
[3] Hamill, T. M., and S. J. Colucci, Verification of Eta-RSM short-range ensemble forecasts, Monthly Weather
Review, 125, 1312–1327, 1997.
[4] Hersbach, H. (2000). Decomposition of The Continuous Ranked Probability Score for Ensemble Prediction
System. Weather Forecasting. 15, 559-570, 2000.
[5] Karllson, S. and Eklund, J. Forecast combination and model averaging using predictive measures.
Econometrics Review 26 (2-4), 329-363, 2007.
[6] Kapetanios, G., Labhard, V. and Price, S., Forecasting using predictive likelihood model averaging.
Economics Letters, 91 (3), 373-379, 2006.
772 H. KUSWANTO

[7] van der Linden P., and J.F.B. Mitchell (eds.). ENSEMBLES: Climate Change and its Impacts: Summary of
research and results from the ENSEMBLES project. Met Office Hadley Centre, FitzRoy Road, Exeter EX1
3PB, UK. 160pp., 2009
[8] Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski, Using Bayesian model averaging to
calibrate
[9] Wang, X. and Bishop, C.H., Improvement of ensemble reliability with a new dressing kernel, Q. J. R.
Meteorol. Soc,131, 965–986, 2005.
[10] Wilks, D. S., and Hamill, T. M., Comparison of Ensemble-MOS Methods Using GFS Reforecasts. Mon.
Wea. Rev., 135, 2379–2390, 2007
[11] Zhu, Y., Ensemble Forecast: A New Approach to Uncertainty and Predictability, Advance in Atmospheric
Science, 22 (6), 781–788, 2005.

HERI KUSWANTO
Department of Statistics, Institut Teknologi Sepuluh Nopember (ITS) Surabaya Indonesia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 773–780.

SECOND ORDER LEAST SQUARE FOR ARCH MODEL

Herni Utami, Subanar, Dedi Rosadi, Liqun Wang

Abstract. We propose the second-order least square estimator (SLSE) for ARCH models.
This estimator minimizes the quadratic distances of the response variable to its first
conditional moment and the squared response variable to its second conditional moment
simultaneously. We proof that this estimator is strongly consistent and asymptotically
normal under general regularity conditions. A Monte Carlo simulation study is done to
demonstrate the finite sample properties of the proposed estimator.

Keywords and Phrases: time series, ARCH model, SLSE, conditional mean, conditional
variance.

1. INTRODUCTION
Time series models with homoscedastic errors, such as autoregressive, moving av-
erage, or autoregressive moving average models are widely applied in practice. However,
they are not appropriate when dealing with certain financial market variables such as
the stock price indices or currency exchange rates. These financial market variables
typically have three characteristics that standard time series models fail to consider:
(1) the unconditional distribution of the time series has heavier tails than the nor-
mal distribution;
(2) the values of time series Xt at different time points are not strongly correlated,
but the values of Xt2 are strongly correlated; and
(3) the volatilities of Xt tend to be clustered.
Least square and maximum likelihood estimation methods for ARCH models have
been widely used are . (see. Weiss [13], Johnston and DiNardo [6], Pantula [7], Bollerslev
[2], and Straumann [9]. While the MLE can only be used if the probability distribution
of the random error is known, the least square estimation is based on minimization of
the square distance of response variable to its conditional moment given the predictor
variable. The LS estimation procedure in ARCH model consists of two steps. First, the
least square estimator of the regression equation is calculated. Second, parameters of
variance equation are estimated using an ARCH regression model. Weiss [13] and Pan-
tula [7] studied studied the asymptotic properties of least square estimators which also
773
774 Herni Utami et al.

do not require a normality assumption. They proved the consistency and asymptotic
normality of such estimators.
In this paper, we propose the second order least square estimator (SLSE) for
ARCH models. This estimator is based on the first and second conditional moments of
the response variable, which can be computed easily without any further distributional
assumptions on the random error term. SLSE method was first used by Wang [10],[11]
to deal with the measurement error problem in nonlinear model. Then, Leblanc and
Wang [12] used the estimation method to estimate nonlinear regression model. They
have studied a SLSE for a general nonlinear model, where no distributional assumption
for the random error is made. The SLS estimations are more efficient than LS estimation
if used optimal weight and the random error in model has a nonzero third moment.

2. THE ARCH MODEL


An ARCH process is represented by
Yt = X0t β + εt , εt (0, ht ) (1)
and
ht = α0 + α1 ε2t−1 + ... + αR ε2t−R , (2)
where α0 > 0, αi ≥ 0, i = 1, 2, ..., R (so that the conditional variance is strictly posi-
tive), Xt is a vector of p independent variables at time t which may include the lagged
values of dependent variable Yt , β 0 = (β1 , β2 , ..., βp ) is a vector of associated regression
coefficients. For the ARCH equation, assume
E(εt |=t−1 ) = 0, (3)
where =t−1 = σ{Ys , Xs , s ≤ t − 1} is the information set containing all the information
about the process up to and including time t − 1. Then the conditional variance of εt
is given by
ht = E(ε2t |=t−1 )
= α0 + α1 ε2t−1 + ... + αR ε2t−R .
In this model, α0 = (α1 , α2 , ..., αR ) is the vector of parameters in the ARCH or variance
equation and β is the vector of parameters in the mean equation.

3. SECOND ORDER LEAST SQUARES ESTIMATION


In this section we first outline the second order least squares estimation (SLSE)
procedure and then prove its asymptotic properties. the second order Least Square
estimator of v0 = (β 0 , α0 ) is defined as the measurable function that minimizes
T
1X 0
QT (v) = ρ (v)Wt ρt (v) (4)
T t=1 t
Second Order Least Square Estimator for ARCH Models 775

where ρt (v) = (Yt − E(Yt |It ), Yt2 − E(Yt2 |It ))0 and Wt is a weight matrix that is
nonnegative definite. Alternatively, we can write
v̂SLS = arg min QT (v) (5)
v

where v ∈ Θ and the parameter space Θ ⊂ Rp+R+1 is assumed to be compact. The


true parameter value of the model is denoted as v00 = (β00 , α00 ).
Lemma 3.1. (Weiss [13]) For all v ∈ Θ, there exist M < ∞ not depending on v, such
that  
∂εt ∂εt
E < M.
∂β ∂β 0
Further, for all v ∈ Θ, it holds
 
∂εt ∂εt
det E > 0.
∂β ∂β 0
The equivalent result for ht is:

Lemma 3.2. (Weiss [13]) Assume that E ε4t < ∞. Then for all v ∈ Θ, there exists
M1 < ∞ not depending on v, such that
 
∂ht ∂ht
E < M1 .
∂α ∂α0
Further, for all v ∈ Θ,
 
∂ht ∂ht
det E > 0.
∂α ∂α0
The following will show ergodic theorem and some properties of ρt (θ) in the
equation (4).
Theorem 3.1. (Hayashi [5])Let {Zt } be stationary and ergodic with E(Zt ) = µ. Then
T
1X a.s
Zt −−→µ.
T t=1

Corollary 3.1. (Hayashi [5]) Let {Zt } be stationary ergodic and f (.) be a continuous
function. Then {f (Zt )} will be stationary and ergodic.
Corollary 3.2. (Hayashi [5]) Let {Zt } be stationary ergodic and let f (.) be a continuous
function. Assume that E(f (Zt )) = η. Then
T
1X a.s
f (Zt ) −−→η.
T t=1

Theorem 3.2. For the ARCH model with v ∈ Θ, under the condition of Lemma 3.2,
a.s
QT (v) −→ Q(v) for all v ∈ Θ and Q(v) attains a unique minimum at v0 .
776 Herni Utami et al.

Proof. We first refer to the standard ergodic theorem, corollary 3.2, that for any v ∈ Θ,
Q(v) = limT →∞ QT (v)
1X 0
= limT →∞ (ρt (v)Wt ρt (v))
T
0
= E(ρt (v)Wt ρt (v)).
Furthermore, the expectation can be written as
Q(v) = E (ρ0t (v)Wt ρt (v))
= Q(v0 ) + 2E [ρ0t (v0 )Wt (ρt (v) − ρt (v0 ))]
 0 
+ E (ρt (v) − ρt (v0 )) Wt (ρt (v) − ρt (v0 )) .
Since ρt (v) − ρt (v0 ) is a function of Jt−1 and does not depend on Yt , we have
E [ρ0t (v0 )Wt (ρt (v) − ρt (v0 ))] = E [E (ρ0t (v0 )|Xt ) Wt (ρt (v) − ρt (v0 ))] = 0.
Therefore
 0 
Q(v) = Q(v0 ) + E (ρt (v) − ρt (v0 )) Wt (ρt (v) − ρt (v0 ))
≥ Q(v0 ).
with equality holding for v = v0 only. 

Corollary 3.3. Under the conditions of Theorem 3.2, Q(v) = limT →∞ QT (v) exists
a.s for all v ∈ Θ and has unique minimizer at v0 .
Lemma 3.3. Under the condition of Lemma 3.2,
 0 
∂ρt (v) ∂ρt (v)
B=E Wt
∂v ∂v
is finite and B is non singular matrix, where
∂ρ0t (v) −X0t βXt
 
−Xt
=
∂v 0 ∂ht /∂α
We need to find conditions under which there exit consistent roots of the equation
∂QT (v)
∂v = 0. By Taylor’s expansion, the derivative ∂Q∂v T (v)
can be expressed as
∂QT (v) X ∂qt (v0 ) X ∂ 2 qt (v0 ) X  ∂ 2 qt (v∗) ∂ 2 qt (v0 ) 
= + (v − v0 ) + (v − v0 ) −
∂v ∂v ∂v∂v0 ∂v∂v0 ∂v∂v0
(6)
where v∗ = v0 + r(v − v0 ) with |r| ≤ 1 and qt (v) = ρ0t (v)Wt ρt (v). Basawa [1] gives a
set of sufficient conditions for the consistency and asymptotic normality of the MLE in
ARCH models. These conditions can be adapted to the SLSE and based on equation 6,
these results imply that there exists a consistent root of the equation ∂QT (v)/∂v = 0
if
P ∂qt (v0 ) p
(1) T −1 ∂v −→ 0;
Second Order Least Square Estimator for ARCH Models 777

(2) there exists a non random matrix M (v0 ) > 0 such that for all ε > 0,
 X ∂ 2 qt (v0 ) 
−1
P −T ≥ M (v0 ) > 1 − ε
∂v∂v0
for all T > T1 (ε); and
(3) there exists a constant M < ∞ such that
3
∂ qt (v0 )
E
<M
∂v∂v0 ∂v
for all v ∈ Θ.
Theorem 3.3. (Consistency). In addition to the condition of Lemma 3.2, assume that
v0 lies in the interior of Θ. Then the SLSE of v, v̂SLS , is consistent for v0 .
Proof. We first show that the previous three conditions are satisfied and hence there
exists a consistent root of the equation ∂QT (v)/∂v = 0. To this end we write the
derivative of qt (v0 ) as
∂qt (v0 ) ∂
= (ρ0 (v0 )Wt ρt (v0 ))
∂v ∂v t
∂  2 2 
ε0t w11 + yt2 − E(yt2 |=t ) ε0t (w12 + w21 ) + yt2 − E(yt2 |=t ) w22

=
∂v !

∂ε0t ∂ yt2 − E(yt2 |It ) 2 2
 ∂ε0t
= 2ε0t w11 + ε0t + yt − E(yt |It ) (w12 + w21 )
∂v ∂v ∂v

2 2
 ∂ yt2 − E(yt2 |It )
+ 2 yt − E(yt |It ) ,
∂v
h i
where wij is the element of Wt for i, j = 1, 2. Therefore, E ∂qt∂v (v0 )
= 0 since

E(εt |=t ) = 0 and E yt2 − E(yt2 |=t ) = 0. The ergodic theorem then implies that
P ∂qt (v0 ) p
T −1 ∂v −→ 0.
Then, by the ergodic theorem, for any constant vector c 6= 0,
X ∂ 2 qt (v0 ) a.s  2 
−1 0 0 ∂ qt (v0 )
T c c −→ E c c .
∂v∂v0 ∂v∂v0
 2 
Now for the given c, let 0 < δ(c) < − 12 c0 E ∂∂v∂v qt (v0 )
0 c. Then, for all ε > 0, there exist
T1 = T1 (ε) such that
−1 X 0 ∂ 2 qt (v0 )
  2  
0 ∂ qt (v0 )

P T c c−E c c <δ >1−ε

∂v∂v0 ∂v∂v0
 2 
for all T > T1 . Let M (v0 ) = − 21 E ∂∂v∂v
qt (v0 )
0 . It follows that
 X ∂ 2 qt (v0 ) 
P −T −1 c0 c > c 0
M (v 0 )c >1−ε
∂v∂v0
2
for all T > T1 . Finally, by differentiating ∂∂v∂v
qt (v0 )
0 we can show that the third derivative
of qt (v) evaluated at v0 is also bounded, which completes the proof. 
778 Herni Utami et al.

Theorem 3.4. (Asymptotic normality) In addition to the conditions of Theorem 3.3,


assume that detB0 > 0. Then for T → ∞,
√ d
→ N (0, B0−1 A0 B0−1 ),
T (v̂SLS − v0 ) −
where
∂ρ0t (v0 )
 
∂ρt (v0 )
A0 = E Wt ρt (v0 )ρ0t (v0 )Wt
∂v ∂v0
and
∂ρ0t (v0 )
 
∂ρt (v0 )
B0 = E Wt .
∂v ∂v0

Proof. Since v0 is an interior point of Θ, by the mean value theorem we have


∂QT (v̂) ∂QT (v0 ) ∂ 2 QT (ṽ)
= + (v̂ − v0 ) (7)
∂v ∂v ∂v∂v0
P ∂ρ0t (v)
where kv̂ − ṽk ≤ kv̂ − v0 k, and ∂Q∂vT (v)
=2 ∂v Wt ρt (v).
The second derivative of QT (v) in the above equation is given by
∂ 2 QT (v) X  ∂ρ0 (v) ∂ρt (v) ∂vec (∂ρ0t (v)/∂v)

t
= 2 W t + ρt (v)W t ⊗ Ip+R+1 (8)
∂v∂v0 ∂v ∂v ∂v0
where
0 0
 
∂vec (∂ρ0t (v)/∂v)  0 0 
= 2
∂ ht
.
∂v0  − ∂α∂α 0 0 
∂Xt βXt
−2 ∂β 0 0
∂QT (v̂)
Since v̂ = arg min QT (v) and hence ∂v = 0, we have
v
−1
√ 1 ∂ 2 QT (ṽ)

1 ∂QT (v0 )
T (v̂ − v0 ) = − √ . (9)
T ∂v∂v0 T ∂v

It follows from the equation (9), that the asymptotic distribution of T (v̂ − v0 ) will
be normal if
P ∂qt (v0 ) d
(1) √1T ∂Q∂v
T (v0 )
= √1T ∂v −→ N (0, 4A0 ) for non random A0 > 0; and
2 p
1 ∂ QT (ṽ)
(2) −→ 2B0 for non random B0 > 0.
T ∂v∂v0
 
From the proof of Theorem 3.3, we have E ∂qt∂v (v0 )
= 0. Further, let

∂qt (v0 ) ∂qt (v0 )0


 
D0 = E
∂v ∂v
 0 
∂ρt (v0 ) ∂ρt (v0 )
= 4E Wt ρt (v0 )ρ0t (v0 )Wt
∂v ∂v0
= 4A0 .
Second Order Least Square Estimator for ARCH Models 779

Then by Lemma 3.3 we can show that D0 is finite. It follows from a martingale central
limit theorem that
1 ∂QT (v0 ) 1 X ∂qt (v0 ) d
√ =√ −→ N (0, 4A0 ) .
T ∂v T ∂v
P ∂ρ0t (v) ∂ρt (v) p
 0
∂ρt (v) ∂ρt (v)

Again by the ergodic theorem, T1 ∂v W t ∂v0 −→ E ∂v W t ∂v0 . So by
equation (8), we obtain
1 ∂ 2 QT (ṽ) p
0
−→ 2B0 ,
 T ∂v∂v
∂ρ0t (v0 )

∂ρt (v0 )
since B0 = E ∂v Wt ∂v0 for non random B0 > 0. 

4. MONTE CARLO SIMULATIONS


In this section, we carry out simulation studies to examine finite sample behavior
of the SLSE. In particular, we consider the model:
(1 − 0.9B − 0.5B 2 )yt = εt , εt ∼ (0, ht ) (10)
with ht = 0.2 + 0.1ε2t−1 .
The following configurations are used in the simulations:
(1) sample sizes are T= 50, 150, 200; and number of replications is n=100;
(2) ε1 ∼ N ID(0, 1);
(3) εt ∼ N ID(0, 0.2 + 0.1ε2t−1 ) for t = 2, 3, ..., T .
The numerical computation is done using the statistical computing language R 2.14.0.
The simulation results are presented in table 1. The results show a clear pattern of
significant variance reduction of the SLSE and the convergence of the mean SLSE to
the true value when the sample size T increases.

Tabel 1. Simulation of SLSE for model: (1 − 0.9B − 0.5B 2 )yt = εt


with ht = 0.2 + 0.1ε2t−1 .
n = 100 β1 = 0.9 β2 = 0.5 α0 = 0.2 α1 = 0.1
¯ ¯
T βˆ1 M SE(βˆ1 ) βˆ2 M SE(βˆ2 ) α¯
ˆ0 M SE(αˆ0 ) α¯
ˆ1 M SE(αˆ1 )
50 0.868 0.026 0.499 0.063 0.216 0.147 0.009 0.038
150 0.890 0.021 0.501 0.022 0.212 0.021 0.048 0.022
200 0.890 0.002 0.501 0.013 0.205 0.002 0.070 0.013

5. CONCLUDING REMARKS
We have proposed a second order least square estimator for the ARCH model

Yt = X0t β + εt , εt ∼ (0, ht )
780 Herni Utami et al.

and ht = α0 +α1 ε2t−1 +...+αR ε2t−R . We have shown the proposed estimator is consistent
and asymptotically normal under standard regularity conditions. The Monte Carlo
simulation studies show that the SLSE preforms satisfactorily in finite sample situations.

References
[1] Basawa, I.V., Feigin, P.D., and Heyde, C.C., Asymptotic Properties of Maximum Likelihood
Estimators for Stochastic Processes, The Indian Journal of Statistics28(3), 259-270, 1976.
[2] Bollerslev, T., Generalized Autoregressive Conditional Heteroscedasticity, Journal of
Econometrsics, 31, 307-327, 1986.
[3] Capinski, M., Kopp, E., Measure, Integral and probability, Springer, New York, 2003.
[4] Hannan, E.J., Multiple Time series, New York: Wiley, 1970.
[5] Hayashi, F., Econometric, Princeton University Press, United Kingdom, 2000.
[6] Johnton, J., DiNardo, J., Econometric Methods, Fouth Edition, New York: McGraw-Hill, 1997.
[7] Pantula, S.G., Estimation of Autoregressive Models with ARCH Errors, The Indian Journal of
Statistics, Series B, 50, 119-138,1988.
[8] Sarkar, N., ARCH model with Box-Cox Transformed Dependent Variable, Statistics and Proba-
bility Letters 50, 365-374, 2000.
[9] Straumann,D., Estimation in Conditionally Heteroscedastic Time Series, New York: Springer,
2005.
[10] Wang, L., Estimation of Nonlinear Berkson-Type Measurement Error Models, Statistica Sinica,
13, 1201-1210, 2003.
[11] Wang, L., Estimation of Nonlinear Models with Berkson Measurement Error, Annals of Statis-
tics, 32, 2559-2579,2004.
[12] Wang, L., Leblanc, A., Second-order nonlinear least square estimation, Ann Inst Math 60,
883-900, 2008.
[13] Weiss, A.A., Asymptotic Theory for ARCH Model: Estimating and Testing, Econometric Theory
2(1), 107-131, 1986.

Herni Utami
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]

Subanar
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]

Dedi Rosadi
Department of Mathematics, Gadjah Mada University.
e-mail: [email protected]

Liqun Wang
Department of Statistics, Universityof Manitoba.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 781 - 790.

TWO-DIMENSIONAL WEIBULL FAILURE


MODELING

INDIRA P. KINASIH AND UDJIANNA S. PASARIBU

Abstract. In this paper, the construction of two-dimensional Weibull failure modeling for a
system where its degradation is due to age and usage is studied. This failure model will be based
on the construction of bivariate Weibull model from the consideration of component failure
behaviors of a two-component system. The idea is originally taken from J. Baik, et.al. whose
paper study about two-dimensional failure modeling and also from Lu and Bhattacharyya who
studied about constructions of bivariate Weibull models. Numerical examples are given to obtain
the values of cumulatif failure rate function. The data sets come from part of warranty claims data
for an automobile component (20 observation out of 497). This numerical example was made to give
a brief ilustration about how two-dimensional Weibull failure modeling deal with a case where the
product has a Weibull failure distribution on its age and usage.

Keywords and Phrases : Weibull distribution, failure modeling, bivariate, reliability, minimal
repair.

1. INTRODUCTION

Modern technology has enabled us to design many complicated systems whose


operation, or perhaps safety, depends in the realibility of the various components making
up the systems. For example, a fuse may burn out, a steel column may buckle, or a heat -
sensing device may fail. Identical components subjected to identical environmental
conditions will fail at different and unpredictable times. Another distribution beside gam ma
and exponential distribution that has been used extensively in recent years to deal with such
problems is the Weibull distribution, introduced by the Swedish physicist Waloddi Weibull
in 1939 [1]. Weibull distribution is a versatile family of life distributions in view of its
physical interpretation and its flexibility for empirical fit. It also had been applied to
analysis of life data concerning many types of manufactured items.
For many systems, the degradation is a function of age and usage [2]. For
example, in the case of manufacture machine tools, usage may corresponds to the number
of manufacture product they have produced. In the case of military arms, machine gun for
example, usage corresponds to the number of bullets they have been shot. Husniah, et. al
[4] describe a two-dimensional problem about automobiles maintenance policy by reducing
781
782 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU

the dimension onto one-dimensional problem by treating usage as a random function of


age. There are also numerical examples for the case where the product has a Weibull failure
distribution. This “one-dimensional” approach was also used by Blischke and Murthy [5] in
the context of warranty cost analysis. Baik, et. al [2] extend the concept of minimal repair
for the one-dimensional case to the two-dimensional case using bivariate failure
distribution. They also give a brief explanation and some examples about Weibull failure
model.
Our work is specified into a spesific bivariate distribution. Here, two-dimensional
Weibull failure model is constructed. It is discussed for a system where its degradation is
due to age and usage. This failure model will be based on the construction of bivariate
Weibull model from the consideration of failure behaviors of the components of a two-
component system. The bulk of literature on bivariate distributions deal with the case
where X and T are viewed as random lifetimes of two different items. The bivariate
distributions then models the statistical dependence between the two life times. In contrast,
according to Baik, et. al [2], it will be dealing with a single item with X and T denoting
the age and usage at the first failure. This paper then try to be focus to obtain cumulative
failure rate (CFR) value of bivariate Weibull through a study case using warranty claims
data from an automobile company.

2. FAILURE MODELING

Failure refers to the state or condition of not meeting a desirable or intended


objective, and may be viewed as the opposite of success. Failure might be defined in many
ways, and usually means mechanical breakdown, deterioration beyond a threshold level,
appearance of certain defects in system performance, or decrease in system performance, or
decrease in system performance below a critical level [6]. The notion of aging, which
describes how a unit improves or deteriorates with its age, plays a role in realibility theory.
Aging is usually measured based on the term of failure rate function. That is, failure rate is
the most important quantity to maintenance theory, and also important in many different
fields, e.g., statistics, social sciences, biomedical sciences, and finance [6]. Failure rate is a
good measure for representing the operating characteristics of a unit that tends to frequency
as it ages.
In automotive industry, many products are sold with a two-dimensional warranty.
For example, a dump truck is warranted for 36 months or 30.000 miles, whichever occur
first [4]. Two approaches have been proposed to modeling failures. In the first approach,
two-dimensional problem is effectively reduced to a one-dimensional problem by treating
usage as a random function of age [2]. This approach was used and developed by Husniah,
et.al. [4] in the context of warranty cost analysis. This approach assumes that the usage rate
X varies from customer to customer but it is constant for a given customer. X is a random
variable that can be modeled using a density function f x  , 0  x   . Conditional on
X  x , the total usage u at age t is given by u  xt .
For a given usage rate X , the conditional hazard function for the time to first
failure is given by ht x   0 , which is a non-decreasing function of the product age t and
product usage rate x . Failures over time are modeled by a one-dimensional counting
process. If failed products are replaced by a new one, this counting process is characterized
by a conditional intensity function  t x  which is a non decreasing function of t and x .
Moreover if all repairs are minimal and repair time is negligible, then according to Barlow
Two-Dimensional Weibull Failure Modeling 783

and Hunter [4],  t x   ht x  . In the second approach, the modeling of system failure
involves a bivariate distribution. This approach was used by Murthy, et.al. [5], and Hunter
in the context of warranty cost analysis for two-dimensional warranty [2].

2.1. One-Dimensional Failure Modeling. According to Baik J., et.al [2], a system can be
either repaired or replaced at each failure and the durations of all such corrective maintenance
actions are assumed to be small compared to the times between failures and so they can be
ignored. Let the nonnegative random variable Tn denotes the time of the n th system failure,
with n  1 , and Yn  Tn  Tn1 denotes the time between the n th and n  1 failure,
th

where T0  0 .
Suppose that T has a distribution function

F t   PT  t, t 0 (1)

and survival distribution of T1 is defined as

S t   PT  t (2)

When F t  is differentiable, the failure density function is given by

dF t 
f t   (3)
dt
and the hazard or failure rate function is defined as

f t 
ht   (5)
S t 

The probability that the system will fail for the first time in the interval t , t  dt  , given that
it has not failed prior to t , is ht dt  odt  .
Successive system failures can be modeled as a point process formulation. Let

N1 s, t  be the number of system failures in the interval s, t  , 0  s  t , with N1 0, t  is
abbreviated to N1 t  . The failure intensity function or rate occurence of failures (ROCOF) at
t is given by
PN1 t , t  dt   1
 t   lim , (6)
dt0 dt

So the probability that a failure will occur in the interval t , t  dt  is  t dt  odt  .
Assuming simultaneous failures may not occur, it follows from (6) that  t   dEN1 t  dt
or
784 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU

t
t   EN1 t    r  dr  (7)
0

t  denotes the cumulative intensity function for the failure process.
Let H t denote the history of the failure process up to, but not including, time t [2].
The conditional failure intensity function is then given by

PN1 t , t  dt   1 H t 
 t H t   lim , (8)
dt0 dt

 
Which implies that  t   E  t H t . Thus,  t  is the mean of  t H t  averaged over all
sample paths of the failure process [2].

2.2. Two-Dimensional Failure Modeling. It is now assumed that the degradation of a system
depends on its age and usage. Let Tn and X n , n  1 , denote the time of the n th system
failure and the corresponding usage at that time. Yn  Tn  Tn1 denotes the time between

n th and n  1 failure, and Z n  X n  X n1 is the system usage during this period,
th
the
where T0  0 and X 0  0 .
Refering to Baik J., et.al. [2], in the two-dimensional approach to modeling failures,
it is assumed that T, X  is a nonnegative bivariate random variable with distribution
function

F t , x  PT  t , X  x, t  0, x  0 (9)

The survival function is given by


S t , x   PT  t , X  x    f u, v dvdu (10)
t x

and if F u, v  is differentiable, then the bivariate failure density function is given by

 2 F t , x 
f t , x   (11)
tx

The bivariate hazard function can be defined as

f t , x 
ht , x   , (12)
S t , x 
Two-Dimensional Weibull Failure Modeling 785

So the probability that the first system failure will occur in t , t  dt   x, x  dx given that
T  t and X  x is ht , x dtdx  odtdx .
Successive system failures can be modeled using two-dimensional point process
formulation [2]. Expanding the one-dimensional formulation used in (6), now, let
N 2 s, t; w, x  denotes the number of system failures in the rectangle s, t   w, x  with  
0  s  t , 0  w  x , and N 2 0, t ; 0, x  is abbreviated to N 2 t , x  . The
failure
intensity or rate of occurrence of failure (ROCOF) at the point t, x  is given by the function

PN 2 t , t  dt ; x, x  dx  1
 t , x   lim (13)
dt0 ;dx0 dtdx

So the probability that a failure may occur in t, t  dt  x, x  dx is
 t , xdtdx  odtdx .
Assuming simultaneous failures cannot occur, it follows from (13) that
 t , x    2 EN 2 t , x  tx or
t x
H t , x   EN 2 t , x      u, v dvdu (14)
0 0

Let H t , x denotes the history of the failure process up to, but not including, the point t, x  .
The conditional failure intensity function is then given by


P N 2 t , t  dt ; x, x  dx  1 H t , x 

 t, x H t ,x   lim
dt0 ;dx0 dtdx
(15)

For a nonrepairable system, rectification involves replacing the failed item by a new one. If
the failure is detected immediately and the time to replace is negligible, so that it can be
ignored, then failures over the two-dimensional plane can be modeled by a two-dimensional
renewal process [2].

3. WEIBULL FAILURE MODEL

In this section, failure model for two-dimensional Weibull will be described,


according to the construction of Baik J., et.al. [2] and Lu and Bhattacharyya [3]. To give a
better understanding, first, there will be a brief review about failure model for one-
dimensional Weibull.

3.1. One-Dimensional Model. It is assumed T as the system’s age when failure occur.
According to (1), T has a Weibull distribution with

  
F t   1  e
 t
, (16)
786 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU

Where   0 and   0 . So, the survival distribution function according to (2), can be
described as
  

S t  1  F t   e
 t
(17)

The corresponding hazard or instant failure rate function then can be define as the following
equation
 1
   t   t 


   e
f t      
 1
   t   t  1
ht         (18)
S t   t       

e 

This failure rate is increasing (decreasing) for   1   1 , and will coincides with the
exponential distribution for   1 [7].
Under minimal repair, the ROCOF is given by  t   r t  and the resulting point
process is referred as a Weibull process. From (17), the expected number of system failures in

the interval 0, t  under minimal repair is given by

t 
H t     (19)
 
The expected number of system failures in the interval 0, t  under replacement is a renewal
function.

3.2. Two-Dimensional Model. Several approach had been proposed towards the construction
of bivariate Weibull models by considering the failure behaviour of the systems. This model
will be based on the model proposed by Lu and Bhattacharyya [3]. This following theorem
provide by Lu and Bhattacharyya, serves as general method of constructing bivariate life
models with specified marginals.

   
Theorem. Suppose S x, y s  exp  H x   H  y  w represents the conditional survival

function of  X , Y  given W  w  0 , and assume that the Laplace transform  t  of W


exist on 0,   , is strictly decreasing,  t   0 as t   , and  u  is absolutely
1

continous on 0,1 . Let


H  x    1 S X x   1

,  
H   y    1 SY  y 
1

(20)

qx, y   H x  H   y  S x, y    qx, y 


 
, (21)

Then S x, y  is a bivariate survival function with the marginals S X and S Y .

Proof. Any absolutely continuous and nondecreasing function H x  on 0,   such that
H 0  0 and H x    as x   , is a valid cumulative failure rate (CFR) function for a
Two-Dimensional Weibull Failure Modeling 787

 
univariate life distribution. Letting y  0 in S x, y w and taking expectation over W , we
get the relation

 
S x   S x,0  exp  H x  w  wdw   H x    
 

 
1
Solving for H x  we get H  x    1 S x   which is a valid CFR on 0,   in the light
of the assumptions made on  t  . Similarly, H   y  is also a valid CFR function. So
S x, y w is a valid conditional model. The joint distribution of  X , Y  is then

 
 



S x, y   E exp  H  x   H   y  W    qx, y 



The X -marginal of this joint survival function is   H  x    S x  which was initially

 
targeted and likewise for the Y -marginal.
Now, assuming that T, X  is a nonnegative bivariate random variable with bivariat
1
 t 
Weibull distribution function, consider the Weibull marginals S t   e  1 
and
2
 x 
S x   e , with 0  t, x   , and let  u   e u , with 0    1 . It is the Laplace
 2 

transform of a positive stable distribution [3] and it is satisfies the conditions of the theorem.
 y    log y   , so
1
Since 
1

1 2
 t    x  
H t    

, H x   

 (22)
 1  2 


  1 2
 
  t   
S t , x   exp  
 x 
    , 0    1, 0    1
 
(23)
  1  2 
   

Obviously, in this end results, the parameters  and  are not individually identifiable.
Combining them into a single parameter    , with 0    1 , then the bivariate
Weibull model can be describe as


  1 2
 
  t    x  
  

S t , x   exp     
  (24)
   1  2   
 
 

788 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU

Note that X and T are not independent. So the corresponding failure density
function is defined as
  1 1 2  1   2
1 2
  
   t   x   
 t     x   
f t , x   1 2    
1 2  1         
 2  1   2 
  (25)

 1  2  


  1  2  

  t    x     
      
        1  1  exp   t     x    
       
 1   
 2 
    1   
 2  
      
     

and the bivariate hazard/failure rate and cumulative failure rate function are each given by

      2
1 1 2  1  1 2
            
ht , x   1 2  
t  x     
t x  
           
1 2  1  2  1   2 
  (26)


   

1 2
 t    x    
        1  1
     
 1   2 
 
  
 

1 2 
 
 t   x
  
H t , x    ln S t , x       
 (27)
 2
 1   

4. CASE STUDY

Two-dimensional failure modeling can be described more clearly by considering a


numerical example using a given data collection that might be satisfies our distribution
assumption. For this paper, the data sets come from part of warranty claims data for an
automobile component (20 observation out of 497), which is already become an appendix of
Warranty Data Collection and Analysis book, written by Wallace R. Blischke, et.al. [7]. The
data contains failure models, type of automobile used the component, and auto used
zone/region are shown in codes. The age in days data and used at failure (KM) data are two
kind of data sets that will be used for this numerical example, respectively describing T and
X . The two random variables are already fitted as random variables with Weibull
distribution.
Two-Dimensional Weibull Failure Modeling 789

The cumulative failure rate values are shown in this following table , with scale
parameter 1  154.84 and  2  23,324 , whereas shape parameter 1  1.63 and
 2  1.29 , with   0.38 . Parameters are estimated using Maximum Likelihood method
and computed numerically using Newton-Raphson method. Computation was done by
MATLAB 7.10.0.. Result will be displayed on this following table.

Used KM at Failure
Age at Failure
0 100 200 300 400 500 600
0 0 0.335 0.820 1.383 2.005 2.674 3.383
50 0.158 0.352 0.824 1.385 2.006 2.674 3.383
100 0.490 0.552 0.894 1.417 2.023 2.685 3.391
150 0.949 0.972 1.156 1.560 2.107 2.739 3.428
200 1.517 1.528 1.625 1.891 2.327 2.888 3.533
250 2.183 2.189 2.245 2.413 2.729 3.186 3.755
300 2.938 2.942 2.977 3.086 3.308 3.659 4.130
Table 1. Cumulative Failure Rate Values

5. CONCLUDING REMARKS

In this paper, two-dimensional Weibull failure modeling has been studied base on
the idea of Baik J., et.al. [2], and Lu and Bhattacharrya [3]. The numerical example show that
the highest value of cumulative hazard rate is given by the highest KM used and the eldest
age at failure. This result gives us an information, that along with the increasing of age and
usage, the cumulative hazard rate function also has an increasing pattern.

References

[1] WALPOLE, R. E., MYERS, R. H., AND YE, K., Probability & Statistics for Engineer & Scientists, 7th ed.,Pearson
Prentice Hall, United States , 2007.
[2] BAIK, J., MURTHY, D. N. P., AND JACK, N., Two-Dimensional Failure Modeling with Minimal Repair, Wiley
Periodical Inc., 2003.
[3] LU, J. C. AND BHATTACHARYYA, G. K., Some New Construction of Bivariate Weibull Models, Ann. Inst.
Statist. Math., 1990.
[4] HUSNIAH, H., PASARIBU, U. S., HALIM, A. H., AND ISKANDAR, B. P., A Hybrid Minimal Repair and Age
Replacement Policy for Warranted Products, 2nd Pasific Conference on Manufacturing System, 2009.
[5] BLISCHKE, W. R. AND MURTHY, D. N. P., Warranty Cost Analysis, Marcel Dekker, London, 1994.
[6] NAKAGAWA, T., Maintenance Theory of Reliability, Springer-Verlag London Limited, 2005.
[7] BLISCHKE, W. R., KARIM, M. R., MURTHY, D. N. P., Warranty Data Collection and Analysis, Springer Series
in Reliability Engineering, Springer-Verlag London Limited, 2011.

INDIRA P. K INASIH
Master student at the Faculty of Mathematics and Natural Sciences, Bandung Institute of
Technology. She is also a lecturer at the Faculty of Mathematics and Natural Sciences
Education, IKIP Mataram.
e-mail: [email protected]
790 INDIRA P. KINASIH AND UDJIANNA S. PASARIBU

UDJIANNA S. P ASARIBU
Associate Professor and Lecturer at the Faculty of Mathematics and Natural Sciences,
Bandung Institute of Technology. Her research interests include Stochastic Process and
Space Time Analysis.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 791 - 800.

SIMULATION STUDY OF MLE ON MULTIVARIATE


PROBIT MODELS

JAKA NUGRAHA

Abstract. We have studied estimator properties of multivariate binary Probit Models using
simulation study. Estimation of parameter is performed by GEE, MLE and SMLE method.
Statistics software that was used in the calculation is R.2.8.1. Probit Model can be applied
on binary multivariate response by using MLE and GEE estimation method. Based on the
simulation data, MSLE estimator is inappropriate to multivariate Probit model. We
recommend to combine GEE and MLE. GEE can be used to estimate parameter of
regression. MLE can be used to estimate parameter correlations only.

Keywords and Phrases :Discrete Choice Model, MLE, GEE, simulation study.

1. INTRODUCTION

Discrete Choice Model (DCM) is a model constructed on the assumption that decision
maker faced the choice among a group of alternatives based on their utilities. The alternatives
or responses are nominal and one of them is having maximum utility. In this case, the
decision maker can be a person, family, company or other unit of decision maker. DCM is
correlated with two connected activities; determination of model and calculation of
proportion for each choice. The model has been widely discussed are Logit Model and Probit
Model. Methods of parameter estimation used are Maximum Likelihood Estimation (MLE)
method, Moment method and Generalized Estimating Equation (GEE) method.
Some researchers have studied this similar estimating method on panel binary
response. GEE estimator’s have invariant properties, consistent and normally asymptotic [1,
2]. In binary panel Probit Model, MLE is the best compare to Solomon-Cox or Gibbs sampler
[3, 4]. Probit Model needs multiple integral and it can be solved using Geweeke-
Hajivassilou-Keane (GHK)[5,6] Frequently, some dependent variables are observed in each
individual. Because the data include simultaneous measurements on many variables, this data
is called multivariate data. However the applications of multivariate binary response model
are most extensive. Researchon multivariate binary response models still gets a little
attention.On binary response, MLE and GEE are consistent estimator [7] and the estimators
791
792 JAKA NUGRAHA

of regression parameters are not influenced by the correlation [8]. On multivariate binary
Logit Model, GEE is more efficient compared to univariate approximation but the estimator
of correlation in GEE tends to be underestimate [9]. ProbitModel can be used in multivariate
binary response by using some parameter estimation that can be used such as GEE, MLE and
MSLE based on GHK simulation [10,11]
Based on the development of binary response model that also supported by
computational field, we studied properties of estimator using simulated study in multivariate
binary response data. Modeling of multivariate binary response utilized used Probit model
and estimation of parameter is performed by GEE, MLE and SMLE method.

2. DCM ON THE MULTIVARIATE BINARY RESPONSE

It is assumed that Yit is binary response, Yit=1 as the subject i at the response of t
choosing the alternative 1 and Yit =0 if the subject of i at the response of t choosing the
alternative of 2. Each individual has covariate Xi as individual characteristics i and covariate
Zijt as characteristic of choice/alternative j at the individual of i.
Utility of subject i selecting the alternative of j on response t is

Uijt= Vijt + ijt for t=1,2,...,T ; i=1,2,...,n ; j=0,1.

with

Vijt = jt +jtXi + tZijt.

By assuming that decision maker select the alternative based on the maximum utility value,
the model can be expressed in different of utility

Uit = Ui1t - Ui0t = Vit + it

with

Vit = (Vi1t – Vi0t) = (1t-0t) +(1t-0t)Xi + t(Zi1t-Zi0t)=t +t Xi + tZit

and it = (i1t - i0t). The probability of subject of i selecting (yi1 = 1,...,yiT = 1) is

P(yi1 = 1,...., yiT = 1)   I (V it   it ). f ( i )d i1...d iT t


i

With i = (i1,...,iT)′. Value of probability is calculated by multiple integral T depend on the


parameter = (1′,2′,...,T′) as well as distribution of .

2.1 MLE on Multivariate Binary Probit Model. Vector i = (i1,...,iT) is normally


distributed with mean value of zero, covariance matrix  and each of it is normally standard
distributed.
 i' ~ N(0, ) and it ~N(0,1) for t=1,...,T.

It is assume that R is correlation matrix of i, so  = Q iRQ i with Q i is diagonal matrix with its
Simulation Study of MLEon MultivariateProbit Models 793

diagonal element (2yit-1).

 1  12 ...  1T   1 12 ... 1T 


   
  12 1 ...  2T   1 ...  2T 
  R   12
... ... ... ... ... ... ... ... 
   
   ... 1 
 1T  2T ... 1   1T  2T
 (2 yi1  1) 0 ... 0 
 
 0 (2 yi2 1) ... 0 
Qi  
... ... ... .... 
 
 0 0 ... (2 yiT  1) 

yit = 0 at the respondent of i choosing first alternative and yit = 1 if respondent of i choosing
second alternative. So, tt’ = (2yit-1)(2yis -1)ts. To simplify the notation, take Vit = tXit (t is
identified parameter and the parameters of  and  are not included).

wiT wil wi 1

 T ( wi ;0;  )   ...  ...  T ( i ; ; )d i  


  
D (Yi )
T ( i ; ; )d i

With wit = (2yit-1)xitt and D(Yi) = [-,wi1]…[-,wil]…[-,wiT]. The T(.)is multivariate


normal density of T response. Its log likelihood equation is

n
LL( ; )   ln T (wi ;0; ) (1)
i 1
or also can be represented as
n
LL( ; R)   ln T (wi ;0; R)
i1

Estimation of  and (or R) using MLE method can be derived from likelihood function of
equation (1). Define
 1 ...  1T  1l 
 
 ... ...  l l
 11 12 
 l    1T 
... 1  Tl   l  
   21 1 
 
 ...  Tl 1 
 1l
794 JAKA NUGRAHA

 1 ...  1T  1k  1l 
 
 ... ... ... ... ... 
 ... 1  Tk  Tl     11  12 
kl kl
 kl   1L  kl kl 
 .    21  22 
 ..  Tk 1  kl 
 1k ...  Tl  kl 1 
 1l

First derivative of log likelihood function (1) with respect to the parameter  is

n  ( w ;0;1) l l
LL( ; ) il T 1 ( wi ,  l ; M ; S )(2 yil  1) xil
 (2)
1l i 1  T ( wi ;0;  )

l l l
with M l  12
l
wil ; S  11  12 l21 and wi,-l = (wi1,..,wi(l-1),wi(l+1),...,wiT).
First derivative of log-likelihood function (1) with respect to parameter  is

kl kl kl
LL( , R | x) N 2 ( wik , wil ;0,  22 )T 2 ( wi , kl , M i ; S )
 (2 yik  1)(2 yil  1) (3)
kl i 1 (wi ; )

with M ikl  12


kl
( kl22 ) 1 ( wil , wik )' ; S kl  11
kl kl
 12 ( kl22 )1  kl21
and wi,-kl= (w1,...,wk-1,wk+1,...,wl-1,wl+1,...,wT)
Second derivative of log-likelihood functions (1) with respect to parameter  and 
are

n  ( w ;0;1) l l 2
 2 LL( ;  ) it T 1 ( wi ,  l ; M ;  )[(2 yil  1) xil ]
 .
 12l i 1  T ( wi ;0;  )
1   (w ;0;1)
it T 1 ( wi , l ; M l ; l )  (4)
2 n
 LL ( ; )  ( wik ;0;1) ( wil ;0;1)( 2 yil  1)( 2 yik  1) xik xil
 .
1k 1l i 1  T ( wi ;0; )
( T 2 (wi , kl ; M kl ; kl )  2T 1 (wi, k ; M k ; k )T 1 (wi, l ; M l ; l ) (5) 
kl kl kl kl
with M kl  12 kl
wik ;   11  12 21
and wi,-kl = (wi1,.., wi(k-1),wi(k+1),wi(l-),wi(l+1), ...,wiT)

kl kl
 2 LL(  , R | x ) n 
T  2 ( wi ,  kl , M i ; S )
 ( 2 yik  1)(2 yil  1)
 2kl i 1 [ T ( wi ;  )]2
 (w ; ) A1   (w , w ;0, 
T i 2 ik il
kl
22 2
) T 2 ( i , kl ; M kl ; S kl )(2 yik  1)(2 yil  1)  (6)
Simulation Study of MLEon MultivariateProbit Models 795

with A1 
2 ( wil , wik ;  kl )
2 (1   ) 2
 (w w  )
il ik (1   kl2 ) ln(1   kl2 )  4 kl 
kl

kl kl k
2 LL( , R | x) n (2 yik 1)(2 yil 1)T 2 (wi, kl , Mi ; S ) (wil ;0;1) (wik ; m ;1)(2 yil 1) xil
 
l kl i1 (wi ; )
n (2 y  1)(2 y  1) 2 x  (w , w ;0, kl ) kl kl l l
ik il il 2 ik il 22 T 2 (wi ,  kl , M i ; S ) ( wil ;0;1).T 1 (wi , l ; M ;  ) (7)

i 1 [(wi ; )]2

withmk = klwil and wil = (2yil-1)xil1l.


MLE can be solved by Newton-Raphson iteration method based on first and second
derivation of equation (2) to (7). Estimation by using MLE also needs multiple integral
calculation T. Calculation value of P(Yi) over Gaussian Square calculation can not be done
for T more than four [12]. P(Yi) can be solved only by simulation. GHK is found to be the
most efficient simulation method and the unbiased estimator [13]. Therefore it is also called
as maximum simulated likelihood estimator (MSLE) method. For getting MSLE in similar
properties with MLE, high simulations are needed [14].

2.2 MSLE on Multivariate Binary ProbitModel. Probit Model is based on the assumption
that vector i = (i1,...,iT)′ on equation (5) has normal multivariate distribution with the mean
null and the covariance matrix . Marginal probability ( for t and i) is

it = P(yit=1|Xi,Zi) = P(-Vit<it ) = 1- (-Vit) (8)


where

Vit 1 1
 (Vit )   2 1/ 2
exp[  2  it2 ]d it
 (2 t ) 2 t
P (Yit  yit )   ityit (1   it )1 y it foryit=0,1.

From the symmetric properties of normal distribution, the equation (8) can be expressed as

it = P(yit=1|Xi,Zi) = P(-Vit<it) = P(it<Vit) = (Vit)

Marginal probability also can be represented by

P(Yit  yit )  [(2yit-1)Vit]


and the combined probability is

P(Yi1  yi1,...,YiT  yiT )  Pi1  (2 yi1 1)Vi1,...,iT  (2 yiT 1)ViT   (wi ;0;  )

wherewi = ((2yi1-1)Vi1,..., (2yiT-1)ViT) and  express multivariate normal density of T


responses. The log-likelihood function is
796 JAKA NUGRAHA

n
LL( ;  )   log  (w i ;0;  )
i 1

( w i ;0;  ) is determined by GHK simulation using the factor of CholeskyC. Because


estimated parameter  = (,c). c is elements of matrix C.Utility equation become :

t
U it  Vit   ctlli for t=1,...,T andi~ N(0,I) (9)
l 1

By using the algorithm of GHK simulation, the result is

t 1
 
T T
 ( 2 y it  1)V it   c tk k( r ) 
~i( r )    it( r )    k 1 
t 1 t 1
 c 
tt
 
 
Therefore
1 R
~i   ~i( r )
R r 1

Index of r states simulation r. The function of simulated log-likelihood is

n
 1 R ~(r )  n
 1 R T (r ) 
simlog L( )   log      log   it 
i 1  R r 1  i 1  R r 1 t 1 
If it is known the utility model as represented in equation (9) that is fit to regularity
condition and i=(i1,..., iT) is normal multivariate distributed with the mean of null and
covariance matrix , so by using GHK simulation, MLE for the parameter of  = (,) is the
solution of estimator equation:

(r )
n
1 R  ~ (r ) T
a li( r ) 
1
1
R  i
R r 1 
   li
r
.

0


i 1

R
 ~
r 1
i
(r ) l 1 li 

where
 l 1 clh ( r )  (ahi( r ) ) a hi( r ) ( 2 yil  1) Vil 
ali( r ) 
u hi . .  , l  1
c  ( hi( r ) )  cll  
  h1 ll
  (2 y il  1) Vil
, l 1
 c11 

Index of r represent the simulation of rth, r=1,..,R.


Simulation Study of MLEon MultivariateProbit Models 797

3. SIMULATION STUDY AND DISCUSSION

Multivariate analysis is generalization of univariate analysis. As data have low


correlation, problem of multivariate can be solved by univariate analysis. However as
correlation between the responses are strong, univariate approximation for multivariate case
results the estimator that is under estimate. So we studied the effect of correlation level on
estimating accuration based on simulation data generating at a fixed parameter value and
some level of correlation.Statistics software that was used in this calculation is R.2.8.1
program, and three estimation methods used were :
a. GEE: estimation using GEE from Liang-Zeger
b. MLE: estimation using MLE method.
c. MSLE : estimation using MSLE method based on GHK simulation
The case of T=3 is taken. Utility model of subject i selecting the alternative j on the
decision of t is
Uit = Vit+ it
Vit = (Vi1t – Vi0t) =  t  t X i   t Z it
Zit = (Zi1t – Zi0t); it =i1t -i0t; t = 0t - 1t ; t = (0t - 1t)

where i=1,...,n and t=1,2,3;j=0,1.ijt ~N(0,1). Data were generated on the parameter value of
t =-1,t = 0.5 and t=0.3
Structure of correlation that will be examined is r 12= and r13=r32= 0. Utility on t=1
is correlated with the utility on t=2 with the values of correlation, =0,0.2,...,0.9. The value of
observation variable Xi and Zijt were taken from the Normal distribution,
Xi ~ N(0,1) ; Zi0t ~N(0,1) ; Zi1t ~N(2,1)
Survey on the effect of correlation to the estimator was conducted on GEE, MLE
and MSLE for n=1000. On each sample, the iteration for 50 times is performed. Results of
estimating parameter are presented in Table 1, Table 2 and Table 3.

Table 1.Average of estimator using GEE.


 21
0 0.2 0.4 0.6 0.8 0.9
1=-1 -0.9852 -1.0363 -1.0125 -1.0685 -1.0284 -1.0600
2=-1 -1.0145 -1.0058 -1.0780 -1.0516 -1.0135 -1.0175
3=-1 -0.9875 -1.0250 -1.0804 -1.0981 -1.0085 -1.0481
1=0.5 0.4740 0.5127 0.4585 0.5391 0.4692 0.5387
2=0.5 0.5446 0.5530 0.4697 0.5212 0.4707 0.4716
3=0.5 0.4948 0.5147 0.5684 0.4924 0.5532 0.4889
1=0.3 0.2915 0.3078 0.3047 0.3219 0.3180 0.3181
2=0.3 0.3109 0.2996 0.3204 0.3239 0.3194 0.2930
3=0.3 0.2890 0.3177 0.3183 0.3490 0.2918 0.3175
21 -0.0223 0.0861 0.2152 0.3526 0.5182 0.5672

Based on Table 1, GEE is good method to estimate regression parameter except correlation
parameter (21). The estimatorsof , β, and γ are close(small bias) to the true parameter at all
of correlation value. So, the value of correlation among utility does not influence to GEE
798 JAKA NUGRAHA

estimator but ̂ 21 is always underestimate.

Table 2. Average of estimator using MLE


 21
0 0.2 0.4 0.6 0.8 0.9
1=-1 -0.9829 -1.0331 -1.0108 -1.0710 -1.0159 -1.0455
2=-1 -0.0296 0.0274 -0.0648 0.0330 0.0006 0.0297
3=-1 -0.0025 0.0105 -0.0668 -0.0250 0.0092 -0.0010
1=0.5 0.4737 0.5115 0.4583 0.5415 0.4636 0.5287
2=0.5 0.5442 0.5515 0.4670 0.5190 0.4777 0.4775
3=0.5 0.4940 0.5135 0.5674 0.4919 0.5529 0.4885
1=0.3 0.2900 0.3060 0.3038 0.3221 0.3127 0.3146
2=0.3 0.3096 0.2990 0.3190 0.3185 0.3184 0.2914
3=0.3 0.2878 0.3166 0.3167 0.3478 0.2907 0.3161
21 -0.0417 0.1584 0.3799 0.5996 0.8139 0.8915

Utility 1 (Ui1) is correlated with utility 2 (Ui2) and both utility are not correlated to utility 3
(Ui3). Therefore correlation value is only affecting the parameters within Ui1 and Ui2. At both
utility, bias of estimator is too high and proportional to the value of correlation. MLE of
parameter 2 and 3are not good because it produced the high bias (see Table 2). Estimator of
21 in MLE is more better than GEE.

Table 3. Average of estimator using MSLE


 21
0 0.2 0.4 0.6 0.8 0.9
1=-1 -0.3499 -0.6201 -0.2942 -0.1758 -0.3694 -0.1985
2=-1 0.3167 1.0063 0.5998 0.3530 0.5352 0.7078
3=-1 0.1514 -0.0859 0.0188 -0.0055 -0.1044 0.1018
1=0.5 -0.0966 -0.3595 -0.4914 -0.7192 -0.2877 -0.3125
2=0.5 0.3391 0.5567 0.3735 0.5089 0.4776 0.5845
3=0.5 -0.0084 -0.1065 0.1623 0.1867 -0.0379 0.0977
1=0.3 -0.3336 -0.0899 -0.6505 -0.6186 -0.4069 -0.2815
2=0.3 0.4304 0.8490 0.5980 0.6814 0.4885 0.4382
3=0.3 -0.0789 -0.1702 0.2369 -0.2030 0.1326 0.1235
21 =0 0.9876 0.9900 0.7483 0.4796 0.9476 0.9900

Result of parameter estimation by using GHK simulation produce very high bias (see Table
3). In other side, the value of estimator is influenced by initial estimator value. Problem
encountered in ProbitModel is that its log-likelihood function are not globally concave. This
causes the maximum global point difficult to define and the computation will be not efficient
(time consuming) to reach convergence point.
Simulation Study of MLEon MultivariateProbit Models 799

0.35
0.3
0.25
0.2

bias
0.15
0.1
0.05
0
0 0.2 0.4 0.6 0.8 1
Kore las i

GEE MLE

Figure 1. Comparisons of bias correlation

Comparisons of bias correlation have description on Figure 1. Estimator parameter using


MLE is better (lower bias) compared to GEE. Estimator of correlation parameter in GEE has
tendency to underestimate proportional to the magnitude of correlation.

4. CONCLUDING REMARKS

ProbitModel can be applied on binary multivariate response by using MLE and GEE
estimation method. Based on simulation data,
1. GEE estimator for regression coefficients is not affected by the value of correlation
between the responses.
2. MLE estimator for regression coefficients is affected by the value of correlation
between the responses.
3. GEE estimator for correlation parameter tends to be underestimated. Whereas MLE
method is more accurate to estimate the correlation parameter.
4. MSLE estimator not appropriate to multivariate ProbitModel

Open Problem

In this research, estimation of parameter used are MLE and GEE. It is very possible to use
other estimation method such as Bayes methods. From computational side, simulation method
applicable for Probit model is need to be developed to overcome the limitation of GHK
method.

References

[1] LIANG, K.Y., AND ZEGER,S.L, Longitudinal Data Analysis Using Generalised Linear Models, Biometrika73, 13-
22, 1986.
[2] PRENTICE, Correlated Binary Regression with Covariates Specific to Each Binary Observation. Biometrics 44,
1043-1048, 1988.
[3] HARRIS, M.N, MACQUARIE L.R AND SIOUCLIS AJ., Comparison of alternative Estimators for Binary Panel Probit
Models, Melbourne Institute Working Paper no 3/00, 2000
800 JAKA NUGRAHA

[4] CONTOYANNIS P, ANDREW M. J, AND R ICE N, Dynamics of Health in British Household: Simulation-Based
Inference in Panel Probit Model, Working Paper, Department of Economics and Related Studies, University of
York, 2001.
[5] HAJIVASSILIOU, V., D. MCFADDEN, AND R UUD P., Simulation of Multivariate Normal Rectangle Probabilities and
Their derivatives: Theoretical and Computational Results, Journal of Econometrics 72, 85–134, 1996.
[6] GEWEKE J.F., KEANE M.P., ANDRUNKLE D.E., Statistical Inference in The Multinomial MultiperiodeProbit
Model, Journal of Econometrics 80, 125-165, 1997.
[7] NUGRAHA J., GURITNO S., ANDHARYATMI S., Logistic Regression Model on Multivariate Binary Response Using
Generalized Estimating Equation, National Seminar on Math and Education of Math conducted by UNY,
Indonesia, 2006.
[8] NUGRAHA J, H ARYATMI S. AND G URITNOS, A Comparison of MLE and GEE on Modeling Binary Panel
Response, ICoMS3th IPB, 2008.
[9] NUGRAHA J., HARYATMI, ANDGURITNO, Logistic Regression Model on Multivariate Binary Response Using
Generalized Estimating Equation, Proceeding of National Seminar on Mathematicsconducted by UNY,
Indonesia FMIPA UNY, 2006.
[10] NUGRAHA J., GURITNO S.,AND H ARYATMIS., Likelihood Function and its Derivatives of Probit Model on
Multivariate Biner Response, JurnalKalam, Vol. 1 No. 2, Faculty of Science and Technology, Universiti
Malaysia Terengganu, Malaysia, 2008
[11] NUGRAHA J., GURITNO S.,AND HARYATMIS., ProbitModel on Multivariate Binary ResponsUsing SMLE,
JurnalIlmuDasar, FMIPA Univ. Jember, 2010.
[12] LECHNER M, LOLLIVIER S AND MAGNAGT, Parametric Binary Choice Models, Discusion paper no 2005-23,
2005.
[13] HAJIVASSILIOU V., MCFADDEN, AND RUUD P., Simulation of Multivariate Normal Rectangle Probabilities and
Their derivatives: Theoretical and Computational Results, Journal of Econometrics 72, 85–134, 1996.
[14] TRAIN, KENNETH, Discrete Choice Methods with Simulation, UK Press, Cambridge, 2003.

JAKANUGRAHA
Dept. of Statistics, Islamic University of Indonesia, Kampus Terpadu UII, Jl. Kaliurang
Km.14, Yogyakarta, Indonesia
e-mails: [email protected] or [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 801 - 812.

CLUSTERINGOF DICHOTOMOUS VARIABLES AND


ITS APPLICATION FOR SIMPLIFYING DIMENSION
OF QUALITY VARIABLES OF BUILDING
RECONSTRUCTION PROCESS

KARIYAM

Abstract.Clustering of dichotomous variables is usually realized by application ofhierarchicalcluster


analysis based on a proximity matrix. Some problems in the variable clustering are the difference in
proximity matrix and difference in linkage method can result in the difference members of variables
cluster although formed in the sizes of variable group same.
This paper discusses the reasons for selecting clustering parameters based on the maximum level of
matching members of variable cluster, which obtained through the application of hierarchical cluster
analysis using several of linkage and similarity methods, in the case of simplification of quality
variable of reconstruction process of buildings post-earthquake in Yogyakarta. Validation of results
of the variable clustering is done in several stages, first by using the Pearson correlation matrix,
applied the furthest neighbor, centroid-clustering, between-group linkage, and within-group linkage,
and compare the rate matching of members variable cluster, and this process gave of conformity
which high enough. Second, we combined the furthest neighbor withYule'Qsimilarity, the similarity
of Yule's Y, the Hamann similarity, and this process also produces high rate of conformity members
of cluster. Stability of member of the group havedone also with analyze subgroup of data that is
separated from original data and this process gave the high results in the level of appropriateness of
members of the variable cluster.
Application of variables cluster procedure on the case of dimension reduction of process quality
variable of the post-earthquake building reconstruction in Yogyakarta have produced eleven from
forty of variables with excellent results as well as easily interpreted on the early problem context. A
number of parameters suitable to solve this case are hierarchical cluster analysis which combining
between the Pearson correlation similarity and the complete linkage method.

Keywords and Phrases: dichotomous, clustering, quality, reconstruction, building

1. INTRODUCTION

Clustering of dichotomous variables is usually realized by an application of


hierarchical cluster analysis on a proximity matrix. Some of similarity measure of
dichotomous variables usually based on the frequency of the following contingency table:
801
802 KARIYAM

Table 1. Two-way Contingency Table, Dichotomous Variables


Category of Category of Variable xi
Variable x j Total
1 0
1 a b a+b
0 c d c+d
Total a+c b+d a+b+c+d=n

Similaritybetweenvariable xi and x j , S ij , can beappliedby measure of


Pearsoncorrelation, informed in equation (1)

ad  bc
S ij  (1)
a  bc  d a  c b  d 
Moreover, it can also be applied to the similarity of Yule's Q, the similarity of Yule's Y, and
Hamannsimilarity, as in equation (2), (3), and (4).

Yule’s Q : S  ad  bc (2)
ad  bc
ij

Yule’s Y : ad  bc
Sij  (3)
ad  bc

Hamman : S  a  d   b  c  (4)
abcd
ij

A proximity matrix of similarity among variables was used as the basis for the
analysis of the grouping hierarchy. A number of linkage methods that can be implemented
consistent with the type of dichotomous variables, suppose a complete linkage method, the
method of between-group linkage, the methods of within-group linkage, or centroid
clustering. Some problems in the variables clustering are difference in proximity matrix and
difference in linkage method can result difference of members of groups, although they are
formed in the same variable group sizes. [9]

2. RELATED WORK

A number of researchers have examined this issue, an algorithm for clustering of


earthquakes based on maximum likelihood [1], a novel attribute weighting algorithm for
clustering high-dimensional categorical data [2], and comparison of distance in cluster
analysis with dichotomous data [4].

Another research have discussed this issue too, a comparative evaluation of


Clustering of Dichotomous Variables and Its Application… 803

similarity measures for categorical data [6], clustering binary sequence through a two-step
iterative procedure [7], cluster analysis and categorical data [8], and a comparison of
different approaches to hierarchical clustering of ordinal data [10]. This paper will discuss
the application of dichotomous variable cluster in the quality control of reconstruction
process of house a small type of post-earthquake Yogyakarta.

3. MATERIALS AND METHODOLOGY

3.1. Materials.The author uses data as many as 8123 data with the consent of the team, where
the author also member team of quality assurance activities in the rehabilitation and
reconstruction of quake victim's home. There are forty variables were derived from eleven
major components of process of making home, which includes the availability of design,
map, foundation of house, sloof, columns, wall, girder of ring, reinforcement at a joint of the
beams ends and column, the connection reinforcement, bearing wall, and easel.Detail of
statement that must be answered with dichotomous variables Yes (1) and No (0) by quake
victims are listed in Table 2. [3]

Table 2. Quality Variables of Building


Observation Yes No
A Design A1 Building based on design
B2 Symmetric map
B A Map
B3 Nothing bump > 25% from big map
C4 Deepness > 60 cm
C5 Wide of foundation > 60 cm
C Foundation C6 Reinforcement in foundation > 40 cm
C7 River stone or hard white stone
C8 Composition 1pc : 4sand
D9 Minimal of measure 15 cm x 20 cm
D10 Minimal of reinforcement : 4d12mm
Size of begel in sloof : d8 mm x 15 cm or
D11
d6 mm x 12,5 cm
D Sloof
D12 Existing of anchor in foundation
D13 Condition of sloof concrete (not porous)
Composition of sloof concrete 1pc : 2sand :
D14
3 gravel
E15 Minimal of size: 15cm x 15cm
Minimal reinforcement in column: 4d12
E16
mm
E Column Size of begel in column: d8 mm x 15 cm or
E17.
d6 mm x 12,5 cm
E18 Condition of column concrete (not porous)
E19 Composition of column concrete 1pc :
804 KARIYAM

Observation Yes No
2sand : 3gravel
F20 Broad of wall < 9 m²
F Wall F21 Existing anchor in wall
F22 Composition 1pc : 4sand
G23 Minimal of size 12 cm x 15 cm
Minimal reinforcement in ring beams: 4d12
G24
mm
Size of begel in column: d8 mm x 15 cm or
G Ring Beams G25
d6 mm x 12,5 cm
Condition of ring beams concrete (not
G26
porous)
Composition of ring beams concrete 1pc :
G27
2sand : 3gravel
Detail Of Reinforcement in the end of angle with
H H28
Reinforcement size: length 40 cm
Connection Of
I I29 Minimum overlap: 40cm
Reinforcement
J30 Existing of sloping ring beams
J31 Condition of sloof concrete (not porous)
J32 Size of sloping ring beams 12 cm x 15 cm
J Bearing Wall Reinforcement of sloping ring beams: 4d12
J33
mm
Size of begel in column: d8 mm x 15 cm or
J34
d6 mm x 12,5 cm
J35 Existing of knot of wind
K36 Minimal size of wood: 6 cm x12 cm
K37 Hookup with begel
K Easel K38 Existing of knot of wind
K39 Existing of anchor
K40 Wood color is dark

3.2. Methodology. Algorithm of the hierarchical clustering of variables is follows: [5]


(i) Start with p clusters, each containing a single entity and an p x p symmetric matrix
or distance;
(ii) Search the distance of matrix for the nearest (most similar) pair of clusters. Let the
distance between “most similar” clusters U and V, d(UV);
(iii) Merge clusters U and V. Label the newly formed cluster (UV). Update the entries in
the distance matrix by deleting the rows and columns corresponding to cluster U and
V, and adding a row and column giving the distances between cluster (UV) and the
remaining clusters;
(iv) Repeat (ii) and (iii) a total of p – 1 times. (All variables will be in a single cluster at
termination of the algorithm).
The linkage methods were used include the complete, centroid, between and within
Clustering of Dichotomous Variables and Its Application… 805

linkage. Meanwhile, the distance measure was used Pearson correlation, Yule’Q, Yule’s Y,
and Hamann. Outline of application method procedureare follows:
(i) apply a hierarchical cluster analysis based on the Pearson correlation matrix, and the
complete linkage method;
(ii) apply between-group linkage, within-group linkage, centroid-clustering and comparing
the results with the complete linkage,
(iii) apply the similarity of Yule's Q, Yule's Y similarity and Hamann similarity, and
compare;
(iv) The data separated into two sets of data, and comparing the level of conformity of the
groups of variables use result of step (ii) and (iii).

4. RESULT AND DISCUSSION

Application of clustering of dichotomous variables on the characteristics of the


building using the Pearson distance and complete linkage have produced dendrogram plot, as
shown in figure 1.
806 KARIYAM

Figure 1.The Dendrogram Plot

Furthermore, applied cluster analysis using a different linkage method, that is the
within, between, and centroid linkage. Comparison of the suitability of the number of
members in the group of variables, between complete linkage and within linkage was 60%.
The suitability level of members in the group has obtained between complete linkage and
between linkage80%. Meanwhile, complete and centroid linkage producing suitability as
many as 83%. Thus, it can be said that the complete linkage method can be used with good
for case of quality variable clustering of the building, as shown in figure 2.
Clustering of Dichotomous Variables and Its Application… 807

Figure 2. Percentage of suitability of members of variables groups based on different


linkage method

Furthermore, by using the complete linkage method, the data were analyzed with
different of similarity namelyYule’Q, Yule’s Y and Hamannand compared with Pearson.
Comparison of percentage of the suitability of the members of variables groups for these
similarities, as shown in figure 3.

Figure 3. Percentage of suitability of members of variables groups based on distance measure

It shows that Pearson correlation can be used to analyze of quality variables data of
building. Application of different distance has producing the level of suitability of each
group of variables as many as 95%, except for Hamannsimilarity obtained 58%. Separating
the data into two datasets,first datasets contain 6088 data, and the second datasets contain
808 KARIYAM

2035 data. For each datasets was analyzed by cluster analysis using complete linkage and
Pearson. This step produced level of suitability of variables members that high, 95%. The
difficulty of a series of such procedures is the process of calculation to obtain percentage of
the conformity of members of the group variable. This is because the difference in sign of the
group caused by difference of linkage and similarity methods, so the conversion must be done
manually with the help of Microsoft Excel.
Furthermore, groups of variables based on dendrogram plot on figure 1. whichwas
formed in 11 groups, are listed in Table 3.

Table 3.Name of Variable Clusters


Name of
Code Name of variables Cluster variable
clusters
(1) (2) (3) (4)
A1. Building based on design 1
B2. Symmetric map 1 Design and map
B3. Nothing bump > 25% from big map 1
C4. Deepness > 60 cm 2 Foundation of
C5. Wide of foundation > 60 cm 2 house
C6. Reinforcement in foundation > 40 cm 3
Size of begel in sloof : d8 mm x 15 cm or
D11. 3
d6 mm x 12,5 cm
D12. Existing of anchor in foundation 3
Size of begel in column: d8 mm x 15 cm or Size of begel
E17. 3
d6 mm x 12,5 cm
F21. Existing anchor in wall 3
Size of begel in column: d8 mm x 15 cm or
G25. 3
d6 mm x 12,5 cm
Stone of
C7. River stone or hard white stone 4
foundation
C8. Composition 1pc : 4sand 5
D9. Minimal of measure 15 cm x 20 cm 5
Composition of sloof concrete 1pc : 2sand :
D14. 5
3 gravel
E15. Minimal of size: 15 cm x 15 cm 5 Composition
Composition of column concrete 1pc : mixture and
E19. 5 size of concrete
2sand : 3gravel
F22. Composition 1pc : 4sand 5
G23. Minimal of size 12 cm x 15 cm 5
Composition of ring beams concrete 1pc :
G27. 5
2sand : 3gravel
D10. Minimal of reinforcement : 4d12mm 6 Size of
Clustering of Dichotomous Variables and Its Application… 809

Name of
Code Name of variables Cluster variable
clusters
(1) (2) (3) (4)
Minimal reinforcement in column: 4d12 reinforcement
E16. 6
mm
Minimal reinforcement in ring beams: 4d12
G24. 6
mm
Reinforcement of sloping ring beams: 4d12
J33. 6
mm
D13. Condition of sloof concrete (not porous) 7
E18. Condition of column concrete (not porous) 7
Quality of
F20. Broad of wall < 9 m² 7 concrete
Condition of ring beams concrete (not
G26. 7
porous)
Reinforcement in the end of angle with
H28. 8 Detail of
size: length 40 cm
reinforcement
I29. Minimum overlap: 40cm 8
J30. Existing of sloping ring beams 9
J31. Condition of sloof concrete (not porous) 9
Quality of
J32. Size of sloping ring beams 12 cm x 15 cm 9 bearing wall
Size of begel in column: d8 mm x 15 cm or
J34. 9
d6 mm x 12,5 cm
J35. Existing of knot of wind 10
Knot of wind
K38. Existing of knot of wind 10
K36. Minimal size of wood: 6 cm x12 cm 11
K37. Hookup with begel 11
Easel
K39. Existing of anchor 11
K40. Wood color is dark 11

The groups of variables can be named the group the variable availability of design
and map, foundation of house, foundation of house, stone of foundation, composition mixture
and size of concrete, size of reinforcement, quality of concrete, detail of reinforcement,
quality of bearing wall, knot of wind, and quality of easel. Further, as comparator has also
been applied using the Pearson correlation matrix as the basis on factor analysis, and based on
Eigen value above one, it was found twelve factors. However, in this paper is selected as
many as eleven groups of variables or factors, because this results are relatively stable and
valid, and easily interpreted in the context of the early problems.
Detail of analysis especially about percentage of nonconformities of building with
standard of earthquake resistant is shown in figure 4.
810 KARIYAM

Figure 4. Average of house percentage unconformity with standard of earthquake-resistant


buildings

Based on figure 4,average of house that built without design is 6.5%,average of


house with deepness of foundation less than 60 cm is 4.5%, and 1% of house with condition
of stone of foundation not river stone. Meanwhile, average of house with size of begel not
conformstandard is around 6%. A number of houses with percentage of mixture composition
and concrete size are not conform standard of earthquake-resistant are 16.5%, moreover
percentage of house with quality of porous concrete is 2.5%. More than 60% of house didn’t
have reinforcement that suitable with standard. In addition, 20% of reinforcement detail of
house unsuitable with standard. A number of house with percentage of quality of bearing wall
are not suitable with standard as many as 16.5%, around 50% of house didn’t have knot of
wind, and also around 12% of easel condition of house not conform with standard. It shows
bad of house condition located on condition of reinforcement, and existence of knot of wind.
However, in general can concluded that a number of house not conform standard of
earthquake-resistant buildings. These results are expected to be used as basis to suggestion of
house renovation, thus will reduce risk from earthquake.

5. CONCLUDING REMARK

Suitability of variable cluster members which resulted from combination between


distance similarity and linkage method can be used as parameter goodness from a
dichotomous variables clustering. The research about dimension reduction of quality
variables of building reconstruction process can be solved based on its parameter with the
excellent result. Conclusion fantastic and easily interpreted according to the early context on
this problem was done by variable cluster analysis using the complete linkage method and
Pearson correlation matrix.
Clustering of Dichotomous Variables and Its Application… 811

References

[1]ADELFIO, G., CHIODI, M., AND LUZIO, D., An Algorithm for Earthquakes Clustering Based on Maximum
Likelihood, Proceedings of the 6th Conference of the Classification and Data Analysis Group of the
SocietaItaliana di Statistica, Springer New York, Part II: Cluster Analysis, 25 – 32, 2010.
[2] BAI, L., LIANG, J., DANG, C., AND CAO, F., A Novel Attribute Weighting Algorithm for Clustering High-
Dimensional Categorical Data, Pattern Recognition, 44, 2843 – 2861, 2011
[3] DINASPEKERJAANUMUM DIY, LaporanAkhirpadapekerjaan Quality Assurance (QA) dan Quality Control (QC)
PelaksanaanRehabilitasi/ RekonstruksiPascaGempaBumi di D.I. Yogyakarta danJawa Tengah, DPU
Yogyakarta, 2007.
[4] FINCH, H., Comparison of Distance Measures in Cluster Analysis with Dichotomous Data, Journal of Data
Science, Vol. 3, 85 – 100, 2005.
[5]HARDLE, W., danSimar, L., Applied Multivariate Analysis Statistical Analysis, Second Edition, Springer–Verlag,
2007
[6] KUMAR, V., CHANDOLA, V., AND BORIAH, S., Similarity Measures for Categorical Data: A Comparative
Evaluation, SIAM, 2008.
[7] PALUMBO, F., AND D’ENZA, A.I., A Two-Step Iterative Procedure for Clustering of Binary Sequences,
Proceedings of the 6th Conference of the Classification and Data Analysis Group of the SocietaItaliana di
Statistica, Springer New York, Part II: Cluster Analysis, 33 – 40, 2010.
[8] REZANKOVA, H., Cluster Analysis and Categorical Data,
http://panda.hyperlink.cz/cestapdf/pdf09c3/rezankova.pdf, 2010.
[9] TIMM, N.H., Applied Multivariate Analysis, Springer, 2002.
[10]ZIBERNA, A., KEJZAR, N, AND GOLOB, P., A Comparison of Different Approaches to Hierarchical Clustering of
Ordinal Data, Journal of Metodoloskizvezki, Vol.1, No.1., 57 – 73, Slovenia, 2004.

KARIYAM
Department of Statistics, Faculty of Mathematics and Natural Science
Islamic University of Indonesia
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 813 - 820.

VALUING EMPLOYEE STOCK OPTIONS USING MONTE


CARLO METHOD

KUNTJORO ADJI SIDARTO AND DILA PUSPITA

Abstract. As a part of compensation package many companies granted employee stock


options (ESOs for short) to their employee. These are call options granted by a
company to an employee on the stock of the company. ESOs differ from standard -
traded options in at least ESOs can be exercised only after the vesting period, that
ESOs cannot be transferred and in case of the employee leaving the company during
the vesting period, the ESOs are forfeited. While Black -Scholes model is effective in
valuing standard-traded options, the use of lattice model in valuing ESOs, such as the
Hull-White ESO’s model, is more preferable regarding the specific features of ESOs.
In this paper, a modification on the Hull-White ESO’s model was made. First by
assuming that the option is exercised when the barrier condition on the stock price is
met after period of vesting. Second by assuming that the option is exercised when the
stock price spends a certain period of time above a certain barrier level after period of
vesting. The values of ESOs are then computed using Monte Carlo method.

Keywords and Phrases: Employee Stock Options, Hull-White ESO’s model, Monte Carlo
Method

1. INTRODUCTION

Employee Stock Options (ESOs for short) are call options granted by a company to
an employee on the stock of the company. ESOs differ from standard-traded options in at
least: ESOs can be exercised only after vesting period; ESOs cannot be transferred; and in
case of the employee leaving the company during vesting period the ESOs are forfeited.
Regarding these specific features of ESOs, the use of lattice model in valuing ESOs, such
as the Hull-White ESOs model [3], is more preferable. ESOs encourage the employee to
remain in the company and to work towards improvement of the company’s earning and
management which will result in the increase of share price and eventual increase wealth
of the employee (West [4]). Hence the company may use ESOs as a strategy to increase
their stock price. In the Hull-White model it is assumed that an employee will exercise
their ESOs prior to maturity if the stock price is at least M times the strike price. Binomial
method is then used to find its value. In this paper a modification on the Hull-White
model concerning exercise strategy is proposed. First, the case in which an employee may
813
814 K. A.SIDARTO AND D.PUSPITA

exercise their ESOs prior to maturity if the stock price reaches a certain value. Second, in
which an employee may exercise their ESOs prior to maturity if the stock pricespends a
certain period of time above a certain value. Thus a Parisian-style is added as an ESO’s
feature. Hence a Monte Carlo method will be more readily to apply for valuing the ESOs
(Bernard and Boyle [2]).

2. HULL-WHITE MODEL

Recall that in the Hull-White ESO’s model [3]: option can be exercised at any
time during its life after a vesting period; a vested option is exercised prior to maturity
if the stock price is at least times the strike price; there is an employee exit rate
which is the rate of employees leaving the company per year. In case an employee
leaves the company during vesting period their ESOs will be forfeited.Assume that the
probability an employee leaving the company in each period of time is , as
in Ammann and Seiz [1].

Monte Carlo Method for Valuing Hull-White Model.Partition the time to expiration
into times step: Let denote the
corresponding stock prices at these time. Further suppose that is the value of the
option at time . Define as the strike price of the option; as the time when the
vesting period ends; as the risk-free interest rate; as the volatility of the underlying
stock; and as the dividend yield. Assume that the stock price follows a geometric
Brownian motion. First, simulate in terms of using the formula

The equations for describing the backward recurrence through path are:

When

 if then

 if and then

 if and then

Repeat the simulation times, and find as the value of ESO from the
simulation. Then the value of ESO is given by
Va lu i n g E mp lo ye e St oc k Op t ion s Us i n g M on t e C a rlo M et h od 815

Table 2 gives the result of Hull-White ESO’s model computed using binomial method and
Monte Carlo method based on data given in table1.The number of stock price path
simulation is 105 with 2520 time-steps for each simulation. The 95% confidence interval is
[12.2968, 12.5015].

$50 $50 10 years 3years 5% 2.4% 30% 6% 1.5


Table 1 ESO data

Method Value of ESO


Binomial $ 12.5421
Monte Carlo $ 12.3991
Table 2 Hull-White ESO’s value

3. ESO WITH PARISIAN-STYLE

In this section, we replace the ‘psychological barrier’ MKon Hull-White model as


real barrier. Set as the barrier. ESOs with Hull-White model can then be exercised after
the stock price spends a certain period of time (window period) above a certain barrier
level ( ). If an employee leaves the company before the stock price spends above the
barrier for a certain window period, his ESOs are forfeited. It is assumed that employees
will exercise the option as soon as the Parisian condition is met. The Parisian condition is
met if the closing asset price stays above the barrier each day for the specified number of
consecutive days (Bernard and Boyle [2]).

Monte Carlo Method for Valuing ESOs with Parisian-style.Define as the first time
that the Parisian condition is metfor the simulation. Assume the window period of the
ESOs with Parisian-styleis . Figure 1 gives the illustration of ESOs with Parisian-style

Figure 1. Illustration of ESOs with Parisian-style


Employee will exercise their ESOs at , where . If the
816 K. A.SIDARTO AND D.PUSPITA

Parisian condition has not been met than the ESOs is forfeited and the value of the ESO is
zero.
After simulate the stock price define the exercise time. Define

With the usual convention that . The value of ESO at is given by

Using backward recurrence, the value of ESO with Parisian-style for the simulation
at is given by

Hence the value of ESO is given by

with

Table 3 gives the result of ESOs with Parisian-style model computed using Monte Carlo
method based on data given in Table 1 and value of for two window periods 0
and 15 days. The number of stock price path simulation is 10 5 with 2520 time-steps for
each simulation.

Window period ( ) Value of ESO 95% confidence interval


0 day $ 12.2736 [12.1689 , 12.3783]
15 days $ 12. 4836 [12.3751 , 12.5921]
Table 3. Value of ESO with Parisian-style

The ESOs with Parisian-style model with window period 0 day is a special case. It
can be viewed as a standard barrier option. Lattice method can then be implemented to
value the ESOs with standard barrier. Using the trinomial lattice method we got the value
of ESO with Parisian-style model where day aboveas $12.2113. Butthe lattice
method is hardly to apply for valuing the ESOs with Parisian-style model for others
window period. Monte Carlo method is a flexible method and easy to implement to price
the ESOs with Parisian-style model.

4. INFLUENCE OF SEVERAL PARAMETERS ON ESOS VALUATION


Va lu i n g E mp lo ye e St oc k Op t ion s Us i n g M on t e C a rlo M et h od 817

The following Figures 2, 3 and 4 show the influence of several parameters on ESOs
value. The parameters are: vesting period ( ), psychologicalbarrier ( ), real barrier ( )
and window period ( ).

Figure 2. Vesting Period vs. ESO

Figure 2 gives the ESO values using input


parameters: ,λ=
0.06 and different vesting period. As Figure 2 shows, there is a value of vesting period at
which the ESO has a maximum value.

Figure 3. ESO vs. Barrier

Figure 3 gives the ESO values input using


parameters:
, λ=0.06. The blue line
represents the value of ESOs with Hull-White model using different psychological barrier
818 K. A.SIDARTO AND D.PUSPITA

level. The red line represents the value of ESOs with Parisian-style model with day
and different real barrier level. The value of ESOs with real barrier is lower than the one
with psychological barrier because ESOs with real barrier doesn’t give exercised
opportunity to an employee leaving the company before the Parisian condition is met. As
Figure 3 shows, there is a value of barrier at which the value of ESOs with real barrier has
a maximum value.

Figure 4. ESO vs. Window period

Figure 4 gives the ESO with Parisian model valuation using input
parameters: ,
λ=0.06 and different window period. It is natural that as the length of window period
become longer, the option will be more difficult to be exercised. Hence, as Figure 4
shows, the increasing length of window period gives the decreasing value of ESOs.

4. CONCLUDING REMARK

We have presented a simple Monte Carlo method for valuing employee stock options. In
particular, we analyzed the Hull-White model where we added the Parisian style to the
model. As a consequence it is not easy to use a lattice method to perform valuation. Hence
we propose the use of Monte Carlo method for its valuation, which is easier to implement.
A graphical analysis of several ESOs parameter’s influence on the option value is also
given.

References

[1] AMMAN, M. AND R. SEIZ., Valuing Employee Stock Options: Does the Model Matter? Financial Analysis
Journal, vol.60, 5, 21-37, 2004.
[2] BERNARD, C. AND P. BOYLE., Monte Carlo Methods for Pricing Discrete Parisian Option.The European
Journal of Finance, 1-28, 2010.
[3] HULL, J. AND A. WHITE., How to Value Employee Stock Options.Financial Analysis Journal, 60, 114-
Va lu i n g E mp lo ye e St oc k Op t ion s Us i n g M on t e C a rlo M et h od 819

119, 2004.
[4] WEST, G., Employee Stock Options. http://www.riskworx.com/pdf/esoPDF5 (accessed: June 5, 2008).

KUNTJOROADJISIDARTO
Industrial and Financial Mathematics Group, Faculty of Mathematics and Natural
Sciences, Institute of Technology Bandung, Indonesia.
e-mail: [email protected]

DILAPUSPITA
Industrial and Financial Mathematics Group, Faculty of Mathematics and Natural
Sciences, Institute of Technology Bandung, Indonesia.
e-mail: [email protected]
820 K. A.SIDARTO AND D.PUSPITA
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 821 - 830.

CLASSIFICATION OF EPILEPTIC DATA USING


FUZZY CLUSTERING

NAZIHAH AHMAD, SHARMILA KARIM, HAWA IBRAHIM , AZIZAN SAABAN,


KAMARUN HIZAM MANSOR

Abstract. Determining the region in the brain of epileptic foci that need to be removed or isolated
is a crucial task.The EEG signals not only have incomplete recorded data but also not much on
priori information about the unknown source. Thus EEG signals are tackled in a fuzzy
environment for classification. Fuzzy clustering is one of the techniques used to determine the
electrical activity most occurred in the brain. In this paper, two algorithms of fuzzy clustering i.e.
Fuzzy c-Means and Gustafson-Kessel are investigated and applied to the real epileptic data. The
results of each algorithm are compared by referring to the optimal number of cluster s. The number
of clusters is well spread within the considered time which gives the hint of generalized epilepsy
seizure.

Keywords and Phrases :Fuzzy clustering, EEG signals analysis, Fuzzy c-Means, Gustafson-
Kessel

1. INTRODUCTION

Epilepsy is one of the most common disorders of the brain characterized by recurrent
seizures [1]that affects approximately 1% of the world’s population which is more than 50
million individuals worldwide. Seizures are classified into two major categories as either
partial or generalized [2]. A partial seizures occurs when the initial discharges occurs at a
localized focus while a generalized seizure has multiple foci at various locations throughout
the whole brain, in both hemispheres. Not all seizures can be easily defined as either partial or
generalized. Some people have seizures that begin as partial seizures but then spread to the
entire brain. Others may have both types of seizures but with no clear pattern.
Epileptic seizure, which is caused by abnormal electrical activity in the brain, can be
measured by using Electroencephalogram (EEG). The non-stationary recorded EEG signals
contain information regarding changes in the electrical potential of the brain obtained from a

________________________________
2010 Mathematics Subject Classification : 92C55, 93A30, 94A12
821
822 NAZIHAH AHMAD ET AL.

given set of recording electrodes.Whenever there is a net current flow between two
electrodes, a potential difference will develop. These potential differences developed due to
volume of currents that spread from their source in active neural tissue throughout the
conductive media of the head.
These changes appear as wriggling lines along the time axis in typical EEG recording
(Figure 1). The recorded data include the characteristics waveforms with accompanying
variations in amplitude and frequency which is in a series of numerical values over time. EEG
waveforms are typically recorded from1 to 50µV in amplitude with frequencies of 2 to 50 Hz.
During brain disease state, such as in epileptic seizure, the EEG amplitudes can be shot up to
nearing 1000 µV[2].

Figure 1: Sample of EEG signals from an epileptic patient during seizure attack.

In order to localize the source or strength of EEG signals, the classification of data is
often required [4]. Classification method groups a scattered data of the EEG signals into two
or more clusters with a corresponding cluster center each. The cluster center gives some clues
where the electrical activities in the brain occur most [5] and [6]. It also provides the most
likely possible location where the epileptic seizure begins and identifies the category of the
seizure which is either partial or generalized [7] and [8].
The EEG signals not only have incomplete recorded data but also not much on priori
information about the unknown source [9]. Therefore, in this study, EEG signals are tackled
in a fuzzy environment where it has significant advantages over other approaches. Fuzzy has
the capability in underlying the remarkable human ability to make rational decisions in the
environment of imprecision, partial knowledge, partial certainty and partial truth [10].
Classification of EEG recordings based on fuzzy clustering algorithms has been applied in
spike detection and classifying emotions[11] and [12].
In this study, fuzzy partition clustering [13] and [14] is implemented to the real data of
EEG signals (i.e the potential difference)of epilepsy patients. Then a comparison based on the
optimal number of cluster is performed.
Classification of Epileptic Data Using Fuzzy Clustering 823

2.SEGMENTATION OF EEG SIGNALS

The structure of recorded potential difference in Figure 1can be partitioned into


disjoint and non-overlapping region [4]. The basic idea of EEG signals segmentation can be
described as follows. Let P be a set of recorded potential difference of EEG signals. Then a

set Pi  Pi1 , Pi2 ,..., Pit , Pit 1 ,..., Pin  is a finite ordered partition of P for i
th
channel at any time
t where ti  ti 1 such that
i. Pit  Pit 1  0 t (Well-partition)
n
ii. Pit  P
t 1

where Pit captures the amplitude of raw EEG signals at any fixed time t for ithchannel which
is segmented into several consecutive connected points.
These non-overlapping signals have been proven to be metric space, hence topological
space and finally implemented as digital space in order to explain the theoretical background
of these particular signals[15]. Since this EEG signals are topological spaces, so these signals
can be stretched to any geometric figure (i.e fuzzy clustering) in order to identify the current
source where the epileptic seizure originated.

3. FUZZY CLUSTERING

Fuzzy clustering algorithm is an extension of the classical clustering algorithm to the


fuzzy domain[10]. This approach has advantage when exists a dataset with subgroupings of
points having indistinct boundaries and overlap between the clusters [16]. Various degrees of
membership are assigned to each point where the membership of a point is shared among
various clusters.
There are different types of fuzzy clustering [17 ]. However in this study, we consider
Fuzzyc-Means (FCM) algorithm and Gustafson-Kessel(GK) algorithm. These two algorithms
have been widely applied for different tasks such as pattern recognition, data mining, image
processing, signal processing and fuzzy modelling [18].

3.1 Fuzzy c-means Algorithm. FCM algorithm is based on the minimization of an objective
function.
824 NAZIHAH AHMAD ET AL.

c c n
J U , c1 , c2 ,..., cc    J i   uij m dij 2
i 1 i 1 j 1

where
uij is between 0 and 1;
ci is the centroid of cluster i;
dij is the Euclidean distance between ith centroid(ci )and jth data point;
m  1,   is a weighting exponent.

The step of FCM algorithm is given in Figure 2.

Given the data set Z, choose the number of clusters 1  c  N , the weighting exponent m  1,
the termination tolerance   0 and the norm-inducing matrix A.
Repeat for l  1, 2,...
Step 1:Compute the cluster prototypes (means):

     Z
m
N l 1
k 1 ik k
vi (l )
 , 1  i  c.
    
m
N l 1
k 1 ik

Step 2 : Compute the distances:

  AZ 
T
D 2ikA  Z k  vi    vi   , 1  i  c, 1  k  N .
l l
k

Step 3 : Update the partition matrix:


if DikA  0 for 1  i  c, 1  k  N ,
1
ik l  
 D / D jkA 
c 2/  m 1
j 1 ikA

otherwise
c
ik l   0 if DikA > 0, and ik l    0,1 with  ik l 
i 1
l   l 1
until U U  .

Figure 2: FCM algorithm


Classification of Epileptic Data Using Fuzzy Clustering 825

3.2 Gustafson-KesselAlgorithm. GK algorithm is an extension of FCM clustering and uses


the covariance matrix. Gustafson and Kessel[13] extended the FCM algorithm for an
innerproduct matrix norm, where a positive definite matrix is adapted according to the actual
shapes of the individual clusters, described approximately by the cluster covariance matrices.
Figure 3 presents the GK algorithm.
In this study, m is typically set equal to 2, and . If the error never reached
below , the maximum number of iteration is set to 100 as second termination criterion.

Given the data set Z, choose the number of clusters 1  c  N , the weighting exponent m  1,
the termination tolerance   0 and the norm-inducing matrix A.
Repeat for l  1, 2,...
Step 1:Compute the cluster prototypes (means):

     Z
m
N l 1
k 1 ik k
vi (l )
 , 1  i  c.
    
m
N l 1
k 1 ik

Step 2 : Compute the cluster covariance matrices:

   Z  v   Z 
m T
 l 1
 vi  
N l l
k 1 ik k i k
F  , 1  i  c.
    
i m
N l 1
k 1 ik

Step 3 : Compute the distance:

  
 i det  Fi 1/ n Fi 1  Z k  vi  l  , 1  i  c, 1  k  N . 
T
D 2ikAi  Z k  vi  
l
 

Step 4 : Update the partition matrix:


if DikAi  0 for 1  i  c, 1  k  N ,
1
ik  l  
 D 
c 2/  m 1

j 1 ikAi / DikAi
otherwise
c
ik l   0 if DikA > 0, and ik l    0,1 with  ik l   1
i
i 1
l   l 1
until U U  .

Figure 3: Gustafson-Kessel algorithm.

3.3 Cluster Validity Measurement. In order to determine the optimal number of clusters
826 NAZIHAH AHMAD ET AL.

present in the data, cluster validity need to be done. So Xie and Beni’s Index (XB) has been
implemented. The XB index is defined as follows:

   
c n m 2
i 1 j 1 ij x j  vi
XB  2
n min i , j x j  vi

According to [5] and [19] XB index gives an effective measurement for the number of
clusters compared to the other indexes when EEG signals are used.

4. DATA ANALYSIS AND RESULT

The real EEG data of epileptic patient is digitized at 256 samples per second using
Nicolet One EEG software. The software performed Fast Fourier Transform (FFT) of raw
data for the signals. The EEG data is recorded by placing electrodes on the scalp according to
the International 10-20 system. Nineteen channels of EEG are recorded simultaneously.
These channels with duration of ten seconds during the seizures attack are considered.
The FCM and GK algorithms are implemented using MATLAB R2010a with
different number of clusters varying from two to ten for each second. At time t=1, the values
of validity measures corresponding to the number of clusters are plotted as depicted in Figure
4 and Figure 5. The XB index for FCM and GK reach the local minimum at c=4. Hence the
optimal cluster for both algorithms is four.

Figure 4: Validity measure for FCM at t=1


Classification of Epileptic Data Using Fuzzy Clustering 827

Figure 5: Validity measure for GK at t=1

Table 1 shows the optimal number of cluster using FCM and GK at each second. At
t=1 and t=2, there are 4 cluster center. However the positions of the cluster centers at t=1 in
R2areslightly different for each algorithm (Table 2). The highest number of clusters occurs at
t=3. This means that at this time, the electrical current in the brain has been triggered. For the
rest of the time, the number of cluster centers lies within three to five.

Table 1: Optimal number of cluster for ten seconds using FCM and GK

Algorithm t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
FCM 4 4 6 4 3 3 4 5 3 4
GK 4 4 5 3 5 3 4 5 3 4

Table 2:Position of the optimal cluster center at t=1

FCM GK
( x, y) ( x, y)

(0.4642 , 0.7590) (0.4637 , 0.7567)


t1 (0.1510 , 0.5480) (0.1493 , 0.5488)
(0.6088 , 0.3135) (0.5773 , 0.2898)
(0.6888 , 0.1509) (0.7130 , 0.1880)
(0.5109 , 0.8720) (0.4356 , 0.8545)
(0.3169 , 0.6751) (0.5109 , 0.6886)
t2
(0.6557 , 0.5933) (0.6046 , 0.5363)
(0.4121 , 0.4027) (0.3743 , 0.3717)
(0.5971 , 0.5135) (0.6309 , 0.6368)
(0.3762 , 0.1804) (0.3415,0.7277)
(0.4150 , 0.7226) (0.5300 , 0.2840)
t3
(0.3504 , 0.5725) (0.4587 , 0.6528)
(0.8083 , 0.6035) (0.8130 , 0.5804)
(0.6881 , 0.7724)
828 NAZIHAH AHMAD ET AL.

(0.6975 , 0.6089) (0.4550 , 0.2702)


(0.4930 , 0.7738) (0.5267 , 0.5950)
t4
(0.4893 , 0.3162) (0.7617 , 0.6278)
(0.8478 , 0.1713)
(0.9036 , 0.3893) (0.9586 , 0.5462)
(0.0979 , 0.7262) (0.0728 , 0.6185)
t5 (0.5218 , 0.6569) (0.4231 , 0.6587)
(0.1207 , 0.8342)
(0.8254 , 0.3454)
(0.7633 , 0.3849) (0.7995 , 0.3782)
t6 (0.1377 , 0.3820) (0.1235 , 0.4691)
(0.3003 , 0.6647) (0.4386 , 0.5280)
(0.6573 , 0.7635) (0.6541 , 0.7294)
(0.5374 , 0.5434) (0.5226 , 0.4763)
t7
(0.8615 , 0.5080) (0.8661 , 0.5039)
(0.3191 , 0.3812) (0.2609 , 0.4378)
(0.7120 , 0.7322) (0.7109 , 0.7315)
(0.2686 , 0.3280) (0.2679 , 0.3248)
t8 (0.7035 , 0.2977) (0.5605 , 0.3932)
(0.2003 , 0.5440) (0.2048 , 0.5457)
(0.5739 , 0.4174) (0.7210 , 0.3342)
(0.7032 , 0.5028) (0.3366 , 0.2520)
t9 (0.2576 , 0.2607) (0.2820 , 0.6196)
(0.3195 , 0.6484) (0.7048 , 0.5406)
(0.7675 , 0.7575) (0.7548 , 0.7842)
(0.2187 , 0.5604) (0.2201 , 0.5650)
t10
(0.5294 , 0.6958) (0.5623 , 0.6627)
(0.6151 , 0.2818) (0.5505 , 0.2867)

As shown in Table 2, at one particular time, the position of the cluster centers using FCM and
GK are quite similar. However, the position of cluster centers for both algorithms are varies
from time to time. These cluster centers are well-spread throughout the ten seconds. The
pattern gives a hint of the general case of epilepsy since the position of cluster centers are
scattered. These positions only show the location of cluster centers of the EEG signal in two-
dimensional. The result need to be further investigated using any inverse projection in order
to transform the information of the location of cluster centers inside the brain.

5. CONCLUSION

This study shows that FCM and GK can be used in identifying the type of
epilepsywhich is either partial or generalized. The spreading pattern for the position of cluster
centers in both algorithms is similar. This means that FCM and GK can be used to cross
reference the individual results with one another, where both identify generalized epilepsy.
Classification of Epileptic Data Using Fuzzy Clustering 829

Acknowledgement.Thisresearch is supported by LEADs,University Utara


Malaysia with S/O code:12032.We would like to thank Prof Dr. Tahir Ahmad and Dr.
Amidora Idris from Theoretical & Computational Modeling For Complex Systems (TCM),
UTM for providing the EEG data as well as giving ideas and support.

References

[1] SHORVON, S. D., Handbook of Epilepsy Treatment, 2nd ed, Blackwell Publishing Ltd., USA, 2005.
[2] MARKS, D.A.. Classification of Seizure Disorders.In Schulder, M., and Gandhi, C.D. (Ed).Handbook of
Stereotactic and Functional Neurosurgery. New York. Marcel Dekker, Inc, 2003
[3]KUTZ, M., Standard Handbook of Biomedical Engineering & Design, McGraw- Hill.,New York,2003.
[4] SANEI, S ANDCHAMBERS, J.A.,EEG Signal Processing, John Wiley & Sons Ltd.,England, 2007.
[5] FAUZIAH, Z., Dynamic Profiling of EEG Data During Seizure Using Fuzzy Information Space, PhD
Thesis,Universiti Teknologi Malaysia, Skudai, 2008.
[6] NAZIHAH, A. AND TAHIR A., Information Granulation in Biomedical Signal, Proceeding of National Seminar
on Fuzzy Theory and Applications, 2008.
[7] HARIKUMAR, R.AND NARAYANAN, B. S., Fuzzy Techniques for Classicification of Epilepsy Risk Level
from EEG Signals, Proceedings of the IEEE Conference, 2003.
[8] TAHIR, A., RAJA A. F., FAUZIAH, Z.AND HERMAN, I., Selection of a Subset of EEG Channels of Epileptic
Patient During Seizure Using PCA. Proceeding of the 7th WSEAS International Conference on Signal
Processing, Robotics and Automation, 270-273, 2008.
[9] SOLOMON, E. P.,Introduction to Human Anatomy and Physiology, 2nd ed., Elsevier Science,USA,2003.
[10] ZADEH, L. A.,TowardA Theory of Fuzzy Information Granulation and Its Centrality in Human Reasoning and
Fuzzy Logic,Fuzzy Sets and System, 90: 111-127, 1997.
[11] HILAL, Z. I. ANDKUNTALP, M., A Study on Fuzzy C-means Clustering-based Systems in Automatic Spike
Detection, Computers in Biology and Medicine, doi:10.1016/j.compbiomed. 2006.
[12] MURUGAPAN, M., RIZON, M., NAGARAJAN, R., YAACOB, S., ZUNAIDI, I., ANDHAZRY., D., EEG
Feature Extraction for Classifying Emotions using FCM and FKM, International Journal of Computers and
Communications, 2(1), 21-25, 2007.
[13] BEZDEK, J.C.,Pattern Recognition with Fuzzy Objective Function Algorithms, Kluwer Academic
Publishers,USA, 1981.
[14] GUSTAFSON, D. E. AND KESSEL, W. C., Fuzzy Clustering with a Fuzzy Covariance Matrix, Proceedings of
the IEEE Conference on Decision and Control, pp. 761-766, San Diego, Calif, USA, 1979.
[15] NAZIHAH, A., TAHIR A.AND HUSSAIN I. H. M. I., Topologizing The Bioelectromagnetic Field Proceeding
of the 5th Asian Mathematical Conference, PWTC, 2009.
[16] MILLER, D. J., NELSON, C. A., CANNON, M. B., ANDCANNON, K. P., Comparison of Fuzzy Clustering
Methods and Their Application to Geophysis Data. Applied Computational Intelligence and Soft Computing, 1-
16, 2009.
[17] BARGIELA, A. ANDPEDRYCZ, W., Granular Computing: An Introduction, Kluwer Academic Publishers,
USA, 2003.
[18] KAYMAK, U. ANDSETNES, M., Extended Fuzzy Clustering Algorithm. ERIM Report Series Research in
Management, 1-23, 2000.
[19] CHIANG, W. Y., Establishment and Application of Fuzzy Decision Rules: an Empirical Case of the Air
Passenger Market in Taiwan, Int. J. Tiurism Res., doi: 10.1002/jtr.819, 2010.
[20] MARKS, D.A..Classification of Seizure Disorders.In Schulder, M., and Gandhi, C.D. (Ed).Handbook of
Stereotactic and Functional Neurosurgery. New York. Marcel Dekker, Inc, 2003
830 NAZIHAH AHMAD ET AL.

NAZIHAH AHMAD
Universiti Utara Malaysia.
E-mails: [email protected]

SHARMILA KARIM
Universiti Utara Malaysia.
E-mails: [email protected]

AZIZAN SAABAN
Universiti Utara Malaysia.
E-mails: [email protected]

HAWA IBRAHIM
Universiti Utara Malaysia.
E-mails: [email protected]

KAMARUN HIZAM MANSOR


Universiti Utara Malaysia.
E-mails: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 831 - 848.

RECOMMENDATION ANALYSIS BASED ON SOFT SET FOR


PURCHASING PRODUCTS

R.B. FAJRIYA HAKIM, SUBANAR, EDI WINARKO

Abstract. Purchasing product is one of a complex knowledge in decision making problem under
uncertain conditions. Some decisions have to be taken based on mathematical method. Soft set
theory is a new general mathematical method for dealing with uncertain data proposed by
Molodtsov in 1999. However many soft set researches produce exact solution even when their
initial description of the data are approximation valueswhich is supposed to be better give a soft
solution or recommendation. This paper uses soft set as a generic mathematical tool to describe the
objects or products under consideration in the form of parameters they needed and
multidimensional scaling techniques to give recommendation or soft solution for purchasing
products system. The proposed purchasing products system usesa simple form of ranking
evaluation for each object parameters which is filled out by customers self and yielding a
recommendation based on soft set which could be used as a suggestion for customer in taking a
decision.

Keywords and Phrases: Soft set theory, multidimensional scaling, recommendation analysis,
clustering.

1. INTRODUCTION

Purchasing products is one of a complex knowledge in decision making problem


under uncertain conditions. Before purchasing products or any important things we should
conduct a research to get any kind of information we need. A lot of information has been
produced by magazines or newspaper columns which evaluate or give a rating for some
products. In the other side, we also purchase a product based on our own knowledge,
preference, finance or anything we have that support us to take any decision in buying a
product. Some decisions may have a help using a method based on mathematical theory. Soft
set theory is a new general mathematical theory that deals with decision problem under
uncertain data. There will be a lot ofreal world decision problem that could be defined in soft
set. In our daily lives, to decide what clothes that should be wear, what shoes that should be
used or what books that we need to read until the government choice for some decision plans
need a set as a state of the art. Usually we do not remember an exact attribute for each of our
clothes, we only remember „the blue one‟ or „the blue and makes me look trendy‟. These

831
832 R.B.F. HAKIM , SUBANAR , E. WINARKO

happen also, say, when we will have a special guest that comes to our house. We need to
decorate our dining room and buy a new chair and dining tables. We just go to a furniture
store and ask the salesman and describe our need of dining room decoration. Rather than
mentions an exact attributes of chairs or tables we prefer to ask „the traditional look‟, „the
comfortable chairs‟ and so on. Molodtsov [2] has laid the foundation of a set that can collect
different objects under consideration on the form of parameters they needed. For example (F,
E) is a soft set that defined the clothes belonging to someone which are c1, c2, c3, c4, and c5
with {upper wear, lower wear, blue, black, jeans, formal, sporty} as a description of the
clothes, wherec represents the clothes. Another example (F, E) is a soft set that defined the
dining chairs (dc1, dc2, dc3, dc4, dc5) and their parameters = {solid wood, durable, heavy
carved, dark finish, traditional look, luxurious look, easy to clean}. Molodtsov insisted that
soft set could use any parametrization we prefer such as words and sentences, real numbers,
functions and so on. This parametrization caused the multidimensional topological space.This
model space needs not only metric space but also non metric space. Due to the basic notions
of soft set that offers an approximate nature of the object under consideration, the solution for
someone‟s problem based on soft set should also be a soft decision. Many research activities
were using soft set theory in decision making process give an exact solution rather than a soft
solution. However, this paper will show a simple ranking evaluation applied for each object
parameters in the soft set and mapping that parameters family of the objects (houses and its
attractiveness parameters of Mr. X‟s Molodtsov example) using non metric multidimensional
scaling and give a soft solution or recommendation based on soft set that could be used as a
suggestion for Mr. X‟s decision.
The rest of this paper is organized as follows. Section 2 describes the notion of soft set
theory. Section 3 presents a review on soft set-based decision making techniques. Section 4
and 5 describes soft set-based recommendation systems and proposed the software of soft set
based recommendation systems. Finally, the conclusion of this work is described in section 6.

2. SOFT SET THEORY

Molodtsov [2] first defined a soft set which is a family of objects whose definition
depend on a set of parameter. Let U be an initial universe of objects, E be the set of adequate
parameters in relation to objects in U. Adequate parametrization is desired to avoid some
difficulties when using probability theory, fuzzy sets theory and interval mathematics which
are in common used as mathematical tool for dealing with uncertainties. The definition of soft
set is given as follows.

Definition 2.1.([Molodtsov [2]).A pair (F, E) is called a soft set over U if and only if F is a
mapping of E into the set of all subsets of the set U.

From definition, a soft set F, E  over the universeU is a parameterized family that gives an
approximate description of the objects in U. Let e any parameter in E, e  E , the subset
F e  U may be considered as the set of e -approximate elements in the soft set F, E  .
Example 1.Let us consider a soft set F, E  which describes the “attractiveness of houses”
that Mr. X is considering to purchase.
U – is the set of houses under Mr. X consideration
Recommendation Anlysis Based On Soft Set For Purchasing Products 833

E – is the set of parameters. Each parameter is a word or a sentence


E = {expensive, beautiful, wooden, cheap, in the green surroundings,
modern, in good repair, in bad repair}

In this example, to define a soft set means to point out expensive houses, that shows which
houses are expensive due to the dominant parameter of the houses is expensive, in the green
surrounding houses, which shows houses that their surrounding are greener than other, and so
on. Molodtsov [2] also stated that soft set theory has an opposite approach which is usually
done in classical mathematics that should construct a mathematical model of an object and
define the notion of the exact solution of this model. Soft set theory uses an approximate
nature as an initial description of the objects under consideration and does not need to define
the notion of exact solution. A common mathematical tools to solve complicated decision
problems with uncertainties are probability theory, fuzzy theory and interval mathematics,
but their difficulties are probability theory must perform a large number of trials, fuzzy
theory must set the membership function in each particular case and the nature of the
membership function is extremely individual, interval mathematics should construct an
interval estimate for exact solution of a problem but not sufficiently adaptable for problem
with different uncertainties. To avoid these difficulties, in the soft set theory, when someone
is faced to the decision problems with many uncertainties, he or she could express their own
problems using objects and any information belongs to the objects. This relevant information
refers to the necessary parameter of the objects. The necessary parameters could be a
particular interest that he or she can express their preference, knowledge, perception or
common words in a simple way to the objects under consideration. Parameters attached to
the objects are said to be adequate if he or she considers that the information involved to
identifying a problem is sufficient to elucidate the objects and could give a fair valuation to
each object based on this information and then get suggestion to make a decision. Setting the
objects and their necessary information using words and sentences, real numbers, function,
mappings, etc., is a parametrization process that makes soft set theory applicable in practice.
Maji et al. [1] have extended example 1 to decision making problem of choosing
six houses based on the attractiveness of houses as a house parameters. Some parameters are
absolutely belonging to some houses and some parameters are absolutely not belonging by
some houses. Their example of choosing of houses problem based on soft set has initiated
many important applied and theoretical research that have been achieved in soft set decision
making problem. However, soft set theory has not been yet find out the right format to the
solution of soft set theory due to many research using binary, fuzzy membership or interval
valued for parameters of objects valuation that should be avoided as notified by Molodtsov
and will be described in the next section.

3. A REVIEW ON DECISION MAKING TECHNIQUES


INVOLVING SOFT SET THEORY

 
Molodtsov‟s example give a soft set F, E that describes the attractiveness of
houses which Mr. X is going to buy, with U is the set of houses and E= {expensive, beautiful,
wooden, cheap, in the green surroundings, modern, in good repair, in bad repair} as a set of
parameter that defined the houses under consideration.
834 R.B.F. HAKIM , SUBANAR , E. WINARKO

3.1. Review on soft set-based decision making. Maji et al. [1] applied the theory of soft set
to solve a decision making problem that has been encountered by Mr. X. They defined the six
houses {h1, h2, h3, h4, h5, h6} that each house have their own parameters, for example h1, h2,
h3, h4, h5, h6 are beautiful houses and h1, h2, h6 are wooden houses, etc. Mr. X interested in a
several parameters as a priority to buy a houses, that are „beautiful‟, „wooden‟, „cheap‟, „in
the green surroundings‟ and ‟in good repair‟ as a subset of E. His decision is based on the
maximum number of parameters of the soft set. They continued to present the soft set of
attractiveness of houses which Mr. X is going to buy using tabular representation with the
entry is 1 if a house has a particular parameter, and zero if it does not has. This quantification
means, for example, the house h1 is absolutely beautiful and wooden or h3 are not wooden,
Then the choice of Mr. X is only a cumulative numbers based on the houses that have all the
parameters. At this point, Maji et al [1][3] assumed that the parameters as an attribute of an
objects or object‟s features. This assumption is of course, need a process to transform the
parameter to attribute or feature of objects. To handle the binary valuation, Maji et al [1] tried
to introduce the W-soft set or weighted soft sets, but their effort did not give a new approach
to decision analysis caused by the weights are multiplied to each parameter (as attribute) and
hence, this will not changes the final result. Later, Herawan and Mat Deris [4] and Zou and
Xiao [5] have proven that soft set could be transformed to binary information systems.
Maji et al [1] also used rough set theory to reduce the parameters that have been hold
to every object in the universe. Rather than optimize the worth of parameter as necessary
information; they preferred to reduce the parameter. The information involved in the
parameter will be loosed. Molodtsov has insisted the adequacy of parameterization to objects
of universe rather than reducing the parameter that has been belonged to every object. Due to
binary value of the entries, their decision result gives an exact solution rather a soft solution
that contradicted to the philosophy of the initial Molodtsov‟s soft set that insisted the
approximation to the result which is caused by soft information accepted in a parametrization
family of soft set.
Chen et al [16] and Kong et al [17] also wanted to reduce the parameter of the
objects, but Molodtsov already pointed out that the expansion of the set of parameters may be
useful due to the expansion of parameters will give more detailed description of the objects.
For example, Mr. X could add the parameter „distance to office‟ for the attractiveness of
houses. This parameter gives more detailed description of houses and may help him to re-
decide which house to buy. The reduction of parameters is worthless since the adequacies of
parameter are crucial in soft set theory to describe the houses. Reducing parameter may
deduct valuable information from objects and it can be used only for special case. For
example, reducing the parameter „expensive‟ and „cheap‟ is allowed since Mr. X exactly
knows that all houses has an actual same prices.

3.2. Review on fuzzy soft set-based decision making. Roy and Maji [6] then combined
the fuzzy set and soft set, that fuzzy numbers is used to evaluate the value of the parameter‟s
judgment for each object. This idea develops to the hybrid theory of fuzzy soft set. This fuzzy
soft set also initiated by Yang et al. [7]. They also said that rather than using {0, 1} value to
define the object hold the parameter, it will be better to use the degree of membership to
represent the object which hold the parameter. Since the parameter‟s value of each object
filled by fuzzy numbers, then it means that there must be an expert to determine the
membership value that represent the matching number for each house. It become more
difficult since the valuation of parameters of the objects are on the interval-valued fuzzy
Recommendation Anlysis Based On Soft Set For Purchasing Products 835

number (Feng et al [15], Jiang et al [10]). An expert should give not only matching number of
parameter but should determine the lowest and the highest number as the value of the objects
parameters. Molodtsov had been stated that this is the nature difficulties when dealing with
fuzzy numbers and should be avoided.
Several researchers could be grouped that follows this two main idea i.e. treat the
soft set as an attribute of information systems (Herawan and Mat Deris [4], Zou and Xiao [5])
then using rough set (soft rough set) to handle the vagueness to make a decision (Feng et
al.[8]) and the fuzzy soft set (Jun et al. [9], Feng et al [15] and Jiang et al. [10]).
Both of them (sections 3.1 and 3.2) gave an exact solution or best decision rather
than a soft solution or recommendation that satisfying Molodtsov‟s soft set philosophy.
This paper will use soft set as a generic mathematical tool for describe the objects
under consideration on the form of parameters they needed to give recommendation rather
than exact solution. This paper also shows a simple ranking evaluation applied for each object
parameters in the soft set and mapping that parameters family of the objects (houses and its
attractiveness parameters of Mr. X Molodtsov‟s example) using non metric multidimensional
scaling as Nijkamp and Soffer [11] introduced to soft multicriteria decision models and give a
soft solution or recommendation based on soft set that could be used as a suggestion for Mr.
X‟s or someone‟s decision.

4. SOFT SET BASED RECOMMENDATION SYSTEM

4.1. Definition of the Soft Set and Soft Solution.From definition 2.1. Let U be an initial
universe set and let E be a set of parameters. A pair (F, E) is called a soft set over U if and
only if F is a mapping of E into the set of all subsets of the set U. From that definition, a soft
 
set F, E over the universeU is a parameterized family that gives an approximate
description of the objects in U. Let e any parameter in E,e  E , the subset F e  U may
be considered as the set ofe-approximate elements in the soft set F, E  .
As an illustration, let us consider the following examples from Molodtsov, (1999). A
soft set (F, E) describes the attractiveness of the houses which Mr. X is going to buy.
U – is the set of houses under consideration
E – is the set of parameters. Each parameter is a word or a sentence.
E = {expensive, beautiful, wooden, cheap, in the green surroundings, modern, in good repair,
in bad repair}

In this problem, to define a soft set means to point out expensive houses, beautiful houses and
so on. Expensive houses may show which houses are expensive due to the dominating
parameter is „expensive‟ compared to other parameters that are possessed by the house, in the
green surrounding houses, which shows houses that their surrounding are greener than other,
and so on. It is worth noting that the sets F(e) may be arbitrary. Some of them may be empty,
some may have nonempty intersection. That is, the solution of the soft set is a set which are a
subset of object and a subset of parameters that shows the objects and its parameters.

Definition 4.1. (soft solution). A pair (F’, E’) over U’ is said to be a soft solution of soft set
(F, E) over U if and only if
i) U’  U
836 R.B.F. HAKIM , SUBANAR , E. WINARKO

ii) {e|U’ | e E} = E’ where e|U’ is the restriction parameter of e to U’


iii) F’ is a mapping of E’ into the set of all subsets of the set U’

We shall use the notion of restriction parameter of eE’ to U’ in order to obtain the
parameters which dominate an object compared to other parameters that may be possessed by
those objects.
We are trying to approach the soft solution using information system theory that has
been widely disclosed by Demri and Orlowska (2002) which already has an established
theoretical foundation. Soft set theory is different from information systems in which a
problem or an object in the soft set is determined by the person dealing with the problem,
then relies on the ability of the person to be able to explain various things related to that
object. Various things that might be related are referred to as the object parameter in the soft
set. Meanwhile,an information system is a collection of objects and their properties. That is
why, soft set described as a pair (F, E) over U with F is a mapping of E into the power set of
U, and instead of (U, E) where U is the set of objects and E is the set of attributes as the
structure of information systems (OB, AT) where OB is the set of objects and AT is the set of
attributes (properties). A formal information systems (Demri and Orlowska, 2002) may be
presented as a structure of the form (OB, AT, (Va)aAT, f ), where OB is a non-empty set of
objects, AT is anon-empty set of attributes, Va is a non-empty set of values of the attribute a,
and f is a total function OB x ATaATP(Va) such that for every (x, a) OB x AT, f(x, a)
Va. We often use (OB, AT) as a concise notation instead of a formal structure.
A soft set (F, E) over U might be considered as an information systems (U, AT) such
that AT = {F} and value of a mapping function of F = eE make available the same
information about objects from U. It is a common thing to identify a wide range of matters
(parameters) relating to the object and then create a collection of objects that possess this
parameters. To compose this intuition, for a given soft set S = (F, E) over U, we define a soft
set formal context S = (U, E, F) where U and E are non-empty sets whose elements are
interpreted as objects and parameters (features), respectively, and FUxE is a binary relation.
If xU and eE and (x, e) F, then the object x is said to have the feature e. If U is finite then
the relation F can be naturally represented as a matrix with entries (c(x,e)) xU, eE such that
the rows and columns are labeled with objects and object parameters, respectively, and if (x,
e)F, then c(x,e) = 1, otherwise c(x,e) = 0. In this concept, the soft set formal context provide the
following mappings ext: P(E) P(U), that shows extensional information for objects under
consideration. This means an object parameters may be able to be expanded on someone
views as the set of those objects that possess the parameters.
def
For all XU and eE we define ext(E)  {xU | (x,e)F, for every eE}; ext(E) is
referred to as the extent of E.

Lemma4.1. For all A, A1, A2 E if A1 A2, then ext(A2)  ext(A1)
def
Proof. Let A1, A2E. Assume that xU and for every eA2 then ext(A2)  {xU| (x, e) F,
for every eA2} since A1A2 , xU and for every eA1 then (x,e)F that is ext(A1).

Each soft set formal context could be viewed as an information system. Given a soft set
formal context S = (U, E, F), we define the information system (OB, AT, (Va)aAT, f )
determined by S as follows :
Recommendation Anlysis Based On Soft Set For Purchasing Products 837

def
- OB  U
def
- AT  E
def
- For every aAT and for every xOB, f(x, a)  {1} if (x, a) F, otherwise f(x, a)
def
 {0}

Any soft set formal context S = (U, E, F) which has been viewed as an information
systems (OB, AT) could be represented as soft set information system S = (U, E, F) that
contains some information about relationships among parameter of the objects under
consideration. This relations reflect a various forms of indistinguishability or „sameness‟of
objects in terms of their parameters. Let S = (U, E, F) be soft set information system. For
every AE we define the following binary relations on U:
- The indiscernibility relation ind(A) is a relation such that for all x, yU, (x, y)
ind(A) if and only if for all aA, a(x) = a(y).
- The similarity relation sim(A) is a relation such that for all x, yU, (x, y) ind(A) if
and only if for all aA, a(x) a(y) .
Intuitively, two objects are A-indiscernible whenever their sets of a-parameter
determined by the parameter aA are the same, while objects are A-similar whenever the
objects share some parameters. In addition to having indistinguishability, we also show a
formal distinguishability relation from a soft set information system,
- The diversity relation div(A) is a relation such that for all x, yU, (x, y) div(A) if
and only if for all aA, a(x) a(y).
Objects are A-diverse if all sets of their parameters determined by A are different. The
information relations derived from soft set formal context (U, E, F) satisfy a property below,

Lemma 4.2.For every soft set formal context S = (U, E, F), for every A  E, this assertion
holds:
- ind(A) is an equivalence relation.
- sim(A) is reflexive and symmetric.

4.2. Application Using Diversity Relations. Let S = (U, E, F) be a soft set information
system and AE. We say that parameter aA is indispensable in A if and only if ind(A)
ind(A-{a}). It follows that if a is indispensable in A means, classification of objects with
respect to parameter from A is properly finer than the classification based on A –{a}. The
notion finer here is that the classification based on indiscernibility of A-parameter provide a
finer partition and never coarser of the set of objects than the A-parameters objects without
parameter a. The set A of parameters is independent if and only if every element of A is
indispensable in A, otherwise A is dependent. This indispensable parameter property plays
important roles in the soft solution. The absence of this parameter will cause a fundamental
change to the outcome of soft solution. In the molodtsov houses example, a parameter „in the
green surroundings‟ is indispensable, while parameter „expensive‟ and „cheap‟ or „in good
repair‟ and „in bad repair‟ may could be selected one from each pair. The set of all parameters
indispensable in A are referred to as the parameter core of A in the system S:
def
CoreS(A)  (aA | ind(A) ind(A - {a})
838 R.B.F. HAKIM , SUBANAR , E. WINARKO

Let S be soft set information system. By the discernibility matrix of S, the entries of the
matrix (cx,y)x,yU where cx,y = cy,x and cx,x= Ø, cx,y = {eE | (x,y)div(e)}. The columns and the
rows of the matrix are labeled with objects whose entries are cx,y.

Lemma 4.3. Let S = (U, E, F) be soft set information system and let B  A  E. Then the
following assertions hold.
i) (x, y) ind(A) iff cx,y A = Ø;
ii) ind(B)  ind(A) iff for all x, y U, ( cx,y A  Ø implies cx,y B  Ø);
iii) if B  A, then ind(B) = ind(A) iff for all x, y U, cx,y A  Ø implies cx,y B  Ø.

Proof (iii). Let BAE, for all (x, y)U, let aA, since cx,y= a(x) a(y) A Ø and BA then
Bcx,yA = Bcx,y Ø, if aoB, a ao, for all (x, y)ind(B) an for every aoB, (ao(x) = ao(y)).
Since BA, for every aoA, (ao(x) = ao(y)), that is ind(B) = ind(A).

The above lemma enables us to find the core of a set of parameter, namely we have the
following theorem.

Theorem 4.1. Let S = (U, E, F) be soft set information system and let A  E. Then the
following assertion holds:

CoreS(A) = {aA | cx,yB = {a} for some x, yU}

Proof: Let aA. Since (A - {a}) A, by lemma 13.3. (iii), ind(A - {a}) = ind(A) if for all x,
yU, cx,yA Ø implies cx,y (A-{a})  Ø. Hence ind(A-{a}) ind(A) iff there are xo, yoU
such that cxo,yoA Ø and cxo,yo (A-{a}) = Ø. So ind(A-{a}) ind(A) iff there are xo,yoU
such that cxo,yoA = {a}.

This theorem says that parameter a Cores(A) iff there are x,yU such that a is the only
parameter that allow us to make a distinction between x and y. In other words, the only
division between x and y is provided by their a-parameter.
We have already present several reasons for soft set that might be treated as an
information system. Other researchers might directly put a value on each parameter by
treating object parameter as an attribute information system and give a value (binary, fuzzy or
interval-valued numbers) on each attribute. Bring the soft set to the structure of information
system allows us to find a soft solution. In this work we use ranking value just look like
simple fuzzy numbers to valuation the parameter on soft set information system that may be
done by all people rather than directly set the membership value which may be done by an
expert to each object under consideration. Soft solution is expected come out using this way.
This set of soft solution will be used in the recommendation system. The following is the
recommendation system,

Definition 4.2. (regions of recommendation). Given a pair (F’, E’) over U’ which is a soft
solution of soft set (F, E) over U, a set X  U’ of objects and a set A  E’ of parameters, the
regions of recommendation is defined as a A-parameters which are sufficient to classify all
the elements of U’ either as members or non-members of X. Since A-parameter might not be
able to distinguish sufficiently between individual objects, it might be able to use cluster of
Recommendation Anlysis Based On Soft Set For Purchasing Products 839

objects rather than individual objects.


Definition 4.3. (recommendation systems). A recommendation system (R-system) of a soft set
(F, E) over U which has a soft solution (F’, E’) over U’ where U’ U, E’E and F’F is a
structure of the form R = (U’, F’, E’), where U’ is a non-empty set of objects under
consideration, E’ is the restriction parameter of E and F’ is a mapping of E’ into the set of
all subsets of the set U’.
From the example of Mr. X‟s houses, the attractiveness of the houses is the Mr. X‟s
consideration of houses to buy. To describe the attractiveness of each house under
consideration, he considers several parameters. Those parameters are adequate from his view
to evaluate houses, say {expensive, beautiful, wooden, cheap, in the green surroundings,
modern, in good repair, in bad repair}. Adequate parameters used to draw the houses are a
family of the set of parameter. Parameterized family means that objects in the set U is defined
by the set of parameters. The numbers of the parameter that represents and draws the objects
under consideration depends on someone‟s view. The numbers of the parameters of houses
under consideration that Mr. X attracted are depending on him. One parameter could be
regarded as minimal parameter if and only if this parameter could describe the objects under
consideration, while many parameters could not be regarded as minimal parameter if and only
if those parameters could not describe the objects under consideration at all.

Example 2. There are 6 houses under consideration (H1, H2, H3, H4, H5, H6) of Mr. X
which will be bought. It is clear that such of his evaluation to each house is not very accurate;
it is „soft‟ because he used any qualitative and quantitative information or knowledge he had,
to judge each house. Then he must give valuation for each house based on those parameters.
The way of dealing with the evaluation problem usually is using a ranking, rating or
membership degree to the objects which allow someone to express their preference,
knowledge, perception or a common thing in an easiest and fairly flexible way. This soft
expression will affect his evaluation to each parameter‟s house. The simplest way when
someone forced to the choice is to give ranking or rating for each house based on parameter.
Tabular representation of the houses and parameter will be useful to describe the response of
Mr. X evaluation. It will be checked using asterisks that describing the condition „more
asterisks more meet parameters‟.

Table 1. Evaluation of object parameters


U/E Expensive beautiful wooden cheap In the green Modern In In bad
surroundings good repair
repair
H1        
H2        
H3        
H4        
H5        
H6        

From the table 1, we could look Mr. X‟s evaluation. H1 is an as expensive as H3 even though
the actual prices for both houses are not exactly same. H1 and H3 are the highest prices than
other houses and H4 and H6 are in the middle rate while H2 and H5 are the lowest from Mr.
X‟s evaluation. Some of parameters are vice versa, i.e., expensive and cheap, in good repair
840 R.B.F. HAKIM , SUBANAR , E. WINARKO

and in bad repair. It will be difficult to identify and directly obtain a solution from this table
due to qualitative evaluation. The main decision problem is the fact that the difference
between the houses was relative evaluation to each other that meets the parameters. Collect
the entries of each cell in the table based on the numbers of asterisks will give the numbers of
house‟s ranking that meet the parameter.

Table 2. Cumulative evaluation


U Total
  
H1 3 3 2
H2 3 2 3
H3 3 2 3
H4 3 3 2
H5 2 2 4
H6 0 6 2

From table 2, H1 and H4, H2 and H3 have same highest attractiveness for several parameters
but we get difficulty to compare and knowing exactly what parameters they are. H6 has a
highest value in the middle preferences. Table 2 could be as a soft solution or
recommendation to Mr. X for choosing one of them. The soft set ranking solution or
recommendation is {H1 and H4, H2 and H3, H6, H5}. This solution is very general, supposed
Mr. X assigns several parameters as priority parameters that he though there parameters are
most important when buying a house. To buy a house, Mr. X gives priority to parameter
„beautiful‟, „cheap‟, „modern‟ and „in a good repair‟. Accumulate all entries of each cell on
table 1 based on his priority parameters and then placed in table 3.

Table 3. Accumulation table based on priority parameters


Priority (P) Parameters Normal Priority (N) Parameters
U
     
H1 1 2 1 2 1 1
H2 1 1 2 2 1 1
H3 1 2 1 2 0 2
H4 2 2 0 1 1 2
H5 1 1 2 1 1 2
H6 0 4 0 0 2 2

From table 3, there are some recommendations for Mr. X, namely, H4 which dominate other
houses with two priority parameters that have three and two asterisks, then H1 and H3 or H2
and H5 which have same values in the priority parameters and the last is H6, H6 is interesting
one because even though it does not has a numbers in three asterisks but all priority
parameters are in the middle. The soft solution which is offered to Mr. X to choose one of
them could be clustered into four groups {(H4), (H1 and H3), (H2 and H5), (H6)}.
To better utilize information from the tables and provide added value to our
recommendation to Mr. X, we will use multidimensional scaling techniques. Non-metric
multidimensional scaling techniques are common techniques which based on ordinal or
qualitative rankings of similarities data (Kruskal [12]). Therefore, table 3 needs to be
transformed via the numbers into an ordinal table.
Recommendation Anlysis Based On Soft Set For Purchasing Products 841

Table 4. Ordinal numbers of table 1


Expensive Beautiful wooden cheap In the green modern In In bad
U/E surroundings good repair
repair
H1 3 2 3 1 2 2 3 1
H2 1 2 1 3 2 1 1 3
H3 3 2 1 1 3 2 3 1
H4 2 3 1 2 3 2 3 1
H5 1 1 1 3 2 2 1 3
H6 2 2 1 2 1 2 2 2

Using the software R (R Development Core Team [13]) with vegan package and metaMDS
procedure (Dixon and Palmer, [14]), we get the mapping of Mr. X view of houses and
parameters ranking on Figure 1.

wooden
1.5

H1
1.0

H6
0.5
NMDS2

expensive
modern(P)
H5
0.0

bad repair good repair(P)

H2
cheap(P) beautiful(P)
-0.5

H4 H3
-1.0

green surroundings

-3 -2 -1 0 1 2

NMDS1

Figure 1. Multidimensional scaling plot of houses and its parameters

From figure 1 we get an illustration of houses and its parameters where H3 and H4 are
„in the green surroundings‟, „beautiful‟ and „in a good repair‟ that two of its parameters are
priority, H2 tends to „cheap‟ parameter (priority parameter), H5 closed to parameter „bad
repair‟, while H1 far from priority parameter and H6 is in the middle evaluation. From this
illustration we could determine a set as a solution of soft set that we call as a soft solution. A
set (F’, E’) is called a soft solution for soft set (F, E) if and only if E’ is subset of E, F’ is a
domination mapping of E’ into the set of all subsets of the set U. The domination is
understood in the following way. Let say, V and W are sets of parameters, where V = {1, 2,
... } and W = {1, 2, ...} are subsets of E’. We say that a set of parameters Vdominates W on
a set of all subsets of U if and only if i i for every i and there exists an index j such that
j>j.
Houses H3 and H4 closed to parameters „green surroundings‟, „beautiful‟ and „good
repair‟, but H4 is dominated by „green surroundings‟ and „beautiful‟ while H3 is dominated
842 R.B.F. HAKIM , SUBANAR , E. WINARKO

by „good repair‟. It could be explained, let V = {green surroundings, beautiful, good repair}
and W = {good repair}, H4 is not dominated by V because there is v3 = „good repair‟ which is
equal to w1 = „good repair‟ where w1W that dominates H3. House H1 is inclined to
parameter „expensive‟ and „modern‟, H6 is in the middle preferences. H2 is dominated by
parameter „cheap‟, while H5 is dominated by parameter „bad repair‟. Then the solution of soft
set for Mr. X problem is
Soft solution (F’,E’) = {(green surroundings, beautiful) = H4, (good
repair) = H3, (expensive, modern) = H1,(cheap) = H2,(bad repair) =H5, (
) = H6}
This set of soft solution is used as a basis for recommendationsto Mr. X to choose the house.
So the recommendations for Mr. X is
{(H1, H2, H3, H4, H5, H6), (expensive, beautiful, wooden, cheap, in the
green surroundings, modern, in good repair, in bad repair), (green
surroundings, beautiful) = H4, (good repair) = H3, (expensive, modern) =
H1,(cheap) = H2,(bad repair) =H5, ( ) = H6}
Say, Mrs. X is agreed to her husband, Mr. X, about the parameters used and their evaluation
for each house, but she has different priority parameters. Her priority parameters are
(expensive, beautiful, in the green surroundings and in a good repair). By compromising to
both of their choices of priority parameters, the recommendations are: choose H4 that has
two dominant priority parameters i.e. „green surroundings‟ and „beautiful‟ or H3 that has one
dominant priority parameters i.e., „in a good repair‟.
To see the group of houses based on parameters of Mr. X view, we could get the soft
cluster of houses. Using three techniques of hierarchical clustering which are single, complete
and average linkages we could draw the dendrogram of houses (figure 2).

houses cluster houses cluster houses cluster


0.30
0.35
0.20

0.30

0.25
0.25

0.20
0.15

0.20

H6
Height

Height

Height
H6
H6

0.15
0.15

H4
0.10

0.10

H4
0.10
H4

H1

H3

0.05

H1

H3
0.05
H1

H3
0.05

H2

H5

H2

H5

H2

H5

houses.dis houses.dis houses.dis


hclust (*, "single") hclust (*, "complete") hclust (*, "average")

Figure 2. Single, complete and average linkage hierarchical clusterings

For single and complete linkage techniques, the houses are separated in three groups (H2,
H5), (H1, H3, H4) and H6, while for average linkage technique (H1, H3, H4, H6) and (H2,
H5) fit in two groups. With the help of function rect.hclust and cutree from Base R (R
Development Core Team, 2006) for visualizing the group cutting and make classification
Recommendation Anlysis Based On Soft Set For Purchasing Products 843

vector with 3 classes of dendrogram in the complete linkage technique we could see the result
in figure 3 that shows three groups separation. We could also run those functions to average
or single linkage technique. Figure 3 shows three soft cluster separation of houses, (H1, H3,
H4), (H2, H5) and H6. First group is dominated by parameters: „green surroundings‟,
„beautiful‟, „good repair‟ and „expensive‟, second group is dominated by parameters: „cheap‟
and „bad repair‟ while the last group is in the middle preferences.
Cluster Dendrogram
0.35
0.25
Height

H6
0.15

H4
0.05

H1

H3

H2

H5
Figure 3. Three groups of houses using complete linkage technique
houses.dis
hclust (*, "complete")

5. THE PROPOSED SOFTWARE


SOFT SET RECOMMENDATION SYSTEM

In this section we are trying to illustrate user interface for soft set recommendation
systems for purchasing furniture products in some furniture store. Buyers are offered to get
assistant from furniture expert for choosing the products. System displays all collections of
furniture items as shown in figure 4. In this example, buyers need to see collections of dining
chairs. Selected chairs just click and the salesman will bring it or show the items (as shown in
figure 5, selected items column). Buyers can try and feel the comfort of each chair chosen.
Buyers could determine their own requirements for their dining chairs, in this example,
someone has a several thing which she thought as a dining chairs prerequisites, i.e. „match
with my dining room decoration‟, „fit the space of my dining room‟, „cheap‟, „comfort‟,
„classic‟ and „bright wood color‟. Those necessities could be regarded as parameter of each
chair. She thought that, those information/knowledge/parameters are necessary parameters for
her to choose a chair that she need for inviting a special guest for dinner. She does not need to
be an expert first when buying a dining chairs and does not need to estimate the interval
valuation for each chair criteria/requirements and also she does not need to try each chair for
a long time before decide which one should be bought. The simple act to evaluate the selected
items is to compare them in a fairly flexible way by giving a more mark to the chairs that
meet her requirements.
844 R.B.F. HAKIM , SUBANAR , E. WINARKO

Raden Furniture # Recommendation System Purchasing #

Customer
List of Collections: Dining chairs collection:
Identification
Dining Chairs:
Name:F
ajriya
Hakim

Table Dining Chairs:


Addres Sleman,
s: Yogyakar
ta.
Need Yes
expert
assistant? Sofa:
Special Diner
Request with
for: special
guest
Dressing Table:
Collecti Dining
on to Chairs
buy:

Figure 4. Soft Set Recommendation System first page


Recommendation Anlysis Based On Soft Set For Purchasing Products 845

Recommendatio
Selected Items: Customer Evaluation
n B
Fill in with asterisks, U
Note: more asterisk more The results is
Y
meet request. based on your
?
evaluation
Items 1 2 3 4 5 6
Match the dining room
* * Tend to
decoration (Match) * * *
* * * match
Custom Fit the space of dining * * *
* *
er room (Fit)
request Cheap Color 
* * *
: Comfort Comfort
* * * * * *
Classic Classic
* * *
Wood colour
Fit
Cheap
* Tend to
Expert Comfort * * *
* * * cheap
Says : Classic * * *
* and
Wood Colour
match
* * Tend to
* * * *
* * cheap
Comfo Classic Cheap Cheap
rt Comfo
People who bought the items Classic
Wood rt
also bought: Color
colour Wood
colour

Figure 5. Evaluation Page by Customer and its Recommendation

6. CONCLUSION

In this paper we have introduced a different way than other researchers that gave the
binary or fuzzy evaluation to parameters of soft sets. Binary or fuzzy evaluation for
parameters of soft sets will produce the exact solution for soft set. Molodtsov had emphasized
the approximation or approach solution for soft set rather than exact solution.
Recommendation or soft solution describes the dominant parameters for each choice. This
846 R.B.F. HAKIM , SUBANAR , E. WINARKO

example shows the strict dominant parameters, we call strict dominant parameter since a set
of all subsets of U has a set of parameter that has empty intersection with other set of
parameter. Another kind of dominant parameter is a weak dominant parameter that allows a
set of all subsets of U has a set of parameters that may have nonempty intersection with other
set of parameters. The main important part of soft set is an adequate parameters of objects
under consideration which determined by someone when deal with decision making problem.
The adequacy test is needed to test the adequacies of the parameters of soft set. Object‟s
parameters of soft set are called minimal if reducing one parameter will result the failure to
give soft solution. One parameter is called adequate if and only if this parameter could
describe the objects under consideration, while many parameters may be called inadequate if
and only if those parameters could not describe the objects under consideration. This paper
has shown a useful study using a simple approximation to soft set theory in a decision making
problem.

References

[1] MAJI, P.K., A.R. ROY AND R. BISWAS, (2002), An Application of Soft Sets in A Decision Making Problem,
Computers and Mathematics with Applications 44: 1077-1083.
[2] MOLODTSOV, D (1999), Soft Set Theory – First Results, Computers and Mathematics with Applications 37: 19
– 31.
[3] MAJI, P.K., A.R. ROY AND R. BISWAS, (2003), Soft Sets Theory, Computers and Mathematics with
Applications 45: 555-562.
[4] HERAWAN, T. AND DERIS, M.M., (2011), A soft set approach for association rules mining, Knowledge Based
Systems 24: 186 – 195
[5] ZOU, YAN AND XIAO, ZHI, (2008) Data Analysis Approaches of Soft Sets under Incomplete Information,
Knowledge Based Systems 21: 941 -945,
[6] ROY, A.R. AND P.K. MAJI(2007), A Fuzzy Soft Set Theoretic Approach to Decision Making Problems,
Computational and Applied Mathematics 203: 412-418
[7] YANG, X; D. YU, J. YANG, C. WU, (2007)Generalization of soft set theory: From crisp to fuzzy case, in Fuzzy
Information and Engineering (ICFIE), ASC 40: 345-354
[8] FENG, FENG, XIAOYAN LIU, VIOLETALEOREANU-FOTEA, YOUNG BAE JUN, (2011), Soft set and soft rough sets,
Information Sciences 181: 1125 – 1137
[9] JUN, YOUNG BAE; KYOUNGJA LEE AND CHUL HWAN PARK, (2010) Fuzzy soft sets theory applied to BCK/BCI-
algebras, Computers and Mathematics with Applications 59: 3180-3192
[10] JIANG, YUNCHENG; YONG TANG, QIMAI CHEN (2011) An adjustable approach to intuitionistic fuzzy soft sets
based decision making, Applied Mathematical Modelling 35: 824-836
[11] NIJKAMP, PETER AND ASTRID SOFFER (1979) Soft Multicriteria Decision Models for Urban Renewal Plans,
Researchmemorandum no. 1979-5, Paper SistemiUrbani, Torino.
[12] KRUSKAL, J. B. (1964) Nonmetric multidimensional scaling: A numerical method, Psychometrika 29, (1964),
pp. 115-29.
[13] R DEVELOPMENT CORE TEAM (2006), R: A language and environment for statistical computing. R Foundation
for Statistical Computing, Vienna, Austria. URL http://www.r-project.org/.
[14] DIXON, P., PALMER, M.W., (2003). Vegan, a package of R function for community ecology, Journal of
Vegetation Science 14, 927-930.
[15] FENG, F., JUN, Y.B., LIU, X., LI, L. (2010): An adjustable approach to fuzzy soft set based decisionmaking.
Journal of Computational and Applied Mathematics 234, 10–20.
[16] CHEN, D., TSANG, E.C.C., YEUNG, D.S., AND WANG, X. (2004). The Parameterization Reduction of Soft Sets
and its Applications, Computers and Mathematics with Applications 49 (2005) 757-763.
[17] Z. KONG, L. Z., GAO, L. WANG AND S. LI, (2008), The normal parameter reduction of softsets and its algorithm,
Comput. Math. Appl. 56 (12) 3029-3037.
Recommendation Anlysis Based On Soft Set For Purchasing Products 847

[18] DEMRI, S. P. AND ORLOWSKA, E. S., 2002, Incomplete Information: Structure, Inference, Complexity, Springer-
Verlag, Berlin Heidelberg, New York.

R.B. FAJRIYA HAKIM


Statistics Department, Faculty of Mathematics and Natural Sciences
Universitas Islam Indonesia,
JalanKaliurang KM 14.5 Sleman, Jogjakarta, Indonesia, 55584.
e-mail: [email protected]

SUBANAR
Mathematics Department, Faculty of Mathematics and Natural Sciences,
UniversitasGadjahMada
Sekip Utara, Jogjakarta, Indonesia 55528
e-mail: [email protected]

EDI WINARKO
Mathematics Department, Faculty of Mathematics and Natural Sciences,
UniversitasGadjahMada
Sekip Utara, Jogjakarta, Indonesia 55528
e-mail: [email protected]
848 R.B.F. HAKIM , SUBANAR , E. WINARKO
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 849 - 858.

HETEROSCEDASTIC TIME SERIES MODEL


BY WAVELET TRANSFORM

RUKUN SANTOSO, SUBANAR, DEDI ROSADI, SUHARTONO

Abstract. Box-Jenkins is the best method for stationary time series modeling. When variance varies over
time, it is proposed to use ARCH model to capture time series structure (Engle, 1982). In 1986, Bollerslev
generalized it into GARCH model. The standard approach of models is introducin g an exogenous variable
along with some assumptions. This paper is proposed an alternative solution when exogenous variable
unfulfilled these assumptions. Discrete wavelet transform can be used to analyze time series structure
when the sample size is integer power of 2. When sample size is arbitrarily, it’s proposed to use
undecimated wavelet transform.
Keywords and Phrases :

1. INTRODUCTION

Volatility in time series data is indicated by changing of variance value over time. It
means there is a heteroscedasticity property in data. According to this condition, the data can
not be modeled by Box-Jenkins method directly. The early heteroscedastic model was
proposed by Engel [5] in 1982, which is called as Autoregressive Conditional
Heteroscedasticity (ARCH) model. Heteroscedastic properties in data are captured by AR(p)
model of error component.

 t  t v t ,
 2t   0  1 2t 1  ...  1 2t  p (1)

where {vt}is a sequence of iid with mean 0 and variance 1, α0>0, and αi≥0 for i>0. In
practice, vt is usually assumed to follow the standard normal or a standardized student-t. The
model (1) was developed by Bollerslev [2] in 1986 along with assumption that  t follows an
2
ARMA(p,q) model. This paper does not cover complete solution of ARCH/GARCH model,
but will gives an alternative solution when there is a violation of v t assumption. However the
study of wavelet method is included in nonparametric modeling which free of distribution
assumption.
849
850 R. SANTOSO ET AL.

2. WAVELET AND FILTERING

Wavelet is a small wave function that can build an orthonormal basis for L 2(R), so that
every function f∈L2(R) can be expressed as linear combination of wavelets [4]

f (t )   cJ,k J,k (t )   w j,k  j,k (t ) (2)


kZ j J kZ
 SJ  D J  D J 1  ...  D1

where  and ψ is a father and mother wavelet respectively with dilation and translation
indexes
 j, k (t )  2  j / 2 (2  j t  k) (3a)

 j, k (t )  2 j / 2 (2 j t  k) . (3b)

In discrete version wavelet can construct an orthonormal filter matrix so that every discrete
realization of f∈L2(R) can be decomposed into scaling component or smooth component (S)
and detail components (D) [8].
Let h=[h0, h1, …, hL-1] a wavelet filter then scaling filter g can be derived from h by
formulation (4)
gi  (1)i 1 h L1i , i=0,1,…, L-1 (4)

For example the Haar filter h and its scaling filter will be formed as equation (5)
h= [ 1 , 1 ] and g = [ 1 , 1 ] (5)
2 2 2 2
Let {Zt} is time series from discrete time realization of fL2(R) with t=1,2,…, N,
N=2J, then the coefficient wj,k and cJ,k in equation (2) can be computed by discrete wavelet
transform (DWT) as shown in (6). Here, H is a filter matrix of NxN
 h (1) h (01) 0  0 h (L1) 1  h (21)   Z1 
 11 
 h3 h12 h11 h10 0  h15 h14   Z 2 
 
    
 
W=H.Z=  0 0  h1L 1 h1L  2  h11 h10    (6)
 
 h ( 2) h2( 2) ( 2)
h1 ( 2)
h0  h 3L  2  h 4  
( 2) ( 2) 
 3  
   
 (J) (J) (J) (J)   
h 2 J 1 h 2 J  2  h1 h0 
 
g ( J ) (J)
g1( J ) g (0J )   Z N 
 2 J 1 g 2 J 1  
Up-sampled version of h notion by hup is constructed by inserting zero between non-zero
value filter. The filter of high level (j=2,3,…,J) is gotten by convolution of hup and g as
shown in (7).
Heteroscedastic Time Series Model By Wavelet Transform 851

h(j)=(h(j-1))up*g (7)
For example in Haar case h = (2)
[ 1 ,0, 1 ]*[ 1 , 1 ] = [ 1 , 1 , 1 , 1 ]
2 2 2 2
2 2 2 2
When the sample size is not the form of 2J, JZ, the coefficient wj,k and cJ,k can be
computed by Undecimated Wavelet Transform (UDWT). The scenario of UDWT for j=1 can
be shown in Figure 1. The wavelet coefficient w1,k is resulted by convolution of time series Z
and h. The first detail component D1 is resulted by convolution of w1,k and h’ where h’ is
time reverse version of h. The scaling coefficient c1,k is resulted by convolution of Z and g.
The first scaling coefficient is resulted by convolution of c1,k and g’ where g’ is time reverse
version of g. Furthermore Ẑ =S1+D1 will equal to Z regard to wavelet filtering.

Figure 1. Algorithm of UDWT at level j=1

Higher level of UDWT can be constructed by split the scaling coefficient cj,k into cj+1,k and
wj+1,k. The UDWT for j=2 can be shown in Figure 2. Furthermore Ẑ =S2+D2 +D1 will equal
to Z regard to wavelet filtering.

Figure 2. Algorithm of UDWT at level j=2

The number of DWT coefficients at level j+1 is a half of level j. In other hand, the
number of UDWT coefficient always the same for all decomposition level. This property
makes UDWT more powerful to analyze the time series than DWT. Furthermore, this paper
will discuss wavelet base prediction of time series by UDWT only.
852 R. SANTOSO ET AL.

3. WAVELET BASE PREDICTION MODEL

Prediction of Z at time t+1 will be done refer to realization of Z in the past and wavelet
coefficient which resulted from decomposition. Starck [9] propose the necessary wavelet
coefficients at each level j which will be used for forecasting at time t+1 have the
form w j, N  2 j ( k 1) and cJ, N  2 j ( k 1) . The forecasting formulation is expressed in equation
(8)
J Aj A J 1
Ẑ N 1   â j, k w j, N  2 j ( k 1)   â J 1,kcJ, N 2 (k 1)
J (8)
j1 k 1 k 1
The highest level of decomposition is indicated by J, and Aj is indicate the number of
coefficients which chosen at level j. For example, if J=4 and Aj=2 for j=1,2,3,4 then (8) can
be expressed as (9)
Ẑ N 1  â1,1w1, N  â1,2 w1, N  2  â 2,1w 2, N  â 2,2 w 2, N  4
 â 3,1w 3, N  â 3,2 w 3, N 8  â 4,1w 4, N  â 4,2 w 4, N 16 (9)
 â 5,1c 4, N  â 5,2c 4, N 16
Furthermore, least square method can be used for estimating coefficient a j,k in equation (8)
and (9).

4. IMPLEMENTATION AND RESULT

The data of currency exchange from USD to IDR will be used for implementing the
proposed method. The daily equivalent value of $1 to IDR along of 2003 year will be
modeled according to equation (9). The statistic test will be appeared to check that the data is
reasonable for this aim.
The actual data which is appeared in Figure 4 shows that the data comes from a non-
stationary process. The Box-Jenkins standard method proposed to difference the data. The
result of one lag data differencing can be shown in Figure 3, which gives a sign of
heteroscedasticity feature. The ACF and PACF plot give a sign that there is neither AR nor
MA which is significant. It looks like that the data can be modeled as Zt=t, where t are not
normally distributed. The Ljung-Box test for {t} and {t2} indicate that {t} are independent,
but {t2} are dependent. So, it can be concluded that the heterogeneity of variances are
occurred. The GARCH(1,1) looks like as the nearest model for {Zt}, but the Jarque-Bera test
is not supporting the residual normality assumption. Finally, it is concluded that the standard
ARIMA and GARCH models have been failed to capture the data pattern.
Heteroscedastic Time Series Model By Wavelet Transform 853

Figure 3. One Lag Differencing Value

As has been discussed above, there is a long way to reach final solution in parametric
modeling. Next, it will be appeared the simpler way to make a prediction model in non-
parametric sense, especially in wavelet based model. Although the wavelet computation
theory is a complicated problem, but there are some software which make it easier. The step
by step algorithm of modeling can be explained as follows

1. Exploring wavelet coefficients


The Wavelet R-packaged which arranged by Aldrich [1] is used for exploring the UDWT
wavelet coefficients with sample size N= 243 and decomposition level J= 4.

2. Collecting the selected coefficients


For i in range 1 to N-16, there are defined the vectors of selected coefficients.
w1=vector of wavelet coefficients at scale 1 with index of the form i+16
w2= vector of wavelet coefficients at scale 1 with index of the form i+14
w3= vector of wavelet coefficients at scale 2 with index of the form i+16
w4= vector of wavelet coefficients at scale 2 with index of the form i+12
w5=vector of wavelet coefficients at scale 3 with index of the form i+16
w6= vector of wavelet coefficients at scale 3 with index of the form i+8
w7= vector of wavelet coefficients at scale 4 with index of the form i+16
w8= vector of wavelet coefficients at scale 4 with index of the form i
c9= vector of scaling coefficients at scale 4 with index of the form i+16
c10= vector of scaling coefficients at scale 4 with index of the form i

3. Calculating the parameter estimation of model


The coefficients of equation (8) are computed regard to minimizing the sum squared of
error. It can be solved by linear model in R packaged z~ w1+…+w8+c9+c10, where z is
time series data with index 18. Furthermore, fitted values and residuals can be computed
so that the MSE can be computed too.
854 R. SANTOSO ET AL.

Figure 4. Actual Time Series Data and Estimation

The summary of parameter estimation gives the form of prediction model as (10)

ẑ n 1  1.18w1, n  0.1469w1, n  2  0.7868w 2, n  0.1006w 2, n  4


 0.9919w 3, n  0.0240w 3, n  8  1.152w 4, n  0.0632w 4, n 16 (10)
 0.9841c 4, n  0.0157c 4, n 16
The time series plot of data and fitted values can be shown in Figure 4. The black line is a
view of actual data and the red dash line is a view of fitted values. The mean square of error
in this level is 4.14.

5. CONCLUDING REMARK

Wavelet transform, especially UDWT can be used for producing estimation model of
time series. This modeling is simpler and easier to be implemented. The graphical views
show that this method gives a good approximation. However a wide comparison to another
methods and further analytical study must be done to make a comprehensives conclusion.

References

[1] Aldrich, E., A package of Functions for Computing Wavelet Filters, Wavelet Transforms and Multiresolution
Analyses, http://www.ealdrich.com/wavelets/
[2] Bollerslev, T. “Generalized autoregressive conditional heteroskedasticity”, Journal of Econometrics, Vol.
31,1986, pp. 307–327.
[3] Ciancio, A., “Analysis of Time Series with Wavelets”, International Journal of Wavelets, Multiresolution and
Information Processing, Vol. 5(2007), No. 2, pp. 241-256.
[4] Daubhechies, I., Ten Lecture on Wavelets, SIAM, Philadelphia, 1992.
[5] Engel, R.F., “Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom
Inflation”, Econometrica, Vol. 50 (1982), No. 4, pp. 987-1008
Heteroscedastic Time Series Model By Wavelet Transform 855

[6] Jawerth, B. and Sweldens, W., “An Overview of Wavelet Based Multiresolution Analyses”, SIAM Review, Vol.
36 (1994), pp. 377-412.
[7] Ogden, R.Todd, Essential Wavelets for Statistical Applications and Data Analysis, Birkhäuser: Berlin, 1997.
[8] Percival, D. B. and Walden, A. T., Wavelet Methods for Time Series Analysis Cambridge University, 2000.
[9] Starck, J.L., et. al., “The Undicimated Wavelet Decomposition and its Reconstruction”, IEEE on Image
Processing. Vol. 16 (2007) No. 2.

Rukun Santoso
Program Studi Statistika Universitas Diponegoro
e-mail: [email protected]

Subanar
Jurusan Matematika FMIPA UGM

Dedi Rosadi
Jurusan Matematika FMIPA UGM

Suhartono
Jurusan Statistik ITS Surabaya
856 R. SANTOSO ET AL.

R Code Listing

function (x,wv='haar',j=4)
{
n=length(x)
x.modwt=modwt(x,wv,j)
d1=x.modwt@W$W1
d2=x.modwt@W$W2
d3=x.modwt@W$W3
d4=x.modwt@W$W4
v4=x.modwt@V$V4
w1<-w2<-w3<-w4<-w5<-w6<-w7<-w8<-c9<-c10<-NULL
for (i in 1:(n-17)){
w1<-c(w1,d1[i+16])
w2<-c(w2,d1[i+14])
w3<-c(w3,d2[i+16])
w4<-c(w4,d2[i+12])
w5<-c(w5,d3[i+16])
w6<-c(w6,d3[i+8])
w7<-c(w7,d4[i+16])
w8<-c(w8,d4[i])
c9<-c(c9,v4[i+16])
c10<-c(c10,v4[i])
}
z=x[18:n]
lm.z=lm(z~-1+w1+w2+w3+w4+w5+w6+w7+w8+c9+c10)
koef<-lm.y$coeff
pred<-c(rep(0,17), lm.y$fitted)
ts.plot(z,xlim=c(0,250), ylim=c(8500,9800), xlab="", ylab="",
type= 'l')
par(new=T)
ts.plot(pred, xlim=c(0,250), xlab="Daily Time",ylab="$1
Equvalencies", ylim=c(8500,9800), col=2, lty=4)
return(lm.z)
}

DATA
> kurs2003
9468 9431 9435 9435 9424 9440 9433 9400 9360 9364 9364 9376 9387 9390 9388
9385 9390 9393 9392 9336 9364 9376 9380 9369 9363 9375 9367 9384 9405 9470
9435 9417 9389 9378 9384 9381 9418 9392 9392 9402 9405 9383 9363 9388 9375
9387 9383 9380 9400 9410 9419 9490 9525 9620 9525 9480 9440 9415 9415 9399
9408 9406 9397 9405 9396 9376 9377 9362 9370 9374 9342 9339 9295 9165 9230
9270 9218 9240 9275 9200 9175 9175 9161 9148 9058 9070 9000 9035 8975 8949
8951 8965 8890 8863 8830 8825 8665 8670 8730 8779 8837 8721 8730 8670 8675
8700 8690 8745 8760 8730 8695 8690 8698 8747 8730 8753 8725 8723 8726 8780
8785 8745 8735 8708 8703 8666 8695 8709 8718 8718 8725 8720 8740 8747 8770
8801 8895 9083 9165 9025 8995 9065 9090 9005 8980 9013 8993 9118 9076 9053
9033 9025 9034 9053 9047 8989 8944 8880 8906 8920 8988 8957 9018 9035 8983
Heteroscedastic Time Series Model By Wavelet Transform 857

8994 8988 8990 9003 9000 8970 8965 8941 8971 8955 8959 8960 8985 8991 8910
8930 8950 8950 8925 8889 8885 8870 8875 8890 8888 8871 8877 8886 8865 8893
8939 8945 8945 8937 8940 8958 8959 8998 9038 9083 9077 9020 8995 9015 9025
8988 8983 8990 8986 8981 8980 8980 9022 8997 8985 8974 8990 9037 9009 8996
8981 8988 9000 8991 8990 8988 8995 8990 8983 8990 8983 8988 8996 8994 8995
8986 8991 8947
858 R. SANTOSO ET AL.
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 859–864.

PARALLEL NONPARAMETRIC REGRESSION CURVES

Sri Haryatmi Kartiko

Abstract. Consider N time series observation {Xit }T t=1 , , i = 1, . . . , N , according to


the model Xit = mi (t/T ) + it , t = 1, . . . , T, where mi (u), u ∈ [0, 1], is the mean
function of {Xit }T T
t=1 , and {it }t=1 is the error of mean zero and finite variance. It will
be investigated whether the shape of the mean function mi (.), i = 1, . . . , N are identical.
In fact we want to test the null hypothesis that the mi (.) are parallel, or that is there is
a function m(.) such that
H0 : mi (.) = ci + m(.), i = 1, . . . , N
where ci are real constants representing the distance of the curve mi (.). In the testing
hypothesis, ci is viewed as nuisance parameters.
Comparison of regression curves is an important problem in regression analysis. For
instance, in the study of human growth, it is of interest to test whether the growth
curves have the same pattern. If a different growth pattern is observed, then a special
attention is needed and that the individual having it needs a closed monitoring. In
longitudinal clinical studies, evaluators are interested in comparing curves corresponding
to treatment and control groups.
To test H0 is to compare curves m̂i estimated under the stated model to the curves
ĉi + m̂ estimated P
under H0 . To estimate the common trend m under H0 we can use the
N
averaged process i=1 Xit /N , for t = 1, . . . , T is X̄t = m(t/T ) + ¯t . To estimate m we
use nonparametric regression with kernel function as the weight.
There are many ways to measure the distance between the curves ĉi + m̂(.) and m̂(u).
PN R N
This paper use L2 distance ∆2 = 2
i=1 0 (m̂i (u) − ĉi + m̂(u)) du as a test statistics
and it is used as a natural estimate for the parallelism index. A central limit theorem is
then obtained for the parallelism index based on the distance between the estimates of
the regression curves and their average.

Keywords and Phrases: nonparametric regression, kernel, bandwidth, central limit the-
orem

1. INTRODUCTION
Consider N time series observation {Xit }Tt=1 , , i = 1, . . . , N , according to the
model
Xit = mi (t/T ) + it , t = 1, . . . , T (1)
where mi (u), u ∈ [0, 1], is the mean function of {Xit }Tt=1 , and {it }Tt=1 is the error of
mean zero and finite variance. It will be investigated whether the shape of the mean
function mi (.), i = 1, . . . , N are identical.
859
860 Sri Haryatmi Kartiko

In fact we want to test the null hypothesis that the mi (.) are parallel, that is there
is a function m(.) such that
H0 : mi (.) = ci + m(.), i = 1, . . . , N (2)
where ci are real constants representing the distance of the curve mi (.). In the testing
hypothesis, ci is viewed as nuisance parameters.
Comparison of regression curves is an important problem in regression analysis.
For instance, in the study of human growth, it is of interest to test whether the growth
curves have the same pattern. If a different growth pattern is observed, then a special
attention is needed and that the individual having it needs a closed monitoring. In
longitudinal clinical studies, evaluators are interested in comparing curves corresponding
to treatment and control groups.
Comparison problem for mean function is discussed in several literature, among
others are Hardle and Maron (1990)compare the shape of two regression curves by test-
ing whether one of them is a parametric transformation of the other. King, Hart and
Wehrly (1991) used kernel method to compare two regression curves, under indepen-
dent and identical distributed errors. This method is generalized by Munk and Dette
(1998)for several curves. Hall and Hart (1990) proposed a bootstrap test to compare
two mean functions with independent errors. In the time series set up, Park et al.
(2009) propose a graphical device to see the equality of two mean functions, while Guo
and Oyel (2009) apply a wavelet based method.
The former paper used assumptions such as the errors are independent in time
and the number of curves is fixed. Here in this paper, the assumptions are relaxed to
the presence of dependence structure and the number of curves can be fixed or tends to
infinity. We derive the asymptotic theory of the test statistics based on the L2 distances
between individual trend estimates and the global trend estimate. For implementation
of the test, we proposed a cross validation bandwidth selection procedure that accommo-
date the dependence in the data. Finally to approximate the finite sample distribution
of the test statistics, simulation based method that is more accurate than the normal
limiting distribution is presented. Overall the methodology is fully nonparametric and
data driven.

2. TEST STATISTICS
PN
To ensure model identifiability under the null hypothesis we assume that i=1 ci =
0.
X̄t = m(t/T ) + ¯t (3)
To test H0 is to compare curves m̂i estimated under the model (3) to the curves ĉi + m̂
estimated under H0 .
To estimate the common trend m under H0 we can use the averaged process
PN
i=1 it /N , for t = 1, . . . , T
X
T
X
X̄i = Xit /T (4)
t=1
Parallel Nonparametric Regression Curves 861

X̄, ¯t , ¯i are defined similarly. This paper will adopt the local linear smoothing pro-
cedure to estimate the trends. To estimate m we use nonparametric regression with
kernel function as the weight. Let K the kernel function then the estimate of m and
mi are respectively
XN
m̂(u) = Xit wh (t, u), 0 ≤ u ≤ 1 (5)
i=1
where
Sh,2 (u) − (u − t/T )Sh,1 (u)
wh (t, u) = K((u − t/T )/h) 2 (u)
Sh,2 (u)Sh,0 (u) − Sh,1
and
T
X
Sh,j (u) = (u − t/T )j K((u − t/T )/h), u ∈ [0, 1]
t=1
mi is estimated using the same bandwidth as m, so that the local linear estimate of mi
is
T
X
m̂i (u) = Xit wh (t, u), 0 ≤ u ≤ 1. (6)
t=1
The intercepts ci are estimated by
T
1X
ĉi = [m̂i (t/T ) − m̂(t/T )]. (7)
T t=1

There are many ways to measure the distance between the curves ĉi + m̂(.) and
m̂(u). This paper use L2 distance
XN Z N
∆2 = (m̂i (u) − ĉi + m̂(u))2 du (8)
i=1 0

as a test statistics. It is clear that ∆2 is a natural estimate for the parallelism index.

N Z
X 1
∆(m1 , m2 , . . . , mN ) = min
P (mi (u) − ci − m(u))2 du (9)
c1 ,...,cN : i ci =0 0
i=1
PN R1
where m(u) = i=1 mi (u)/N and ci = 0 (mi (u) − m(u))du. A central limit theorem
is then obtained for the parallelism index based on the distance between the estimates
of the regression curves and their average.

3. ASYMPTOTIC THEORY
To establish the asymptotic normality of ∆2 , we impose structural conditions
on the error processes (it )Tt=1 , i = 1, . . . , N are iid as the process (t )Tt=1 of the form
862 Sri Haryatmi Kartiko

t = G(t/T ; Ft ), where Ft = (. . . , t−1 , t ), (j )j∈Z is an iid innovation process. G(u, Ft )


is Stochastically Lipschitz Continuous (SLC), that is there exists a constant C such that

kG(u1 ; Ft ) − G(u2 ; Ft )kp ≤ C|u1 − u2 | (10)

for all u1 , u2 ∈ [0, 1], can be written as G ∈ SLC. Assuming that Ek = 0 for all k ∈ R,
let
γk (u) = E[G(u; Ft )G(u; Ft )], 0 ≤ u ≤ 1. (11)
Define long-run variance function
X
g(u) = γk (u) (12)
k∈Z

and the square integral


Z 1
σ2 = g 2 (u)du. (13)
0

Recall that the kernel function K is Lipschitz continuous on it support [0, 1], let
Z 1−2|x| Z 1
K ∗ (x) = K(v)K(v + 2|x|)dv dan K2∗ = (K ∗ (v))2 dv
−1 0

We have the following result.

Theorem 3.1. Let N = NT be such that either (i) N → ∞ or (ii) N is fixed. Let
h = hT be a bandwidth sequence such that T h3/2 → ∞ and h → 0. Further assume that
G ∈ SLN and that, for some p > 4 the following short-range condition hold;

X
δp (t) < ∞ (14)
t=0

then under the null hypothesis H0 , we have

T h1/2 (N − 1)−1/2 (42 − E42 ) ⇒ N (0, σ 2 K2∗ ). (15)

Outline of proof : First define


Z 1 T
X
A0i = ( wb (t; u)eit )2 du : (16)
0 t=1

By Theorem 1 in Zhang and Wu (2011), under the bandwidth conditions T b3/2 →


∞andb → 0 and the short-range dependence condition, we have
d
T b3/2 (A0i − EA0i ) → N (0; σ 2 K2∗ ). (17)

Using some modifications the normality result is concluded.


Parallel Nonparametric Regression Curves 863

T = 100 T = 300
b\ N 50 100 150 50 100 150
1 .94 .93 .96 .91 .95 .96
2 .95 .93 .96 .92 .98 .93
3 .93 .94 .94 .93 .94 .95
Table 1. Acceptance probabilities at 95% level, for different T, N and h

4. SIMULATION STUDY
In this section we present a simulation study to assess the performance of the
testing procedure. Consider the model
Xit = ci + m(t/T ) + it (18)
2
where ci = 2(i/N ) and m(u) = 3 sin(3πu). The error process (it ) is generated by
it = ψi,t (t/T ) where for all i in Z and u ∈ [0, 1], the process ψi,t (u) follow the recursion
ψi,t (u) = ρ(u)ψi,t−1 (u) + σi,t (19)
with the i,t random variables satisfying P (i,t = −1) = P (i,t = 1) = 1/2. Thus,
(i,t )t∈Z for i = 1, . . . , N are iid sequences of AR(1) processes with time varying coef-
ficients. Let ρ(u) = 0.2 − 0.3u and σ = 1. It can be shown easily that E(ζi,t (u)) = 0,
V ar(ζi,t (u)) = σ 2 /(1 − ρ(u)2 ) and in the long run variance function g(u) = σ 2 /(1 −
ρ(u)2 ). Five estimated regression line of m(u) = ci + 3 sin(3πu), i = 1, . . . , 5, for 5
different ci by adding five different set of errors i are presented in Figure 1.

Regression Curve for Multiple Time Series (N=5)


2
1
Estimated Value

0
−1
−2

0 20 40 60 80 100

Figure 1. Five estimated regression line of m(u) = ci + 3 sin(3πu), i = 1, . . . , 5

In the simulation study, the normal kernel is used. Simulation is done 100 times.
We are interested in the proportion of realization that the null hypothesis is correctly ac-
cepted. Acceptance probabilities are presented in Table 1 for different choice of T, N, h.
864 Sri Haryatmi Kartiko

This suggests that the acceptance probability are reasonably close to the 95% nominal
levels, and become more robust to the size of bandwidth as the sample size gets bigger.

References
[1] Guo, P. and Oyet, A.J. (2009). On wavelet methods for testing equality of mean response curves.
Int. J. Wavelets Multiresolut. Inf. Process. 7, 357-373.
[2] Hall, P. and Hart, J.D. (1990). Bootstrap test for difference between means in nonparametric
regression curves. J. Amer. Statist. Assoc. 85, 1039-1049
[3] Hardle, W. and Marron, J.S. (1990), Semiparametric comparison of regression curves. Ann.
Statist. 18, 63-89.
[4] King, E.C., Hart, J.D. and Wehrly, T.E. (1991), Testing the equality of regression curves using
linear smoothers. Statist. Probab. Lett. 12, 239-247.
[5] Munk, A. and Dette (1998), Nonparametric comparison of several regression functions : exact
and asymptotic theory. Ann. Statist. 6,2339-2368.
[6] Park, C., Vaughan, A., Hannig, J. and kang, K.H. (2009). Sizer analysis for the comparison
of time series,www.stat.uga.edu

Sri Haryatmi Kartiko


Department of Mathematics, Gadjah Mada University,
Sekip Utara, Yogyakarta, Indonesia.
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 865 - 878.

ORDERING DUALLY IN TRIANGLES (ORDIT) AND HOTSPOT


DETECTION IN GENERALIZED LINEAR MODEL FOR POVERTY
AND INFANT HEALTH IN EAST JAVA

YEKTI WIDYANINGSIH, ASEP SAEFUDDIN,


KHAIRIL ANWAR NOTODIPUTRO, AJI HAMIM WIGENA

Abstract. The objective of this research is to build a generalized linear model using ordinal
response variable with some covariates. One of the covariates in the model is an indicator
variable of hotspot as the results of the hotspot detection, while the response variable is ordinal
data as the result of ORDIT ranking method. The data in this research is about infant health and
poverty in some districts of East Java; i.e. Blitar, Kediri, and Jember taken from 3 different
infant health district levels. GLM is implemented for 200 villages from each district, so there are
600 villages (sub-districts) as unit observation for modeling. The modeling analysis gives results
that number of farmer families, hotspot, and district are statistically significant as the predictor
to the poverty level of villages.

Keywords and Phrases: generalized linear model, ordering dually in triangles, hotspot detection.

1. INTRODUCTION

Modeling, ranking, and hotspot detection are important methods in statistics to


evaluate and to scrutinize events of interest in daily life in a particular study area. The event
of interest can be the number of a diseases case, the number of people in poverty, the number
of animals or plants that related to biodiversity or ecology matter, the number of catastrophe,
and many others.
Ranking of individuals or areas usually founded on multi-scores of those subjects that
will be ranked. To obtain the rank of subjects with multi-scores is not easy. An ordinary way
to rank the subjects, ones usually combine or count the averages of all scores of the subject,
but the result of this way is not usually appropriate. ORDIT (Ordering Dually in Triangles) is
a ranking method that uses a Hasse diagram as a tool to obtain the ranking. This method can
be used to rank the subjects with more than two indicators to obtain the true ranking. In this
paper, ORDIT is used to obtain ranking of districts (kabupaten) based on 5 infant health
865
866 YEKTI WIDYANINGSIH ET AL.

indicators and sub-district/villages (desa) inside kabupaten based on 2 poverty indicators.


Beside ranking, hotspot areas of poverty have been detected and analyzed together in a linear
model.
Hotspot means unusual phenomenon, anomalies, aberrations, outbreaks, elevated
clusters, or critical areas. Hotspot detection is a method to detect unusual phenomenon
geographically. There are some methods to detect hotspot area. One of them is Scan Statistics
by Kulldorff that uses in this research. The results of hotspot detection are information
(status) of areas as the hotspot or not the hotspot.
In statistics, modeling is a technique to investigate a relationship between response
variable and covariate(s). Various modeling are still developed by many researchers to imply
the best analysis of the data. This research is to build a generalized linear model uses an
ordinal response and some covariates. Grouping of ORDIT ranking is used as the ordinal
response and the result of hotspot detection is used as one of covariates.The data in this
research is PODES (Potensi Desa) 2008 in East Java, from BPS; ranking groups of the
villages is used as the ordinal response variable, and number of people in poverty, number of
farmer families, and hotspot are used as covariates. Generalized Estimating Equation (GEE)
is a method to estimate the parameters of the model.

2. MATERIALS AND METHODS

2.1 Data. The data in this research is about poverty in East Java. Based on ORDIT ranking of
districts, Blitar, Kediri, and Jember are the districts chosen from each group of ranking.
Figure 1 shows the study areas in this research. The resource of sub-districts (villages) data is
PODES 2008 form BPS. Response variable is constructed from grouping the ranking result
based on two poverty indicators. This ordinal response variable has three levels (good,
moderate, and bad) as level of poverty. Modeling is built with this response variable and
some covariates. Hotspot is one of covariates in GLM.
-5.5
-6.0
-6.5
-7.0
Latitude

Tuban Sumen
BangkSampang
Pamek
Lamong
Gresik
Bojoneg Ksurab
-7.5

Ngawi Sidoarjo
Kmojok
Jombang Mojoke
Kmadiun Nganjuk
Madiun Kpasur
Magetan
Pasuru Kprobo Situb
Kkediri
Kediri Kbatu Proboli
-8.0

Ponor Kmala Bondow


TulungKblitar
Pacitan Trengglk Blitar Malang Lumaj
Jember
Banyuw
-8.5

111 112 113 114 115 116


Longitude

Figure 1. Study Area: Blitar, Kediri, and Jember


Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 867

2.2 Ordering Dually in Triangle

2.2.1 Rating Relations/Rules for Ascribing Advantage. This section describes the protocols
for comparing cases or collectives of cases via ratings, rules and relations that ascribe
advantage to some cases over some others or fail to do so for particular pairs [1] where Hasse
diagram should be prepare first before protocols are being used. There are three possibilities
in comparing a pair of cases, where one case is denoted by Э and the other by Є. ЭaaЄ
wherein Э is/has ascribed advantage over Є. Э has subordinate ЭssЄ status to Є which
implies ЄaaЭ. ЭiiЄ whereby these are indefinite instances without ascribed advantage and
without subordinate status, which implies ЄiiЭ. Thus the protocol either designates one
member of a pair as having ascribed advantage and the other subordinate status, or that
pairing as being indefinite. ff(aa) is the frequency of number of occurrences for the ascribed
advantage. ff(ss) is the frequency of number of occurrences for the subordinate status. ff(ii) is
the frequency of number of occurrences for being indefinite. Each of the N cases can be
compared on this basis to all others in the deleted domain DD=N–1 of competing cases with
the percent occurrence of these relations being tabulated as follows, AA = 100 × ff(aa)/DD,
SS = 100 × ff(ss)/DD, II = 100 × ff(ii)/DD. Clearly, AA + SS + II = 100%; and for later use
let us define CCC = 100 – AA as the complement of case condition relative to ascribed
advantage (AA) [1]. All of those computations should be visualized in a Hasse diagram as the
results of partially order set arrangement for a set of subjects with a number of quantity as
their attributes.
A simple X-shaped Hasse diagram in Figure 2 is used to illustrate wherein entity A has
ascribed advantage over B, D, and E while being indefinite with regard to C. Entity B has
subordinate status to A and C with ascribed advantage over D and E. The deleted domain
(DD) is four, since an entity is not paired with itself for present purposes.

Figure 2. X-shaped Hasse diagram of five entities labeled as A, B, C, D and E.

2.2.2 Subordination Schematic and Ordering Dually in Triangles (ORDIT. Subordination can
be symbolized diagrammatically [1] in a triangle depicted in Figure 3, where the point
representing a district makes a triangle divided into two parts, ‘trapezoidal triplet’ (of AA, SS,
and II) and a topping triangle (of CCC and II). The combination of these two parts forms a
right triangle with the ‘tip’ at AA = 100% in the upper-left and the toe at SS = 100% in the
lower-right. The hypotenuse is a right-hand ‘limiting line’ of plotting positions because
AA+SS+II=100%. Topping triangle provides the basis for an ‘ORDIT ordering’ of the
districts or instances.
According to the Figure 3, an idealized district has AA = 100% of the deleted domain
(DD) of other districts, that is the frequency of ascribed advantage being equal to the number
868 YEKTI WIDYANINGSIH ET AL.

of competing districts, so if the ideal actually occurs, then the trapezoidal triplet becomes a
triangle.

Figure 3. Subordination schematic with plotted instance dividing a right triangle into two
parts, a ‘trapezoidal triplet’ (of AA, SS and II) below, and a ‘topping triangle’ (of CCC, SS
and II) above.
The numbers for ORDIT can be coupled as a decimal value ccc.bbb. The ccc
component is obtained by rounding CCC to two decimal places and then multiplying by 100.
The bbb component is obtained by dividing SS by CCC, and imposing 0.999 as an upper
limit. And then add these two values as ccc.bbb. This ordering is assigned the acronym
ORDIT for ORdering Dually In Triangles. It preserves all aspects of AA, SS and II except
for the actual number of districts. Simple rank ordering of ORDIT values becomes salient
scaling of the district [1].

2.2.3 Product-order Rating Regime. A general relational rule for ascribing advantage is
product-order whereby advantage is gained by having all criteria at least as good and at least
one better. Conversely, subordinate status lies with having all criteria at least as poor and at
least one poorer. This relational rule is applicable to all kinds of criteria as long as they have
the same polarity (sense of better and worse).
According to Figure 3 and its computation, ORDIT ordering is the ranking of the
instances based on their indicators. ORDITs and salient scaling according to product-order
are determined by Scheme 2 [1] in Function Facilities of R function. ORDIT (Ordering
Dually In Triangle) is topping triangle in Figure 3 and Salient is the ranking of ORDIT.

2.3 A Spatial Scan Statistic. In this paper a spatial scan statistic is proposed. The analysis is
always conditioned on the total number of observed points. The windows with a particular
size and shape are formed around the regions to capture the highest cases of study.
Cases are assumed to be Bernoulli model with constant risk over spaces under null
hypothesis, and with different risk inside and outside at least one of circular window under
the alternative hypothesis. For each circular window, the numbers of people in poverty inside
and outside it noted, together with the expected number of cases reflecting the population at
risk. On the basis of these numbers, the likelihood ratio is calculated for each circular
window. The circular window with the maximum likelihood, and with more than its
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 869

expected number of cases, is denoted as the most likely cluster [2].


Significance is evaluated with Monte Carlo simulation, where the null hypothesis of
no cluster is rejected at an  level of 0.05 exactly if the simulated P is less than or equal to
0.05 for the most likely cluster [3].

2.3.1 The Bernoulli Model in Hypothesis Statistic. With the Bernoulli model [2], there are
cases and non-cases represented by a 0/1 variable. These variables may represent people in
poverty or not in poverty. They may reflect cases and controls from a larger population, or
they may together constitute the population as a whole. Whatever the situation may be, these
variables will be denoted as cases and controls throughout the user guide, and their total will
be denoted as the population. The Bernoulli model requires information about the location of
a set of cases and controls, provided to SaTScan using the case, control and coordinates files.
Separate locations may be specified for each case and each control, or the data may be
aggregated for states, provinces, counties, parishes, census tracts, postal code areas, school
districts, households, etc, with multiple cases and controls at each data location
Let N denote a spatial point process where N(A) is the random number of points in the set A
 G. As the window moves over the study area it defines a collection Z of zones Z  G. Z
will be used to denote both a subset of G and a set of parameters defining the zone.
For the Bernoulli model, each unit of measure corresponds to an ‘entity’ or
‘individual’ who could be in either one of two states (yes or no). In the model, there is exactly
one zone Z  G such that each individual within that zone has probability p of being a point,
while the probability for individual outside the zone is q. The probability for any individual is
independent of all the others. The null hypothesis is Ho : p = q. The alternative Hypothesis is
H1 : p > q, Z  G . Under Ho, N(A) ~ Binomial ((A), p) for all sets A. Under H1, N(A) ~
Binomial ((A), q) for all sets A  Z, and N(A) ~ Binomial ((A), q) for all sets A  Z’ [2].

2.3.2 Likelihood Ratio Test. The likelihood function for the Bernoulli model is expressed as
L(Z , p, q)  p nZ (1  p) ( Z ) nZ q nG  nZ (1  q)(  (G )   ( Z )) ( nG  nZ )
To detect the zone that is most likely to be a cluster, we find the zone Ẑ that maximizes the
likelihood function. In other words, Ẑ is the maximum likelihood estimator of the parameter
Z. There are two steps to conclude the hotspot. First, maximize the likelihood function
condition on Z.
nZ  ( Z )  nZ
def
 n   n 
L( Z )  sup L( Z , p, q)   Z  1  Z 
pq   ( Z )    (Z ) 
nG  nZ (  ( G )   ( Z ))  ( nG  nZ )
 nG  nZ   nG  nZ 
  1   (1)
  (G )   ( Z )    (G )   (Z ) 
nZ (nG  nZ )
when  , and otherwise
 ( Z ) (  (G)   ( Z ))
nG  ( G )  nG
 n    (G)  nG 
L( Z )   G    (2)
  (G)    (G) 
Next, we find the solution Zˆ  {Z : L(Z )  L(Z ')Z ' } . The most likely cluster is of
interest, and should be continued for making statistical inference. Let
870 YEKTI WIDYANINGSIH ET AL.

nG  ( G )  nG
def
 n    (G)  nG 
Lo  sup L( Z , p, q)   G    (3)
pq   (G)   (  (G) 
The likelihood ratio, , can be written as
sup Z , p  q L( Z , p, q) L( Zˆ )
  (4)
sup p  q L( Z , p, q) Lo
The ratio  is used as the test statistic, and its distribution is obtained through Monte Carlo
[2].

2.4 Generalized Linear Model. Generalized linear model (GLM) is a flexible generalization
of ordinary least squares regression. In the GLM, a link function is needed to link response
variable to covariates in generalize linear regression by allowing the linear model to be
related to the response variable [4].
Generalized linear models were formulated by John Nelder and Robert Wedderburn as
a way of unifying various other statistical models, including linear regression, logistic
regression and Poisson regression [5].
In a GLM, each outcome of the dependent variables, Y, is assumed to be generated
from a particular distribution in the exponential family, a large range of probability
distributions that includes the normal, binomial, poisson, and multinomial distributions,
among others. The mean, μ, of the distribution depends on the independent variables, X,
through:

(5)

where E(Y) is the expected value of Y; Xβ is the linear predictor, a linear combination of
unknown parameters, β; g is a link function [4].
Modeling in this paper uses ordinal scale with 3 levels as response variable and some
categorical variables as covariates. One of the covariates is the indicator values of hotspot,
whether a sub-district is hotspot or not hotspot. The data comprises of 3 districts, every
district has 200 sub-districts with a number of covariates. Assumed sub districts in a district
are more correlated than sub districts from different districts. Therefore, parameters
estimation method should able to tackle condition.
According to the data, model building in this study is based on spatial concept: the
closer the observation, the larger the correlation [10]. Based on this concept, the idea was
expanded to the correlated clustered data. The following section described Generalized
Estimating Equation (GEE) as a model parameters estimation method for correlated clustered
data.

2.5 Threshold Model. Threshold is a latent variable at the model that made the difference
between the linear models with ordinal response and the linear models with non-ordinal
responses. Threshold model is explained as follows. In logistic and probit regression models,
there are assumptions about an unobserved latent variable (y) associated with the actual
responses through the concept of threshold (Hedeker 1994). For dichotomy model, it is
assumed there is a threshold value, while for ordinal model with K categories (polytomy), it
is assumed there are K-1 threshold values, namely 𝛾1 , 𝛾2 , ⋯ , 𝛾𝐾−1 , with 𝛾0 = −∞ and
𝛾𝐾 = ∞. Response occurs in category k (Y = k), if the latent response y is greater than the
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 871

response with K categories, assume Yi is unobserved, and the i-th observation is in a


category, say category Zi, i = 1, …, N. The relationship between Yi and Zi is taken to be
𝛾𝑘−1 < 𝑌𝑖 ≤ 𝛾𝑘 ⇔ 𝑍𝑖 = 𝑘
where k  {1, , K}, 0 = - , K = +  and 𝛾1 , 𝛾2 , ⋯ , 𝛾𝐾−1 are unknown boundary points
that define a partitioning of the real line into K intervals. Thus, when the realized value of Y j
belongs to the k-th interval, we observe that zi = k. Under that assumptions, the probability-
mass function of Z1, , ZN is
𝑃 𝑧1 , , 𝑧𝑁 = 𝑃𝑟 𝑍𝑖 = 𝑧𝑖 ; 𝑖 = 1, 𝑁
= 𝑃𝑟 𝑧 −1 < 𝑌𝑖 ≤ 𝑧 ; 𝑖 = 1, 𝑁
𝑖 𝑖
This model is called the threshold model (Harville 1984).

2.6 Generalized Estimating Equations (GEE) for Ordinal Response. GEE is a method of
parameter estimation for correlated or clustered data [6]. It is a common choice for marginal
modeling of ordinal response if one is interested in the regression parameters rather than
variance-covariance structure of the longitudinal data. The covariance structure of GEE is
regarded as nuisance. In this regard, the estimators of the regression coefficients and their
standard errors based on GEE are consistent even with miss-specification of the covariance
structure for the data [7].
Generalized linear models were first introduced by Nelder and Wedderburn (1972) and
later expanded by McCullagh and Nelder (1989). The following discussion is based on their
works and an extension of GEE from Liang & Zeger (1986) for ordinal categorical responses
data.
Suppose we have a multinomial response, say z. And for this response, there are K
ordered categories with corresponding probabilities π 1, π2, …, πK, that is Pr(z = k) = πk. The
proportional odds model is based on the cumulative probabilities, k = π1 + π2 + …+ πk, for k =
1 to K-1. Logit link function is used to relate k to a linear function of p covariates X. Now
let’s take a look at the repeated situation. Suppose we have a sample of I subjects. Let zij be
the ordinal response (with K levels) for the ith subject (i =1 to I) at point j (j = 1 to ni).
Assumed ni = J for all i for simplicity. Form of a (K-1)×1 vector
𝒚𝑖𝑗 = 𝑦𝑖𝑗 1 , 𝑦𝑖𝑗 2 , ⋯ , 𝑦𝑖𝑗 ,𝐾−1 ′ where yijk = 1 if zij = k, and 0 otherwise. Let’s denote the
expectation of 𝒚𝑖𝑗 as 𝝅𝑖𝑗 = E(𝒀𝑖𝑗 ) = 𝜋𝑖𝑗 1 , 𝜋𝑖𝑗 2 , ⋯ , 𝜋𝑖𝑗 ,𝐾−1 ′ with 𝜋𝑖𝑗𝑘 = Pr(yijk = 1). And
let xij denote a p×1 vector of covariates for subject i at sub subject j.
The objective of this part is to model the 𝜋𝑖𝑗𝑘 as a function of xij and the regression
parameters 𝜽 = 𝛾1 , 𝛾2 , ⋯ , 𝛾𝐾−1 , 𝜷 ′ where 𝛾𝑘 are intercept or cut-point parameters
and 𝜷 is a p×1 vector of regression parameters. Let 𝜑𝑖𝑗𝑘 = 𝜋𝑖𝑗 1 + 𝜋𝑖𝑗 2 + ⋯ + 𝜋𝑖𝑗𝑘
denote the cumulative probabilities. Then the proportional odds model at sub subject j is:
𝑙𝑜𝑔𝑖𝑡 𝜑𝑖𝑗𝑘 = 𝛾𝑘 + 𝒙𝑖𝑗 𝜷
𝜑𝑖𝑗𝑘 𝑃 𝑧𝑖𝑗 ≤ 𝑘
𝑙𝑜𝑔 = 𝑙𝑜𝑔 = 𝛾𝑘 + 𝒙𝑖𝑗 𝜷
1 − 𝜑𝑖𝑗𝑘 1 − 𝑃 𝑧𝑖𝑗 ≤ 𝑘
where 𝑧𝑖𝑗 ∈ 1,2, … 𝐾 is transformed to 𝑦𝑖𝑗𝑘 ∈ 1, 0 with 𝜋𝑖𝑗𝑘 = 𝑃𝑟 𝑦𝑖𝑗𝑘 = 1 .  is a
vector of fixed effect at the transformed cumulative probabilities; 𝒙𝑖𝑗 is a vector of
covariates of district i, and sub-district j; 𝛾𝑘 is a threshold.
872 YEKTI WIDYANINGSIH ET AL.

To establish notation, let 𝒚𝑖 = 𝒚𝑖1 , ⋯ , 𝒚𝑖𝑛 𝑖 ′, and 𝝅𝑖 = 𝝅𝑖1 , ⋯ , 𝝅𝑖𝑛 𝑖 ′. Then θ can be
estimated through solving the estimating equation as follows
𝐼
𝜕𝝅𝑖 −1
ψ 𝜽 = 𝐕 𝐲i − 𝛑i = 𝟎 𝐾−1+𝑝 ×1
𝜕𝜽 i
𝑖=1
𝜕𝜋 𝑖
where
𝜕𝜽
=
𝜕𝜋𝑖1,1 𝜕𝜋𝑖1,1 𝜕𝜋𝑖1,1 𝜕𝜋𝑖1,1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝
⋱ ⋱
𝜕𝜋𝑖1,𝐾−1 𝜕𝜋𝑖1,𝐾−1 𝜕𝜋𝑖1,𝐾−1 𝜕𝜋𝑖1,𝐾−1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝
⋮ ⋮
𝜕𝜋𝑖𝑛 𝑖 ,1 𝜕𝜋𝑖𝑛 𝑖 ,1 𝜕𝜋𝑖𝑛 𝑖 ,1 𝜕𝜋𝑖𝑛 𝑖 ,1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝
⋱ ⋱
𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1 𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1 𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1 𝜕𝜋𝑖𝑛 𝑖 ,𝐾−1
⋯ ⋯
𝜕𝛾1 𝜕𝛾𝐾−1 𝜕𝛽1 𝜕𝛽𝑝 𝑛 𝑖 × 𝐾−1 × 𝐾−1+𝑝

and 𝑽−1
𝑖 is a generalized inverse of Vi . Here
1/2 1/2
𝐕i = ϕ𝐀i 𝑹𝒊 𝛼 𝐀i (6)

where 𝑨𝑖 = 𝑑𝑖𝑎𝑔 𝑨𝑖1 , ⋯ , 𝑨𝑖𝑛 𝑖


1
𝑨𝑖𝑗 =𝑑𝑖𝑎𝑔 𝜋𝑖𝑗 ,1 1 − 𝜋𝑖𝑗 ,1 , ⋯ , 𝜋𝑖𝑗 ,𝐾−1 1 − 𝜋𝑖𝑗 ,𝐾−1
𝜔𝑖𝑗
ω is a known prior weight which varies from case to case. V𝑖 = V𝑖 𝜽, 𝜶 denotes the
working covariance matrix of Yi. α is a q×1 vector of correlation parameters. One gains
efficiency in estimating θ by selecting V𝑖 close to the true covariance.
Since Yij is multinomial, the (K-1)×(K-1) diagonal blocks of 𝑉𝑖 are the multinomial
covariance matrices.
𝐕𝑖𝑗 = 𝐷𝑖𝑎𝑔 𝜋𝑖𝑗 − 𝜋𝑖𝑗 𝜋𝑖𝑗 ′
where 𝐷𝑖𝑎𝑔 𝜋𝑖𝑗 denotes a diagonal matrix with elements of 𝜋𝑖𝑗 on the main diagonal. And
the other elements of V𝑖 are associated with the correlations 𝝆𝑖𝑗𝑡 𝛼 between Yij and Yit, j ≠
t, in which the correlation parameters α are involved. From equation (6), the working
correlation matrix, Ri(α) is
−1/2 −1/2
𝐴𝑖1 V𝑖1 𝐴𝑖1 𝜌𝑖12 ⋯ 𝜌𝑖1𝑛 𝑖
1 −1/2 −1/2 1 𝜌𝑖21
−1/2
𝐴𝑖2 V𝑖2 𝐴𝑖2
−1/2
⋯ 𝜌𝑖2𝑛 𝑖
𝑅𝑖 𝛼 = Ai Vi Ai =
ϕ 𝜙 ⋮ ⋮ ⋱ ⋯
−1/2 −1/2
𝜌𝑖𝑛 𝑖 ,1 𝜌𝑖𝑛 𝑖 ,2 ⋯ 𝐴𝑖𝑛 𝑖 V𝑖𝑛 𝑖 𝐴𝑖𝑛 𝑖

Moment methods to estimate correlation parameters α for specific correlation structures have
been proposed in [12].
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 873

𝑹𝑠(𝑖) 𝛼
−1/2 −1/2
𝐀𝑠(𝑖)1 𝐕𝑠(𝑖)1 𝐀𝑠(𝑖)1 𝜌𝑠(𝑖)12 ⋯ 𝜌𝑠(𝑖)1𝑛 𝑠𝑖
1 𝜌𝑠(𝑖)21
−1/2 −1/2
𝐀𝑠(𝑖)2 𝐕𝑠(𝑖)2 𝐀𝑠(𝑖)2 ⋯ 𝜌𝑠(𝑖)2𝑛 𝑠𝑖
=
𝜙 ⋮ ⋮ ⋱ ⋯
−1/2 −1/2
𝜌𝑠(𝑖)𝑛 𝑠𝑖 ,1 𝜌𝑠(𝑖)𝑛 𝑠𝑖 ,2 ⋯ 𝑨𝑠(𝑖)𝑛 𝑠𝑖 𝐕𝑠(𝑖)𝑛 𝑠𝑖 𝑨𝑠(𝑖)𝑛 𝑠𝑖

For simplifying, index j = 1, …, nsi is not inside parenthesis, but the same meaning with
written before is maintained [5][6][11].

3. APPLICATIONS AND RESULTS

The main purpose of this paper is to exploit the generalization of linear model to deal
with clustered ordinal measurement within the same district (kabupaten) as subject. GEE
approach is used to estimate the model parameters. Comparisons with the method of
maximum likelihood have been finished in the simpler case that serves to demonstrate the
efficiency of the method and some practical advantages [8]. The data is about poverty and
infant health in three districts of East Java. Thirty eight districts in East Java were ranked
based on 5 indicators of infant health; i.e. number of infant deaths, number of low births
weight, number of delivers in absence of health personnel, number of people in poverty, and
number of education shortfall. The ranks result of districts were divided into three groups as
not poor, moderate, and poor, which each group has 8, 7, and 23 districts, respectively. From
each group of infant health level, a district has taken for modeling analysis. Blitar, Kediri, and
Jember are chosen from those three different groups. Blitar has chosen from not poor group,
Kediri from moderate group, and Jember from poor group. Sub-districts (villages) from these
three district groups are ranked based on poverty indicators. Ordinal response was form from
grouping of these ranks into three groups. Two hundred sub-districts were taken from each
district to be analyzed in the GLM modeling. The covariates for modeling are scarcity,
percentage of farmer families, number of farmer family, number of Indonesia labor (TKI),
number of telephone cables, number of schools, number of health personnel, number of
families using electricity ( the values of all covariates are divided into three groups based on
the number of cases), and hotspot of poverty status. Hotspots of poverty in those three
districts were detected by Scan Statistic Method of Kulldorff, 1997. Hotspots status that used
as a covariate of villages is the significant hotspot: most likely and secondary hotspot. The
value is one if a village is hotspot and zero if it is not hotspot.
Table 1 at Appendix is the result of Generalized Linear Modeling with ordinal
response, some covariates and hotspot status. Ordinal response variable is level of sub-
districts in poverty, 1 = not poor, 2 = moderate, and 3 = poor, with each group has 115, 297,
and 188 sub-districts, respectively. As mentioned before, this ordinal response is a grouping
of rank result based on the number of people in poverty that represented by number of poor
statement letters and number of health insurance for people in poverty.
Based on the data in this research, table 1 shows some covariates are not significant,
while number of farmer families, number of Indonesia labor (TKI), number of telephone
cables, number of schools, and hotspot are related to the poverty.
Table 2 shows some covariates that significant to the level of poverty. District,
874 YEKTI WIDYANINGSIH ET AL.

number of family farmer and hotspot status are significant to the poverty level of a village.
This model has Log Likelihood = -456.0614. Interpretations of some model parameters at
Table 2:
1. Probability a village in Blitar has a better level, is exp(1.9344) = 6.92 times of a
village in Jember, where other covariates are remain the same.
2. Probability a village with jktan=1 (less farmer) is exp(2.1449) = 8.54 times of village
with jktan= 3 (more farmer), where other covariates are remain the same.
Cumulative predicted probabilities from the logistic model for each case:
1
𝑃𝑟 𝑧𝑖𝑗 ≤ 𝑘 =
1 + 𝑒𝑥𝑝 − 𝛾𝑘 + 𝑥𝑖𝑗 𝜷 + 𝛼𝑖𝑗 𝒅
The events in an ordinal logistic model are not individual scores but cumulative scores. First,
we calculate the predicted probability for a village as a hotspot (hotspot=1) in Jember
(kab=3), with number of farmer family at the highest level category (jktan=3), this means ˆ
is 0
P (score  1) = 1/(1 + e-(-5.2631+(-1.3014)*(1))) = 1/(1+709.457) = 0.00141
P (score  2) = 1/(1 + e-(-1.8998+(-1.3014)*(1))) = 1/(1+exp(3.2012)) = 0.039
P (score  3) = 1
P(score = 3) = 1 - P(score  2) = 1 – 0.039 = 0.961
P(score = 2) = P(score  2) - P(score  1) = 0.039 – 0.00141 = 0.03759
The village with that condition has probability 0.961 as the village from poor level (Yji = 3)
Table 3 shows some values for prediction of probabilities. Phat is the cumulative
probability of a village in the level or lower. As an example, a village in observation 2 has
level 2 in poverty or lower, it means moderate or not poor is 0.85537. Similar interpretation is
valid for other observations.

4. DISCUSSION

Generalized Linear Model with ordinal response and hotspot as a covariate is an


improved model to scrutinize the relationship between quality of an area in infant health, and
poverty. The model shows that there is a strong relationship between village level and the
district where that village come from (p-value < 0.0001). The hotspot of poverty is
statistically significant (p-value < 0.0001) related to the level of villages. It means that the
scan statistics method is a powerful tool to detect hotspot. Villages in the hotspot area tend to
have higher level in poverty (poor). According to the result, still there is a question. Even
though the districts have been ranked, so that Blitar (kab=1), Kediri (kab=2), and Jember
(kab=3) are in order from better to worse, but coefficient of Kediri is greater than coefficient
for Blitar, that’s mean Kediri is better than Blitar. It is a contradiction from the result of
ranking method. This situation needs more investigation and deeper analysis. But generally,
the results of modeling make sense enough for poverty case in East Java.
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 875

5. CONCLUSION AND RECOMMENDATION

The three methods: ORDIT ranking, Hotspot detection, and Generalized Linear Model
give many advantages for analysis the data, especially when researcher needs to see the
relationship between ordering, hotspot, and some covariates. The results in this research,
where the topic is about infant health and poverty could be used by the government as an
input to make a decision or policy for district or sub-district improvement. As a conclusion,
between these three districts (kabupaten): Blitar, Kediri, and Jember, Jember is the worst in
infant health. Based on the GLM, there is a strong relationship between district, number of
family farmer, and hotspot to the poverty level of a village. All those covariates has p-values
< 0.0001. These results give advantages to the Health Department for decision making for
districts and sub-districts improvement in infant health with giving attention to the farmer
families. Hopefully, better condition in farmer families will improve women in education and
health to have healthier infant and children.
This method also can be used for other data such as biodiversity, medical, social,
economic, and many others.

Acknowledgement. Valuable guidance in ORDIT ranking method of Prof. G.P. Patil,


Center for Statistical Ecology and Environmental Statistics, Department of Statistics, The
PSU and Prof. Wayne Myers, Penn State Institutes of Energy and the Environment are
gratefully acknowledged. Many thanks to the Directorate of Mendepdiknas for financial
assistance, S3 Sandwich Program 2010

References

[1] Myers, W.; Patil, G. P. Preliminary Prioritization Based on Partial Order Theory and R Software for
Compositional Complexes in Landscape Ecology, with Applications to Restoration, Remediation, and
Enhancement, Manuscript of Environmental Ecological Statistics. 2009.
[2] Kulldorff M. A Spatial Scan Statistic. Communication in Statistics: Theory and Methods, 26:1481—1496. 1997.
[3] Kulldorff, Martin, SaTScanTM User Guide for version 9.0. 2010.
[4] McCullagh P, Regression Models for Ordinal Data. Journal of the Royal Statistical Society. Series B
(Methodological, Vol. 42, No. 2, pp. 109-142. http://www.jstor.org/pss/2984952 [25 April 2011, 6:30]. 1980.
[5] McCullagh, Peter; Nelder, John. Generalized Linear Models, Second Edition. Boca Raton: Chapman and
Hall/CRC. ISBN 0-412-31760-5. 1989.
[6] Liang KY, Zeger SL, Longitudinal data analysis using generalized linear models, Biomttrika. 73. 1. pp. 13-22.
1986.
[7] Yang B, Lilly E. Analyzing Ordinal Repeated Measures Data Using SAS, Paper SP08, Indianapolis, Indiana.
2008.
[8] Clayton, David. Repeated Ordinal Measurement: a Generalized Estimating Equation Approach, MRC
Biostatistics Unit, Cambridge. 1992.
[9] Gortmaker, S.L., Poverty and Infant Mortality in the United States, American Sociological Review, Vol. 44, No. 2
(Apr., 1979), pp. 280-297
[10] Cressie, NAC. Statistics for Spatial Data, Revised Edition. New York: John Wiley & Sons. 1993.
[11] Nelder JA, Wedderburn RWM. Generalized Linear Models. Journal of the Royal Statistical Society Series A,
135, 370–384. 1972.
[12] Lipsitz SR, Kim K, Zhao L, 1994. Analysis of repeated categorical data using generalized estimating equations.
Statist. Med., 13: 1149–1163. doi: 10.1002/sim.4780131106
876 YEKTI WIDYANINGSIH ET AL.

YEKTI WIDYANINGSIH
Department of Mathematics
Faculty of Mathematics and Natural Sciences, University of Indonesia
e-mail: [email protected]

ASEP SAEFUDDIN
Department of Statistics
Faculty of Mathematics and Natural Sciences, Bogor Agricultural University
e-mail: [email protected]

KHAIRIL ANWAR NOTODIPUTRO


Department of Statistics
Faculty of Mathematics and Natural Sciences, Bogor Agricultural University

AJI HAMIM WIGENA


Department of Statistics
Faculty of Mathematics and Natural Sciences, Bogor Agricultural University
Ordering Dually in Triangels (Ordit) and Hotspot Detection In Generalized … 877

Appendix

Tabel 1 Result of Generalized Linear Model with ordinal response.


Standard 95% Confidence
Category Parameter Estimate Error Limits Z Pr > |Z|
Intercept1 -10.0672 0.707 -11.4529 -8.6815 -14.24 <.0001
Intercept2 -6.1814 0.7515 -7.6542 -4.7086 -8.23 <.0001
kab 1 3.197 0.5408 2.1371 4.2569 5.91 <.0001
kab 2 3.595 0.4558 2.7016 4.4885 7.89 <.0001
kab 3 0 0 0 0 . .
< 1 gz 1 0.0549 0.6277 -1.1753 1.2852 0.09 0.9303
2 to 6 gz 2 -0.0577 0.4771 -0.9927 0.8773 -0.12 0.9037
>= 6 gz 3 0 0 0 0 . .
< 45 ptan 1 0.3766 0.3143 -0.2394 0.9926 1.2 0.2308
45 to 80 ptan 2 0.2983 0.338 -0.3642 0.9608 0.88 0.3776
>= 80 ptan 3 0 0 0 0 . .
< 500 jktan 1 1.7383 0.1741 1.3971 2.0795 9.99 <.0001
500-1000 jktan 2 1.7019 0.1093 1.4877 1.9161 15.57 <.0001
>= 1000 jktan 3 0 0 0 0 . .
< 100 jtk 1 0.4326 0.0974 0.2417 0.6235 4.44 <.0001
100 - 200 jtk 2 0.0994 0.2556 -0.4017 0.6004 0.39 0.6975
>= 200 jtk 3 0 0 0 0 . .
< 100 telpon 1 2.1349 0.6005 0.958 3.3119 3.56 0.0004
100 - 200 telpon 2 1.502 0.6532 0.2218 2.7822 2.3 0.0215
>= 200 telpon 3 0 0 0 0 . .
< 5 sekolah 1 1.8475 0.2257 1.4052 2.2898 8.19 <.0001
5 to 10 sekolah 2 0.9655 0.1471 0.6772 1.2539 6.56 <.0001
>= 10 sekolah 3 0 0 0 0 . .
< 5 medis 1 0.4661 0.3386 -0.1975 1.1296 1.38 0.1686
5 to 10 medis 2 0.3391 0.2586 -0.1677 0.8459 1.31 0.1898
>= 10 medis 3 0 0 0 0 . .
< 50 pln 1 -0.896 0.7902 -2.4449 0.6528 -1.13 0.2569
50 to 90 pln 2 -0.5129 0.2355 -0.9745 -0.0513 -2.18 0.0294
>= 90 pln 3 0 0 0 0 . .
hotspot -1.5512 0.2462 -2.0339 -1.0686 -6.3 <.0001
878 YEKTI WIDYANINGSIH ET AL.

Tabel 2 Result of Generalized Linear Model with ordinal response.

Analysis Of GEE Parameter Estimates


Empirical Standard Error Estimates

Standard 95% Confidence


Parameter Estimate Error Limits Z Pr > |Z|

Intercept1 -5.2631 0.2545 -5.7619 -4.7643 -20.68 <.0001


Intercept2 -1.8998 0.0469 -1.9917 -1.8079 -40.51 <.0001
kab 1 1.9344 0.2028 1.5370 2.3318 9.54 <.0001
kab 2 2.8865 0.3087 2.2814 3.4917 9.35 <.0001
kab 3 0.0000 0.0000 0.0000 0.0000 . .
jktan 1 2.1449 0.1315 1.8870 2.4027 16.31 <.0001
jktan 2 1.7428 0.0592 1.6267 1.8588 29.43 <.0001
jktan 3 0.0000 0.0000 0.0000 0.0000 . .
hotspot -1.3014 0.2764 -1.8431 -0.7598 -4.71 <.0001

Table 3. Some values of predictions

Obs kab desa _ORDER_ _LEVEL_ phat lower upper clogit

1 1 1 1 1 0.16995 0.15143 0.19023 -1.58596


2 1 1 2 2 0.85537 0.77949 0.90821 1.77734
3 1 2 1 1 0.23436 0.17759 0.30260 -1.18386
4 1 2 2 2 0.89839 0.86926 0.92161 2.17944
5 1 3 1 1 0.23436 0.17759 0.30260 -1.18386
6 1 3 2 2 0.89839 0.86926 0.92161 2.17944

Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Statistics, pp. 879 - 894.

EMPIRICAL PROPERTIES AND MIXTURE OF


DISTRIBUTIONS: EVIDENCE FROM BURSA MALAYSIA
STOCK MARKET INDICES

ZETTY AIN KAMARUZZAMAN , ZAIDI ISA, MOHD T AHIR ISMAIL

Abstract. This paper studies the behavior of financial time series in three indices of Bursa Malaysia
Index Series namely the FTSE Bursa Malaysia Composite Index (FBM KLCI), the Finance Index and the
Industrial Index from July 1990 until July 2010. We observe that these three i ndices are characterized by
the presence of the stylized facts such as lack of normality, exhibit skewness and excess kurtosis. This
paper provides discussion on how mixture distributions accommodate with non-normality and asymmetry
characteristics of financial time series data. We also present the most commonly used Maximum
Likelihood Estimation (MLE) via the EM algorithm to fit the mixture Normal distributi on and study the
two-component Normal mixtures using data sets on logarithmic stock returns of Bursa Malaysia indices.

Keywords and Phrases : Bursa Malaysia stock market indices; behavior of financial time
series; mixture distributions.

1. INTRODUCTION

It is a stylized fact that the marginal distributions of stock returns are poorly
described by the Normal distribution. It has been established that return distributions have
thick tails, are skewed and leptokurtic relative to the Normal distribution (having more
values near the mean and in the extreme tails; dramatic falls and spectacular jumps appear
with higher frequency than predicted) (Frances and van Dijk [1] and Cont [2]). Mixtures of
Normal distributions have been associated with empirical finance. There exists a long
history of modeling asset returns with a mixture of Normal (see Press [3], Praetz [4], Clark
[5], Blattberg and Gonedes [6] and Kon [7]). One attractive property of the mixtures of
Normal is that it is flexible to accommodate various shapes of continuous distributions, and
able to capture leptokurtic, skewed and multimodal characteristics of financial time series
data. The EM algorithm is a popular tool for simplifying maximum likelihood problems in
the context of a mixture model. The EM algorithm has become the method of choice for
_______________________________
2010 Mathematics Subject Classification: 62-07; 62P05; 65Y04

879
880 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

estimating the parameters of a mixture model, since its formulation leads to straightforward
estimators (Picard [8]). The key property of the EM algorithm has been established by
Dempster et al. [9].
The paper is structured as follows. We start by presenting the case study and their
properties in Section 2. We describe the data and test the normality assumption for the
monthly stock returns. In Section 3, we introduce the statistical distribution to be fitted to the
data. Also we provide discussion on how mixture distributions accommodate the non-
normality and asymmetry characteristics of financial time series. In Section 4, we fit the
specification to the data and present the most commonly used Maximum Likelihood
Estimation (MLE) via the EM algorithm to fit the two-component mixture of Normal
distribution. Lastly, Section 5 concludes. We summarize the main findings of our study.

2. CASE STUDY

2.1 Data. The data sets used in this paper are monthly closing prices covers a twenty years
period from July 1990 to July 2010 for three Malaysia stock market indices namely the
Composite Index, Financial Index and Industrial Index as obtained from DataStream. All
three indices are denominated in local currency which is Malaysian Ringgit (MYR). In total,
we have 241 observations per index. The behavior of the three indices considered during the
sample period is shown below. Figure 1 depicts the time series of monthly stock market
indices of Bursa Malaysia. It can be seen that the price rise and fall over time.
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 881

Composite Index

1600

1400

1200

1000

800

600

400

200
92 94 96 98 00 02 04 06 08 10

Finance Index

14000

12000

10000

8000

6000

4000

2000

0
92 94 96 98 00 02 04 06 08 10

Industry Index

3500

3000

2500

2000

1500

1000

500
92 94 96 98 00 02 04 06 08 10

Figure 1. Time series plot of monthly stock market indices

2.2 Return Series. Prior to analysis, all the series are analyzed in return, which is the first
difference of natural algorithms multiplied by 100. This is done to express things in
percentage terms. Let Pit be the observed monthly closing price of market indices i on day
t , i  1,..., n and t  1,..., T . The monthly rates of return are defined as the percentage rate of
return by

 P 
yit  100  log  it  (1)
P 
 i ,t 1 

2.3 Empirical Properties of Return. Figure 2 depicts the monthly returns of Bursa Malaysia
stock market indices. There are periods of quiet and periods of wild variation in the monthly
returns. The period analyzed can be characterized as a period of market instability as it
reflects the upturn and downturn of Malaysia stock market.
882 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

RETURN_COMPOSITE INDEX
30

20

10

-10

-20

-30
92 94 96 98 00 02 04 06 08 10

RETURN_FINANCIAL INDEX
50
40
30
20
10
0
-10
-20
-30
-40
92 94 96 98 00 02 04 06 08 10

RETURN_INDUSTRIAL INDEX
30

20

10

-10

-20

-30
92 94 96 98 00 02 04 06 08 10

Figure 2. Time series of index rate for three Malaysia stock markets

The three stock market indices are listed in Table 1. Table 1 below summarizes some
relevant information about the empirical distributions of stock market indices under
consideration. This table reports some descriptive statistics and tests for the monthly stock
prices. First, the means of the series are, in general, not significantly different from zero (H0: µ
= 0). Second, there is some evidence of negative skewness (β1, defined as the 3rd standardized
moment) in the monthly Industry Index. Meanwhile, the Composite Index and Financial Index
distribution of return rates are positive skewed. Third, it has been found that stock returns in
financial markets have excess kurtosis, i.e. kurtosis which is significantly greater than 3 (the
value for a normal distribution). Kurtosis β2, defined as the ratio of the 4th central moment to
the square of the variance, increases both with excessive mass in the tails or at the centre of the
distribution. Table 1 show that all three distributions are leptokurtic, thus exhibiting fat tails
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 883

and high peaks. The Jarque-Bera test rejects the null hypothesis of normality for each of the
three stock market indices.
Table 1. Summary Statistics
Bursa Malaysia stock market indices
Statistics Composite Financial Industrial
Index Index Index
Mean 0.3237 0.6021 0.3233
Median 0.6119 0.8037 0.6474
Maximum 28.2488 40.3887 22.5572
Minimum -24.8089 -32.3621 -23.1723
Std. Dev. 6.8727 9.3636 5.9313
Skewness 0.0749 0.3479 -0.3980
Kurtosis 5.2621 6.5048 5.4303
Jarque-Bera 51.3942 127.6802 65.4024
P-value 0.0000 0.0000 0.0000

Figure 3 depicts the histogram for the Malaysia’s stock return rates and the
corresponding normal curve with the same mean and standard deviation. The departures from
normality can be seen in the histograms displayed in Figure 3, where normal distributions
generated by the sample mean and standard deviation of each index is shown together with the
observed histograms. It can easily be seen that the empirical distribution is more peaked and
has heavier tails than the normal distribution. Note also that the return distributions with
thicker tails have a thinner and higher peak in the center compared to normal distribution.
884 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

Figure 3. Empirical distribution of the Malaysian indices

3. MIXTURE OF DISTRIBUTION

Most financial markets returns are both skewed and leptokurtic. Based on the above
analysis, the monthly log return is far from being normally distributed. Hence, a number of
alternatives skewed and leptokurtic distributions have been applied. The mixtures of normal
distributions is by far the most extensively applied and the simplest case is a mixture of two
univariate normal distributions may be considered as the most widely applied. A flexible and
tractable alternative of departures from normality is a mixture of two normal distributions. A
mixture of two log normal distributions fit financial data better than a single normal
distribution. Fama [10] claims that a mixture of several normal distributions with same mean
but different variances are the most popular approach to describe long-tailed distribution of
price changes.
Recent studies of stock returns tend to use mixture of Normal distributions. Under the
assumption of Normal distribution, the log return is normally distributed with mean  and
variance  2 i.e. rt ~ N   ,  2  . Advantages of mixture of Normal include that they maintain
the tractability of normal, have finite higher order moments, and can capture the excess
kurtosis (Tsay [11]). Besides, the mixture of Normal has other advantages. One is that it can
capture the structural change not only in the variance but also in the mean. The other advantage
is that it can be asymmetric (Knight and Satchell [12]). Also it is believed that Normal
mixtures are appropriate in order to accommodate certain discontinuities in stock returns
such as the ‘weekend effect’, the ‘turn-of-the month effect’ and the ‘January effect’ (Klar
and Meintanis [13]). Besides, the mixtures of normal models are easy to interpret if the
asset returns are viewed as generated from different information distributions where the
mixture proportion can accommodate parameter cyclical shifts or switches among a finite
number of regimes (Xu and Wirjanto [14]). Other most appealing features of the mixtures
of normal models for modelling assets returns is that it has the flexibility to approximate
various shapes of continuous distributions by adjusting its component weights, means and
variances (Tan and Chu [15]).
The general form of the CDF of a Normal mixture can be represented as
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 885

K
 x  i 
F  x    i    (1)
i 1  i 
where  is the cumulative density function of N  0,1 . The probability density function of a
mixture of Normal is therefore given by
K
f ( x)   i  x; i , i  (2)
i 1

where
  x  i 
2

1
  x; i , i   2 i 2
e
2 i
K

i 1
i  1 and 0  i  1

for i  1, 2,..., K .
Thus, in a Normal mixture, the return distribution is approximated by a mixture of
Normal each of which have unique mean i and standard deviation  i and weight (or
probabilities or mixing parameter)  i (Subramanian and Rao [16]).
A mixture of two Normal distributions is given by
f  xt ; 1 , 2 ,12 , 22    N  xt ; 1 ,12   1    N  xt ; 2 , 22  (3)

where N  x; i ,  i2  is the pdf of a Normal distribution with mean i and variance  i2 . This
mixture implies that stock returns are drawn from a normal distribution with mean 1 and
standard deviation  1 with probability  , and from a normal distribution with mean  2 and
standard deviation  2 with probability 1    .
If X is a mixture of K normal with pdf in (2), then its mean, variance, skewness and
kurtosis are

K
    i i
i 1

 2    i  i2  i2    2
K

i 1
K
1
       3 i3   i    
2

 3
i 1
i i  
K
1

 4 
i 1
i
3 i4  6  i   2  i2   i   4 
 

The five parameters  , 1 , 2 , 12 ,  22  of mixture distribution allow a very flexible


definition of departures from symmetry and normality. Therefore, mixture distributions have
the ability to deal with skewness and kurtosis in analyzing financial time series. By using
886 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

mixture distributions, we can obtain densities with larger peakness and with heavier tails than
Normal distribution.

4. FITTING A MIXTURE NORMAL DISTRIBUTION TO DATA

In this section, we describe a simple mixture model for density estimation, and the
associated EM algorithm for carrying out maximum likelihood estimation.

4.1 The EM Algorithm for Two Component Mixture Model. Fitting mixture distributions
can be handled by a wide variety of techniques, such as graphical methods, the method of
moments, maximum likelihood and Bayesian approaches (see Titterington et al. [17] for an
exhaustive review of these methods). Now considerable advances have been made in the
fitting of mixture models, especially via the maximum likelihood method. The maximum
likelihood method has focused many attentions, mainly due to the existence of an
associated statistical theory.

Remark 1. (Estimation of the mixture Normal pdf). With two mixture components, the log
likelihood is
log L  t 1 ln f  xt ; 1 , 2 , 12 ,  22 ,  
T

where f   is the pdf in (3). A numerical optimization method could be used to maximize
this likelihood function. However this is tricky so an alternative approach is often used
(Söderlind [18]).
We used a procedure called the EM algorithm, given in Algorithm 1 for the special case
of Normal mixtures (Hastie et al. [19]). In the Expectation (E) step, we do a soft assignment of
each observation to each model: the current estimates of the parameters are used to assign
responsibilities according to the relative density of the training points under each model. Next,
in the Maximization (M) step, these responsibilities are used in weighted maximum-likelihood
fits to update the estimates of the parameters (see Hastie et al. [19]).

Algorithm 1. EM algorithm for two-component Normal mixture (Hastie et al. [19] and
Söderlind [18])
Step 1: Take initial guesses for the parameters 1 , 2 ,12 , 22 and  .
According to Hastie et al. [19], a good way to construct initial guesses for 1 and

 2 is choose two of the xt at random. However, we can also choose 1  x1 and


2  x2 as the initial guesses for 1 and  2 (Söderlind, [18]). The mixing

proportion  can be started at the value 0.5. Both  12 and  22 can be set equal to
the overall sample variance
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 887

 x  x
2


T t
t 1
T

Step 2: Expectation (E) Step.


Calculate the responsibilities
 N  xt ; 2 , 22 
t  for t  1,..., T .
1    N  xt ; 1 ,12    N  xt ; 2 , 22 
Step 3: Maximization (M) Step.
Calculate the weighted means and variances
 1    x ,  1    x   
T T 2

1  t 1
  t 1
t t 2 t t 1
,
 1     1   
T 1 T
t 1 t t 1 t

 x    xt  2 
T T 2

2  t 1 t t
,  2
 t 1 t
, and
 
2
 
T T
t 1 t t 1 t

t
   t 1
T

T
Step 4: Iterate over steps 2 and 3 until the parameter values converge.

For the initial values, we decided to have the following values. Table 2 depicts the initial
values for the EM algorithm.

Table 2. Initial values for the EM algorithm


Composite Index Finance Index Industry Index
 0
 0.5 0.5 0.5
1 0 -10.7651 -10.7736 -11.9791
 0
2 -1.8481 -6.0206 0.5584
2 0
 1 47.2335 87.6776 35.1808
2 0
 2 47.2335 87.6776 35.1808

After the EM algorithm, our final maximum likelihood estimates for unknown
parameters are as described in Table 3. Table 3 depicts the summary of two components
Normal mixture using the EM algorithm. For each Index, there are two components, with two
weights, two means, two standard deviations and the overall log-likelihood. We report in Table
3 the maximum-likelihood estimations resulted from fitting the theoretical distributions as
888 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

described previously to the series of monthly stock returns of the three indices under
consideration.
This table provides the estimates of the five parameters of a two component Normal
mixture as well as the Normal mixture model for the three stock market indices of Bursa
Malaysia. For Composite Index, 27.07% of the returns follow the first Normal distribution and
72.92% follow the second Normal distribution. 38.68% of the returns of Finance Index follow
the first Normal distribution and 61.32% follow the second Normal distribution. Meanwhile
for Industry Index, 19.64% of the returns follow the first Normal distribution and 80.36%
follow the second Normal distribution.
Several important observations may be drawn from Table 3. First, the components of
stock price series can clearly be distinguished with respect to their variance. The results for
medium-term stock prices, i.e. for monthly data, the first component is always associated with
the higher variance except in the monthly Finance Index. The high-variance component has the
smaller probability for these data except for Finance Index.

Table 3. EM estimation for Normal mixtures


Composite Index
Weight,  Mean,  Variance, 2
Normal 1 0.2708 -4.4542 43.4384
Normal 2 0.7292 2.0979 36.7481
LogL -789.6809
Model f  xt   0.2708N  4.4542, 43.4384  0.7292N  2.0979,36.7481
Finance Index
Weight,  Mean,  Variance, 2
Normal 1 0.3868 -1.9839 79.9652
Normal 2 0.6132 2.2331 85.0682
LogL -854.8642
Model f  xt   0.3868N  1.9839,79.9652  0.6132N  2.2331,85.0682
Industry Index
Weight,  Mean,  Variance, 2
Normal 1 0.1964 -5.9841 35.8676
Normal 2 0.8036 1.8652 22.7276
LogL -754.8154
Model f  xt   0.1964 N  5.9841,35.8676  0.8036 N 1.8652, 22.7276

In order to judge whether the estimated models are compatible with the stylized facts of
the data, we compute the implied skewness 1 and the implied kurtosis  2 of the models
from
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 889

 p  3    i3 
I
2
i i i
1  i 1
3/ 2
 I 2 
  pi  i   i  
2

 i 1 

 p  3  6 i2 i2   i4 
I
4
i i
2  i 1
2
 I 2 
  pi  i   i  
2

 i 1 
with  i  i   , where  is the overall mean.

Table 4. The implied skewness and implied kurtosis of the models

Composite Index Finance Index Industry Index


1 -0.1020 0.0616 -0.3650
2 5.7435 6.8628 5.7651

The results, reported in Table 4, show a rather close agreement between the pattern of
skewness and kurtosis in the data and implied skewness and kurtosis. There is a quite close
agreement between implied leptokurtosis and actual leptokurtosis for the monthly data.
Figure 4 depicts the mixture Normal distribution for the three indices. It shows that
mixture Normal distribution can accommodate leptokurtic as well as skewed in the data as
the distribution has thicker tails and higher peak. From Table 3 and Figure 4, the two indices
(Composite Index and Industry Index) indicate that the first Normal is a low mean high
variance regime and the second normal is a high mean low variance regime. However, the
Finance Index indicates that the first Normal is a low mean (-1.9839) low variance (79.9652)
regime and the second Normal is a high mean (2.2331) high variance (85.0682) regime.
Meanwhile, the weights indicate that the second regime is the more prevalent regime for the
three stock market indices of Bursa Malaysia.
890 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

Figure 4. Mixture of Normal distributions


E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 891

Also we plot the histogram of the data and the non-parametric density estimate (Figure
5a). In Figure 5b, we add the density of a given component to the current plot, but scaled by
the share it has in the mixture, so that it is visually comparable to the overall density.

Monthly Composite Index Monthly Composite Index


0.10

0.10
0.08

0.08
0.06

0.06
Density

Density
0.04

0.04
0.02

0.02
0.00

0.00
-20 -10 0 10 20 30 -20 -10 0 10 20 30

Return Composite Return Composite

Monthly Finance Index Monthly Finance Index


0.08

0.08
0.06

0.06
Density

Density
0.04

0.04
0.02

0.02
0.00

0.00

-20 0 20 40 -20 0 20 40

Return Finance Return Finance


892 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

Monthly Industry Index Monthly Industry Index

0.12

0.12
0.10

0.10
0.08

0.08
Density

Density
0.06

0.06
0.04

0.04
0.02

0.02
0.00

0.00
-20 -10 0 10 20 -20 -10 0 10 20

Return Industry Return Industry

Figure 5a (left). Histogram (grey) for monthly stock prices of Bursa Malaysia. The dashed
line is a kernel density estimate, which is not completely satisfactory. Figure 5b (right). As in
the previous figure, plus the components of a mixture of two Normal, fitted to the data by the
EM algorithm. These are scaled by the mixing weights of the components.

5. CONCLUSION

In this paper, we focus on study the behavior of financial time series in three stock
market indices of Bursa Malaysia (Composite Index, Finance Index and Industrial Index) for
20 years and characterize the presence of the stylized facts using the data sets on
logarithmic stock returns. We started by describing the data and test the hypothesis of
normality. We found that these three indices exhibit asymmetry and non-normality. Not
surprisingly, the distributions of monthly stock returns analyzed show fat tails and high peaks,
as well as skewness in different directions. The results are in fact fully consistent with those
found for many other markets and reported in many other studies. From previous studies we
found that financial data may be successfully modeled by mixture distributions. We also
found that mixture distribution can accommodate leptokurtic as well as skewed in the data.
Lastly, we fit the two component mixture Normal distribution to data sets using the EM
algorithm.

Acknowledgement. This work is supported by the Ministry of Higher Education, Malaysia


under the Fundamental Research Grant Scheme (UKM-ST-06-FRGS0102-2010).
E m p i ric a l P rop ert i es a n d M i xtu re of Di s t ri b ut i on s : E vi d enc e … 893

References

[1] FRANCES, H. P. AND VAN DIJK, H. K., Non Linear Time Series Models in Empirical Finance, Cambridge
University Press, Cambridge, 2000.
[2] CONT, R., Empirical properties of asset returns: stylized facts and statistical issues, Quantitative Finance, 1,
223-236, 2001.
[3] PRESS, S. J., A compound events model for security prices, J. Bus., 40(3), 317-335, 1967.
[4] PRAETZ, P. D., The distribution of share price changes, J. Bus., 45(1), 49-55, 1972.
[5] CLARK, P. K., A subordinated stochastic process model with finite variance for speculative prices,
Econometrica, 41(1), 135-155, 1973.
[6] BLATTBERG, R. C. AND GONEDES N. J., A comparison of the stable and student distributions as statistical
models for stock prices, J. Bus., 47(2), 244-280, 1974.
[7] KON, S. J., Models of stock returns – a comparison, J. Finance, 39(1), 147-165, 1984.
[8] PICARD, F., An introduction to mixture models, Statistics for Systems Biology Group, Research Report No 7,
2007.
[9] DEMPSTER, A. P., LAIRD, N. M. AND RUBIN D. B., Maximum likelihood from incomplete data via the EM
algorithm, J. Royal Statistical Society Series B, 39, 1-38, 1977.
[10] FAMA, E. F., The behavior of stock-market price, J. of Bussiness, 38(1), 34-105, 1965.
[11] TSAY, R. S., Analysis of Financial Time Series, Wiley Series in Probability and Statistics, 2005.
[12] KNIGHT, J., AND SATCHELL, S., Return Distributions in Finance, Quantitative Finance Series, 2001.
[13] XU, D., AND WIRJANTO, T., An empirical characteristic function approach to VaR under a mixture-of-
normal distribution with time-varying volatility. J. of Derivatives, 18(1), 39-58, 2010.
[14] TAN, K., AND CHU, M., Estimation of portfolio return and value at risk using a class of Gaussian mixture
distributions. The International Journal of Business and Finance Research, 6(1), 97-107, 2012.
[15] KLAR, B., AND MEINTANIS S. G., Test for normal mixtures based on the empirical characteristics function,
J. Computational Statistics and Data Analysis, 49, 227-242, 2005.
[16] SUBRAMANIAN, S., AND RAO U. S., Sensex and stylized facts an empirical investigation, Social Science
Research Network, id. 962828, 2007.
[17] TITTERINGTON, D. M., SMITH, A. F. M. AND MAKOV U. E., Statistical Analysis of Finite Mixture
Distribution, John Wiley & Sons, 2001.
[18] SöDERLIND, P., Lecture Notes in Empirical Finance (PhD): Return Distribution, University of St. Gallen,
2010.
[19] HASTIE, T., TIBSHIRANI, R. AND FRIEDMAN J., The Element of Statistical Learning: Data Mining,
Inference and Prediction, Springer Verlag, 2001.

ZETTY AIN KAMARUZZAMAN


Universiti Kebangsaan Malaysia.
e-mail: [email protected]

ZAIDI ISA
Universiti Kebangsaan Malaysia.
e-mail: [email protected]
894 Z. A. KAMARUZZAMAN , Z. ISA, M. T. ISMAIL

MOHD TAHIR ISMAIL


Universiti Sains Malaysia.
e-mail: [email protected]
Proceedings of ”The 6th SEAMS-UGM Conference 2011”
Applied Mathematics, pp. 895–904.

AN IMPROVED MODEL OF TUMOUR-IMMUNE SYSTEM


INTERACTIONS

Trisilowati, Scott W. Mccue, Dann Mallet

Abstract. The immune system plays an important role in defending the body against
tumours and other threats. Currently, mechanisms involved in immune system interac-
tions with tumour cells are not fully understood. Here we develop a mathematical tool
that can be used in aiding to address this shortfall in understanding. This paper de-
scribes a hybrid cellular automata model of the interaction between a growing tumour
and cells of the innate and specific immune system including the effects of chemokines
that builds on previous models of tumour-immune system interactions. In particular, the
model is focused on the response of immune cells to tumour cells and how the dynamics
of the tumour cells change due to the immune system of the host. We present results and
predictions of in silico experiments including simulations of Kaplan-Meier survival-like
curves.
Keywords and Phrases: hybrid cellular automata, tumour, chemokine, immune, dendritic
cell, cytotoxic T lymphocyte.

1. INTRODUCTION
Cancer is one of the leading causes of death worldwide, with 7.9 million people
dying as a result of cancer in 2007 alone. This is projected to rise to 12 million by 2030
(see http://www.who.int/cancer/en, [13]). A similar report (see http://www.aihw.
gov.au/, [1]) states that in Australia in 2007, cancer was the second most common cause
of death and that 108, 368 new cases of cancer were diagnosed. For those diagnosed
with cancer between 1998 and 2004, the 5-year relative survival for all cancers combined
was 61%. Clearly, cancer is a major concern for public health officials around the world
and a greater understanding of cancer has potential to save many lives.
There is strong evidence in the literature for the hypothesis that tumour growth
is directly influenced by the cellular immune system of the human host. For example,

2010 Mathematics Subject Classification: 92C50

895
896 Trisilowati et al.

Sandel et al. [12] discuss the influence of dendritic cells in controlling prostate cancer.
Furthermore, tumour infiltrating dendritic cells (DCs) are a key factor at the interface
between the innate and adaptive immune responses in malignant diseases. While the
interactions of a tumour and the host immune system have been modelled previously
by, for example, Mallet and de Pillis [9] and de Pillis et al. [4], here we present the
first multidimensional, hybrid cellular automata model of the process that incorporates
important signalling molecules.
Hart [7] states that dendritic cells (DCs), found in many types of tumours, are
the dominant antigen presenting cells for initiating and maintaining the host immune
response. They are critical in activating, stimulating and recruiting T lymphocytes:
cells with the ability to lyse tumour cells. DCs have numerous states of activation,
maturation and differentiation. Natural killer (NK) cells and cytotoxic T lymphocyte
(CTL) cells also play important roles in the response of the immune system against the
tumour as described in Kindt et al. [8].
The dynamics of tumour growth and the interactions of growing tumours with
the host immune system have been studied using mathematical models over the past
four decades. Most of these models are presented using ordinary differential equations
(ODEs) or partial differential equations (PDEs) that impose restrictions on the modelled
system’s time-scales, as described in Ribba et al. [11]. However, a cellular automata
(CA) model can describe more complex mechanisms in the biological system without
such restrictions by detailing phenomena at the individual cell or particle level. The
classic definition of a CA model holds that they involve only local rules that depend
on the configuration of the spatial neighbourhood of each CA element. Hybrid cellular
automata (HCA), on the other hand, extend the CA to incorporate non-local effects,
often via coupling the CA with PDEs.
The purpose of the model developed in the present research is to investigate the
growth of a small solid tumour, when the growth is affected by the immune system. In
this preliminary study, we present a hybrid cellular automata model of the interaction
between a growing tumour and cells of the innate and specific immune system that also
includes generic signalling molecules known as chemokines. Chemokines are a family
of small cytokines, or proteins secreted by many different cell types, including tumour
cells. They can affect cell-cell interactions and play a fundamental role in the recruiting
or attracting cells of the immune system to sites of infection or tumour growth.
To include the effect of a chemokine in this model, we recognise the significantly
smaller size of such molecules compared with biological cells and introduce a partial
differential equation to describe the concentration of chemokine secreted by the tumour.
We combine the analytic solution of the partial differential equation model with a
number of biologically motivated automata rules to form the HCA model. We use
the hybrid cellular automata model to simulate the growth of a tumour in a number
of computational ‘cancer patients’. Each computational patient is distinguished from
others by altering model parameters. We define ‘death’ of a patient as the situation
where the cells of the tumour reach the boundary of our model domain; effectively this
represents tumour metastasis.
An Improved Model of Tumour-Immune System Interactions 897

In the sections to follow, we present the development of the HCA model before
analysing numerical simulations. We conclude with a discussion of the results and
conclusion.

2. MATHEMATICAL MODEL
We investigate the growth of a solid tumour and its interaction with the host
immune system and a tumour-secreted chemokine. The model is comprised of a partial
differential equation to describe the chemokine secreted by the tumour, coupled with a
discrete, stochastic cellular automata describing individual cells. We employ a square-
shaped computational domain of length L, which is partitioned into a regular square
grid. Each square element in the grid represents a location that may contain a healthy
cell, tumour cell or immune cell.
We consider a number of biological cell types including normal healthy cells,
tumour cells (necrotic, dividing and migrating), DCs (mature and immature), NK cells
and CTL cells. To build the CA model, we define ‘rules’ that draw upon the biological
literature to describe cell-cell interactions, cell effects on the environment, and effects
of the environment on cells.
Initially, non-cancerous healthy cells cover the whole of the model domain, then
the tumour mass is allowed to grow from one cancer cell placed at the centre cell of the
grid. Cells of the host immune system are spread randomly over the domain throughout
the other healthy cells. Three separate immune cell populations are considered here –
the NK cells of the innnate immune system and cells of the specific immune response,
represented by the CTL cells and DCs.
The model solutions are progressed via discrete time steps, at which each spatial
location is investigated to determine its contents and whether or not actions will occur.
This is summarised in Algorithm 1.

Algorithm 1 Brief pseudocode for the overall algorithm.


Draw parameters for current computational patient
Initialise domain
for each time step do
for each CA element do
Determine cell type in element
Characterise neighbourhood of element
Test whether event will occur and update state
end for
end for
Export data

2.1. Cellular Automata Rules. Each particular cell-level action is associated with a
probability of success, Pevent , that is compared with a pseudo-random number, r, drawn
898 Trisilowati et al.

from the uniform distribution on the interval [0, 1] to determine whether or not it is
carried out. To describe the evolution of the cell population, we introduce the general
algorithm of cellular automata rules as presented below.

Algorithm 2 Pseudocode for testing occurrence of individual events.


Draw r ∼ U [0, 1]
Calculate Pevent using current state of CA
if r < Pevent then
update state (the event occurs)
end if

2.1.1. Host cells. As described in the work of Ferreira et al. [5] and of Mallet and de
Pillis [9], we assume that the healthy host cells are effectively passive bystanders in the
interaction. They do not hinder the growth of the tumour cells or the movement of any
cell type.

2.1.2. Tumour cells. In this model, we consider tumour growth to be influenced by


the immune system via NK cells, CTL cells and DCs. The tumour cells undergo the
processes of division, migration and lysis resulting from interaction with the immune
system. We assume that NK cells, CTL cells and mature dendritic cells can directly kill
the tumour cells. At each time step, the neighbourhood of each tumour cell is surveyed
to determine whether the cells of the immune system are present or not. If they are,
the tumour cell will be killed by the immune system whereas if there are no immune
system cells in the neighbourhood then the tumour cell is marked for potential division
or migration. Following this, a stochastic rule is checked to determine whether or not
the action will be carried out. The probability of tumour division that depends on the
density of tumour cells in the neighbourhood of the dividing cell has the form as follows
 
tmr 2
Pdiv = exp − (θdiv Tsum ) ,

where θdiv controls the shape of the curve allowing it to capture qualitative understand-
ing of the biology and Tsum is the number of tumour cells in a one cell radius of the cell
of interest. From Figure 1(a), it can be seen that tumour cell division is more likely
when there is space in the neighbourhood for the resulting daughter cell.
The probability of tumour lysis depends on the strength of the immune system
in the neighbourhood of the tumour cell (see Figure 1(b)), and is given by
 
tmr 2
Plysis = 1 − exp − (θlysis Isum ) ,

where again θlysis controls the shape of the curve allowing it to capture qualitative
understanding of the biology and Isum is the number of immune cells in a one cell
radius of the cell of interest.
An Improved Model of Tumour-Immune System Interactions 899

1 1

Plysis
tmr
tmr
Pdiv

0.5 0.5

0 0
0 1 2 3 0 1 2 3
θdiv Tsum θlysis Isum
(a) (b)

Figure 1. The form of the curves used to determine the probability


of (a) tumour cell division and (b) tumour cell lysis, given different
neighbourhood conditions.

2.1.3. Immune System. At each time step, the neighbourhood of each immune cell is
surveyed to determine whether the tumour cells are present. If tumour cells are present,
the immune system will kill the tumour cells in the manner described above. If there
are no tumour cells in the neighbourhood of the CTL cells, then the CTL cells move
towards areas of higher chemokine concentration.
To control the normal background level of CTL cells, at each time step there is a
chance that healthy cells are replaced (from outside the computational domain) by new
immune cells. This is carried out by imposing a probability of healthy cell replacement
with a CTL, given by
CTL 1 X
Prep = CTL0 − 2 CTLi,j , (1)
n
domain

where CTL0 is the ‘normal’ density of CTL cells and n2 is the total number of CA
elements.
NK cell and dendritic cell have similar rules to CTL cells, except that NK cells
and mature dendritic cells can lyse the tumour cell only once. When immature dendritic
cells come in contact with tumour cell it becomes a mature dendritic cell that has the
ability to kill the tumour cell.

2.2. Chemokine Equation. To include the effect of a chemokine in this model, we


use a partial differential equation to describe the evolution of the concentration of
chemokine throughout the model domain. We combine the analytic solution of the
partial differential equation with a number of biologically motivated automata rules as
described above. The equation for the concentration of chemokine is given by
 2
∂2C

∂C ∂ C
=D + , (2)
∂t ∂x2 ∂y 2
900 Trisilowati et al.

100 100

80 80

60 60

40 40

20 20

20 40 60 80 100 20 40 60 80 100
(a) 50 cell cycles (b) 100 cell cycles

Figure 2. The growing tumour and host immune system.

where C(x, y, t) represents the chemokine concentration, D is the diffusion coefficient of


the chemokine, x and y represent the spatial variables along the horizontal and vertical
axes, and t represents time. The initial condition is given by
C(x, y, 0) = f (x, y), 0 < x < b, 0 < y < c,
where b = c = 1 mm. For all four boundaries of the domain, we set C = 0.
Using the boundary conditions as presented above along with an unspecified initial
condition, equation (2) is exactly solvable using separation of variables, with a solution
given by
 2 2
n2 π 2
  
m π  mπx   nπy 
C(x, y, t) = Am,n exp −D + t sin sin ,
b2 c2 b c
where
Z c Z b
4  mπx   nπy 
Am,n = f (x, y) sin sin dx dy.
bc 0 0 b c
For the present study, initially, a tumour cell placed in the middle of the grid and
is assumed to secrete a chemokine. The initial value of the chemokine concentration is
therefore a function of the form
f (x, y) = exp(−0.5((x − b/2)2 + (y − c/2)2 )). (3)
Chemokines then start to diffuse from the centre to the whole domain and attract the
immune system to the site of tumour. This represents the behaviour of chemokines, such
as is described by Allavena et al. [2] and Murooka et al. [10]. For the initial condition
given in equation (3) we have numerically integrated using MATLAB’s in-built adaptive
Simpson quadrature to obtain Am,n .
An Improved Model of Tumour-Immune System Interactions 901

6,000 20

4,000
10
2,000

0 0
0 20 40 60 80 100 0 20 40 60 80 100
TC NEC CTL MDC NK IDC
(a) Tumour and necrotic cells (b) Immature and mature DCs, NKs and CTLs

Figure 3. Total cell counts after 100 cell cycles.

3. RESULTS
We combine the solution of the PDE with the CA as described in Section 2 to
simulate the evolution of the growing tumour. A two-dimensional regular 100 × 100
square domain is used with 100 times steps and a Moore neighbourhood is considered
for the cellular automata rules. In this simulation, an estimated value of diffusion
coefficient for chemokine, D, is 10−4 µm2 s−1 . The distribution of the growing tumour
after 50 and 100 cell cyces is shown in Figure 2(a), with results qualitatively matching
those of Mallet and de Pillis [9].
Figure 3(a) shows the evolution of the tumour cell and necrotic cell densities over
100 cell cycles. This plot shows the characteristic exponential and linear growth phases
of solid, avascular tumours (see for example, Folkman and Hochberg [6]), as well as a
slower growing population of necrotic cells. In 3(b) we see that initially, the number
of mature dendritic cell is zero until immature dendritic cells come in contact with
tumour cells, at which point the matured dendritic cells commence killing the tumour
cells. After around 80 cycles, all immature dendritic cells have matured and the number
of mature dendritic cells stabilises. As expected, due to the nature of equation (1), the
populations of NK cells and CTL cells remain approximation steady over the extent of
the tumour growth.
We also use the hybrid cellular automata model to investigate the growth of a
tumour in a number of computational ‘cancer patients’. Each computational patient is
distinguished from others by altering model parameters. We define ‘death’ of a patient
as occurring when the tumour is able to metastasise. Effectively, this is when the cells
of the tumour reach the boundary of our model domain. We present the results of these
simulations using a simulated Kaplan-Meier survival curve, shown in Figure 4. The
figure shows that metastasis sets in for the first patients after 80 cycles. Metastasis
of the simulated tumours occurred in approximately 60% of simulated patients after
902 Trisilowati et al.

100

% of patients surviving
80

60

40

20

0
0 100 200 300 400
Cell cycles

Figure 4. Simulated Kaplan-Meier curve.

250 cycles after which time most surviving patients exhibited dormant tumours being
controlled by the immune system.

4. CONCLUSION
Duchting and Vogelsaenger [3] pioneered the use of discrete cellular automata for
modelling cancer, investigating the effects of radio-therapy. Ferreira et al. [5] modelled
avascular cancer growth with a CA model based on the fundamental biological process
of proliferation, motility, and death, including competition for diffusing nutrients among
normal and cancer cells. Based on the Ferreira et al. model, Mallet and de Pillis [9]
constructed a hybrid cellular automata cancer model that built on the work of Ferreira
et al. to include NK cells as the innate immune system and CTL cells as the specific
immune system. The Mallet and de Pillis model lacked sufficient detail of the immune
system and in this present research, we attempt to improve on their work by explicitly
describing more of the host immune system. While direct comparison of the models
is difficult, the results as described in Figure 3(a) qualitatively reflect the findings of
Mallet and de Pillis and of Ferreira et al..
While models based on differential equations allow for analytical investigations
such as stability and parameter sensitivity analyses, and ease of fitting the model to
experimental data, these types of models cannot capture the detailed cellular and sub-
cellular level complexity of the biological system. On the other hand, HCA models can
describe greater complexity of the biological process such as the interaction between
every single cell. In current work complementary to the present research of this paper,
we have included greater realism in the modelling of tumour-secreted chemokines by
allowing secretion due to cell-cell interaction. Currently, chemokines and their receptors
in the tumour microenvironment are being extensively investigated to produce thera-
peutic interventions to combat cancer, (see for example, Allavena et al. [2] and Murooka
An Improved Model of Tumour-Immune System Interactions 903

et al. [10]). Our models currently under development will allow for simulation-based
and theoretical investigations of such interventions.
We have developed a useful model that can be employed as a preliminary inves-
tigative tool for experimentalists who conduct expensive in vitro and in vivo experiments
to test and refine hypotheses prior to entering the lab. With further cross disciplinary
collaboration, this type of model can be refined to provide a more accurate descrip-
tion of the underlying cancer biology and hence yield more relevant predictions and
tests of hypotheses. Future developments based upon this model will be related to the
specific context of colorectal cancer, and the effect of chemokines on the cell-cell inter-
actions will be deeply investigated. More complex partial differential equations related
to chemokines secretion resulting from cell-cell interactions will be introduced in future
work.

Acknowledgement. The first author acknowledges QUT for conference funding as


well as DIKTI for a scholarship for postgraduate study.

References
[1] Australian Institute of Health & Welfare, Cancer, available at http://www.aihw.gov.au/
cancer/, accessed April 16, 2011.
[2] Allavena, P., Marchesi, F. and Mantovani, A., The role of chemokines and their receptors
in tumor progression and invasion: Potential new target of biological therapy, Current Cancer
Therapy Reviews 1, 81-92, 2005.
[3] Duchting, W. and Vogelsaenger, T., Analysis, forecasting and control of three-dimensional
tumor growth and treatment. J. Med. Syst. 8, 461-475, 1984.
[4] de Pillis, L.G., Mallet, D.G. and Radunskaya, A.E., Spatial Tumor-Immune Modeling. Com-
putational and Mathematical Methods in Medicine 7:2-3, 159-176, 2006.
[5] Ferreira Jr. S.C., Martins, M.L., Vilela, M.J., Reaction diffusion model for the growth of
avascular tumor. Phys. Rev. E 65, 021907, 2002.
[6] Folkman, J. and Hochberg, M., Self regulation of growth in three dimensions. J. Exp. Med.
138, 745-753, 1973.
[7] Hart, D.N., Dendritic cells: Unique leukocyte populations which control the primary immune
response. Blood 90, 3245-3287, 1997.
[8] Kindt, T.J., Goldsby, R.A., and Osborne, B.A., Immunology, W.H. Freeman and Company,
New York, 2007.
[9] Mallet, D.G. and de Pillis, L.G., A cellular automata model of tumour-immune system inter-
actions, J. Theoretical Biology 239, 334-350, 2006.
[10] Murooka,T.T.,Ward, S.E., and Fish, E.N., Chemokines and Cancer, in: Cytokines and Cancer,
Springer, New York, 2005.
[11] Ribba, B., Alarkon, T., Marron, K., Maini, P.K. and Agur, Z., The use of hybrid cellular
automata models for improving cancer therapy. ACRI 2004, LNCS 3305, 444-453, 2004.
[12] Sandel, M.H. et al., Prognostic Value of Tumor-Infiltrating Dendritic Cells in Colorectal Cancer:
Role of Maturation Status and Intra-tumoral Localization. Clinical Cancer Research 11:7, 2576-
2582, 2005.
[13] World Health Organization, Programmes and Projects: Cancer, available at http://www.who.
int/cancer/en/, accessed April 16, 2011.

Trisilowati
Mathematical Sciences Discipline, Queensland University of Technology, Brisbane, Australia.
904 Trisilowati et al.

Mathematics Department, Brawijaya University, Indonesia.


e-mail: [email protected]

Scott W. Mccue
Mathematical Sciences Discipline, Queensland University of Technology, Brisbane, Australia.
e-mail: [email protected]

Dann Mallet
Mathematical Sciences Discipline and Institute of Health and Biomedical Innovation, Queens-
land University of Technology, Brisbane, Australia.
e-mail: [email protected]

You might also like