0% found this document useful (0 votes)
216 views

Computational Intelligence in Control Engineering

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
216 views

Computational Intelligence in Control Engineering

Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 305

Foreword

In the last 50 years, Automatic Control Theory has developed into a well-established engineering discipline that has found application in space technology, industry, household appliances and other technological implementations. It was designed to monitor and correct the performance of systems without the intervention of a human operator. Lately, with the growth of digital computers and the universal acceptance of systems theory, it was discovered and used in softer fields of human interest such as ecology, economics, biology, etc. In the meanwhile, being a dynamic discipline, Automatic Control with the aid of the digital computer has evolved from simple servomechanisms to an autonomous selforganizing decision-making methodology that was given the name of Intelligent Control. Several manifestations of Intelligent Control have been proposed by various scientists in the literature. Fuzzy, Neural, Hierarchical Intelligent, Cerebellar and Linguistic control systems are typical examples of such theoretically developed Intelligent Controls. However, the application of such sophisticated methodologies to real life problems is far behind the theory. The areas with the highest need and the smallest tolerance for adopting the techniques resulting from such theoretical research are the industrial complexes. The main reason is the lack of suitable intelligent computational algorithms and interfaces designed especially for their needs. This book attempts to correct this by first presenting the theory and then developing various computational algorithms to be adapted for the various industrial applications that require Intelligent Control for efficient production.
vii

viii

Foreword

The author, who was one of the first to actually implement Intelligent Control in industry, accomplishes this goal by developing step by step some of the most important Intelligent Computational Algorithms. His industrial experience, coupled with a strong academic background, has been channeled into creating a book that is suitable for graduate academic education and a manual for the practicing industrial engineer. Such a book fills a major gap in the global literature on Computational Intelligence and could serve as a text for the developing areas of biological, societal and ecological systems. I am very proud to introduce such an important work. George N. Saridis Professor Emeritus Rensselaer Polytechnic Institute Troy, New York, 1999

Preface
Conventional control techniques based on industrial three-term controllers are almost universally used in industry and manufacturing today, despite their limitations. Modern control techniques have not proved possible to apply because of the difficulties in establishing faithful microscopic models of the processes under control. It is not surprising, therefore, that manual control constitutes the norm in industry. In the early 1970s Intelligent Control techniques, which emulate the processing of human knowledge about controlling a process by machine, appeared and a new era of control was born. Intelligent Control has come a long way since then, breaking down the barriers of industrial conservatism with impressive results. Intelligent Control, which includes Fuzzy, Neural, Neuro-fuzzy and Evolutionary Control, is the result of applying Computational Intelligence to the control of complex systems. This class of unconventional control systems differs radically from conventional (or hard control) systems that are based on classical and modern control theory. The techniques of Intelligent Control are being applied increasingly to industrial control problems and are leading to solutions where conventional control methods have proved unsuccessful. The outcome of their application to industry and manufacturing has been a significant improvement in productivity, reduced energy consumption and improved product quality, factors that are of paramount importance in todays global market. The first Chapter presents an introduction to Computational Inix telligence, the branch of Soft Computing which includes Expert Systems,

Preface

Fuzzy Logic, Artificial Neural Networks and Evolutionary Computation (Genetic Algorithms and Simulated Annealing) with special emphasis on its application to Control Engineering. The theoretical background required to allow the reader to comprehend the underlying principles has been kept to a minimum. The reader is expected to possess a basic familiarity with the fundamentals of conventional control principles since it is inconceivable that unconventional control techniques can be applied without an understanding of conventional control techniques. The book is written at a level suitable for both undergraduate and graduate students as well as for practicing engineers who are interested in learning about unconventional control systems that they are likely to see in increasing numbers in the next millennium. The primary objective of the book is to show the reader how the fusion of the techniques of Computational Intelligence techniques can be applied to the design of Intelligent Systems that, unlike conventional control systems, can learn, remember and make decisions. After many years of teaching in higher education, the author took leave to work in industry only to face the technology gap between control theory and practice firsthand. He is one of that rare breed of academics who had a free hand to experiment on-line on large-scale chemical processes. He spent considerable time trying to apply conventional modern control techniques but frustrated with the outcome, sought unconventional techniques that could and did yield solutions to the difficult practical control problems that he faced. His search led him first to fuzzy control and later to neural control, which he applied to the process industry with considerable success. Those pioneering years in industry proved critical to his thinking about control practice and the use of Computational Intelligence, which is proving to be a powerful tool with which to bridge the technology gap. After some ten years in industry, the author returned to academe, applying reverse technology transfer instructing his students on Intelligent Control techniques that have proved effective in industry. This book is the result of the experience he gained during those years in industry and of teaching this material to his graduate class on Intelligent Control while many of the examples presented in this book are the result of this experience. Chapter 1 is an introduction to the techniques of Computational Intelligence, their origins and application to Control Engineering. Conventional and Intelligent Control are compared, with a view to focusing

Preface

xi

on the differences which led to the need for Intelligent Control in industry and manufacturing. Chapter 2 discusses Expert Systems with reference to their engineering applications and presents some common applications in industry and manufacturing. Chapter 3 discusses Intelligent Control Systems, their goals and objectives while Chapter 4 discusses its principal components. The elements of Fuzzy Logic on which Fuzzy Controllers are based are presented in Chapter 5 while Chapter 6 discusses the mechanisms of Fuzzy Reasoning, i.e., the inference engine that is the kernel of every fuzzy controller. Chapter 7 defines the fuzzy algorithm, methods of fuzzification and de-fuzzification and outlines the principal fuzzy controller design considerations. The requirements for real-time fuzzy controllers, both supervisory as well as embedded, are discussed in Chapter 8, which also includes examples of industrial applications. Chapter 9 presents fuzzy three-term industrial controllers that are replacing many conventional three-term controllers in the industrial environment. Chapter 10 outlines the Takagi-Sugeno Model-Based Fuzzy Controller design technique and fuzzy gain-scheduling that fuse conventional and fuzzy control. Neural Control, the second important technique of Intelligent Control, is presented in Chapter 11. The elemental artificial neuron and multi-layer artificial neural networks that form the kernel of neural controllers are introduced in this Chapter. The delta and back-propagation algorithms, two of the most common algorithms for training neural network, are described in Chapter 12. Chapter 13 discusses how neural controllers can be trained from linguistic control rules identical to those used in fuzzy control. Finally, the result of fusing fuzzy and neural techniques of Computational Intelligence in the design of hybrid neuro-fuzzy controllers is discussed in Chapter 14. Evolutionary Computation, the latest entrant in the field of Computational Intelligence, and Genetic Algorithms, the best known example of stochastic numerical optimization techniques, are presented in Chapter 15. Chapter 16 introduces Simulated Annealing, a stochastic technique that has found considerable application in engineering optimization. Finally, Chapter 17 demonstrates how these two techniques can be used to advantage in the design of conventional and intelligent controllers. An extensive Bibliography on Computational Intelligence and its applications is presented in Chapter 18.

xii

Preface

Appendix A offers a step-by-step study for the design of a fuzzy controller of a realistic non-linear dynamic plant using MATLAB and its Fuzzy Toolbox. Appendices B and C offer listings of the MATLAB mfiles of Genetic and Simulated Annealing Algorithms. Finally, Appendix D presents a listing of a MATLAB m-file for training industrial neural controllers using the Neural Toolbox.

Acknowledgments
This book would not have been written had it not been for two people: an anonymous kidney donor and. Mark Hardy M.D., Auchinloss Professor of Surgery in the Department of Surgery at the College of Physicians & Surgeons of Columbia University in New York, who performed the transplant. Together, they gave him that most precious gift: Life. He is forever indebted to them. The author gratefully also acknowledges the contributions of his colleagues and former students at the University of Patras in Greece, N. Antonopoulos to Chapter 2, K. Kouramas to Chapters 2 and 10, P. Skantzakis to Chapter 11, G. Tsitouras and G. Nikolopoulos to Chapter 13 and V. Goggos to Chapters 15, 16 and 17. Robert E. King [email protected] October 2004

Series Introduction
Many textbooks have been written on control engineering, describing new techniques for controlling systems, or new and better ways of mathematically formulating existing methods to solve the ever-increasing complex problems faced by practicing engineers. However, few of these books fully address the applications aspects of control engineering. It is the intention of this new series to redress this situation. The series will stress applications issues, and not just the mathematics of control engineering. It will provide texts that not only contain an expos of both new and well-established techniques, but also present detailed examples of the application of these methods to the solution of real-world problems. The authors will be drawn from both the academic world and the relevant applications sectors. There are already many exciting examples of the application of control techniques in the established fields of electrical, mechanical (including aerospace), and chemical engineering. We have only to look around in todays highly automated society to see the use of advanced robotics techniques in the manufacturing industries; the use of automated control and navigation systems in air and surface transport systems; the increasing use of intelligent control systems in the many artifacts available to the domestic consumer market; and the reliable supply of water, gas, and electrical power to the domestic consumer and to industry. However, there are currently many challenging problems that could benefit from wider exposure to the applicability of control methodologies, and the systematic systems-oriented basis inherent in the application of control techniques.
v

vi

Series Introduction

This new series will present books that draw on expertise from both the academic world and the applications domains, and will be useful not only as academically recommended course texts but also as handbooks for practitioners in many applications domains. Neil Munro

Contents
Cover Page Series Introduction by Neil Munro Foreword by George N. Saridis Preface 1. Introduction 1.1 Conventional Control 1.2 Intelligent Control 1.3 Computational Intelligence in Control 2. Expert Systems in Industry 2.1 Elements of an Expert System 2.2 The Need for Expert Systems 2.3 Stages in the Development of an Expert System 2.4 The Representation of Knowledge 2.5 Expert System Paradigms 2.5.1 Expert systems for product design 2.5.2 Expert systems for plant simulation and operator training 2.5.3 Expert supervisory control systems 2.5.4 Expert systems for the design of industrial controllers 2.5.5 Expert systems for fault prediction and diagnosis 2.5.6 Expert systems for the prediction of emergency plant conditions 2.5.7 Expert systems for energy management 2.5.8 Expert systems for production scheduling 2.5.9 Expert systems for the diagnosis of malfunctions 3. Intelligent Control 3.1 Conditions for the Use of Intelligent Control 3.2 Objectives of Intelligent Control 4. Techniques of Intelligent Control 4.1 Unconventional Control 4.2 Autonomy and Intelligent Control 4.3 Knowledge-Based Systems 4.3.1 Expert systems 4.3.2 Fuzzy control 4.3.3 Neural control 4.3.4 Neuro-fuzzy control 5. Elements of Fuzzy Logic 5.1 Basic Concepts 5.2 Fuzzy Algorithms 5.3 Fuzzy Operators 5.4 Operations on Fuzzy Sets 5.5 Algebraic Properties of Fuzzy Sets 5.6 Linguistic Variables 5.7 Connectives v vii ix 1 2 6 8 13 15 17 18 20 20 21 22 23 24 24 26 26 27 28 31 33 34 39 40 45 48 49 50 51 51 53 54 59 60 63 64 64 69

6. Fuzzy Reasoning 6.1 The Fuzzy Algorithm 6.2 Fuzzy Reasoning 6.2.1 Generalized Modus Ponens (GMP) 6.2.2 Generalized Modus Tollens (GMT) 6.2.3 Boolean implication 6.2.4 Lukasiewicz implication 6.2.5 Zadeh implication 6.2.6 Mamdani implication 6.2.7 Larsen implication 6.2.8 GMP implication 6.3 The Compositional Rules of Inference 7. The Fuzzy Control Algorithm 7.1 Controller Decomposition 7.2 Fuzzification 7.2.1 Steps in the fuzzification algorithm 7.3 De-fuzzification of the Composite Controller Output Membership Function 7.3.1 Center of area (COA) de-fuzzification 7.3.2 Center of gravity (COG) de-fuzzification 7.4 Design Considerations 7.4.1 Shape of the fuzzy sets 7.4.2 Coarseness of the fuzzy sets 7.4.3 Completeness of the fuzzy sets 7.4.4 Rule conflict 8. Fuzzy Industrial Controllers 8.1 Controller Tuning 8.2 Fuzzy Three-Term Controllers 8.2.1 Generalized three-term controllers 8.2.2 Partitioned controller architecture 8.2.3 Hybrid architectures 8.2.4 Generic two-term fuzzy controllers 8.3 Coarse-Fine Fuzzy Control 9. Real-time Fuzzy Control 9.1 Supervisory Fuzzy Controllers 9.2 Embedded Fuzzy Controllers 9.3 The Real-time Execution Scheduler 10. Model-Based Fuzzy Control 10.1 The Takagi-Sugeno Model-Based Approach to Fuzzy Control 10.2 Fuzzy Variables and Fuzzy Spaces 10.3 The Fuzzy Process Model 10.4 The Fuzzy Control Law 10.5 The Locally Linearized Process Model 10.5.1 Conditions for closed system stability 10.6 The Second Takagi-Sugeno Approach 10.7 Fuzzy Gain-Scheduling 11. Neural Control 11.1 The Elemental Artificial Neuron 11.2 Topologies of Multi-layer Neural Networks 11.3 Neural Control 11.4 Properties of Neural Controllers 11.5 Neural Controller Architectures 11.5.1 Inverse model architecture

71 74 76 77 77 78 78 79 79 80 80 81 89 90 91 96 98 98 99 100 100 100 101 102 105 106 107 108 109 112 113 117 119 120 123 124 135 136 137 139 141 142 144 144 146 153 156 158 160 161 162 164

11.5.2 Specialized training architecture 11.5.3 Indirect learning architecture 12. Neural Network Training 12.1 The Widrow-Hoff Training Algorithm 12.2 The Delta Training Algorithm 12.3 Multi-layer ANN Training Algorithms 12.4 The Back-propagation (BP) Algorithm 13. Rule-Based Neural Control 13.1 Encoding Linguistic Rules 13.2 Training Rule-Based Neural Controllers 14. Neuro-Fuzzy Control 14.1 Neuro-Fuzzy Controller Architectures 14.2 Neuro-Fuzzy Isomorphism 15. Evolutionary Computation 15.1 Evolutionary Algorithms 15.2 The Optimization Problem 15.3 Evolutionary Optimization 15.4 Genetic Algorithms 15.4.1 Initialization 15.4.2 Decoding 15.4.3 Evaluation of the fitness 15.4.4 Recombination and mutation 15.4.5 Selection 15.4.6 Choice of parameters of a GA 15.5 Design of Intelligent Controllers Using GAs 15.5.1 Fuzzy controllers 15.5.2 Neural controllers 16. Simulated Annealing 16.1 The Metropolis Algorithm 16.2 Application Examples 17. Evolutionary Design of Controllers 17.1 Qualitative Fitness Function 17.2 Controller Suitability 18. Bibliography A. Computational Intelligence B. Intelligent Systems C. Fuzzy Logic and Fuzzy Control D. Fuzzy Logic and Neural Networks E. Artificial Neural Networks F. Neural and Neuro-Fuzzy Control G. Computer and Advanced Control H. Evolutionary Algorithms I. MATLAB and its Toolboxes Appendix A Case Study: Design of a Fuzzy Controller Using MATLAB A.1 The Controlled Process A.2 Basic Linguistic Control Rules A.3 A Simple Linguistic Controller A.4 The MATLAB fuzzy Design Tool A.5 System Stabilization Rules A.6 On the Universe of Discourse of the Fuzzy Sets A.7 On the Choice of Fuzzy Sets

165 166 169 170 173 175 176 181 182 183 193 194 195 203 205 207 208 211 212 212 213 214 215 217 221 221 222 225 226 228 235 236 237 247 247 247 248 251 252 253 254 254 257 259 259 261 261 264 266 267 268

A.8 Compensation of Response Asymmetry 269 A.9 Conclusions 270 Appendix B Simple Genetic Algorithm Appendix C Simulated Annealing Algorithm Appendix D Network Training Algorithm Index Back Cover 279

285

289

291

Chapter 1

Introduction
Modern control theory, which has contributed so significantly to the exploration and conquest of space, has not had similar success in solving the control problems of industry and manufacturing. Despite the progress in the field since the 1950s, the chasm between theory and practice has been widening and many of the needs of industry remain unsolved. Industry has had little choice, therefore, but to rely heavily on conventional (sometimes termed hard) control techniques that are based on industrial three-term controllers. Unfortunately, these simple and ubiquitous devices cannot always cope with the demands and complexity of modern manufacturing systems. The chasm between theory and practice has led to a search for new and unconventional techniques that are not subject to the constraints and limitations of modern control theory to solve the control problems faced by industry and manufacturing,. The breakthrough came in the mid-1960s with the introduction of Fuzzy Logic by Zadeh. The application of Zadehs theory to control was to come almost ten years later and it was to take even more years before it received the respect and acceptance that it rightly deserved. At about the same time, Widrow demonstrated the use of ADALINEs (Adaptive Linear Networks), which are a primitive form of Artificial Neural Networks (ANNs), in control. This was a radical departure from conventional control since a generic controller was trained to perform a specific task instead of being designed.

Chapter 1

The two approaches were developed independently and it was to take many years before these concepts were applied to any degree. The application of Fuzzy Logic to Control Engineering was first demonstrated in Europe and Japan in the mid-1970s. Mamdani presented the first demonstration of Fuzzy Logic in 1974 on an experimental process. This demonstration of Fuzzy Logic Control (FLC) gave the impetus for a seemingly endless series of applications, which continues unabated to this day. With a few notable exceptions, Zadehs theory of Fuzzy Logic went unnoticed in the West for many years while, in the meantime, there was a frenzy of activity in Japan applying the theory to such varied fields as home appliances, cameras and transportation systems. Not until the early 1980s did industries in the West seriously consider applying fuzzy control. At the forefront of this thrust was the process industry and in particular the cement industry, which was the first to apply the new technique to control large-scale processes. The developments in the field since then have been impressive and today there are hundreds of plants worldwide being successfully controlled by such techniques. The field of Artificial Neural Networks, which evolved quite separately, has had a difficult evolution. Appearing in the 1970s as a field that offered much promise and potential, it was thwarted by inadequate computational facilities and a lack of effective network training algorithms. Re-emerging in the 1980s, by which time significant progress had been made in both training algorithms and computer hardware, research and development in the field has evolved rapidly. Artificial Neural Networks can be found today in a host of applications ranging from communications, speech analysis and synthesis, control and more.

1.1 Conventional Control


Despite the advances in the theory of automatic control, most industrial plants, even to this day, are under the exclusive supervision and control of human operators. Their observations on the state of the plant from a host measurements taken from sensors in the plant coupled with their knowledge and experience of the plant lead them to decide on what control strategy to take in order to achieve the desired product quality and production specifications. In the past, industry has had little option but to use Classical Control theory that is based on macroscopic models of the plant in de-

Introduction

signing appropriate conventional controllers. These methods depend on empirical knowledge of the dynamic behavior of the controlled plant, derived from measurements of the control and manipulated variables of that plant. Traditionally industry has relied heavily on three-term (PID) controllers, that are incorporated today in most Remote Terminal Units (RTUs) and Programmable Logic Controllers (PLCs). The ubiquitous three-term controller is used to control all kinds of devices, industrial processes and manufacturing plants. Their tuning is based on simple approximants of the controlled plant dynamics and on design methods such as the classical ones by Nichols and Ziegler or more modern techniques such as those of Persson and Astrom. Most often in practice turning is performed heuristically by expert tuners in situ. Without doubt, these simple industrial controllers have offered sterling service for many decades and will continue to do so for many more, wherever simplicity and robustness are essential and control specifications permit. However, three-term controllers cannot always satisfy the increasing complexity of modern industrial plants and the demands for high flexibility, productivity and product quality, which are essential in todays very competitive global market. The problem is further aggravated by the increasing environmental restrictions being placed on industry and manufacturing. Modern Control was introduced in the early 1960s and is a rigorous methodology that has proved invaluable for finding solutions to well-structured control problems. With a few notable exceptions, however, its application to industry has been disappointing and few industrial controllers are designed with this methodology. The reasons for this discrepancy are the complexity, uncertainty and vagueness with which industrial processes are characterized - conditions that do not allow for ready modeling of the controlled plant, essential to the application of modern control methodologies. Despite more than five decades of research and development in the theory and practice of Control Engineering, most industrial processes are by and large still controlled manually. Today, Supervisory Control And Data Acquisition (SCADA) Systems and Distributed Control Systems (DCS) make the operators task considerably easier. A partial schematic of such an information system using a distributed architecture, is shown in Figure 1.1.

Chapter 1

Raw Materials Kiln

Fuel Air Clinker

RTUs
LAN

Operator Consoles

Figure 1.1 Distributed Control System architecture

The operator console possesses one or more screens that display the essential variables of the plant through a graphical user interface by which the operator interacts with the plant. A typical example of such a display is shown in Figure 1.2. In plants where the various sub-processes interact, it is clear that the control problem can be severe, requiring operator skills that can only be acquired after years of experience. Today, Multimedia and Virtual Reality are finding their way into the control room, improving the manmachine interface and making decision-making considerably easier and the work environment more tolerable.

Introduction

One or more human operators normally supervise a cluster of sub-processes, receiving data on the state of the plant and sending corrections to the set points of the local controllers which are distributed throughout the plant so that the plant remains at its nominal state despite external disturbances. These local controllers are often embedded in RTUs that are also capable of performing sequential switching control, data acquisition and communications with the Supervisory Control System and the operators consoles via a local area network.

Figure 1.2 A typical graphical user interface

In most industrial applications, human operators close the loop between the controlled and the control variables of the controlled plant. Operators respond to observations of the principal variables of the plant and continuously stride to satisfy often-conflicting objectives, e.g., maximizing productivity and profit while minimizing energy demand. Proper operation of a process is thus very much dependent on the experience of the operator, his knowledge about the process and its dynamics and the speed with which he responds to plant disturbances, malfunctions and disruptions. The yield of a process can vary quite significantly from operator to operator and less experienced operators are often unable

Chapter 1

to control a plant effectively, particularly under abnormal situations which they have never met before. The control actions of a human operator are subjective, frequently incomprehensible and often prone to errors particularly when they are under stress. Indeed in the case of abnormal operating (i.e., alarm) conditions, their actions may be potentially dangerous and there is little margin for errors. Delays in making decisions can lead to disastrous results as was amply demonstrated in the Chernobyl nuclear reactor disaster. Thus in modern complex plants there exists a very real need to assist operators in their decision-making, particularly in abnormal situations in which they often are bombarded with conflicting signals. The advent of Computational Intelligence and unconventional control free operators of many of the tedious and complex chores of monitoring and controlling a plant, assuring them fast and consistent support in their decision-making.

1.2 Intelligent Control


During the past twenty years or so, a major effort has been under way to develop new and unconventional control techniques that can often augment or replace conventional control techniques. A number of unconventional control techniques have evolved, offering solutions to many difficult control problems in industry and manufacturing. This is the essence of what has been termed Practical Control, which is a collection of techniques which practicing engineers have found effective and easy to use in the field. It is true to say that virtually all the techniques of unconventional control could not have been possible but for the availability of computationally powerful and high-speed computers. Significant research has been carried out in understanding and emulating human intelligence while, in parallel, developing inference engines for processing human knowledge. The resultant techniques incorporate notions gathered from a wide range of specialization such as neurology, psychology, operations research, conventional control theory, computer science and communications theory. Many of the results of this effort have migrated to the field of Control Engineering and their fusion has led to a rapid growth of new techniques such as inductive reasoning,

Introduction

connectionism and parallel distributed processing for dealing with vagueness and uncertainty. This is the domain of Soft Computing, which focuses on stochastic, vague, empirical and associative situations, typical of the industrial and manufacturing environment. Intelligent Controllers (sometimes termed soft controllers) are derivatives of Soft Computing, being characterized by their ability to establish the functional relationship between their inputs and outputs from empirical data, without recourse to explicit models of the controlled process. This is a radical departure from conventional controllers, which are based on explicit functional relations. Unlike their conventional counterparts, intelligent controllers can learn, remember and make decisions. The functional relationship between the inputs and outputs of an intelligent controller can be specified either: indirectly by means of a relational algorithm, relational matrix or a knowledge base, or directly from a specified training set.

The first category belongs to the domain of Fuzzy Systems while Artificial Neural Networks belong to the second. Generality, in which similar inputs to a plant produce similar outputs so that sensitivity to perturbations in the plant inputs is minimized, is an inherent feature of such systems. Generality implies that the controller is capable of operating correctly on information beyond the training set. Intelligent controllers, whatever form they may take, share the following properties: they use the same process states, use parallel distributed associative processors, assure generality, and are capable of codifying and processing vague data.

The principal medium of intelligent control is Computational Intelligence, the branch of Soft Computing which includes Expert Systems, Fuzzy Logic, Artificial Neural Networks and their derivatives. Evolutionary Computation (Genetic Algorithms and Simulated Annealing) is a very recent addition to this rapidly evolving field.

Chapter 1

1.3 Computational Intelligence in Control


The field of Expert Systems, the first class of systems that this book discusses, is the precursor to Computational Intelligence and is the most successful outgrowth of Artificial Intelligence. Expert systems use linguistic rules to specify domain knowledge and are used extensively today in industry in such diverse applications as fault prediction, fault diagnosis, energy management, production management and supervisory control, among others. Chronologically, fuzzy logic was the first technique of intelligent control. Neural, neuro-fuzzy and evolutionary control and their derivatives followed later, each technique offering new possibilities and making intelligent control even more versatile and applicable in an everincreasing range of industial applications. The third technique of intelligent control considered in this book appeared towards the end of the 1980s and is based on Artificial Neural Networks. Neural networks have had a varied history, progress having remained stagnant until the mid-1980s when efficient training algorithms were developed and fast computational platforms became readily available. Since then, neural networks have had a remarkable resurgence, being successfully used in a wide range of applications such as communications, speech analysis and synthesis pattern recognition, system identification and control. Finally, in the mid-1990s Evolutionary Control, an outgrowth of Evolutionary Computing, emerged as a viable method for optimum control. This technique, which is possible only because of the rapid developments in computer hardware and software, uses stochastic methods. Since the early 1990s a major effort has been underway to develop derivatives of these techniques in order to exploit the best features of each in the design of intelligent controllers. These new techniques have revolutionized the field of Control Engineering, offering new hope in solving many of the difficult control problems of industry and manufacturing. Computational Intelligence is based on concepts that practicing control engineers use on a daily basis and has played a major role in reducing the chasm between advanced control and engineering practice. The new control techniques based on Computational Intelligence no longer face the barrier of disbelief that they faced when they first ap-

Introduction

peared. Numerous successful applications in a variety of fields attest to the usefulness and power of these techniques. Computational Intelligence uses numerical representation of knowledge in contrast to Artificial Intelligence, which uses symbolic representation. This feature is exploited in Control Engineering, which deals with numerical data since control and controlled variables are both defined numerically. Computational Intelligence adapts naturally to the engineering world, requiring no further data conversion. The techniques of Computational Intelligence share the following properties: they use a numerical representation of knowledge, demonstrate adaptability, have an inherent tolerance to errors, and possess speeds comparable to those of humans.

Intelligent controllers infer the control strategy that must be applied to a plant in order to satisfy specific design requirements. This action can be the result of operations on a set of pre-specified linguistic control rules, as in the case of Fuzzy Controllers, or of training an artificial neural network with numerically coded rules as in the case of Neural Controllers. In either case, the primary objective is to generate control actions which closely match those of an expert human operator. In this manner, the controller can assist the human operator to maintain the plant under his supervision at its nominal operating state while simultaneously compensating for his inconsistency and unreliability brought about by fatigue, boredom and difficult working conditions. Intelligent controllers can be trained to operate effectively in conditions of vagueness and uncertainty of both the plant state and plant environment and can respond to unforeseen situations autonomously, i.e., without intervention from the plant operator. They differ, however, from their human counterpart in their ability to learn new control rules or to adapt to new situations for which they have not been trained. Selforganizing controllers that have the ability to learn new rules on-line have been variously proposed in the literature and tried out in the laboratory, but none has been commissioned so far in a manufacturing plant. The main reason is that this class of controllers assumes extended testing and experimentation on the controlled plant under normal operating conditions, a situation that few plant managers are likely to entertain.

10

Chapter 1

A variety of architectures have been proposed for the design and implementation of high level intelligent controllers for large-scale systems. One of the most useful is the hierarchical architecture proposed by Saridis in the mid-1970s. In this, information from the controlled plant flows with decreasing frequency from the lowest to the highest layer of the hierarchy. In contrast, management directives (on such matters as production quotas, product qualities, etc.) flow in the reverse direction with increasing frequency as they descend the hierarchy, leading ultimately to selection of the best control strategy that must be imposed on the plant. Saridis principle, on which a number of successful intelligent hierarchical process management and control systems have been developed, can be paraphrased as: Increasing/decreasing precision is accompanied by decreasing/increasing intelligence. It is useful, finally, to note the features that every Intelligent System involving clusters of intelligent controllers must support: Correctness - i.e., the ability to operate correctly for specific sets of commands and plant safety constraints. Robustness - i.e., the ability to operate acceptably despite wide variations in plant parameters. The higher layers of the hierarchy must possess an inherent ability to deal with unforeseen variations. Extendibility - i.e., the ability to accept extensions to both hardware and software without the necessity for major modifications to either. Extendibility implies modularity, which is the partitioning of the system into easily modifiable software and hardware modules. Reusability - i.e., the ability to use the same software in different applications. To possess this feature, the system must be general or possess an open architecture. The field of intelligent control is one of the most exciting and promising new directions of automatic control that is opening up new frontiers for research and development in radical solutions to the control of industrial systems in the new millenium.

Introduction

11

Chapter 2

Expert Systems in Industry


Expert Systems, which are the most commercially successful result of research in Artificial Intelligence, are software entities that emulate the cognitive abilities of human experts in complex decision making situations. As one of the primary activities of Computer Science and dependent heavily on the rapid developments in computer technology, Expert Systems have been eagerly adopted by industry and applied to a wide range of applications. Expert Systems belong to the field of Intelligent Knowledge-Based Systems that constitute one of the principal fields of activity of Computational Intelligence, a field which has been referred to as the science that attempts to reproduce human intelligence using computational means. Computational Intelligence has also been referred to as the science that attempts to make computers perform tasks at which humans, for now at least, are better! Computational Intelligence has many branches, one of the earliest and most important of which belongs to Expert Systems. The other branches of Computational Intelligence are shown in Figure 2.1 and are introduced in subsequent chapters. Expert systems use a variety of methods to represent knowledge and derive decisions while they have the ability to manage knowledge from different sources of human thought and activity. The manner in which this knowledge is represented in the computational environment depends on the nature of the knowledge and the field of expertise. In an industrial environment knowledge
13

14

Chapter 2

is typically represented in the form of linguistic rules that describe the actions that must be taken in response to specified excitations.

Expert Systems

Evolutionary Computing

Computational Intelligence
Fuzzy Systems

Neural Systems

Neuro-fuzzy Systems

Figure 2.1 Branches of Computational Intelligence

There are many techniques for representing knowledge and each one has its advantages and disadvantages. The principal theoretical research issue is how to give Expert Systems the ability to search through the domain knowledge systematically and arrive at decisions rapidly. The following are techniques commonly used for representing knowledge:

predicate logic, semantic networks, procedural representation, production systems, and frames.

Expert Systems in Industry

15

2.1 Elements of an Expert System


Expert systems are the outcome of a major effort in computer science to emulate the cognitive faculty of humans. Artificial intelligence is the basis for this field of endeavor, which includes such areas as pattern recognition, artificial speech and artificial vision, among others. Conventional computer software can be viewed as the synergy of: Software = Data + Algorithm Here, the algorithm processes data in a top-down sequential manner until the result is arrived at. In contrast, computer software used in Expert Systems can be described as the synergy of: System = Knowledge + Inference In this case the system structure differs radically and the principal elements are the knowledge base, which is a depository of all the available domain specific knowledge and the inference engine, the software whose function is to infer decisions. An Expert System can be characterized as an intelligent knowledge-based system provided it reproduces knowledge in the form of rules. The most significant characteristic of this class of systems is that it draws on human knowledge and emulates human experts in the manner with which they arrive at decisions. One definition of an Expert System is thus: An Expert System is the embodiment of knowledge elicited from human experts, suitably encoded so that the computational system can offer intelligent advice and derive intelligent conclusions on the operation of a system. Production rules are a convenient form by which to represent the knowledge of domain experts. Before describing this method, we note some alternative methods for representing knowledge that have been found useful in industrial applications. In general, knowledge that is useful in solving real industrial problems has two components:

16

Chapter 2

facts, which constitute ephemeral information subject to changes with time (e.g., plant variables) and procedural knowledge, which refers to the manner in which experts in the specific field of application arrive at their decisions. Procedural knowledge (e.g., information flows, control sequences and actions, etc.) and the step-by-step procedure which must be followed in the specific manufacturing plant, is evidently known by production engineers and is the result of years of experience with working with the plant or process. This is one of the principal reasons why Expert Systems have attracted so much attention in the industrial world. The use of rules is the simplest way to describe a manufacturing procedure, while linguistic rules of the classical if then else form are most commonly used by humans.

Domain Experts

Knowledge Acquisition System

Knowledge Base

Inference Engine
M anM achine Interface

Explanation Sub-system

Figure 2.2 Basic elements of an Expert System

Expert Systems in Industry

17

The basic elements of an Expert System are shown in Figure 2.2. An Expert System includes the following elements: the knowledge base, which comprises facts and rules with which to control a plant, the inference engine, which processes the data in the knowledge base in order to arrive at logical conclusions, the explanation sub-system, which is capable of a giving a rational explanation on how the decision was arrived at, the knowledge acquisition system, which is used by the knowledge engineers to help them analyze and test the knowledge elicited from human domain experts and the man-machine or user interface system through which the human operator interacts with the system.

2.2 The Need for Expert Systems


The developments in the field of Expert Systems rapidly found proponents in industry despite the inherent reluctance to adopt new technology. The application of Expert Systems in industry and manufacturing was left to innovative manufacturers who were sufficiently broadminded to take the risk in the expectation that the outcome would increase their competitive position and their market share. Despite some early failures of Expert Systems, which were touted for what they were supposed to do but didnt, the positive results which were reported motivated more manufacturers to invest in knowledge-based technology. This in turn led to further research and development in the field of Expert Systems in universities and research establishments. The reasons that has motivated industry to adopt knowledge-based techniques are the following: the lack of an explicit quantitative description of the physical plant, the existence of the knowledge and experience to control the plant, and

18

Chapter 2

the ability of a class of knowledge-based systems to deal with vagueness and uncertainty that is characteristic of many industrial plants. A common feature in industrial and manufacturing systems is that their quantitative models that are supposed to predict their dynamic behavior are either unknown or do not possess sufficient fidelity. This is particularly true in the case of large-scale industrial plants whose quantitative description is a difficult, tedious and occasionally impossible task for lack of sufficient deep knowledge. Deep knowledge is the result of microscopic knowledge of the physical laws that govern the behavior of a plant. In contrast, shallow knowledge is the result of holistic or macroscopic knowledge and is readily available from human domain experts. This knowledge is acquired after years of experience in operating the plant and observing its peculiarities and nuances.

2.3 Stages in the Development of an Expert System


In developing a knowledge-based system using an Expert System, it is essential to concentrate first on the objectives of the Expert System and not how these objectives can be met. Great effort must therefore be made to specify these objectives and constrain the domain of the Expert System. Inadequate specifications of the constraints of the Expert System over which it is expected to function and unwarranted expectations were the basic reasons for failure of many early Expert Systems to meet user requirements. Once the domain of the Expert System has been specified, we are in a position to select the tools and methods with which to design the Expert System. During this phase of development, the knowledge engineer elicits the rules by which the plant is to be controlled from domain experts. Following interviews that invariably include questionnaires on what variables are observed and what controlling actions the domain experts would take in every conceivable situation, the knowledge so acquired is stored in a suitably coded form in the knowledge base of an Expert System shell. An Expert System shell is nothing more than a collection of software elements that perform all the tasks of an Expert System. While in the past Expert Systems were developed using object-

Expert Systems in Industry

19

oriented languages, notably LISP, it is inconceivable today to develop such a system without a shell. It should be noted that knowledge elicitation is one of the most painstaking tasks in the design procedure. Human domain experts are often reluctant to part with their knowledge, fearful that divulging knowledge gained after years of experience may lead to their redundancy and termination. In the first stage of development of any Expert System, it is very useful to implement a rapid prototype. The objective here is not development of a complete Expert System, but a prototype that will form the basis of the final system under development. Once the knowledge engineer has a thorough understanding of the rules elicited from the domain experts and the manner in which decisions are arrived at and justified, he must then encode the knowledge in a form suitable for processing by the Expert System. It is noted that the rapid prototype need not possess all the features of the end product, but should incorporate the basic features that can be evaluated by both the domain experts and the end users. Should the prototype system demonstrate deficiencies and difficulties in inferring decisions, it is clearly preferable to make corrections and improvements at this stage rather than in the end product when it may be very difficult and costly. In the implementation stage of the Expert System, the knowledge elicited from the domain expert is transferred to the Expert System that runs on a suitable platform. Early Expert Systems invariably ran on powerful workstations or special purpose computers (such as the shortlived LISP machines) which were subsequently superceded by common microcomputers. Today, most Expert Systems can run on high end PCs or workstations. Once completed, the Expert System is tested off-line until the end users are convinced of its ability to infer correct results and support its decisions. It is noted that it is often difficult and uneconomic to test the Expert System exhaustively, i.e., for all possible conditions, in practice. For these reasons, end users must develop a close liaison with the Expert System designer, assisting him whenever some discrepancy is observed between their decision and that of the Expert System. Such discrepancies arise from rule conflict, misunderstandings or errors in the knowledge base.

20

Chapter 2

2.4 The Representation of Knowledge


Simplicity in representing knowledge in an Expert System is essential and a variety of techniques have been proposed to this end. One of the most common representations of domain knowledge is the decision tree, each branch of which represents some action. Every branch of the tree emanates from a node where a condition is examined. Depending on the outcome of this condition, a specific branch of the tree is traversed until the next node. The tree may have many branches and nodes. Some early Expert Systems proposed for diagnosis in medicine had thousands of branches and nodes, making them cumbersome, slow and difficult to use. Implementation of an Expert System can be either direct, using an object oriented programming language such as LISP, Prolog, C++, Visual Basic, Visual C++, Visual J++, Visual Fortran, etc. or, more conveniently using an Expert System shell such as G2, NEXPERT, etc. As noted earlier, the rule base of an Expert System contains linguistic rules of the classical if ... then ... else form with which plant operators are trained and subsequently use to justify their decisions. These rules may appear as strings in their original form or encoded into numerical form. Quantitative descriptions of a plant are not always straightforward, particularly when only incomplete and vague data on the plant are available. To make descriptions possible in such cases, special techniques such as Fuzzy Logic (which is introduced in chapter 5) or probabilistic methods are used.

2.5 Expert System Paradigms


Expert systems have been diffusing into industry rapidly since their introduction in the mid-1970s. Today, Expert Systems can be found in a variety of industrial applications, the most successful examples of which are described briefly below. The typical stages of a manufacturing plant are shown in Figure 2.3, which shows where Expert Systems can benefit production.

Expert Systems in Industry

21

22

Chapter 2

2.5.1 Expert systems for product design


Modern Flexible Manufacturing Systems (FMS) produce specialized products of high quality, limited production runs and short life cycles, i.e., lean production. These products undergo changes often and their design must be completed in very short times, imposing considerable stress on product designers. Expert computer-aided-design systems are now available to assist the designer, permitting him to exploit his creative abilities to the utmost while advising him on design and materials constraints following extensive background computations.

Product specifications
Design

Knowledge

Product properties Programming

Interpretation
Desired properties Interpretation

Control Diagnosis

Production Process

Supervision Prediction

Observed properties Future behavior

Figure 2.3 The manufacturing environment

Expert Systems in Industry

23

While conventional Computer Aided Design (CAD) software can process geometric shapes rapidly, the designer needs to know rapidly certain characteristics of the product being designed, such as strengths, thermal distributions, costs, etc. Expert CAD systems provide all this information while in addition advising the designer of alternative shapes from a priori experience with similar designs. The trend in product design today does not yet permit total design with expert CAD systems since design normally depends on the designers intuition and aesthetic knowledge, the prehistory of the product and economic factors that are difficult to incorporate in a knowledge base. The final product is a set of diagrams or plans, design specifications and various documents on which manufacturing will then proceed, as shown in Figure 2.3.

2.5.2 Expert systems for plant simulation and operator training


The training of operators to control modern industrial plants is important, time-consuming and very expensive when performed on the actual physical plants. Apart from the dangers involved should some wrong control action be taken with on-line training, the unevenness of production and uncertain quality of the product produced during operator training makes this procedure undesirable and very costly. Plant simulators, which simulate the plant and can be programmed to take into account faults and malfunctions in the plant (quite similar, in fact, to flight simulators) are today being used extensively to train new plant operators. Usually an instructor, unseen to the trainee operator, enters malfunctions and observes the trainees reactions and performance. The role of the instructor can be taken by Expert Systems, which can tirelessly repeat plant malfunctions and, like their human counterparts, examine and instruct the trainee operators. The knowledge with which to operate a plant is embedded in a set of if then else rules that are used to operate the plant. Multimedia and Virtual Reality can be used in the man-machine interface in training plant operators, even before the plant has been commissioned. The same system can also be used to refresh old operators knowledge, much as pilots must undergo periodic training and certification using flight simulators.

24

Chapter 2

2.5.3 Expert supervisory control systems


Reference was made in Section 2.3 to the use of Expert Systems for the supervision and control of Computer Integrated Manufacturing (CIM) systems. The primary objective of any Supervisory Control And Data Acquisition (SCADA) system, which constitutes the kernel of any CIM system, is data acquisition, the overall supervision of the health of the plant, prompt alarming of out-of-range variables and control of the principal variables of the plant under control. Supervisory control systems have revolutionized production plants, increasing productivity while significantly reducing production costs. The next stage in their evolution was the introduction of Computational Intelligence techniques that broadened their abilities significantly. The new generation of supervisory control systems exploits the knowledge and experience of domain experts in automatically correcting for plant malfunctions and discrepancies. New and advanced intelligent control techniques that were inconceivable until recently, are now commonly incorporated into most commercially available SCADA systems, further improving product quality and productivity while simultaneously reducing production costs. Expert systems are being used in industrial control, which is an integral part of any SCADA system, in the following fields: the design of industrial controllers, and the supervision and control of manufacturing plants.

One of the major difficulties in the design of plant controllers, particularly in the case of large-scale multivariable plants, using conventional control techniques, is the unavailability of explicit models of the plants. For this reason industrial automation leans towards the use of three term (PID) controllers and various empirical and semi-empirical design techniques have been proposed to determine the parameters of these controllers. Examples of these design techniques are the wellknown methods of Ziegler and Nichols and modern variants due to Persson and Astrom. In contrast, expert controller techniques, which can exploit the knowledge of expert controller tuners, can often offer superior results. A number of vendors currently offer such software products. The use of Expert Systems in the design of industrial controllers has two aspects. The first involves the rules on the most appropriate

Expert Systems in Industry

25

design technique to use in order to achieve the desired result. These rules are dependent on the specific plant to be controlled and criteria by which the control quality, i.e., the performance of the closed plant, is judged. The second aspect involves rules that specify the best control strategy to follow in any situation, given as advice to the operator.

2.5.4 Expert systems for the design of industrial controllers


Human operators are trained to use linguistic rules that involve the principal measured plant variables in order to maintain the plant at the desired state following some exogenous disturbance. The operators speed of reaction is critical in achieving a high quality of control and satisfactory product quality. Inaction, delays and inconsistencies in the actions of the operator due to fatigue or when under pressure, invariably leads to uneconomic operation and in the worst case, to disastrous results, a prime example being the Chernobyl nuclear power plant. The necessity to assist the operator in his routine tasks and advise him on the best strategy to follow in extreme cases or in rare situations which he may not have met earlier, was the motivation for the development of a new class of expert supervisory control systems. Coupled with the rapid developments in computer technology, this new generation of control systems is a reality that is finally diffusing into manufacturing. Expert supervisory control systems do not require deep knowledge of the plant to be controlled, but are based on shallow knowledge of the form normally used by human operators. The fundamental requirement is the existence of a conventional supervisory control system to which the Expert System is appended.

2.5.5 Expert systems for fault prediction and diagnosis


A very significant field of application of Expert Systems has been in equipment fault prediction and diagnosis, sometimes termed equipment health condition. Many such systems have augmented existing data acquisition systems and have proved invaluable for the prediction of faults in equipment. Examples of faults that are important to predict are

26

Chapter 2

increased wear of bearings of rotating machinery due to vibrations or excess friction due to overheating. This class of on-line, real-time Expert Systems is giving new meaning to the field of predictive maintenance. Productivity is benefiting through improved estimates of the time-to-go before catastrophic failure to the equipment is likely to occur. This is particularly important in the case of large equipment for which expensive spare parts have to be in stock, to be used in case of a breakdown. Expert systems can minimize and even eliminate stocks through timely procurement. The use of Expert Systems for fault prediction results leads to a drastic reduction in the mean time to repair equipment and a corresponding increase in the availability of the equipment and, most importantly, an increase in plant productivity. In preventive maintenance, historical data is gathered from suitable sensors attached to the equipment (e.g., temperatures, pressures, vibrations, etc.). Real-time measurements of critical variables are compared with expected or desired values and any discrepancy is used to diagnose the possible cause of the discrepancy from rules embedded in the Expert System. Following spectral analysis of such measurements of bearing sounds by standard signal processing techniques, the Expert System suggests what maintenance will be required and when best to perform it. Expert systems for fault diagnosis can be either off-line or online. In the former case, maintenance personnel enter into a dialog with the Expert System, supplying answers to questions posed by the Expert System on the health of the equipment. The Expert System then gives instructions on what further measurements and what actions should be followed that will focus on the source of the problem and then give advice on how to repair it. It is obvious that rapid fault diagnosis is of paramount importance in a manufacturing environment where every minute of lost production results in a loss of profit. It should be evident why expert fault prediction and diagnosis systems have been the subject of considerable commercial interest and have found such extensive application.

Expert Systems in Industry

27

2.5.6 Expert systems for the prediction of emergency plant conditions


Effective control of large complex industrial systems, such as nuclear reactors, power distribution networks and aircraft, is critically important since breakdowns can lead to unforeseen and potentially disastrous results. In recent history, the Chernobyl disaster stands out as a leading example of human error. Likewise, power blackouts over large areas of the power distribution network, are often the result of human error. But the most visible example is pilot error, when hundreds of lives are lost because of wrong pilot decisions that have been made in situations of immense pressure. A human operator has great difficulty in making decisions when facing conflicting or excessive information, particularly if under stress. Real-time Expert Systems, using data from the plant and rules derived from logical reasoning and prior experience, advise the plant operator on the best course of action to take in order to avert a catastrophe and return the plant to its nominal operating state as quickly as possible, with minimal disruption of production and damage to the equipment.

2.5.7 Expert systems for energy management


With the ever-increasing costs of energy, the management of energy in large industrial plants is of major concern and means to contain these costs are actively sought. Energy-intensive industries, such as the metallurgical, cement and petrochemical industries, have a very real need to contain their energy demand and most nowadays use some form of energy management system. In large manufacturing plants the electric energy pricing policy is dependent on the power absorbed over, for instance, each 15-minute period. The power provider and the manufacturer agree on a pricing policy for every minute period during the day, the cost of energy being significantly lower in off-peak periods and prohibitively high in peak periods. In turn, the consumer agrees to restrict his energy intake to these limits. A significant penalty must be paid if the contractual limits are exceeded in any period. Such additional costs can make production noncompetitive.

28

Chapter 2

It is therefore necessary to accurately predict what the power absorbed over each period will be and to monitor the energy demand by shedding loads in time to avoid exceeding the contractual energy limit. The decision on which loads to shed and when to do so without disrupting production, is a very difficult and tiring task for a human who would have to make this decision every 15 minutes throughout the day and night. The operator has to know which equipment can be shut down and which must, at all costs, be left running in order to avoid major disruption of the production line or manufacturing plant and how long before each piece of equipment can be restarted without causing excess wear to it. In a large plant this is normally performed by shedding auxiliary equipment that is not considered absolutely essential to the manufacturing plant (e.g., circulation pumps, conveyor belts) and in the worst case by a total stoppage of production in periods of high energy cost. Many electric energy intensive plants today are forced to shut down production during peak hours in order to conserve energy. Real-time expert energy management systems have been developed and have been very successful in containing energy costs, replacing the human operator in this arduous task. Indeed, avoiding just one or two overload penalties often pays for the cost of the Expert System! The rules by which equipment can be operated, the order in which they may be shed, when and how many times per day they can be restarted are elicited from human operators and are embedded in the Expert System rule base. The real-time expert energy management system is then executed every few seconds following prediction of the energy absorbed at the end of the timing period. Naturally, the magnitude of the load that must be shed is critically dependent on the time-to-go before the end of the period: the shorter the time left, the larger must be the load that must be shed and the greater the malfunction that is incurred. Accurate prediction and effective and fast decisions from the Expert System are essential to proper operation.

2.5.8 Expert systems for production scheduling


Production scheduling in manufacturing plants with multiple parallel production lines is essential in order to maintain high product throughput despite changes in production priorities, equipment malfunctions and variations in the raw materials. In deciding which production line can be

Expert Systems in Industry

29

used to manufacture a specific product, the production manager must know the production capacity and limitations of each production line, the overall production schedule, equipment and storage capabilities, etc. When a production line is disrupted for whatever reason, it is often necessary to switch production lines and change the priorities with which the product is produced, permitting high priority items to be completed first while lower priority items are placed in a queue. The long-term production schedule is normally produced on a weekly or monthly basis but changes to it may be necessary due to equipment failures. When these failures are serious enough to cause extended production disruption it is necessary to re-compute the production schedule. Operational research techniques based on linear integer or mixed-integer programming is the conventional approach to this problem, but these techniques are time-consuming. An alternative way to reschedule production is through the use of empirical rules that are followed by production management. Expert scheduling systems using this knowledge and experience are considerably simpler to use and lead to equally feasible results much faster and have been used with excellent results.

2.5.9 Expert systems for the diagnosis of malfunctions


This application involves Expert Systems for the diagnosis of malfunctions in the sub-systems of a manufacturing system. The method requires decomposition of the manufacturing system into a set of interacting subsystems and leads to the development of a knowledge base from which the cause of the malfunction can be inferred. The method is particularly useful in Flexible Manufacturing Systems (FMS) - discrete production lines, typical examples of which are beverage bottling, cigarette packing and food canning lines. A malfunction in any sub-system can lead to a total shutdown of the production line and it is therefore critically important to diagnose the source of the malfunction as quickly as possible in order that the malfunctioning equipment be repaired rapidly and be put on-line once more. Thus sensors (photocells, inductive detectors, etc.) are placed at critical points along the production line, from which flow rates can be estimated continuously. It is obvious that rapid reinstatement of the production line

30

Chapter 2

is of paramount importance in order to maintain high equipment availability and meet production schedules. In large bottling or canning plants, for instance, the sensors are linked to the Factory Data Acquisition (FDA) system and measurements are continuously compared with the desired values. Should some unit along the line malfunction, then clearly both the proceeding and succeeding units will suffer the consequences. Due to the interactive nature of most production systems and work cells, it is obvious that when any sub-system malfunctions, the sub-systems upstream and down-stream will be affected sooner or later. Up-stream units must thus be stopped in time to avoid strangulation as a consequence of the accumulation of partially finished products which may exceed the capacity of the silos or queues if the malfunction persists for some time, while down-stream, units must be stopped because of starvation. Expert systems for the diagnosis of equipment malfunctions contain the rules by which a malfunction can be transmitted to adjacent units embedded in their knowledge base. The Expert System continuously monitors the materials flows and should the mass balance for each unit be essentially constant, then no alarm is issued. However, when some malfunction occurs, the Expert System is executed with the object of determining the source of the fault. Timing is clearly of the essence.

Chapter 3

Intelligent Control
Intelligent control takes a radically different approach to the control of industrial processes and plants from conventional control. The knowledge and experience of human operators constitutes the basis for this new approach to Control Engineering for which Computational Intelligence provides the theoretical foundation. In this chapter we summarize the potential and some limitations of intelligent control and we attempt to address the questions on how, where, when and under what conditions can intelligent control be applied in practice. Intelligent control seeks solutions to the problem of controlling plants from the viewpoint of the human-operator. In other words, the technique seeks to establish some kind of cognitive model of the human operator and not the plant under his control. This is the point at which intelligent control departs from conventional control and it is undoubtedly true that the technique could not have been possible but for the rapid progress in computer technology. Computational Intelligence provides the tools with which to make intelligent control a reality. The reproduction of human intelligence and the mechanisms for inferring decisions on the appropriate control actions, strategy or policy that must be followed are embedded in these tools. Figure 3.1 shows how Computational Intelligence can be classified according to the form of the knowledge (i.e., structured or unstructured) and the manner in which this knowledge is processed (i.e., symbolic or numerical). For control applications, knowledge can be struc31

32

Chapter 3

tured or not, but processing is invariably numerical. Fuzzy and neural control form the core of intelligent control and are the principal components of computational intelligence.

PROCESSING
Symbolic Numerical

Structured

Expert Systems

Fuzzy Systems

KNOWLEDGE
Unstructured Neural Systems

Figure 3.1 Classification of Computational Intelligence

In contrast to conventional control, intelligent control is based on advanced computational techniques for reproducing human knowledge and experience. Thus in intelligent control the focus of interest moves away from the tedious task of establishing an explicit, microscopic model of the controlled plant and the subsequent design of a corresponding hard controller, to the emulation of the cognitive mechanisms used by humans to infer and support control decisions. Intelligent control has been applied with considerable success in the process industry. Examples can be found in the petrochemical, cement, paper, fertilizer and metals industries. With time, it is predicted that intelligent control will diffuse into most branches of industry by manufacturing and be adopted by progressive organizations that are seeking to improve their strategic position in the global market through improved productivity and product quality.

Intelligent Control

33

3.1 Conditions for the Use of Intelligent Control


Intelligent controllers use empirical models that form the framework on how and not why the controlled plant behaves in a particular manner, instead of relying on explicit mathematical models of the plant. The fundamental problem in developing an intelligent controller is elicitation and representation of the knowledge and experience of human operators in a manner that is amenable to computational processing. Intelligent systems invariably use a collection of heuristic and non-heuristic facts of common logic as well as other forms of knowledge in combination with inference mechanisms in order to arrive at and support their decisions. A basic characteristic of this class of systems is that these systems are able to infer decisions from incomplete, inaccurate and uncertain information, typical of many industrial and manufacturing environments. As noted in chapter 2, intelligent systems can be applied off-line, e.g., for controller design, for fault diagnosis and production management or on-line, e.g., for fault prediction and supervisory control. In conventional control, the knowledge about the controlled plant (i.e., its model) is used implicitly in designing the controller, whereas in intelligent control, the knowledge about the plant is distinct from the mechanism that infer the desired control actions. This is shown schematically in Figure 3.2. It is obvious that an intelligent controller can therefore be easily configured for any plant by simply modifying its knowledge base. The most common means to reproduce knowledge are linguistic rules, sometimes termed control protocols, which relate the state of the plant to the corresponding desired control actions. These rules represent shallow empirical knowledge, deep knowledge or model-based knowledge about the plant.

34

Chapter 3

Real-time Data Base

Inference Engine Knowledge Base

Figure 3.2 Basic elements of an intelligent controller

3.2 Objectives of Intelligent Control


The principal objectives of intelligent control are maximization of: the strategic success (i.e., profit) of the business and productivity. In order to succeed in these objectives, it is presumed that an intelligent controller is technically correct. The success of any application is based on both technical and social factors that must co-exist and cooperate in order to bring about the desired results. The distinct characteristics of each must therefore be respected, otherwise conflicts will arise which are bound to minimize its impact and effectiveness. Thus continuing cooperation among plant management, production management and plant operators is essential to the acceptance of any such system. Doubts and differences of opinion, for whatever reason, by any one of those involved, may easily doom any system unless it has been studied in depth. The optimum solutions using social criteria are by no means easy to determine or measure. For an intelligent system to be technically correct it is necessary that its technical specifications meet the demands of the particular application. There are a number of vendors that offer intelligent controllers for the process industry. All claim to meet the demands of industry and the choice of which one is the most suitable is not always

Intelligent Control

35

easy. Cost is one factor but prior experience with a similar plant is considered the most important factor in making a decision on which system to purchase. The socially optimum solution is given by the support that plant management, production management and plant operators are prepared to give to a particular system to make it successful. In considering intelligent control seriously for solving production control problems, which conventional control is unable to solve, answers must be sought to the following questions: will the proposed system

repay its cost in finite time? decrease the cost of production? increase productivity? lead to savings in energy? improve equipment availability? be simple to use or require specialized knowledge that is not available in-house? decrease the workload of plant operators? It is implicitly assumed that the knowledge to control the plant is available by the plant operators who have spent years operating the plant. It should be obvious that intelligent control is not a candidate when this knowledge is absent or incomplete, as in the case of an entirely new plant for which there is no prior experience. In practice this situation is not very likely to occur since most new plants are based on earlier designs for which some prior knowledge exists. It is difficult to state all the technical properties of a successful intelligent system since they vary according to the application. The success of an intelligent system is very much dependent on the support of the users of the system. Intelligent controllers are underutilized and even ignored when user support is undermined. Assuming that plant management has been convinced that intelligent control could lead to an improvement in the strategic position of the business, improve productivity and lead to a reduction in production costs, it is still necessary to convince plant operators to use the system. This is not always an easy task, as by tradition, operators are fearful of any new technology that may undermine their post, future and usefulness in the business. These are natural feelings that have to be taken into account when any new system is introduced. This inbred fear can be

36

Chapter 3

greatly reduced by including the plant operators in the system development process and providing adequate training to alleviate his fears. The older generation of plant operators spent years controlling plants from central control areas with classical instrumentation, adjusting the setpoints of conventional three term controllers and tediously logging plant activity manually. Today, the new generation of plant operators have been brought up in the era of computers, consoles with graphical user interfaces and all the benefits of Computer Integrated Manufacturing systems. Even the older plant operators have adapted, even though sometimes reluctantly, to the new environment. New plant operators no longer view the introduction of advanced technology as a threat but on the contrary, show great interest and an enviable ability to assimilate and use it effectively to improve their working conditions. This is especially true where management has had the foresight to provide the necessary training in advance. The days of pulling down control switches and turning control knobs are gone, replaced by the touch of a light pen or a finger on a screen or the click of a keyboard or a mouse. Report generation is a matter of seconds instead of hours, days or even weeks. Information is power and this can undoubtedly be enhanced through the use of intelligent techniques. From the viewpoint of management, the success of an intelligent control system is judged solely on how rapidly the system will repay its investment. This is measured from the observed (and not assumed) increase in productivity, the energy reduction and the improvement in the mean-time-between-failures of the plant. Improvements of the order of 5-10% are not uncommon in the process industry. History has shown that since their introduction, manufacturers that have taken advantage of the new control techniques have benefited significantly on all counts. The specialization required to develop intelligent systems is Knowledge Engineering. It would be very wrong to conclude, however, that no knowledge of conventional control and system theory is necessary to design such systems. On the contrary, a very thorough knowledge of the abilities and limitations of classical and modern control techniques must constitute the background of the knowledge engineer. The most successful intelligent control systems that have been commissioned have been designed by control engineers with a very thorough background in conventional control techniques. Knowledge Engineering requires the cooperation of knowledge engineers, domain experts and plant operators in the design phase, com-

Intelligent Control

37

missioning and operation of an intelligent system. Characteristics such as the quality, depth of knowledge, effectiveness of the inference engine and the suitability of the man-machine interface, are important to the efficiency and acceptance of an intelligent system. An intelligent system based on computational intelligence uses linguistic rules with which to describe the knowledge about controlling the plant. Before eliciting the rules from human operators, it is very important to stipulate the bounds of this knowledge, otherwise the system is likely to be unwieldy. It should be obvious, furthermore, that the intelligent system software can be written in any high-level language or be developed on some expert system shell that simplifies the design process significantly.

Chapter 4

Techniques of Intelligent Control


Conventional control systems design relies on the existence of an adequate macroscopic model of the physical plant to be controlled. The first stage in the analysis of such a system is therefore the development of an explicit mathematical model of the controlled plant that adequately reproduces the characteristics of the plant with fidelity. The model can be determined either from first physical laws or using some technique of identification from operating data mined from the plant. There exist a variety of design techniques that can then be used to design an appropriate hard controller. Today, this task is made simpler through the use of computer aided control systems design software with which the performance of the closed system can be rapidly evaluated and optimized. The ultimate objective is clearly the development of a hard controller that satisfies specific performance specifications. On completion of the design, the hard controller is then implemented in hardware or software that will execute in real-time on a process computer. The design of a conventional controller, particularly in the case of multivariable plants, is a tedious and painstaking process that requires repeated cycles of analysis, synthesis and testing. The design hopefully converges to an acceptable solution and ultimately to commissioning.
39

40

Chapter 4

In order to use conventional design techniques, it is essential that the model of the plant be simplified yet be sufficiently comprehensive so that it reproduces the essential dynamic features of the physical plant. Modern manufacturing plants have to meet increasing demands for more flexible production and improved quality while striving to meet stringent environmental constraints. Though there were high expectations that modern control theory would meet these demands, it has failed by and large to do so to any significant degree in industry and manufacturing, which thus far have had to be content with conventional industrial threeterm controllers. The design of simple, practical and robust controllers for industry is usually based on low order holistic models of the physical plant. These approximants form the basis for the design of industrial controllers that satisfy relaxed performance criteria. Three-term controllers are the backbone of industrial control and these ubiquitous, simple and robust controllers have offered sterling service. However, these controllers can only perform at their best at the nominal operating point of the plant about which the approximant holds. When the operating point moves away from the nominal point, their performance is invariably degraded due to the inherent non-linearity of the physical plant. A number of techniques have been proposed to anticipate this problem, the most common example of which is gain-scheduling, a variant of which is considered in a later chapter. The objective here is one of extending the domain over which satisfactory controller performance is maintained. Adaptive controllers are another class of controllers whose parameters can be varied to track the changes in the operating point. Here, periodic identification is required in order to follow the changes in the plant dynamics. The degree of autonomy of a controller is closely related to the range of operation of the controller and consequently to its robustness. The degree of autonomy of a gain-scheduled controller is higher than that of a fixed controller but lower than that of an adaptive controller whose range of operation is correspondingly greater.

Techniques of Intelligent Control

41

4.1 Unconventional Control


There are many situations in practice where, due to unforeseen changes in the controlled plant and its operational environment, a greater degree of autonomy is required. Conventional control techniques often fail to provide the necessary autonomy which these situations demand and this has led to an extensive search for new advanced control techniques which offer high autonomy and robustness despite the unfavorable operating conditions, uncertainty and vagueness that characterize the plant and its environment. This is typically the domain of industry and manufacturing. Coincidentally, this is also the domain of Intelligent Control. This recognition was not long in coming and it was not long before the beneficial results of applying Intelligent Control to industry became evident. Many industrial processes are so complex that any attempt at describing them analytically is often futile. Even if much effort is expended in determining some form of explicit model, it is usually so complex as to make it of little use for the design of a suitable controller. To apply modern control design techniques it is necessary to simplify the model through linearization and then model reduction before proceeding to determine an appropriate linear controller. This design procedure used often in design exercises leaves much to be desired in practice, as the original control problem is no longer attacked. Instead, some idealized controller for an idealized plant is determined and the probability that this controller is applicable to the real problem is small indeed. Invariably resort has to be taken to parameter tuning on-line in order to obtain acceptable performance. Techniques for designing and analyzing nonlinear plants are virtually non-existent and what techniques are available apply to very restricted situations. Thus the designer invariably falls on simulation to design an acceptable controller which must then be tested exhaustively in the field, an iterative, time-consuming procedure. The field of Industrial Automation, which relies largely on Programmable Logic Controllers (PLCs) and industrial three-term controllers, is now being confronted with new control techniques which find their origins in Soft Computing and Computational Intelligence. The need to maintain tight production and quality control in large-scale industrial and manufacturing plants producing products with high specifications, and the inability of conventional control techniques to satisfy these requirements, has led to emergence of a new class of automatic

42

Chapter 4

control techniques. What sets these new unconventional techniques apart is their ability to arrive at control decisions and control strategies in ways that are radically different from those of conventional control. It is not unreasonable, therefore, that the new class of unconventional control techniques has caused considerable interest in industrial and manufacturing circles and has led to innovative controllers which have been applied to many difficult problems in industry. In this new class of controllers, the primary objective is minimization of the uncertainty and vagueness with which industrial processes are shrouded, leading to controllers with high autonomy and robustness. The reproduction of the cognitive and decision making processes of a human operator of an industrial plant executing his control task has been the subject of intense research since the 1950s, reaching fruition in the 1970s with the implementation of the first experimental rule-based control system. The first practical unconventional industrial controllers were commissioned in the early 1980s in the cement industry, an industry with many difficult problems particularly in the critical kilning process. The development of unconventional controllers since then has been very rapid and they are to be found today not only in most process industries but in all kinds of household appliances as well. The new field of unconventional control, which is based on the knowledge and experience of human operators, is better known as Intelligent Control and is supported by fuzzy, neural, neuro-fuzzy and evolutionary control techniques. This book is devoted exclusively to these techniques and their application to practical industrial problems. In line with the use of the term Soft Computing in Control this field is sometimes known as Soft Control. Modern control theory, which is based on a microscopic description of the controlled plant using differential or difference equations, can, in contrast, be described as Hard Control since it uses inflexible algorithmic computing techniques. There are fundamental differences between conventional and intelligent control techniques. These differences are highlighted in this chapter. It is important to note that control in no way supercedes conventional control, but rather augments it in an effort to resolve some of the difficult and unsolved control problems which industry and manufacturing face. A fundamental difference between conventional and Intelligent Control is the manner in which the plant and controller are viewed. This is seen schematically in Figure 4.1. In conventional control, the plant and

Techniques of Intelligent Control

43

controller are viewed as distinct entities. The plant is assumed invariant and designed to perform a specific task without the controller in mind. The controller is subsequently designed independently and after the plant has been completed. In contrast, in intelligent control, the plant and the controller are viewed as a single entity to be designed and commissioned simultaneously. Thus plants which are unstable by nature are stable when a unified approach is taken. A typical example is a helicopter that is by nature an unstable plant, but so long as its controller is operational is perfectly stable.

Plant

Plant

Controller

Controller

(a)

(b)

Figure 4.1 (a) Conventional and (b) Intelligent control configurations

The generalized closed control system is shown in Figure 4.2. Here block P depicts the controlled plant, block C depicts the controller and block S specifies the desired closed system performance specifications. In conventional control blocks P and C are assumed linear (or linearized) and the block S defines the cost function, or criterion of performance, e.g., stability margin, rise time, settling time, overshoot, steady state error, integral squared error, etc. following some exogenous disturbance. The following characteristics apply to industrial processes: the physical process P is so complex that it is either not known explicitly or is very difficult to describe in analytical terms, and the specifications S ideally demand a high degree of system autonomy so that the closed system can operate satisfactorily and without intervention despite faults in the system.

44

Chapter 4

Disturbance Output Plant P Specifications S

Control Action

Controller C Input

Figure 4.2 Closed control system variables

The intelligent control problem can be stated in precisely the same way: given the plant P, find a controller C that will satisfy specifications S that may be either qualitative or quantitative. The basic difference here is that in intelligent control it is not necessary to have an explicit description of the plant P. Furthermore, it may not always be possible to separate the plant and controller. Since intelligent control is often rule-based, the control rules are embedded in the system and form an integral element of the system. This structure presents new possibilities for improved integrated manufacturing systems. As illustrated in Figure 4.3, Intelligent Control is the fusion of Systems Theory, Computer Science, Operations Research and Computational Intelligence that bonds them. These techniques are now being called upon to solve problems of control engineering, which were hitherto unsolvable. Intelligent control has been claimed to be able to reproduce human-like properties such as adaptation and learning under unfavorable and uncertain conditions. There is considerable debate on this matter and many have refuted these claims. No one has refuted the fact, however, that intelligent control is capable of controlling large-scale industrial processes that have hitherto been controlled only manually very successfully. Industry has only to take advantage of this fact to benefit significantly.

Techniques of Intelligent Control

45

Computational Intelligence

Systems Theory

Computer Science

Operations Research

Figure 4.3 Intelligent Control

4.2 Autonomy and Intelligent Control


An intelligent system is designed to function in an uncertain and vague industrial or manufacturing environment with a view to increasing the success of the business. Success is defined as satisfaction of the system objectives, primarily the profit margin. Intelligent control systems are required to sense the environment, infer and justify the appropriate control action on the system so as to improve productivity, reduce energy consumption and labor costs. In advanced forms of autonomous systems, intelligence implies the faculties of perception and thought, the ability to make wise decisions and act correctly under a large range of unforeseeable conditions in order to survive and thrive in a complex and often hostile environment. In order to relieve the human operator of his often boring and tiresome control task, a high degree of autonomy is required in intelligent supervisory control systems. In the case of humans, these faculties are credited to his natural intelligence. In order to approach this level of intelligence

46

Chapter 4

under conditions of uncertainty and vagueness by mechanistic means, it is necessary to develop advanced inference and decision support techniques. Autonomy of operation is the objective and intelligent control is the means to this objective. The theory of intelligent systems, which was developed by Saridis, fuses the powerful techniques for decision support of Soft Computing and advanced techniques of analysis and synthesis of conventional Systems Theory. The fusion of Computational Intelligence, Operations Research, Computer Science and Systems Theory offers a unified approach to the design of intelligent control systems. The result of this fusion is Soft Control, which today is one of the most interesting areas of research and development in Control Engineering.

Organization

Coordination

Execution

Intelligence

Precision

Figure 4.4 Hierarchical structure of an Intelligent System

Intelligence is distributed hierarchically and in accordance with Saridis principle of increasing precision with decreasing intelligence as is depicted in Figure 4.4. Most practical hierarchical intelligent control systems have three layers:

Techniques of Intelligent Control

47

the Organization layer in which high level management decisions, e.g., production scheduling, are made, the Coordination layer for the tasks that have been decided on at the Organization layer. As in the case of the uppermost layer of the hierarchy, this layer normally possesses intelligence, and the Execution layer, which has little or no intelligence, is the layer in which the commands of the higher layers are executed. This layer involves low-level controllers embedded in the plant Remote Terminal Units (RTU). Recently, some vendors have added some degree of intelligence into this layer.

Organizer

Coordinator

Coordinator

LAN

Executor

Executor

Executor

Figure 4.5 Distributed architecture of an intelligent system

The uppermost layer of the hierarchy is activated rarely and at random as need requires and uses qualitative reasoning to arrive at its decisions. The intermediate Organization level is activated by production management and has low repetition frequency, perhaps once or twice

48

Chapter 4

daily. In this layer, management arrives at a long-term production policy that must be followed to achieve production goals. The production policy that has been decided in the highest layers is relayed to the Coordination layer, which is responsible for carrying out this policy. In this intermediate layer, decisions on changes in the policy can be made should, for instance, a serious malfunction or breakdown occur in a production line, if raw material shortages are ascertained, or if changes are made in customer priorities. The Coordination layer is also responsible for product quality control, maximization of productivity of each unit in the factory and for coordination between the various manufacturing units. Even though the structure of an intelligent system is normally represented vertically in Figure 4.4 to indicate its hierarchical architecture, in practice such systems use a distributed architecture based on a client/server structure, as shown in Figure 4.5. Here, the Organizer acts as the server while the Coordinators and Executors are clients of the system.

4.3 Knowledge-Based Systems


Intelligent controllers use a qualitative description on how a process operates instead of an explicit quantitative description of the physical principles that relate the causes to the effects of the process. An intelligent controller is therefore based on knowledge, stated linguistically in the form of production rules, which are elicited from human experts. Appropriate inference mechanisms must then be used to process this knowledge in order to arrive at suitable control decisions. One or more of the following techniques of Computational Intelligence may be used to this end: Expert Systems Fuzzy Logic Artificial Neural Networks Neuro-fuzzy Systems Evolutionary Computation

Techniques of Intelligent Control

49

4.3.1 Expert systems


The objective of an expert system is to permit a non-expert user to exploit the accumulated knowledge and experience of an expert in a specific field of expertise. Knowledge-based expert systems use rules, data, events, simple logic and any other form of knowledge in conjunction with search techniques to arrive at decisions. Where an element of knowledge is missing then the expert system cannot but return a dont know response, implying an inability to arrive at a decision. An expert system cannot extrapolate data and infer from similar or adjacent knowledge. Expert systems operate either on-line when decisions are required in real-time, as in the cases of fault prediction, energy management and supervisory control, or off-line, as in the cases of interactive, dialog-based systems for fault diagnosis and production management. In dialog-based expert systems using decision trees, forward and backtracking mechanisms for searching the tree have proved successful in industrial applications. In the category of on-line expert systems, a number of vendors supply real-time expert supervisory control systems for industrial applications. In the case where a complete decision tree that can account for every possible situation that may arise in practice is available, then such knowledge-based systems offer a simple and effective solution to the unconventional control problem. Expert systems can be developed using object-oriented languages such as LISP, Prolog and C++ or any of the available expert system shells, such as G2, NEXPERT, etc. It should be obvious that situations involving uncertainty and vagueness cannot be treated effectively using conventional decision-tree based expert systems. In contrast, when insufficient or incomplete knowledge about a process is available and when uncertainty and vagueness characterize the plant and its measurements, then such knowledge-based expert systems are not always able to arrive at decisions and are consequently unacceptable for real-time control purposes. It would be extremely frustrating to receive a dont know or a conflicting decision in the case of an unforeseen emergency when immediate action is required! In situations of uncertainty and vagueness, more effective mechanisms, capable of inferring decisions from incomplete data, are necessary. Fuzzy logic and artificial neural networks and their hybrids are the primary examples of

50

Chapter 4

techniques that possess appropriate mechanisms to deal with uncertainty and vagueness.

4.3.2 Fuzzy control


The assumptions that industrial processes can be modeled by sets of algebraic or differential equations and that the measurements from sensors are noise-free and exact rarely hold in practice. Likewise, the degree of uncertainty in the controlled process, which is characterized by the completeness and vagueness of the information that is mined from the process, plays a major part in determining the behavior of a closed industrial system. Many industrial processes can be controlled adequately by human operators whose knowledge of mathematical models, algorithms and the underlying physical principles on which the process operates is limited to non-existent. Human operators have an inherent ability for deciding on acceptable or satisfactory actions to follow from qualitative information received from a variety of apparently disparate sources. Naturally, where control of a process or plant is in the hands of human operators, it is inappropriate to talk about optimum control decisions, but rather acceptable control decisions. It is unlikely that two operators will make identical decisions in any given situation and the only way to evaluate the quality of their decisions is to evaluate the productivity of the process under their control. Experience has shown that there is always a best operator who is consistently capable of increased productivity. This is the operator whose knowledge should be elicited since he clearly knows how to get the most out of the process! Fuzzy control requires some qualitative description of the rules with which a human operator can control a process. Zadeh has noted that for many complex processes a high level of precision is not possible or even necessary in order to provide acceptable control. This is a pivotal concept in Computational Intelligence and forms the basis of unconventional control. Fuzzy logic, which in no way replaces probability theory, is the underlying theory for dealing with approximate reasoning in uncertain situations where truth is a matter of degree.

Techniques of Intelligent Control

51

4.3.3 Neural control


Artificial Neural Networks (ANNs) were originally proposed in the 1940s but limitations in the hardware and in training methods at the time did not permit their practical application. Extensive and often disheartening research followed over the next decades and it was not until the 1980s that ANNs became established as a viable technique of Computational Intelligence. ANNs are made up of densely parallel layers of simple non-linear neurons (or nodes) which are interconnected with varying weights (the synaptic weights) that completely specify the behavior of the network once it has been trained. Various network architectures have been proposed and their application to unconventional control was simply a matter of time. ANNs may be constructed with both analog and digital hardware and exhibit the properties of massive parallelism, robustness and fault tolerance, properties which make them ideally suited to control applications. Furthermore ANNs possess: the ability to learn from experience instead of from models, the ability to generalize and relate similar inputs to similar outputs, the ability to generate any arbitrary non-linear functional relationship between their inputs and outputs, a distributed architecture which is eminently suited to parallel computation, inherent stability, the ability of absorbing new knowledge without destroying existing knowledge, and the ability to learn on-line, despite disturbances in the process.

4.3.4 Neuro-fuzzy control


Traditionally, fuzzy and neural systems were considered as distinct, since their origins are very different. This restrictive viewpoint has changed radically since the late 1980s when it was realized that both fields belonged to Computational Intelligence and had much in common. Fuzzy systems are very powerful for representing linguistic and structured knowledge by means of fuzzy sets, but it is up to the experts

52

Chapter 4

to establish the knowledge base that is used in the system. Execution of the fuzzy algorithm is performed as a sequence of logical steps at the end of which a full explanation of the steps that were taken and the rules that were used in arriving at the conclusion is made available. In contrast, ANNs are ideal for the representation of an arbitrary nonlinear functional relationship with parallel processing methods, can be trained from training sets but do not offer any mechanism for giving explanations on the decisions at which they arrive. It is natural, therefore, to consider the advantages and benefits that a fusion of the two methods may present. This possibility is discussed at length in a later chapter and it is shown how fuzzy control and neural control can be combined with interesting results.

Chapter 5

Elements of Fuzzy Logic


Uncertainty can be traced to Heisenbergs principle, which established multi-valued logic in the 1920s. In the late 1930s the mathematician Max Black applied continuous logic to sets of elements and symbols and named this uncertainty. The developments that followed allowed for degrees of uncertainty, with truth and falsity being at the extremes of a continuous spectrum of uncertainty. In his seminal publication entitled Fuzzy Sets, Zadeh presented in 1965 the theory of multi-valued logic, which he termed Fuzzy Set Theory. Zadeh used the term Fuzzy Logic and established the foundations of a new area of scientific endeavor that continues to this day. Initially, many have been critical of Zadehs theory, claiming that Fuzzy Logic is nothing but probability theory in disguise. Zadeh was to develop his theory into Possibility Theory, which differs significantly from probability theory. In contrast, in Japan the theory of Fuzzy Logic was rapidly assimilated into a host of applications that have yielded enormous profits! Kosko conjectures that the principles of Fuzzy Logic are much closer to the Far Eastern concept of logic than Aristotelian logic which the West espouses and this was the reason why the Japanese were more appreciative of Fuzzy Logic. The theory of Fuzzy Logic establishes the basis for representing knowledge and developing the mechanisms essential to infer decisions on the appropriate actions that must be taken to control a plant. Since the late 1970s, Fuzzy Logic has found increasing application in the process
53

54

Chapter 5

industry, in traffic and train control systems and most notably in household appliances. The fundamental elements of Fuzzy Logic necessary to understand the techniques of Fuzzy Control are presented in this chapter. For further in depth study of the theory of Fuzzy Sets, the reader is referred to the numerous books and papers on the subject given in the Bibliography in chapter 18.

5.1 Basic Concepts


In classical set theory, a set consists of a finite or infinite number of elements belonging to some specified set termed the universe of discourse. The elements of the universe of discourse may or may not belong to the set A, as shown in Figure 5.1.

f 1

Figure 5.1 Characteristic function of a classical Boolean set

The crisp or Boolean characteristic function fA(x) in Figure 5.1 is expressed as the discontinuous function fA (x) = 1 if xA = 0 if xA

Elements of Fuzzy Logic

55

Vagueness can be introduced in the theory of sets if the characteristic function is generalized to permit an infinite number of values between 0 and 1, as shown in Figure 5.2.

Support set Universe of Discourse

Support set Universe of Discourse

Figure 5.2 Examples of triangular and trapezoidal fuzzy sets

If is the universe of discourse with elements x (i.e., the region to which the physical variable is confined), then we may state X={x}. A fuzzy set A on the universe of discourse X can be expressed symbolically as the set of ordered pairs A= {A(x)/x} or

{ (x)/x} for xX
A

for the continuous and discrete cases respectively. Here A(x) is termed the membership function of x on the set and is a mapping of the universe of discourse on the closed interval [0,1]. The membership function is simply a measure of the degree to which x belongs to the set , i.e., A (x) : X[0,1] It is noted that the symbols and imply a fuzzy set and bear no relation to the integration and summation.

56

Chapter 5

The support set of a fuzzy set is a subset of the universe of discourse for which A(x)>0. Thus a fuzzy set is a mapping of the support set on the closed interval [0,1]. As an example, consider the temperature of water at some point in a plant. Consider for example the fuzzy variable Low. This can be described in terms of a set of positive integers in the range [0,100] and defined as ={Low}. This set expresses the degree to which the temperature is considered Low over the range of all possible temperatures. Here, the membership function A(x) has discrete values specified in degrees Centigrade by the set: A(0)= (5)= (10)= (15)= (20)=1.0, (25)=0.9, (30)=0.8, (35)=0.6, (40)=0.3, (45)=0.1, (50)= (55)= ..... (100)=0 More compactly, this set can be expressed as: (x)= {1/0 + 1/5 + 1/10 + 1/15 + 1/20 + 0.9/25 + 0.8/30 + 0.6/35 + 0.3/40 + 0.1/45 + 0/50 + 0/55 +....0/100} The symbol + represents the union operator in set theory and must not be confused with arithmetic addition. A graphical representation of the corresponding fuzzy membership function A(x) is shown in Figure 5.3.
(x) 1

10

20

30

40

......

A={Low}

Figure 5.3 Discrete membership function of the fuzzy set A={Low}

Elements of Fuzzy Logic

57

A fuzzy variable is one whose values can be considered labels of fuzzy sets. Thus TEMPERATURE can be considered as a fuzzy variable which can take on linguistic values such as Low, Medium, Normal, High and Very_High. This is precisely the way that human operators refer to plant variables in relation to their nominal values. It is shown in the following that fuzzy variables can be readily described by fuzzy sets. In general, any fuzzy variable can be expressed in terms of phrases that combine fuzzy variables, linguistic descriptors and hedges. Thus the values of the fuzzy variable TEMPERATURE in the foregoing example can be described as High, NOT High, rather_High, NOT Very_High, extremely_High, quite_High etc., labels such as High, negation NOT, connectives AND and hedges such as extremely, rather, quite etc. Figure 5.4 shows the variable TEMPERATURE with a few of its values.

TEMPERATURE

Linguistic Variable

Very_Low

Low

High

Linguistic Values

0.6 0.8 0.6 0.8 0.9 1

Membership values Universe of Discourse (degrees C)

20

40

60

80

100

Figure 5.4 The linguistic variable TEMPERATURE and some of its values

58

Chapter 5

The dependence of a linguistic variable on another can be described by means of a fuzzy conditional statement of the form: R : IF S1 THEN S2 or symbolically as: S1 S2 where S1 and S2 are fuzzy conditional statements which have the general form: S : is and . A linguistic meaning can be given to the fuzzy subset to specify the value of , for example: IF the LOAD is Small THEN TORQUE is Very_High OR IF the ERROR is Negative_Large THEN OUTPUT is Negative_Large. Two or more fuzzy conditional statements can be combined (or included in another) so as to form a composite conditional statement such as: R : IF S1 THEN (IF S2 THEN S3). It should be obvious that the composite statement can be decomposed into the two simpler conditional statements: R1 : IF S1 THEN R 2 AND R 2 : IF S2 THEN S3 The composite statement (or rule): IF the ERROR is Negative_Large THEN (IF CHANGE_IN_ERROR is Positive_Large THEN OUTPUT is Positive_Large) can be written more simply as a pair of rules:

Elements of Fuzzy Logic

59

R1 : IF ERROR is Negative_Large THEN R 2 R 2 : IF CHANGE_IN_ERROR is Positive_Large THEN OUTPUT is Positive_Large This is the most useful rule structure in practice. Human operators are invariably taught to control plants with linguistic rules of this type, rather than composite rules that appear to be far too complicated. The number of rules that are required to control a plant varies enormously and depends on the complexity of the plant. Human plant operators rarely use more than approximately 30 rules for routine control tasks, since rules that are rarely used tend to be quickly forgotten. To control a complex plant like a rotary kiln as many as 60-80 rules may be required but as few as 5 are necessary to control simple appliances like washing machines or cameras.

5.2 Fuzzy Algorithms


Two or more fuzzy conditional statements can be combined with the OR (or ELSE) connective to form a fuzzy algorithm RN of the form: RN : R1 OR R2 OR R3 OR .... OR Rn For example a subset of linguistic rules used to control a steam engine can be written as: IF SPE is Negative_Large THEN (IF CSPE is NOT (Negative_Large OR Negative_Medium) THEN CFUEL is Positive_Large) OR IF SPE is Negative_Small THEN (IF CSPE is (Positive_Large OR Positive_Small) THEN CFUEL is Positive_Small) OR IF SPE is Negative_Small THEN (if CSPE is Positive_Medium THEN CFUEL is Positive_Small) where

60

Chapter 5

SPE = SPeed Error CSPE = Change in Speed Error CFUEL = Change in Fuel Intake It is noted that linguistic control rules are of the familiar if then else form, where else is replaced by the connective OR.

5.3 Fuzzy Operators


The operators min (for minimum) and max (for maximum) can be used for either one or two elements. Thus the operations min and max on two elements a and b are defined respectively as: a b = min (a,b) = a if a b = b if a>b a b = max (a,b) = a if ab = b if a<b Well known applications of these operators are to be found in programmable controllers (PLCs) for logical switching operations. These are the AND () and OR () functions whose operations are shown symbolically below. A black tile indicates a 1 and a blank tile a 0.
0 0 1 1

OR (max ) AND (min ) The operators min and max of two sets A and B result in the sets C and D as follows:

D = AB = {max(a,b)} aA, bB
C = A B = {min(a,b)} aA, bB

Elements of Fuzzy Logic

61

and are shown in Figure 5.5. The AND operator is therefore synonymous with the min operation and the OR operator with the max operation. It is worth remembering this in the following chapters. When operators are used on one element only they imply the minimum (inf or infinum) or maximum (sup or supremum) of all the elements of the set, thus: a=

A = inf(A) aA and a = A = sup(A) aA

Operators can also be used as functions on elements of discrete sets, e.g., a =

(a ,a ,......a ) = a a ..... a = k( a ) = (A ,A ,......A ) = A A ..... A = k( A )


1 2 m 1 2 k 1 2 m 1 2 k

When the elements of the set are functions of a variable, then the operators are expressed as: a=

x(a(x))

xX

It is noted, finally, that expressions that involve the min and max operators use identical rules to those of arithmetic multiplication and addition respectively.

62

Chapter 5

1 (x)

1 1 (x)

0 1 (x) D(x) - max

C(x) - min 0 x

Figure 5.5 Graphical representation of the min and max operations

Elements of Fuzzy Logic

63

5.4 Operations on Fuzzy Sets


A fuzzy set A on X is said to be a null set if its membership function is zero everywhere, i.e., A = if A (x) = 0 x X The complement of a fuzzy set is simply: A = 1 A(x) x X Two fuzzy sets are considered identical if their membership functions are identical everywhere on the universe of discourse, i.e., A = B if A(x) B(x) x X A fuzzy set B is a subset of A if the membership function of B is smaller or equal to that of the fuzzy set A everywhere on X, i.e., A B if A (x) B(x) x X The union of two fuzzy sets A and B on X is defined as: A B(x) = A(x) B(x) x X while the intersection of two fuzzy sets A and B on X is defined as: AB(x) = A (x) B(x) x X Finally, the product of two fuzzy sets A and B on X is defined as: AB(x) = A(x)B(x) x X

64

Chapter 5

5.5 Algebraic Properties of Fuzzy Sets


The standard definitions for the union, intersection and complement of classical logic can readily be extended to fuzzy sets. The classical properties of sets can be expressed in terms of membership functions. For instance, the distribution property in terms of fuzzy sets becomes: (A A) C = A (A C ) while DeMorgans theorem becomes:

( ) =
The following properties apply only to fuzzy sets:
E A = or 0 = 0 A = A or 0 = A E = A or 1 = A E = E or 1 = 1

where is the unit set specified by A(x) = 1 xX and is the null set.

5.6 Linguistic Variables


As was noted earlier, a linguistic variable can take on values that are statements of a natural or artificial language. In general the value of a linguistic variable is specified in terms of the following terms:

primary terms which are labels of fuzzy sets, such as High, Low, Small, Medium, Zero, negation NOT and connectives AND and OR, hedges such as very, nearly, almost and markers such as parentheses ( ).

Elements of Fuzzy Logic

65

The primary terms may have either continuous or discrete membership functions. Continuous membership functions are normally defined by analytic functions. The Danish company F. L. Smidth in its fuzzy controllers designed for the cement industry uses Gaussian-like membership functions of the type shown in Figure 5.7 given by the expression:

A ( x) = 1 e

) | x|

(x) 1

x -1 0 1

Figure 5.7 Examples of membership functions used in the F. L. Smidth fuzzy controller

The triplet (,,) defining the shape of the F. L. Smidth fuzzy sets shown in Figure 5.7 is given in the following table:

66

Chapter 5

Linguistic Variable Positive_Large Positive_Medium Positive_Small Positive_Zero Zero Negative_Zero Negative_Small Negative_Medium Negative_Large Large Normal Low

Acronym LP MP SP ZP ZE ZN SN MN LN HIGH OK LOW

0.25 0.25 0.25 0.1 0.25 0.1 0.25 0.25 0.25 0.5 0.6 0.5

2.5 2.5 2.5 6 6 6 2.5 2.5 2.5 6 8 6

1 0.7 0.4 0.1 0 -0.1 -0.4 -0.7 -1 1 0 -1

An alternative way of defining continuous membership functions is through the generic S and functions shown in Figures 5.8(a) and 5.8(b). The first is monotonic and is specified by: S(x,,,) = 0 = 2[(x -)/(-)]2 = 1 - 2[(x -)/(-)]2 = 1 for x for x for x for x

(x) 1

Figure 5.8(a) The generic membership functions S(x,,,)

Elements of Fuzzy Logic

67

The second generic membership function is the function that changes monotonicity at one point only. This function can be defined in terms of S functions. In this case the parameter represents the width of the function between the median points where the membership function has a value of 0.5. This function is given by: (x,,,) = S(x,-,-/2,) for x = 1 - S(x,,+/2,+) for x > Continuous fuzzy sets can also be constructed from standardized trapezoidal or triangular functions. Three examples of trapezoidal functions that represent the primary sets (Small, Medium and Large) are shown in Figure 5.9. These fuzzy sets can be uniquely defined using four parameters, the inflexion points b and c and the left and right extinction points a and d that define the support set. Figure 5.10 shows examples of some of the membership functions available in the MATLAB/Fuzzy Toolbox. Finally, discrete fuzzy sets are sets of singletons on a finite universe of discourse. For example, if the universe of discourse is given by the finite set:

X = {0 + 1 + 2 + 3 + 4 + 5 + 6}

A(x)

Figure 5.8(b) The generic membership functions (x,,,)

68

Chapter 5

(x) 1

x (x)

Neg_Large -x

Neg_Small

Zero 0

Pos_Small

Pos_Large x

Figure 5.9 Standardized trapezoidal membership functions

then the fuzzy sets for the linguistic variables small, medium and large as shown in Figure 5.11 could be defined as the sets:

small (x) = {0.3 + 0.7 + 1 + 0.7 + 0.3 + 0 + 0} medium(x) = {0 + 0 + 0.3 + 0.7 + 1 + 0.7 + 0.3} large (x) = {0 + 0 + 0 + 0 + 0.3 + 0.7 + 1}

Figure 5.10 Examples of membership functions provided in the MATLAB/Fuzzy_Toolbox

Elements of Fuzzy Logic

69

5.7 Connectives
Negation (NOT) and the connectives AND and OR can be defined in terms of the complement, union and intersection operations respectively. Usually the connective AND is used for fuzzy variables which have different universes of discourse. If

= {A(x)/x} B = {B(y)/y}
it follows that

for xX for yY

A AND B = A(x) B(y)/(x,y) for xX, yY = AB (x,y)/(x,y)


The connective OR connects linguistic values of the same variable. It is obvious that both variables must belong to the same universe of discourse.

(x)

small

(x)

medium

(x)

large

Figure 5.11 Examples of discrete membership functions made up of singletons

Thus if

= {A(x)/x} for xX B = {A(x)/x} for xX

70

Chapter 5

then

A OR B = A(x) B(x)/(x) = AB (x)/(x) for xX


The connective OR can be used only when the fuzzy variables have different universes of discourse except in the case where the variables appear on the same side of a conditional if .... then statement as in the rule: if the PRESSURE is High or the SPEED is Low then FUEL_FEED must be Zero. The NOT operator is synonymous with negation in a natural language. Thus if:

= {A(x)/x} for xX NOT == {1 A(x)/x}


The statement PRESSURE is NOT High is clearly identical to the statement PRESSURE is Not_High. Linguistic hedges are useful in generating a larger set of linguistic values from a smaller set of primary terms. Thus using the hedges very, the connectives NOT, AND and the primary term Large, we may generate the new fuzzy sets very_Large, NOT_very_Large, Large_ AND_NOT_very_Large, etc. In this manner it is possible to compute the membership function of complex terms such as:

= NOT_Small AND_NOT_Large
whose membership function is

(x) = [1 - Small(x)] [1 - Large(x)].

Chapter 6

Fuzzy Reasoning
At the core of every fuzzy controller is the inference engine, the computational mechanism with which decisions can be inferred even though the knowledge may be incomplete. It is this very mechanism that can give linguistic controllers the power to reason by being able to extrapolate knowledge and search for rules which only partially fit any given situation for which a rule does not exist. Unlike expert systems that depend on a variety of techniques to search decision trees, fuzzy inference engines perform an exhaustive search of the rules in the knowledge base to determine the degree of fit for each rule for a given set of causes. The contribution to the final decision of rules that exhibit a small degree of fit is clearly small and may even be ignored while rules with a high degree of fit are dominant. It is clear that a number of rules may contribute to the final result to varying degrees. A degree of fit of unity means that only one rule has fired and only one unique rule contributes to the final decision, while a degree of fit of zero implies that the rule does not contribute to the final decision. Inference engines can take different forms depending on the manner in which inference is defined. It is therefore prudent to review the fundamentals of fuzzy logic that will allow us to understand how the inference works and how it may be implemented. A fuzzy propositional implication defines the relationship between the linguistic variables of a fuzzy controller. Given two fuzzy sets and that belong to the universes of discourse and respectively, then we define the fuzzy propositional implication as:
71

72

Chapter 6

R : IF THEN = where is the Cartesian product of the two fuzzy sets A and B. The Cartesian product is an essential operation of all fuzzy inference engines. Using the conjunctive operator (min), the Cartesian product is defined as: =

( (x) (y)) / (x,y) = min( (x), (y)) / (x,y)


B B

while for the case of an algebraic product the Cartesian product is: =

( (x) (y)) / (x,y)


B

Thus, for example, given the discrete fuzzy sets: = {1 + 2 + 3} and = {1 + 2 + 3 + 4} whose corresponding discrete membership functions (sometimes termed the grades of membership) are: {(x)/x} = (1/1 + 0.7/2 + 0.2/3) and {(y)/y} = (0.8/1 + 0.6/2 + 0.4/3 + 0.2/4), using the union (or more generally disjunction) operator, the Cartesian product using the conjunctive (min) operator is: R = = {min(1, 0.8)/(1,1), min(1, 0.6)/(1,2), min(1, 0.4)/(1,3), min(1, 0.2)/(1,4), min(0.7, 0.8)/(2,1), min(0.7, 0.6)/(2,2), min(0.7, 0.4)/(2,3), min(0.7, 0.2)/(2,4), ............ = {0.8/(1,1) + 0.6/(1,2) + 0.4/(1,3) + 0.2/(1,4) + 0.7/(2,1) + 0.6/(2,2) + 0.4/(2,3) + 0.2/(2,4) +

Fuzzy Reasoning

73

0.2/(3,1) + 0.2/(3,2) + 0.2/(3,3) + 0.2/(3,4)} This Cartesian product can be conveniently represented by means of the relational matrix: x\y 1 2 3 1 0.8 0.7 0.2 2 0.6 0.6 0.2 3 0.4 0.4 0.2 4 0.2 0.2 0.2

which is shown in graphical form in Figure 6.1.

Figure 6.1 Graphical representation of the relational matrix R

Likewise, the Cartesian algebraic product is computed as follows: R* = = {0.80/(1,1)+ 0.60/(1,2) + 0.40/(1,3) + 0.20/(1,4) + 0.56/(2,1) + 0.42/(2,2) + 0.28/(2,3) + 0.14/(2,4) + 0.16/(3,1) + 0.12/(3,2) + 0.08/(3,3) + 0.04/(3,4)} whose relational matrix is:

74

Chapter 6

x\y 1 2 3

1 0.8 0.56 0.16

2 0.6 0.42 0.12

3 0.4 0.28 0.08

4 0.2 0.14 0.04

which is shown graphically in Figure 6.2. It is observed that the graphical representations of the relational matrices of the two Cartesian products have similarities. The Cartesian product based on the conjunctive operator min is much simpler and more efficient to implement computationally and is therefore generally preferred in fuzzy controller inference engines. Most commercially available fuzzy controllers in fact use this method.

R* x

Figure 6.2 Graphical representation of the relational matrix R*

6.1 The Fuzzy Algorithm


The membership function that specifies fuzzy implication is given in terms of the individual membership functions (x) and (y) of the sets and in different ways, as described below. Assume that R(x,y)= ((x), B(y)) for x X and yY

Fuzzy Reasoning

75

where is some implication operator and R = {R(x,y)/(x,y)}. In general, if 1, 2, ... are fuzzy sub-sets of and 1, 2, ... are sub-sets of (corresponding to the antecedents or causes and the consequents or effects respectively), then the fuzzy algorithm is defined as the set of rules: R : OR IF 1 THEN 1 IF 2 THEN 2

OR ............................... IF THEN This form is typically used in fuzzy control and is identical to the manner and terms in which human operators think. The connective OR, abbreviated as , depends on the fuzzy implication operator . Thus the membership function for N rules in a fuzzy algorithm is given by: R(x,y) = (R1(x,y), R2(x,y) .... ) = ((1(x) B1(y)), (2(x) B2(y))....) The foregoing relations apply to simple variables and . In general, the part of the conditional statement of the form IF THEN ELSE involves more than one variable and can be expressed as a series of nested statements of the form: IF 1 THEN (IF 2 THEN .... (IF N THEN )) or as a statement where the antecedents are related through the connective AND, i.e., IF (1 AND 2 AND .... N) THEN

76

Chapter 6

whereupon: R(x1,x2,.....xN) = {1(x1),(2(x2)),...((N(xN),(B(y))))} for x1,x2....xn X1, X2 ... XN, yY or R(x1,x2,.....xn) = {1(x1) 2(x2) ....

( N(xN), B(y))}

= {k (k(xk) B(y))} for k=1,2..N

6.2 Fuzzy Reasoning


As noted earlier, the knowledge necessary to control a plant is usually expressed as a set of linguistic rules of the form IF (cause) THE (effect). These are the rules with which new operators are trained to control a plant and they constitute the knowledge base of the system. In practice, it is possible that not all the rules necessary to control a plant have been elicited, or are known. It is therefore essential to use some technique capable of inferring the control action in this case, from available rules. In classical propositional calculus the conditional statement (or rule): IF THEN that can be expressed symbolically as: is equivalent to the operation A B, where and are sub-sets of the universes of discourse and respectively and A = NOT A. In fuzzy logic and approximate reasoning there are two fuzzy implication inference rules:

Fuzzy Reasoning

77

6.2.1 Generalized Modus Ponens (GMP)


In fuzzy logic there exist two principal categories of inferences. The first is Generalized Modus Ponens (or GMP) defined as follows: GMP: Premise 1: x is Premise 2: IF x is THEN y is Consequence: y is GMP is related to forward data-driven inference, which finds use in all fuzzy controllers. The objective here is to infer the effect given the cause. For the special case where = and = then GMP reduces to Modus Ponens.

6.2.2 Generalized Modus Tollens (GMT)


The second category is Generalized Modus Tollens (or GMT) for which the following holds: GMT: Premise 1: y is B Premise 2: IF x is THEN y is Consequence: x is A GMT is directly related to backwards goal-driven inference mechanisms, which find application in expert systems. In contrast to GMP, the objective here is to infer the cause that leads to a particular effect. For the special case where = A and = B then GMT reduces to Modus Tollens. The most common inference relations, with which fuzzy inference engines can be constructed, are given in the following. It will observed that they have differing degrees of complexity and simplicity of use and only the last two have found extensive application in practice.

78

Chapter 6

6.2.3 Boolean implication


The classical binary implication due to Boole uses the conjunctive union operator in addition to negation and is defined as: Roole = A B = ( A Y) (X B) and R(x,y) = (1 - (x))

(y)

For the case of rules, use is made of the connective AND in which case:

RN

(x,y) =
RN =

k k

Rk where k=1,2.... and

(((1- k(x)) k(y)))

6.2.4 Lukasiewicz implication


The Lukasiewicz implication (also known as Zadehs arithmetic rule of fuzzy implication) is chronologically one of the first and is based on multi-valued logic. It is similar to Booles implication intersection is replaced by simple arithmetic addition but in this case, i.e., RL = ( A Y) (X B) and R(x,y) = 1 (1 - (x) + (y)) Likewise, the implication for the case of rules use is made of the connective AND is given by: RN = R

R where k=1,2.... and (x,y) = (1 (1 - (x)+ (y)))


k k k k k

Fuzzy Reasoning

79

6.2.5 Zadeh implication


The Zadeh max-min implication involves the max and min operators and is defined as: RZadeh2 = (A B) ( A X) and R(x,y) = ((x)

(y)) (1 - (x))

Zadehs fuzzy implication rule is difficult to apply in practice and it took several years before Mamdani proposed a simplification that made it especially useful for control applications.

6.2.6 Mamdani implication


The Mamdani implication rule is a simplification of the implication proposed by Zadeh and uses only the min operator and is defined as: Ramdani = A B and R(x,y) = (x) (y) = min((x), (y)) For a fuzzy algorithm comprising N rules use is made of the connective OR, in which case the implication is: RN =

Rk where k=1,2.... and

RN(x,y) =

(k(x) k(y))

80

Chapter 6

6.2.7 Larsen implication


Finally, the Larsen implication rule uses arithmetic multiplication in the computation of the Cartesian product and is defined as: and RLarsen = A B R(x,y) = (x) (y) For a fuzzy algorithm comprising N rules, use is made of the connective OR, i.e., RN =

Rk where k=1,2.... and

R(x,y) =

(
k

k (x)

k(y)).

Both Mamdanis and Larsens implications have found extensive application in practical control engineering due to their computational simplicity. Nearly all industrial fuzzy controllers use one or the other of these two fuzzy implications in their inference engines. Mamdanis implication, being computationally faster, is found most often.

6.2.8 GMP implication


For completeness, it is useful to revisit Generalized Modus Ponens with a view to comparing it with the implication rules previously defined. Here RGMP = A Y X B whereupon using the bounded product

Fuzzy Reasoning

81

1 if (x) (y) R(x,y) = (x) > (y) = 0 if (x) > (y) Alternatively, using the algebraic product 1 if (x) (y) R(x,y) = (x) > (y) = (x) if (y) > (y) A (y)

6.3 The Compositional Rules of Inference


Fuzzy controllers invariably involve numerous rules and it is essential, therefore, to establish suitable mechanisms with which to process these rules in order to arrive at some consequent. Given, for instance, two sequential fuzzy conditional rules: R1 : IF THEN R2 : IF THEN C it is possible to combine the two rules into a single rule by absorbing the intermediate result B and find the relationship between antecedent and the ultimate consequent directly, i.e., R 12 : IF A THEN C The composition of these two rules into one can be expressed as: R12 = R1 o R2

82

Chapter 6

where o implies rule composition. In terms of the max-min operators used by Mamdani, the membership function of the resultant compositional rule of inference is: R12(x,z) =

(
y

1 R (x,y)

R2(y,z))

= max (min (R1(x,y), R2(y,z)) and in the case of the Larsen implication rule using the max-product operators: R12(x,z) =

(
y

1 R (x,y)

R2(y,z))

= max (R1(x,y) R2(y,z)) When discrete membership functions are used, the compositional rule of inference in the case of Mamdani implication is analogous to the inner product of two matrices in which multiplication and addition are replaced by the min and max operators, respectively. For Larsen implication, addition is replaced by the max operator while multiplication is arithmetic. In the following, we discuss the procedure for determining the consequent (or effect), given the antecedent (or cause). Given = {(x)/x} = {(y)/y} and the compositional rule of inference: R = {R(x,y)/(x,y)} for x and y Y We wish to infer the consequent if the antecedent is modified slightly to , i.e., = {(x)/x} for x X for x X for y Y

Fuzzy Reasoning

83

Making use of the fuzzy compositional rule of inference using the max-min operators for instance: = o R =

(
x

A(x)/x

R(x,y)/(x,y)) for x X and y Y

or using the max-product operators: = o R =

(
x

A(x)/x

R(x,y)/(x,y)) for x X and y Y

By way of example consider the rule: IF is Slow THEN is Fast given the fuzzy sets for Slow and Fast are given by the discrete membership functions: (x) = {1 + 0.7 + 0.3 + 0 + 0 + 0} (y) = {0 + 0 + 0.3 + 0.7 + 1 + 1}

Figure 6.3 The fuzzy sets Slow and Fast in the example

on the universes of discourse X, = {0,1,2,3,4,5,6}. The discrete membership functions are shown in Figure 6.3. We wish to determine the outcome if A = slightly Slow for which there no rule exists.

84

Chapter 6

The procedure is straightforward, though tedious. The first step is to compute the Cartesian product and using the min operator this is simply: R = = {min[(xi ), B(yj)]} =
min[1, 0] min[0.7, 0] min[0.3, 0] min[0, 0] min[0, 0] min[0, 0] min[1, 0] min[0.7, 0] min[0.3, 0] min[0, 0] min[0, 0] min[0, 0] min[1, 0.3] min[0.7, 0.3] min[0.3, 0.3] min[0, 0.3] min[0, 0.3] min[0, 0.3] min[1, 0.7] min[0.7, 0.7] min[0.3, 0.7] min[0, 0.7] min[0, 0.7] min[0, 0.7] min[1, 1] min[0.7, 1] min[0.3, 1] min[0, 1] min[0, 1] min[0, 1] min[1, 1] min[0.7, 1] min[0.3, 1] min[0, 1] min[0, 1] min[0, 1]

0 0 0 0 0 0

0 0 0 0 0 0

0.3 0.3 0.3 0 0 0

0.7 0.7 0.3 0 0 0

1 0.7 0.3 0 0 0

1 0.7 0.3 0 0 0

Thus if the antecedent A is modified somewhat, by displacing the elements of the discrete fuzzy set one place to the left to represent the fuzzy set A = slightly Slow then the original discrete membership function: (x) = {1 + 0.7 + 0.3 + 0 + 0 + 0} becomes = {0.3 + 0.7 + 1 + 0.7 + 0.3 + 0} which is shown in Figure 6.4. Using the fuzzy compositional inference rule: = o R and the max-min operators (i.e., the Mamdani compositional rule): (y) = max (min (A(x), R(y)))

Fuzzy Reasoning

85

'

'

Figure 6.4 The compositional inference rule using the max-min operators

then the discrete membership function of the new consequent can be readily computed. The relational matrix {min((x),R(y))} contains elements of the matrix R and the discrete membership function (x) and is given by: (y) =
min(0, 0.3) min(0, 0.7) min(0, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min( 0, 0.3) min(0, 0.7) min(0, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(0.3, 0.3) min(0.3, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(0.7, 0.3) min(0.7, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(1, 0.3) min(0.7, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(1, 0.3) min(0.7, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0)

0 0 0 0 0 0

0 0 0 0 0 0

0.3 0.3 0.3 0 0 0

0.3 0.7 0.3 0 0 0

0.3 0.7 0.3 0 0 0

0.3 0.7 0.3 0 0 0

The final operation in determining the membership function (y) of the new consequence is selection of the largest element in each column, which is equivalent to applying the max operator in each column. These are shown in bold. The result of the procedure is shown in Figure 6.4 and is: (y) = {0 + 0 + 0.3 + 0.7 + 0.7 + 0.7}

86

Chapter 6

It is noted that the displacement of the membership function for the new condition must be small, otherwise all the elements of the relational matrix will be zero and no conclusion can be drawn. By way of comparison, the corresponding result using the max-product rule of compositional inference: (y) = max ( (x) R(y)) has a relational matrix: R* = = {(xi) B(yj)} =
[1* 0] [0.7* 0] [0.3* 0] [0* 0] [0* 0] [0* 0] [1* 0] [0.7* 0] [0.3* 0] [0* 0] [0* 0] [0* 0] [1* 0.3] [0.7*0.3] [0.3*0.3] [0* 0.3] [0* 0.3] [0* 0.3] [1* 0.7] [0.7*0.7] [0.3* 0.7] [0* 0.7] [0* 0.7] [0* 0.7] [1* 1] [0.7* 1] [0.3* 1] [0* 1] [0* 1] [0* 1] [1* 1] [0.7* 1] [0.3* 1] [0* 1] [0* 1] [0* 1]

0 0 0 0 0 0

0 0 0 0 0 0

0.3 0.21 0.09 0 0 0

0.7 0.49 0.21 0 0 0

1 0.7 0.3 0 0 0

1 0.7 0.3 0 0 0

whereupon {(xi ) B(yj)} =


(0* 0.3) (0* 0.7) (0* 1) (0* 0.7) (0* 0.3) (0* 0) (0* 0.3) (0* 0.7) (0* 1) (0* 0.7) (0* 0.3) (0* 0) (0.3*0.3) (0.21*0.7) (0.09*1) (0* 0.7) (0* 0.3) (0* 0) (0.7*0.3) (0.49*0.7) (0.21* 1) (0* 0.7) (0* 0.3) (0* 0) (1* 0.3) (0.7*0.7) (0.3* 1) (0* 0.7) (0* 0.3) (0* 0) (1*0.3) (0.7*0.7) (0.3* 1) (0* 0.7) (0* 0.3) (0* 0)

Fuzzy Reasoning

87

0 0 0 0 0 0

0 0 0 0 0 0

0.09 0.15 0.09 0 0 0

0.21 0.35 0.21 0 0 0

0.3 0.49 0.3 0 0 0

0.3 0.49 0.3 0 0 0

The maximum elements of each column are therefore: (y) = {0 + 0 + 0.15 + 0.35 + 0.49 + 0.49} which are shown graphically in Figure 6.5.

'

'

Figure 6.5 Compositional inference using the max-product

In general, industrial fuzzy controllers are multivariable, involving m inputs and p outputs. It is, however, simpler to think in terms of p parallel fuzzy controllers each with one output only. In practice this is achieved by multiplexing a multi-input single output controller since the precedents are the same and only the antecedents change in each case. For the case of a fuzzy controller with m inputs and one output only: RN = {RN(x1,x2....xm,y)/(x1,x2....xm,y)} for xk Xk and yY. Now, given k = {k(xk)/xk} for xk Xk, k=1,2...m

88

Chapter 6

the consequent is given by the relationship: = ( k Ak) o RN for k=1,2...m = {(y)/y} for yY where (y) =

. . . [
x1 x2 xm

(k(xk) RN(x1,x2....xm))]
for k=1,2...m.

Finally, for completeness, if the max-product is used in the compositional inference rule, the corresponding expressions for the resultant consequence is given by: (y) =

. . . [
x1 x2 xm

k(xk) RN(x1,x2....xm)] for k=1,2...m

where now RN(x1,x2....xm) =

[( Akj(xk) Bj(y)] for k=1,2...m and j=1,2...n.

Chapter 7

The Fuzzy Control Algorithm


Industrial processes are invariably multi-variable and considerable research has gone into developing conventional analytical techniques for multi-variable plants both in the time and frequency domains. All of these techniques assume the plant is linear, but with rare exceptions, these techniques have not found their way into industry because of their complexity and restrictions. There is no comparable theory for nonlinear processes and thus industry has had to be content with various configurations of conventional industrial three-term controllers. It is obvious that this arrangement leaves much to be desired, leaving a gap between control theory and practice which has been waiting to be bridged. Soft control is proving to be this very bridge. Fuzzy control is by nature eminently suited to multi-variable and non-linear processes. Fuzzy linguistic rules required to control a multivariable plant have already been considered in earlier chapters. It was observed there that whereas the antecedents of each rule differ for each output variable, the precedents are identical. This suggests that a multivariable fuzzy controller can conveniently be decomposed into a set of single-output controllers equal to the number of outputs. In practice, the measurements of plant variables even if contaminated by noise and the control actions to the plant actuators are crisp. In order to apply the fuzzy algorithm it is therefore necessary first to fuzzify
89

90

Chapter 7

the measured plant variables and following completion of the fuzzy algorithm to de-fuzzify the result and thereby return to the engineering world. This chapter discusses the procedure of fuzzification and de-fuzzification as they apply to practical control.

7.1 Controller Decomposition


The decomposition of a fuzzy multi-input multi-output (MIMO) controller into a set of multi-input single-output (MISO) fuzzy controllers (FC1, FC2 FCn) as shown in Figure 7.1. The outputs of a multi-variable fuzzy controller can be computed in parallel in a multi-processor or sequentially by multiplexing the MISO controllers when computational times are not critical.

x FC1 y1

FC2

y2

FC3

y3

FCn

yn

Figure 7.1 Decomposition of a multi-input multi-output incremental fuzzy controller into a set of multi-input single-output incremental controllers

The Fuzzy Control Algorithm

91

7.2 Fuzzification
The algorithm for computing the crisp output of a fuzzy controller involves the following three steps: (1) fuzzification (2) inference (3) de-fuzzification To make these steps easier to understand, consider a fuzzy controller with three inputs and a single output only. It should be obvious that the procedure that follows can be generalized for any number of inputs. Given a MISO controller with inputs x1, x2, x3 and output y and assuming that the linguistic control rules are of the form: IF (1 AND 2 AND 3) THEN then the membership function of the output of the controller is given by: R(x1,x2,x3) = (1(x1) 2(x2) = (

(3(xN), B(y)))

(j(xk), B(y))) for j=1,2,3

where the operator implies max-min or max-product. Using the intersection operator, the degree of fulfillment of the j-th rule j(k)[0,1] is defined by: j(k) = 1(x1)

2(x2)

....

N(xN)

and is computed every time the algorithm is executed. The degrees of fulfillment are thus a measure of how closely the inputs to the controller match the control rules. They can be viewed conveniently as weights that are assigned to every rule. In practice only a small number of the rules in the rule-base (typically less than 6) will exhibit non-zero degrees of fulfillment at any instant. The expression for the degree of fulfillment given above applies when the Mamdani max-min implication rule is used. For the case of

92

Chapter 7

max-product implication, the corresponding expression for the degree of fulfillment is: j(k) = 1(x1) 2(x2) .... N(xN) The fuzzy implication rules yield the membership function of the output of the controller from knowledge of the current instantaneous measurements of the inputs to the controller x1(k), x2(k), x3(k). Thus at any instant k the membership function of the output of the controller using the max-min fuzzy implication rule is: Y(k) = (1(k) 2(k) or, the max-product implication rule: Y(k) = (1(k) 2(k) 3(k)) o RN where X1, X2 and X3 are the corresponding fuzzy sets of the controller inputs. These computations are simplified significantly if the fuzzy sets of the inputs to the controller are taken as singletons defined as: i(k) = 1 if xi =xi(k) = 0 otherwise whereupon: Y(y) =

3(k)) o RN

(111
x1 x2 x3

N R (x1,x2,x3,y)

RN(x1,x2,x3,y)

Example 7.1 illustrates the procedure.

The Fuzzy Control Algorithm

93

Example 7.1 Graphical interpretation of fuzzification


The various operations required to establish the fuzzy set of the output of a fuzzy controller are illustrated graphically in this example. For simplicity assume that the controller has two inputs and a single output. Assume that the first input to the controller Input_1 (x1) is specified by 5 fuzzy sets, while Input_2 (x2) is specified by 3 fuzzy sets. The linguistic variables are assumed to be VL=Very_Low, LO=LOw, ZO=ZerO, LH=Little_High, MH=Medium_High, and VH=Very_High. Assume that the following 15 control rules constitute the rule base:
R1: If Input_1 is LO and Input_2 is VL then Output is LO R2: If Input_1 is ZO and Input_2 is VL then Output is ZO R3: If Input_1 is LH and Input_2 is VL then Output is LH R4: If Input_1 is MH and Input_2 is VL then Output is LH R5: If Input_1 is VH and Input_2 is VL then Output is LH R6: If Input_1 is LO and Input_2 is ZO then Output is LO R7: If Input_1 is ZO and Input_2 is ZO then Output is LH R8: If Input_1 is LH and Input_2 is ZO then Output is MH R9: If Input_1 is MH and Input_2 is ZO then Output is MH R10: If Input_1 is VH and Input_2 is ZO then Output is VH R11: If Input_1 is LO and Input_2 is VH then Output is LH R12: If Input_1 is ZO and Input_2 is VH then Output is MH R13: If Input_1 is LH and Input_2 is VH then Output is MH R14: If Input_1 is MH and Input_2 is VH then Output is VH R15: If Input_1 is VH and Input_2 is VH then Output is VH

that can de depicted more compactly by means of the rule matrix:


x2\x1 VL ZO VH LO LO ZO LH ZO ZO LH MH LH LH MH MH MH LH MH VH VH LH VH VH

For simplicity, assume, furthermore that the fuzzy sets of the inputs and outputs are triangular and are as shown in Figure 7.2.

94

Chapter 7

The universes of discourse of Input_1 and Input_2 are assumed symmetric and are expressed as percentages of their maximum permissible values. Output y involves 5 fuzzy sets and is assumed asymmetric. Thus the inputs to the controller can take any value between 100% of their maximum permissible values while the output can take any value between 0 and 100% of its maximum permissible value. For example, the Output y could represent the opening of a servo-valve, while Input_1 could be a pressure deviation and Input_2 could be the temperature deviation about their nominal values.
LO ZO LH MH VH

1 Input_1

-100%

100%

VL

ZO

VH

1 Input_2

-100%

100%

LO

ZO

LH

MH

VH

1 Output

100 %

Figure 7.2 Fuzzy sets of inputs and outputs

The first five rules in the rule base are depicted graphically in Figure 7.3. Assume, furthermore, that at the instant of execution of the algorithm, the instantaneous inputs to the controller are -20% and -50% respectively.

The Fuzzy Control Algorithm

95

Every rule is now examined with a view to determine the degree to which it contributes to the final decision. This measure is termed the degree of fulfillment j. The computational time required for this determination is clearly dependent on the number of rules in the rule base. Fortunately, rarely are more than 20 to 50 rules required in practice and consequently computational time is minimal.
Input_1
Input_2
Output

LO
R1

ZO

LH

MH

VH

VL

ZO

VH

LO

ZO

LH

MH

VH

-100%

100%
LO ZO LH MH VH

-100%
VL ZO

100%
VH

0
LO ZO LH MH

100 %
VH 1

R2

-100%

100%

-100%

100%

100 %

LO
R3

ZO

LH

MH

VH

VL

ZO

VH

LO

ZO

LH

MH

VH 1

-100%

100%

-100%

100%

100 %

LO
R 4

ZO

LH

MH

VH

VL

ZO

VH

LO

ZO

LH

MH

VH

-100%

100%
LO ZO LH MH VH

-100%
VL ZO

100%
VH

0
LO ZO LH MH

100 %
VH 1

R5

-100%

100%

-100%

100%

100 %

x (k) 1

x (k) 2

Figure 7.3 Graphical representation of first five rules in the rule base For the given values of the inputs, it is clear that rules R1, R4 and R5 have no part in the final decision (and consequently the output) since these rules have not fired. The degrees of fulfillment of the non-fired rules are consequently zero. The intercepts of the vertical lines corresponding to the instantaneous value of Input_1 and Input_2 and the corresponding fuzzy sets specifying the membership value are:

96

Chapter 7

The Fuzzy Control Algorithm

97

11=0, 21=0.66, 31=0.33, 41=0, 51=0 and 12=0.5, 22=0.5, 32=0.5, 42=0.5, 52=0.5

respectively.
The degree of fulfillment j for every rule is computed from the membership values and the operation min(j1,j2). Here R1 : 1= min(11,12)=min(0, 0.5)=0 R2 : 2= min(21,22)=min(0.66, 0.5)=0.5 R3 : 3= min(31,32)=min(0.33, 0.5)=0.33 R4 : 4= min(41,42)=min(0, 0.5)=0 R5 : 5= min(51,52)=min(0, 0.5)=0 etc.

7.2.1 Steps in the fuzzification algorithm


The first step in the fuzzy controller algorithm is determination of the minimum intercepts for each input, i.e., their membership. If any rule has zero intercept then it is discarded in subsequent computations, as it does not contribute to the final decision. The second step involves determination of the degrees of fulfillment of every rule. Using the min operator, this is equivalent to scanning the intercepts of every rule horizontally. Should any intercept be zero then this rule clearly does not contribute to the final conclusion. The third involves determination of the composite membership function of the output of the controller. This operation is dependent on the choice of fuzzy implication rule. Thus in the case of Larsen implication, the membership function of the output of the controller is the union of the individual membership functions for each rule that has fired weighted with the corresponding degree of fulfillment, i.e., (y) = 11(y) 22(y) 33(y) 44(y) 55(y) = 22(y) 33(y)

98

Chapter 7

This is shown in graphical form in Figure 7.4. The resultant composite membership function of the output of the controller is shown in the lower right hand side diagram of Figure 7.5. Using Mamdani implication, the composite membership function of the controller output is the union of the maximum values of the weighted membership functions of each rule that has fired, i.e., (y)= max (1,1(y))max(2,2(y))max(3,3(y))

max(4 ,4(y))max(5, 5(y))) = max (2,2(y))max(3,3(y))


This procedure is shown in Figure 7.5 for comparison.
Input_1 Input_2 Output

LO R2

ZO

LH

MH

VH 1

VL

ZO

VH 1

LO

ZO

LH

MH

VH

-100%

100%

-100%

100%

100 %

LO R3

ZO

LH

MH

VH 1

VL

ZO

VH 1

LO

ZO

LH

MH

VH 1

-100%

100%

-100%

100%

100 %

x (k) 1

x (k) 2

(y)

100 %

Figure 7.4 Determination of the composite membership function of the controller output using the Larsen implication

The Fuzzy Control Algorithm

99

7.3 De-fuzzification of the Composite Controller Output Fuzzy Set


The final step in the fuzzy controller design procedure is de-fuzzification of the fuzzy set of the output to yield a single crisp value that uniquely specifies the desired control action. It must be noted that there is no theoretical basis for deciding which is the best manner to perform defuzzification and there exist a number of schemes, each of which presents a different degree of computational complexity. Simplicity and speed of computation are invariably the primary requirements in industrial controllers. Below, we note those de-fuzzification techniques that have found application in practical control applications:
Input_1
LO R2 ZO LH MH VH

Input_2

Output

VL

ZO

VH

LO

ZO

LH

MH

VH

-100%

100%

-100%

100%

100 %

LO R3

ZO

LH

MH

VH

VL

ZO

VH

LO

ZO

LH

MH

VH

-100%

100%

-100%

100%

100 %

x (k) 1

x (k) 2

1 (y)

100 %

Figure 7.5 Determination of the fuzzy set of the controller output using the Mamdani implication

7.3.1 Center of area (COA) de-fuzzification


In this method the center of the area under the composite membership function of the output of the controller Y(y) is taken as the final output of the controller:

100

Chapter 7

yCOA =

y
S

( y )dy

( y )dy

where S is the support set of Y(y). In the case where the composite membership function is discrete with I elements, this becomes:

yCOA =

y
i =1 I i

( yi )

i =1

( yi )

This method is sensitive to changes in the shape of the membership functions of the output. Because it yields intuitive results, this method has found extensive use in practical fuzzy control.

7.3.2 Center of gravity (COG) de-fuzzification


In this method, the centers of gravity of the individual components of the composite fuzzy set for the rules that have fired, are combined by weighting them in accordance with their degrees of fulfillment. The outcome is thus:

yCOG =


i =1 i

( yi )

i =1

( yi )

On concluding de-fuzzification, the crisp output of the controller is applied to the plant actuators.

The Fuzzy Control Algorithm

101

7.4 Design Considerations


The principal factors that must be considered prior to implementation of the fuzzy control algorithm are the following:

7.4.1 Shape of the fuzzy sets


In the continuous case, the fuzzy sets are uniquely defined by some analytic function while in the discrete case the fuzzy sets can be defined by arrays whose size depends on the number of quantization levels. The shape of the fuzzy sets used in the design of a fuzzy controller has been the subject of considerable research and it is fair to state that at this time there is no theory which can guide the designer on the best shape to use for a specific application. Experience in similar controllers and computational ease appear to be the basic criteria in selecting the shapes of membership functions. In practice it appears triangular and trapezoidal functions are generally used, though Gaussian-like functions are used by a number of vendors of fuzzy controllers for the process industry. It has been claimed that the output of the controller is rather insensitive to the shape of the membership function. Comparison of the performance of a simple control system to changes in the membership functions of both the inputs and outputs of the fuzzy controller is made in Appendix A.

7.4.2 Coarseness of the fuzzy sets


The number of fuzzy sets that are required to specify a variable is termed the coarseness of the controller and determines the accuracy of the controller. High accuracy requires a large number of fuzzy sets and corresponding memory requirements. In some cases it is desirable to use two or more levels of coarseness to control a process. Thus when the process variables are at some distance from the desired operating point, coarse control is applied through the use of very few fuzzy sets. As the trajectory of the process approaches the operating point, the number of fuzzy sets is increased. This results in finer control with increased accuracy. This technique is not unlike coarse-fine control used in classical control. An example of a pair of fuzzy sets for coarse-fine control is shown in Figure 7.6. Here, three fuzzy sets (NE Negative, ZE - Zero, and PO -

102

Chapter 7

POsitive) are used for coarse control and five for fine control. This technique has been applied with success in a number of processes requiring high terminal accuracy.

NE (a)

ZE

PO

NB (b)

NM

NS

ZE

PS

PM

PB

Figure 7.6 Coarse-fine fuzzy sets

7.4.3 Completeness of the fuzzy sets


The fuzzy control algorithm must lead to a unique control action for any set of inputs. This property is termed completeness and depends on the contents of the knowledge-base as well as the number and shape of the fuzzy sets used to describe the inputs and outputs of the controller. The manner in which the fuzzy sets are defined on the universe of discourse, as well as the degree with which they overlap specifies the integrity of the controller, i.e. its ability to infer a plausible outcome under all circumstances. In the example of Figure 7.6, the overlap, defined as the intersection of the fuzzy sets, is at least 50%. In this case there will always be a dominant rule which has membership in excess of 0.5 so that an outcome will always be forthcoming. In the worst case, two rules at most will fire with an equal membership of 0.5 but still there will be no ambiguity as to the final result. In contrast, the fuzzy sets in Figure 7.7 possess points on the universe of discourse where the intersection is less than 0.5. This leads to highly uneven control surfaces and irregular control actions. Indeed there

The Fuzzy Control Algorithm

103

are regions on the universe of discourse where membership is zero whereupon if the instantaneous value of the input falls in these regions no rule can be fired with the result that the controller is unable to infer any control action. This is clearly undesirable and indicates that fuzzy sets must overlap in order to obtain a continuous output.

ZE

PS

PM

PB

Figure 7.7 Fuzzy set overlap

7.4.4 Rule conflict


The set of linguistic rules used to control a plant or process is normally elicited from expert human operators or domain experts. It is an undisputed fact that the knowledge elicited from two human operators is rarely the same. Though they may have been trained with the same rules, with experience and time they have learned to modify them, believing that in this manner they can control the process better. Of course better is clearly a subjective criterion, since it may imply increased productivity, reduced energy costs or even less trouble to the operator. Many of these criteria are conflicting. Thus in eliciting the knowledge for a controller it is advisable to restrict interviews to one human operator, e.g., the supervisor whose knowledge on how to control the plant efficiently and effectively is undisputed. If this is not possible, then there is little recourse but for the plant engineer to state the rules that he wishes to be followed. Even so, rule conflict is a common phenomenon and some means must be found for resolving this conflict. By simply writing the rules in sequential order it is virtually impossible to do so, however. Graphical means are very effective when the number of inputs to the controller are three or less

104

Chapter 7

whereupon rules can be represented in the form of tiles whose colors specify the control action required. This is none other than the Fuzzy Associative Memory or FAM and Figure 7.8 shows an example of this simple technique that has proved very useful in practice. The human eye is an excellent detector of abnormal color changes and thus by simply looking at the manner in which the colors of the tiles vary in control space it is possible to identify possible conflict.

Figure7.8 Representation of the knowledge base in tile form or Fuzzy Associative Memory (FAM)

Chapter 8

Fuzzy Industrial Controllers


The industrial three-term controller without doubt constitutes the backbone of industrial control, having been in use for over a century. Threeterm controllers can take a number of forms, from the early mechanical to later hydraulic, pneumatic, analog and digital versions. The modern form takes the form of multi-tasking discrete-time three-term algorithms embedded in almost all industrial PLCs and RTUs. Undisputedly, industrial progress would have been greatly limited but for these ubiquitous controllers. To this very day, the majority of industrial plants rely almost exclusively on this inexpensive, robust and easy to use conventional controller. The output of a conventional industrial controller normally involves three-terms: the first is proportional to its input (the P term), the second is proportional to the integral of the input (the I term) and the third is proportional to the derivative of the input (the D term). In most practical applications, the first two terms are sufficient and only a small fraction of industrial controllers make use of the derivative term. Threeterm controllers can be configured in a variety of ways, from the simplest autonomous single-loop controller to cascade control when a single controller is insufficient to provide the necessary control due to the interactions in the controlled variables of the plant. Three-term controllers can also be configured so as to provide ratio or blending control when

105

106

Chapter 8

quantities must be maintained as percentages, typically found in materials blending and mixing processes. Productivity is closely related to the quality of control. Low quality of control implies poor product quality with products that cannot meet standards, reduced productivity, loss of competitiveness and ultimately the collapse of the manufacturer. Effective control is thus of vital importance where high product quality and productivity are essential, high standards are to be maintained and market share assured. Effective operation of a plant implies correct tuning of the controllers to meet the product specifications while the efficiency of a plant is critically dependent on specifying the correct parameters of these controllers. Traditionally, a three-term controllers are tuned on-line by human experts who excite the plant by injecting appropriate disturbances to the set-points and then systematically adjust the parameters (i.e., gain constants) of the controller until the plant meets its design specifications. Re-tuning is normally necessary whenever the operating conditions of the plant are changed. Controller tuning requires considerable expertise and patience and takes time to master. The way in which a human tunes a control loop or plant is based on heuristic rules which involve such factors as the rise time, settling time and steady state error of the closed system. Indeed, as will be seen later in this chapter, these rules are put to good use in the design of expert controller tuners, which a number of vendors offer today.

8.1 Controller Tuning


Manual tuning can be supported by simple analytical techniques, which systematically determine the best (in some sense) controller parameters based on a simple macroscopic model (or approximant) of the controlled plant. Ziegler and Nichols offered design procedures to tune industrial controllers as far back as the 1940s that are still used to this day. By observing the initial slope of the response of the plant when subjected to a step disturbance and the dead time of the plant before it responds, a complex plant can be approximated by the classical first order lag plus dead time approximant. This simple but very approximate approach has formed the basis for improved variants on the Ziegler-Nichols approach. Modern tuning techniques such as those by Persson and Astrom require additional information from the response of the controlled plant and re-

Fuzzy Industrial Controllers

107

quire more computational effort but yield vastly improved plant responses. All these techniques assume that the controlled plant is scalar, i.e., has a single input and a single output and are not applicable to multivariable plants. Multivariable plants, for which three-term controllers do not find ready application, require an entirely different approach to controller design. For best performance three-term controllers must be tuned for all operating conditions. Unfortunately, the dynamic characteristics of most industrial plants depend on their operating state and production rates and these are far from linear or stationary. A three-term controller is normally tuned for best (note that use of the word optimum is tactfully avoided) performance at a specific operating state. When the operating conditions of the plant change, so does the operating state, whereupon the parameters of the controller may no longer be the best and as a consequence performance is degraded. The degree to which such degradation is acceptable clearly depends on the nature of the controlled plant, plants that are highly nonlinear being the most difficult to control effectively. The robustness of a controller is a measure of its ability to operate acceptably despite changes in the operating state of the plant. Where the variations in the plant are severe, a three-term controller with fixed parameters is no longer effective and alternate techniques, which are capable of tracking the changes in the plant must be employed. Such techniques as gain-scheduling, auto-tuning and adaptive control are commonly used to extend the domain of effectiveness and thereby the robustness of the controller. Fuzzy logic can likewise be used to extend the domain of effectiveness of a three-term controller.

8.2 Fuzzy Three-Term Controllers


The advent of fuzzy control motivated many researchers to reconsider the familiar robust three-term controller in the hope that fuzzifying it would improve its domain of effectiveness. This is a case of a technology retrofit, in which Computational Intelligence is applied to improve a well-known device. The result has been a new generation of intelligent three-term controllers with increased robustness that a number of vendors currently are offering. Fuzzy logic can be applied to three-term controllers in a number of ways. One obvious approach is to fuzzify the gains of the three-term

108

Chapter 8

controller by establishing rules whereby these gains are varied in accordance with the operating state of the closed system. In this case the controller output has the generalized form u = = fP(e,edt,De) + fI(e,edt,De)) + fD(e,edt,De) which when fuzzified can be expressed as the weighted sum u = fuzzy(kP) e + fuzzy(kI) edt + fuzzy(kD) De Hybrid fuzzy three-term controllers in which only the proportional and derivative terms are fuzzified while the integral term remains conventional, have also been used. In this case the controller output is u = uPD + kI edt An alternative class of fuzzy controllers, which possess the characteristics of a two-term PI controller is the generic fuzzy controller which has been used extensively in practice. Generic fuzzy controllers are very simple and require few rules to operate effectively. Using the closed system error and its derivative only, this class of fuzzy controllers is normally incremental with output Du = f(e,De) which must subsequently be integrated (or accumulated) to generate the final controller output.

8.2.1 Generalized three-term controllers


The output of a conventional three-term controller contains a term proportional to the error e between the desired and the actual output of the controlled plant, a term proportional to the derivative of the error De=de/dt and a term proportional to the integral of the error edt. In practice it is often preferable to use the derivative and the integral of the output of the process instead of the error in order to avoid sudden changes in the output when changing the setpoint, i.e., bump-less operation. If the fuzzy sets of the error, derivative and integral terms are ,

Fuzzy Industrial Controllers

109

and respectively, then the control rules of a generalized fuzzy threeterm controller can be expressed as: Rr: IF e is Er AND De is r AND edt is r THEN u is Ur Further, if the union operator relates the control rules, then the fuzzy algorithm reduces to the fuzzy implication rule R = R1R2.....Rn=

(E E IE U )
r r r r

The fuzzy set of the output of the generalized fuzzy three-term controller is thus given by U = (EEIE) R whose membership function is consequently U(u)=

( (e)

(De)

(edt) R(e,De,edt,u))

for eE, DE and edtIE. A graphical display of the parameter surface of such a three-term controller would have to be three-dimensional and it would be difficult to comprehend the effect of each parameter on the controller output.

8.2.2 Partitioned controller architecture


It is often more convenient to partition the three-term controller into two independent fuzzy sub-controllers that separately generate the signals uPD and uI that correspond to the proportional plus derivative term and the integral term respectively. The result is a fuzzy proportional plus derivative sub-controller FPD in parallel with a fuzzy integral controller FPI as shown in Figure 8.1. Both sub-controllers are fed with the error and its derivative. The second sub-controller requires an integrator (or accumulator) to generate the integral term. Both this controller and the generic two-term fuzzy controller that is discussed in a later section in this chapter require the error derivative De. In situations where the signal

110

Chapter 8

contains high frequency extraneous noise, there may be certain misgivings in generating the derivative or difference term from the error since noise aggravates the situation. In this case it is clear that some form of low pass filtering or signal processing is necessary to reduce the effect of high frequency noise.

e De FPD

uPD

FPI Integrator

uI

Figure 8.1 Decomposition of a three-term controller into two sub-controllers

The rule matrix or Fuzzy Associative Matrix for the FPD fuzzy subcontrollers is:
e\De NB NM NS ZO PS PM PB NB NB NB NB NB NM NS ZO NM NB NB NM NS NS ZO PS NS NB NB NM NS ZO PS PM ZO NM NM NS ZO PS PM PM PS NM NS ZO PS PM PB PB PM NS ZO PS PM PB PB PB PB ZO PS PM PB PB PB PB

This FAM contains 77=49 rules, a number that must be compared to the case of a three-term controller which would require 777=343 rules and corresponding memory storage locations in its knowledge base. It is noted that the FAM shown above is symmetric about the principal diagonal. It is possible to take advantage of this fact and store only half the rules if memory must be conserved. Rule pruning, in which adjacent rules are systematically eliminated, can further reduce the number of

Fuzzy Industrial Controllers

111

rules that have to be stored to about a quarter of the original number or about a dozen. Applying the Mamdani compositional rule and confining the controller inputs e and De to the closed interval [-3,3] and quantizing the two inputs into 7 levels, results in the relational matrix shown below. The entries in the matrix are the numerical values of uPD that are stored in the controller memory: this form of look-up table control is very simple to implement and has been applied extensively in practice. By altering some of the entries in the matrix it is possible to trim the performance of the controller further, intentionally warping the control surface in order to compensate for inherent non-linearities in the plant characteristics. In the design study presented in Appendix A, it is shown how step-response asymmetry may be compensated for using this approach.
e\De -3 -2 -1 0 1 2 3 -3 -3 -3 -3 -3 -2 -1 0 -2 -3 -3 -2 -1 -1 0 1 -1 -3 -3 -2 -1 0 1 2 0 -2 -2 -1 0 1 2 2 1 -2 -1 0 1 2 3 3 2 -1 0 1 2 3 3 3 3 0 1 2 3 3 3 3

The relational matrix is shown graphically in Figure 8.2. This is none other than the control surface of the FPD sub-controller. Due to quantization of the controller inputs, the controller output is not continuous and smooth and is sometimes referred to as a multilevel relay controller. Clearly the control surface can be smoothed and the controller performance improved by using extrapolation techniques in which case the control actions are defined everywhere in the bounded control space.

112

Chapter 8

4 2 0 -2
uPD

-4
De e Figure 8.2 Control surface of the FPD sub-controller

1 2 3 4 5 6 7

8.2.3 Hybrid architectures


Various hybrid architectures that combine fuzzy and deterministic elements have been proposed for the three-term controller. One such architecture is shown in Figure 8.3. In this architecture the P+D subcontroller FPD is identical to the one considered earlier but the integral term involves a gain coefficient which is the result of using the simple rules given below. The proportional and derivative terms are determined from the simple set of rules
e uPD NB NB NM NM NS NS ZO ZO PS PS PM PM PB PB

while the integral gain is varied according to the following rules


e kI NB PS NM PM NS PM ZO PB PS PM PM PM PB PS

Fuzzy Industrial Controllers

113

The final output of this hybrid fuzzy controller is the sum u = uPD + uI = uPD + fuzzy(kI)edt

De e

uPD FPD uI Integrator Multiplier

fuzzy(kI)

Figure 8.3 Hybrid three-term controller

Extending the method further, it is possible to design a threeterm controller using the configuration shown in Figure 8.4 with each of the controller parameters (kP, kI and kD) specified by independent rules, i.e.,: U = fuzzy(kP)e + fuzzy(kI)edt + fuzzy(kD)De

8.2.4 Generic two-term fuzzy controllers


The phase plane, which portrays the trajectory of the error between desired and actual output of a process, is a useful domain in which to specify the linguistic control rules of the process. Figure 8.5(a) shows a typical trajectory in phase-space, analogous to state space in which one of the states is related to its derivative. Figure 8.5(b) shows the error e(t) and the corresponding rate of change of error De(t) in response to a step excitation of the plant.

114

Chapter 8

De e

fuzzy(kP)

fuzzy(kI) Integrator fuzzy(kD)

Figure 8.4 Fuzzy controller with independent fuzzy parameters

The temporal error response to a step excitation and its derivative can be broken up into regions defined by their zero crossover points. Thus the error may be coarsely described as positive, zero at the crossover or negative. Assume, therefore, that the three fuzzy sets (POsitive, ZerO and NEgative) suffice to describe each control variable. In Region 1 we can thus write the first generic control rule: R1: IF e is PO AND De is NE THEN u is PO The objective of this rule is to apply maximum positive control action (e.g., torque in the case of a servomotor) to the controlled process in order to force it to accelerate to its final value with a minimum rise time. In Region 2 the corresponding generic control rule is: R2: IF e is AND De is NE THEN u is Here the objective is to apply maximum negative control action to decelerate the process in order to minimize overshoot. Using similar reasoning, the 11 rule Fuzzy Associative Memory that follows is derived.

Fuzzy Industrial Controllers

115

De

(a)

e a b c d e f g h i j k l

ii iii iv v vi vii viii ix (b)

Figure 8.5 (a) Phase-space trajectory and (b) time-domain response

116

Chapter 8

Rule
1 2 3 4 5 6 7 8 9 10 11

e
PO ZO NE ZO ZO PO NE NE PO PO NE

De
ZO NE ZO PO ZO NE NE PO PO NE PO

u
PO NE NE PO ZO PO NE NE PO ZO ZO

Points
a,e,i b,f,j c,g,k d,h,l origin i,v ii,vi iii,vii iv,viii ix ix

It is obvious that finer control can be achieved if the number of control rules is increased. The number of rules is consequently a measure of the granularity of the controller.
Rule
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

e
PL PM PS ZO ZO ZO NL NM NS ZO ZO ZO ZO PL PS NL NS PS NS

De
ZO ZO ZO NL NM NS ZO ZO ZO PL PM PS ZO NS NL PS PL NS PS

u
PL PM PS NL NM NS NL NM NS PL PM PS ZO PM NM NM PM ZO ZO

Points
a e i b f j c g k d h l origin i i iii iii ix xi

Increasing the number of fuzzy sets assigned to each control variable to 19 and using 5 fuzzy variables PL for Positive_Large, PM for Positive_Medium, PS for Positive_Small, ZO for ZerO, NS for Nega-

Fuzzy Industrial Controllers

117

tive_Small, NM for Negative_Medium and. NL for Negative_Large leads to the FAM given above.

Du e De Generic Fuzzy Controller Integrator

u0 u

Figure 8.7 Generic two-term incremental fuzzy controller

Figure 8.7 shows a schematic of the generic two-term architecture. This incremental fuzzy controller is generic since it can be applied to any dynamic process that exhibits under-damped behavior. Being incremental, it is necessary to supply the nominal controller output u0 to the controller in order to generate the total control variable.

8.3 Coarse-Fine Fuzzy Control


When a controller is required to operate under conditions of both large and small excursions of its inputs from their nominal values, it is convenient to use two or more sets of fuzzy rules to effect improved control. For large excursions of the controller input variables, coarse control is applied with the objective of forcing the plant to return to its nominal operating point as rapidly as possible. Accuracy of control is of secondary importance under these circumstances and only a few rules are required. When the plant variables reach some small region about the nominal operating point then fine control is applied. Here a new set of control rules necessary to effect the desired fine control actions are used and these involve a larger number of rules and fuzzy sets. Under normal operating conditions the controller uses fine control for small excursions about the nominal operating point. An alternative way of achieving coarse-fine control is through zooming of the universe of discourse of each controller input variable. In

118

Chapter 8

this case the universe of discourse is varied, either in discrete regions in control space or smoothly as the plant approaches the desired operating point (see fuzzy gain-scheduling in chapter 10). This approach has been used to great effect for the control of high precision mechatronic devices.

Chapter 9

Real-time Fuzzy Control


Chronologically, fuzzy control was the first form of intelligent control. It was to radically alter the course of Control Engineering with seemingly infinite applications. Fuzzy control appears today in real-time applications ranging from domestic appliances (such as air-conditioners and refrigerators where it has become a major selling point) to high-speed trains, automobiles and process control. Zadehs theory of fuzzy sets laid the foundations of Fuzzy Control and the revolution in Control Engineering that continues unabated to this day. Whereas the foundations of fuzzy set theory are on solid ground, fuzzy control is still very much an art, requiring knowledge and experience to implement correctly. Soft control is not a panacea for all control problems and it would be wrong to state that it will ultimately replace conventional control. There are numerous cases where conventional control results in excellent quality of control and there is no need to replace it. There are however many cases, especially in industry, that have defied solution using any of the conventional control techniques and here fuzzy control has proved invaluable, offering solutions where none existed previously. The control strategy of an industrial plant may often be described in linguistic terms by way of control rules or control protocols that relate the control actions to the controlled variables in linguistic terms. The resultant fuzzy controller must faithfully replicate the actions of the human operator from which the control rules were elicited.
119

120

Chapter 9

As noted earlier, the rules (sometimes referred to as production rules) by which a plant can be controlled are normally expressed as a set of relational statements of the form: IF (Current_State_of_Plant) THEN (Control_Action) These rules, which are suitably encoded, are stored in the knowledge base of the fuzzy controller and are processed by a fuzzy inference engine based on one of the implication functions presented in chapter 6.

9.1 Supervisory Fuzzy Controllers


The elements (i.e., building blocks) of a fuzzy controller are shown schematically in Figure 9.1 and are: a real-time data base RTDB where the current and the previous values of the control variables of the physical plant obtained from the local RTUs are deposited (usually following some form of signal processing to remove extraneous signals). The real-time database also contains the values of the controlled variables (i.e., control actions) and the time when the previous action was taken on the plant. The real-time database is the interface between the physical plant and the controller, the knowledge base KB in which the control rules, suitably coded, are stored, a database FS in which the fuzzy sets of the inputs and outputs of the controller are stored, a fuzzifier FZ where the inputs to the controller are transformed into the domain of fuzzy sets, an inference engine IE, the kernel of the fuzzy controller, is software with which the fuzzy sets of the controller outputs are computed, a de-fuzzifier DF where the fuzzy sets of the outputs are transformed back into the engineering domain and the numerical control actions are deposited in the real-time data base RTDB for transmittal to the RTUs which subsequently enforce these control actions on the physical plant and, finally

Real-time Fuzzy Control

121

a development system DS through which the engineer interacts with the fuzzy controller during the development or whenever modifications have to be made. This module is removed once the controller is finalized.

Causes
Real-time Database RTDB

Effects

Fuzzifier FZ

De-fuzzifier

Inference Engine IE

Knowledge Base KB

Fuzzy Sets FS

Fuzzy Controller Kernel

Development System DS

Figure 9.1 Elements of a fuzzy controller

122

Chapter 9

In modern industrial plants, the link between the supervisory control system (whether SCADA or DCS) and the physical plant being controlled is through a Local Area Network (LAN) to which remote terminal units (RTUs) are attached. The RTUs include analog to digital converters for the conversion of the physical variables at rates that can reach thousands of samples per seconds. These units invariably include software with which three-term controllers (PID) can be implemented, as well as software with which the RTUs can communicate with the host computer (i.e., server) via the LAN. The client/server architecture shown in Figure 9.2 and is one of the most common architectures in use today. It is noted that in the new generation of industrial control systems, smart sensors and actuators with built-in micro-controllers that can be directly connected to the LAN are gradually replacing RTUs.

RTU

RTU LAN

PC

PC

PC

Clients Figure 9.2 Architecture of a distributed supervisory control system

Distributed supervisory control systems involve a cluster of industrial grade microcomputers, RTUs, micro-controllers and peripherals connected to a LAN. Each component of the cluster performs real-time control of a specific sub-process of the plant. The host computer usually

Real-time Fuzzy Control

123

contains the real-time database where all current data on the state of the plant is stored and can serve this data to any of the clients (sometimes referred to as agents) in the cluster on demand. An alternative architecture involves a distributed real-time database in which each client retains the data pertinent to the tasks it is assigned. The client can transmit this data to any other client on demand. This leads to considerably more data having to be transmitted over the LAN between clients, resulting in data transmission delays. Finally, a hybrid architecture in which the local real-time databases are mirrored in the server can be used. In this case, each client must continuously update the data that has changed since the last transmission in the master database in the host computer. The advantage of this last architecture is that in the unlikely case that the host computer fails and its data is lost or corrupted, then the master database can be restored from the local databases in the clients when the host becomes operational again.

9.2 Embedded Fuzzy Controllers


Fuzzy controllers are increasingly being embedded in micro-controllers for operation at the lowest level of the control hierarchy. Chip-size fuzzy micro-controllers have found their way into a variety of products such as disk drives, printers, video cameras, laser printers, cars, household appliances, and so forth. The use of real-time embedded micro-controllers is spreading rapidly for two reasons: convenience and cost. Except for memory restrictions, embedded controllers can be programmed to do just about anything their larger counterparts can do. Unlike their larger counterparts, which operate under such real time operating systems as Unix or Windows NT, there is no standardized real-time operating system for micro-controllers. Windows CE, which features a deterministic scheduler, may change this situation, however. The programming language of choice for micro-controllers in the foreseeable future is likely to be Java, which is processor independent. Micro-controller hardware has generally been restricted to 16 and 32 bit processors but 4-bit and even 64-bit processors are in use, the former in domestic appliances. The architecture of embedded intelligent micro-controllers follows that presented in the previous section closely. Only the development system is independent and resides in a host PC where all development is performed off-line. Once the design has been completed, the ex-

124

Chapter 9

ecutable software, which includes the knowledge base, fuzzy sets, inference engine, fuzzifier and de-fuzzifier, is downloaded to the microcontroller via a local link or is burned into an erasable programmable read-only memory (EPROM) which is then plugged into the microcontroller. A new class of intelligent industrial three-term controllers is gradually replacing conventional industrial controllers in a number of critical applications that require increased autonomy. These controllers find use in situations where the operational demands of the application do not permit fixed or programmed gains. Intelligent industrial controllers are implemented in software in micro-controllers (MC), programmable logical controllers (PLCs) or remote terminal controllers (RTUs) and a number of vendors today offer appropriate software for their development. In new plants it would be wise to consider incorporating intelligent controllers that have been shown to enhance control over conventional industrial controllers. In order to minimize memory requirements as well as accelerate computation in embedded fuzzy controllers, many vendors restrict both the number and the shape of the permissible fuzzy sets for both inputs and outputs. Triangular fuzzy sets are almost universally used for the inputs and outputs of the controller. Rules are coded numerically and the number of fuzzy sets is restricted. Singletons are often used to define the fuzzy sets of the outputs of the controllers as they simplify defuzzification considerably.

9.3 The Real-time Execution Scheduler


Intelligent controllers reside in a devoted intelligent agent or are distributed throughout the clients in the cluster. They can be executed either at regular intervals (analogous to discrete-time control) or on demand. In the case of a devoted intelligent agent, the executable program of the fuzzy controllers is scheduled to run at fixed time intervals. The execution interval is clearly dependent on the dynamics of the plant under control. The execution interval is typically set equal to one-tenth to onetwentieth of the value of the smallest time constant of the plant. In some cases, particularly in process control, where static control is acceptable, the execution interval is set equal to the settling time of the plant as determined from experimental observations.

Real-time Fuzzy Control

125

An alternative technique for scheduling the intelligent controller which is useful particularly when dealing with fast dynamic plants, is to have a control scheduler that continuously tracks the inputs to the fuzzy controller and their changes. Based on some measure of the rates of change (or differences) of these inputs, the control scheduler is programmed to decide whether or not the controller should execute. Clearly if the inputs to the controller (i.e., the plant variables) do not change significantly with time, implying that the plant is stationary and operating at its nominal state, then there is no reason to effect control and the controller algorithm is not executed. Conversely, if the inputs are changing significantly, implying that the plant is in transition then immediate action must be taken in order to restore it to its nominal state. By continuously following the changes of the controller inputs, the control scheduler is therefore in a position to issue commands on when to execute the controller. If, following execution and after some minimum time interval, the control scheduler continues to observe that the rates of the controller inputs exceed pre-specified limits, then it sends another command to execute the controller and continues to do so until the plant returns to its nominal state. This scheme leads to a controller that executes at irregular intervals and on demand. The control scheduler is shown schematically in Figure 9.3. Controller Differencer Execution Scheduler

Input

Output

Figure 9.3 The real-time execution scheduler

On executing the fuzzy algorithm and following defuzzification, the new control variables are deposited in the real-time da-

126

Chapter 9

tabase. At regular intervals (measured in milliseconds) and provided there has been a significant change in the control variables since the last time the algorithm was executed, this information is transmitted via the LAN to the appropriate RTUs for immediate action on the plant. This final action closes the control loop. The structure of a fuzzy controller can be seen to be very general and is applicable to a wide variety of plants and processes. The only thing that is required to change the controller is to modify the numbers of the inputs and outputs of the controller, the shape and number of the fuzzy sets and, most importantly, the rule base.

Example 9.1 Waste-water treatment control


An interesting application of real-time fuzzy control to an environmental problem is outlined below. The system has been used in a municipal water-treatment plant, achieving significantly improved water treatment over existing methods. The system was implemented on a commercial PLC. Through suitable modification of the biological treatment process, stabilization of the nitrogen compounds (nitrates, ammonia) can also be achieved. Biological treatment is classified by the type of microorganisms that are used in the removal of the organic load and is either aerobic, anaerobic or both. Because of the anoxic zone in aerobic treatment, the simultaneous removal of both nitrates and ammonia is possible. Interest is normally focused on secondary biological treatment and on aerobic treatment in particular. The wastewater treatment plant considered in this paper involves extended aeration and is a variation of the active sludge method. A schematic of the plant is shown in Figure 9.6. Here the biological reactor possesses a tank with two zones as well as a secondary settling tank. Wet sludge is fed into the anoxic zone in which low oxygen conditions prevail and de-nitrification initially takes place. To operate satisfactorily, this stage requires low oxygen concentrations in the presence of nitrates.

Real-time Fuzzy Control

127

Sludge is transferred to a second aerated zone where organic load is removed and nitrification take place. A fraction of this sludge is then returned to the anoxic zone while the remainder is fed to the secondary settling tank. The quantity of sludge returned for further treatment depends on factors that affect nitrification and de-nitrification and is therefore one of the process variables that must be controlled. Finally, a fraction of the sludge in the secondary settling tank is returned to the biological reactor, while the rest is removed and fed to the fusion stage. Sludge feed Grit chamber Biological treatment Bar racks Skimming tank Suspended liquids Sludge returns Remaining sludge Primary settling tank Secondary settling tank

Sludge Fusion Sludge Drying

Solids storage

Figure 9.5 Schematic of a typical wastewater treatment plant The fraction of sludge fed back to the biological reactor is another variable that must be controlled. In all wastewater treatment plants it is necessary to control the oxygen content in the aerated zone of the reactor. The oxygen content depends on the removal of the organic load and nitrification. The removal of organic load, nitrification and denitrification are the three principal quantities in a wastewater treatment plant that must be controlled. This is achieved by a suitable control strategy for the following three manipulated variables:

128

Chapter 9

1. the oxygen supply to the aerated zone (O2Feed), 2. the mixed liquid returns rate from the aerated zone to the anoxic zone of the biological reactor (R_ml) and 3. the sludge returns rate from the secondary settling tank to the biological reactor (R_sludge).

Measurement of BOD

Reactor

Measurement of: Temperature, DO and MLSS

Settling tank

+ Sludge feed

ANOXIC ZONE

AERATED ZONE
Oxygen Feed

Liquid Returns Pump for return of liquids

Measurement of N=NO3 and N-NH3

Measurement of BOD

Sludge returns Remaining sludge for fusion

Figure 9.6. The treatment process and its principal variables The controlled variables of the plant and inputs to the controller are: 1. 2. 3. 4. 5. the ammonia concentration in the reactor (N-NH3), the nitrate concentration in the reactor (N-NO3), the dissolved oxygen in the reactor (DO), the temperature in the reactor (TEMP), the mixed liquid suspended solids concentration in the reactor (MLSS), 6. the difference in biochemical oxygen demand D(BOD)between the entrance and exit of the secondary settling tank.

Real-time Fuzzy Control

129

The integrity of the controller is directly related to the number of fuzzy variables used. Increasing the number of fuzzy variables, however, increases the memory requirements of the controller logarithmically. For wastewater plant control, the three fuzzy variables, HI (HIgh), OK and LO (LOw) normally suffice to characterize the controller inputs. Trapezoidal fuzzy sets (membership functions) are computationally simple and effective in practice. Similarly, the manipulated variables or controller outputs must be allocated an appropriate number of linguistic descriptors. Five fuzzy variables, i.e., VH (VeryHigh), HI (HIgh), OK, LO (LOw) and VL (VeryLow) provide sufficiently fine control. Finally, for computational simplicity, singletons are used to describe the fuzzy sets of the controller outputs, leading to a particularly simple and fast procedure for de-fuzzification. The knowledge with which a given wastewater plant is controlled must first be elicited from plant operators. This is a painstaking task and one that is critical for proper operation. If the 6 manipulated variables have 3 descriptors each, then the theoretical maximum number of rules is 36 or 729, a number which is clearly unmanageable and practically unnecessary. In practice 50 rules suffice to provide control of a wastewater treatment plant. Of these some 27 are required to stabilize the organic load (BOD), 11 to stabilize the nitrification process while 12 rules similar to those for nitrification are necessary to stabilize de-nitrification. The 50 rules, which form the knowledge base of the controller, are considered to be the minimum necessary to achieve acceptable control of a wastewater treatment plant under most operating conditions. A subset of these rules is shown in the Table on p. 130. The control rules have the standard form: R : IF (D(BOD) is Y1) AND (MLSS is Y2) AND (TEMP is Y3) AND (DO is Y4) AND (N-NH3 is Y5) AND (N-NO3 is Y6) THEN (O2Feed is U1) AND (R_Sludge is U2) AND (R_ml is U3) The controller outputs are all normalized to the range [0,100]%. Under normal operating conditions the plant outputs have nominal values of 50% and the corresponding levels of reactor stabilization, nitrification and de-nitrification are 90%, 70% and 60% respectively.

130

Chapter 9

L=Low, OK= normal, H=High, VH=VeryHigh, B=Big, VB=VeryBig, S=Small, VS=VerySmall

O2Feed

MLSS

D(BOD)

Figure 9.7. Variations in oxygen feed in response to a triangular perturbations in the D(BOD) and O2Feed Finally, Figure 9.7 shows the effect on the oxygen feed to the reactor of a triangular perturbation of 50% on D(BOD) and MLSS about their nominal values.

Real-time Fuzzy Control

131

Example 9.2 Rotary kiln control


The rotary kiln is the basic process in cement production. Raw feed is processed in a cyclone pre-heater comprising the rotary kiln and precalciner strings, both of which supply pre-heated feed to the rotary kiln, a long tubular structure with a refractory lining. Fuel, either coal dust or fuel oil is burnt at the exit of the kiln in a long flame that extends along most of the length of the kiln. Throughout the length of the kiln there is a hot airflow induced by an induction fan at the top of the precalciner. As a result of this hot air mass, there is a transfer of heat between the hot air and the raw feed. The raw feed undergoes crystallization as it reaches the end of the kiln where the temperatures are of the order of 1400 degrees. The kiln is rotated at a few revolutions per minute to allow for the lava-like fluid to exit the kiln and flow to the grate cooler where it solidifies into clinker. Clinker, when added to gypsum and ground to a fine powder in a finishgrinding mill, results in the final product, cement. The cement production process is very energy-intensive, requiring large quantities of fuel in the kilning stage and electric power in the finish grinding process. As the cost of energy constitutes the largest component of the total cost of production, it is clear that every effort must be made to keep this cost to a minimum. The kilning process is one of the most difficult processes to control. It is significant that this process is one of the few remaining in the process industry, which generally relies on human operators. Many attempts have been made to model the process and then apply modern theory to control it. All these attempts have failed dismally due to the complexity, non-linearity and distributed nature of the process. Until the advent of fuzzy kiln control, first introduced by F. L. Smidth of Denmark in the early 1980s, all cement kilns were manually controlled. A kiln operators objective is to produce clinker of the desired specifications and production rate while minimizing fuel feed, and at all times maintaining the process in a stable state. Under-burning (i.e., using less fuel, which results in lower kiln temperatures) the raw material leads to clinker which is less reactive and thus undergoes an incomplete change of phase of the raw material. The lava-like material, which slowly slides down the kiln may then solidify, a catastrophic situation that results in kiln shut-down and repairs to the kiln lining. Over-burning, on the other hand, leads to excessive fuel consumption, less reactive clinker and high refractory lining wear.

132

Chapter 9

Most kiln operators prefer to operate the kiln in the region of overburning however, as it is more stable. In between the two extremes, however, there is a narrow region of stable operation in which high productivity and product quality can be achieved. Kiln operators find this state very difficult to maintain for long periods or time as the controls must be continuously manipulated to account for small changes in the operating state of the process. Kiln operators learn to control their plant from linguistic rules of the type: IF the Kiln_KW demand is HIgh AND the outlet oxygen content O2 is Low AND the Previous_Fuel_Feed is OK THEN make a SMall reduction to the Fuel_Feed AND make NO change to the induced draft fan speed VIDF Here, Kiln_KW, outlet oxygen content O2 and Previous_Fuel_Feed constitute the input fuzzy variables, while current Fuel_Feed and induced draft fan speed VIDF are the output fuzzy variables. Approximately 50-60 linguistic rule are used to provide very good control of the kiln under normal operating conditions. For start-up and shut-down conditions, as well as abnormal situations, different sets of rules may be used. The fuzzy controller can control this most difficult of processes as well as, and certainly more consistently than, a human operator, by observing the same control variables and adjusting the same manipulated variables. Fuzzy kiln controllers can easily maintain control of the process in this stable intermediate state, consistently achieving marked fuel economy and high productivity. Fuzzy kiln controllers normally reside in a client on a client-server network, receiving information on the current values of the control variables and returning the manipulated variables to the real-time data base on the file server for transmittal to the local RTUs. The fuzzy controller can be executed either at regular intervals or following an interrupt from the control scheduler, which monitors the temporal changes in the control variables. Using fuzzy kiln control, fuel costs have been reduced by up to 5% while productivity has been increased by an equal amount. Today there is a very large number of kilns worldwide under fuzzy control. Similar fuzzy controllers have been used to control all of the processes associated with cement production.

Real-time Fuzzy Control

133

Hierarchical intelligent control has also been used to control a cluster of fuzzy controllers, each controlling a different sub-process of the kilning process. The cement industry was the first process industry to adopt fuzzy control and today a number of vendors supply such controllers. There is certainly no doubt that fuzzy control has resulted in a major change in the cement industry and many process industries are now following suit, encouraged by the progress in the field.

VIDF Precalciner Prev_Fuel_Feed Raw Feed Controller Fuel Feed Air flow

O2

Kiln KW

Kiln motor

Fuel Rotary Kiln Clinker flow

Figure 9.10 The rotary kiln intelligent control system and the principal control and controlled variables

Chapter 10

Model-Based Fuzzy Control


Two distinct trends in the design of fuzzy controllers have evolved in the last decade or so. The first, based on heuristic knowledge of the control actions necessary to control a process, has been dealt with at length in the preceding chapters. Heuristic fuzzy control does not require deep knowledge of the controlled process in order to be applied successfully. This feature has made heuristic fuzzy control very popular in the industrial and Fmanufacturing environments where such knowledge is often lacking. In general, soft control is possible only if heuristic knowledge of the control policy is known a priori. Thus for new processes, where there is no such prior knowledge, heuristic soft control is not a candidate. It is also clear that the heuristic approach cannot resolve such issues as overall system stability and system performance. This implies that this approach is case-dependent and it is therefore not possible to generalize the performance of fuzzy controllers from knowledge of their behavior in other applications. The limitations of the heuristic approach led to a search for more rigorous methods that combine fuzzy logic and the theory of modern control. The result was Model-based Fuzzy Control which has attracted considerable attention and has been applied successfully in a number of applications, particularly in Japan, for the control of high-speed trains, helicopters, robotic arms and more. The technique assumes the existence of an explicit microscopic model of the controlled process of sufficient fidelity from which a series of linearized models can be derived for each
135

136

Chapter 10

nominal operating state. The model-based fuzzy control methodology is thus a fusion of soft and hard control techniques and offers advantages in situations where stability and transient behavior must be guaranteed. In conventional gain-scheduling control, the selection of the scheduling procedure, i.e., the control law to use, is dependent on some exogenous variable. In contrast, fuzzy gain scheduling, which is a special form of model-based fuzzy control, uses linguistic rules and fuzzy reasoning to determine the corresponding control law. Issues of stability, pole placement and closed loop dynamic behavior are resolved using conventional modern control techniques.

10.1 The Takagi-Sugeno Model-Based Approach to Fuzzy Control


In the mid-1980s Takagi and Sugeno proposed the use of fuzzy reasoning to specify the control law of a conventional state feedback controller so that the overall system would have guaranteed properties. The original controller that Takagi and Sugeno proposed is characterized by a set of fuzzy rules that relate the current state of the process to its process model and the corresponding control law. These composite rules have the form: R: IF (state) THEN (fuzzy process model) AND (fuzzy control law). Consider a physical process described by its non-homogeneous state equations:

x = f (x,u); x0
where xn and um are the crisp n-dimensional process state vector and m-dimensional control vector respectively and x0 is the initial state. This explicit description of the process may be the result of deep knowledge about the process or the result of identification of the process from experimental data using any of the well-known system identification techniques. One of the interesting features of the first Takagi and Sugeno technique for designing model-based fuzzy controllers is that under cer-

Model-Based Fuzzy Control

137

tain conditions that are, unfortunately, not always easy to satisfy, this technique guarantees stability of the closed system while specifying the transient behavior of the closed system through pole-placement. These are properties that are inconceivable with the heuristic fuzzy controller design technique. The difficulties in meeting the conditions for stability of the first method proposed by Takagi and Sugeno were eliminated in the second version that uses state differences. Both techniques are outlined in this chapter.

10.2 Fuzzy Variables and Fuzzy Spaces


Consider the crisp process state vector x define on a closed, real set . The fuzzy state variables xi are fuzzy sets defined on . The values of these variables are termed the fuzzy values of the fuzzy state variable and are written as Xij. For every fuzzy value of xi there exists a corresponding membership function Xij(x), which specifies the membership value of the crisp value xi* of this variable. In general we define:

X ij = X ij ( x) / x
X

for the continuous case and


X ij = X ij ( x ) / x

for the discrete case. The universe of discourse is given as the set: TXi = { X i 1 , X i 2 ...X iki } where ki is the number of fuzzy values xi. In order to simplify the analysis that follows it will be assumed that: the shapes of the fuzzy sets of Xi are identical for all i, the number of fuzzy numbers k1 = k2 =...=kn and that X 1i = X 2 i ... = X ij = X nj .

Examples of such fuzzy sets are shown in Figure 10.1.

138

Chapter 10

1 1 i2 in

xi

Figure 10.1 Fuzzy sets of the state xi

The state vector x of a process is defined over some space. Every crisp value x* of the state vector corresponds to a specific state in state space. In the case of fuzzy controllers based on the Takagi-Sugeno model, the states take on fuzzy values and consequently the concept of state space must be modified to account for the fuzzy values of the state vector. Knowing that every fuzzy variable has a finite number of fuzzy values, we can then generate a finite number of fuzzy vectors that result from the combinations of the fuzzy values. Each element of the crisp state variable x is fuzzified as in the heuristic fuzzy control case. In each fuzzy region of fuzzy state space, a rule uniquely defines the local process model in that region, e.g., RSi : IF x=xi THEN x i =fi(xi,ui) (10.1)

The symbolism x=Xi implies that the state of the process x belongs to the fuzzy region Xi. The consequent of each rule describes an explicit local model of the process in the corresponding fuzzy region

Model-Based Fuzzy Control

139

i. Simple examples of fuzzy process rules for both continuous and discrete-time processes are: IF u is Low AND x is High THEN the process model is x = -0.7x - 0.2x3 - 3.1u ELSE IF u is High AND x is Low THEN the process model is x = -0.6x - 0.3x3 - 2.8u and Given the process parametric model Pressure(k+1)=a0 Pressure(k) + b1 Valve(k) +b2 Input_Flow_Rate(k) IF Pressure(k+1) is High AND Valve(k) is Closed AND Input_Flow_Rate(k) is Very_High THEN the process parameters are a0=-3, b1=2 and b2=-1 ELSE IF Pressure(k+1) is Low AND Valve(k) is OK AND Input_Flow_Rate(k) is High THEN the process parameters are a0=-4, b1=1 and b2=-2 ELSE etc.

10.3 The Fuzzy Process Model


It is observed that whereas the antecedent of the fuzzy process rules are similar to those used in the heuristic fuzzy control case, the consequents are analytical expressions that describe a process model. Process rules can be expressed in terms of the elements of the crisp process state as: RSi: IF x1=xi1 AND x2=xi2 AND xn=xin THEN x i =fi(x,u)

140

Chapter 10

In any fuzzy region Xi the process can thus be specified by the state equation:
x i = S ( x ) f i ( x, u )
i

(10.2)

where

S ( x) = X i ( x1 ) X i ( x 2 ) ... X i ( x n )
1 2 n

= min( X i ( x1 ), X i ( x 2 ),..., X i ( x n )).


1 2 n

(10.3)

are the degrees of fulfillment of the local models of the process using Mamdanis fuzzy compositional rule. For each nominal state of the process, a state equation of the form of (10.2) is determined. Using the set of fuzzy process rules (10.1) we establish the fuzzy open-loop model of the process, which is the weighted sum of the local models fi(x,u) , i.e.

x = wS ( x ) f i ( x, u )
i i

(10.4)

where

wS ( x ) =

S ( x)
i i

S ( x)

[0, 1]

(10.5)

are the normalized degrees of fulfillment or process function weights. Clearly the sum of the weights is unity, i.e.

w
i

i S

( x) = 1

Model-Based Fuzzy Control

141

10.4 The Fuzzy Control Law


Similarly, for every antecedent there exist two consequents, the second one of which specifies the state feedback control law that must be applied in order to stabilize the process. This, of course, assumes that all states are measurable, a restrictive limitation of the technique in practice. Where it is not possible to measure all the state variables then observers or other forms of state estimators may be used to provide the missing states provided an explicit microscopic model of the process is available. The second consequent of every rule in the fuzzy region Xj has the form:

RSj : IF x=xj THEN u =gi(x)


while the control law that must be applied to the process so that stability is assured at any nominal states is derived in an analogous manner and is:
u j = C ( x) g j ( x)
j

(10.6)

where

C ( x) = k ( X j ( x k ))
k

= min( X j ( x1 ), X j ( x 2 ),..., X j ( x n )). (10.7)


1 2 n

The overall control law is thus the weighted sum:

u = wC ( x) g j ( x)
j j

(10.8)

where wCj are the control weights which are computed in Equation (10.5).

142

Chapter 10

10.5 The Locally Linearized Process Model


When the nonlinear model of the process is linearized (also termed Lyapunov linearization) about each nominal state of the process, the result is a set of locally linearized models of the controlled process described by state equations:

xi = f i ( x i , u i ) = Ai xi + Biu i
where

(10.9)

Ai =

f i x
i xi ,ui

, Bi =

f i u i
xi ,ui

are the usual partial derivatives of the nonlinear functions fi() evaluated at the nominal conditions (xi,ui). The set of locally linearized models at these nominal states defines the state equations of the controlled process at those states and is termed the overall fuzzy open-loop model of the process. For each linearized local model there must be a corresponding linear control law that guarantees closed system stability while satisfying time-domain criteria. The complete set of control laws constitutes the global control law of the closed system. In the Takagi-Sugeno approach, the decision on the process model and the control law to use clearly depends on the nominal state of the process and is based on fuzzy linguistic rules. The linearized process model rule (10.2) now becomes:

RS i : IF x=xi THEN x = Ai x + Bi u
P

(10.10)

Modern control theory assumes that all the states of the process are measurable or can be estimated, in which case a constant state feedback control law of the form: (10.11)

Model-Based Fuzzy Control

143

u i = Ki x
can be applied, where Ki is a suitably gain matrix. Using any poleplacement technique, the elements of the gain matrix are chosen so that the poles of the closed system yield the desired transient behavior. The fuzzy control rule is therefore:

RC : RSi : IF x=xi THEN ui =Kixi


i j

(10.12)

Now, from the definitions of wS (x) and wC (x) it is seen that these two are in fact identical, whereupon to simplify notation, let i j wS ( x) = w i ( x) and wC ( x) = w j ( x) . Substituting Equation (10.5) in (10.4), we obtain the Overall Closed System (linearized) Fuzzy Model:

x = w i ( x)( Ai x + Bi u )
i

(10.13)

and likewise the overall control law:


u = w j ( x) K j x
j

(10.14)

Finally, substituting Equation (10.14) in (10.13) yields the overall homogeneous closed system state equations:

x = w i ( x) w j ( x )( Ai + Bi K j ) x
i j

(10.15)

or

x = w i ( x) w j ( x) Aij x
i j

(10.16)

where

Aij = Ai + Bi K j

144

Chapter 10

is a Hurwitz matrix. It should be noted that with this technique, even though linearized models of the process have been used in the development, both the overall system state equations and the overall control law are not linear because of the nonlinearity of the normalized membership functions wi(x) and wj(x). This leads us to the conclusion that asymptotic stability of the closed system can be guaranteed only locally, i.e., around each nominal state, and cannot be inferred elsewhere.

10.5.1 Conditions for closed system stability


The conditions for stability of the Takagi-Sugeno model-based fuzzy controller can be stated as follows:

he closed system (10.15) is asymptotically stable and stabilizable at the origin x=0, if and only if
1. the matrix Aij is Hurwitz, i.e., all its eigenvalues possess negative real parts and 2. there exists a positive definite matrix P such that

Aij T P + PAij < 0 i,j

(10.17)

. Even though these stability conditions are necessary and sufficient, it is very difficult to determine a suitable positive definite matrix P. Furthermore, if the matrix ij is not Hurwitz then a matrix P which satisfies Equation (10.17) does not exist and this technique is not applicable.

10.6 The Second Takagi-Sugeno Approach


To overcome these limitations, Takagi and Sugeno proposed a second approach that follows the one already outlined in principle, but differs in the formulation of the fuzzy rules. Thus the rule which defines the fuzzy open loop model is modified to:

Model-Based Fuzzy Control

145

Rsi: IF xd=xi THEN x i = f i ( x i , u i ) (10.18) d where x is the desired state of the process. The modified model-based fuzzy control approach determines which fuzzy region the desired state of the process belongs to before deciding what control law to follow. It is noted that the desired state need not be stationary but may change continuously without the need to re-compute the control law. The desired state xd and the control law ud are the result of solving the steady state equations:
fi (xd ,u d ) = 0

Assuming that the desired state xd is constant or varies very slowly with time so that x d 0, the nonlinear system can then be linearized about (xd,ud) to yield:

x = A d ( x x d ) + B d (u u d )

(10.19)

which are linear state equations involving the state and control input deviations from their nominal states xd and ud respectively, while
A d = A( x d , u d ) = f i ( x, u ) x

x = x d ,u = u d

and
B d = B( x d , u d ) = f i ( x, u ) u

x = x d ,u = u d

The modified technique has the following limitations that may make application difficult:

it is valid only for small deviations in the state and input vectors from their nominal values and for every change in the nominal state, the rules by which decisions are made must be changed and the linearization procedure repeated.

146

Chapter 10

10.7 Fuzzy Gain-Scheduling


Gain-scheduling is a well-known technique of conventional industrial control that is used when the plant is subject to changes in its operating state. Here the gains of the controller are varied in accordance with some exogenous variable. A typical example is aircraft control: as the altitude of the aircraft increases the influence of the control surfaces decreases due to the thinner air. This in turn requires a greater control action to achieve a particular result. If altitude (as measured by an altimeter) is therefore used as the exogenous variable to adjust the controller gains at given altitudes, then the controlling effect can scheduled to be essentially the same whatever the altitude. The adjustments to the controller gains are step-wise, resulting in bumpy control when the adjustments are effected. Automatic transmission in vehicles is another well-known example. The distinct changes in the gear ratios while accelerating are a cause of jerky motion is undesirable. To avoid these sudden changes some manufacturers offer infinitely variable ratio transmissions that lead to smooth motion. Refering to the problem described in the previous section, if it were possible to generate a set D of known nominal states xi, for which a corresponding set S of linearized process models is computed, then fuzzy gain-scheduling can offer distinct advantages. In this scheme, the transition from one nominal state to another is smooth since the system parameters can be made to vary smoothly along the trajectory. These two sets are clearly invariant with changes in the nominal states. For any nominal state that does not belong to the set D, an approximating model can be derived from models belonging to the set without recourse to linearization. For each nominal state xd the locally linearized model is stored in the model base S while the corresponding control law is stored in the control law base U. The nominal states xi in the nominal states base D can be conveniently chosen to be at the centers of the fuzzy regions X, i.e., the states xiXi at which:

Model-Based Fuzzy Control

147

LX i ( x i ) = min(1,1,...,1) = 1.
Thus at the centers of each fuzzy region, the linearized system in the consequents of the fuzzy rules are given by Equation (10.19) with xi in place of xd. The set of rules which describe the fuzzy open-loop model reduce to:

RSj: IF xd=Xi THEN x i =Ai(x-xd)+Bi(u- ud)


and the control law rules are defined by:

(10.20)

RCj: IF xd=Xj THEN u=Kj(x-xd)+ud

(10.21)

Where the gain matrix K(xj,uj) is computed on the basis of the linearized closed system defined in the corresponding fuzzy region Lj. The fuzzy open loop model is now specified in terms of the deviations of the state and control actions from the nominal values, i.e.:

x = w i ( x d ) A i ( x x d ) + B i (u u d )
i

(10.22)

The overall control law is now given by:

u = w j (x d ) K * (x x d ) + u d
j

(10.23)

It is observed that both the closed system model and the control law are linear since the normalized membership functions wi(xd) and wj(xd) are constants less than unity. As in the previous case, the closed system is now given by:

x = w i ( x d ) w j ( x d ) A i + B i K j ( x x d )
i j

(10.24)

It follows that

w (x
i i

) = w j ( x d ) = wi ( x d ) w j ( x d ) = 1
j i j

Here

148

Chapter 10

Ad = wi ( x d ) Ai ; B d = wi ( x d ) B i
i
i

and

K d = w j ( xd )K j
j

In an analogous way

x = A d ( x x d ) + B d (u u d )
and

u = K d (x xd ) + u d
The overall closed system is stable at the nominal state xd if and only if:

Re{m ( A*)} < 0 m = 1,2,..., n


where m are the eigenvalues of the matrix

A*=d+BdKd.
If these conditions hold, then we can state the following:

The fuzzy gain-scheduled closed system (10.24) is asymptotically stable if and only if positive definite matrices and Q exist such that

A *T P + PA* = Q

(10.25)

In summary, the advantages of the second Takagi-Sugeno approach are:

the approach leads to linear descriptions of the open and closed system descriptions and the desired control law,

Model-Based Fuzzy Control

149

computation of the control law and the conditions for stability are well established and the determination of the matrix is almost always possible. The second Takagi-Sugeno technique combines the simplicity of heuristic fuzzy logic with rigorous hard control techniques and offers the possibility of specifying the desired closed system response as well as the conditions for stability. It suffers, however, from the fact that an explicit analytical model of the process must be known.

Example 10.1 Fuzzy gain-scheduling of a simple process


The second Takagi-Sugeno approach is illustrated by way of an example for fuzzy gain-scheduling. Aircraft exhibit dynamic characteristics which vary significantly with altitude: if a controller is tuned at low altitudes then control performance is grossly degraded with altitude due to the thinning air. Thus gain-scheduling is resorted to in order to adapt the parameters of the controller with altitude. The example presented here assumes a linearized scalar model of the dynamics of the aircraft or plant. The explicit model of the plant is assumed known at ground level xd=0 and at some other specified altitude xd=1. The process rules are assumed, for simplicity, to be: R0: IF xd=0 THEN x = f 1 ( x, u) = 0.5 x + 0.5u (a) R1: IF xd=1 THEN x = f 2 ( x, u) = x + 2u (b)

At low altitude, case (a), the dynamics of the plant are slow with a normalized time constant of T=2 whereas at high altitude in case (b), the

150

Chapter 10 response of the plant is faster, with a time constant T=0.5 and is considerably more sensitive to control actions. The corresponding step responses are shown in Figure 10.2.

Model-Based Fuzzy Control

151

(b)

(a)

Figure 10.2 Step responses of the plant at the two altitudes The transition of the dynamics of the plant with altitude is assumed to vary smoothly. Consider the use of a fuzzy gain-scheduling controller whose control rules are as follows: R0: IF xd=0 THEN u1 = g1(x,xd)= k1(x-xd) R1: IF xd=1 THEN u2 = g2(x,xd)= k2(x-xd) In order to obtain the desired closed system response at both altitude extremes, the state feedback gains k1=0.6, k2=0.4 are selected so that the eigenvalue of the closed system is at (-0.2,0) in both cases. To simplify the analysis, let the fuzzy membership functions shown in Figure 10.3 indicate how the transition of the dynamics of the plant follows the altitude. It is clear that at very low altitudes the mathematical model of the plant is predominantly of type (a) but as altitude increases the type (a) model increasingly fades into type (b) until x=xd at which the model is entirely type (b).

152

Chapter 10

0 Figure 10.3 Fuzzy sets of gain-scheduling

The gain-scheduling fuzzy sets can be described by the expressions 1(x)=1-x and 2(x)=x. It is noted that 1+2=1x. The overall fuzzy process model is therefore given by the weighted sum:
x = w1 (0.5( x x d ) + 0.5u ) + w2 (( x x d ) + 2u )

where w1 and w2 are the normalized membership functions:


wi = i = i 1 + 2

The overall fuzzy control law is therefore the weighted sum:


u = 0.6 w1 ( x x d ) + 0.4 w2 ( x x d )

The fuzzy gain-scheduling controller provides the control actions to force the closed system to follow the desired response for all values of altitude. The response of the closed system to a very large step demand in altitude from x=0 to x=xd=1 is shown in Figure 10.4(b). Figure 10.4(a) shows for comparison, the response of an invariant system governed by:

x = 0.2 x + 0.2u

Model-Based Fuzzy Control

153

(a) (b)

Figure 10.4 Responses of (a) an invariant system with the desired dynamic characteristics and (b) the fuzzy gain-scheduled plant Since the closed system with the fuzzy gain scheduler is clearly nonlinear, it would be unreasonable to expect the two responses shown in Figure 10.3 to be the same. It is noted that in the proximity of x=0 and x=xd the response of the closed system approaches the response of the invariant system, as desired.

Chapter 11

Neural Control
The emulation of human cognitive ability is one of the primary fields of research interest in Computational Intelligence. The human is nowhere near as fast or as accurate in calculating as a modern computer, yet even at a very early age one can easily recognize objects and relate them in their natural environment even if they are distorted or partially hidden. Exactly the opposite is true with computers they are far more capable than humans in performing tedious tasks that involve extensive numerical computations yet they have difficulty performing cognitive tasks despite the impressive progress made in Artificial Intelligence. The ability to learn through experience is one of the principal characteristics of humans. Humans have the capacity to store huge quantities and types of information, recall this data and process it in a very short time with little difficulty. One possible explanation for this superiority is the lack of suitable computer software to emulate the humans ability to process information. A second explanation is the fact that the human brain and computers work in radically different ways, the brain being much more efficient at processing information. The human brain is extremely powerful, comprising a huge number (of the order of 109) of simple processing elements or neurons, each one of which is capable of communicating with the others. This is massive parallelism and it is interesting that modern computer designs are increasingly following this architecture.

154

Chapter 11

The neurons of an Artificial Neural Network (ANN) are arranged in such a manner as to process information in parallel and simultaneously. Each neuron sends activation or de-activation signals to other neurons while its activation depends on signals received from other neurons to which it is connected. The term synapses is commonly used for 153 these connections. Suitable interconnection of these simple elements can yield powerful networks with the ability to learn, adapt and infer. The use of artificial neural networks is sometimes known as connectionism and ANNs can therefore be viewed as generalized connectionist machines or generalized function mappers. Since their re-emergence in the early 1980s, there has been an explosion of interest in the application of ANNs for qualitative reasoning, which is at the core of the fields of Soft Computing and Intelligent Systems. This interest has been encouraged by the increasing availability of powerful parallel computing platforms capable of very high computational speeds and parallel programming techniques. Multi-layer ANNs are finding use in an ever-increasing range of applications, from image recognition, voice analysis and synthesis to system identification and industrial control. This chapter, and those that follow, present the most commonly used ANN architectures that have found application in Control Engineering, the basic network-learning algorithms and examples of industrial applications. The origins of ANNs are to be found some fifty years ago in the work of McCulloch and Pitts. The first contribution in the area of network learning was due to Hebb in 1949 who showed that learning in complex networks can be achieved by adapting the strengths of the synapses. Rosenblatt introduced the Perceptron, an early form of neuron, in the late 1950s. The operation of multi-layer ANNs was not fully understood in those early days and research was restricted to structured perceptrons. Nilsson followed in the mid-1960s with learning machines and machines that were made up of clusters of threshold logic units. In 1969 Minsky and Papert published their seminal work on Perceptrons, in which they proved that Perceptrons are limited in their ability to learn, pointing to their inability to represent a simple R logic element. This work was to dampen the enthusiasm in ANNs and research in the field was virtually paralyzed for almost a decade. Fortunately, ANNs re-emerged in the early 1980s due mainly to the work of Hopfield who continued his research on network training

Neural Control

155

methods. New ANN architectures and powerful learning algorithms were introduced in the field in the mid-1980s rekindling interest in ANNs and their application. The rapid progress in very large-scale integrated circuitry (VLSI) and parallel computers aided the developments in ANNs and today the field of neural networks constitutes a thriving area of research and development. In a very comprehensive paper, Hunt et al. in 1992 (see the Bibliography in chapter 18) presented a host of applications of neural networks in Control Engineering and the reader is well advised to refer to this work. The properties that make ANNs particularly applicable to control applications are the following: being non-linear by nature, they are eminently suited to the control of non-linear plants, they are directly applicable to multi-variable control, they are inherently fault tolerant due to their parallel structure, faced with new situations, they have the ability to generalize and extrapolate.

These properties satisfy the fundamental requirements for their use in Intelligent Control. Neural controllers and fuzzy controllers, thus constitute the core of intelligent control. An ANN is essentially a cluster of suitably interconnected nonlinear elements of very simple form that possess the ability of learning and adaptation. These networks are characterized by their topology, the way in which they communicate with their environment, the manner in which they are trained and their ability to process information. ANNs are classified as: static when they do not contain any memory elements and their input-output relationship is some non-linear instantaneous function, or dynamic when they involve memory elements and whose behavior is the solution of some differential equation

156

Chapter 11

11.1 The Elemental Artificial Neuron


ANNs are constructed from elemental artificial neurons (which vaguely approximate physical neurons) that are suitably interconnected via branches. The synaptic weights are the gains or multipliers of these branches and uniquely specify the input-output transfer function (functional relationship or mapping) of the neural network. The synaptic weights of an untrained network are initially unknown and are determined following training using some network training law which aims at minimizing some measure of the error between the output of the network and the desired output of the network. A model of an elemental artificial neuron (also referred to as a node) is shown in Figure 11.1. A static neuron has a summer or linear combiner, whose output is the weighted sum of its inputs, i.e.:

= w1 x1 + w2 x 2 + ....wn x n + b
where w and x are the synaptic weight and input vectors of the neuron respectively, while b is the bias or offset. A positive synaptic weight implies activation, whereas a negative weight implies de-activation of the input. The absolute value of the synaptic weight defines the strength of the connection. The weighted sum activates a distorting (or compression) element f(i). One form that this element can take is the threshold logic unit, in which the output of the neuron is triggered when the inner product <w,x> exceeds the bias b. There are many variations of the nonlinear distorting element, the most common of which are 1) Linear (ADALINE): f()= 2) Binary (Threshold Logic Unit or TLU): f() = 1 if >0 = 0 if 0

Neural Control

157

3) Sigmoid:
f( ) = 1 [0,1] 1 + e

4) Hyperbolic tangent:

f ( ) =
5) Perceptron:

1 e =(1+tanh) [-1,1] 1 + e

f() = if >0 =0 if 0

x1

w1 y f() w2

x2

xn

wn b

Figure 11.1 The elemental artificial neuron or node

The input to the compression element may take on either of the following forms, depending on whether the neuron is static or dynamic: weighted sum (for the case of static memoryless neurons):

158

Chapter 11

w x
i i =1

+ b=

w x
i i =1

n+1

; x n+1 = 1, b = wn+1

accumulated output (for the case of dynamic neurons with memory):

( k ) = ( k 1) + wi x i ( k )
i =1

n +1

Here k is the time index and it is necessary to store the previous value of the weighted sum (k-1).

11.2 Topologies of Multi-layer Neural Networks


As noted earlier, ANNs are clusters of neurons structured hierarchically in a multiplicity of layers. The neurons of a network, depicted as circles in Figures 11.2 and 11.3, are normally structured in layers, resulting in multi-layered ANNs. The input layer of the network is at the lowest layer of the hierarchy while the highest layer corresponds to the output layer and yields the output of the network. Feed-forward ANNs are networks where information flows successively from the lowest to the highest layers of the network and no feedback is involved. Figure 11.2 shows an example of a multi-layered ANN belonging to this class. It is observed that the internal or hidden layers of the network communicate with the environment only through the input and output layers. Though an ANN may, in principle, have any number of hidden layers, it has been proved that one hidden layer suffices to generate any arbitrary mapping between inputs and outputs. Depending on the manner in which the various neurons in the network are connected, i.e., the network topology or network architecture, the following constitute the principal classes of ANNs: Hopfield recurrent network where the nodes of one layer interact with nodes of the same, lower and higher layers,

Neural Control

159

feed-forward networks in which information flows from the lowest to the highest layers, feedback networks in which information from any node can return to this node through some closed path, including that from the output layer to the input layer and symmetric auto-associative networks whose connections and synaptic weights are symmetric.

Outputs

Output Layer

Hidden Layer

Input Layer

Inputs

Figure 11.2 Feed-forward network

Figure 11.2 shows an example of a multi-layer feed-forward ANN involving an input layer, a single hidden layer and an output layer. This is a very common feed-forward network topology. Figure 11.3 shows a single-layered Hopfield network, which involves feedback. In contrast to feed-forward networks, every node of a Hopfield network is connected to all others. These ANNs can be useful in Soft Control because they possess the following three properties: they

160

Chapter 11

can learn from experience rather than programming, have the ability to generalize, can generate arbitrary non-linear input-output mappings and are distributed and inherently parallel.

Outputs

Buses

Inputs

Figure 11.3 Hopfield network with feedback

11.3 Neural Control


In order to compare any unconventional control method with conventional methods it is necessary to enumerate the basic characteristics of each method and to specify a measure of comparison. Thus neural control: is directly applicable to non-linear systems because of their ability to map any arbitrary transfer function, has a parallel structure thereby permitting high computational speeds. The parallel structure implies that neural controllers have a much higher reliability and fault tolerance than conventional controllers,

Neural Control

161

can be trained from prior operational data and can generalize when subjected to causes that they were not trained with, and have the inherent ability to process multiple inputs and generate multiple outputs simultaneously, making them ideal for multivariable intelligent control. From the control viewpoint, the ability of neural networks to cope with non-linear phenomena is particularly significant. As is well known, there is no unified theory of non-linear control and there exists a host of scattered techniques capable of giving solutions only to specific cases. ANNs can therefore be used to advantage in the design of nonlinear controllers for nonlinear plants, particularly since the design is the result of learning.

11.4 Properties of Neural Controllers


Modern intelligent control systems are capable of some degree of autonomy and the fusion of modern control techniques and Computational Intelligence assures them increased efficiency in changing, vague and uncertain environments. The ultimate objective is: autonomous intelligent systems capable of understanding and adapting to the changes in both the operating conditions and the environment in order to consistently maximize performance. There are many situations where autonomous intelligent systems are considered essential, typically in cases of high danger such as in nuclear reactor and armament inspection, space and underwater exploration where intelligent robots are already playing their part. This objective is feasible using the techniques of Computational Intelligence. Both man and machine can control an industrial plant. There is, however, a fundamental difference in the manner in which they do so. Man processes large and seemingly disparate quantities of stimuli compared to machines whose stimuli are usually highly limited. The reason for this is not so much the absence of sensors but the manner in which these stimuli are processed. Humans have an inherent ability to process massive quantities of data quickly and efficiently, sorting what is relevant and ignoring what is not while fusing information from a variety of sources before arriving at a conclusion. Machines do not have this ability and it is uncertain whether they will in the foreseeable future.

162

Chapter 11

Finally, neural networks have the following properties that make them particularly useful for control: they possess a collective processing ability, are inherently adaptable, are easily implemented, achieve their behavior following training, can be used for plants that are non-linear and multivariable, can process large numbers of inputs and outputs making them suitable for multi-variable control, are relatively immune to noise, are very fast in computing the desired control action due to their parallel nature and do not require an explicit model of the controlled process.

11.5 Neural Controller Architectures


Widrow and Smith demonstrated the first application of a neural network in Control in the mid-1960s. They used a single ADALINE (ADaptive LInear NEuron) to control an inverted pendulum and showed that this elementary trainable controller was just as effective as a conventional controller after being trained. The increasing demands for improved productivity, product quality and plant efficiency coupled with the increasing complexity of industrial plants have led inevitably to a search for more advanced control techniques. Since the late 1980s, significant activity in the use of ANNs for identification and control of systems has been observed. Their ease of use, their inherent reliability and fault tolerance, has made ANNs a viable medium for control. Many architectures for the control of plants with ANNs have been proposed since the mid-1980s and the subject still presents considerable interest, not only in research but also practice. An alternative to fuzzy controllers in many cases, neural controllers share the need to replace hard controllers with intelligent controllers in order to increase control quality. The problem of macroscopic identification of physical plants from normal operational data using multi-layer ANNs reduces to one of finding a dynamic functional relationship between the plant inputs and

Neural Control

163

outputs. The method is well established and is not limited to linear approximants. By way of example, consider the case of a SISO discretetime system that is to be identified by the ANN shown in Figure 11.4. The input to the ANN is fed from the input to the physical plant. If, following training, the output of the ANN is identical to that of the plant then we say that the plant has been exactly identified. In practice, perfect identification is unlikely and the plant can be identified only approximately. The fidelity of the approximation depends on the complexity of the network and is directly related to the order of the neural network and the number of past samples of the input that are used. ANN

Delayor Dx D 2x y

D nx Figure 11.4 Multi-layer dynamic neural network model of a SISO plant

By placing a series of n delayors D (i.e., elements that delay the signal one sample period) in series as shown in Figure 11.4, in effect a tapped delay line, we obtain the input signal x(k) and n delayed versions of it x(k-1), x(k-2) x(k-n). The n delayed signals from the transversal delay line are then fed to a multi-layer ANN that generates the scalar signal y. This signal is then compared with the desired signal d from the physical plant. The object of identification is to minimize some measure of the difference (dy), by suitably adjusting (i.e., network training) the

164

Chapter 11

synaptic weights of the ANN. As will be seen in the following, a similar technique is used for the control of plants using ANNs.

11.5.1 Inverse model architecture


The success of the back-propagation training algorithm for multi-layer neural networks, which is presented in some detail in the next chapter, was instrumental in opening the gates to a flood of applications of ANNs for control.

d Neural Controller e

u Plant

Figure 11.5 Inverse model structure

A neural controller that has found use in practice due to its simplicity uses an inverse model of the plant. The method has much in common with conventional auto-tuning. During the training phase of the ANN, which is performed off-line with known training sets, the objective is to establish the inverse relationship P-1 between the output(s) and the input(s) of the physical plant, in effect the inverse transfer function if the plant is linear. Thus if the physical plant is characterized by the mapping y=P(u), then following training the ANN will ideally generate the inverse mapping u=P-1(y) so that the overall relationship between the input and the output of the closed controlled system is unity, i.e., perfect tracking! In keeping with accepted symbolism the output of the neural

Neural Control

165

controller, now the control signal to the plant has been renamed u whereas y refers to the plant output. Network training is based on some measure of the open system error between the desired and the actual outputs eo=d-y of the closed system. A flow diagram of the method is shown in Figure 11.5. It should be obvious that the very simplicity of the method is questionable. In theory the method should work but in practice it may not, because identification of an inverse mapping can never be exact, in which case the overall transfer relationship of the system is not unity. In order to work, even partially, the method requires repeated identification at regular intervals, a fact that makes the method impractical for most industrial applications.

eC Controller

u Plant

Figure 11.6 Specialized training architecture

11.5.2 Specialized training architecture


The lack of robustness of the previous architecture can be compensated for by using what has been referred to as the specialized training architecture. The flow diagram of this architecture is shown in Figure 11.6 in which the closed system error becomes the driving force. In this architecture, the ANN is placed in series with the plant, as in the previous case, but within the closed loop. The result is increased robustness coupled with the advantages of conventional feedback, since training is now based on some measure of the closed system error ec=d-y. Training is

166

Chapter 11

considerably more difficult, however, with this structure, due to the feedback action.

11.5.3 Indirect learning architecture


The indirect training architecture is more complicated than either of the preceding methods since it involves not one but two dynamic ANNs and its training is considerably more difficult. Here, one ANN is trained to model the physical plant following identification while the second ANN performs the controlling task using a feed-forward network. Both ANNs are trained on-line from normal operating records. A flow diagram of the architecture is shown in Figure 11.7.

Plant Model e* d Controller u Plant

y*

Figure 11.7 Indirect learning architecture

During the training phase, the simulator ANN learns the functional relationship between input(s) and output(s) (i.e., the transfer function) of the physical plant. This is the identification phase, which is based on some measure of the error between the output of the plant and that of the plant model simulator ANN, i.e., e*=y-y*.

Neural Control

167

Training can be either off-line or on-line with random or pseudorandom signals. Simultaneously, the overall error e=d-y is used to train the controller ANN. The advantage of this architecture is that it presents easier training of the controller ANN on-line since the error can be propagated backwards through the simulator ANN at every sampling instant.

Chapter 12

Neural Network Training


Heuristic knowledge on how to control a plant can be used to train an artificial neural network provided this knowledge is suitably encoded. The resultant neural controller is thus simply an ANN whose synaptic weights are trained with this encoded knowledge using an appropriate training algorithm. Network training is basic to establishing the functional relationship between the inputs and the outputs of any neural network and considerable effort has been spent on finding faster and more efficient training algorithms which will reduce the time required to train a network. There are two basic classes of network training: supervised learning that involves an external source of knowledge about the system or unsupervised learning that involves no external source of knowledge and learning relies on local information and internal data. In the case of supervised learning, the desired outputs of the network for every given input condition are specified and the network learns the appropriate functional relationship between them following repeated application of training sets of input-output pairs. The popular back-propagation algorithm, which is used in many applications, belongs to this class. This algorithm gets its name from the fact that the synaptic weights of a multi-layer network are adapted iteratively by
169

170

Chapter 12

propagating some measure of the error between the desired and actual output of the network from its output back to its input. In the case of unsupervised learning, no information on the desired output of the network that corresponds to a particular input is available. Here, the network is auto-associative, learning to respond to the different inputs in different ways. Typical applications of this class are feature detection and data clustering. Hebbs algorithm and competitive learning are two examples of unsupervised learning algorithms. A wide range of network topologies such as those due to Hopfield, Hamming and Boltzmann, also use the same method of learning. In general, these networks, with their ability to generate arbitrary mappings between their inputs and outputs, are used as associative memories and classifiers.

12.1 The Widrow-Hoff Training Algorithm


A basic adaptive element is the ADAptive LInear NEuron (ADALINE) that was introduced by Widrow and Hoff in the 1960s. This element has a linear input-output relationship. The element is useful in introducing the fundamental concepts of network training. Clusters of ADALINES can be assembled into complex networks termed MADALINES (for Multiple ADALINEs) which were used in early classifiers and Learning Machines. Network training can be conveniently explained by considering first how the synaptic weights of a single ADALINE may be adapted to generate a given functional relationship between its inputs and its outputs. It will be seen that the same principle can be used for adapting nonlinear neurons and finally, multi-layer ANNs. A single neuron with inputs: x = {x1, x2 xn, xn+1} ; xn+1=b which, weighted by an array of synaptic weights: w = {w1, w2 wn, wn+1} ; wn+1=1 results in the weighted sum:

Neural Network Training

171

= {w1 x1 + w2 x2 + wnxn + wn+1xn+1} = <w,x> An ADALINE has a linear functional relationship given by f()= and can be trained by adapting its synaptic weights provided the desired output d is known. Training is based on some measure of the discrepancy or error e=d-y between the desired and actual output of the element. Widrow and Hoff presented the first successful training algorithm for ADALINES in the 1960s. They named their training procedure the LMS algorithm (for Least Mean Squares), using the squared error for each training set as the objective function, i.e.

J=

e
i =1

2 i

which is computed for all the elements of the training set. We seek the values of the synaptic weights that minimize the objective function. The inputs to the ADALINE may be either binary or real, whereas the values of the synaptic weights are always real and may be positive or negative, signifying activation and de-activation respectively. In terms of the inputs and the synaptic weights of the element, the error is:

e = d - <w,x>
where <w,x> is the inner product of the synaptic weight vector and the input vector and thus the squared error is:

e2 = d2 2d<w,x> + <w,<x,x>w>
whose expected value is:

E(e2) = E(d2) - 2E(d<w,x>) + E(<w,<x,x>w>)


The partial derivative of the expected value with respect to the synaptic weights is:

172

Chapter 12

E ( e 2 ) = -2p + 2Rw w
where

(12.1)

p = E(dx) and R = E(<x,x>) R is the power of the input signal while p is the expectation of the product of the desired output and the input vector. Setting the partial derivative in Equation 12.1 to zero yields the optimum synaptic weight vector: w* = R-1p
which is known as the Wiener solution. If the matrix R and the vector p are known explicitly then the optimum synaptic weight vector w* can be computed directly. In practice, however, R and p are not known and the Wiener solution unfortunately cannot be used. Widrow and Hoff observed that using the partial derivative of the squared instantaneous error

ek2 k = w
instead of its expectation, made little difference to the final result. This observation simplified network training significantly since the squared error sensitivity could now be expressed in terms of the synaptic weights and the error directly, i.e.,

k = 2ek

ek = 2ek (dk < wk , xk >) = 2ek wk wk wk

The Widrow-Hoff algorithm can be expressed as the iteration:

~ wk +1 = wk + wk = wk k = wk + 2e k x k

(12.2)

where is some positive constant that defines the rate of convergence of the algorithm. This expression shows how the synaptic weights at the

Neural Network Training

173

following iteration of the training procedure must be adapted. The search direction that must be followed is dependent on the sign of the gradient: thus if the gradient is negative then clearly the search trajectory is tending towards some minimum. The algorithm is initialized with arbitrary initial synaptic weigh vector w0 and a new training set is applied to the element. The iteration is terminated once the squared error in that iteration falls below some acceptable value. At each iteration the change in the error is

ek = (dk - <wk ,xk>) = -<xk ,wk>


and using the fact that the change in the synaptic weight is

wk = 2ek xk
it follows that

ek = -2<xk ,xk>ek = - ek
This means that the error is reduced by a factor of at each iteration. This factor thus plays a major role in the stability of convergence of the algorithm and must lie in the range 0<<2. In practice values of in the range of 0.1<<1 are preferred. Too small a value of leads to very slow convergence while a large value leads to oscillations and ultimately instability. The Widrow-Hoff training algorithm has the advantage of simplicity but is plagued by slow convergence. Training algorithms are the subject of considerable research and algorithms with vastly increased training rates have been developed. The back-propagation algorithm, which was the major breakthrough that opened up the field of ANN, is one of the most popular, though by no means the fastest or most reliable. This algorithm is presented later in this chapter.

12.2 The Delta Training Algorithm


The original Widrow-Hoff algorithm must be modified when the neural element contains a nonlinear element. However, in order to develop a systematic training procedure for the general case, it is necessary to place

174

Chapter 12

certain restrictions on the non-linear element. As will be seen below, the fundamental condition on the non-linear element is that it must be uniformly differentiable, i.e., the non-linearity must not exhibit discontinuities. Given a neuron with =<w,x>, output y=f() and error e=d-y, the error sensitivity is given by the partial derivative:

e y f ( ) f ( ) = = = = g( )x w w w w
It is obvious that for the error sensitivity to exist everywhere, it is necessary that the gradient of the nonlinear function g() = f() exist for all values of . This implies that the nonlinear function must be differentiable or smooth. In the special case of a linear neuron clearly g()=1. The development of the Delta algorithm follows that of the Widrow-Hoff algorithm closely. Here, the squared error sensitivity is
2 ek ek f f ~ = 2e = 2ek = 2ek k = w k wk wk

= 2ek g k xk w k

which shows how the gradient g of the nonlinear function f enters the training algorithm. This expression is very similar to that of the WidrowHoff algorithm except that an additional term has been added. The Delta training algorithm is given by the iteration

~ wk +1 = wk + wk = wk k = w k + 2e k g k x k

(12.3)

In order to accelerate convergence, it is possible to vary the value of according to the progress achieved in training. Thus, at each iteration, for example, the value of may be made a function of the norm of the change in the synaptic weights. If the rate of change of the norm of the synaptic weights drops below some pre-specified lower limit, then the value of is doubled whereas if the error norm rises above some upper pre-specified limit then the value of is halved. This is perhaps the simplest possible approach to adapting the value of the multiplier , and much more sophisticated ways are available but are beyond the scope of this book.

Neural Network Training

175

12.3 Multi-layer ANN Training Algorithms


The training algorithm for a single nonlinear neuron is straightforward and is a variation of the original Widrow-Hoff algorithm. A multi-layer ANN comprises an input layer, a hidden layer and an output layer, each of which involves many nonlinear neurons. Indeed some ANNs used in pattern recognition, speech analysis and synthesis involve thousands of neurons. Often one or more of the layers of an ANN involves linear neurons. Unfortunately, however, there is no way to know a priori how many neurons are required in each layer with which to generate a specified input-output mapping. At this time, the choice is arbitrary and a matter of experimentation. Training is performed first for a given number of neurons, the number is then reduced and the ANN is re-trained. The number of neurons is further reduced at each successive training session until the measure of the error starts to increase. For control applications, it is essential that the size of the ANN in the neural controller be limited both in the number of layers and the number of neurons per layer. Fortunately, minimal neural networks involving an input layer, a single hidden layer and an output layer, with a total of less than 10 nodes have proved quite successful in practical neural controllers. This small size implies fast training and re-training should the circumstances require this. The problem of supervised learning of a complex multi-layer ANN must be viewed as an iterative algorithm for systematically minimizing some measure of the discrepancy between the inputs and outputs of the network by adapting the synaptic weights and biases in each layer of the network. The structure of the iterative training algorithm for adapting the synaptic weights of the network is shown in Figure 12.1. The basic training algorithm for a multi-layer ANN is a variation of the Delta algorithm for a single neuron. In this case vectors are replaced by matrices whose dimensions depend on the layer number and the number of neurons in each layer. In the next section back-propagation, the most popular network-training algorithm is presented.

176

Chapter 12

12.4 The Back-propagation (BP) Algorithm


A multi-layer neural network is characterized by the number of neurons it possesses in each layer. Thus, for instance, a 30-50-10 ANN has 30 neurons in the input layer, 50 in the second or hidden layer and 10 neurons in the output layer. An analysis of the back-propagation algorithm is presented below in simplified form so that it can be easily understood. Generalization of the algorithm to multi-layered ANNs with many neurons is relatively straightforward but requires multiple indices on the various synaptic weight vectors for each layer being considered. In order to avoid unnecessary complications we will consider a simple 2-2-1 ANN involving two neurons in the input layer, two in the hidden layer and one in the output layer. This minimal ANN has, incidentally, been used successfully in a neural controller for a large mill used for pulverizing coal for a cement kiln as well as a controller for a finish cutting lathe.

w e Training Algorithm d

Figure 12.1 The structure of the network training procedure

Neural Network Training

177

The signals at the inputs and outputs of the neurons of the first layer are respectively:

1 = w11 x1 + w21 x 2 + w13 ; y 1 = f ( 1 ) 2 = w12 x1 + w22 x 2 + w23 ; y 2 = f ( 2 )


while the weighted signal at the output of the second (output) layer is:

3=v1y1+v2y2+v3 ; y3=f(3)
The synaptic weights of the first layer are given by the vector w and those of the second layer by the vector v. The error is

e = d - y3 = d - f(3)

1 w13 x1 w12 w21 x2 w22 w23 1 N v2 v3 1 w11 N v1 N y3

Figure 12.2 An example of a simple 2-2-1 ANN

178

Chapter 12

The partial derivatives of the squared error with respect to the synaptic weights in each layer therefore are
e 2 e = 2e ; i = 1,2; j = 1,3 wij wij

e 2 e = 2e ; i = 1,3 vi vi Following the steps taken in developing the Delta training algorithm presented in the previous section, the synaptic weights of the first layer must be adapted according to
Wk +1 = Wk + W k = Wk + 2 e k
e W

where W={wij} and the weights of the output layer as

v k+1 = v k + v k = v k + 2k e k e v

The training algorithm thus requires evaluation of the partial derivatives of the errors with respect to the synaptic weights of both layers, i.e. the error sensitivities

e e and W v
Dropping the iteration index k to simplify notation, the chain rule is used to derive the first three partial derivatives for the first layer, which are:

Neural Network Training

179

e y f ( 3 ) 3 f ( 3 ) 3 f ( 1 ) 1 = 3 = = = g ( 3 ) g ( 1 )v1 x1 w11 w11 3 w11 3 f ( 1 ) 1 w11 e y f ( 3 ) w3 f ( 3 ) 3 f ( 1 ) 1 = 3 = = = g ( 3 ) g ( 1 )v1 x2 w12 w12 3 w12 3 f ( 1 ) 1 w12 f ( 3 ) 3 f ( 1 ) 1 y f ( 3 ) 3 e = = g ( 3 ) g ( 1 )v1 = 3 = 3 f ( 1 ) 1 w13 w13 w13 3 w13

The reader is encouraged to derive the remaining three partial derivatives following the same procedure. It is noted that given the input-output relationship of the compression function analytically, the derivative g can be readily derived. Thus for the tanh non-linearity, used very commonly as the distorting or compression function of the node:

f ( ) =

1 e 2 and g ( ) = f ' ( ) = 1+ e (1 + e ) 2

In the same manner, the partial derivatives for the neuron in the second layer are

e y f ( 3 ) 3 = 3 = = g ( 3 ) y1 v1 v1 3 v1 e y f ( 3 ) 3 = 3 = = g ( 3 ) y2 v2 v2 3 v2 e y f ( 3 ) 3 = 3 = = g ( 3 ) v3 v3 3 v3
These expressions are especially simple if the propagation is through one layer only. Thus given some random values for the synaptic weights in all layers of the ANN and the known input vector, the first pass is made during which the signals 1, 2, 3 are computed. Computation of the neuron outputs y1, y2, y3 then follows from which the various nonlinear functions and their derivatives (f,g) are computed. At every iteration of the algorithm, the corrections to the synaptic weight matrix Wk for the first layer and the weight vector vk for the second layer are computed. These corrections are then added to the previous synaptic weights and the algorithm is repeated until convergence,

180

Chapter 12

defined as satisfaction of some error measure, is achieved, whereupon the algorithm is terminated. It is obvious that for complex ANNs containing many layers and neurons in each layer, it may be necessary to perform many thousands of iterations before convergence is achieved and clearly computers with high computational speeds are essential. Finally, it must be noted that the back-propagation algorithm, though convenient and used extensively for training ANNs, is by no means the fastest training algorithm. Many variations of this algorithm have been developed by additional terms to the synaptic weight adaptation vector which take into account such factors as the momentum of learning. This additional term results in a significant increase in the rate of convergence of the training algorithm.
W0 v0 x

W v

1 2 y1 y2

First layer

Training Algorithm

3 y3

Second

Figure 12.3 Flow chart for the back-propagation algorithm for the 2-2-1 ANN

Chapter 13

Rule-Based Neural Control


eural controllers that can be trained from linguistic rules are a relatively new concept in Soft Control. As in the case of heuristic fuzzy controllers, it is expected that rule-based neural controllers will play their part in furthering the diffusion of Computational Intelligence to Control Engineering. As noted earlier, Soft Control is based on the knowledge and experience of human operators who are trained to control a plant using linguistic rules of the classical IF (cause) THEN (effect) ELSE form. In the previous chapters, methods were presented for processing heuristic linguistic control rules of this form using Fuzzy Logic. In this chapter we discuss a simple, yet effective, technique with which an artificial neural network embedded in a neural controller can be trained from linguistic rules. This technique can be considered as a special case of neuro-fuzzy control that was examined in the previous chapter. The neural networks used in control are either static, i.e., they contain no elements with memory, or dynamic, containing delayors that take account of the plants past history. Furthermore, these neural networks are characterized by the fact that they are small involving only a small number of neurons and at most one hidden layer. Simplicity being of essence in industrial controllers where high reliability and robustness are of primary importance, it is not uncommon to find neural controllers with as few as 10 neurons.
181

182

Chapter 13

13.1 Encoding Linguistic Rules


The training of a neural network is normally performed with a numerical training set. There are numerous commercially available software packages from a number of vendors, notably MATLAB and its Neural Toolbox, Neural, Neuralware, etc., which make this task a relatively simple matter. As control rules involving linguistic variables cannot be used directly for network training, it is therefore necessary to encode the linguistic rules into numerical form prior to using them in existing network training packages. The problem reduces to one of finding a suitable mapping T: R(L) R(Q) that maps the linguistic (i.e., qualitative) rule set R(L) into a corresponding numerical (i.e., quantitative) training set R(Q). This procedure has been proved very effective in training neural controllers for a number of applications, from large-scale industrial plants to mechatronic systems using even simple transformations. The following simple example shows the procedure for transforming linguistic rules into numerical data. Defining the linguistic variables NL = Negative_Large, NSM = Negative_Small, OK =Normal, PSM = Positive_Small, PL = Positive_Large then a one-to-one transformation of linguistic variables into their numerical equivalents defined in the normalized range [-1,1] could be as follows: [NL|NSM|OK|PSM|PL] [-1.0|-0.5|0|0.5|1.0] This is an example of a linear mapping that follows human intuition. Non-linear mapping may be used when necessary to compensate for any non-linear behavior of the plant being controlled. Thus a rule with three input variables, INPUT_1, INPUT_2 and INPUT_3 and output variable CONTROL_VARIABLE: IF INPUT_1 is Negative_Small AND INPUT_2 is Positive_Small AND INPUT_3 is OK

Rule-Based Neural Control

183

THEN CONTROL_VARIABLE must be Positive_Large will appear as the training string: [-0.5, 0.5, 0] [1]. This training string can be used directly to form the training set of a neural network of specified topology. The corresponding MATLAB Neural Toolbox statement is simply:
P=[-0.5 0.5 0.0]; inputs

while the corresponding output is specified by the statement:


T=[1.0] ; output

The training set is the collection of all the (P,T) pairs corresponding to every linguistic rule in the rule base.

13.2 Training Rule-Based Neural Controllers


The training of rule-based neural controllers of specified topology is performed off-line. The results of this training are the optimum values of the synaptic weights and the biases of the neural network that yield the required control surface. These weights are subsequently downloaded to the real-time version of the controller. Should it prove necessary to modify or add any rule then the network must be re-trained off line. Fortunately, small changes require very short training times since the network parameters will be initialized with its previous known values rather than entirely random values as in the case of a new controller. In order to train the network it is necessary to map all the N linguistic rules contained in the rule base in accordance with the chosen mapping. For the earlier example, the numerical strings for each rule (i.e., the (P,T) pairs) are concatenated to form the training string:
P=[-1.0 -1.0 -0.5 -0.5 0 0.5 ...] ; -1.0; 0.5; -0.5; string of N triplets

184

Chapter 13

In a like manner, the corresponding output training set of a multi input single output neural controller can be stated as follows:
T=[0 0.5 -0.5 ...... -1.0] ; string of N corresponding outputs

This training set is used to train the neural network, hundreds or even thousands of times in random order until the synaptic weights converge. The initial estimates of the unknown synaptic weights are taken randomly. At every epoch (i.e., iteration of the training algorithm), the synaptic weights are updated in accordance with the training algorithm used. The back-propagation algorithm (BP) is by far the most popular training algorithm, though it often converges slowly, particularly when the network contains many neurons and layers. Fortunately, in the majority of control applications neural controllers normally contain very few neurons and training is fast. Learning is considered completed when some measure of the error between the desired and actual outputs of the network reaches some acceptable limit, or the number of epochs reach some upper limit.

Example 13.1 Design of a rule-based neural controller for a mechatronic system


A schematic diagram of a mechatronic device involving a finish lathe is shown in Figure 13.1. The cutting tool, on which a force sensor is attached, is forced to follow a desired force on the object being turned on the lathe. The quality of finish of the object is critically dependent on the rate with which the metal is cut. It is important where a high quality finish is required that the depth of cut be controlled with great accuracy. In most existing cutting lathes this is achieved with conventional twoterm (PI) controllers. Here we will present how an unconventional controller may be used with equal or better results.

Rule-Based Neural Control

185

The minimal neural network shown in Figure 13.2 will be used in the neural controller. The network will be trained using linguistic rules. The controller has two inputs: the error between desired and actual force ek and the change in error Dek=ek-ek-1. The incremental output of the network is Duk= F(gpek,gdDek) while the output of the neural controller is simply: uk= uk-1+Duk=uk-1+F(giek,gpDek)

Controller

Actuator

Force Sensor

Lathe

Figure 13.1 A force-cutting lathe

Finally, the output of the controller is weighted before being applied to the plant as: ck=gc uk The parameters gi, gp and gc are the normalizing gains of the controller, necessary to convert the inputs to the controller into the range [-1,1]. The value for gc=(ck/uk)max while the pair of controller parameters (gi,gp) are tuned on-line or obtained in an identical manner to the ZieglerNichols method. The linguistic control rules required to control the lathe cutting process are shown in the form of the checkerboard pattern in the linguistic rule matrix in Figure 13.3.

186

Chapter 13

{w} e {v}

De Du 1
Figure 13.2 Minimal neural controller for the mechatronic system

Both the ERROR (e) and the CHANGE_OF_ERROR (De) are quantized into five elements (equal to the number of linguistic variables assigned to each controller input) over the normalized ranges [-1,1]. Here, each controller input has been assigned seven linguistic variables and consequently the Linguistic Rule Matrix (LRM) or Fuzzy Associative Memory (FAM), contains 49 possible rules not every one of which is specified. Typically some 20-30% of the possible elements of the FAM are specified. In the example shown in Figure 13.3, the FAM contains 13 of the 49 possible rules.

De\e PL PM PMS OK NMS NM NL

NL

NM

NMS

OK

PMS

PM

PL

OK

PM

PL

OK NM NMS OK OK

PMS PM

NL

NM

OK

Figure 13.3 Linguistic Rule Matrix (LRM) or Fuzzy Associative Memory (FAM)

Rule-Based Neural Control

187

Here PL = Positive_large, PM = Positive_Medium, OK = Normal, NS = Negative_Small, PS = Positive_Small, NM = Negative_Medium, PMS = Positive_Medium_Small and NMS = Negative_Medium_Small. If the linguistic variables for the inputs to the controller are assigned the following numerical values: NM=-0.8. PM=0.8. OK=0. NS=-0.3. PS=0.3. NM=-0.8. PM=0.8 and the corresponding controller output: NM=-1. PM=1. OK=0. NS=-0.35. PS=0.35. NM=-0.55. PM=0.55 then the numerical fuzzy associative memory (NFAM) matrix is shown in Figure 13.4. The magnitude of the control variable is specified in the darkened elements of the matrix. All other elements are left blank. The neural network will perform the necessary interpolation.

+1
0 0.55 +1

De 0
-0.55

0 0 -0.35

0.35 0.55 0

-1

-0.55

-1 -1 0
Figure 13.4 Numerical FAM

+1 e

188

Chapter 13

The training set comprising 13 rules is given below:


Rule 1 2 3 4 5 6 7 8 9 10 11 12 13 e 0.8 0.8 0.8 0.3 0.3 0 0 0 -0.3 -0.3 -0.8 -0.8 -0.8 De 0.8 0 0.8 0.3 -0.3 0.8 0 -0.8 0.3 -0.3 0.8 0 -0.8 Du 1 0.55 0 0.35 0 0.55 0 -0.55 0 -0.35 0 -0.55 -1

The complete network training program in MATLAB is given in Appendix D. On executing this program the following synaptic weights are obtained: w11= 0.7233 w12= 0.2674 w21= 1.0764 v1= 0.798 v2= 0.2237 w22= -0.9706

where wij are the synaptic weights of the first layer and vi the synaptic weights of the second layer. The corresponding biases are: b1=0.6312 b2=0.166 b3=-0.5403

Figure 13.5 shows the Network Learning Rate and the Network Error (i.e., the sum of squared errors) as functions of the number of epochs. It is noted that less than 100 epochs are required to bring about conversion. This translates into a few seconds of computational time depending on the platform used. The performance of the closed system with the neural controller compares favorably with that of a conventional two-term controller in Figure 13.6, proving that unconventional control can be as good as conventional control and often superior.

Rule-Based Neural Control

189

Network Error 8 6 4 2 0 0 0.25 0.2 0.15 0.1 0.05 0 0 20 40 Epoch 60 80 100 20 60 Epoch Network Learning Rate 40 80 100

Figure 13.5 Network Errors and Network Learning Rates as functions of epochs

1.2

1 0.8

0.6

0.4

0.2 0 0

10

20

30 Time (second)

40

50

60

Figure 13.6 Comparison of step responses of the mechatronic system with the conventional controller (continuous line) and the neural controller (crosses)

190

Chapter 13

Example 13.2 Rule-based neural control of a coal mill


A schematic of a large vertical mill which is used to pulverize coal which is fed to a rotary kiln producing clinker at one of Europes largest cement manufacturers. This coal is injected into a rotary kiln where it self-ignites. The kiln produces clinker which is the principal constituent of cement (see also Example 9.2). Every cluster of cement kilns has one or more adjoined coal mills that feed them with pulverized coal. The coal mills normally operate discontinuously and on demand. When in operation, a coal mill normally produces more pulverized coal than a kiln can consume and thus an intermediate storage silo is used to store excess production. When the level of coal in the silo drops below some predetermined level, the coal mill is automatically started. Due to the danger of fire because of the combustibility of the coal, special care must be taken to maintain the coal mill environment inert. Conventional multivariable controllers have not been successful in controlling this process, a task normally left to human operators. The process and the principal control and controlled variables are shown in Figure 13.7. The operator manipulates three principal variables: the raw feed to the coal mill (FEED), the air feed to the mill (HADR) and the pulverized coal returns (RDPR) and his decision is based on measurements of the Differential Pressure in the mill (DP), the EXit Temperature of the pulverized coal (EXT) and the Under-Pressure in the mill (UP). The operator sets the nominal values for both the inputs and the outputs of the controller and the objective of the controller is to maintain the process in the desired state despite disturbances to the operation of the process from external sources. Human operators are trained to control this process using a set of linguistic rules. It is interesting to note, however, that despite this common training set, human operators develop their own control strategy over time which often vary from operator to operator. It is not surprising, therefore, that plant performance and productivity vary accordingly. Operational consistency with human operators is desirable but rarely obtained in practice. Intelligent controllers, in contrast, assure such consistency on a 24-hour basis. In the specific example a set of 125 rules (555) were elicited from human operators, examined for conflict and consistency and formed the knowledge from which the controllers were trained.

Rule-Based Neural Control

191

Five linguistic variables are used for each controller input, sufficient to yield the desired accuracy, whereupon the overall FAM has 3555=375 rules. In practice this number is unwieldy and it is simpler to design three independent three-input single-output subcontrollers which are executed sequentially. Here the sub-controllers have identical causes but different effects. The three sub-controllers can consequently be trained independently. Fewer than 125 rules are necessary in practice to achieve satisfactory control. Rule pruning was systematically performed with a view to reducing the number of control rules without loss of controller performance. This led to a reduced training set of 65 rules, an example of which is given below: R: IF DP is Very_High AND EXT is HIgh AND UP is OK THEN FEED is Very_High AND HADR is OK AND RDPR is OK

DP

EXT RDPR

UP FEED

HADPR

CONTROLLER

UP(0) EXT(0) DP(0)

HADPR(0) FEED(0) RDPR(0)

Figure 13.7 Rule-based neural control of a coal mill

192

Chapter 13 The controller algorithm is resident in the supervisory control system and is executed every 10 seconds. If the resultant incremental control actions are small, indicative of stable operation, they are ignored and the process is maintained at its previous state. Two such neural coal mill controllers have been in continuous operation since 1992 and have consistently led to energy demand reduction of approximately 5% and a comparable increase in productivity, food for thought for anyone still doubting the economics of intelligent control!

Chapter 14

Neuro-Fuzzy Control
Neuro-fuzzy controllers constitute a class of hybrid Soft Controllers that fuse fuzzy logic and artificial neural networks. Though the principles of fuzzy logic and artificial neural networks are very different, the two techniques have important common features: fuzzy logic aims at reproducing the mechanism of human cognitive faculty while neural networks attempt to emulate the human brain at the physiological level. In fuzzy controllers linguistic rules embody the knowledge on how to control a physical plant. In a neural controller this knowledge is embedded in the structure and the synaptic weights of the network. Feedforward processing in ANNs is analogous to the inference engine in fuzzy logic. Fuzzy controllers use fuzzy compositional rules to arrive at their decisions, require fuzzification of the input variables and defuzzification of the composite fuzzy set of the output in order to obtain a crisp output from the controller. In contrast, neural controllers use simple arithmetic techniques, operating directly in the physical world. In both cases, current data from the physical plant being controlled is stored in a real-time database and then processed by an appropriate algorithm. Only in the manner with which the two techniques arrive at this control action do they differ radically. The ability to generalize, i.e., to extrapolate when faced with a new situation, is a feature common to both. Evolving from very different origins, fuzzy and neural controllers developed independently by researchers with very different backgrounds and very different objectives. Scant thought was given to the
193

194

Chapter 14

possibility of combining the two at the time. It did not take much time, however, for control engineers to realize that the operations of a fuzzy controller could be implemented to advantage with artificial neural networks, which, because of their inherent parallelism and their superior computational speed, could lead to controllers capable of significantly higher bandwidth as required in a number of critical situations. Neuro-fuzzy control has been the subject of numerous books and it is beyond the scope of this book to delve in depth into the various architectures that have been proposed. This chapter presents a brief introduction on how the two techniques can be combined and how the fuzzy controller algorithm can be implemented with artificial neural networks.

14.1 Neuro-Fuzzy Controller Architectures


It is logical to examine the fusion of fuzzy logic and ANNs with a view to developing hybrid neuro-fuzzy controllers that possesses the best attributes of both techniques in the hope that this will lead to a superior class of intelligent controller. In principle it should be possible to neuralize a fuzzy controller or fuzzify a neural controller. It is useful, therefore, to state the principal characteristics of hybrid neuro-fuzzy controllers: they possess an architecture derived from both techniques, they have elements of both fuzzy and neural controllers, each of which performs a separate task and their design methodology is a combination of the two techniques.

A number of neuro-fuzzy controller architectures have been proposed, each with features that make them suitable for specific applications. Considering fuzzy and neural elements as distinct entities, it is possible to construct a controller structured in layers, some of which are implemented with neural elements and others with fuzzy elements. A fuzzy element can, for instance, act as supervisor to a neural element that controls some conventional industrial three-term controller. The following characteristics of ANNs are useful in implementing fuzzy controllers: they

Neuro-Fuzzy Control

195

use a distributed representation of knowledge, are macroscopic estimators, are fault-tolerant and can deal with uncertainty and vagueness.

14.2 Neuro-Fuzzy Isomorphism


The inference mechanism in a fuzzy controller follows a series of systematic steps (the fuzzy algorithm) and in the process justifies its decisions by identifying which rules have been fired and what contribution each rule had on the final decision. There is no similar mechanism in a neural controller, which is consequently unable to justify its decisions. In this sense a neural controller is little more than a black box capable of performing an arbitrary functional mapping of its inputs to its outputs. With the term neuralization we imply the use of ANNs in implementing fuzzy controllers. It is possible, for instance, to represent each linguistic rule with a separate ANN, maintaining thereby an isomorphism (i.e., the same form) between the two techniques. Alternatively, it is possible to represent each linguistic rule with many neurons and synaptic weights using only one ANN. This destroys the isomorphism, but leads to a much simpler implementation. In the first case, the resultant controller is unduly complicated, whereas in the second the resultant controller has a compact structure with few neurons and very few layers. With the term fuzzification, concepts of fuzzy logic are introduced in ANNs in which case the resultant controller will have neural equivalents of the basic fuzzy operators, e.g., min, max, max product and others. The hybrid neuro-fuzzy controller that is described in the following comprises a multi-layered feed-forward network that implements the elements of a fuzzy controller using connectionist techniques. This example of neuralized design shows how it is possible, in principle, to substitute the elements (or building blocks) of a fuzzy controller with neural equivalents. The resultant hybrid neuro-fuzzy controller uses a variety of dissimilar neurons and is certainly not the most practical or economical solution to neuro-fuzzy control. The structure of this neuro-fuzzy controller is interesting though impractical, however, because it retains the

196

Chapter 14

flow of operations of a fuzzy controller while emphasizing the concept of isomorphism.

d1 Layer 4

y1

dm

ym

Layer 3

Layer 2

Layer 1

u1

u2

un

Figure 14.1 Multi-layered hybrid neuro-fuzzy controller (for simplicity not all branches are shown)

Figure 14.1 shows a multi-layer neural network of the hybrid neuro-fuzzy controller. It is observed that each layer of the neural network has a fuzzy equivalent. The causes and effects (i.e., inputs and outputs) of the network correspond to the input and output nodes of the network, while the hidden layers perform the intermediate operations on the fuzzy sets while embedding the knowledge base. Every node in the second layer of the network performs a nonlinear mapping of the membership functions of the input variables. The second layer involves a cluster of neurons that have been trained a priori

Neuro-Fuzzy Control

197

to map the desired membership functions. The nodes in the third layer perform the same function as the knowledge base in a fuzzy controller, while the connections between the second and third layers correspond to the inference engine in a fuzzy controller. The nodes in the third layer map the membership functions of the output variables. In the fourth and final layer, there is only one node for the output of the network and a node from which the training sets are introduced to the network. The various neurons of each ANN are shown as nodes in Figure 14.1 and have properties which depend on the layer to which they belong. The relation between the inputs and output of an elemental neuron is, as before, simply:

p i=1

wi ui where y = f()

where y is the output, ui are the inputs, is the weighted sum, wi the synaptic weights, and f() the non-linear (compression) function of the elemental neuron. There are p inputs and p+1 synaptic weights for each neuron to account for the bias term. In the first layer, each neuron distorts the weighted sum of the inputs so that it corresponds to the membership function. For example, if the fuzzy set is Gaussian, then
=(xi ij ) 2 s ij
2

and f()= e

where ij are the centers and sij the standard deviations of the membership functions. Here, the synaptic weights of the first layer of the network must be equal to the centers of the membership functions, i.e., wij= ij. Using Mamdanis fuzzy compositional rule for instance, the nodes in the second layer of the controller perform the AND operator (i.e., min) in which =min(u1, u2.up) and f()= The neurons in the third layer perform the OR operator required in the final stage of the fuzzy compositional rule. Here the synaptic weights in this layer are unity and

198

Chapter 14

u
i =1

and f()=min(1,).

The fourth layer of the network possesses two sets of nodes. The first set is required for de-fuzzification and in the case where the fuzzy sets of the output are also Gaussian then the synaptic weights of these nodes and the non-linear compression function are defined by

wij = ijsij and f () =

sij u i

The second set of nodes in this layer transfer the elements of the training sets to the network in a reverse direction, i.e., from the output to the input of the network and here i = di. The training of the network is performed in two phases, at the end of which the parameters (ij,sij) of the neurons in the first and third layer are determined. During this phase, the network also learns the control rules, storing this knowledge in the synaptic weights of the connections between the second and third layer.

Example 14.1 Design of a simple 2-input 1-output neuro-fuzzy controller


A simple example of a hybrid neuro-fuzzy controller is considered here. The controller has two inputs and one output. It will be assumed that each input is assigned three fuzzy variables LOw, OK and HIgh. Assume, furthermore, that the fuzzy sets of the input variables are triangular, as shown in Figure 14.2. Each input to the network is fed to three ANNs, each of which has an input-output relationship (i.e., mapping) approximating the corresponding fuzzy set. The output of each ANN thus represents the corresponding membership function 1i, 2i and 3i.

Neuro-Fuzzy Control

199

LO OK

HI

ui Figure 14.2 Fuzzy sets of the input variables


Furthermore, assume that the knowledge base comprises five rules distributed in control space (i.e., FAM) as shown in the tile diagram of Figure 14.3.

u2\u1 HI OK LO

LO
R1: OK

OK

HI
R2: HI

R3: OK R4: LO R5: OK

Figure 14.3 Rules in control space

200

Chapter 14

R1

1 R2 2

R3

3 y

R4

4 5

R5

u 1

u 2

Figure 14.4 The inference mechanism

Given the membership values for each input, the nodes in the second layer compute the degree of fulfillment of each rule, as shown in Figure 14.4. For the given input values, only rules R3 and R5 are fired and the corresponding degrees of fulfillment are 0.2 and 0.5. Figure 14.5 shows the neuro-fuzzy controller with two inputs, five rules and one output. The five rules in the knowledge base are embodied in the second layer. The connections between the nodes in this layer and the next are determined by the rules. Thus, for example, the node corresponding to rule R3 has connections from the nodes representing the fuzzy sets OK and OK respectively while the node for rule R5 is linked to the nodes HI and LO of the first layer.

Neuro-Fuzzy Control

201

Output De-fuzzification

LO:OK:HI

Output Fuzzy Set

R1:R2:R3:R4:R5

Rules

LO:OK:HI

Input Fuzzy Sets u1 u2

Figure 14.5 The simple neuro-fuzzy controller with 2-inputs, 1-output and 5 rules

The outputs of the nodes of the second layer are the degrees of fulfillment of each rule. The fuzzy sets of the contributions from each rule that has fired are combined (using the union operator) in the third layer and finally the fourth and final layer performs the task of de-fuzzifying the output fuzzy set to yield a crisp output at the output of the neurofuzzy controller.

Chapter 15

Evolutionary Computation
The design of intelligent controllers based on unconventional control techniques will undoubtedly become increasingly common in the near future and these developments will rely heavily on the use of the stochastic methods of Soft Computing in seeking optimum results. These hybrid methods offer a new and very exciting prospect for Control Engineering, leading to solutions to problems that cannot be solved by conventional analytical or numerical optimization methods. Although stochastic methods of optimization are computerintensive, the impressive progress that has been observed in computer hardware over the past decades has led to the ready availability of extremely fast and powerful computers that make stochastic techniques very attractive. One of the ascending techniques of Intelligent Control is the fusion of Fuzzy and Neural Control with Evolutionary Computation. Evolutionary Computation is a generic term for computational methods that use models of biological evolutionary processes for the solution of complex engineering problems. The techniques of Evolutionary Computation have in common the emulation of the natural evolution of individual structures through processes inspired from natural selection and reproduction. These processes depend on the fitness of the individuals to survive and reproduce in a hostile environment. Evolution can be viewed as an optimization process that can be emulated by a computer. Evolutionary Computation is essentially a stochastic search technique with remarkable abilities for searching for global solutions.
203

204

Chapter 15

There has been a dramatic increase in interest in the techniques of Evolutionary Computation since their introduction in the mid-1970s. Many applications of the technique have been reported, including solving problems of numerical and combinatorial optimization, the optimum placing of components in VLSI devices, the design of optimum control systems, economics, modeling ecological systems, the study of evolutionary phenomena in social systems, and machine learning, among others. The idea behind Evolutionary Computation is best explained by the example quoted in Michalewicz (1992): Do what nature does. Let us take rabbits as an example: at any given time there is a population of rabbits. These faster, smarter rabbits are less likely to be eaten by foxes, and therefore more of them survive to do what rabbits do best: make more rabbits. Of course, some of the slower, dumber rabbits will survive just because they are lucky. This surviving population of rabbits starts breeding. The breeding results in a good mixture of rabbit genetic material: some slow rabbits breed with fast rabbits, some fast with fast, some smart rabbits with dumb rabbits, and so on. And on the top of that, nature throws in a wild hare every once in a while by mutating some of the rabbit genetic material. The resulting baby rabbits will (on average) be faster and smarter than those in the original population because more faster, smarter parents survived the foxes ... By analogy, in Evolutionary Computation, solutions that maximize some measure of fitness (the criterion or cost function) will have a higher probability of participating in the reproduction process for new solutions and it is likely that these solutions are better than the previous ones. This is a fundamental premise in Evolutionary Computation. Solutions of an optimization problem evolve by following the well-known Darwinian principles of survival of the fittest. The basic principles, the principal techniques and operators of Evolutionary Computation are introduced in this chapter and an example illustrates how an Evolutionary Algorithm can be used to determine the global optimum parameters of a complex problem. In Chapter 17 the technique is applied to the design of optimized control systems.

Evolutionary Computation

205

15.1 Evolutionary Algorithms


Historically, Evolutionary Algorithms were introduced at the end of the 1950s, but due to the lack of readily available fast computers, the whole field did not become known for many years. Initially, Evolutionary Programming was proposed as an alternative method of Artificial Intelligence, while Evolutionary Strategies were designed and used for the solution of difficult optimization problems. The seminal book by Holland entitled Adaptation in Natural and Artificial Systems published in 1975, laid the foundations of the technique and introduced the Genetic Algorithm, which is the most popular Evolutionary Algorithm. Genetic Algorithms (GAs) derive their name from the genetic processes of natural evolution. They were developed from Holland in the mid-1960s and have been implemented successfully in a broad range of control applications, e.g., the design of neural and fuzzy controllers, for tuning of industrial controllers and for the creation of hybrid fuzzy/ evolutionary and neural/evolutionary controllers, etc. During the 1980s, the rapid progress in computer technology permitted the use of the Evolutionary Algorithms in difficult large-scale optimization problems and the method rapidly diffused into the scientific community. Today, new applications of Evolutionary Algorithms are being reported in large numbers and the field has finally achieved general acceptance. The terminology in the field of Evolutionary Computation is derived from Biology and Genetics. This terminology may lead to some confusion to the engineer who must associate the new terminology with known engineering terms. Here, the terms used will be adapted to make them easier to comprehend. Although Evolutionary Algorithms appear to be extremely simple compared with their biological counterparts, they are, however, sufficiently complicated so as to yield solutions where conventional numerical methods have been known to fail. Evolutionary Algorithms are a subset of Evolutionary Computation and belong to the generic fields of the Simulated Evolution and Artificial Life. The search for an optimum solution is based on the natural processes of biological evolution and is accomplished in a parallel manner in the parameter search space. The terminology used in Evolutionary Computation is familiar. Thus, candidate solutions of an optimization problem are termed individuals. The population of solutions evolves in accordance with the laws of natural evolution. After initialization, the population undergoes selection, recombination and mutation repeatedly

206

Chapter 15

until some termination condition is satisfied. Each iteration is termed a generation, while the individuals that undergo recombination and mutation are named parents that yield offsprings. Selection aims at improving the average quality of the population, giving the individuals with higher quality increased chances for replication in the next generation of solutions. Selection has the feature of focusing the search in promising areas of the parameter search space. The quality of every individual is evaluated by means of a fitness function, which is analogous to an objective function. The assumption that better individuals have increased chances to reproduce even better offspring is based on the fact that there is a strong correlation between the fitness of the parents and that of their offspring. In Genetics this correlation is termed heredity. Through selection, exploitation of the numerical/genetic information is thereby achieved. Through recombination, two parents exchange their characteristics through random partial exchange of their numerical/genetic information. The recombination of the characteristics of two parents of high fitness assumes that if a portion of the numerical/genetic information responsible for high values of fitness recombines with an equivalent parent, then the chances that their offspring will have as high or even higher fitness values are correspondingly increased. Recombination is also referred to as Crossover. Likewise, through mutation, an individual undergoes a random change in one of its characteristics, i.e., in a specific section of its structure. Mutation aims at introducing new characteristics to the population that does not necessarily exist in the parents, leading thereby to an increase in the variance of the population. Exploration of the search space is achieved through the operators of recombination and mutation The cornerstone of Evolutionary Algorithms is the iterative procedure in exploring the search space while simultaneously exploiting the information that is being accumulated during the search. This is in fact, where their functionality lies. Through exploration, a systematic sampling of the search space is achieved, while through exploitation the information that has been accumulated during exploration is used to search for new areas of interest in which exploration can be continued. Unlike exploitation, exploration includes random steps. It should be emphasized that random exploration does not mean exploration without direction since the technique focuses on the most promising directions.

Evolutionary Computation

207

The most common types of Evolutionary Algorithms are: Genetic Algorithms Evolutionary Strategies Evolutionary Programming Classifier Systems Genetic Programming

The first three are used extensively in optimization problems, while Classifier Systems are used in machine learning. Finally, Genetic Programming is used in the automatic production of computer programs. It is noted that Genetic Programming and Classifier Systems are often considered as special cases of Genetic Algorithms and not as special cases of Evolutionary Algorithms.

15.2 The Optimization Problem


In the generalized optimization problem, the objective is a search of the values of a vector x M, whose cost or objective function is to be minimized, i.e.: (x) min The solution of the problem requires determination of the optimum solution vector x*, for which: x : (x*) (x). In practice, the space of feasible solutions is often bounded, i.e., G M by functions of the form gj : M and an analytical solution to the problem is unlikely. Only when and gj are particularly simple functions can conventional numerical optimization methods such as linear and non-linear programming be used. Also, optimization problems met in practice often require simplification, resulting thereby in solutions that do not correspond to the original problem. This is one of the principal reasons for the adoption of unconventional stochastic optimization

208

Chapter 15

methods, such as Genetic Algorithms that are not bound by such constraints.

15.3 Evolutionary Optimization


The field of Evolutionary Optimization has generated considerable interest and excitement in the engineering community and is one of the ascending fields of Control Engineering. The technique can yield solutions to optimization problems that cannot be solved otherwise. In Soft Control, in particular, after the initial period of enthusiasm, it became necessary to search for optimum solutions where heuristics and prior knowledge could not be applied or were not always useful. For example, issues concerning the best form of the fuzzy sets to use in a Fuzzy Controller or the best topology and the number of neurons and layers of the ANN being used in Neural Control are now being sought using Evolutionary Optimization. The well-known numerical optimization methods of nonlinear programming do not always lead to acceptable solutions in practical problems, often becoming entrapped in local minima instead of yielding global solutions. Evolutionary Algorithms, in combination with local hill-climbing techniques, in most cases are able to locate global optimum solutions and are rapidly superceding classical techniques. What then are the benefits of Evolutionary Computation and what are the reasons for the intense interest in this field? The answer is simple: Evolutionary Algorithms are robust, flexible and adaptable and they can yield global solutions to any problem, whatever the form of the objective function. Conventional numerical optimization methods, in contrast, can yield excellent results in a specific class of problems and fail in all others. The main differences between Evolutionary Algorithms and conventional numerical optimization methods are the following: they

seek the optimum solution by searching a population of points of the search (solution) space in parallel and not in an isolated space, do not require derivative information or any other information. The direction of search is influenced by the evaluations

Evolutionary Computation

209

of the objective function and of the respective fitness function only, use stochastic (probabilistic) transition and not deterministic rules in the optimization procedure, are simple to implement and apply, can yield a population of optimum feasible solutions in a problem and not a unique one. The choice of the best solution is then left to the user. This is very useful in practical problems where multiple solutions exist as well as in multiobjective optimization problems.

Figure 15.1 Structure of a simple Evolutionary Algorithm

The search process, which is followed by a simple Evolutionary Algorithm for the solution of an optimization problem is shown in the flow chart of Fig. 15.1 and is summarized as follows:

210

Chapter 15

1. a population of initial solutions is created heuristically or randomly, 2. the fitness of every individual-solution is evaluated, using the fitness function, which depends strongly on the corresponding value of the objective function of every candidate solution, 3. the selection operator gives improved chances to the better solutions for survival in the next generation, 4. using the recombination operator, two parents, which have been chosen randomly using the selection operator, exchange numerical information, according to a pre-defined probability of recombination, 5. using the mutation operator, partial numerical information is perturbed according to a pre-defined probability of mutation, 6. the fitness values of the new population are re-evaluated and 7. if the termination criterion (statistical or temporal) is not satisfied, then a return is made to the 3rd step, otherwise the algorithm is terminated and 8. the best solution from the set of optimum solutions is selected. If the optimization problem has constraints, then there are two alternative methods of approaching an optimization problem: the first uses penalties (solutions which violate the constraints are penalized and their respective fitness function values are reduced) and the second uses a mapping of the candidate solutions with simultaneous use of the exploration operators (recombination and mutation).

The pseudo-code of a simple Evolutionary Algorithm is shown below:


k=0 ; % Initialization of the iteration index Init_population P(k) ; % Initialization of the random population Evaluate P(k) ; % Evaluation of the Fitness of the initial population until(done) % Iteration till the termination criterion be satisfied k:= k + 1; % Increment iteration index or epoch P':= select_parents P(k) ; % Selection of the sub-population for the

Evolutionary Computation reproduction of the offspring recombine P'(k) ; % Recombination mutate P'(k) ; % Mutation evaluate P'(k) ; % Evaluation of the fitness of the new population P := survive P, P'(k) ; % Selection of survivors

211

15.4 Genetic Algorithms


Here, we will limit our interest to the most popular Evolutionary Algorithm, the Genetic Algorithms. As was noted above, Holland proposed Genetic Algorithms in the 1970s, having studied the adaptability of organisms as a natural phenomenon. Since the early 1980s, Genetic Algorithms have been used extensively in optimization problems. Meanwhile, the research efforts during the 1980s and 1990s resulted in the development of many new forms of Genetic Algorithms, which differ significantly from the initial version proposed by Holland. It is beyond the scope of this book to describe these variants and the reader is referred to the Bibliography in chapter 18 for further study. The differences between Genetic Algorithms, Evolutionary Strategies and Evolutionary Programming lie in the operations that candidate solutions are subject to during the evolutionary procedure. These differences may be critical in a successful implementation of an Evolutionary Algorithm but in practice, they are of lesser importance compared with the properties that they share. It is worthy of note that since the early 1980s each technique has borrowed principles from the others and their differences have tended to disappear. In the design and use of a Genetic Algorithm, the following issues must be considered: the creation of the initial population initialization, representation-mapping of the candidate solutions, evaluation of the fitness of every candidate solution, implementation of the exploration operators recombination and mutation, selection of the parents for reproduction of the offspring selection and choice of the parameters of the Genetic Algorithm

212

Chapter 15

The stages of a simple Genetic Algorithm are discussed in some detail below.

15.4.1 Initialization
An initial population of N candidate solutions (one for each of the N unknowns) is created and for every solution/individual/chromosome xi the corresponding objective function value i is evaluated. Alternative methods for the creation of the initial population or part of it, is through statistical analysis of the search space or through heuristic reasoning.

15.4.2 Decoding
The N candidate solutions of the optimization problem are converted into binary strings of length L that are used to represent real numbers as follows: 000.0 = minimum value of the parameter 000.1 = minimum value of the parameter + q20 00010 = minimum value of the parameter + q21 11111 = maximum value of the parameter where q = (max. value min value)/(2L-1). Clearly the discretization step q specifies the precision of the representation while the length L of each representation need not be equal for all candidate solutions. When the optimization problem is multidimensional, then the partial strings are concatenated as shown in Figure 15.2 in order to create a single binary string. Other mappings that are commonly used are representations with real numbers and the Gray code. By way of example, consider a simple two-dimensional optimization problem whose objective function is quadratic: (x)=x12+ x22 where 0xi7 are integers. Here we select the length of the binary string to be L=3 since 23=8, sufficient to represent the integer values of the so-

Evolutionary Computation

213

lutions. Thus the string 000 000 represents the solution (0, 0), the string 000 001 the solution (0,1), the string 111 111 the solution (7,7), etc. Following accepted terminology, the binary string is named a genotype, the decoded information the phenotype, while every individual solution is a chromosome.

Fig. 15.2 Creation of the string in a multidimensional problem of optimization

15.4.3 Evaluation of the fitness


Evaluation of the fitness of every candidate solution is fundamental in Genetic Algorithms. When a Genetic Algorithm is used for the optimization of an objective function, the fitness is analogous with the objective function. If the objective function takes positive values only, then the fitness values and the objective function values of an individual are synonymous. If not, then a transformation is necessary to reflect the qualitative difference of the candidate solutions while ensuring that the fitness values are always positive. Usually, we use the relative fitness fi of the individual-solutions defined as:

214

Chapter 15

fi =

j =1

15.4.4 Recombination and mutation


The recombination and mutation operators perform the necessary exploration of the search space. Through recombination (or crossover), numerical information (i.e., the genetic material) is exchanged between two random individuals, while through mutation a bit-digit is perturbed and its value is changed. For example, consider the following two 3-digits genotypes: Q1= and Q2= 1 0 0 1 1 1 0 1 1 0 0 1

The crossover point is chosen randomly and in the example the result of recombination in the third gene (or digits) of the two binary strings are the new strings:
Crossover Point

Q1*= and Q2*=

In string Q1 the digits to the right of the crossover point are exchanged with these of the second string Q2, while the opposite is done with the digits of the second string. Crossover can be performed at one

Evolutionary Computation

215

or more points that are selected randomly. An alternative form of crossover is uniform crossover where every digit of every offspring has equal probabilities of being taken from either parent. The frequency with which the crossover operator acts on the candidate solutions depends on some pre-defined crossover probability pcross[0, 1]. Practical values of the crossover probability are in the range [0.6-0.95], while techniques have been proposed where the same search process adapts the crossover probability. During mutation, a random bit that is selected with some predefined mutation probability pmut, takes on values of 0 or 1 at random. This operation results in an increase in the variability of the population. Mutation is necessary since potentially useful genetic material may be lost at specific locations of the generation during previous operations. For instance, after mutation of the second digit/gene which is chosen randomly, the string,: Q1= is transformed into: Q1*= 1 1 0 0 0 1 1 0 0 0 0 1

Figure 15.3 depicts the operations of crossover and mutation in genotype space, in phenotype space in the objective function and correlated to the fitness function.

15.4.5 Selection
During selection, N individuals are chosen for survival in the next generation according to their fitness values from a population of candidate solutions-individuals. Individuals with high fitness values have an increased probability of survival in the next generation/iteration, compared to those with low fitness values. Many methods of selection have been proposed but here only the popular roulette-wheel method is considered. Consider the surface of a roulette wheel that is divided according to the fitness values, i.e., the angles of every sector of the roulette wheel are set proportional to the fitness. If the relative fitness values are equal (an unlikely situation in practice), then the roulette sectors will have

216

Chapter 15

equal angles, implying that there is an equal probability that the roulette ball will stop in any of the sectors.

Figure 15.3 Representation of the Crossover and Mutation Operators

The hypothetical case of the ball stopping on a dividing line is discounted! In reality, the fitness values are unequal in which case the sectors of the roulette will also be unequal, implying that the probability of the ball stopping in any given sector increases with the angle of the sector. It is not improbable, however, that the ball will stop in a sector with a small angle. Imagine now that the ball is rolled and finally stops in one of the N sectors of the roulette wheel. The sector where the ball stops defines the chromosome that will undergo evolution. It is obvious that the ball may fall two or more times in a sector which corresponds to a high fitness value and the corresponding solution/chromosome is selected an equal number of times for survival in the next generation. A so-

Evolutionary Computation

217

lution with low fitness may not be selected for survival in the next generation.

13.13% 25.25%
A/A Chromosome 1 011100 2 000011 3 110100 4 010011
TOTAL

Fitness

% Totally

9.09% 52.53%

25 9 52 13 99

25.25% 9.09% 52.53% 13.13% 100%

Figure 15.4 Roulette-wheel selection for the objective function (x1,x2) =x12+x22 and a population of N=4.

15.4.6 Choice of the parameters of a GA


The main parameters that must be considered in the design of a Genetic Algorithm are the population size of the N solutions-individuals and the values of the probabilities of recombination/crossover and mutation. There are no general rules for selecting the appropriate probabilities but some general guidelines on values that give acceptable results have been established from experience. These guidelines suggest the following:

[60, 100], pcross[0.6, 0.9] and pm[0.001, 0.01]


It is noted that the probability of recombination is referred to the population N, while the probability of mutation is referred to all the digits of the population. Also, it is worthy of note also that Genetic Algorithms have been used for the optimization of the parameters of the Genetic Algorithms (i.e., meta-GAs).

218

Chapter 15

Example 15.1 Constrained optimization of a complex function


A difficult objective function is chosen for the demonstration of the potential of the Genetic Algorithms. Assume that the following objective function must be minimized: (x1,x2)=x12+x22-0.3 cos(3x1)-0.4 cos(4x2)+0.7 where x1,x2 [-1,1]. As shown in Figure 15.5, the function (x1,x2) has multiple minima, which would be difficult to locate using conventional numerical methods of optimization. Even if the initial starting state in the search space were good, it would be difficult to avoid entrapment in some local optimum. Figure 15.6 shows the two-dimensional contour of (x1,x2). Note that the minimum value of the function is at (0,0), while the search space is a square with axes (-1,1) and (-1,1). In Appendix C a program for the solution of the optimization problem of the function (x1,x2) is presented. The program is written in MATLAB and is based on routines that can be found at the Mathworks web site www.mathworks.com. Executing the Genetic Algorithm to optimize the objective function (x1,x2) with random initial conditions leads to the results depicted in Figures 15.7 and 15.8. In Figure 15.7 the evolution of the optimization trajectory of the optimum solution in every generation is shown while in Figure 15.8 the evolution of the objective function in every generation (i.e., iteration) is shown. The upper, middle and lower curves correspond to the worst, average and minimum or optimum solution respectively. It is noted that the Genetic Algorithm that has been implemented in this example follows an elitist strategy, i.e., it preserves the optimum solution, found so far, in every generation. Finally, the other parameters of the Genetic Algorithm are population_size=10, probability of mutation pm=0.01, probability of crossover pcross=0.5 and maximum number of generations maxgen= 40. The length of the binary string for every parameter is 12, in which case the concatenated string has 24 digits.

Evolutionary Computation

219

Figure 15.5 The objective function (x1,x2)

1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Figure 15.6 Contour graph of the objective function (x1,x2)

220

Chapter 15

1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1 -0.5 0 0.5 1

Figure 15.7 Evolution of the trajectory of the optimum solution in every generation

2.5

1.5

0.5

10

20

30

40

50

60

Figure 15.8 Evolution of the best/average/worst solution for every epoch

Evolutionary Computation

221

15.5 Design of Intelligent Controllers Using GAs


15.5.1 Fuzzy controllers
The traditional design of fuzzy controllers is based primarily on heuristic techniques. The main problem in fuzzy control design is the difficulty in defining a host of parameters, such as the number and shape of the fuzzy sets of the inputs and outputs, the form of the inference engine and the de-fuzzification mechanism. Their choice has considerable influence on the overall behavior of the controller. Unfortunately, at present the theoretical foundations for determining the optimum solutions does not exist and consequently only experience with similar problems can be used in their design. Often, the result is acceptable, but there is no guarantee that there does not exist a better solution. The flexibility of Evolutionary Algorithms is the principal motivating factor for their use in determining the optimum values of the parameters of a fuzzy controller. As was mentioned in previous chapters, the linguistic values of a fuzzy variable are defined by their membership functions. When the membership functions are triangular, then the three parameters a, b and c shown in Figure 15.9 uniquely specify the fuzzy set. The parameters of every fuzzy set are encoded into binary strings of sufficient length to give the desired precision. For n fuzzy sets and m variables and encoding with p digits, the total length of the chromosome is clearly nmp digits.
(x)

Figure 15.9 Parameters of a triangular fuzzy set

222

Chapter 15

If the dynamic behavior of the process is known and there exists a macroscopic model of the process, then we may use the single objective function (or criterion) to evaluate the performance of the closed system to a step disturbance:

t e dt or

e 2 dt

where T. These are the familiar ITAE and ISE criteria respectively. Alternatively some criterion which involves overshoot, steady state error and rise time could be used.

15.5.2 Optimum neural controllers


The performance of a neural controller depends critically on the architecture of the ANN used, i.e., the number of neurons in every layer, the number of the layers and the topology of the network, the form of the compression function and the algorithm used to train the network. The determination of these parameters is based on the knowledge and experience of the designer and any discussion on optimum design is of no consequence. The determination of the parameters of a neural controller can, however, be transformed into an optimization problem for which Evolutionary Algorithms appear very attractive. Genetic Algorithms are considered very efficient in rapidly finding the approximate optimum solution of an optimization problem, but they are generally slow in finding the precise solution. For better convergence, Genetic Algorithms can be combined with local-search techniques, such as hill climbing, which are ideally suited to finding optimum solutions in the small. It is noted that for an Artificial Neural Network with m layers and ni neurons in the ith-layer, the total number of parameters in the optimization problem is:
N = ( n i + 1) n i +1
i =0 m 1

If every weight is decoded into a binary string with L digits, the total length of the chromosome is clearly NL. It is obvious that the use of Genetic Algorithms in problems involving ANNs with thousands of neu-

Evolutionary Computation

223

rons becomes difficult and extremely time-consuming. However ANNs, which are used in Neural Controllers, usually have a very simple architecture with rarely more than 30 neurons or more than one hidden layer, in which case GAs are very efficient. The second application of Evolutionary Algorithms in the field of Neural Control has to do with the evolution of the topology of the network (i.e., the manner in which the neurons of the network are interconnected) with or without parallel evolution of the weights of the network. This problem is practically unsolvable with conventional optimization methods. The main characteristic that makes Evolutionary Algorithms attractive for a broad class of optimization problems is their robustness because:

they do not require specific knowledge or derivative information of the objective function discontinuities, noise or other unpredictable phenomena have little impact on the performance of the method they perform in parallel in the solution space, exploring the search space with simultaneous exploitation of the information derived and they do not become entrapped in local optima they have good performance in multidimensional largescale optimization problems and they can implement in many different optimization problems without big changes in their algorithmic structure.
The main disadvantages of GAs are that they:

face some difficulties in locating the precise global optimum, although it is easy for them to locate the vicinity where the global optimum exists and require a great number of evaluations of the objective function and therefore require considerable computational power.

With regard to the first disadvantage, many hybrid evolutionary algorithms, which embed local search techniques, have been proposed. Methods like hill-climbing or Simulated Annealing (presented in the

224

Chapter 15

next chapter) in combination with Evolutionary Algorithms have been developed in order to determine the exact global optimum and to improve the overall performance of the search algorithm. A notable genetic algorithm is that due to Moed and Saridis for global optimization. Concerning the second disadvantage, the dramatic evolution in computer technology in combination with the progress in parallel computer machines is tending to minimize this disadvantage.

Chapter 16

Simulated Annealing
Simulated Annealing, which has much in common with Evolutionary Computation, is a derivative-free stochastic search method for determining the optimum solution in an optimization problem. The method was proposed by Kirkpatrick et al. in 1983 and has since been used extensively to solve large-scale problems of combinatorial optimization, such as the well-known traveling salesman problem (TSP), the design of VLSI circuitry and in the design of optimum controllers. The main difference between Evolutionary Computation and Simulated Annealing is that the latter is inspired by the annealing process for metals during cooling, while the former is based on evolutionary processes. The principle of annealing is simple: at high temperatures the molecules in a metal move freely but as the metal is cooled gradually this movement is reduced and atoms align to form crystals. This crystalline form actually constitutes a state of minimum energy. Metals that are cooled gradually reach a state of minimum energy naturally, while if they are forcibly cooled they reach a polycrystalline or amorphous state whose energy level is significantly higher. Metals that are annealed are pliable while the latter are brittle. However, even at low temperatures there exists a small, but finite probability that the metal will enter a state of higher energy. This implies that it is possible that the metal will leave the state of minimum energy for a new state where the energy is increased. During the cooling process, the intrinsic energy may rise or drop but as the temperature is lowered the probability that the energy level will increase suddenly is re225

226

Chapter 16

duced. The probability that a change in the state of the metal at some temperature T and initial energy level E1 to some other state with energy level 2 is given by:

p=e

if 2>1 =1 otherwise

( E2 E1 ) T

where is Boltzmanns constant. This thermodynamic principle was adapted to numerical analysis by Metropolis et al. in 1953 giving rise to the terms Simulated Annealing. Simulated annealing attempts to minimize energy. This is similar to minimizing a Lyapunov function in modern control theory. In implementing the Metropolis algorithm the following must be known: the objective function (by analogy with the energy E of the metal) whose minimum is sought and, a control parameter T (the simulated temperature) whose temporal strategy defines the changes in the simulated temperature at every iteration of the algorithm.

16.1 The Metropolis Algorithm


Although the analogy between the physical annealing process and Simulated Annealing is far from perfect, there is clearly much in common. In all the stochastic algorithms of optimization, an attempt is made to guide the solution from an initial random point to an optimum solution as rapidly as possible. This can lead to entrapment in some local optimum from which it may be difficult if not impossible to extricate. Simulated Annealing does not suffer this problem, since the technique is stochastic and searches the solution space randomly. For the solution of an optimization problem with Simulated Annealing, the following steps are required: 1. an initial random solution vector x1 in the bounded parameter space is selected and its objective function (x1) is computed, 2. an initial temperature T(0)= Tinit is specified,

Simulated Annealing

227

3. using some stochastic or heuristic strategy, a new solution vector x2 is selected and the corresponding objective function value is evaluated (x2), 4. the difference of the objective function = (x2)-(x1) is computed, 5. if <0, then the solution vector x2 is accepted, otherwise if >0 accept the solution vector according to the probability of acceptance:

p(k ) = e

T (k )

otherwise go to step 7, 6. set x1=x2 and (x1)=(x2) and weight the current simulated temperature with the coefficient , where 0<<1, decreasing the simulated temperature successively at every iteration, so that at the (k+1)st iteration: T(k+1) = T(k), where k is the iteration index, 7. if the current simulated temperature is lower or equal to the final temperature, i.e., T(k) Tfinal, then accept the current solution vector as being optimum, otherwise return to Step 3 and repeat the process. If the Simulated Annealing algorithm is to succeed, it is important that the temporal annealing strategy that is followed, i.e., the simulated temperature profile, be suitable. The rate at which the simulated temperature is decreased depends on the weighting coefficient . Too high a simulated cooling rate leads to non-minimum energy solutions, while too low a cooling rate leads to excessively long computation times. The closer the value of is to unity, the slower simulated temperature decreases. Figure 16.1 shows the probability of acceptance of the solution p(), as a function of the iteration index for different values of . In order to achieve effective exploration of the search space, it is advisable to use 0.95<<0.98. Finally, as in Evolutionary Computation, the trajectory of an optimization problem is critically dependent on the initial estimates of the optimum solutions that are heuristic or the result of statistical analysis.

228

Chapter 16

p() 1 =0.5 =0.8 =0.95

10

19

28

37

46

55

64

73

82

91 100 109 118

Figure 16.1 Variation of the acceptance probability p() as a function of the iteration index k.

16.2 Application Examples


Two examples of the application of simulated annealing are presented below. In the first, the two-dimensional parameter optimization problem that was presented in the previous chapter on Evolutionary Computation is used. The second application example shows how Simulated Annealing can be used to advantage to obtain the optimum coefficients of a two-term (PI) industrial controller.

Simulated Annealing

229

Example 16.1 Constrained optimization of a complex function


The problem statement in this example is identical to that of Example 15.1 in the previous chapter and refers to the problem of finding the parameters x1 and x2 which minimize the multi-peaked objective function: (x1,x2)=x12+x22-0.3 cos(3x1)-0.4 cos(4x2)+0.7 The parameter search space is bounded by the square x1,x2[-1,1]. Figure 15.5 showed that the objective function has multiple minima in the permissible parameter search space. An example of the trajectory of the solutions in the bounded search space using simulated annealing algorithm is shown in Figure 16.2. The evolution of the objective function (x1, x2) is shown in Figure 16.3. During the initial iterations of the algorithm and particularly when the simulated temperature is high, the trajectory appears random with large fluctuations. With successive decreases in the simulated temperature, however, the algorithm is seen to converge. Without systematically decreasing the simulated temperature, the solution to the problem at every iteration would be totally random, something analogous to the Monte Carlo method and convergence is unlikely. Finally, Figures 16.4 and 16.5 show the trajectories of the parameters x1 and x2 as they converge to the null state (0,0) at which the objective function has a minimum. This is the state of minimum energy. Convergence is achieved in approximately 150 iterations. It is noted that convergence is achieved without knowledge of the derivative of the objective function. This is particularly useful when the derivative of the objective function, required by most non-linear programming methods, is difficult or impossible to obtain analytically.

230

Chapter 16

1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1

-0.5

0.5

Figure 16.2. Trajectory of the solution in the search space (x1,x2) (x1,x2)
3

2.5

1.5

0.5

0 0

50

100

150

200

250

300

350

Figure 16.3 Evolution of the objective function (x1,x2) as a function of the iteration index

Simulated Annealing

231

1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0 10 15 20 25 30 35

50

Figure 16.4 Evolution of x1 as a function of the iteration index

1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 0

50

10

15

20

25

30

35

Figure 16.5 Evolution of x2 as a function of the iteration index

232

Chapter 16

Example 16.2 Optimization of a two-term industrial controller


Given a simple SISO (Single Input - Single Output) plant whose step response, derived from experimental data, is shown in Figure 16.6. The plant is to be controlled by a classic industrial two-term controller (PI), whose input-output relationship is simply:
u = K p e + Ki e( )d
0 t

Here e is the error between the desired and the real output and u is the control variable. The unknown parameters of the industrial controller are the gains Kp and Ki whose optimum values are sought. One way of determining the best gain pair is to use classical tuning methods such as those of Ziegler and Nichols or modern tuning techniques such as those of Persson and Astrom (see Bibliography in chapter 18). These methods (i) assume a simplified dynamic model of the plant and (ii) use heuristics to arrive at the best parameters instead of an analytical error criterion. Here in contrast, an analytical criterion is used directly and the optimum pair is determined using stochastic techniques

Figure 16.6 Discrete-time step response of the controlled process

Simulated Annealing

233

Using the ITAE criterion it is desired to obtain the values of the parameter pair (Kp,Ki), which minimize the objective function (i.e., error criterion): ()= ITAE =

t e(t ) dt
T 0

where T . An example of the evolution of the criterion dependent from the iterations is depicted in Figure 16.7. Convergence is achieved in about 120 epochs (iterations).

Figure 16.7 Evolution of the objective function

Figure 16.8 Evolution of the parameters of the industrial controller

234

Chapter 16

Figure 16.8 shows the evolution of the parameter pair (Kp,Ki) of the industrial controller. It is noted that convergence is achieved, whatever the initial values of the unknown parameters. Finally, the step response of the closed-loop system is shown in Figure 16.9. This response must be compared with that of Figure 14.6 for the non-optimized neural controller. A cursory glance will confirm that the response of this system is superior.

Figure 16.9 Step response of the closed-loop system with minimum ITAE

Chapter 17

Evolutionary Design of Controllers


The criteria used in the design of control systems, better known as objective or cost functions, are normally formulated in terms of analytical functions. In practice, multiple engineering objectives are often difficult to express in closed analytical form and great effort is spent trying to find suitable expressions that result in acceptable system performance. In industrial three-term controller design, for instance, the optimum parameters of the controller must satisfy not only specific closed system specifications but may also involve economical, ecological, production and other engineering factors. These multiple objectives are difficult to formulate analytically, conflicting and impossible to satisfy simultaneously. Controller design with multiple objectives can be attempted using a composite objective function in which the various objectives are assigned weights or penalties and the design problem is transformed into a nonlinear optimization problem. Nonlinear programming is the traditional method for obtaining numerical solutions to this problem, but there is no guarantee that the procedure will reach a global optimum. The problem is compounded if the parameter search space is bounded. Stochastic optimization using Evolutionary Programming or Simulated Annealing is an attractive alternative when numerical solutions are required and significant inroads in its use have been made in recent years.

236

Chapter 17

Stochastic optimization with a qualitative measure of performance of a closed system, instead of quantitative terms, offers distinct advantages in controller design to the Control Engineer as he is able to235 relate to the design problem directly in linguistic terms rather than through some abstract analytical formulation.

17.1 Qualitative Fitness Function


Evolutionary Algorithms are used mainly in optimization problems and especially in constrained problems in which the objective function is complicated. This class of problems is usually difficult to solve with conventional numerical methods. In the stochastic approach using evolutionary programming, the multiple objective function is transformed into a composite fitness function, which then constitutes the driving force of the evolutionary search procedure. Human knowledge and engineering objectives can normally be expressed in qualitative terms when a quantitative formulation is not possible. Considering the flexibility of Evolutionary Computation, it is evident that comparison of the candidate solutions can be performed using fuzzy linguistic rules. The formulation of the optimization problem in such a way as to use linguistic objectives constitutes the difference between the stochastic technique using Evolutionary Algorithms and fuzzy fitness criteria and conventional forms of Evolutionary Algorithms. In the qualitative design technique, fitness is expressed in terms of fuzzy linguistic rules that are processed using an inference mechanism whose outcome is de-fuzzified to yield crisp values for the fitness. It is recalled that in the classical Evolutionary Algorithms described in Chapter 15, the fitness values are computed from an analytical objective function. In the generalized optimization problem, the objective is a search for the values of a vector x M, whose objective function is minimized i.e., (x) min The function (x) expresses the engineering objectives in functional form and is some measure of the behavior of the system being studied. In the stochastic design technique, a Genetic Algorithm is used to search for the optimum values of the unknown vector xM, in the same manner

Evolutionary Design of Controllers

237

that was described in Chapter 15. The main difference here is that the evaluation of fitness does not follow the classical approaches, but is derived from a set of linguistic rules that express the multiple engineering objectives of the problem. The consequent of each fuzzy rule is taken as the objective function that can take m fuzzy linguistic values from the set: w = {w1, w2 wm} Assume also that the array a = {a1, a2 am} is the set of parameters, which consist of the antecedents, and vi = {vi1 , vi2 viq} is the set of q fuzzy values of the variable ai. Then, the linguistic rules that specify the fitness function in the Evolutionary Algorithm take the form: R: IF a1 is v1i AND a2 is v2i AND a3 is v3i THEN is wi The total number of design rules is thus equal to a1*n1+a2*n2++ak*nk.

17.2 Controller Suitability


The knowledge about controlling a system exists in the form of linguistic rules whose antecedents (i.e., controller attributes) and consequents (i.e., controller suitability) uniquely specify a particular design. What is required therefore are rules that describe, in linguistic terms, how the controller attributes affect the performance of the closed system. Once this is done the stochastic design technique reduces to one of using fuzzy reasoning to infer the suitability of any design. De-fuzzification of the membership function of the suitability yields a crisp value for the fitness that is subsequently used as the fitness measure in the stochastic optimization procedure. A Genetic Algorithm is then used to determine the global optima of the parameters of the controller.

238

Chapter 17

The controller attributes are described by fuzzy sets, three to five being sufficient for most practical purposes. These fuzzy sets describe the desired attributes of the design. The specifications normally used in designing industrial controllers are overshoot, rise time (i.e., the time for the closed-loop response to reach some specified percentage of its final value) and settling time (i.e., the time required for the closed-loop response to reach some specified percentage of its final value). More design specifications can be added as necessary, e.g. steady-state error and the maximum permissible control actions that can be used, in which case the complexity of the computational problem is increased proportionally. Consider for example, the fuzzy sets for the Rise_Time, Overshoot and Settling_Time to be Small, Medium and Large while the fuzzy sets of the resultant Fitness to be Very_Small, Small, Negative_ Medium, Medium, Positive_Medium, Large and Very_Large. A sample of suitability rules can therefore be stated as follows: R1: IF (Rise_Time is Small) AND (Overshoot is Small) AND (Settling_Time is Small) THEN (Fitness is Very_Large) 2 R : IF (Rise_Time is Medium) AND (Overshoot is Small) AND (Settling_Time is Small) THEN (Fitness is Large) 3 R : IF (Rise_Time is Medium) AND (Overshoot is Medium) AND (Settling_Time is Large) THEN (Fitness is Negative_Medium) R4: IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is Large) THEN (Fitness is Small)) R5: IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Large) THEN (Fitness is Very_Small) Examples of membership functions that can be used in the qualitative design technique are shown in Figure 17.1. Assuming that the settling time of the closed system is to remain constant, the variation of the fitness with rise time and overshoot is shown in Figure 17.2. The complete set of linguistic rules, containing 33=27 rules, constitutes the rule-base of the design procedure and is given in the Table that follows. These rules may be modified to satisfy any control objective and MATLAB and its Fuzzy Toolbox can be used to implement the design technique.

Evolutionary Design of Controllers

239

(a) Rise_Time

(b) Overshoot

(c) Settling_Time

Figure 17.1 Fuzzy sets used in the qualitative design technique

240

Chapter 17

(d) Fitness

Figure 17.1 (continued) Fuzzy sets used in the qualitative design technique

Figure 17.2. The fitness surface

Evolutionary Design of Controllers

241

1. IF (Rise_Time is Small) AND (Overshoot is Small) AND (Settling_Time


is Small) THEN (Fitness is Very_Large) 2. IF (Rise_Time is Small) AND (Overshoot is Small) AND (Settling_Time is Medium) THEN (Fitness is Large) 3. IF (Rise_Time is Small) AND (Overshoot is Small) AND (Settling_Time is Large) THEN (Fitness is Positive_Medium) 4. IF (Rise_Time is Small) AND (Overshoot is Medium) AND (Settling_Time is Small) THEN (Fitness is Large) 5. IF (Rise_Time is Small) AND (Overshoot is Medium) AND (Settling_Time is Medium) THEN (Fitness is Positive_Medium) 6. IF (Rise_Time is Small) AND (Overshoot is Medium) AND (Settling_Time is Large) THEN (Fitness is Medium) 7. IF (Rise_Time is Small) AND (Overshoot is Large) AND (Settling_Time is Small) THEN (Fitness is Positive_Medium) 8. IF (Rise_Time is Small) AND (Overshoot is Large) AND (Settling_Time is Medium) THEN (Fitness is Medium) 9. IF (Rise_Time is Small) AND (Overshoot is Large) AND (Settling_Time is Large) THEN (Fitness is Negative_Medium) 10. IF (Rise_Time is Medium) AND (Overshoot is Small) AND (Settling_Time is Small) THEN (Fitness is Large) 11. IF (Rise_Time is Medium) AND (Overshoot is Small) AND (Settling_Time is Medium) THEN (Fitness is Positive_Medium) 12. IF (Rise_Time is Medium) AND (Overshoot is Small) AND (Settling_Time is Large) THEN (Fitness is Medium) 13. IF (Rise_Time is Medium) AND (Overshoot is Medium) AND (Settling_Time is Small) THEN (Fitness is Positive_Medium) 14. IF (Rise_Time is Medium) AND (Overshoot is Medium) AND (Settling_Time is Medium) THEN (Fitness is Medium) 15. IF (Rise_Time is Medium) AND (Overshoot is MEDIUM) AND (Settling_Time is Large) THEN (Fitness is Negative_Medium) 16. IF (Rise_Time is Medium) AND (Overshoot is Large) AND (Settling_Time is Small) THEN (Fitness is Medium) 17. IF (Rise_Time is Medium) AND (Overshoot is Large) AND (Settling_Time is Medium) THEN (Fitness is Negative_Medium) 18. IF (Rise_Time is Medium) AND (Overshoot is Large) AND (Settling_Time is Large) THEN (Fitness is Small) 19. IF (Rise_Time is Large) AND (Overshoot is Small) AND (Settling_Time is Small) THEN (Fitness is Positive_Medium)

242

Chapter 17

20. IF (Rise_Time is Large) AND (Overshoot is Small) AND (Settling_Time is Medium) THEN (Fitness is Medium) 21. IF (Rise_Time is Large) AND (Overshoot is Small) AND (Settling_Time is Large) THEN (Fitness is Negative_Medium) 22. IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is SMALL) THEN (Fitness is Medium) 23. IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is Medium) THEN (Fitness is NM) 24. IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is large) THEN (Fitness is Small) 25. IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Small) THEN (Fitness is Negative_Medium) 26. IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Medium) THEN (Fitness is Small) 27. IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Large) THEN (Fitness is Very_Small)

Rule-Base for the qualitative evaluation of controller fitness using Rise_Time, Overshoot and Settling_Time.

Evolutionary Design of Controllers

243

Example 17.1 Design of an optimum two-term controller for temperature control of a greenhouse
The qualitative controller design technique is applied to the design of a two-term (PI) controller to control the environment inside a greenhouse. Following a step demand in the reference temperature Tref, the temperature T at some point in the greenhouse shown in Figure 17.3, has the characteristic dead time followed by an exponential rise shown in Figure 17.4.

Tref C Greenhouse

Figure 17.3 Schematic of the controlled greenhouse The proposed technique is model-free and no attempt is made to obtain a low order approximant of the plant in order to tune the closed system. Here it is sufficient to know only the continuous or discrete step response. The design objective is to find the optimum parameters (Kp, Ki)* of a two-term controller that will lead to a closed system step response with a nominal rise time Trise=26 units, a nominal overshoot of p=10% and a nominal settling of Ts=20 units. Instead of defining some quantitative criterion with penalty functions, qualitative measures are used to describe the desired attributes of the closed system. Each attribute is therefore assigned a fuzzy variable, which define the suitability of the overall system.

244

Chapter 17

The fuzzy sets of the three attributes that form the inputs to the inference engine are assumed to be triangular while the fuzzy sets of the suitability function are Gaussian. The fuzzy sets are shown in Figure 17.1. In deriving the control surface, use is made of the Mamdani min-max compositional inference rule. De-fuzzification uses the Center of Gravity (COG). A simple Genetic Algorithm, which follows an elitist strategy, is finally used to obtain the global optima of the controller parameters. It is observed that fitness decreases with increasing overshoot and increasing rise time.

Figure 17.4 Normalized step response of the greenhouse The step response of the optimum system designed with the proposed hybrid Evolutionary - Fuzzy (E-F) technique is compared in Figure 17.5 with that of the same controller designed for minimum ITAE. It is evident that the response of the E-F design is superior, being better damped, and reaches the steady state faster than the ITAE design. It is interesting to note that the E-F design has a higher ITAE index than that of the ITAE design. That the E-F design is superior to the ITAE design is no surprise since a multiple criterion was used in the former.

Evolutionary Design of Controllers

245

ITAE
1

E-F

0.8

0.6

0.4

0.2

10

15

20

25

30

35

40

45

50

Figure 17.5 Comparison of closed system step responses for the optimum E-F and minimum ITAE controllers

Example 17.2 Design of an optimum neural controller for a lathe cutting process
A rule-based neural controller for a cutting lathe was described earlier in chapter 16 and a schematic of the lathe cutting process was shown in Figure 16.7. The response of the process to a small step demand in feed rate was shown in Figure 16.8. The objective in this case study is a controller with a rise time of less than some specified value Trise, an overshoot that does not exceed p% of the steady state value and a settling time Tset less than some specified value. These design objectives can be achieved by (i) the proper choice of the control rules, (ii) the inference mechanism and (iii) optimization of the free parameters of the controller. Assume that the first part of the controller design procedure that involves rule elicitation and rule-encoding, has been completed and that a suitable neural network of specified topology has been designed and trained to generate the desired control surface, as in chapter 16. Our concern here is the second part of the design, i.e., the optimization of the free controller parameters.

246

Chapter 17

Twenty-seven rules relating the design attributes were necessary to specify the desired properties of the closed system completely and are displayed in Figure 17.3. The fuzzy sets of the three controller design attributes constitute the inputs to the inference engine and are assumed to be triangular while the fuzzy set for the fitness function is taken as Gaussian as shown in Figure 17.1. ITAE E-F

Figure 17.6 Step responses of optimum controllers for the optimum E-F and minimum ITAE controllers The step response of the optimum controller designed with the qualitative Evolutionary - Fuzzy (E-F) design technique is compared with that of the same controller designed for minimum ITAE in Figure 17.6. The response of the qualitative design is superior, being better damped and faster in reaching the steady state than the ITAE design, presumably because a multi-objective criterion has been used.

Chapter 18

Bibliography
A. Computational Intelligence
Eberhart R. C., Dobbins R. C. and Simpson P. K. (1996) : Computational Intelligence PC Tools, AP Professional. Kaynak O. (1998) : Computational Intelligence: Soft Comnputing and Fuzzy-neuro Integration with Applications, Springer-Verlag, Berlin. Palaniswani M., Attikiouzel Y. and Marks R. (Eds) (1996) : Computational Intelligence - a dynamic systems perspective, IEEE Press, NY. Pedrycz W. (1997) : Computational Intelligence - an Introduction, CRC Press. Poole D., Mackworth A. and Goebel R. (1998) : Computational Intelligence, Oxford University Press, Oxford. Reusch B. (Ed). (1999) : Computational Intelligence: Theory and Applications, Springer-Verlag, Berlin. Tzafestas S. G. (Ed.) (1999) : Computational Intelligence in Systems and Control: Design and Applications, Kluwer Academic Publications, Hingham, Ma.

247

248

Chapter 18

B. Intelligent Systems
Antsaklis P. J., Passino K. M. and Wang S. J. (1989) : Towards intelligent autonomous control systems: architecture and fundamental issues, Journal of Intelligent and Robotic Systems, Vol. 1, pp. 315-342. Bernard J. (1988) : Use of rule-based systems for process control, IEEE Control Systems Magazine, Vol. 8, No. 5, pp. 3-13. Bigger C. J. and Coupland J. W. (1982) : Expert Systems : a bibliography, IEE Publications, London. Chiu S. (1997) : Developing Commercial Applications of Intelligent Control, IEEE Control Systems Magazine, April, pp. 94-97. 247 Francis J. C. and Leitch R. R. (1985) : Intelligent knowledge based process control, Proc IEE Conference on Control, London. Harris C. J., Moore C. G. and Brown M. (1993) : Intelligent Control, Aspects of Fuzzy Logic and Neural Nets, World Scientific, Singapore. Saridis G. N. (1979) : Towards the realization of intelligent control, Proc IEEE, Vol. 67, No. 8, pp. 1115-1133. Saridis G. N. and Valavanis K. P. (1988) : Analytical design of intelligent machines, Automatica, Vol. 24, No. 2, pp. 123-133. Saridis G. N. (1996) : On the theory of intelligent machines: a comprehensive analysis, Int. Journal of Intelligent Control and Systems, Vol. 1, No. 1, pp. 3-14. Tzafestas S. G. (Ed.) (1993) : Expert Systems in Engineering Applications, Springer-Verlag, Berlin. Tzafestas S. G. (Ed.) (1997) : Methods and Applications of Intelligent Control, Kluwer Academic Publishers, Hingham, Ma. Tzafestas S. G. (Ed.) (1997) : Knowledge Based Systems Control, World Scientific, Singapore. Valavanis K. and Saridis G. N. (1992) : Intelligent robotic system theory: design and applications, Kluwer Academic Publishers, Hingham, Ma.

Bibliography

249

C. Fuzzy Logic and Fuzzy Control


Astrom K. J., Anton J. J. and Arzen K.-E., (1986) : Fuzzy Control, Automatica, Vol. 22, No. 3. Berkan R. C. (1997) : Fuzzy systems design principles, IEEE Press, NY. Boverie S., Demaya B. and Titli A. (1991) : Fuzzy logic control compared with other automatic control approaches, Proc. 30th CDC Conference, Brighton. Chen C-L., Chen P-C. and Chen C-K. (1993) : Analysis and design of fuzzy control systems, Fuzzy Sets and Systems, Vol. 57, pp. 125140. He S.-Z., Tan S., Xu F.-L. and Wang P.-Z. (1993) : Fuzzy self-tuning of PID Controllers, Fuzzy Sets and Systems, Vol. 56, pp. 37-46. Jamshidi M., Vadiee N. and Ross T. J. (Eds.) (1994) : Fuzzy Logic and Control, Prentice Hall, NY. Jamshidi M., Titli A., Zadeh L. A. and Boverie S. (Eds.) : Applications of Fuzzy Logic, Prentice Hall, Upper Saddle River, NJ. Kickert W. J. M. and Mamdani E. H. (1978) : Analysis of fuzzy logic controller, Fuzzy Sets and Systems, Vol. 1, pp. 29-44. King P. J. and Mamdani E. H. (1975) : The application of fuzzy control systems to industrial processes, Proc. IFAC World Congress, MIT Press, Boston. King R. E. and Karonis F. C. (1986) : Rule-based systems in the process industry, Proc. 25th IEEE CDC, Athens. King R. E. and Karonis F. C. (1988) : Multi-layer expert control of a large-scale industrial process, Chapter in Fuzzy Computing, Gupta M. M. and Yamakawa T. (Eds.), Elsevier Science Publishers, NY. King R. E. (1992) : Expert supervision and control of a large-scale plant, J. Intelligent and Robotic Systems, Vol. 2, No. 3. King R. E. (1996) : Synergistic fuzzy control, Chapter in Applications of Fuzzy Logic, Jamshidi M., Titli A., Zadeh L. A. and Boverie S. (Eds.), Prentice Hall, Upper Saddle River, NJ. Klir G. J. and Yuan B. (1995) : Fuzzy Sets and Fuzzy Logic - Theory and Applications, Prentice Hall, NY. Klir G. J., St Clair U. and Yuan B. (1997) : Fuzzy Set Theory Foundations and Applications, Prentice Hall, NY. Kosko B. (1994) : Fuzzy Thinking, Hyperion Kosko B. (1996) : Fuzzy Engineering, Prentice Hall, NY.

250

Chapter 18

Larsen P. M. (1980) : Industrial applications of fuzzy control, Int. Journal of Man-Machine Studies, Vol. 12 (one of the first publications on applications of fuzzy control). Lee C. C. (1990a) : Fuzzy logic control systems: fuzzy logic controller Part I, IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-20, No. 2, pp 404-418. (contains an extensive Bibliography on fuzzy control). Lee C. C. (1990b) : Fuzzy logic control systems: fuzzy logic controller - Part II, IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-20, No. 2, pp. 419-435. Li Y. and Lau C. (1989) : Development of fuzzy algorithms for fuzzy systems, IEEE Control Systems Magazine, Vol. 9, No. 3, pp. 6577. Mamdani E. H. and Gaines B. R. (Eds) (1981) : Fuzzy Reasoning and its Applications, Academic Press, NY. Mamdani E. H. (1974) : An application of fuzzy algorithms for the control of a dynamic plant", Proc. IEE, Vol. 121, No. 12. Mendel J. M. (1995) : Fuzzy logic systems for Engineering: a tutorial, Proceedings IEEE, Vol. 83, No. 3, Mar 1995, pp. 345-377 (contains an extensive list of references on Fuzzy Logic). Negoita C. V. (1981) : Fuzzy Systems, Gordon and Brown. Negoita C. V. (1985) : Expert systems and fuzzy systems, Benjamin Cummings Publishing Co. Mylnek D. M. and Patyra M. J. (Eds.) (1996) : Fuzzy logic implementation and application, J. Wiley, NY. Pedrycz W. (1989) : Fuzzy control and fuzzy systems, J. Wiley and Sons, NY. Ross T. J. (1995) : Fuzzy logic with engineering applications, McGraw Hill, NY. Rutherford D. and Bloore G. (1976) : The implementation of fuzzy algorithms for control, Proc. IEEE, Vol. 64, No. 4, pp. 572-573. Sugeno M. (1985) : Industrial Applications of Fuzzy Control, Elsevier Science Publishers North Holland. Sugeno M. and Kang G. T. (1986) : Fuzzy modelling and control of multilayer incinerator, Fuzzy Sets and Systems, Vol. 18, pp. 329346. Sugeno M. and Yawagawa T. (1993) : A fuzzy-logic-based approach to qualitative modeling, IEEE Trans. on Fuzzy Systems, Vol. 1, No. 1, pp. 7 - 31.

Bibliography

251

Takagi T. and Sugeno M. (1985) : Fuzzy identification of systems and its application to modeling and control, IEEE Trans on Systems, Man and Cybernetics, Vol. SMC-15, pp. 116-132. Terano T., Asai K. and Sugeno M. (1989) : Applied fuzzy systems, Academic Press, NY. Tzafestas S. G. and Venetsanopoulos A. N. (Eds) (1994) : Fuzzy Reasoning in Information, Decision and Control Systems, Kluwer Academic Publishers, Hingham, Ma. Wang P. P. and Tyan C-Y. (1994) : Fuzzy dynamic system and fuzzy linguistic controller classification, Automatica, Vol. 30, No. 11, pp. 1769-1774. Wang L-X. (1994) : Adaptive fuzzy systems and control - design and stability analysis, Prentice Hall, NY. Yen J., Langari R. and Zadeh L.A. (Eds.) (1995) : Industrial applications of fuzzy logic and intelligent systems, IEEE Press, NY. Zadeh L. A. (1965) : Fuzzy Sets, Information and Control, Vol. 8, pp. 3-11. (the definitive paper which laid out the foundations of Fuzzy Logic). Zadeh L. A. (1972) : A rationale for fuzzy control, Trans. ASME, Journal of Dynamic Systems and Control, Vol. G-94, pp. 3-4. Zadeh L. A. (1973) : Outline of a new approach to the analysis of complex systems and decision processes, IEEE Trans. n Systems, Man and Cybernetics, Vol. SMC-3, No. 1, pp. 28-44. Zadeh L. A. (1988) : Fuzzy logic, IEEE Computer, Vol. 21, No. 4, pp. 83-93. Zimmermann H. J. (1996) : Fuzzy set theory and its applications, Kluwer Academic Publishers, Hingham, Ma.

D. Fuzzy Logic and Neural Networks


Chen C. H. (Editor) (1995) : Fuzzy logic and neural network handbook, IEEE Press, NY. Ishibuchi H., Fujioka R. and Tanaka H. (1993) : Neural networks that learn from fuzzy if-then rules, IEEE Trans. on Fuzzy Systems, Vol. 1, No. 2, pp. 85-97. Kartalopoulos S. V. (1996) : Understanding neural networks and fuzzy logic, IEEE Press, NY.

252

Chapter 18

Kosko B. (1992) : Neural Networks and Fuzzy Systems, a dynamic systems approach to machine intelligence, Prentice Hall, NY. Yager R. R. (1992) : Implementing fuzzy logic controllers using a neural network framework, Fuzzy Sets and Systems, Vol. 148, pp. 53-64.

E. Artificial Neural Networks


Aggarwal M. (1997) : A Systematic Classification of Neural-NetworkBased Control, IEEE Control Systems Magazine, April, pp. 7593. (contains some 170 references on neural control). Dayhoff J. (1990) : Neural network architectures, Van Nostrand Rheinhold, NY. Haykin S. (1994) : Neural networks, IEEE Press, NY. Hebb D. (1988) : The organization of behavior, in Neurocomputing: foundations and research (Eds. Anderson A. and Rosenfeld E.), MIT Press, Boston, Mass. Marcos S., O., Vignat C., Dreyfus G., Personnaz L. and Roussel-Ragot P. (1992): A unified framework for gradient algorithms used for filter adaptation and neural network training, Int. Journal of Circuit Theory and Applications, Vol. 20, pp. 159-200. McCullough W. and Pitts W. (1943) : A logical calculus of the ideas immanent in nervous activity, Bulletin of Mathematical Biophysics, Vol. 5. Minsky M. and Papert S. (1969) : Perceptrons : an introduction to computational geometry, MIT Press, Boston. Nahas E. P., Henson M. A. and Seborg D. E. (1992) : Nonlinear internal model control strategy for neural network models, Computers in Chemical Engineering, Vol. 16, No. 12, pp. 1039-1057. Narendra K. S. and Mukhoopadhyay S. (1992) : Intelligent control using neural networks, IEEE Control Systems Magazine, Apr., pp. 11-18. Patterson D. W. (1996) : Artificial Neural Networks Theory through backpropagation, Prentice Hall, NY. Rosenblatt F. (1958) : The Perceptron: a probabilistic model for information storage and organization in the brain, Psychological Review, Vol. 65. (the definitive work on the Perceptron).

Bibliography

253

Widrow B., Winter R. G. and Baxter R. A. (1988) : Layered neural nets for pattern recognition, IEEE Trans. Acoustics, Speech and Signal Processing, Vol. 36, No. 7, pp. 1109-1118.

F. Neural and Neuro-Fuzzy Control


Corneliusen A., Terdal P., Knight T. and Spencer J. (1990) : Computation and control with neural networks, Nuclear Instruments and Methods in Physics Research, A293, pp. 507-516. Guez A., Eilbert J. E. and Kam M. (1988) : Neural network architecture for control, IEEE Control Systems Magazine, April, pp. 22-25. Hunt K. J., Sbarbaro D., Zbikowski R. and Gawthrop P. J. (1992): Neural networks for control systems - a survey, Automatica, Vol. 28, No. 6, pp. 1083-1112. Jang J.-S. R., Sun C.-T. and Mizutani E. (1996) : Neuro-Fuzzy and Soft Computing, Prentice Hall, Upper Saddle River, NJ. Kohonen T. (1988) : Self-organization and associative memory, Springer-Verlag, Berlin. Lin C. T. and Lee C. S. G. (1991) : Neural-network based fuzzy logic control and decision system, IEEE Trans. on Computers, Vol. C40, No. 12, pp 1320-1336. Lin C. T. (1994) : Neural Fuzzy Control Systems with Structure and Parameter Learning, World Scientific, Singapore. Lin C. T. and Lee C. S. G. (1996) : Neural fuzzy systems, Prentice Hall, NY. Miller W.T., Sutton R. S. and Werbos P. J. (Eds) (1990) : Neural Networks for Control, MIT Press, Boston, Mass. Nauck D. and Kruse R. (1996) : Designing neuro-fuzzy systems through backpropagation in Fuzzy Modeling Paradigms and Practice (Ed. Pedrycz W.), Kluwer Academic Publishers, Hingham, Ma. Nie J. and Linkens D. (1995) : Fuzzy-Neural Control, Prentice-Hall Intl., UK. Omatu S., Khalid M. and Yusof R. (1995) : Neuro-control and its applications, Springer-Verlag, Berlin. Palm R., Driankov D. and Hellendoorn H. (1996) : Model Based Fuzzy Control, Springer, Berlin.

254

Chapter 18

Tsitouras G. S. and King R. E. (1997) : Rule-based neural control of mechatronic systems, Proc. Int. Journal of Intelligent Mechatronics, Vol. 2, No. 1, pp. 1-11. Von Altrock C. (1995) : Fuzzy Logic and Neuro-fuzzy applications, Prentice Hall, NY. Wu Q. H., Hogg. B. W. and Irwin G. W. (1992) : A neural network regulator for turbo-generators, IEEE Trans. on Neural Networks, Vol. 3, No.1, pp. 95-100.

G. Computer and Advanced Control


Astrom K. and Hagglund T. (1995) : PID Controllers: Theory, Design and Tuning, Instrument Society of America, Research Triangle Park, NC. Harmon Ray W. (1981) : Advanced Process Control, McGraw Hill, NY. Olsson G. and Piani G. (1992) : Computer Systems for Automation and Control, Prentice-Hall, Hemel Hempstead, Herts. Popovic D. and Bhatkar V. P. (1990) : Distributed Computer Control for Industrial Automation, Marcel Dekker, NY. Tzafestas S. G. (Ed) (1993) : Applied Control, Marcel Dekker, NY.

H. Evolutionary Algorithms
Angeline P. J., Saunders G. M. and Pollack J. B. (1994) : An evolution algorithm that constructs recurrent neural networks, IEEE Trans on Neural Networks, Vol. 5, No. 1, pp. 54-65. Back T., Hammel U. and Schwefel H-P. (1997) : Evolutionary Computation: Comments on the History and Current State, IEEE Trans on Evolutionary Computation, Vol. 1, No. 1, pp 3 - 17. (contains over 220 references on Evolutionary and Genetic Algorithms). Dasgupta D. and Michalewicz Z. (Eds) (1997) : Evolutionary Algorithms in Engineering, Springer Verlag, Berlin. Davis L. (1991) : Handbook of Genetic Algorithms, Van Nostrand, NY. DeGaris H. (1991) : GenNETS - Genetically programmed neural nets, Proc. IEEE Intl. Joint Conf. on Neural Networks, Singapore.

Bibliography

255

DeJong K. A. (1985) : Genetic Algorithms - a 10 year perspective, Proc. First Intl. Conf. on Genetic Algorithms, Hillsdale, NJ., pp. 169-177. Fogel D. B. (1994) : An Introduction to Simulated Evolutionary Optimization, IEEE Trans. Neural Networks, Vol. 5, No. 1, pp. 3-15. Fogel D. B. (1995) : Evolutionary Computation: Towards a new philosophy of machine intelligence, IEEE Press, NY. Fogel D. B. (Editor) (1997) : Handbook of Evolutionary Computation, IOP Publishing, Oxford. Fogel D. B. (Editor) (1998) : Evolutionary Computation- the Fossil Record, IEEE Press, NY. (Selected readings on the history of Evolutionary Algorithms). Goggos V. and King R. E. (1996) : Evolutionary Predictive Control, Computers and Chemical Engineering, Supplement B on Computer Aided Process Engineering, pp. S817-822. Goldberg D. E. (1985) : Genetic algorithms and rule learning in dynamic systems control, Proc 1st Int. Conf. on Genetic Algorithms and their Applications, Hillsdale, N.J., pp. 5-17. Goldberg D. E. (1989) : Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, Mass. (seminal book on Genetic Algorithms). Greffenstette J. (1986) : Optimization of control parameters for Genetic Algorithms, IEEE Trans. on Systems, Man and Cybernetics, Vol. 16, No. 1, pp. 122-128. Holland J. H. (1975) : Adaptation in Natural and Artificial Systems, University of Michigan Press, Michigan. Holland J. H. (1990) : Genetic Algorithms, Scientific American, July, pp. 44-50. Karr C. L. and Gentry E. J. (1993) : Fuzzy Control of pH using Genetic Algorithms, IEEE Trans. on Fuzzy Systems, Vol. 1, No. 1, pp. 46-53. Kim J., Moon Y. and Zeigler P. (1995) : Designing Fuzzy Net Controllers using Genetic Algorithms, IEEE Control Systems, pp. 66-72. Lin F. T., Kao C. Y., Hsu C. J. C. (1993): Applying the genetic approach to Simulated Annealing in Solving some NP-Hard Problems, IEEE Trans. on Systems Man and Cybernetics, Vol. 23, No. 6, pp. 1752-1767.

256

Chapter 18

Man K. F., Tang T. S., Kwong S. and Halang W. A. (1997) : Genetic Algorithms for Control and Signal Processing, Springer-Verlag, Berlin. Maniezzo V. (1994) : Genetic evolution of the topology and weight distribution of neural networks, IEEE Trans. on Neural Networks, Vol. 5, No. 1, pp 39-53. Marti L (1992) : Genetically generated neural networks, Proc. IEEE Intl. Joint Conf. on Neural Networks, pp. IV-537-542. Michalewicz Z. (1992) : Genetic Algorithms + Data Structures = Evolution Programs, Springer-Verlag, Berlin. Michalewicz Z. and Schoenauer M. (1996) : Evolutionary Algorithms for constrained parameter optimization problems, Evolutionary Computation, Vol. 4, No. 1, pp. 1-32. Moed M. C. and Saridis G. N. (1990) : A Boltzmann machine for the organization of intelligent machines, IEEE Trans. on SMC, Vol. 20, No. 5, pp. 1094-1102. Mohlenbain H., Schomisch M. and Born J. (1991) : The parallel Genetic Algorithm as function optimizer, Parallel Computing, Vol. 17, pp. 619-632. Park D., Kandel A. and Langholz G. (1994) : Genetic-based new fuzzy r reasoning models with application to fuzzy control, IEEE Trans on Systems, Man and Cybernetics, Vol. 24, No. 1, pp. 39-47. Spears W. M, DeJong K. A., Bock T., Fogel D. B. and DeGaris H. (1993) : An Overview of Evolutionary Computation, Proc. European Conference on Machine Learning. Srinivas M. and Patnaik L. M. (1991) :Learning neural network weights using Genetic Algorithms - improving performance by searchspace reduction, Proc IEEE Int. Joint Conf. on Neural Networks, Singapore. Varsek A., Urbancic T. and Filipic B. (1993) : Genetic Algorithms in Controller Design and Tuning, IEEE Trans on Systems, Man and Cybernetics, Vol. 23, No. 5, pp. 1330-1339. Zhang J. (1993) : Learning diagnostic knowledge through neural networks and genetic algorithms, Studies in Informatics and Control, Vol. 2, No. 3, pp. 233-252.

I. MATLAB and its Toolboxes

Bibliography

257

Bishop R. H. (1997) : Modern Control Systems Analysis and Design using MATLAB and Simulink, Addison Wesley, Reading, Mass. Cavallo A. (1996) : Using MATLAB, Simulink and Control System Toolbox : A Practical Approach, Prentice Hall, NY. Dabney J. and Harman T. L. (1998) : Mastering Simulink 2 : Dynamic System Simulation for MATLAB, Prentice Hall, NY. Djaferis T. E. (1997) : Automatic Control : the power of feedback using MATLAB, PWS Publishers. Dorf R. C. (1997) : Modern Control Systems Analysis and Design using MATLAB and Simulink, Addison Wesley, Reading, Mass. Gulley N. and Roger Jang J.-S. (1995) : Fuzzy Logic Toolbox for use with MATLAB, Mathworks, Boston, Mass. Moscinski J. and Ogonowski Z. (1995) : Advanced Control with MATLAB and Simulink, Ellis Horwood, Hertfordshire.

Appendix A

Case Study: Design of a Fuzzy Controller using MATLAB


This Appendix considers a case study for the design of a generic fuzzy controller for the control of a simple, yet representative process. The dynamic process exhibits nonlinear behavior and response asymmetry, for which it is desired to compensate. The objective is to show the reader the step-by-step procedure that must be followed and the ease with which it is possible to examine the performance of the resultant fuzzy controller using MATLAB and its Simulink and Fuzzy Toolboxes. The steps in the design procedure are explained in sufficient detail so that the advantages and disadvantages of fuzzy control over conventional three-term control can be evaluated. The case study includes a comprehensive study of the effects of the choice of fuzzy sets on the performance of the closed system. The case study is based on the example given in the book by Gulley and Roger Jang (1995) on the use of the MATLAB Fuzzy Toolbox.
259

260

Appendix A

A.1 The Controlled Process


The controlled process comprises a single storage tank shown in Figure A.1. Inflow is controlled by a flow control valve while outflow is through a fixed opening at the bottom of the tank. Outflow depends on the pressure in the tank and consequently on the level of the fluid in the storage tank. The control objective is to maintain the level of the fluid in the storage tank h(t) constant so that the outflow rate remains constant despite perturbations in the inflow rate. In order to maintain the storage tank level constant, it is necessary to know the level of the fluid in the tank at all times. This can be achieved by using an inductive, capacitive or ultrasonic sensor, for instance. Even a floating ball connected to a potentiometer will do if the variations in the fluid level are relatively small. The controller has the error e(t)= hd - h(t) where hd and h(t) are the desired and actual levels of the fluid in the tank respectively as input and the rate u(t) with which the valve is opened or closed as output.
u(t)

Inflow

Level sensor

Desired fluid level hd Actual level h(t) Outflow rate y(t)

Figure A.1 The controlled process

Design of a Fuzzy Controller Using MATLAB

261

A.2 Basic Linguistic Control Rules


Without knowing anything about the controlled process characteristics and using only basic qualitative knowledge, it is possible to control the level of fluid in the storage tank by following the three intuitive rules: R1: IF the error_in_the_fluid_level is ZEro (i.e., the actual level is equal to the desired level) THEN the control_valve_rate must be ZEro (i.e., do nothing), R2: IF the error_in_the_fluid_level is LOw (i.e., the actual level is less than the desired level) THEN the control_valve_rate must be POsitive_large (so that the tank will fill as quickly as possible), R3: IF the error_in_the_fluid_level is HIgh (i.e., the actual level is higher than the desired level) THEN the control_valve_rate must be NEgative_large (so that the tank will empty as fast as possible), Since the controller must send a quantitative action to the control valve, then the following additional information is required: the maximum permissible change (also termed the deadzone) in the nominal value of the fluid level and the maximum rate of change of the fluid level (which can be computed from the dimensions of the storage tank and the maximum inflow rate or by experimental measurements).

A.3 A Simple Linguistic Controller


A very simple linguistic controller can be implemented very easily using the three rules presented above. Thus, for example, let the linguistic variable NORmal be assigned a tolerable discrepancy from the desired level of meters. If the level is greater than meters below the desired level then the valve must open at its highest rate POsitive_Large. By contrast if the level of fluid in the tank exceeds the desired level by then the rate of valve closure must be NEgative_Large. It is noted that this simple

262

Appendix A

principle is used in simple on-off controllers using relays in building thermostats for which no knowledge of the room dynamics or characteristics is required! Unfortunately, this simple controller leads to undamped oscillations in the level of fluid in the tank with a peak to peak amplitude of 2 meters, as seen in Figures A.2(a) and A.2(b). Furthermore the step response, i.e., the fluid level as a function of time in response to a sudden change in the desired level, is asymmetric. As seen, an increase in the desired fluid level leads to a different response than a decrease in the desired level. This is characteristic of fluid systems since fluid pressure and level are nonlinearly related. There is no way of estimating the frequency of oscillation since there it is assumed that there is no knowledge of the dynamics of the controlled process. It is observed, however, that reducing the dead-zone 2 reduces the amplitude and increases the frequency of the oscillation. This is observed in Figures A.2(a) and A.2(b). Reducing the dead-zone to zero will, in theory, lead to an infinite frequency of oscillation. In practice this is unlikely because of the inherent delays in the relay. In any case this continuous chattering is undesirable since it shortens relay life.

Figure A.2(a) Step response of linguistic controller with =0,1

Design of a Fuzzy Controller Using MATLAB

263

Figure A.2(b) Step response of linguistic controller with =0,025

The system can be modeled using MATLAB/Simulink/Fuzzy Toolbox simply typing the instruction sltank at which the flow diagram shown in Figure A.3 appears. In this file we are given the opportunity to compare the step responses of the fuzzy and fuzzy controllers for the same conditions. The fuzzy sets in this simple case have the Boolean form shown in Figure 5.1. As a consequence the output of the controller can only take one of three possible values, i.e., POL, ZER or NEL. The qualitative controller is equivalent to a threshold logic unit with dead-zone, as shown in Figure A.4. If the fluid level error is within the dead-zone then the controller gives no output and the process coasts freely. However, when the absolute error exceeds the dead-zone then the controller gives a command of sgn(e).

264

Appendix A

Figure A.3 Simulink flow diagram using sltank


u 1 - -1 e

Figure A.4 Input-output relationship of qualitative controller

A.4 The Design Procedure


With the simple linguistic controller, the oscillations (or more precisely, limit cycles) of the fluid in the storage tank about its desired value are not acceptable except only where coarse control suffices. The constant chattering of the controller relay inevitably leads to wear of the control valve, which is in a state of continuous motion. We must find a suitable control strategy that will dampen these oscillations and ultimately elimi-

Design of a Fuzzy Controller Using MATLAB

265

nate them. It is also desirable to find some technique to compensate for the response asymmetry. We could, of course, follow the road of conventional control, identifying the process dynamics by applying step disturbances to it, determining a simple approximant and using the cookbook techniques of Ziegler and Nichols, or more sophisticated methods in order to establish the parameters of the required three-term controller. This procedure is quite tedious and time-consuming and it is doubtful that all objectives can be satisfied simultaneously. It is certain, however, that a conventional three-term controller cannot compensate for the observed asymmetric response. If we could change the control strategy by adding some additional stabilizing rules and some advanced inference mechanism so that the control action changes smoothly, instead of abruptly as in the case of the simple controller, then we could achieve stabilization and significant improvement in the closed system response. The solution is to turn to unconventional control techniques and design a fuzzy controller with one of the available computer aided design packages. This case study uses MATLAB Simulink and the Fuzzy Toolbox. In the following, we present the procedure that must be followed in the design of a simple generic fuzzy controller. Our objective is a robust fuzzy controller which can be described by simple linguistic rules but which will not exhibit the problems that were encountered with the simple linguistic controller described earlier. The next step in the design procedure following encoding of the control rules, is to decide on the number of fuzzy sets for every input and output variable of the controller. We must also specify the shape of the corresponding fuzzy sets, some of the available choices being shown in Figure 11.5. In this study we experiment with a number of shapes in an attempt to find acceptable overall performance. The FIS Editor in the Fuzzy_Toolbox shown in Figure A.5. is a simple and effective tool that makes designing fuzzy controllers simple. As was noted in earlier chapters of this book, it is not necessary to specify many fuzzy sets for the inputs to the controller in order to obtain acceptable accuracy. Most industrial applications use 3 to 5 fuzzy sets. Where fine control is essential then many more fuzzy sets are required. In the F. L. Smidth kiln controller, for instance, more than a dozen fuzzy sets are used. In the fuzzy controller proposed here, only

266

Appendix A

three fuzzy sets LOw, OK and HIgh are used for the input variables for simplicity.

A.5 System Stabilization Rules


The oscillations in the fluid level in the storage tank which were observed with the simple qualitative controller resulting from the use of only three rules, can be damped by adding two more rules which involve the derivative of the level error de(t)/dt. This is a classical procedure even in conventional control. However, in order to avoid any problem through using derivatives of the set point, we prefer to use the derivative of the measured fluid level in the tank dh(t)/dt as the second input to the controller in addition to the level error e(t)=hd-h(t). In practice the derivative is replaced by the first backwards difference of the level hk=hkhk-1.

Figure A.5. The FIS Editor in the fuzzy design tool

Adding two rules can, indeed, stabilize the closed system:

Design of a Fuzzy Controller Using MATLAB

267

R4: IF error_in_the_fluid_level is ZEro AND the rate_of_change_of_level is NEgative_Small THEN the control valve must be closed slowly, i.e., the control_valve_rate must be NEgative_Small R5: IF error_in_the_fluid_level is ZEro AND the rate_of_change_of_ level is POsitive_Small THEN the control valve must be opened slowly, i.e., the control_valve_rate must be POsitive_Small These 5 control rules may be specified in many ways, the most convenient of which is linguistic, as shown in Figure A.6.
1. If (level is OK) then (valve is no_change) 2. If (level is low) then (valve is open_fast) 3. If (level is high) then (valve is close_fast) 4. If (level is OK) and (rate_of_change_of level is positive) then (valve is close_slow) 5. If (level is OK) and (rate_of_change_of level is negative) then (valve is open_slow) Figure A.6. The control rules in linguistic form

A.6 On the Universe of Discourse of the Fuzzy Sets


The universe of discourse of every input to the controller depends on the permissible range of that variable. Errors are confined to e[-, ] and the change of errors to dh(t)/dt|max[-, ]. The two parameters and thus uniquely specify the universes of discourse of the two controller inputs. The output of the controller is the rate with which the flow control valve is opened or closed . Here we specify 5 fuzzy variables, i.e.,

268

Appendix A

NEg_Large, NEgative_Small, ZEro, POsitive_Small and POsitive_Large for finer control. The universe of discourse in this Case is normalized to [-1,1] where +1 indicates that the valve is entirely open (i.e., 100% open) and 1 that it is closed (i.e., 0% open). The median 0.5 implies that the valve is at the center of its range.

A.7 On the Choice of Fuzzy Sets


The fuzzy sets of the controller inputs and outputs can be given any of the most common shapes offered in the Fuzzy Toolbox, such as triangular, trapezoidal, Gaussian, etc. (see Figure 5.12). Various combinations are tried and the resultant control surface and corresponding dynamic behavior of the closed system are compared. The control surface is a graphical interpretation of the control actions in input space, i.e., how the outputs of the controller vary as functions of the controller inputs. For a given set of control rules the control surface depends on the shape of the fuzzy sets of both the inputs and the outputs of the controller but this does not greatly affect the dynamic performance of the closed system. The control surface of a generic controller with two inputs gives us an immediate indication of the magnitude of the control action in control space but clearly when the controller has more than two inputs, the control surface becomes a manifold which is impossible to visualize. Should the performance of the closed system prove unsatisfactory, the control surface can be examined to see how the rules must be modified or what new rules must be added to bring about the desired behavior. In the design example, the control surfaces were computed using an inference engine based on Mamdanis compositional inference rule while COG (Center Of Gravity) was used to de-fuzzify the fuzzy set of the controller output. The Fuzzy Toolbox provides alternative forms for both the inference engine and de-fuzzification.. The following five cases were considered in the design study: 1. Case - Figure A.7: Inputs with 3 triangular fuzzy sets and outputs with 5 symmetric triangular fuzzy sets with large support (50%) and no overlap.

Design of a Fuzzy Controller Using MATLAB

269

2. Case - Figure A.8: Inputs with 3 triangular fuzzy sets and outputs with 5 symmetric triangular fuzzy sets with small support and no overlap. 3. Case C - Figure A.9: Inputs with 3 Gaussian fuzzy sets and outputs with 5 symmetric triangular fuzzy sets with small support and no overlap. 4. Case D - Figure A.10: Inputs with 3 Gaussian fuzzy sets and outputs with 5 asymmetric triangular fuzzy sets with small support and no overlap. 5. Case E - Figure A.11 : Inputs with 3 Gaussian fuzzy sets and outputs with 5 asymmetric triangular fuzzy sets with small support and some overlap. It is noted that triangular fuzzy sets with small support, i.e., , 0 approximate singletons (see chapter 5) that have been used in industrial fuzzy controllers. The advantage of using singletons is the simplicity and speed of the computations for de-fuzzification. A number of vendors use singletons in their products, e.g., S5-Profuzzy, S7Fuzzy_Control by Siemens and FuzzyTech by Allen Bradley. For every one of the five cases we present the control surface and the corresponding step response. Figure A.12 shows the computer screen from the Fuzzy Toolbox which shows which rules are fired for a given controller input values and the corresponding controller output for the Case E.

A.8 Compensation of Asymmetry


Asymmetry in the step response of a closed system is not an uncommon phenomenon in practice due to the inherent nonlinearities in the controlled plants. In these cases, the step response for a positive demand differs significantly from the response for a negative demand. A typical example is in the control of pressure, which is related by the square root of the observed height, as shown in the example analyzed in this Appendix. A conventional industrial three-term controller cannot easily compensate for this asymmetry. In contrast, asymmetry can be compensated for relatively easily in soft controllers (both fuzzy and neural) by warping the control surface appropriately. This can be achieved easily by displacing one or more of the fuzzy sets laterally on the universe of discourse. In

270

Appendix A

Figure A.10(c) for instance, the fuzzy set Positive_Small has been shifted to the left with the consequence that the control action for small positive errors in the liquid level is greater than that for small negative errors. Thus for small discrepancies in the liquid level about the nominal level, the rate with which the valve is changed is increased when the tank is filling and decreased when it empties. This leads to compensation of response asymmetry.

A.9 Conclusions
It should have become clear that altering the shape of the fuzzy sets of the controller does not lead to major changes in the control surface. This is evident in Figures A.7(c) to A.11(c). In cases and , the control surfaces shown in Figures A.7(c) and A.8(c) are almost flat in the dead-zone due the triangular nature of the fuzzy sets of both the inputs and the output. The corresponding step responses are shown in Figures A.11(a) and A.11(b) are unsatisfactory because their steady-state error is non-zero and response asymmetry is severe. It is noted, also, that in these two cases, both the triangular fuzzy sets with a large support set and the singletons have comparable step responses. In contrast, in cases C, D and E the control surfaces are smooth because of the smooth (Gaussian) shape of the fuzzy sets of the controller inputs. The corresponding step responses are seen to be superior. In cases D and E (see Figures A.11(d) and B.11(e)) the step responses are almost symmetric. Finally, in the last three cases the steady-state errors are essentially zero and overshoot is negligible. Case E appears to be the best as it demonstrates symmetric response, zero steady-state error and no overshoot. As was seen in all five cases, the closed system step response is influenced significantly by the shape of the fuzzy sets of the inputs and to a lesser extent by the fuzzy sets of the controller output. In general, Gaussian fuzzy sets have smoother control surfaces, implying smoother control actions and improved responses. The best shapes of the fuzzy sets are not generally known and are the subject of ongoing research.

Design of a Fuzzy Controller Using MATLAB CASE A

271

Figure A.7(a) Input fuzzy sets

Figure A.7(b) Output fuzzy sets

Figure A.7(c) Control surface

272

Appendix A CASE B

Figure A.8(a) Input fuzzy sets

Figure A.8(b) Output fuzzy sets

Figure A.8(c) Control surface

Design of a Fuzzy Controller Using MATLAB CASE C

273

Figure A.9(a) Input fuzzy sets

Figure A.9(b) Output fuzzy sets

Figure A.9(c) Control surface

274

Appendix A CASE D

Figure A.10(a) Input fuzzy sets

Figure A.10(b) Output fuzzy sets

Figure A.10(c) Control surface

Design of a Fuzzy Controller Using MATLAB

275

CASE E

Figure A.11(a) Input fuzzy sets

Figure A.11(b) Output fuzzy sets

Figure A.11(c) Control surface

276

Appendix A

Figure A.12(a) Step response for Case A

Figure A.12(b) Step response for Case B

Design of a Fuzzy Controller Using MATLAB

277

Figure A.12(c) Step response for Case C

Figure A.12(d) Step response for Case D

278

Appendix A

Figure A.12(e) Step response for Case E

Figure A.13 Computations for case E

Appendix B

A Simple Genetic Algorithm


The m-files given in the Appendices that follow can be downloaded from the authors web site www.lar.ee.upatras.gr/reking % Main_Program - ga.m popsize=10; % Population Size maxgen=50; % Number of Generations (iterations) length=12; % Length of genotype (i.e. number of bits in the binary array) pcross=0.8; % Probability of Crossover pm=0.01; % Probability of Mutation bits=[length length]; vlb=[-1 -1]; vub=[1 1]; phen=init(vlb,vub,popsize,2); % Initialization of the population of phenotypes [gen, lchrom, coarse, nround] = encode(phen, vlb, vub, bits); % Conversion of phenotypes to binary string [fitness, object]=score(phen,popsize); % Evaluation of Fitness & Objective Function x=-1:0.1:1; % Display of the contour graph of the objective function y=-1:0.1:1; [x1,y1]=meshgrid(x,y); z=x1.^2+y1.^2-0.3*cos(3*pi*x1)-0.4*cos(4*pi*y1)+0.7; figure(1); contour(x,y,z); 279

280

Appendix B

[best_obj(1), index] = min(object); % Store the best candidate of the initial population best_gen=gen(index,:); best_phen=phen(index, :); [worst_obj(1), index1] = max(object); % Store the worst candidate of the initial population worst_cur_gen=gen(index1); worst_cur_phen=phen(index1); avg_obj(1)=0; % Calculate the average performance of the population for k=1:popsize avg_obj(1)=avg_obj(1)+object(k); end; avg_obj(1)=avg_obj(1)/popsize; best_x(1)=best_phen(1); best_y(1)=best_phen(2); for i1=1:2 fprintf(1,'%f ',best_phen(1)); end; fprintf('\n'); fprintf(1,'BEST : %f WORST : %f AVG : %f \n',best_obj(1),worst_obj(1),avg_obj(1)); for i=1:maxgen % Start of the Genetic-Loop newgen=reproduc(gen,fitness); % Reproduction gen=mate(gen); % Mate two members of the population gen=xover(gen, pcross); % Crossover Operation gen=mutate(gen, pm); % Mutation Operation [phen, coa] = decode(gen, vlb, vub, bits); % Decode the genotype of the new population to phenotype [fitness, object]=score(phen,popsize); % Evaluation of the Fitness & Objective Functions [best_cur_obj, index] = min(object); % Store the best candidate of the current population best_cur_gen=gen(index, :); best_cur_phen=phen(index, :); [worst_obj(i+1), index1] = max(object); % Store the worst candidate of the current population worst_cur_gen=gen(index1); worst_cur_phen=phen(index1); avg_obj(i+1)=0; % Average performance of the current population for k=1:popsize avg_obj(i+1)=avg_obj(i+1)+object(k);

Simple Genetic Algorithm end; avg_obj(i+1)=avg_obj(i+1)/popsize;

281

if(best_cur_obj > best_obj(i)) % Apply Elitist Strategy phen(index1,:) = best_phen; gen(index1,:) = best_gen; object(index1) = best_obj(i); best_obj(i+1) = best_obj(i); elseif(best_cur_obj <= best_obj(i)) best_phen = best_cur_phen; best_gen = best_cur_gen; best_obj(i+1) = best_cur_obj; end; best_x(i+1)=best_phen(1); % Display evolution of the best solution on the contour graph best_y(i+1)=best_phen(2); hold; line(best_x,best_y); for i1=1:2 fprintf(1,'%f ',best_phen(i1)); end; fprintf(1,'---> %f\n',best_obj(i+1)); fprintf('\n'); fprintf(1,'BEST : %f WORST : %f AVG : %f \n',best_obj(i+1),worst_obj(i+1),avg_obj(i+1)); end xx=1:maxgen+1; % Display evolution of objective functions for the worst, average and best solutions figure(2); plot(xx,best_obj,xx,worst_obj,xx,avg_obj); grid; % File init.m - This function creates a random population function phen=init(vlb,vub, siz, sea) for i=1:siz phen(i,:)=(vub-vlb).*rand(1, sea) + vlb; end % File score.m - This function computes the fitness and the objective function values of a population function [fitness, object]=score(phen, popsize) for i=1:popsize

282

Appendix B object(i)=phen(i,1)^2+phen(i,2)^20.3*cos(3*pi*phen(i,1))0.4*cos(4*pi*phen(i,2))+0.7; fitness(i)=1/(object(i)+1);

end

The following m-files called by the main program can be downloaded directly from the MATHWORKS web site www.mathworks.com and bear the indication: % Copyright (c) 1993 by the MathWorks, Inc. % Andrew Potvin 1-10-93. % File encode.m - This function converts a variable from real to binary function [gen,lchrom,coarse,nround] = encode(x,vlb,vub,bits) lchrom = sum(bits); coarse = (vub-vlb)./((2.^bits)-1); [x_row,x_col] = size(x); gen = []; if ~isempty(x), temp = (x-ones(x_row,1)*vlb)./ ... (ones(x_row,1)*coarse); b10 = round(temp); nround = find(b10-temp>1e-4); gen = b10to2(b10,bits); end % File reproduc.m - This function selects individuals in accordance to their fitness function [new_gen,selected] = reproduc(old_gen,fitness) norm_fit = fitness/sum(fitness); selected = rand(size(fitness)); sum_fit = 0; for i=1:length(fitness), sum_fit = sum_fit + norm_fit(i); index = find(selected<sum_fit); selected(index) = i*ones(size(index)); end new_gen = old_gen(selected,:); % File mate.m - This function mates two members of the population

Simple Genetic Algorithm function [new_gen,mating] = mate(old_gen) [junk,mating] = sort(rand(size(old_gen,1),1)); new_gen = old_gen(mating,:);

283

% File xover.m - This function performs the Crossover operation function [new_gen,sites] = xover(old_gen,Pc) lchrom = size(old_gen,2); sites = ceil(rand(size(old_gen,1)/2,1)*(lchrom-1)); sites = sites.*(rand(size(sites))<Pc); for i = 1:length(sites); new_gen([2*i-1 2*i],:) = old_gen([2*i-12*i],1:sites(i)) ... old_gen([2*i 2*i-1],sites(i)+1:lchrom)]; end % File mutate.m - This function performs the Mutation operation function [new_gen,mutated] = mutate(old_gen,Pm) mutated = find(rand(size(old_gen))<Pm); new_gen = old_gen; new_gen(mutated) = 1-old_gen(mutated);

% File decode.m - This function coverts a variable from binary to real function [x,coarse] = decode(gen,vlb,vub,bits) bit_count = 0; two_pow = 2.^(0:max(bits))'; for i=1:length(bits), pow_mat((1:bits(i))+bit_count,i) = two_pow(bits(i):-1:1); bit_count = bit_count + bits(i); end gen_row = size(gen,1); coarse = (vub-vlb)./((2.^bits)-1); inc = ones(gen_row,1)*coarse; x = ones(gen_row,1)*vlb + (gen*pow_mat).*inc; % File b10tob2 - This function converts a variable from base 10 to base 2 function b2 = b10to2(b10,bits) bit_count = 0; b2_index = [];

284

Appendix B

bits_index = 1:length(bits); for i=bits_index, bit_count = bit_count + bits(i); b2_index = [b2_index bit_count]; end for i=1:max(bits), r = rem(b10,2); b2(:,b2_index) = r; b10 = fix(b10/2); tbe = find( all(b10==0) | (bits(bits_index)==i) ); if ~isempty(tbe), b10(:,tbe) = []; b2_index(tbe) = []; bits_index(tbe) = []; end if isempty(bits_index), return end b2_index = b2_index-1; end

Appendix C

Simulated Annealing Algorithm


% Main_Program sa.m Tinit=120; % Initial Simulated Temperature l=0.98; % Temperature Decrement Parameter Tfinal=0.00001; % Final Temperature % Computation and Display of objective Function Contour x=-1:0.1:1; y=-1:0.1:1; [x1,y1]=meshgrid(x,y); z1=x1.^2+y1.^2-0.3*cos(3*pi*x1)-0.4*cos(4*pi*y1)+0.7; figure(1); contour(x,y,z1); Tcur=Tinit; % Initialize Simulated Temperature x1=-1 + 2*rand; % Select First Solution (x1, x2) x2=-1+2*rand; z(1)=x1^2+x2^2-0.3*cos(3.*pi*x1)0.4*cos(4*pi*x2)+0.7; r1(1)=x1; r2(1)=x2; i=1; while Tcur>Tfinal % Start of Simulated Annealing Loop 285

286

Appendix C

x_1=-1 + 2*rand; % Select New Solution (x_1, x_2) x_2=-1 + 2*rand; % Evaluate New Objective Function Value z_1=x_1^2+x_2^2 - 0.3*cos(3.*pi*x_1)0.4*cos(4*pi*x_2)+0.7; g=exp(-((z_1-z(i))/Tcur)); % Acceptance Probability if ((z_1 < z(i))| (rand < g)) x1=x_1; x2=x_2; z(i+1)=z_1; else z(i+1) = z(i); end Tcur=Tcur*l; % New Simulated Temperature r1(i+1)=x1; r2(i+1)=x2; title('Search for the global optimum point'); xlabel('x axis'); ylabel('y axis'); i=i+1; end % End of Loop f=find(z==min(z)); fprintf(1,'The Minimum value has of the Obj. func. been observed so far is : %f in the %d iteration\n', min(z),f(1)); fprintf(1,'x=%f,y=%f\n',r1(f(1)),r2(f(1))); hold; line(r1,r2); title('Movement of x-y parameters in the search space'); xlabel('x parameter'); ylabel('y parameter'); figure(2); plot(z); title('Obective Function values versus Iterations'); xlabel('Iterations'); ylabel('Objective Function'); figure(3); plot(r1); title('Movement of the x parameter');

Simulated Annealing Algorithm ylabel('x parameter'); xlabel('Iterations'); figure(4); plot(r2); title('Movement of the y parameter'); ylabel('y parameter'); xlabel('Iterations');

287

Appendix D

Network Training Algorithm


% Main_Program net.m
P=[0.8 0.8; 0.8 0; 0.8 -0.8; 0.3 0.3; 0.3 -0.3; 0 0.8; 0 0; 0 -0.8; -0.3 0.3; -0.3 -0.3; -0.8 0.8; -0.8 0; -0.8 -0.8]; T=[1 0.55 0 0.35 0 0.55 0 -0.55 0 -0.35 0 -0.55 -1]; [R,Q] = size(P); S1 =2; [S2,Q] = size(T); [W10,B10] = rands(S1,R); Randomize network parameters W20 = rands(S2,S1)*0.5; B20 = rands(S2,1)*0.5; disp_freq = 20; max_epoch = 9999; epoch limit err_goal = 0.01; error measure limit lr = 0.01; learning rate lr_inc = 1.05; learning increment lr_dec = 0.7; learning decrement err_ratio = 1.04; error ratio mom_const=0.95; TP = [disp_freq max_epoch err_goal lr lr_inc lr_dec mom_const err_ratio]; [W10,B10,W20,B20,epochs,TR]=trainbpx(W10,B10, 'purelin',W20,B20,'purelin',P,T,TP) ; training algorithm with linear neurons in 289

290

Appendix D both layers plottr(TR); plot results W10 ; print synaptic weights of first layer W20 ; print synaptic weights of second layer B10 ; print bias of first layer B20 ; print bias of second layer

Index

Adaptive linear networks (ADALINEs), 156, 162, 170 Artificial intelligence, 13 Artificial neural networks, 1, 51, 153 autoassociative, 159 feedback, 159 feed-forward networks, 159 generalized function mapping, 154 Hopfield recurrent networks, 158 multi-layer network topologies, 158 Artificial neurons, 153, 156 dynamic, 158 static, 156 Cartesian product, 72 Classical control, 2 Compositional rules of inference, 81 Computational intelligence, 6, 9, 13, 23, 31, 41, 153

Computer integrated manufacturing (CIM), 23, 36 Control: protocols, 119 rules, 119 Conventional control, 39 Deep knowledge, 18, 32 De-fuzzification, 98 center of area (COA), 98 center of gravity (COG), 98 Elemental artificial neuron, 156 Embedded fuzzy controllers, 123 Evolutionary: algorithms, 20 computation, 203 control, 8 controller suitability, 237 decoding, 212 design of conventional controllers, 235 design of intelligent controllers, 221 291

292 [Evolutionary] operations, 205 crossover, 206, 214 mutation, 206, 214 recombination (see crossover) selection, 206, 215 simulated evolution, 205 optimization, 205, 208 programming, 211 strategies, 211 Expert systems, 13, 49 classification, 32 development, 18 diagnosis of malfunctions, 28 elements, 15 energy management, 26 fault diagnosis, 20, 24 fault prediction, 24 implementation, 19 industrial controller design, 24 LISP machines, 19 need, 17 operator training, 22 paradigms, 20 plant simulation, 22 prediction of emergency plant conditions, 26 predictive maintenance, 25 product design, 21 production scheduling, 27 representation of knowledge, 20 shells, 18, 20 supervisory control systems, 23 tools and methods, 18 Flexible manufacturing systems (FMS), 21, 28 Fuzzification, 91, 96

Index degree of fulfillment, 91, 96 graphical interpretation of, 93 Fuzzy: algorithm, 59 associative memory (FAM), 103, 115 conditional statements, 58 control, 54, 89 algorithm, 89 controllers, 105 coarse-fine, 117 decomposition, 90 embedded, 123 gain-scheduled, 40, 136 generalized three-term, 108 generic two-term, 113 hybrid architectures, 112 integrity, 101 optimization using genetic algorithms, 221 partitioned architecture, 109 real-time, 119 robustness, 107 Tagaki-Sugeno, 136, 144 three-term, 107 fitness criteria, 236 gain-scheduling, 136, 146 implications, 78 Boolean, 78 GMP, 80 Larsen, 80 Lukasiewicz, 78 Mamdani, 79 Zadeh, 79 inference engine, 71 degree of fit, 71 linguistic variables, 64 logic, 1, 7, 8 algorithm, 59, 74 basic concepts, 54 logic control (FLC) (see also Fuzzy controllers), 2

Index operators, 60 conjunctive, 72 propositional implication, 71 reasoning, 71, 76 relational matrix, 73 sets, 55, 100 algebraic properties of, 64 choice of, 268 coarseness of, 100 completeness of, 101 linguistic descriptors of, 57 membership function of, 55 operations on, 63 complement, 63, 64 connectives, 69 DeMorgan's theorem, 64 intersection, 63, 64 product, 63 union, 63, 64 shape of, 100 support set of, 56 singletons, 67, 92 systems, 32 variable, 57 universe of discourse of, 55 Gain-scheduled controllers, 40 Generalized: Modus Ponens (GMP), 77 Modus Tollens (GMT), 77 Genetic algorithms (GAs), 8, 205, 211 fitness functions, 203, 206, 213 initialization, 212 parameters, 217 Hard control, 42 Human: intelligence, 6 operators, 59

293 Industrial: control, 23 controller optimization, 232 Inference engine, 6, 15, 17 effectiveness, 37 quality, 37 Intelligent: agents, 124 control, 6, 11, 31, 41, 43 autonomy, 45 basic elements, 34 conditions for use, 33 objectives, 34 techniques, 39 controllers, 7, 35 correctness, 10 extendibility, 10 precision, 10 reusability, 10 robustness, 10, 40 systems, 10 acceptance, 37 architecture, 46 design tools and methods, 18 distributed architecture, 47 efficiency, 37 hierarchical structure, 46 Knowledge: base, 76 based systems, 13, 48 embedded, 193 empirical, 33 engineering, 37 and experience, 31 heuristic, 135 Learning machines, 154 Linguistic: controller, 261 descriptors, 57

294 [Linguistic] rule matrix, 186 rules, 8, 14, 16, 20, 32, 59 values, 57 variables, 57 Mamdani, 1, 79, 84, 111 Membership function, 55 generic S, 66 generic , 67 Model-based fuzzy control (see also Tagaki-Sugeno controllers), 135, 136 Modern control, 3 Multi-level relay controller, 111 Neural: control, 51, 153, 160 learning and adaptation, 161 parallel processing, 161 rule-based, 181 controllers, 160 architectures, 162 indirect learning, 166 inverse model, 164 specialized training, 165 design using genetic algorithms, 222 fidelity, 163 indirect training of, 166 inverse model of, 164 multi-variable, 161 properties of, 161 rule-based, 181 network training algorithms, 169 back-propagation (BP), 169, 176 flow chart, 180 Delta, 173 least mean squares (LMS), 171

Index multi-layer, 175 supervised learning, 169 unsupervised learning, 169 Widrow-Hoff, 170 Neuro-fuzzy control, 8, 51, 193 architectures, 194 isomorphism, 195 fuzzification of neural controllers, 195 neuralization of fuzzy controllers, 195 Numerical Fuzzy Associative Memory (NFAM), 187 Perceptron, 154 Procedural knowledge, 16 Real-time: expert systems, 26 execution scheduler, 124 fuzzy control, 119 Relational algorithm, 7 Representation of knowledge, 20 Response asymmetry compensation, 269 Rule: composition, 82 conflict, 102 encoding, 182 granularity, 116 Rule-based: neural control, 181 network training, 183 Saridis' principle, 10, 46 Shallow knowledge, 18, 24, 32 Simulated annealing, 8, 225 Metropolis algorithm, 226 optimization: constrained, 229 industrial controller, 232

Index Soft: computing, 7, 41, 42, 154, 203 control, 42 Supervisory fuzzy controllers, 120 de-fuzzifier, 120 fuzzifier, 120 inference engine, 120 knowledge-base, 120 real-time data base, 120 Symbolic representation, 9 Synaptic weights, 156 Tagaki-Sugeno controllers (see also Model-based controllers), 136, 144 first approach, 136 fuzzy control law, 141 fuzzy process models, 139 fuzzy variables and fuzzy spaces, 137 locally linearized process model, 142 second approach, 144 stability conditions, 144

295

Uncertainty and vagueness, 7, 53 Unconventional control, 6, 40 Universe of discourse, 55 Waste-water treatment control, 126 Widrow-Hoff training algorithm, 170, 172, 173 Zadeh, 1, 50, 53, 119

You might also like