Computational Intelligence in Control Engineering
Computational Intelligence in Control Engineering
In the last 50 years, Automatic Control Theory has developed into a well-established engineering discipline that has found application in space technology, industry, household appliances and other technological implementations. It was designed to monitor and correct the performance of systems without the intervention of a human operator. Lately, with the growth of digital computers and the universal acceptance of systems theory, it was discovered and used in softer fields of human interest such as ecology, economics, biology, etc. In the meanwhile, being a dynamic discipline, Automatic Control with the aid of the digital computer has evolved from simple servomechanisms to an autonomous selforganizing decision-making methodology that was given the name of Intelligent Control. Several manifestations of Intelligent Control have been proposed by various scientists in the literature. Fuzzy, Neural, Hierarchical Intelligent, Cerebellar and Linguistic control systems are typical examples of such theoretically developed Intelligent Controls. However, the application of such sophisticated methodologies to real life problems is far behind the theory. The areas with the highest need and the smallest tolerance for adopting the techniques resulting from such theoretical research are the industrial complexes. The main reason is the lack of suitable intelligent computational algorithms and interfaces designed especially for their needs. This book attempts to correct this by first presenting the theory and then developing various computational algorithms to be adapted for the various industrial applications that require Intelligent Control for efficient production.
vii
viii
Foreword
The author, who was one of the first to actually implement Intelligent Control in industry, accomplishes this goal by developing step by step some of the most important Intelligent Computational Algorithms. His industrial experience, coupled with a strong academic background, has been channeled into creating a book that is suitable for graduate academic education and a manual for the practicing industrial engineer. Such a book fills a major gap in the global literature on Computational Intelligence and could serve as a text for the developing areas of biological, societal and ecological systems. I am very proud to introduce such an important work. George N. Saridis Professor Emeritus Rensselaer Polytechnic Institute Troy, New York, 1999
Preface
Conventional control techniques based on industrial three-term controllers are almost universally used in industry and manufacturing today, despite their limitations. Modern control techniques have not proved possible to apply because of the difficulties in establishing faithful microscopic models of the processes under control. It is not surprising, therefore, that manual control constitutes the norm in industry. In the early 1970s Intelligent Control techniques, which emulate the processing of human knowledge about controlling a process by machine, appeared and a new era of control was born. Intelligent Control has come a long way since then, breaking down the barriers of industrial conservatism with impressive results. Intelligent Control, which includes Fuzzy, Neural, Neuro-fuzzy and Evolutionary Control, is the result of applying Computational Intelligence to the control of complex systems. This class of unconventional control systems differs radically from conventional (or hard control) systems that are based on classical and modern control theory. The techniques of Intelligent Control are being applied increasingly to industrial control problems and are leading to solutions where conventional control methods have proved unsuccessful. The outcome of their application to industry and manufacturing has been a significant improvement in productivity, reduced energy consumption and improved product quality, factors that are of paramount importance in todays global market. The first Chapter presents an introduction to Computational Inix telligence, the branch of Soft Computing which includes Expert Systems,
Preface
Fuzzy Logic, Artificial Neural Networks and Evolutionary Computation (Genetic Algorithms and Simulated Annealing) with special emphasis on its application to Control Engineering. The theoretical background required to allow the reader to comprehend the underlying principles has been kept to a minimum. The reader is expected to possess a basic familiarity with the fundamentals of conventional control principles since it is inconceivable that unconventional control techniques can be applied without an understanding of conventional control techniques. The book is written at a level suitable for both undergraduate and graduate students as well as for practicing engineers who are interested in learning about unconventional control systems that they are likely to see in increasing numbers in the next millennium. The primary objective of the book is to show the reader how the fusion of the techniques of Computational Intelligence techniques can be applied to the design of Intelligent Systems that, unlike conventional control systems, can learn, remember and make decisions. After many years of teaching in higher education, the author took leave to work in industry only to face the technology gap between control theory and practice firsthand. He is one of that rare breed of academics who had a free hand to experiment on-line on large-scale chemical processes. He spent considerable time trying to apply conventional modern control techniques but frustrated with the outcome, sought unconventional techniques that could and did yield solutions to the difficult practical control problems that he faced. His search led him first to fuzzy control and later to neural control, which he applied to the process industry with considerable success. Those pioneering years in industry proved critical to his thinking about control practice and the use of Computational Intelligence, which is proving to be a powerful tool with which to bridge the technology gap. After some ten years in industry, the author returned to academe, applying reverse technology transfer instructing his students on Intelligent Control techniques that have proved effective in industry. This book is the result of the experience he gained during those years in industry and of teaching this material to his graduate class on Intelligent Control while many of the examples presented in this book are the result of this experience. Chapter 1 is an introduction to the techniques of Computational Intelligence, their origins and application to Control Engineering. Conventional and Intelligent Control are compared, with a view to focusing
Preface
xi
on the differences which led to the need for Intelligent Control in industry and manufacturing. Chapter 2 discusses Expert Systems with reference to their engineering applications and presents some common applications in industry and manufacturing. Chapter 3 discusses Intelligent Control Systems, their goals and objectives while Chapter 4 discusses its principal components. The elements of Fuzzy Logic on which Fuzzy Controllers are based are presented in Chapter 5 while Chapter 6 discusses the mechanisms of Fuzzy Reasoning, i.e., the inference engine that is the kernel of every fuzzy controller. Chapter 7 defines the fuzzy algorithm, methods of fuzzification and de-fuzzification and outlines the principal fuzzy controller design considerations. The requirements for real-time fuzzy controllers, both supervisory as well as embedded, are discussed in Chapter 8, which also includes examples of industrial applications. Chapter 9 presents fuzzy three-term industrial controllers that are replacing many conventional three-term controllers in the industrial environment. Chapter 10 outlines the Takagi-Sugeno Model-Based Fuzzy Controller design technique and fuzzy gain-scheduling that fuse conventional and fuzzy control. Neural Control, the second important technique of Intelligent Control, is presented in Chapter 11. The elemental artificial neuron and multi-layer artificial neural networks that form the kernel of neural controllers are introduced in this Chapter. The delta and back-propagation algorithms, two of the most common algorithms for training neural network, are described in Chapter 12. Chapter 13 discusses how neural controllers can be trained from linguistic control rules identical to those used in fuzzy control. Finally, the result of fusing fuzzy and neural techniques of Computational Intelligence in the design of hybrid neuro-fuzzy controllers is discussed in Chapter 14. Evolutionary Computation, the latest entrant in the field of Computational Intelligence, and Genetic Algorithms, the best known example of stochastic numerical optimization techniques, are presented in Chapter 15. Chapter 16 introduces Simulated Annealing, a stochastic technique that has found considerable application in engineering optimization. Finally, Chapter 17 demonstrates how these two techniques can be used to advantage in the design of conventional and intelligent controllers. An extensive Bibliography on Computational Intelligence and its applications is presented in Chapter 18.
xii
Preface
Appendix A offers a step-by-step study for the design of a fuzzy controller of a realistic non-linear dynamic plant using MATLAB and its Fuzzy Toolbox. Appendices B and C offer listings of the MATLAB mfiles of Genetic and Simulated Annealing Algorithms. Finally, Appendix D presents a listing of a MATLAB m-file for training industrial neural controllers using the Neural Toolbox.
Acknowledgments
This book would not have been written had it not been for two people: an anonymous kidney donor and. Mark Hardy M.D., Auchinloss Professor of Surgery in the Department of Surgery at the College of Physicians & Surgeons of Columbia University in New York, who performed the transplant. Together, they gave him that most precious gift: Life. He is forever indebted to them. The author gratefully also acknowledges the contributions of his colleagues and former students at the University of Patras in Greece, N. Antonopoulos to Chapter 2, K. Kouramas to Chapters 2 and 10, P. Skantzakis to Chapter 11, G. Tsitouras and G. Nikolopoulos to Chapter 13 and V. Goggos to Chapters 15, 16 and 17. Robert E. King [email protected] October 2004
Series Introduction
Many textbooks have been written on control engineering, describing new techniques for controlling systems, or new and better ways of mathematically formulating existing methods to solve the ever-increasing complex problems faced by practicing engineers. However, few of these books fully address the applications aspects of control engineering. It is the intention of this new series to redress this situation. The series will stress applications issues, and not just the mathematics of control engineering. It will provide texts that not only contain an expos of both new and well-established techniques, but also present detailed examples of the application of these methods to the solution of real-world problems. The authors will be drawn from both the academic world and the relevant applications sectors. There are already many exciting examples of the application of control techniques in the established fields of electrical, mechanical (including aerospace), and chemical engineering. We have only to look around in todays highly automated society to see the use of advanced robotics techniques in the manufacturing industries; the use of automated control and navigation systems in air and surface transport systems; the increasing use of intelligent control systems in the many artifacts available to the domestic consumer market; and the reliable supply of water, gas, and electrical power to the domestic consumer and to industry. However, there are currently many challenging problems that could benefit from wider exposure to the applicability of control methodologies, and the systematic systems-oriented basis inherent in the application of control techniques.
v
vi
Series Introduction
This new series will present books that draw on expertise from both the academic world and the applications domains, and will be useful not only as academically recommended course texts but also as handbooks for practitioners in many applications domains. Neil Munro
Contents
Cover Page Series Introduction by Neil Munro Foreword by George N. Saridis Preface 1. Introduction 1.1 Conventional Control 1.2 Intelligent Control 1.3 Computational Intelligence in Control 2. Expert Systems in Industry 2.1 Elements of an Expert System 2.2 The Need for Expert Systems 2.3 Stages in the Development of an Expert System 2.4 The Representation of Knowledge 2.5 Expert System Paradigms 2.5.1 Expert systems for product design 2.5.2 Expert systems for plant simulation and operator training 2.5.3 Expert supervisory control systems 2.5.4 Expert systems for the design of industrial controllers 2.5.5 Expert systems for fault prediction and diagnosis 2.5.6 Expert systems for the prediction of emergency plant conditions 2.5.7 Expert systems for energy management 2.5.8 Expert systems for production scheduling 2.5.9 Expert systems for the diagnosis of malfunctions 3. Intelligent Control 3.1 Conditions for the Use of Intelligent Control 3.2 Objectives of Intelligent Control 4. Techniques of Intelligent Control 4.1 Unconventional Control 4.2 Autonomy and Intelligent Control 4.3 Knowledge-Based Systems 4.3.1 Expert systems 4.3.2 Fuzzy control 4.3.3 Neural control 4.3.4 Neuro-fuzzy control 5. Elements of Fuzzy Logic 5.1 Basic Concepts 5.2 Fuzzy Algorithms 5.3 Fuzzy Operators 5.4 Operations on Fuzzy Sets 5.5 Algebraic Properties of Fuzzy Sets 5.6 Linguistic Variables 5.7 Connectives v vii ix 1 2 6 8 13 15 17 18 20 20 21 22 23 24 24 26 26 27 28 31 33 34 39 40 45 48 49 50 51 51 53 54 59 60 63 64 64 69
6. Fuzzy Reasoning 6.1 The Fuzzy Algorithm 6.2 Fuzzy Reasoning 6.2.1 Generalized Modus Ponens (GMP) 6.2.2 Generalized Modus Tollens (GMT) 6.2.3 Boolean implication 6.2.4 Lukasiewicz implication 6.2.5 Zadeh implication 6.2.6 Mamdani implication 6.2.7 Larsen implication 6.2.8 GMP implication 6.3 The Compositional Rules of Inference 7. The Fuzzy Control Algorithm 7.1 Controller Decomposition 7.2 Fuzzification 7.2.1 Steps in the fuzzification algorithm 7.3 De-fuzzification of the Composite Controller Output Membership Function 7.3.1 Center of area (COA) de-fuzzification 7.3.2 Center of gravity (COG) de-fuzzification 7.4 Design Considerations 7.4.1 Shape of the fuzzy sets 7.4.2 Coarseness of the fuzzy sets 7.4.3 Completeness of the fuzzy sets 7.4.4 Rule conflict 8. Fuzzy Industrial Controllers 8.1 Controller Tuning 8.2 Fuzzy Three-Term Controllers 8.2.1 Generalized three-term controllers 8.2.2 Partitioned controller architecture 8.2.3 Hybrid architectures 8.2.4 Generic two-term fuzzy controllers 8.3 Coarse-Fine Fuzzy Control 9. Real-time Fuzzy Control 9.1 Supervisory Fuzzy Controllers 9.2 Embedded Fuzzy Controllers 9.3 The Real-time Execution Scheduler 10. Model-Based Fuzzy Control 10.1 The Takagi-Sugeno Model-Based Approach to Fuzzy Control 10.2 Fuzzy Variables and Fuzzy Spaces 10.3 The Fuzzy Process Model 10.4 The Fuzzy Control Law 10.5 The Locally Linearized Process Model 10.5.1 Conditions for closed system stability 10.6 The Second Takagi-Sugeno Approach 10.7 Fuzzy Gain-Scheduling 11. Neural Control 11.1 The Elemental Artificial Neuron 11.2 Topologies of Multi-layer Neural Networks 11.3 Neural Control 11.4 Properties of Neural Controllers 11.5 Neural Controller Architectures 11.5.1 Inverse model architecture
71 74 76 77 77 78 78 79 79 80 80 81 89 90 91 96 98 98 99 100 100 100 101 102 105 106 107 108 109 112 113 117 119 120 123 124 135 136 137 139 141 142 144 144 146 153 156 158 160 161 162 164
11.5.2 Specialized training architecture 11.5.3 Indirect learning architecture 12. Neural Network Training 12.1 The Widrow-Hoff Training Algorithm 12.2 The Delta Training Algorithm 12.3 Multi-layer ANN Training Algorithms 12.4 The Back-propagation (BP) Algorithm 13. Rule-Based Neural Control 13.1 Encoding Linguistic Rules 13.2 Training Rule-Based Neural Controllers 14. Neuro-Fuzzy Control 14.1 Neuro-Fuzzy Controller Architectures 14.2 Neuro-Fuzzy Isomorphism 15. Evolutionary Computation 15.1 Evolutionary Algorithms 15.2 The Optimization Problem 15.3 Evolutionary Optimization 15.4 Genetic Algorithms 15.4.1 Initialization 15.4.2 Decoding 15.4.3 Evaluation of the fitness 15.4.4 Recombination and mutation 15.4.5 Selection 15.4.6 Choice of parameters of a GA 15.5 Design of Intelligent Controllers Using GAs 15.5.1 Fuzzy controllers 15.5.2 Neural controllers 16. Simulated Annealing 16.1 The Metropolis Algorithm 16.2 Application Examples 17. Evolutionary Design of Controllers 17.1 Qualitative Fitness Function 17.2 Controller Suitability 18. Bibliography A. Computational Intelligence B. Intelligent Systems C. Fuzzy Logic and Fuzzy Control D. Fuzzy Logic and Neural Networks E. Artificial Neural Networks F. Neural and Neuro-Fuzzy Control G. Computer and Advanced Control H. Evolutionary Algorithms I. MATLAB and its Toolboxes Appendix A Case Study: Design of a Fuzzy Controller Using MATLAB A.1 The Controlled Process A.2 Basic Linguistic Control Rules A.3 A Simple Linguistic Controller A.4 The MATLAB fuzzy Design Tool A.5 System Stabilization Rules A.6 On the Universe of Discourse of the Fuzzy Sets A.7 On the Choice of Fuzzy Sets
165 166 169 170 173 175 176 181 182 183 193 194 195 203 205 207 208 211 212 212 213 214 215 217 221 221 222 225 226 228 235 236 237 247 247 247 248 251 252 253 254 254 257 259 259 261 261 264 266 267 268
A.8 Compensation of Response Asymmetry 269 A.9 Conclusions 270 Appendix B Simple Genetic Algorithm Appendix C Simulated Annealing Algorithm Appendix D Network Training Algorithm Index Back Cover 279
285
289
291
Chapter 1
Introduction
Modern control theory, which has contributed so significantly to the exploration and conquest of space, has not had similar success in solving the control problems of industry and manufacturing. Despite the progress in the field since the 1950s, the chasm between theory and practice has been widening and many of the needs of industry remain unsolved. Industry has had little choice, therefore, but to rely heavily on conventional (sometimes termed hard) control techniques that are based on industrial three-term controllers. Unfortunately, these simple and ubiquitous devices cannot always cope with the demands and complexity of modern manufacturing systems. The chasm between theory and practice has led to a search for new and unconventional techniques that are not subject to the constraints and limitations of modern control theory to solve the control problems faced by industry and manufacturing,. The breakthrough came in the mid-1960s with the introduction of Fuzzy Logic by Zadeh. The application of Zadehs theory to control was to come almost ten years later and it was to take even more years before it received the respect and acceptance that it rightly deserved. At about the same time, Widrow demonstrated the use of ADALINEs (Adaptive Linear Networks), which are a primitive form of Artificial Neural Networks (ANNs), in control. This was a radical departure from conventional control since a generic controller was trained to perform a specific task instead of being designed.
Chapter 1
The two approaches were developed independently and it was to take many years before these concepts were applied to any degree. The application of Fuzzy Logic to Control Engineering was first demonstrated in Europe and Japan in the mid-1970s. Mamdani presented the first demonstration of Fuzzy Logic in 1974 on an experimental process. This demonstration of Fuzzy Logic Control (FLC) gave the impetus for a seemingly endless series of applications, which continues unabated to this day. With a few notable exceptions, Zadehs theory of Fuzzy Logic went unnoticed in the West for many years while, in the meantime, there was a frenzy of activity in Japan applying the theory to such varied fields as home appliances, cameras and transportation systems. Not until the early 1980s did industries in the West seriously consider applying fuzzy control. At the forefront of this thrust was the process industry and in particular the cement industry, which was the first to apply the new technique to control large-scale processes. The developments in the field since then have been impressive and today there are hundreds of plants worldwide being successfully controlled by such techniques. The field of Artificial Neural Networks, which evolved quite separately, has had a difficult evolution. Appearing in the 1970s as a field that offered much promise and potential, it was thwarted by inadequate computational facilities and a lack of effective network training algorithms. Re-emerging in the 1980s, by which time significant progress had been made in both training algorithms and computer hardware, research and development in the field has evolved rapidly. Artificial Neural Networks can be found today in a host of applications ranging from communications, speech analysis and synthesis, control and more.
Introduction
signing appropriate conventional controllers. These methods depend on empirical knowledge of the dynamic behavior of the controlled plant, derived from measurements of the control and manipulated variables of that plant. Traditionally industry has relied heavily on three-term (PID) controllers, that are incorporated today in most Remote Terminal Units (RTUs) and Programmable Logic Controllers (PLCs). The ubiquitous three-term controller is used to control all kinds of devices, industrial processes and manufacturing plants. Their tuning is based on simple approximants of the controlled plant dynamics and on design methods such as the classical ones by Nichols and Ziegler or more modern techniques such as those of Persson and Astrom. Most often in practice turning is performed heuristically by expert tuners in situ. Without doubt, these simple industrial controllers have offered sterling service for many decades and will continue to do so for many more, wherever simplicity and robustness are essential and control specifications permit. However, three-term controllers cannot always satisfy the increasing complexity of modern industrial plants and the demands for high flexibility, productivity and product quality, which are essential in todays very competitive global market. The problem is further aggravated by the increasing environmental restrictions being placed on industry and manufacturing. Modern Control was introduced in the early 1960s and is a rigorous methodology that has proved invaluable for finding solutions to well-structured control problems. With a few notable exceptions, however, its application to industry has been disappointing and few industrial controllers are designed with this methodology. The reasons for this discrepancy are the complexity, uncertainty and vagueness with which industrial processes are characterized - conditions that do not allow for ready modeling of the controlled plant, essential to the application of modern control methodologies. Despite more than five decades of research and development in the theory and practice of Control Engineering, most industrial processes are by and large still controlled manually. Today, Supervisory Control And Data Acquisition (SCADA) Systems and Distributed Control Systems (DCS) make the operators task considerably easier. A partial schematic of such an information system using a distributed architecture, is shown in Figure 1.1.
Chapter 1
RTUs
LAN
Operator Consoles
The operator console possesses one or more screens that display the essential variables of the plant through a graphical user interface by which the operator interacts with the plant. A typical example of such a display is shown in Figure 1.2. In plants where the various sub-processes interact, it is clear that the control problem can be severe, requiring operator skills that can only be acquired after years of experience. Today, Multimedia and Virtual Reality are finding their way into the control room, improving the manmachine interface and making decision-making considerably easier and the work environment more tolerable.
Introduction
One or more human operators normally supervise a cluster of sub-processes, receiving data on the state of the plant and sending corrections to the set points of the local controllers which are distributed throughout the plant so that the plant remains at its nominal state despite external disturbances. These local controllers are often embedded in RTUs that are also capable of performing sequential switching control, data acquisition and communications with the Supervisory Control System and the operators consoles via a local area network.
In most industrial applications, human operators close the loop between the controlled and the control variables of the controlled plant. Operators respond to observations of the principal variables of the plant and continuously stride to satisfy often-conflicting objectives, e.g., maximizing productivity and profit while minimizing energy demand. Proper operation of a process is thus very much dependent on the experience of the operator, his knowledge about the process and its dynamics and the speed with which he responds to plant disturbances, malfunctions and disruptions. The yield of a process can vary quite significantly from operator to operator and less experienced operators are often unable
Chapter 1
to control a plant effectively, particularly under abnormal situations which they have never met before. The control actions of a human operator are subjective, frequently incomprehensible and often prone to errors particularly when they are under stress. Indeed in the case of abnormal operating (i.e., alarm) conditions, their actions may be potentially dangerous and there is little margin for errors. Delays in making decisions can lead to disastrous results as was amply demonstrated in the Chernobyl nuclear reactor disaster. Thus in modern complex plants there exists a very real need to assist operators in their decision-making, particularly in abnormal situations in which they often are bombarded with conflicting signals. The advent of Computational Intelligence and unconventional control free operators of many of the tedious and complex chores of monitoring and controlling a plant, assuring them fast and consistent support in their decision-making.
Introduction
connectionism and parallel distributed processing for dealing with vagueness and uncertainty. This is the domain of Soft Computing, which focuses on stochastic, vague, empirical and associative situations, typical of the industrial and manufacturing environment. Intelligent Controllers (sometimes termed soft controllers) are derivatives of Soft Computing, being characterized by their ability to establish the functional relationship between their inputs and outputs from empirical data, without recourse to explicit models of the controlled process. This is a radical departure from conventional controllers, which are based on explicit functional relations. Unlike their conventional counterparts, intelligent controllers can learn, remember and make decisions. The functional relationship between the inputs and outputs of an intelligent controller can be specified either: indirectly by means of a relational algorithm, relational matrix or a knowledge base, or directly from a specified training set.
The first category belongs to the domain of Fuzzy Systems while Artificial Neural Networks belong to the second. Generality, in which similar inputs to a plant produce similar outputs so that sensitivity to perturbations in the plant inputs is minimized, is an inherent feature of such systems. Generality implies that the controller is capable of operating correctly on information beyond the training set. Intelligent controllers, whatever form they may take, share the following properties: they use the same process states, use parallel distributed associative processors, assure generality, and are capable of codifying and processing vague data.
The principal medium of intelligent control is Computational Intelligence, the branch of Soft Computing which includes Expert Systems, Fuzzy Logic, Artificial Neural Networks and their derivatives. Evolutionary Computation (Genetic Algorithms and Simulated Annealing) is a very recent addition to this rapidly evolving field.
Chapter 1
Introduction
peared. Numerous successful applications in a variety of fields attest to the usefulness and power of these techniques. Computational Intelligence uses numerical representation of knowledge in contrast to Artificial Intelligence, which uses symbolic representation. This feature is exploited in Control Engineering, which deals with numerical data since control and controlled variables are both defined numerically. Computational Intelligence adapts naturally to the engineering world, requiring no further data conversion. The techniques of Computational Intelligence share the following properties: they use a numerical representation of knowledge, demonstrate adaptability, have an inherent tolerance to errors, and possess speeds comparable to those of humans.
Intelligent controllers infer the control strategy that must be applied to a plant in order to satisfy specific design requirements. This action can be the result of operations on a set of pre-specified linguistic control rules, as in the case of Fuzzy Controllers, or of training an artificial neural network with numerically coded rules as in the case of Neural Controllers. In either case, the primary objective is to generate control actions which closely match those of an expert human operator. In this manner, the controller can assist the human operator to maintain the plant under his supervision at its nominal operating state while simultaneously compensating for his inconsistency and unreliability brought about by fatigue, boredom and difficult working conditions. Intelligent controllers can be trained to operate effectively in conditions of vagueness and uncertainty of both the plant state and plant environment and can respond to unforeseen situations autonomously, i.e., without intervention from the plant operator. They differ, however, from their human counterpart in their ability to learn new control rules or to adapt to new situations for which they have not been trained. Selforganizing controllers that have the ability to learn new rules on-line have been variously proposed in the literature and tried out in the laboratory, but none has been commissioned so far in a manufacturing plant. The main reason is that this class of controllers assumes extended testing and experimentation on the controlled plant under normal operating conditions, a situation that few plant managers are likely to entertain.
10
Chapter 1
A variety of architectures have been proposed for the design and implementation of high level intelligent controllers for large-scale systems. One of the most useful is the hierarchical architecture proposed by Saridis in the mid-1970s. In this, information from the controlled plant flows with decreasing frequency from the lowest to the highest layer of the hierarchy. In contrast, management directives (on such matters as production quotas, product qualities, etc.) flow in the reverse direction with increasing frequency as they descend the hierarchy, leading ultimately to selection of the best control strategy that must be imposed on the plant. Saridis principle, on which a number of successful intelligent hierarchical process management and control systems have been developed, can be paraphrased as: Increasing/decreasing precision is accompanied by decreasing/increasing intelligence. It is useful, finally, to note the features that every Intelligent System involving clusters of intelligent controllers must support: Correctness - i.e., the ability to operate correctly for specific sets of commands and plant safety constraints. Robustness - i.e., the ability to operate acceptably despite wide variations in plant parameters. The higher layers of the hierarchy must possess an inherent ability to deal with unforeseen variations. Extendibility - i.e., the ability to accept extensions to both hardware and software without the necessity for major modifications to either. Extendibility implies modularity, which is the partitioning of the system into easily modifiable software and hardware modules. Reusability - i.e., the ability to use the same software in different applications. To possess this feature, the system must be general or possess an open architecture. The field of intelligent control is one of the most exciting and promising new directions of automatic control that is opening up new frontiers for research and development in radical solutions to the control of industrial systems in the new millenium.
Introduction
11
Chapter 2
14
Chapter 2
is typically represented in the form of linguistic rules that describe the actions that must be taken in response to specified excitations.
Expert Systems
Evolutionary Computing
Computational Intelligence
Fuzzy Systems
Neural Systems
Neuro-fuzzy Systems
There are many techniques for representing knowledge and each one has its advantages and disadvantages. The principal theoretical research issue is how to give Expert Systems the ability to search through the domain knowledge systematically and arrive at decisions rapidly. The following are techniques commonly used for representing knowledge:
predicate logic, semantic networks, procedural representation, production systems, and frames.
15
16
Chapter 2
facts, which constitute ephemeral information subject to changes with time (e.g., plant variables) and procedural knowledge, which refers to the manner in which experts in the specific field of application arrive at their decisions. Procedural knowledge (e.g., information flows, control sequences and actions, etc.) and the step-by-step procedure which must be followed in the specific manufacturing plant, is evidently known by production engineers and is the result of years of experience with working with the plant or process. This is one of the principal reasons why Expert Systems have attracted so much attention in the industrial world. The use of rules is the simplest way to describe a manufacturing procedure, while linguistic rules of the classical if then else form are most commonly used by humans.
Domain Experts
Knowledge Base
Inference Engine
M anM achine Interface
Explanation Sub-system
17
The basic elements of an Expert System are shown in Figure 2.2. An Expert System includes the following elements: the knowledge base, which comprises facts and rules with which to control a plant, the inference engine, which processes the data in the knowledge base in order to arrive at logical conclusions, the explanation sub-system, which is capable of a giving a rational explanation on how the decision was arrived at, the knowledge acquisition system, which is used by the knowledge engineers to help them analyze and test the knowledge elicited from human domain experts and the man-machine or user interface system through which the human operator interacts with the system.
18
Chapter 2
the ability of a class of knowledge-based systems to deal with vagueness and uncertainty that is characteristic of many industrial plants. A common feature in industrial and manufacturing systems is that their quantitative models that are supposed to predict their dynamic behavior are either unknown or do not possess sufficient fidelity. This is particularly true in the case of large-scale industrial plants whose quantitative description is a difficult, tedious and occasionally impossible task for lack of sufficient deep knowledge. Deep knowledge is the result of microscopic knowledge of the physical laws that govern the behavior of a plant. In contrast, shallow knowledge is the result of holistic or macroscopic knowledge and is readily available from human domain experts. This knowledge is acquired after years of experience in operating the plant and observing its peculiarities and nuances.
19
oriented languages, notably LISP, it is inconceivable today to develop such a system without a shell. It should be noted that knowledge elicitation is one of the most painstaking tasks in the design procedure. Human domain experts are often reluctant to part with their knowledge, fearful that divulging knowledge gained after years of experience may lead to their redundancy and termination. In the first stage of development of any Expert System, it is very useful to implement a rapid prototype. The objective here is not development of a complete Expert System, but a prototype that will form the basis of the final system under development. Once the knowledge engineer has a thorough understanding of the rules elicited from the domain experts and the manner in which decisions are arrived at and justified, he must then encode the knowledge in a form suitable for processing by the Expert System. It is noted that the rapid prototype need not possess all the features of the end product, but should incorporate the basic features that can be evaluated by both the domain experts and the end users. Should the prototype system demonstrate deficiencies and difficulties in inferring decisions, it is clearly preferable to make corrections and improvements at this stage rather than in the end product when it may be very difficult and costly. In the implementation stage of the Expert System, the knowledge elicited from the domain expert is transferred to the Expert System that runs on a suitable platform. Early Expert Systems invariably ran on powerful workstations or special purpose computers (such as the shortlived LISP machines) which were subsequently superceded by common microcomputers. Today, most Expert Systems can run on high end PCs or workstations. Once completed, the Expert System is tested off-line until the end users are convinced of its ability to infer correct results and support its decisions. It is noted that it is often difficult and uneconomic to test the Expert System exhaustively, i.e., for all possible conditions, in practice. For these reasons, end users must develop a close liaison with the Expert System designer, assisting him whenever some discrepancy is observed between their decision and that of the Expert System. Such discrepancies arise from rule conflict, misunderstandings or errors in the knowledge base.
20
Chapter 2
21
22
Chapter 2
Product specifications
Design
Knowledge
Interpretation
Desired properties Interpretation
Control Diagnosis
Production Process
Supervision Prediction
23
While conventional Computer Aided Design (CAD) software can process geometric shapes rapidly, the designer needs to know rapidly certain characteristics of the product being designed, such as strengths, thermal distributions, costs, etc. Expert CAD systems provide all this information while in addition advising the designer of alternative shapes from a priori experience with similar designs. The trend in product design today does not yet permit total design with expert CAD systems since design normally depends on the designers intuition and aesthetic knowledge, the prehistory of the product and economic factors that are difficult to incorporate in a knowledge base. The final product is a set of diagrams or plans, design specifications and various documents on which manufacturing will then proceed, as shown in Figure 2.3.
24
Chapter 2
One of the major difficulties in the design of plant controllers, particularly in the case of large-scale multivariable plants, using conventional control techniques, is the unavailability of explicit models of the plants. For this reason industrial automation leans towards the use of three term (PID) controllers and various empirical and semi-empirical design techniques have been proposed to determine the parameters of these controllers. Examples of these design techniques are the wellknown methods of Ziegler and Nichols and modern variants due to Persson and Astrom. In contrast, expert controller techniques, which can exploit the knowledge of expert controller tuners, can often offer superior results. A number of vendors currently offer such software products. The use of Expert Systems in the design of industrial controllers has two aspects. The first involves the rules on the most appropriate
25
design technique to use in order to achieve the desired result. These rules are dependent on the specific plant to be controlled and criteria by which the control quality, i.e., the performance of the closed plant, is judged. The second aspect involves rules that specify the best control strategy to follow in any situation, given as advice to the operator.
26
Chapter 2
increased wear of bearings of rotating machinery due to vibrations or excess friction due to overheating. This class of on-line, real-time Expert Systems is giving new meaning to the field of predictive maintenance. Productivity is benefiting through improved estimates of the time-to-go before catastrophic failure to the equipment is likely to occur. This is particularly important in the case of large equipment for which expensive spare parts have to be in stock, to be used in case of a breakdown. Expert systems can minimize and even eliminate stocks through timely procurement. The use of Expert Systems for fault prediction results leads to a drastic reduction in the mean time to repair equipment and a corresponding increase in the availability of the equipment and, most importantly, an increase in plant productivity. In preventive maintenance, historical data is gathered from suitable sensors attached to the equipment (e.g., temperatures, pressures, vibrations, etc.). Real-time measurements of critical variables are compared with expected or desired values and any discrepancy is used to diagnose the possible cause of the discrepancy from rules embedded in the Expert System. Following spectral analysis of such measurements of bearing sounds by standard signal processing techniques, the Expert System suggests what maintenance will be required and when best to perform it. Expert systems for fault diagnosis can be either off-line or online. In the former case, maintenance personnel enter into a dialog with the Expert System, supplying answers to questions posed by the Expert System on the health of the equipment. The Expert System then gives instructions on what further measurements and what actions should be followed that will focus on the source of the problem and then give advice on how to repair it. It is obvious that rapid fault diagnosis is of paramount importance in a manufacturing environment where every minute of lost production results in a loss of profit. It should be evident why expert fault prediction and diagnosis systems have been the subject of considerable commercial interest and have found such extensive application.
27
28
Chapter 2
It is therefore necessary to accurately predict what the power absorbed over each period will be and to monitor the energy demand by shedding loads in time to avoid exceeding the contractual energy limit. The decision on which loads to shed and when to do so without disrupting production, is a very difficult and tiring task for a human who would have to make this decision every 15 minutes throughout the day and night. The operator has to know which equipment can be shut down and which must, at all costs, be left running in order to avoid major disruption of the production line or manufacturing plant and how long before each piece of equipment can be restarted without causing excess wear to it. In a large plant this is normally performed by shedding auxiliary equipment that is not considered absolutely essential to the manufacturing plant (e.g., circulation pumps, conveyor belts) and in the worst case by a total stoppage of production in periods of high energy cost. Many electric energy intensive plants today are forced to shut down production during peak hours in order to conserve energy. Real-time expert energy management systems have been developed and have been very successful in containing energy costs, replacing the human operator in this arduous task. Indeed, avoiding just one or two overload penalties often pays for the cost of the Expert System! The rules by which equipment can be operated, the order in which they may be shed, when and how many times per day they can be restarted are elicited from human operators and are embedded in the Expert System rule base. The real-time expert energy management system is then executed every few seconds following prediction of the energy absorbed at the end of the timing period. Naturally, the magnitude of the load that must be shed is critically dependent on the time-to-go before the end of the period: the shorter the time left, the larger must be the load that must be shed and the greater the malfunction that is incurred. Accurate prediction and effective and fast decisions from the Expert System are essential to proper operation.
29
used to manufacture a specific product, the production manager must know the production capacity and limitations of each production line, the overall production schedule, equipment and storage capabilities, etc. When a production line is disrupted for whatever reason, it is often necessary to switch production lines and change the priorities with which the product is produced, permitting high priority items to be completed first while lower priority items are placed in a queue. The long-term production schedule is normally produced on a weekly or monthly basis but changes to it may be necessary due to equipment failures. When these failures are serious enough to cause extended production disruption it is necessary to re-compute the production schedule. Operational research techniques based on linear integer or mixed-integer programming is the conventional approach to this problem, but these techniques are time-consuming. An alternative way to reschedule production is through the use of empirical rules that are followed by production management. Expert scheduling systems using this knowledge and experience are considerably simpler to use and lead to equally feasible results much faster and have been used with excellent results.
30
Chapter 2
is of paramount importance in order to maintain high equipment availability and meet production schedules. In large bottling or canning plants, for instance, the sensors are linked to the Factory Data Acquisition (FDA) system and measurements are continuously compared with the desired values. Should some unit along the line malfunction, then clearly both the proceeding and succeeding units will suffer the consequences. Due to the interactive nature of most production systems and work cells, it is obvious that when any sub-system malfunctions, the sub-systems upstream and down-stream will be affected sooner or later. Up-stream units must thus be stopped in time to avoid strangulation as a consequence of the accumulation of partially finished products which may exceed the capacity of the silos or queues if the malfunction persists for some time, while down-stream, units must be stopped because of starvation. Expert systems for the diagnosis of equipment malfunctions contain the rules by which a malfunction can be transmitted to adjacent units embedded in their knowledge base. The Expert System continuously monitors the materials flows and should the mass balance for each unit be essentially constant, then no alarm is issued. However, when some malfunction occurs, the Expert System is executed with the object of determining the source of the fault. Timing is clearly of the essence.
Chapter 3
Intelligent Control
Intelligent control takes a radically different approach to the control of industrial processes and plants from conventional control. The knowledge and experience of human operators constitutes the basis for this new approach to Control Engineering for which Computational Intelligence provides the theoretical foundation. In this chapter we summarize the potential and some limitations of intelligent control and we attempt to address the questions on how, where, when and under what conditions can intelligent control be applied in practice. Intelligent control seeks solutions to the problem of controlling plants from the viewpoint of the human-operator. In other words, the technique seeks to establish some kind of cognitive model of the human operator and not the plant under his control. This is the point at which intelligent control departs from conventional control and it is undoubtedly true that the technique could not have been possible but for the rapid progress in computer technology. Computational Intelligence provides the tools with which to make intelligent control a reality. The reproduction of human intelligence and the mechanisms for inferring decisions on the appropriate control actions, strategy or policy that must be followed are embedded in these tools. Figure 3.1 shows how Computational Intelligence can be classified according to the form of the knowledge (i.e., structured or unstructured) and the manner in which this knowledge is processed (i.e., symbolic or numerical). For control applications, knowledge can be struc31
32
Chapter 3
tured or not, but processing is invariably numerical. Fuzzy and neural control form the core of intelligent control and are the principal components of computational intelligence.
PROCESSING
Symbolic Numerical
Structured
Expert Systems
Fuzzy Systems
KNOWLEDGE
Unstructured Neural Systems
In contrast to conventional control, intelligent control is based on advanced computational techniques for reproducing human knowledge and experience. Thus in intelligent control the focus of interest moves away from the tedious task of establishing an explicit, microscopic model of the controlled plant and the subsequent design of a corresponding hard controller, to the emulation of the cognitive mechanisms used by humans to infer and support control decisions. Intelligent control has been applied with considerable success in the process industry. Examples can be found in the petrochemical, cement, paper, fertilizer and metals industries. With time, it is predicted that intelligent control will diffuse into most branches of industry by manufacturing and be adopted by progressive organizations that are seeking to improve their strategic position in the global market through improved productivity and product quality.
Intelligent Control
33
34
Chapter 3
Intelligent Control
35
easy. Cost is one factor but prior experience with a similar plant is considered the most important factor in making a decision on which system to purchase. The socially optimum solution is given by the support that plant management, production management and plant operators are prepared to give to a particular system to make it successful. In considering intelligent control seriously for solving production control problems, which conventional control is unable to solve, answers must be sought to the following questions: will the proposed system
repay its cost in finite time? decrease the cost of production? increase productivity? lead to savings in energy? improve equipment availability? be simple to use or require specialized knowledge that is not available in-house? decrease the workload of plant operators? It is implicitly assumed that the knowledge to control the plant is available by the plant operators who have spent years operating the plant. It should be obvious that intelligent control is not a candidate when this knowledge is absent or incomplete, as in the case of an entirely new plant for which there is no prior experience. In practice this situation is not very likely to occur since most new plants are based on earlier designs for which some prior knowledge exists. It is difficult to state all the technical properties of a successful intelligent system since they vary according to the application. The success of an intelligent system is very much dependent on the support of the users of the system. Intelligent controllers are underutilized and even ignored when user support is undermined. Assuming that plant management has been convinced that intelligent control could lead to an improvement in the strategic position of the business, improve productivity and lead to a reduction in production costs, it is still necessary to convince plant operators to use the system. This is not always an easy task, as by tradition, operators are fearful of any new technology that may undermine their post, future and usefulness in the business. These are natural feelings that have to be taken into account when any new system is introduced. This inbred fear can be
36
Chapter 3
greatly reduced by including the plant operators in the system development process and providing adequate training to alleviate his fears. The older generation of plant operators spent years controlling plants from central control areas with classical instrumentation, adjusting the setpoints of conventional three term controllers and tediously logging plant activity manually. Today, the new generation of plant operators have been brought up in the era of computers, consoles with graphical user interfaces and all the benefits of Computer Integrated Manufacturing systems. Even the older plant operators have adapted, even though sometimes reluctantly, to the new environment. New plant operators no longer view the introduction of advanced technology as a threat but on the contrary, show great interest and an enviable ability to assimilate and use it effectively to improve their working conditions. This is especially true where management has had the foresight to provide the necessary training in advance. The days of pulling down control switches and turning control knobs are gone, replaced by the touch of a light pen or a finger on a screen or the click of a keyboard or a mouse. Report generation is a matter of seconds instead of hours, days or even weeks. Information is power and this can undoubtedly be enhanced through the use of intelligent techniques. From the viewpoint of management, the success of an intelligent control system is judged solely on how rapidly the system will repay its investment. This is measured from the observed (and not assumed) increase in productivity, the energy reduction and the improvement in the mean-time-between-failures of the plant. Improvements of the order of 5-10% are not uncommon in the process industry. History has shown that since their introduction, manufacturers that have taken advantage of the new control techniques have benefited significantly on all counts. The specialization required to develop intelligent systems is Knowledge Engineering. It would be very wrong to conclude, however, that no knowledge of conventional control and system theory is necessary to design such systems. On the contrary, a very thorough knowledge of the abilities and limitations of classical and modern control techniques must constitute the background of the knowledge engineer. The most successful intelligent control systems that have been commissioned have been designed by control engineers with a very thorough background in conventional control techniques. Knowledge Engineering requires the cooperation of knowledge engineers, domain experts and plant operators in the design phase, com-
Intelligent Control
37
missioning and operation of an intelligent system. Characteristics such as the quality, depth of knowledge, effectiveness of the inference engine and the suitability of the man-machine interface, are important to the efficiency and acceptance of an intelligent system. An intelligent system based on computational intelligence uses linguistic rules with which to describe the knowledge about controlling the plant. Before eliciting the rules from human operators, it is very important to stipulate the bounds of this knowledge, otherwise the system is likely to be unwieldy. It should be obvious, furthermore, that the intelligent system software can be written in any high-level language or be developed on some expert system shell that simplifies the design process significantly.
Chapter 4
40
Chapter 4
In order to use conventional design techniques, it is essential that the model of the plant be simplified yet be sufficiently comprehensive so that it reproduces the essential dynamic features of the physical plant. Modern manufacturing plants have to meet increasing demands for more flexible production and improved quality while striving to meet stringent environmental constraints. Though there were high expectations that modern control theory would meet these demands, it has failed by and large to do so to any significant degree in industry and manufacturing, which thus far have had to be content with conventional industrial threeterm controllers. The design of simple, practical and robust controllers for industry is usually based on low order holistic models of the physical plant. These approximants form the basis for the design of industrial controllers that satisfy relaxed performance criteria. Three-term controllers are the backbone of industrial control and these ubiquitous, simple and robust controllers have offered sterling service. However, these controllers can only perform at their best at the nominal operating point of the plant about which the approximant holds. When the operating point moves away from the nominal point, their performance is invariably degraded due to the inherent non-linearity of the physical plant. A number of techniques have been proposed to anticipate this problem, the most common example of which is gain-scheduling, a variant of which is considered in a later chapter. The objective here is one of extending the domain over which satisfactory controller performance is maintained. Adaptive controllers are another class of controllers whose parameters can be varied to track the changes in the operating point. Here, periodic identification is required in order to follow the changes in the plant dynamics. The degree of autonomy of a controller is closely related to the range of operation of the controller and consequently to its robustness. The degree of autonomy of a gain-scheduled controller is higher than that of a fixed controller but lower than that of an adaptive controller whose range of operation is correspondingly greater.
41
42
Chapter 4
control techniques. What sets these new unconventional techniques apart is their ability to arrive at control decisions and control strategies in ways that are radically different from those of conventional control. It is not unreasonable, therefore, that the new class of unconventional control techniques has caused considerable interest in industrial and manufacturing circles and has led to innovative controllers which have been applied to many difficult problems in industry. In this new class of controllers, the primary objective is minimization of the uncertainty and vagueness with which industrial processes are shrouded, leading to controllers with high autonomy and robustness. The reproduction of the cognitive and decision making processes of a human operator of an industrial plant executing his control task has been the subject of intense research since the 1950s, reaching fruition in the 1970s with the implementation of the first experimental rule-based control system. The first practical unconventional industrial controllers were commissioned in the early 1980s in the cement industry, an industry with many difficult problems particularly in the critical kilning process. The development of unconventional controllers since then has been very rapid and they are to be found today not only in most process industries but in all kinds of household appliances as well. The new field of unconventional control, which is based on the knowledge and experience of human operators, is better known as Intelligent Control and is supported by fuzzy, neural, neuro-fuzzy and evolutionary control techniques. This book is devoted exclusively to these techniques and their application to practical industrial problems. In line with the use of the term Soft Computing in Control this field is sometimes known as Soft Control. Modern control theory, which is based on a microscopic description of the controlled plant using differential or difference equations, can, in contrast, be described as Hard Control since it uses inflexible algorithmic computing techniques. There are fundamental differences between conventional and intelligent control techniques. These differences are highlighted in this chapter. It is important to note that control in no way supercedes conventional control, but rather augments it in an effort to resolve some of the difficult and unsolved control problems which industry and manufacturing face. A fundamental difference between conventional and Intelligent Control is the manner in which the plant and controller are viewed. This is seen schematically in Figure 4.1. In conventional control, the plant and
43
controller are viewed as distinct entities. The plant is assumed invariant and designed to perform a specific task without the controller in mind. The controller is subsequently designed independently and after the plant has been completed. In contrast, in intelligent control, the plant and the controller are viewed as a single entity to be designed and commissioned simultaneously. Thus plants which are unstable by nature are stable when a unified approach is taken. A typical example is a helicopter that is by nature an unstable plant, but so long as its controller is operational is perfectly stable.
Plant
Plant
Controller
Controller
(a)
(b)
The generalized closed control system is shown in Figure 4.2. Here block P depicts the controlled plant, block C depicts the controller and block S specifies the desired closed system performance specifications. In conventional control blocks P and C are assumed linear (or linearized) and the block S defines the cost function, or criterion of performance, e.g., stability margin, rise time, settling time, overshoot, steady state error, integral squared error, etc. following some exogenous disturbance. The following characteristics apply to industrial processes: the physical process P is so complex that it is either not known explicitly or is very difficult to describe in analytical terms, and the specifications S ideally demand a high degree of system autonomy so that the closed system can operate satisfactorily and without intervention despite faults in the system.
44
Chapter 4
Control Action
Controller C Input
The intelligent control problem can be stated in precisely the same way: given the plant P, find a controller C that will satisfy specifications S that may be either qualitative or quantitative. The basic difference here is that in intelligent control it is not necessary to have an explicit description of the plant P. Furthermore, it may not always be possible to separate the plant and controller. Since intelligent control is often rule-based, the control rules are embedded in the system and form an integral element of the system. This structure presents new possibilities for improved integrated manufacturing systems. As illustrated in Figure 4.3, Intelligent Control is the fusion of Systems Theory, Computer Science, Operations Research and Computational Intelligence that bonds them. These techniques are now being called upon to solve problems of control engineering, which were hitherto unsolvable. Intelligent control has been claimed to be able to reproduce human-like properties such as adaptation and learning under unfavorable and uncertain conditions. There is considerable debate on this matter and many have refuted these claims. No one has refuted the fact, however, that intelligent control is capable of controlling large-scale industrial processes that have hitherto been controlled only manually very successfully. Industry has only to take advantage of this fact to benefit significantly.
45
Computational Intelligence
Systems Theory
Computer Science
Operations Research
46
Chapter 4
under conditions of uncertainty and vagueness by mechanistic means, it is necessary to develop advanced inference and decision support techniques. Autonomy of operation is the objective and intelligent control is the means to this objective. The theory of intelligent systems, which was developed by Saridis, fuses the powerful techniques for decision support of Soft Computing and advanced techniques of analysis and synthesis of conventional Systems Theory. The fusion of Computational Intelligence, Operations Research, Computer Science and Systems Theory offers a unified approach to the design of intelligent control systems. The result of this fusion is Soft Control, which today is one of the most interesting areas of research and development in Control Engineering.
Organization
Coordination
Execution
Intelligence
Precision
Intelligence is distributed hierarchically and in accordance with Saridis principle of increasing precision with decreasing intelligence as is depicted in Figure 4.4. Most practical hierarchical intelligent control systems have three layers:
47
the Organization layer in which high level management decisions, e.g., production scheduling, are made, the Coordination layer for the tasks that have been decided on at the Organization layer. As in the case of the uppermost layer of the hierarchy, this layer normally possesses intelligence, and the Execution layer, which has little or no intelligence, is the layer in which the commands of the higher layers are executed. This layer involves low-level controllers embedded in the plant Remote Terminal Units (RTU). Recently, some vendors have added some degree of intelligence into this layer.
Organizer
Coordinator
Coordinator
LAN
Executor
Executor
Executor
The uppermost layer of the hierarchy is activated rarely and at random as need requires and uses qualitative reasoning to arrive at its decisions. The intermediate Organization level is activated by production management and has low repetition frequency, perhaps once or twice
48
Chapter 4
daily. In this layer, management arrives at a long-term production policy that must be followed to achieve production goals. The production policy that has been decided in the highest layers is relayed to the Coordination layer, which is responsible for carrying out this policy. In this intermediate layer, decisions on changes in the policy can be made should, for instance, a serious malfunction or breakdown occur in a production line, if raw material shortages are ascertained, or if changes are made in customer priorities. The Coordination layer is also responsible for product quality control, maximization of productivity of each unit in the factory and for coordination between the various manufacturing units. Even though the structure of an intelligent system is normally represented vertically in Figure 4.4 to indicate its hierarchical architecture, in practice such systems use a distributed architecture based on a client/server structure, as shown in Figure 4.5. Here, the Organizer acts as the server while the Coordinators and Executors are clients of the system.
49
50
Chapter 4
techniques that possess appropriate mechanisms to deal with uncertainty and vagueness.
51
52
Chapter 4
to establish the knowledge base that is used in the system. Execution of the fuzzy algorithm is performed as a sequence of logical steps at the end of which a full explanation of the steps that were taken and the rules that were used in arriving at the conclusion is made available. In contrast, ANNs are ideal for the representation of an arbitrary nonlinear functional relationship with parallel processing methods, can be trained from training sets but do not offer any mechanism for giving explanations on the decisions at which they arrive. It is natural, therefore, to consider the advantages and benefits that a fusion of the two methods may present. This possibility is discussed at length in a later chapter and it is shown how fuzzy control and neural control can be combined with interesting results.
Chapter 5
54
Chapter 5
industry, in traffic and train control systems and most notably in household appliances. The fundamental elements of Fuzzy Logic necessary to understand the techniques of Fuzzy Control are presented in this chapter. For further in depth study of the theory of Fuzzy Sets, the reader is referred to the numerous books and papers on the subject given in the Bibliography in chapter 18.
f 1
The crisp or Boolean characteristic function fA(x) in Figure 5.1 is expressed as the discontinuous function fA (x) = 1 if xA = 0 if xA
55
Vagueness can be introduced in the theory of sets if the characteristic function is generalized to permit an infinite number of values between 0 and 1, as shown in Figure 5.2.
If is the universe of discourse with elements x (i.e., the region to which the physical variable is confined), then we may state X={x}. A fuzzy set A on the universe of discourse X can be expressed symbolically as the set of ordered pairs A= {A(x)/x} or
{ (x)/x} for xX
A
for the continuous and discrete cases respectively. Here A(x) is termed the membership function of x on the set and is a mapping of the universe of discourse on the closed interval [0,1]. The membership function is simply a measure of the degree to which x belongs to the set , i.e., A (x) : X[0,1] It is noted that the symbols and imply a fuzzy set and bear no relation to the integration and summation.
56
Chapter 5
The support set of a fuzzy set is a subset of the universe of discourse for which A(x)>0. Thus a fuzzy set is a mapping of the support set on the closed interval [0,1]. As an example, consider the temperature of water at some point in a plant. Consider for example the fuzzy variable Low. This can be described in terms of a set of positive integers in the range [0,100] and defined as ={Low}. This set expresses the degree to which the temperature is considered Low over the range of all possible temperatures. Here, the membership function A(x) has discrete values specified in degrees Centigrade by the set: A(0)= (5)= (10)= (15)= (20)=1.0, (25)=0.9, (30)=0.8, (35)=0.6, (40)=0.3, (45)=0.1, (50)= (55)= ..... (100)=0 More compactly, this set can be expressed as: (x)= {1/0 + 1/5 + 1/10 + 1/15 + 1/20 + 0.9/25 + 0.8/30 + 0.6/35 + 0.3/40 + 0.1/45 + 0/50 + 0/55 +....0/100} The symbol + represents the union operator in set theory and must not be confused with arithmetic addition. A graphical representation of the corresponding fuzzy membership function A(x) is shown in Figure 5.3.
(x) 1
10
20
30
40
......
A={Low}
57
A fuzzy variable is one whose values can be considered labels of fuzzy sets. Thus TEMPERATURE can be considered as a fuzzy variable which can take on linguistic values such as Low, Medium, Normal, High and Very_High. This is precisely the way that human operators refer to plant variables in relation to their nominal values. It is shown in the following that fuzzy variables can be readily described by fuzzy sets. In general, any fuzzy variable can be expressed in terms of phrases that combine fuzzy variables, linguistic descriptors and hedges. Thus the values of the fuzzy variable TEMPERATURE in the foregoing example can be described as High, NOT High, rather_High, NOT Very_High, extremely_High, quite_High etc., labels such as High, negation NOT, connectives AND and hedges such as extremely, rather, quite etc. Figure 5.4 shows the variable TEMPERATURE with a few of its values.
TEMPERATURE
Linguistic Variable
Very_Low
Low
High
Linguistic Values
20
40
60
80
100
Figure 5.4 The linguistic variable TEMPERATURE and some of its values
58
Chapter 5
The dependence of a linguistic variable on another can be described by means of a fuzzy conditional statement of the form: R : IF S1 THEN S2 or symbolically as: S1 S2 where S1 and S2 are fuzzy conditional statements which have the general form: S : is and . A linguistic meaning can be given to the fuzzy subset to specify the value of , for example: IF the LOAD is Small THEN TORQUE is Very_High OR IF the ERROR is Negative_Large THEN OUTPUT is Negative_Large. Two or more fuzzy conditional statements can be combined (or included in another) so as to form a composite conditional statement such as: R : IF S1 THEN (IF S2 THEN S3). It should be obvious that the composite statement can be decomposed into the two simpler conditional statements: R1 : IF S1 THEN R 2 AND R 2 : IF S2 THEN S3 The composite statement (or rule): IF the ERROR is Negative_Large THEN (IF CHANGE_IN_ERROR is Positive_Large THEN OUTPUT is Positive_Large) can be written more simply as a pair of rules:
59
R1 : IF ERROR is Negative_Large THEN R 2 R 2 : IF CHANGE_IN_ERROR is Positive_Large THEN OUTPUT is Positive_Large This is the most useful rule structure in practice. Human operators are invariably taught to control plants with linguistic rules of this type, rather than composite rules that appear to be far too complicated. The number of rules that are required to control a plant varies enormously and depends on the complexity of the plant. Human plant operators rarely use more than approximately 30 rules for routine control tasks, since rules that are rarely used tend to be quickly forgotten. To control a complex plant like a rotary kiln as many as 60-80 rules may be required but as few as 5 are necessary to control simple appliances like washing machines or cameras.
60
Chapter 5
SPE = SPeed Error CSPE = Change in Speed Error CFUEL = Change in Fuel Intake It is noted that linguistic control rules are of the familiar if then else form, where else is replaced by the connective OR.
OR (max ) AND (min ) The operators min and max of two sets A and B result in the sets C and D as follows:
D = AB = {max(a,b)} aA, bB
C = A B = {min(a,b)} aA, bB
61
and are shown in Figure 5.5. The AND operator is therefore synonymous with the min operation and the OR operator with the max operation. It is worth remembering this in the following chapters. When operators are used on one element only they imply the minimum (inf or infinum) or maximum (sup or supremum) of all the elements of the set, thus: a=
When the elements of the set are functions of a variable, then the operators are expressed as: a=
x(a(x))
xX
It is noted, finally, that expressions that involve the min and max operators use identical rules to those of arithmetic multiplication and addition respectively.
62
Chapter 5
1 (x)
1 1 (x)
C(x) - min 0 x
63
64
Chapter 5
( ) =
The following properties apply only to fuzzy sets:
E A = or 0 = 0 A = A or 0 = A E = A or 1 = A E = E or 1 = 1
where is the unit set specified by A(x) = 1 xX and is the null set.
primary terms which are labels of fuzzy sets, such as High, Low, Small, Medium, Zero, negation NOT and connectives AND and OR, hedges such as very, nearly, almost and markers such as parentheses ( ).
65
The primary terms may have either continuous or discrete membership functions. Continuous membership functions are normally defined by analytic functions. The Danish company F. L. Smidth in its fuzzy controllers designed for the cement industry uses Gaussian-like membership functions of the type shown in Figure 5.7 given by the expression:
A ( x) = 1 e
) | x|
(x) 1
x -1 0 1
Figure 5.7 Examples of membership functions used in the F. L. Smidth fuzzy controller
The triplet (,,) defining the shape of the F. L. Smidth fuzzy sets shown in Figure 5.7 is given in the following table:
66
Chapter 5
Linguistic Variable Positive_Large Positive_Medium Positive_Small Positive_Zero Zero Negative_Zero Negative_Small Negative_Medium Negative_Large Large Normal Low
0.25 0.25 0.25 0.1 0.25 0.1 0.25 0.25 0.25 0.5 0.6 0.5
An alternative way of defining continuous membership functions is through the generic S and functions shown in Figures 5.8(a) and 5.8(b). The first is monotonic and is specified by: S(x,,,) = 0 = 2[(x -)/(-)]2 = 1 - 2[(x -)/(-)]2 = 1 for x for x for x for x
(x) 1
67
The second generic membership function is the function that changes monotonicity at one point only. This function can be defined in terms of S functions. In this case the parameter represents the width of the function between the median points where the membership function has a value of 0.5. This function is given by: (x,,,) = S(x,-,-/2,) for x = 1 - S(x,,+/2,+) for x > Continuous fuzzy sets can also be constructed from standardized trapezoidal or triangular functions. Three examples of trapezoidal functions that represent the primary sets (Small, Medium and Large) are shown in Figure 5.9. These fuzzy sets can be uniquely defined using four parameters, the inflexion points b and c and the left and right extinction points a and d that define the support set. Figure 5.10 shows examples of some of the membership functions available in the MATLAB/Fuzzy Toolbox. Finally, discrete fuzzy sets are sets of singletons on a finite universe of discourse. For example, if the universe of discourse is given by the finite set:
X = {0 + 1 + 2 + 3 + 4 + 5 + 6}
A(x)
68
Chapter 5
(x) 1
x (x)
Neg_Large -x
Neg_Small
Zero 0
Pos_Small
Pos_Large x
then the fuzzy sets for the linguistic variables small, medium and large as shown in Figure 5.11 could be defined as the sets:
small (x) = {0.3 + 0.7 + 1 + 0.7 + 0.3 + 0 + 0} medium(x) = {0 + 0 + 0.3 + 0.7 + 1 + 0.7 + 0.3} large (x) = {0 + 0 + 0 + 0 + 0.3 + 0.7 + 1}
69
5.7 Connectives
Negation (NOT) and the connectives AND and OR can be defined in terms of the complement, union and intersection operations respectively. Usually the connective AND is used for fuzzy variables which have different universes of discourse. If
= {A(x)/x} B = {B(y)/y}
it follows that
for xX for yY
(x)
small
(x)
medium
(x)
large
Thus if
70
Chapter 5
then
= NOT_Small AND_NOT_Large
whose membership function is
Chapter 6
Fuzzy Reasoning
At the core of every fuzzy controller is the inference engine, the computational mechanism with which decisions can be inferred even though the knowledge may be incomplete. It is this very mechanism that can give linguistic controllers the power to reason by being able to extrapolate knowledge and search for rules which only partially fit any given situation for which a rule does not exist. Unlike expert systems that depend on a variety of techniques to search decision trees, fuzzy inference engines perform an exhaustive search of the rules in the knowledge base to determine the degree of fit for each rule for a given set of causes. The contribution to the final decision of rules that exhibit a small degree of fit is clearly small and may even be ignored while rules with a high degree of fit are dominant. It is clear that a number of rules may contribute to the final result to varying degrees. A degree of fit of unity means that only one rule has fired and only one unique rule contributes to the final decision, while a degree of fit of zero implies that the rule does not contribute to the final decision. Inference engines can take different forms depending on the manner in which inference is defined. It is therefore prudent to review the fundamentals of fuzzy logic that will allow us to understand how the inference works and how it may be implemented. A fuzzy propositional implication defines the relationship between the linguistic variables of a fuzzy controller. Given two fuzzy sets and that belong to the universes of discourse and respectively, then we define the fuzzy propositional implication as:
71
72
Chapter 6
R : IF THEN = where is the Cartesian product of the two fuzzy sets A and B. The Cartesian product is an essential operation of all fuzzy inference engines. Using the conjunctive operator (min), the Cartesian product is defined as: =
while for the case of an algebraic product the Cartesian product is: =
Thus, for example, given the discrete fuzzy sets: = {1 + 2 + 3} and = {1 + 2 + 3 + 4} whose corresponding discrete membership functions (sometimes termed the grades of membership) are: {(x)/x} = (1/1 + 0.7/2 + 0.2/3) and {(y)/y} = (0.8/1 + 0.6/2 + 0.4/3 + 0.2/4), using the union (or more generally disjunction) operator, the Cartesian product using the conjunctive (min) operator is: R = = {min(1, 0.8)/(1,1), min(1, 0.6)/(1,2), min(1, 0.4)/(1,3), min(1, 0.2)/(1,4), min(0.7, 0.8)/(2,1), min(0.7, 0.6)/(2,2), min(0.7, 0.4)/(2,3), min(0.7, 0.2)/(2,4), ............ = {0.8/(1,1) + 0.6/(1,2) + 0.4/(1,3) + 0.2/(1,4) + 0.7/(2,1) + 0.6/(2,2) + 0.4/(2,3) + 0.2/(2,4) +
Fuzzy Reasoning
73
0.2/(3,1) + 0.2/(3,2) + 0.2/(3,3) + 0.2/(3,4)} This Cartesian product can be conveniently represented by means of the relational matrix: x\y 1 2 3 1 0.8 0.7 0.2 2 0.6 0.6 0.2 3 0.4 0.4 0.2 4 0.2 0.2 0.2
Likewise, the Cartesian algebraic product is computed as follows: R* = = {0.80/(1,1)+ 0.60/(1,2) + 0.40/(1,3) + 0.20/(1,4) + 0.56/(2,1) + 0.42/(2,2) + 0.28/(2,3) + 0.14/(2,4) + 0.16/(3,1) + 0.12/(3,2) + 0.08/(3,3) + 0.04/(3,4)} whose relational matrix is:
74
Chapter 6
x\y 1 2 3
which is shown graphically in Figure 6.2. It is observed that the graphical representations of the relational matrices of the two Cartesian products have similarities. The Cartesian product based on the conjunctive operator min is much simpler and more efficient to implement computationally and is therefore generally preferred in fuzzy controller inference engines. Most commercially available fuzzy controllers in fact use this method.
R* x
Fuzzy Reasoning
75
where is some implication operator and R = {R(x,y)/(x,y)}. In general, if 1, 2, ... are fuzzy sub-sets of and 1, 2, ... are sub-sets of (corresponding to the antecedents or causes and the consequents or effects respectively), then the fuzzy algorithm is defined as the set of rules: R : OR IF 1 THEN 1 IF 2 THEN 2
OR ............................... IF THEN This form is typically used in fuzzy control and is identical to the manner and terms in which human operators think. The connective OR, abbreviated as , depends on the fuzzy implication operator . Thus the membership function for N rules in a fuzzy algorithm is given by: R(x,y) = (R1(x,y), R2(x,y) .... ) = ((1(x) B1(y)), (2(x) B2(y))....) The foregoing relations apply to simple variables and . In general, the part of the conditional statement of the form IF THEN ELSE involves more than one variable and can be expressed as a series of nested statements of the form: IF 1 THEN (IF 2 THEN .... (IF N THEN )) or as a statement where the antecedents are related through the connective AND, i.e., IF (1 AND 2 AND .... N) THEN
76
Chapter 6
whereupon: R(x1,x2,.....xN) = {1(x1),(2(x2)),...((N(xN),(B(y))))} for x1,x2....xn X1, X2 ... XN, yY or R(x1,x2,.....xn) = {1(x1) 2(x2) ....
( N(xN), B(y))}
Fuzzy Reasoning
77
78
Chapter 6
(y)
For the case of rules, use is made of the connective AND in which case:
RN
(x,y) =
RN =
k k
Fuzzy Reasoning
79
(y)) (1 - (x))
Zadehs fuzzy implication rule is difficult to apply in practice and it took several years before Mamdani proposed a simplification that made it especially useful for control applications.
RN(x,y) =
(k(x) k(y))
80
Chapter 6
R(x,y) =
(
k
k (x)
k(y)).
Both Mamdanis and Larsens implications have found extensive application in practical control engineering due to their computational simplicity. Nearly all industrial fuzzy controllers use one or the other of these two fuzzy implications in their inference engines. Mamdanis implication, being computationally faster, is found most often.
Fuzzy Reasoning
81
1 if (x) (y) R(x,y) = (x) > (y) = 0 if (x) > (y) Alternatively, using the algebraic product 1 if (x) (y) R(x,y) = (x) > (y) = (x) if (y) > (y) A (y)
82
Chapter 6
where o implies rule composition. In terms of the max-min operators used by Mamdani, the membership function of the resultant compositional rule of inference is: R12(x,z) =
(
y
1 R (x,y)
R2(y,z))
= max (min (R1(x,y), R2(y,z)) and in the case of the Larsen implication rule using the max-product operators: R12(x,z) =
(
y
1 R (x,y)
R2(y,z))
= max (R1(x,y) R2(y,z)) When discrete membership functions are used, the compositional rule of inference in the case of Mamdani implication is analogous to the inner product of two matrices in which multiplication and addition are replaced by the min and max operators, respectively. For Larsen implication, addition is replaced by the max operator while multiplication is arithmetic. In the following, we discuss the procedure for determining the consequent (or effect), given the antecedent (or cause). Given = {(x)/x} = {(y)/y} and the compositional rule of inference: R = {R(x,y)/(x,y)} for x and y Y We wish to infer the consequent if the antecedent is modified slightly to , i.e., = {(x)/x} for x X for x X for y Y
Fuzzy Reasoning
83
Making use of the fuzzy compositional rule of inference using the max-min operators for instance: = o R =
(
x
A(x)/x
(
x
A(x)/x
By way of example consider the rule: IF is Slow THEN is Fast given the fuzzy sets for Slow and Fast are given by the discrete membership functions: (x) = {1 + 0.7 + 0.3 + 0 + 0 + 0} (y) = {0 + 0 + 0.3 + 0.7 + 1 + 1}
Figure 6.3 The fuzzy sets Slow and Fast in the example
on the universes of discourse X, = {0,1,2,3,4,5,6}. The discrete membership functions are shown in Figure 6.3. We wish to determine the outcome if A = slightly Slow for which there no rule exists.
84
Chapter 6
The procedure is straightforward, though tedious. The first step is to compute the Cartesian product and using the min operator this is simply: R = = {min[(xi ), B(yj)]} =
min[1, 0] min[0.7, 0] min[0.3, 0] min[0, 0] min[0, 0] min[0, 0] min[1, 0] min[0.7, 0] min[0.3, 0] min[0, 0] min[0, 0] min[0, 0] min[1, 0.3] min[0.7, 0.3] min[0.3, 0.3] min[0, 0.3] min[0, 0.3] min[0, 0.3] min[1, 0.7] min[0.7, 0.7] min[0.3, 0.7] min[0, 0.7] min[0, 0.7] min[0, 0.7] min[1, 1] min[0.7, 1] min[0.3, 1] min[0, 1] min[0, 1] min[0, 1] min[1, 1] min[0.7, 1] min[0.3, 1] min[0, 1] min[0, 1] min[0, 1]
0 0 0 0 0 0
0 0 0 0 0 0
1 0.7 0.3 0 0 0
1 0.7 0.3 0 0 0
Thus if the antecedent A is modified somewhat, by displacing the elements of the discrete fuzzy set one place to the left to represent the fuzzy set A = slightly Slow then the original discrete membership function: (x) = {1 + 0.7 + 0.3 + 0 + 0 + 0} becomes = {0.3 + 0.7 + 1 + 0.7 + 0.3 + 0} which is shown in Figure 6.4. Using the fuzzy compositional inference rule: = o R and the max-min operators (i.e., the Mamdani compositional rule): (y) = max (min (A(x), R(y)))
Fuzzy Reasoning
85
'
'
Figure 6.4 The compositional inference rule using the max-min operators
then the discrete membership function of the new consequent can be readily computed. The relational matrix {min((x),R(y))} contains elements of the matrix R and the discrete membership function (x) and is given by: (y) =
min(0, 0.3) min(0, 0.7) min(0, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min( 0, 0.3) min(0, 0.7) min(0, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(0.3, 0.3) min(0.3, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(0.7, 0.3) min(0.7, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(1, 0.3) min(0.7, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0) min(1, 0.3) min(0.7, 0.7) min(0.3, 1) min(0, 0.7) min(0, 0.3) min(0, 0)
0 0 0 0 0 0
0 0 0 0 0 0
The final operation in determining the membership function (y) of the new consequence is selection of the largest element in each column, which is equivalent to applying the max operator in each column. These are shown in bold. The result of the procedure is shown in Figure 6.4 and is: (y) = {0 + 0 + 0.3 + 0.7 + 0.7 + 0.7}
86
Chapter 6
It is noted that the displacement of the membership function for the new condition must be small, otherwise all the elements of the relational matrix will be zero and no conclusion can be drawn. By way of comparison, the corresponding result using the max-product rule of compositional inference: (y) = max ( (x) R(y)) has a relational matrix: R* = = {(xi) B(yj)} =
[1* 0] [0.7* 0] [0.3* 0] [0* 0] [0* 0] [0* 0] [1* 0] [0.7* 0] [0.3* 0] [0* 0] [0* 0] [0* 0] [1* 0.3] [0.7*0.3] [0.3*0.3] [0* 0.3] [0* 0.3] [0* 0.3] [1* 0.7] [0.7*0.7] [0.3* 0.7] [0* 0.7] [0* 0.7] [0* 0.7] [1* 1] [0.7* 1] [0.3* 1] [0* 1] [0* 1] [0* 1] [1* 1] [0.7* 1] [0.3* 1] [0* 1] [0* 1] [0* 1]
0 0 0 0 0 0
0 0 0 0 0 0
1 0.7 0.3 0 0 0
1 0.7 0.3 0 0 0
Fuzzy Reasoning
87
0 0 0 0 0 0
0 0 0 0 0 0
The maximum elements of each column are therefore: (y) = {0 + 0 + 0.15 + 0.35 + 0.49 + 0.49} which are shown graphically in Figure 6.5.
'
'
In general, industrial fuzzy controllers are multivariable, involving m inputs and p outputs. It is, however, simpler to think in terms of p parallel fuzzy controllers each with one output only. In practice this is achieved by multiplexing a multi-input single output controller since the precedents are the same and only the antecedents change in each case. For the case of a fuzzy controller with m inputs and one output only: RN = {RN(x1,x2....xm,y)/(x1,x2....xm,y)} for xk Xk and yY. Now, given k = {k(xk)/xk} for xk Xk, k=1,2...m
88
Chapter 6
the consequent is given by the relationship: = ( k Ak) o RN for k=1,2...m = {(y)/y} for yY where (y) =
. . . [
x1 x2 xm
(k(xk) RN(x1,x2....xm))]
for k=1,2...m.
Finally, for completeness, if the max-product is used in the compositional inference rule, the corresponding expressions for the resultant consequence is given by: (y) =
. . . [
x1 x2 xm
Chapter 7
90
Chapter 7
the measured plant variables and following completion of the fuzzy algorithm to de-fuzzify the result and thereby return to the engineering world. This chapter discusses the procedure of fuzzification and de-fuzzification as they apply to practical control.
x FC1 y1
FC2
y2
FC3
y3
FCn
yn
Figure 7.1 Decomposition of a multi-input multi-output incremental fuzzy controller into a set of multi-input single-output incremental controllers
91
7.2 Fuzzification
The algorithm for computing the crisp output of a fuzzy controller involves the following three steps: (1) fuzzification (2) inference (3) de-fuzzification To make these steps easier to understand, consider a fuzzy controller with three inputs and a single output only. It should be obvious that the procedure that follows can be generalized for any number of inputs. Given a MISO controller with inputs x1, x2, x3 and output y and assuming that the linguistic control rules are of the form: IF (1 AND 2 AND 3) THEN then the membership function of the output of the controller is given by: R(x1,x2,x3) = (1(x1) 2(x2) = (
(3(xN), B(y)))
where the operator implies max-min or max-product. Using the intersection operator, the degree of fulfillment of the j-th rule j(k)[0,1] is defined by: j(k) = 1(x1)
2(x2)
....
N(xN)
and is computed every time the algorithm is executed. The degrees of fulfillment are thus a measure of how closely the inputs to the controller match the control rules. They can be viewed conveniently as weights that are assigned to every rule. In practice only a small number of the rules in the rule-base (typically less than 6) will exhibit non-zero degrees of fulfillment at any instant. The expression for the degree of fulfillment given above applies when the Mamdani max-min implication rule is used. For the case of
92
Chapter 7
max-product implication, the corresponding expression for the degree of fulfillment is: j(k) = 1(x1) 2(x2) .... N(xN) The fuzzy implication rules yield the membership function of the output of the controller from knowledge of the current instantaneous measurements of the inputs to the controller x1(k), x2(k), x3(k). Thus at any instant k the membership function of the output of the controller using the max-min fuzzy implication rule is: Y(k) = (1(k) 2(k) or, the max-product implication rule: Y(k) = (1(k) 2(k) 3(k)) o RN where X1, X2 and X3 are the corresponding fuzzy sets of the controller inputs. These computations are simplified significantly if the fuzzy sets of the inputs to the controller are taken as singletons defined as: i(k) = 1 if xi =xi(k) = 0 otherwise whereupon: Y(y) =
3(k)) o RN
(111
x1 x2 x3
N R (x1,x2,x3,y)
RN(x1,x2,x3,y)
93
For simplicity, assume, furthermore that the fuzzy sets of the inputs and outputs are triangular and are as shown in Figure 7.2.
94
Chapter 7
The universes of discourse of Input_1 and Input_2 are assumed symmetric and are expressed as percentages of their maximum permissible values. Output y involves 5 fuzzy sets and is assumed asymmetric. Thus the inputs to the controller can take any value between 100% of their maximum permissible values while the output can take any value between 0 and 100% of its maximum permissible value. For example, the Output y could represent the opening of a servo-valve, while Input_1 could be a pressure deviation and Input_2 could be the temperature deviation about their nominal values.
LO ZO LH MH VH
1 Input_1
-100%
100%
VL
ZO
VH
1 Input_2
-100%
100%
LO
ZO
LH
MH
VH
1 Output
100 %
The first five rules in the rule base are depicted graphically in Figure 7.3. Assume, furthermore, that at the instant of execution of the algorithm, the instantaneous inputs to the controller are -20% and -50% respectively.
95
Every rule is now examined with a view to determine the degree to which it contributes to the final decision. This measure is termed the degree of fulfillment j. The computational time required for this determination is clearly dependent on the number of rules in the rule base. Fortunately, rarely are more than 20 to 50 rules required in practice and consequently computational time is minimal.
Input_1
Input_2
Output
LO
R1
ZO
LH
MH
VH
VL
ZO
VH
LO
ZO
LH
MH
VH
-100%
100%
LO ZO LH MH VH
-100%
VL ZO
100%
VH
0
LO ZO LH MH
100 %
VH 1
R2
-100%
100%
-100%
100%
100 %
LO
R3
ZO
LH
MH
VH
VL
ZO
VH
LO
ZO
LH
MH
VH 1
-100%
100%
-100%
100%
100 %
LO
R 4
ZO
LH
MH
VH
VL
ZO
VH
LO
ZO
LH
MH
VH
-100%
100%
LO ZO LH MH VH
-100%
VL ZO
100%
VH
0
LO ZO LH MH
100 %
VH 1
R5
-100%
100%
-100%
100%
100 %
x (k) 1
x (k) 2
Figure 7.3 Graphical representation of first five rules in the rule base For the given values of the inputs, it is clear that rules R1, R4 and R5 have no part in the final decision (and consequently the output) since these rules have not fired. The degrees of fulfillment of the non-fired rules are consequently zero. The intercepts of the vertical lines corresponding to the instantaneous value of Input_1 and Input_2 and the corresponding fuzzy sets specifying the membership value are:
96
Chapter 7
97
11=0, 21=0.66, 31=0.33, 41=0, 51=0 and 12=0.5, 22=0.5, 32=0.5, 42=0.5, 52=0.5
respectively.
The degree of fulfillment j for every rule is computed from the membership values and the operation min(j1,j2). Here R1 : 1= min(11,12)=min(0, 0.5)=0 R2 : 2= min(21,22)=min(0.66, 0.5)=0.5 R3 : 3= min(31,32)=min(0.33, 0.5)=0.33 R4 : 4= min(41,42)=min(0, 0.5)=0 R5 : 5= min(51,52)=min(0, 0.5)=0 etc.
98
Chapter 7
This is shown in graphical form in Figure 7.4. The resultant composite membership function of the output of the controller is shown in the lower right hand side diagram of Figure 7.5. Using Mamdani implication, the composite membership function of the controller output is the union of the maximum values of the weighted membership functions of each rule that has fired, i.e., (y)= max (1,1(y))max(2,2(y))max(3,3(y))
LO R2
ZO
LH
MH
VH 1
VL
ZO
VH 1
LO
ZO
LH
MH
VH
-100%
100%
-100%
100%
100 %
LO R3
ZO
LH
MH
VH 1
VL
ZO
VH 1
LO
ZO
LH
MH
VH 1
-100%
100%
-100%
100%
100 %
x (k) 1
x (k) 2
(y)
100 %
Figure 7.4 Determination of the composite membership function of the controller output using the Larsen implication
99
Input_2
Output
VL
ZO
VH
LO
ZO
LH
MH
VH
-100%
100%
-100%
100%
100 %
LO R3
ZO
LH
MH
VH
VL
ZO
VH
LO
ZO
LH
MH
VH
-100%
100%
-100%
100%
100 %
x (k) 1
x (k) 2
1 (y)
100 %
Figure 7.5 Determination of the fuzzy set of the controller output using the Mamdani implication
100
Chapter 7
yCOA =
y
S
( y )dy
( y )dy
where S is the support set of Y(y). In the case where the composite membership function is discrete with I elements, this becomes:
yCOA =
y
i =1 I i
( yi )
i =1
( yi )
This method is sensitive to changes in the shape of the membership functions of the output. Because it yields intuitive results, this method has found extensive use in practical fuzzy control.
yCOG =
i =1 i
( yi )
i =1
( yi )
On concluding de-fuzzification, the crisp output of the controller is applied to the plant actuators.
101
102
Chapter 7
POsitive) are used for coarse control and five for fine control. This technique has been applied with success in a number of processes requiring high terminal accuracy.
NE (a)
ZE
PO
NB (b)
NM
NS
ZE
PS
PM
PB
103
are regions on the universe of discourse where membership is zero whereupon if the instantaneous value of the input falls in these regions no rule can be fired with the result that the controller is unable to infer any control action. This is clearly undesirable and indicates that fuzzy sets must overlap in order to obtain a continuous output.
ZE
PS
PM
PB
104
Chapter 7
whereupon rules can be represented in the form of tiles whose colors specify the control action required. This is none other than the Fuzzy Associative Memory or FAM and Figure 7.8 shows an example of this simple technique that has proved very useful in practice. The human eye is an excellent detector of abnormal color changes and thus by simply looking at the manner in which the colors of the tiles vary in control space it is possible to identify possible conflict.
Figure7.8 Representation of the knowledge base in tile form or Fuzzy Associative Memory (FAM)
Chapter 8
105
106
Chapter 8
quantities must be maintained as percentages, typically found in materials blending and mixing processes. Productivity is closely related to the quality of control. Low quality of control implies poor product quality with products that cannot meet standards, reduced productivity, loss of competitiveness and ultimately the collapse of the manufacturer. Effective control is thus of vital importance where high product quality and productivity are essential, high standards are to be maintained and market share assured. Effective operation of a plant implies correct tuning of the controllers to meet the product specifications while the efficiency of a plant is critically dependent on specifying the correct parameters of these controllers. Traditionally, a three-term controllers are tuned on-line by human experts who excite the plant by injecting appropriate disturbances to the set-points and then systematically adjust the parameters (i.e., gain constants) of the controller until the plant meets its design specifications. Re-tuning is normally necessary whenever the operating conditions of the plant are changed. Controller tuning requires considerable expertise and patience and takes time to master. The way in which a human tunes a control loop or plant is based on heuristic rules which involve such factors as the rise time, settling time and steady state error of the closed system. Indeed, as will be seen later in this chapter, these rules are put to good use in the design of expert controller tuners, which a number of vendors offer today.
107
quire more computational effort but yield vastly improved plant responses. All these techniques assume that the controlled plant is scalar, i.e., has a single input and a single output and are not applicable to multivariable plants. Multivariable plants, for which three-term controllers do not find ready application, require an entirely different approach to controller design. For best performance three-term controllers must be tuned for all operating conditions. Unfortunately, the dynamic characteristics of most industrial plants depend on their operating state and production rates and these are far from linear or stationary. A three-term controller is normally tuned for best (note that use of the word optimum is tactfully avoided) performance at a specific operating state. When the operating conditions of the plant change, so does the operating state, whereupon the parameters of the controller may no longer be the best and as a consequence performance is degraded. The degree to which such degradation is acceptable clearly depends on the nature of the controlled plant, plants that are highly nonlinear being the most difficult to control effectively. The robustness of a controller is a measure of its ability to operate acceptably despite changes in the operating state of the plant. Where the variations in the plant are severe, a three-term controller with fixed parameters is no longer effective and alternate techniques, which are capable of tracking the changes in the plant must be employed. Such techniques as gain-scheduling, auto-tuning and adaptive control are commonly used to extend the domain of effectiveness and thereby the robustness of the controller. Fuzzy logic can likewise be used to extend the domain of effectiveness of a three-term controller.
108
Chapter 8
controller by establishing rules whereby these gains are varied in accordance with the operating state of the closed system. In this case the controller output has the generalized form u = = fP(e,edt,De) + fI(e,edt,De)) + fD(e,edt,De) which when fuzzified can be expressed as the weighted sum u = fuzzy(kP) e + fuzzy(kI) edt + fuzzy(kD) De Hybrid fuzzy three-term controllers in which only the proportional and derivative terms are fuzzified while the integral term remains conventional, have also been used. In this case the controller output is u = uPD + kI edt An alternative class of fuzzy controllers, which possess the characteristics of a two-term PI controller is the generic fuzzy controller which has been used extensively in practice. Generic fuzzy controllers are very simple and require few rules to operate effectively. Using the closed system error and its derivative only, this class of fuzzy controllers is normally incremental with output Du = f(e,De) which must subsequently be integrated (or accumulated) to generate the final controller output.
109
and respectively, then the control rules of a generalized fuzzy threeterm controller can be expressed as: Rr: IF e is Er AND De is r AND edt is r THEN u is Ur Further, if the union operator relates the control rules, then the fuzzy algorithm reduces to the fuzzy implication rule R = R1R2.....Rn=
(E E IE U )
r r r r
The fuzzy set of the output of the generalized fuzzy three-term controller is thus given by U = (EEIE) R whose membership function is consequently U(u)=
( (e)
(De)
(edt) R(e,De,edt,u))
for eE, DE and edtIE. A graphical display of the parameter surface of such a three-term controller would have to be three-dimensional and it would be difficult to comprehend the effect of each parameter on the controller output.
110
Chapter 8
contains high frequency extraneous noise, there may be certain misgivings in generating the derivative or difference term from the error since noise aggravates the situation. In this case it is clear that some form of low pass filtering or signal processing is necessary to reduce the effect of high frequency noise.
e De FPD
uPD
FPI Integrator
uI
The rule matrix or Fuzzy Associative Matrix for the FPD fuzzy subcontrollers is:
e\De NB NM NS ZO PS PM PB NB NB NB NB NB NM NS ZO NM NB NB NM NS NS ZO PS NS NB NB NM NS ZO PS PM ZO NM NM NS ZO PS PM PM PS NM NS ZO PS PM PB PB PM NS ZO PS PM PB PB PB PB ZO PS PM PB PB PB PB
This FAM contains 77=49 rules, a number that must be compared to the case of a three-term controller which would require 777=343 rules and corresponding memory storage locations in its knowledge base. It is noted that the FAM shown above is symmetric about the principal diagonal. It is possible to take advantage of this fact and store only half the rules if memory must be conserved. Rule pruning, in which adjacent rules are systematically eliminated, can further reduce the number of
111
rules that have to be stored to about a quarter of the original number or about a dozen. Applying the Mamdani compositional rule and confining the controller inputs e and De to the closed interval [-3,3] and quantizing the two inputs into 7 levels, results in the relational matrix shown below. The entries in the matrix are the numerical values of uPD that are stored in the controller memory: this form of look-up table control is very simple to implement and has been applied extensively in practice. By altering some of the entries in the matrix it is possible to trim the performance of the controller further, intentionally warping the control surface in order to compensate for inherent non-linearities in the plant characteristics. In the design study presented in Appendix A, it is shown how step-response asymmetry may be compensated for using this approach.
e\De -3 -2 -1 0 1 2 3 -3 -3 -3 -3 -3 -2 -1 0 -2 -3 -3 -2 -1 -1 0 1 -1 -3 -3 -2 -1 0 1 2 0 -2 -2 -1 0 1 2 2 1 -2 -1 0 1 2 3 3 2 -1 0 1 2 3 3 3 3 0 1 2 3 3 3 3
The relational matrix is shown graphically in Figure 8.2. This is none other than the control surface of the FPD sub-controller. Due to quantization of the controller inputs, the controller output is not continuous and smooth and is sometimes referred to as a multilevel relay controller. Clearly the control surface can be smoothed and the controller performance improved by using extrapolation techniques in which case the control actions are defined everywhere in the bounded control space.
112
Chapter 8
4 2 0 -2
uPD
-4
De e Figure 8.2 Control surface of the FPD sub-controller
1 2 3 4 5 6 7
113
The final output of this hybrid fuzzy controller is the sum u = uPD + uI = uPD + fuzzy(kI)edt
De e
fuzzy(kI)
Extending the method further, it is possible to design a threeterm controller using the configuration shown in Figure 8.4 with each of the controller parameters (kP, kI and kD) specified by independent rules, i.e.,: U = fuzzy(kP)e + fuzzy(kI)edt + fuzzy(kD)De
114
Chapter 8
De e
fuzzy(kP)
The temporal error response to a step excitation and its derivative can be broken up into regions defined by their zero crossover points. Thus the error may be coarsely described as positive, zero at the crossover or negative. Assume, therefore, that the three fuzzy sets (POsitive, ZerO and NEgative) suffice to describe each control variable. In Region 1 we can thus write the first generic control rule: R1: IF e is PO AND De is NE THEN u is PO The objective of this rule is to apply maximum positive control action (e.g., torque in the case of a servomotor) to the controlled process in order to force it to accelerate to its final value with a minimum rise time. In Region 2 the corresponding generic control rule is: R2: IF e is AND De is NE THEN u is Here the objective is to apply maximum negative control action to decelerate the process in order to minimize overshoot. Using similar reasoning, the 11 rule Fuzzy Associative Memory that follows is derived.
115
De
(a)
e a b c d e f g h i j k l
116
Chapter 8
Rule
1 2 3 4 5 6 7 8 9 10 11
e
PO ZO NE ZO ZO PO NE NE PO PO NE
De
ZO NE ZO PO ZO NE NE PO PO NE PO
u
PO NE NE PO ZO PO NE NE PO ZO ZO
Points
a,e,i b,f,j c,g,k d,h,l origin i,v ii,vi iii,vii iv,viii ix ix
It is obvious that finer control can be achieved if the number of control rules is increased. The number of rules is consequently a measure of the granularity of the controller.
Rule
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
e
PL PM PS ZO ZO ZO NL NM NS ZO ZO ZO ZO PL PS NL NS PS NS
De
ZO ZO ZO NL NM NS ZO ZO ZO PL PM PS ZO NS NL PS PL NS PS
u
PL PM PS NL NM NS NL NM NS PL PM PS ZO PM NM NM PM ZO ZO
Points
a e i b f j c g k d h l origin i i iii iii ix xi
Increasing the number of fuzzy sets assigned to each control variable to 19 and using 5 fuzzy variables PL for Positive_Large, PM for Positive_Medium, PS for Positive_Small, ZO for ZerO, NS for Nega-
117
tive_Small, NM for Negative_Medium and. NL for Negative_Large leads to the FAM given above.
u0 u
Figure 8.7 shows a schematic of the generic two-term architecture. This incremental fuzzy controller is generic since it can be applied to any dynamic process that exhibits under-damped behavior. Being incremental, it is necessary to supply the nominal controller output u0 to the controller in order to generate the total control variable.
118
Chapter 8
this case the universe of discourse is varied, either in discrete regions in control space or smoothly as the plant approaches the desired operating point (see fuzzy gain-scheduling in chapter 10). This approach has been used to great effect for the control of high precision mechatronic devices.
Chapter 9
120
Chapter 9
As noted earlier, the rules (sometimes referred to as production rules) by which a plant can be controlled are normally expressed as a set of relational statements of the form: IF (Current_State_of_Plant) THEN (Control_Action) These rules, which are suitably encoded, are stored in the knowledge base of the fuzzy controller and are processed by a fuzzy inference engine based on one of the implication functions presented in chapter 6.
121
a development system DS through which the engineer interacts with the fuzzy controller during the development or whenever modifications have to be made. This module is removed once the controller is finalized.
Causes
Real-time Database RTDB
Effects
Fuzzifier FZ
De-fuzzifier
Inference Engine IE
Knowledge Base KB
Fuzzy Sets FS
Development System DS
122
Chapter 9
In modern industrial plants, the link between the supervisory control system (whether SCADA or DCS) and the physical plant being controlled is through a Local Area Network (LAN) to which remote terminal units (RTUs) are attached. The RTUs include analog to digital converters for the conversion of the physical variables at rates that can reach thousands of samples per seconds. These units invariably include software with which three-term controllers (PID) can be implemented, as well as software with which the RTUs can communicate with the host computer (i.e., server) via the LAN. The client/server architecture shown in Figure 9.2 and is one of the most common architectures in use today. It is noted that in the new generation of industrial control systems, smart sensors and actuators with built-in micro-controllers that can be directly connected to the LAN are gradually replacing RTUs.
RTU
RTU LAN
PC
PC
PC
Distributed supervisory control systems involve a cluster of industrial grade microcomputers, RTUs, micro-controllers and peripherals connected to a LAN. Each component of the cluster performs real-time control of a specific sub-process of the plant. The host computer usually
123
contains the real-time database where all current data on the state of the plant is stored and can serve this data to any of the clients (sometimes referred to as agents) in the cluster on demand. An alternative architecture involves a distributed real-time database in which each client retains the data pertinent to the tasks it is assigned. The client can transmit this data to any other client on demand. This leads to considerably more data having to be transmitted over the LAN between clients, resulting in data transmission delays. Finally, a hybrid architecture in which the local real-time databases are mirrored in the server can be used. In this case, each client must continuously update the data that has changed since the last transmission in the master database in the host computer. The advantage of this last architecture is that in the unlikely case that the host computer fails and its data is lost or corrupted, then the master database can be restored from the local databases in the clients when the host becomes operational again.
124
Chapter 9
ecutable software, which includes the knowledge base, fuzzy sets, inference engine, fuzzifier and de-fuzzifier, is downloaded to the microcontroller via a local link or is burned into an erasable programmable read-only memory (EPROM) which is then plugged into the microcontroller. A new class of intelligent industrial three-term controllers is gradually replacing conventional industrial controllers in a number of critical applications that require increased autonomy. These controllers find use in situations where the operational demands of the application do not permit fixed or programmed gains. Intelligent industrial controllers are implemented in software in micro-controllers (MC), programmable logical controllers (PLCs) or remote terminal controllers (RTUs) and a number of vendors today offer appropriate software for their development. In new plants it would be wise to consider incorporating intelligent controllers that have been shown to enhance control over conventional industrial controllers. In order to minimize memory requirements as well as accelerate computation in embedded fuzzy controllers, many vendors restrict both the number and the shape of the permissible fuzzy sets for both inputs and outputs. Triangular fuzzy sets are almost universally used for the inputs and outputs of the controller. Rules are coded numerically and the number of fuzzy sets is restricted. Singletons are often used to define the fuzzy sets of the outputs of the controllers as they simplify defuzzification considerably.
125
An alternative technique for scheduling the intelligent controller which is useful particularly when dealing with fast dynamic plants, is to have a control scheduler that continuously tracks the inputs to the fuzzy controller and their changes. Based on some measure of the rates of change (or differences) of these inputs, the control scheduler is programmed to decide whether or not the controller should execute. Clearly if the inputs to the controller (i.e., the plant variables) do not change significantly with time, implying that the plant is stationary and operating at its nominal state, then there is no reason to effect control and the controller algorithm is not executed. Conversely, if the inputs are changing significantly, implying that the plant is in transition then immediate action must be taken in order to restore it to its nominal state. By continuously following the changes of the controller inputs, the control scheduler is therefore in a position to issue commands on when to execute the controller. If, following execution and after some minimum time interval, the control scheduler continues to observe that the rates of the controller inputs exceed pre-specified limits, then it sends another command to execute the controller and continues to do so until the plant returns to its nominal state. This scheme leads to a controller that executes at irregular intervals and on demand. The control scheduler is shown schematically in Figure 9.3. Controller Differencer Execution Scheduler
Input
Output
On executing the fuzzy algorithm and following defuzzification, the new control variables are deposited in the real-time da-
126
Chapter 9
tabase. At regular intervals (measured in milliseconds) and provided there has been a significant change in the control variables since the last time the algorithm was executed, this information is transmitted via the LAN to the appropriate RTUs for immediate action on the plant. This final action closes the control loop. The structure of a fuzzy controller can be seen to be very general and is applicable to a wide variety of plants and processes. The only thing that is required to change the controller is to modify the numbers of the inputs and outputs of the controller, the shape and number of the fuzzy sets and, most importantly, the rule base.
127
Sludge is transferred to a second aerated zone where organic load is removed and nitrification take place. A fraction of this sludge is then returned to the anoxic zone while the remainder is fed to the secondary settling tank. The quantity of sludge returned for further treatment depends on factors that affect nitrification and de-nitrification and is therefore one of the process variables that must be controlled. Finally, a fraction of the sludge in the secondary settling tank is returned to the biological reactor, while the rest is removed and fed to the fusion stage. Sludge feed Grit chamber Biological treatment Bar racks Skimming tank Suspended liquids Sludge returns Remaining sludge Primary settling tank Secondary settling tank
Solids storage
Figure 9.5 Schematic of a typical wastewater treatment plant The fraction of sludge fed back to the biological reactor is another variable that must be controlled. In all wastewater treatment plants it is necessary to control the oxygen content in the aerated zone of the reactor. The oxygen content depends on the removal of the organic load and nitrification. The removal of organic load, nitrification and denitrification are the three principal quantities in a wastewater treatment plant that must be controlled. This is achieved by a suitable control strategy for the following three manipulated variables:
128
Chapter 9
1. the oxygen supply to the aerated zone (O2Feed), 2. the mixed liquid returns rate from the aerated zone to the anoxic zone of the biological reactor (R_ml) and 3. the sludge returns rate from the secondary settling tank to the biological reactor (R_sludge).
Measurement of BOD
Reactor
Settling tank
+ Sludge feed
ANOXIC ZONE
AERATED ZONE
Oxygen Feed
Measurement of BOD
Figure 9.6. The treatment process and its principal variables The controlled variables of the plant and inputs to the controller are: 1. 2. 3. 4. 5. the ammonia concentration in the reactor (N-NH3), the nitrate concentration in the reactor (N-NO3), the dissolved oxygen in the reactor (DO), the temperature in the reactor (TEMP), the mixed liquid suspended solids concentration in the reactor (MLSS), 6. the difference in biochemical oxygen demand D(BOD)between the entrance and exit of the secondary settling tank.
129
The integrity of the controller is directly related to the number of fuzzy variables used. Increasing the number of fuzzy variables, however, increases the memory requirements of the controller logarithmically. For wastewater plant control, the three fuzzy variables, HI (HIgh), OK and LO (LOw) normally suffice to characterize the controller inputs. Trapezoidal fuzzy sets (membership functions) are computationally simple and effective in practice. Similarly, the manipulated variables or controller outputs must be allocated an appropriate number of linguistic descriptors. Five fuzzy variables, i.e., VH (VeryHigh), HI (HIgh), OK, LO (LOw) and VL (VeryLow) provide sufficiently fine control. Finally, for computational simplicity, singletons are used to describe the fuzzy sets of the controller outputs, leading to a particularly simple and fast procedure for de-fuzzification. The knowledge with which a given wastewater plant is controlled must first be elicited from plant operators. This is a painstaking task and one that is critical for proper operation. If the 6 manipulated variables have 3 descriptors each, then the theoretical maximum number of rules is 36 or 729, a number which is clearly unmanageable and practically unnecessary. In practice 50 rules suffice to provide control of a wastewater treatment plant. Of these some 27 are required to stabilize the organic load (BOD), 11 to stabilize the nitrification process while 12 rules similar to those for nitrification are necessary to stabilize de-nitrification. The 50 rules, which form the knowledge base of the controller, are considered to be the minimum necessary to achieve acceptable control of a wastewater treatment plant under most operating conditions. A subset of these rules is shown in the Table on p. 130. The control rules have the standard form: R : IF (D(BOD) is Y1) AND (MLSS is Y2) AND (TEMP is Y3) AND (DO is Y4) AND (N-NH3 is Y5) AND (N-NO3 is Y6) THEN (O2Feed is U1) AND (R_Sludge is U2) AND (R_ml is U3) The controller outputs are all normalized to the range [0,100]%. Under normal operating conditions the plant outputs have nominal values of 50% and the corresponding levels of reactor stabilization, nitrification and de-nitrification are 90%, 70% and 60% respectively.
130
Chapter 9
O2Feed
MLSS
D(BOD)
Figure 9.7. Variations in oxygen feed in response to a triangular perturbations in the D(BOD) and O2Feed Finally, Figure 9.7 shows the effect on the oxygen feed to the reactor of a triangular perturbation of 50% on D(BOD) and MLSS about their nominal values.
131
132
Chapter 9
Most kiln operators prefer to operate the kiln in the region of overburning however, as it is more stable. In between the two extremes, however, there is a narrow region of stable operation in which high productivity and product quality can be achieved. Kiln operators find this state very difficult to maintain for long periods or time as the controls must be continuously manipulated to account for small changes in the operating state of the process. Kiln operators learn to control their plant from linguistic rules of the type: IF the Kiln_KW demand is HIgh AND the outlet oxygen content O2 is Low AND the Previous_Fuel_Feed is OK THEN make a SMall reduction to the Fuel_Feed AND make NO change to the induced draft fan speed VIDF Here, Kiln_KW, outlet oxygen content O2 and Previous_Fuel_Feed constitute the input fuzzy variables, while current Fuel_Feed and induced draft fan speed VIDF are the output fuzzy variables. Approximately 50-60 linguistic rule are used to provide very good control of the kiln under normal operating conditions. For start-up and shut-down conditions, as well as abnormal situations, different sets of rules may be used. The fuzzy controller can control this most difficult of processes as well as, and certainly more consistently than, a human operator, by observing the same control variables and adjusting the same manipulated variables. Fuzzy kiln controllers can easily maintain control of the process in this stable intermediate state, consistently achieving marked fuel economy and high productivity. Fuzzy kiln controllers normally reside in a client on a client-server network, receiving information on the current values of the control variables and returning the manipulated variables to the real-time data base on the file server for transmittal to the local RTUs. The fuzzy controller can be executed either at regular intervals or following an interrupt from the control scheduler, which monitors the temporal changes in the control variables. Using fuzzy kiln control, fuel costs have been reduced by up to 5% while productivity has been increased by an equal amount. Today there is a very large number of kilns worldwide under fuzzy control. Similar fuzzy controllers have been used to control all of the processes associated with cement production.
133
Hierarchical intelligent control has also been used to control a cluster of fuzzy controllers, each controlling a different sub-process of the kilning process. The cement industry was the first process industry to adopt fuzzy control and today a number of vendors supply such controllers. There is certainly no doubt that fuzzy control has resulted in a major change in the cement industry and many process industries are now following suit, encouraged by the progress in the field.
VIDF Precalciner Prev_Fuel_Feed Raw Feed Controller Fuel Feed Air flow
O2
Kiln KW
Kiln motor
Figure 9.10 The rotary kiln intelligent control system and the principal control and controlled variables
Chapter 10
136
Chapter 10
nominal operating state. The model-based fuzzy control methodology is thus a fusion of soft and hard control techniques and offers advantages in situations where stability and transient behavior must be guaranteed. In conventional gain-scheduling control, the selection of the scheduling procedure, i.e., the control law to use, is dependent on some exogenous variable. In contrast, fuzzy gain scheduling, which is a special form of model-based fuzzy control, uses linguistic rules and fuzzy reasoning to determine the corresponding control law. Issues of stability, pole placement and closed loop dynamic behavior are resolved using conventional modern control techniques.
x = f (x,u); x0
where xn and um are the crisp n-dimensional process state vector and m-dimensional control vector respectively and x0 is the initial state. This explicit description of the process may be the result of deep knowledge about the process or the result of identification of the process from experimental data using any of the well-known system identification techniques. One of the interesting features of the first Takagi and Sugeno technique for designing model-based fuzzy controllers is that under cer-
137
tain conditions that are, unfortunately, not always easy to satisfy, this technique guarantees stability of the closed system while specifying the transient behavior of the closed system through pole-placement. These are properties that are inconceivable with the heuristic fuzzy controller design technique. The difficulties in meeting the conditions for stability of the first method proposed by Takagi and Sugeno were eliminated in the second version that uses state differences. Both techniques are outlined in this chapter.
X ij = X ij ( x) / x
X
for the discrete case. The universe of discourse is given as the set: TXi = { X i 1 , X i 2 ...X iki } where ki is the number of fuzzy values xi. In order to simplify the analysis that follows it will be assumed that: the shapes of the fuzzy sets of Xi are identical for all i, the number of fuzzy numbers k1 = k2 =...=kn and that X 1i = X 2 i ... = X ij = X nj .
138
Chapter 10
1 1 i2 in
xi
The state vector x of a process is defined over some space. Every crisp value x* of the state vector corresponds to a specific state in state space. In the case of fuzzy controllers based on the Takagi-Sugeno model, the states take on fuzzy values and consequently the concept of state space must be modified to account for the fuzzy values of the state vector. Knowing that every fuzzy variable has a finite number of fuzzy values, we can then generate a finite number of fuzzy vectors that result from the combinations of the fuzzy values. Each element of the crisp state variable x is fuzzified as in the heuristic fuzzy control case. In each fuzzy region of fuzzy state space, a rule uniquely defines the local process model in that region, e.g., RSi : IF x=xi THEN x i =fi(xi,ui) (10.1)
The symbolism x=Xi implies that the state of the process x belongs to the fuzzy region Xi. The consequent of each rule describes an explicit local model of the process in the corresponding fuzzy region
139
i. Simple examples of fuzzy process rules for both continuous and discrete-time processes are: IF u is Low AND x is High THEN the process model is x = -0.7x - 0.2x3 - 3.1u ELSE IF u is High AND x is Low THEN the process model is x = -0.6x - 0.3x3 - 2.8u and Given the process parametric model Pressure(k+1)=a0 Pressure(k) + b1 Valve(k) +b2 Input_Flow_Rate(k) IF Pressure(k+1) is High AND Valve(k) is Closed AND Input_Flow_Rate(k) is Very_High THEN the process parameters are a0=-3, b1=2 and b2=-1 ELSE IF Pressure(k+1) is Low AND Valve(k) is OK AND Input_Flow_Rate(k) is High THEN the process parameters are a0=-4, b1=1 and b2=-2 ELSE etc.
140
Chapter 10
In any fuzzy region Xi the process can thus be specified by the state equation:
x i = S ( x ) f i ( x, u )
i
(10.2)
where
S ( x) = X i ( x1 ) X i ( x 2 ) ... X i ( x n )
1 2 n
(10.3)
are the degrees of fulfillment of the local models of the process using Mamdanis fuzzy compositional rule. For each nominal state of the process, a state equation of the form of (10.2) is determined. Using the set of fuzzy process rules (10.1) we establish the fuzzy open-loop model of the process, which is the weighted sum of the local models fi(x,u) , i.e.
x = wS ( x ) f i ( x, u )
i i
(10.4)
where
wS ( x ) =
S ( x)
i i
S ( x)
[0, 1]
(10.5)
are the normalized degrees of fulfillment or process function weights. Clearly the sum of the weights is unity, i.e.
w
i
i S
( x) = 1
141
(10.6)
where
C ( x) = k ( X j ( x k ))
k
u = wC ( x) g j ( x)
j j
(10.8)
where wCj are the control weights which are computed in Equation (10.5).
142
Chapter 10
xi = f i ( x i , u i ) = Ai xi + Biu i
where
(10.9)
Ai =
f i x
i xi ,ui
, Bi =
f i u i
xi ,ui
are the usual partial derivatives of the nonlinear functions fi() evaluated at the nominal conditions (xi,ui). The set of locally linearized models at these nominal states defines the state equations of the controlled process at those states and is termed the overall fuzzy open-loop model of the process. For each linearized local model there must be a corresponding linear control law that guarantees closed system stability while satisfying time-domain criteria. The complete set of control laws constitutes the global control law of the closed system. In the Takagi-Sugeno approach, the decision on the process model and the control law to use clearly depends on the nominal state of the process and is based on fuzzy linguistic rules. The linearized process model rule (10.2) now becomes:
RS i : IF x=xi THEN x = Ai x + Bi u
P
(10.10)
Modern control theory assumes that all the states of the process are measurable or can be estimated, in which case a constant state feedback control law of the form: (10.11)
143
u i = Ki x
can be applied, where Ki is a suitably gain matrix. Using any poleplacement technique, the elements of the gain matrix are chosen so that the poles of the closed system yield the desired transient behavior. The fuzzy control rule is therefore:
(10.12)
Now, from the definitions of wS (x) and wC (x) it is seen that these two are in fact identical, whereupon to simplify notation, let i j wS ( x) = w i ( x) and wC ( x) = w j ( x) . Substituting Equation (10.5) in (10.4), we obtain the Overall Closed System (linearized) Fuzzy Model:
x = w i ( x)( Ai x + Bi u )
i
(10.13)
(10.14)
Finally, substituting Equation (10.14) in (10.13) yields the overall homogeneous closed system state equations:
x = w i ( x) w j ( x )( Ai + Bi K j ) x
i j
(10.15)
or
x = w i ( x) w j ( x) Aij x
i j
(10.16)
where
Aij = Ai + Bi K j
144
Chapter 10
is a Hurwitz matrix. It should be noted that with this technique, even though linearized models of the process have been used in the development, both the overall system state equations and the overall control law are not linear because of the nonlinearity of the normalized membership functions wi(x) and wj(x). This leads us to the conclusion that asymptotic stability of the closed system can be guaranteed only locally, i.e., around each nominal state, and cannot be inferred elsewhere.
he closed system (10.15) is asymptotically stable and stabilizable at the origin x=0, if and only if
1. the matrix Aij is Hurwitz, i.e., all its eigenvalues possess negative real parts and 2. there exists a positive definite matrix P such that
(10.17)
. Even though these stability conditions are necessary and sufficient, it is very difficult to determine a suitable positive definite matrix P. Furthermore, if the matrix ij is not Hurwitz then a matrix P which satisfies Equation (10.17) does not exist and this technique is not applicable.
145
Rsi: IF xd=xi THEN x i = f i ( x i , u i ) (10.18) d where x is the desired state of the process. The modified model-based fuzzy control approach determines which fuzzy region the desired state of the process belongs to before deciding what control law to follow. It is noted that the desired state need not be stationary but may change continuously without the need to re-compute the control law. The desired state xd and the control law ud are the result of solving the steady state equations:
fi (xd ,u d ) = 0
Assuming that the desired state xd is constant or varies very slowly with time so that x d 0, the nonlinear system can then be linearized about (xd,ud) to yield:
x = A d ( x x d ) + B d (u u d )
(10.19)
which are linear state equations involving the state and control input deviations from their nominal states xd and ud respectively, while
A d = A( x d , u d ) = f i ( x, u ) x
x = x d ,u = u d
and
B d = B( x d , u d ) = f i ( x, u ) u
x = x d ,u = u d
The modified technique has the following limitations that may make application difficult:
it is valid only for small deviations in the state and input vectors from their nominal values and for every change in the nominal state, the rules by which decisions are made must be changed and the linearization procedure repeated.
146
Chapter 10
147
LX i ( x i ) = min(1,1,...,1) = 1.
Thus at the centers of each fuzzy region, the linearized system in the consequents of the fuzzy rules are given by Equation (10.19) with xi in place of xd. The set of rules which describe the fuzzy open-loop model reduce to:
(10.20)
(10.21)
Where the gain matrix K(xj,uj) is computed on the basis of the linearized closed system defined in the corresponding fuzzy region Lj. The fuzzy open loop model is now specified in terms of the deviations of the state and control actions from the nominal values, i.e.:
x = w i ( x d ) A i ( x x d ) + B i (u u d )
i
(10.22)
u = w j (x d ) K * (x x d ) + u d
j
(10.23)
It is observed that both the closed system model and the control law are linear since the normalized membership functions wi(xd) and wj(xd) are constants less than unity. As in the previous case, the closed system is now given by:
x = w i ( x d ) w j ( x d ) A i + B i K j ( x x d )
i j
(10.24)
It follows that
w (x
i i
) = w j ( x d ) = wi ( x d ) w j ( x d ) = 1
j i j
Here
148
Chapter 10
Ad = wi ( x d ) Ai ; B d = wi ( x d ) B i
i
i
and
K d = w j ( xd )K j
j
In an analogous way
x = A d ( x x d ) + B d (u u d )
and
u = K d (x xd ) + u d
The overall closed system is stable at the nominal state xd if and only if:
A*=d+BdKd.
If these conditions hold, then we can state the following:
The fuzzy gain-scheduled closed system (10.24) is asymptotically stable if and only if positive definite matrices and Q exist such that
A *T P + PA* = Q
(10.25)
the approach leads to linear descriptions of the open and closed system descriptions and the desired control law,
149
computation of the control law and the conditions for stability are well established and the determination of the matrix is almost always possible. The second Takagi-Sugeno technique combines the simplicity of heuristic fuzzy logic with rigorous hard control techniques and offers the possibility of specifying the desired closed system response as well as the conditions for stability. It suffers, however, from the fact that an explicit analytical model of the process must be known.
At low altitude, case (a), the dynamics of the plant are slow with a normalized time constant of T=2 whereas at high altitude in case (b), the
150
Chapter 10 response of the plant is faster, with a time constant T=0.5 and is considerably more sensitive to control actions. The corresponding step responses are shown in Figure 10.2.
151
(b)
(a)
Figure 10.2 Step responses of the plant at the two altitudes The transition of the dynamics of the plant with altitude is assumed to vary smoothly. Consider the use of a fuzzy gain-scheduling controller whose control rules are as follows: R0: IF xd=0 THEN u1 = g1(x,xd)= k1(x-xd) R1: IF xd=1 THEN u2 = g2(x,xd)= k2(x-xd) In order to obtain the desired closed system response at both altitude extremes, the state feedback gains k1=0.6, k2=0.4 are selected so that the eigenvalue of the closed system is at (-0.2,0) in both cases. To simplify the analysis, let the fuzzy membership functions shown in Figure 10.3 indicate how the transition of the dynamics of the plant follows the altitude. It is clear that at very low altitudes the mathematical model of the plant is predominantly of type (a) but as altitude increases the type (a) model increasingly fades into type (b) until x=xd at which the model is entirely type (b).
152
Chapter 10
The gain-scheduling fuzzy sets can be described by the expressions 1(x)=1-x and 2(x)=x. It is noted that 1+2=1x. The overall fuzzy process model is therefore given by the weighted sum:
x = w1 (0.5( x x d ) + 0.5u ) + w2 (( x x d ) + 2u )
The fuzzy gain-scheduling controller provides the control actions to force the closed system to follow the desired response for all values of altitude. The response of the closed system to a very large step demand in altitude from x=0 to x=xd=1 is shown in Figure 10.4(b). Figure 10.4(a) shows for comparison, the response of an invariant system governed by:
x = 0.2 x + 0.2u
153
(a) (b)
Figure 10.4 Responses of (a) an invariant system with the desired dynamic characteristics and (b) the fuzzy gain-scheduled plant Since the closed system with the fuzzy gain scheduler is clearly nonlinear, it would be unreasonable to expect the two responses shown in Figure 10.3 to be the same. It is noted that in the proximity of x=0 and x=xd the response of the closed system approaches the response of the invariant system, as desired.
Chapter 11
Neural Control
The emulation of human cognitive ability is one of the primary fields of research interest in Computational Intelligence. The human is nowhere near as fast or as accurate in calculating as a modern computer, yet even at a very early age one can easily recognize objects and relate them in their natural environment even if they are distorted or partially hidden. Exactly the opposite is true with computers they are far more capable than humans in performing tedious tasks that involve extensive numerical computations yet they have difficulty performing cognitive tasks despite the impressive progress made in Artificial Intelligence. The ability to learn through experience is one of the principal characteristics of humans. Humans have the capacity to store huge quantities and types of information, recall this data and process it in a very short time with little difficulty. One possible explanation for this superiority is the lack of suitable computer software to emulate the humans ability to process information. A second explanation is the fact that the human brain and computers work in radically different ways, the brain being much more efficient at processing information. The human brain is extremely powerful, comprising a huge number (of the order of 109) of simple processing elements or neurons, each one of which is capable of communicating with the others. This is massive parallelism and it is interesting that modern computer designs are increasingly following this architecture.
154
Chapter 11
The neurons of an Artificial Neural Network (ANN) are arranged in such a manner as to process information in parallel and simultaneously. Each neuron sends activation or de-activation signals to other neurons while its activation depends on signals received from other neurons to which it is connected. The term synapses is commonly used for 153 these connections. Suitable interconnection of these simple elements can yield powerful networks with the ability to learn, adapt and infer. The use of artificial neural networks is sometimes known as connectionism and ANNs can therefore be viewed as generalized connectionist machines or generalized function mappers. Since their re-emergence in the early 1980s, there has been an explosion of interest in the application of ANNs for qualitative reasoning, which is at the core of the fields of Soft Computing and Intelligent Systems. This interest has been encouraged by the increasing availability of powerful parallel computing platforms capable of very high computational speeds and parallel programming techniques. Multi-layer ANNs are finding use in an ever-increasing range of applications, from image recognition, voice analysis and synthesis to system identification and industrial control. This chapter, and those that follow, present the most commonly used ANN architectures that have found application in Control Engineering, the basic network-learning algorithms and examples of industrial applications. The origins of ANNs are to be found some fifty years ago in the work of McCulloch and Pitts. The first contribution in the area of network learning was due to Hebb in 1949 who showed that learning in complex networks can be achieved by adapting the strengths of the synapses. Rosenblatt introduced the Perceptron, an early form of neuron, in the late 1950s. The operation of multi-layer ANNs was not fully understood in those early days and research was restricted to structured perceptrons. Nilsson followed in the mid-1960s with learning machines and machines that were made up of clusters of threshold logic units. In 1969 Minsky and Papert published their seminal work on Perceptrons, in which they proved that Perceptrons are limited in their ability to learn, pointing to their inability to represent a simple R logic element. This work was to dampen the enthusiasm in ANNs and research in the field was virtually paralyzed for almost a decade. Fortunately, ANNs re-emerged in the early 1980s due mainly to the work of Hopfield who continued his research on network training
Neural Control
155
methods. New ANN architectures and powerful learning algorithms were introduced in the field in the mid-1980s rekindling interest in ANNs and their application. The rapid progress in very large-scale integrated circuitry (VLSI) and parallel computers aided the developments in ANNs and today the field of neural networks constitutes a thriving area of research and development. In a very comprehensive paper, Hunt et al. in 1992 (see the Bibliography in chapter 18) presented a host of applications of neural networks in Control Engineering and the reader is well advised to refer to this work. The properties that make ANNs particularly applicable to control applications are the following: being non-linear by nature, they are eminently suited to the control of non-linear plants, they are directly applicable to multi-variable control, they are inherently fault tolerant due to their parallel structure, faced with new situations, they have the ability to generalize and extrapolate.
These properties satisfy the fundamental requirements for their use in Intelligent Control. Neural controllers and fuzzy controllers, thus constitute the core of intelligent control. An ANN is essentially a cluster of suitably interconnected nonlinear elements of very simple form that possess the ability of learning and adaptation. These networks are characterized by their topology, the way in which they communicate with their environment, the manner in which they are trained and their ability to process information. ANNs are classified as: static when they do not contain any memory elements and their input-output relationship is some non-linear instantaneous function, or dynamic when they involve memory elements and whose behavior is the solution of some differential equation
156
Chapter 11
= w1 x1 + w2 x 2 + ....wn x n + b
where w and x are the synaptic weight and input vectors of the neuron respectively, while b is the bias or offset. A positive synaptic weight implies activation, whereas a negative weight implies de-activation of the input. The absolute value of the synaptic weight defines the strength of the connection. The weighted sum activates a distorting (or compression) element f(i). One form that this element can take is the threshold logic unit, in which the output of the neuron is triggered when the inner product <w,x> exceeds the bias b. There are many variations of the nonlinear distorting element, the most common of which are 1) Linear (ADALINE): f()= 2) Binary (Threshold Logic Unit or TLU): f() = 1 if >0 = 0 if 0
Neural Control
157
3) Sigmoid:
f( ) = 1 [0,1] 1 + e
4) Hyperbolic tangent:
f ( ) =
5) Perceptron:
1 e =(1+tanh) [-1,1] 1 + e
f() = if >0 =0 if 0
x1
w1 y f() w2
x2
xn
wn b
The input to the compression element may take on either of the following forms, depending on whether the neuron is static or dynamic: weighted sum (for the case of static memoryless neurons):
158
Chapter 11
w x
i i =1
+ b=
w x
i i =1
n+1
; x n+1 = 1, b = wn+1
( k ) = ( k 1) + wi x i ( k )
i =1
n +1
Here k is the time index and it is necessary to store the previous value of the weighted sum (k-1).
Neural Control
159
feed-forward networks in which information flows from the lowest to the highest layers, feedback networks in which information from any node can return to this node through some closed path, including that from the output layer to the input layer and symmetric auto-associative networks whose connections and synaptic weights are symmetric.
Outputs
Output Layer
Hidden Layer
Input Layer
Inputs
Figure 11.2 shows an example of a multi-layer feed-forward ANN involving an input layer, a single hidden layer and an output layer. This is a very common feed-forward network topology. Figure 11.3 shows a single-layered Hopfield network, which involves feedback. In contrast to feed-forward networks, every node of a Hopfield network is connected to all others. These ANNs can be useful in Soft Control because they possess the following three properties: they
160
Chapter 11
can learn from experience rather than programming, have the ability to generalize, can generate arbitrary non-linear input-output mappings and are distributed and inherently parallel.
Outputs
Buses
Inputs
Neural Control
161
can be trained from prior operational data and can generalize when subjected to causes that they were not trained with, and have the inherent ability to process multiple inputs and generate multiple outputs simultaneously, making them ideal for multivariable intelligent control. From the control viewpoint, the ability of neural networks to cope with non-linear phenomena is particularly significant. As is well known, there is no unified theory of non-linear control and there exists a host of scattered techniques capable of giving solutions only to specific cases. ANNs can therefore be used to advantage in the design of nonlinear controllers for nonlinear plants, particularly since the design is the result of learning.
162
Chapter 11
Finally, neural networks have the following properties that make them particularly useful for control: they possess a collective processing ability, are inherently adaptable, are easily implemented, achieve their behavior following training, can be used for plants that are non-linear and multivariable, can process large numbers of inputs and outputs making them suitable for multi-variable control, are relatively immune to noise, are very fast in computing the desired control action due to their parallel nature and do not require an explicit model of the controlled process.
Neural Control
163
outputs. The method is well established and is not limited to linear approximants. By way of example, consider the case of a SISO discretetime system that is to be identified by the ANN shown in Figure 11.4. The input to the ANN is fed from the input to the physical plant. If, following training, the output of the ANN is identical to that of the plant then we say that the plant has been exactly identified. In practice, perfect identification is unlikely and the plant can be identified only approximately. The fidelity of the approximation depends on the complexity of the network and is directly related to the order of the neural network and the number of past samples of the input that are used. ANN
Delayor Dx D 2x y
By placing a series of n delayors D (i.e., elements that delay the signal one sample period) in series as shown in Figure 11.4, in effect a tapped delay line, we obtain the input signal x(k) and n delayed versions of it x(k-1), x(k-2) x(k-n). The n delayed signals from the transversal delay line are then fed to a multi-layer ANN that generates the scalar signal y. This signal is then compared with the desired signal d from the physical plant. The object of identification is to minimize some measure of the difference (dy), by suitably adjusting (i.e., network training) the
164
Chapter 11
synaptic weights of the ANN. As will be seen in the following, a similar technique is used for the control of plants using ANNs.
d Neural Controller e
u Plant
A neural controller that has found use in practice due to its simplicity uses an inverse model of the plant. The method has much in common with conventional auto-tuning. During the training phase of the ANN, which is performed off-line with known training sets, the objective is to establish the inverse relationship P-1 between the output(s) and the input(s) of the physical plant, in effect the inverse transfer function if the plant is linear. Thus if the physical plant is characterized by the mapping y=P(u), then following training the ANN will ideally generate the inverse mapping u=P-1(y) so that the overall relationship between the input and the output of the closed controlled system is unity, i.e., perfect tracking! In keeping with accepted symbolism the output of the neural
Neural Control
165
controller, now the control signal to the plant has been renamed u whereas y refers to the plant output. Network training is based on some measure of the open system error between the desired and the actual outputs eo=d-y of the closed system. A flow diagram of the method is shown in Figure 11.5. It should be obvious that the very simplicity of the method is questionable. In theory the method should work but in practice it may not, because identification of an inverse mapping can never be exact, in which case the overall transfer relationship of the system is not unity. In order to work, even partially, the method requires repeated identification at regular intervals, a fact that makes the method impractical for most industrial applications.
eC Controller
u Plant
166
Chapter 11
considerably more difficult, however, with this structure, due to the feedback action.
y*
During the training phase, the simulator ANN learns the functional relationship between input(s) and output(s) (i.e., the transfer function) of the physical plant. This is the identification phase, which is based on some measure of the error between the output of the plant and that of the plant model simulator ANN, i.e., e*=y-y*.
Neural Control
167
Training can be either off-line or on-line with random or pseudorandom signals. Simultaneously, the overall error e=d-y is used to train the controller ANN. The advantage of this architecture is that it presents easier training of the controller ANN on-line since the error can be propagated backwards through the simulator ANN at every sampling instant.
Chapter 12
170
Chapter 12
propagating some measure of the error between the desired and actual output of the network from its output back to its input. In the case of unsupervised learning, no information on the desired output of the network that corresponds to a particular input is available. Here, the network is auto-associative, learning to respond to the different inputs in different ways. Typical applications of this class are feature detection and data clustering. Hebbs algorithm and competitive learning are two examples of unsupervised learning algorithms. A wide range of network topologies such as those due to Hopfield, Hamming and Boltzmann, also use the same method of learning. In general, these networks, with their ability to generate arbitrary mappings between their inputs and outputs, are used as associative memories and classifiers.
171
= {w1 x1 + w2 x2 + wnxn + wn+1xn+1} = <w,x> An ADALINE has a linear functional relationship given by f()= and can be trained by adapting its synaptic weights provided the desired output d is known. Training is based on some measure of the discrepancy or error e=d-y between the desired and actual output of the element. Widrow and Hoff presented the first successful training algorithm for ADALINES in the 1960s. They named their training procedure the LMS algorithm (for Least Mean Squares), using the squared error for each training set as the objective function, i.e.
J=
e
i =1
2 i
which is computed for all the elements of the training set. We seek the values of the synaptic weights that minimize the objective function. The inputs to the ADALINE may be either binary or real, whereas the values of the synaptic weights are always real and may be positive or negative, signifying activation and de-activation respectively. In terms of the inputs and the synaptic weights of the element, the error is:
e = d - <w,x>
where <w,x> is the inner product of the synaptic weight vector and the input vector and thus the squared error is:
e2 = d2 2d<w,x> + <w,<x,x>w>
whose expected value is:
172
Chapter 12
E ( e 2 ) = -2p + 2Rw w
where
(12.1)
p = E(dx) and R = E(<x,x>) R is the power of the input signal while p is the expectation of the product of the desired output and the input vector. Setting the partial derivative in Equation 12.1 to zero yields the optimum synaptic weight vector: w* = R-1p
which is known as the Wiener solution. If the matrix R and the vector p are known explicitly then the optimum synaptic weight vector w* can be computed directly. In practice, however, R and p are not known and the Wiener solution unfortunately cannot be used. Widrow and Hoff observed that using the partial derivative of the squared instantaneous error
ek2 k = w
instead of its expectation, made little difference to the final result. This observation simplified network training significantly since the squared error sensitivity could now be expressed in terms of the synaptic weights and the error directly, i.e.,
k = 2ek
~ wk +1 = wk + wk = wk k = wk + 2e k x k
(12.2)
where is some positive constant that defines the rate of convergence of the algorithm. This expression shows how the synaptic weights at the
173
following iteration of the training procedure must be adapted. The search direction that must be followed is dependent on the sign of the gradient: thus if the gradient is negative then clearly the search trajectory is tending towards some minimum. The algorithm is initialized with arbitrary initial synaptic weigh vector w0 and a new training set is applied to the element. The iteration is terminated once the squared error in that iteration falls below some acceptable value. At each iteration the change in the error is
wk = 2ek xk
it follows that
ek = -2<xk ,xk>ek = - ek
This means that the error is reduced by a factor of at each iteration. This factor thus plays a major role in the stability of convergence of the algorithm and must lie in the range 0<<2. In practice values of in the range of 0.1<<1 are preferred. Too small a value of leads to very slow convergence while a large value leads to oscillations and ultimately instability. The Widrow-Hoff training algorithm has the advantage of simplicity but is plagued by slow convergence. Training algorithms are the subject of considerable research and algorithms with vastly increased training rates have been developed. The back-propagation algorithm, which was the major breakthrough that opened up the field of ANN, is one of the most popular, though by no means the fastest or most reliable. This algorithm is presented later in this chapter.
174
Chapter 12
certain restrictions on the non-linear element. As will be seen below, the fundamental condition on the non-linear element is that it must be uniformly differentiable, i.e., the non-linearity must not exhibit discontinuities. Given a neuron with =<w,x>, output y=f() and error e=d-y, the error sensitivity is given by the partial derivative:
e y f ( ) f ( ) = = = = g( )x w w w w
It is obvious that for the error sensitivity to exist everywhere, it is necessary that the gradient of the nonlinear function g() = f() exist for all values of . This implies that the nonlinear function must be differentiable or smooth. In the special case of a linear neuron clearly g()=1. The development of the Delta algorithm follows that of the Widrow-Hoff algorithm closely. Here, the squared error sensitivity is
2 ek ek f f ~ = 2e = 2ek = 2ek k = w k wk wk
= 2ek g k xk w k
which shows how the gradient g of the nonlinear function f enters the training algorithm. This expression is very similar to that of the WidrowHoff algorithm except that an additional term has been added. The Delta training algorithm is given by the iteration
~ wk +1 = wk + wk = wk k = w k + 2e k g k x k
(12.3)
In order to accelerate convergence, it is possible to vary the value of according to the progress achieved in training. Thus, at each iteration, for example, the value of may be made a function of the norm of the change in the synaptic weights. If the rate of change of the norm of the synaptic weights drops below some pre-specified lower limit, then the value of is doubled whereas if the error norm rises above some upper pre-specified limit then the value of is halved. This is perhaps the simplest possible approach to adapting the value of the multiplier , and much more sophisticated ways are available but are beyond the scope of this book.
175
176
Chapter 12
w e Training Algorithm d
177
The signals at the inputs and outputs of the neurons of the first layer are respectively:
3=v1y1+v2y2+v3 ; y3=f(3)
The synaptic weights of the first layer are given by the vector w and those of the second layer by the vector v. The error is
e = d - y3 = d - f(3)
178
Chapter 12
The partial derivatives of the squared error with respect to the synaptic weights in each layer therefore are
e 2 e = 2e ; i = 1,2; j = 1,3 wij wij
e 2 e = 2e ; i = 1,3 vi vi Following the steps taken in developing the Delta training algorithm presented in the previous section, the synaptic weights of the first layer must be adapted according to
Wk +1 = Wk + W k = Wk + 2 e k
e W
v k+1 = v k + v k = v k + 2k e k e v
The training algorithm thus requires evaluation of the partial derivatives of the errors with respect to the synaptic weights of both layers, i.e. the error sensitivities
e e and W v
Dropping the iteration index k to simplify notation, the chain rule is used to derive the first three partial derivatives for the first layer, which are:
179
e y f ( 3 ) 3 f ( 3 ) 3 f ( 1 ) 1 = 3 = = = g ( 3 ) g ( 1 )v1 x1 w11 w11 3 w11 3 f ( 1 ) 1 w11 e y f ( 3 ) w3 f ( 3 ) 3 f ( 1 ) 1 = 3 = = = g ( 3 ) g ( 1 )v1 x2 w12 w12 3 w12 3 f ( 1 ) 1 w12 f ( 3 ) 3 f ( 1 ) 1 y f ( 3 ) 3 e = = g ( 3 ) g ( 1 )v1 = 3 = 3 f ( 1 ) 1 w13 w13 w13 3 w13
The reader is encouraged to derive the remaining three partial derivatives following the same procedure. It is noted that given the input-output relationship of the compression function analytically, the derivative g can be readily derived. Thus for the tanh non-linearity, used very commonly as the distorting or compression function of the node:
f ( ) =
1 e 2 and g ( ) = f ' ( ) = 1+ e (1 + e ) 2
In the same manner, the partial derivatives for the neuron in the second layer are
e y f ( 3 ) 3 = 3 = = g ( 3 ) y1 v1 v1 3 v1 e y f ( 3 ) 3 = 3 = = g ( 3 ) y2 v2 v2 3 v2 e y f ( 3 ) 3 = 3 = = g ( 3 ) v3 v3 3 v3
These expressions are especially simple if the propagation is through one layer only. Thus given some random values for the synaptic weights in all layers of the ANN and the known input vector, the first pass is made during which the signals 1, 2, 3 are computed. Computation of the neuron outputs y1, y2, y3 then follows from which the various nonlinear functions and their derivatives (f,g) are computed. At every iteration of the algorithm, the corrections to the synaptic weight matrix Wk for the first layer and the weight vector vk for the second layer are computed. These corrections are then added to the previous synaptic weights and the algorithm is repeated until convergence,
180
Chapter 12
defined as satisfaction of some error measure, is achieved, whereupon the algorithm is terminated. It is obvious that for complex ANNs containing many layers and neurons in each layer, it may be necessary to perform many thousands of iterations before convergence is achieved and clearly computers with high computational speeds are essential. Finally, it must be noted that the back-propagation algorithm, though convenient and used extensively for training ANNs, is by no means the fastest training algorithm. Many variations of this algorithm have been developed by additional terms to the synaptic weight adaptation vector which take into account such factors as the momentum of learning. This additional term results in a significant increase in the rate of convergence of the training algorithm.
W0 v0 x
W v
1 2 y1 y2
First layer
Training Algorithm
3 y3
Second
Figure 12.3 Flow chart for the back-propagation algorithm for the 2-2-1 ANN
Chapter 13
182
Chapter 13
183
THEN CONTROL_VARIABLE must be Positive_Large will appear as the training string: [-0.5, 0.5, 0] [1]. This training string can be used directly to form the training set of a neural network of specified topology. The corresponding MATLAB Neural Toolbox statement is simply:
P=[-0.5 0.5 0.0]; inputs
The training set is the collection of all the (P,T) pairs corresponding to every linguistic rule in the rule base.
184
Chapter 13
In a like manner, the corresponding output training set of a multi input single output neural controller can be stated as follows:
T=[0 0.5 -0.5 ...... -1.0] ; string of N corresponding outputs
This training set is used to train the neural network, hundreds or even thousands of times in random order until the synaptic weights converge. The initial estimates of the unknown synaptic weights are taken randomly. At every epoch (i.e., iteration of the training algorithm), the synaptic weights are updated in accordance with the training algorithm used. The back-propagation algorithm (BP) is by far the most popular training algorithm, though it often converges slowly, particularly when the network contains many neurons and layers. Fortunately, in the majority of control applications neural controllers normally contain very few neurons and training is fast. Learning is considered completed when some measure of the error between the desired and actual outputs of the network reaches some acceptable limit, or the number of epochs reach some upper limit.
185
The minimal neural network shown in Figure 13.2 will be used in the neural controller. The network will be trained using linguistic rules. The controller has two inputs: the error between desired and actual force ek and the change in error Dek=ek-ek-1. The incremental output of the network is Duk= F(gpek,gdDek) while the output of the neural controller is simply: uk= uk-1+Duk=uk-1+F(giek,gpDek)
Controller
Actuator
Force Sensor
Lathe
Finally, the output of the controller is weighted before being applied to the plant as: ck=gc uk The parameters gi, gp and gc are the normalizing gains of the controller, necessary to convert the inputs to the controller into the range [-1,1]. The value for gc=(ck/uk)max while the pair of controller parameters (gi,gp) are tuned on-line or obtained in an identical manner to the ZieglerNichols method. The linguistic control rules required to control the lathe cutting process are shown in the form of the checkerboard pattern in the linguistic rule matrix in Figure 13.3.
186
Chapter 13
{w} e {v}
De Du 1
Figure 13.2 Minimal neural controller for the mechatronic system
Both the ERROR (e) and the CHANGE_OF_ERROR (De) are quantized into five elements (equal to the number of linguistic variables assigned to each controller input) over the normalized ranges [-1,1]. Here, each controller input has been assigned seven linguistic variables and consequently the Linguistic Rule Matrix (LRM) or Fuzzy Associative Memory (FAM), contains 49 possible rules not every one of which is specified. Typically some 20-30% of the possible elements of the FAM are specified. In the example shown in Figure 13.3, the FAM contains 13 of the 49 possible rules.
NL
NM
NMS
OK
PMS
PM
PL
OK
PM
PL
OK NM NMS OK OK
PMS PM
NL
NM
OK
Figure 13.3 Linguistic Rule Matrix (LRM) or Fuzzy Associative Memory (FAM)
187
Here PL = Positive_large, PM = Positive_Medium, OK = Normal, NS = Negative_Small, PS = Positive_Small, NM = Negative_Medium, PMS = Positive_Medium_Small and NMS = Negative_Medium_Small. If the linguistic variables for the inputs to the controller are assigned the following numerical values: NM=-0.8. PM=0.8. OK=0. NS=-0.3. PS=0.3. NM=-0.8. PM=0.8 and the corresponding controller output: NM=-1. PM=1. OK=0. NS=-0.35. PS=0.35. NM=-0.55. PM=0.55 then the numerical fuzzy associative memory (NFAM) matrix is shown in Figure 13.4. The magnitude of the control variable is specified in the darkened elements of the matrix. All other elements are left blank. The neural network will perform the necessary interpolation.
+1
0 0.55 +1
De 0
-0.55
0 0 -0.35
0.35 0.55 0
-1
-0.55
-1 -1 0
Figure 13.4 Numerical FAM
+1 e
188
Chapter 13
The complete network training program in MATLAB is given in Appendix D. On executing this program the following synaptic weights are obtained: w11= 0.7233 w12= 0.2674 w21= 1.0764 v1= 0.798 v2= 0.2237 w22= -0.9706
where wij are the synaptic weights of the first layer and vi the synaptic weights of the second layer. The corresponding biases are: b1=0.6312 b2=0.166 b3=-0.5403
Figure 13.5 shows the Network Learning Rate and the Network Error (i.e., the sum of squared errors) as functions of the number of epochs. It is noted that less than 100 epochs are required to bring about conversion. This translates into a few seconds of computational time depending on the platform used. The performance of the closed system with the neural controller compares favorably with that of a conventional two-term controller in Figure 13.6, proving that unconventional control can be as good as conventional control and often superior.
189
Network Error 8 6 4 2 0 0 0.25 0.2 0.15 0.1 0.05 0 0 20 40 Epoch 60 80 100 20 60 Epoch Network Learning Rate 40 80 100
Figure 13.5 Network Errors and Network Learning Rates as functions of epochs
1.2
1 0.8
0.6
0.4
0.2 0 0
10
20
30 Time (second)
40
50
60
Figure 13.6 Comparison of step responses of the mechatronic system with the conventional controller (continuous line) and the neural controller (crosses)
190
Chapter 13
191
Five linguistic variables are used for each controller input, sufficient to yield the desired accuracy, whereupon the overall FAM has 3555=375 rules. In practice this number is unwieldy and it is simpler to design three independent three-input single-output subcontrollers which are executed sequentially. Here the sub-controllers have identical causes but different effects. The three sub-controllers can consequently be trained independently. Fewer than 125 rules are necessary in practice to achieve satisfactory control. Rule pruning was systematically performed with a view to reducing the number of control rules without loss of controller performance. This led to a reduced training set of 65 rules, an example of which is given below: R: IF DP is Very_High AND EXT is HIgh AND UP is OK THEN FEED is Very_High AND HADR is OK AND RDPR is OK
DP
EXT RDPR
UP FEED
HADPR
CONTROLLER
192
Chapter 13 The controller algorithm is resident in the supervisory control system and is executed every 10 seconds. If the resultant incremental control actions are small, indicative of stable operation, they are ignored and the process is maintained at its previous state. Two such neural coal mill controllers have been in continuous operation since 1992 and have consistently led to energy demand reduction of approximately 5% and a comparable increase in productivity, food for thought for anyone still doubting the economics of intelligent control!
Chapter 14
Neuro-Fuzzy Control
Neuro-fuzzy controllers constitute a class of hybrid Soft Controllers that fuse fuzzy logic and artificial neural networks. Though the principles of fuzzy logic and artificial neural networks are very different, the two techniques have important common features: fuzzy logic aims at reproducing the mechanism of human cognitive faculty while neural networks attempt to emulate the human brain at the physiological level. In fuzzy controllers linguistic rules embody the knowledge on how to control a physical plant. In a neural controller this knowledge is embedded in the structure and the synaptic weights of the network. Feedforward processing in ANNs is analogous to the inference engine in fuzzy logic. Fuzzy controllers use fuzzy compositional rules to arrive at their decisions, require fuzzification of the input variables and defuzzification of the composite fuzzy set of the output in order to obtain a crisp output from the controller. In contrast, neural controllers use simple arithmetic techniques, operating directly in the physical world. In both cases, current data from the physical plant being controlled is stored in a real-time database and then processed by an appropriate algorithm. Only in the manner with which the two techniques arrive at this control action do they differ radically. The ability to generalize, i.e., to extrapolate when faced with a new situation, is a feature common to both. Evolving from very different origins, fuzzy and neural controllers developed independently by researchers with very different backgrounds and very different objectives. Scant thought was given to the
193
194
Chapter 14
possibility of combining the two at the time. It did not take much time, however, for control engineers to realize that the operations of a fuzzy controller could be implemented to advantage with artificial neural networks, which, because of their inherent parallelism and their superior computational speed, could lead to controllers capable of significantly higher bandwidth as required in a number of critical situations. Neuro-fuzzy control has been the subject of numerous books and it is beyond the scope of this book to delve in depth into the various architectures that have been proposed. This chapter presents a brief introduction on how the two techniques can be combined and how the fuzzy controller algorithm can be implemented with artificial neural networks.
A number of neuro-fuzzy controller architectures have been proposed, each with features that make them suitable for specific applications. Considering fuzzy and neural elements as distinct entities, it is possible to construct a controller structured in layers, some of which are implemented with neural elements and others with fuzzy elements. A fuzzy element can, for instance, act as supervisor to a neural element that controls some conventional industrial three-term controller. The following characteristics of ANNs are useful in implementing fuzzy controllers: they
Neuro-Fuzzy Control
195
use a distributed representation of knowledge, are macroscopic estimators, are fault-tolerant and can deal with uncertainty and vagueness.
196
Chapter 14
d1 Layer 4
y1
dm
ym
Layer 3
Layer 2
Layer 1
u1
u2
un
Figure 14.1 Multi-layered hybrid neuro-fuzzy controller (for simplicity not all branches are shown)
Figure 14.1 shows a multi-layer neural network of the hybrid neuro-fuzzy controller. It is observed that each layer of the neural network has a fuzzy equivalent. The causes and effects (i.e., inputs and outputs) of the network correspond to the input and output nodes of the network, while the hidden layers perform the intermediate operations on the fuzzy sets while embedding the knowledge base. Every node in the second layer of the network performs a nonlinear mapping of the membership functions of the input variables. The second layer involves a cluster of neurons that have been trained a priori
Neuro-Fuzzy Control
197
to map the desired membership functions. The nodes in the third layer perform the same function as the knowledge base in a fuzzy controller, while the connections between the second and third layers correspond to the inference engine in a fuzzy controller. The nodes in the third layer map the membership functions of the output variables. In the fourth and final layer, there is only one node for the output of the network and a node from which the training sets are introduced to the network. The various neurons of each ANN are shown as nodes in Figure 14.1 and have properties which depend on the layer to which they belong. The relation between the inputs and output of an elemental neuron is, as before, simply:
p i=1
wi ui where y = f()
where y is the output, ui are the inputs, is the weighted sum, wi the synaptic weights, and f() the non-linear (compression) function of the elemental neuron. There are p inputs and p+1 synaptic weights for each neuron to account for the bias term. In the first layer, each neuron distorts the weighted sum of the inputs so that it corresponds to the membership function. For example, if the fuzzy set is Gaussian, then
=(xi ij ) 2 s ij
2
and f()= e
where ij are the centers and sij the standard deviations of the membership functions. Here, the synaptic weights of the first layer of the network must be equal to the centers of the membership functions, i.e., wij= ij. Using Mamdanis fuzzy compositional rule for instance, the nodes in the second layer of the controller perform the AND operator (i.e., min) in which =min(u1, u2.up) and f()= The neurons in the third layer perform the OR operator required in the final stage of the fuzzy compositional rule. Here the synaptic weights in this layer are unity and
198
Chapter 14
u
i =1
and f()=min(1,).
The fourth layer of the network possesses two sets of nodes. The first set is required for de-fuzzification and in the case where the fuzzy sets of the output are also Gaussian then the synaptic weights of these nodes and the non-linear compression function are defined by
sij u i
The second set of nodes in this layer transfer the elements of the training sets to the network in a reverse direction, i.e., from the output to the input of the network and here i = di. The training of the network is performed in two phases, at the end of which the parameters (ij,sij) of the neurons in the first and third layer are determined. During this phase, the network also learns the control rules, storing this knowledge in the synaptic weights of the connections between the second and third layer.
Neuro-Fuzzy Control
199
LO OK
HI
u2\u1 HI OK LO
LO
R1: OK
OK
HI
R2: HI
200
Chapter 14
R1
1 R2 2
R3
3 y
R4
4 5
R5
u 1
u 2
Given the membership values for each input, the nodes in the second layer compute the degree of fulfillment of each rule, as shown in Figure 14.4. For the given input values, only rules R3 and R5 are fired and the corresponding degrees of fulfillment are 0.2 and 0.5. Figure 14.5 shows the neuro-fuzzy controller with two inputs, five rules and one output. The five rules in the knowledge base are embodied in the second layer. The connections between the nodes in this layer and the next are determined by the rules. Thus, for example, the node corresponding to rule R3 has connections from the nodes representing the fuzzy sets OK and OK respectively while the node for rule R5 is linked to the nodes HI and LO of the first layer.
Neuro-Fuzzy Control
201
Output De-fuzzification
LO:OK:HI
R1:R2:R3:R4:R5
Rules
LO:OK:HI
Figure 14.5 The simple neuro-fuzzy controller with 2-inputs, 1-output and 5 rules
The outputs of the nodes of the second layer are the degrees of fulfillment of each rule. The fuzzy sets of the contributions from each rule that has fired are combined (using the union operator) in the third layer and finally the fourth and final layer performs the task of de-fuzzifying the output fuzzy set to yield a crisp output at the output of the neurofuzzy controller.
Chapter 15
Evolutionary Computation
The design of intelligent controllers based on unconventional control techniques will undoubtedly become increasingly common in the near future and these developments will rely heavily on the use of the stochastic methods of Soft Computing in seeking optimum results. These hybrid methods offer a new and very exciting prospect for Control Engineering, leading to solutions to problems that cannot be solved by conventional analytical or numerical optimization methods. Although stochastic methods of optimization are computerintensive, the impressive progress that has been observed in computer hardware over the past decades has led to the ready availability of extremely fast and powerful computers that make stochastic techniques very attractive. One of the ascending techniques of Intelligent Control is the fusion of Fuzzy and Neural Control with Evolutionary Computation. Evolutionary Computation is a generic term for computational methods that use models of biological evolutionary processes for the solution of complex engineering problems. The techniques of Evolutionary Computation have in common the emulation of the natural evolution of individual structures through processes inspired from natural selection and reproduction. These processes depend on the fitness of the individuals to survive and reproduce in a hostile environment. Evolution can be viewed as an optimization process that can be emulated by a computer. Evolutionary Computation is essentially a stochastic search technique with remarkable abilities for searching for global solutions.
203
204
Chapter 15
There has been a dramatic increase in interest in the techniques of Evolutionary Computation since their introduction in the mid-1970s. Many applications of the technique have been reported, including solving problems of numerical and combinatorial optimization, the optimum placing of components in VLSI devices, the design of optimum control systems, economics, modeling ecological systems, the study of evolutionary phenomena in social systems, and machine learning, among others. The idea behind Evolutionary Computation is best explained by the example quoted in Michalewicz (1992): Do what nature does. Let us take rabbits as an example: at any given time there is a population of rabbits. These faster, smarter rabbits are less likely to be eaten by foxes, and therefore more of them survive to do what rabbits do best: make more rabbits. Of course, some of the slower, dumber rabbits will survive just because they are lucky. This surviving population of rabbits starts breeding. The breeding results in a good mixture of rabbit genetic material: some slow rabbits breed with fast rabbits, some fast with fast, some smart rabbits with dumb rabbits, and so on. And on the top of that, nature throws in a wild hare every once in a while by mutating some of the rabbit genetic material. The resulting baby rabbits will (on average) be faster and smarter than those in the original population because more faster, smarter parents survived the foxes ... By analogy, in Evolutionary Computation, solutions that maximize some measure of fitness (the criterion or cost function) will have a higher probability of participating in the reproduction process for new solutions and it is likely that these solutions are better than the previous ones. This is a fundamental premise in Evolutionary Computation. Solutions of an optimization problem evolve by following the well-known Darwinian principles of survival of the fittest. The basic principles, the principal techniques and operators of Evolutionary Computation are introduced in this chapter and an example illustrates how an Evolutionary Algorithm can be used to determine the global optimum parameters of a complex problem. In Chapter 17 the technique is applied to the design of optimized control systems.
Evolutionary Computation
205
206
Chapter 15
until some termination condition is satisfied. Each iteration is termed a generation, while the individuals that undergo recombination and mutation are named parents that yield offsprings. Selection aims at improving the average quality of the population, giving the individuals with higher quality increased chances for replication in the next generation of solutions. Selection has the feature of focusing the search in promising areas of the parameter search space. The quality of every individual is evaluated by means of a fitness function, which is analogous to an objective function. The assumption that better individuals have increased chances to reproduce even better offspring is based on the fact that there is a strong correlation between the fitness of the parents and that of their offspring. In Genetics this correlation is termed heredity. Through selection, exploitation of the numerical/genetic information is thereby achieved. Through recombination, two parents exchange their characteristics through random partial exchange of their numerical/genetic information. The recombination of the characteristics of two parents of high fitness assumes that if a portion of the numerical/genetic information responsible for high values of fitness recombines with an equivalent parent, then the chances that their offspring will have as high or even higher fitness values are correspondingly increased. Recombination is also referred to as Crossover. Likewise, through mutation, an individual undergoes a random change in one of its characteristics, i.e., in a specific section of its structure. Mutation aims at introducing new characteristics to the population that does not necessarily exist in the parents, leading thereby to an increase in the variance of the population. Exploration of the search space is achieved through the operators of recombination and mutation The cornerstone of Evolutionary Algorithms is the iterative procedure in exploring the search space while simultaneously exploiting the information that is being accumulated during the search. This is in fact, where their functionality lies. Through exploration, a systematic sampling of the search space is achieved, while through exploitation the information that has been accumulated during exploration is used to search for new areas of interest in which exploration can be continued. Unlike exploitation, exploration includes random steps. It should be emphasized that random exploration does not mean exploration without direction since the technique focuses on the most promising directions.
Evolutionary Computation
207
The most common types of Evolutionary Algorithms are: Genetic Algorithms Evolutionary Strategies Evolutionary Programming Classifier Systems Genetic Programming
The first three are used extensively in optimization problems, while Classifier Systems are used in machine learning. Finally, Genetic Programming is used in the automatic production of computer programs. It is noted that Genetic Programming and Classifier Systems are often considered as special cases of Genetic Algorithms and not as special cases of Evolutionary Algorithms.
208
Chapter 15
methods, such as Genetic Algorithms that are not bound by such constraints.
seek the optimum solution by searching a population of points of the search (solution) space in parallel and not in an isolated space, do not require derivative information or any other information. The direction of search is influenced by the evaluations
Evolutionary Computation
209
of the objective function and of the respective fitness function only, use stochastic (probabilistic) transition and not deterministic rules in the optimization procedure, are simple to implement and apply, can yield a population of optimum feasible solutions in a problem and not a unique one. The choice of the best solution is then left to the user. This is very useful in practical problems where multiple solutions exist as well as in multiobjective optimization problems.
The search process, which is followed by a simple Evolutionary Algorithm for the solution of an optimization problem is shown in the flow chart of Fig. 15.1 and is summarized as follows:
210
Chapter 15
1. a population of initial solutions is created heuristically or randomly, 2. the fitness of every individual-solution is evaluated, using the fitness function, which depends strongly on the corresponding value of the objective function of every candidate solution, 3. the selection operator gives improved chances to the better solutions for survival in the next generation, 4. using the recombination operator, two parents, which have been chosen randomly using the selection operator, exchange numerical information, according to a pre-defined probability of recombination, 5. using the mutation operator, partial numerical information is perturbed according to a pre-defined probability of mutation, 6. the fitness values of the new population are re-evaluated and 7. if the termination criterion (statistical or temporal) is not satisfied, then a return is made to the 3rd step, otherwise the algorithm is terminated and 8. the best solution from the set of optimum solutions is selected. If the optimization problem has constraints, then there are two alternative methods of approaching an optimization problem: the first uses penalties (solutions which violate the constraints are penalized and their respective fitness function values are reduced) and the second uses a mapping of the candidate solutions with simultaneous use of the exploration operators (recombination and mutation).
Evolutionary Computation reproduction of the offspring recombine P'(k) ; % Recombination mutate P'(k) ; % Mutation evaluate P'(k) ; % Evaluation of the fitness of the new population P := survive P, P'(k) ; % Selection of survivors
211
212
Chapter 15
The stages of a simple Genetic Algorithm are discussed in some detail below.
15.4.1 Initialization
An initial population of N candidate solutions (one for each of the N unknowns) is created and for every solution/individual/chromosome xi the corresponding objective function value i is evaluated. Alternative methods for the creation of the initial population or part of it, is through statistical analysis of the search space or through heuristic reasoning.
15.4.2 Decoding
The N candidate solutions of the optimization problem are converted into binary strings of length L that are used to represent real numbers as follows: 000.0 = minimum value of the parameter 000.1 = minimum value of the parameter + q20 00010 = minimum value of the parameter + q21 11111 = maximum value of the parameter where q = (max. value min value)/(2L-1). Clearly the discretization step q specifies the precision of the representation while the length L of each representation need not be equal for all candidate solutions. When the optimization problem is multidimensional, then the partial strings are concatenated as shown in Figure 15.2 in order to create a single binary string. Other mappings that are commonly used are representations with real numbers and the Gray code. By way of example, consider a simple two-dimensional optimization problem whose objective function is quadratic: (x)=x12+ x22 where 0xi7 are integers. Here we select the length of the binary string to be L=3 since 23=8, sufficient to represent the integer values of the so-
Evolutionary Computation
213
lutions. Thus the string 000 000 represents the solution (0, 0), the string 000 001 the solution (0,1), the string 111 111 the solution (7,7), etc. Following accepted terminology, the binary string is named a genotype, the decoded information the phenotype, while every individual solution is a chromosome.
214
Chapter 15
fi =
j =1
The crossover point is chosen randomly and in the example the result of recombination in the third gene (or digits) of the two binary strings are the new strings:
Crossover Point
In string Q1 the digits to the right of the crossover point are exchanged with these of the second string Q2, while the opposite is done with the digits of the second string. Crossover can be performed at one
Evolutionary Computation
215
or more points that are selected randomly. An alternative form of crossover is uniform crossover where every digit of every offspring has equal probabilities of being taken from either parent. The frequency with which the crossover operator acts on the candidate solutions depends on some pre-defined crossover probability pcross[0, 1]. Practical values of the crossover probability are in the range [0.6-0.95], while techniques have been proposed where the same search process adapts the crossover probability. During mutation, a random bit that is selected with some predefined mutation probability pmut, takes on values of 0 or 1 at random. This operation results in an increase in the variability of the population. Mutation is necessary since potentially useful genetic material may be lost at specific locations of the generation during previous operations. For instance, after mutation of the second digit/gene which is chosen randomly, the string,: Q1= is transformed into: Q1*= 1 1 0 0 0 1 1 0 0 0 0 1
Figure 15.3 depicts the operations of crossover and mutation in genotype space, in phenotype space in the objective function and correlated to the fitness function.
15.4.5 Selection
During selection, N individuals are chosen for survival in the next generation according to their fitness values from a population of candidate solutions-individuals. Individuals with high fitness values have an increased probability of survival in the next generation/iteration, compared to those with low fitness values. Many methods of selection have been proposed but here only the popular roulette-wheel method is considered. Consider the surface of a roulette wheel that is divided according to the fitness values, i.e., the angles of every sector of the roulette wheel are set proportional to the fitness. If the relative fitness values are equal (an unlikely situation in practice), then the roulette sectors will have
216
Chapter 15
equal angles, implying that there is an equal probability that the roulette ball will stop in any of the sectors.
The hypothetical case of the ball stopping on a dividing line is discounted! In reality, the fitness values are unequal in which case the sectors of the roulette will also be unequal, implying that the probability of the ball stopping in any given sector increases with the angle of the sector. It is not improbable, however, that the ball will stop in a sector with a small angle. Imagine now that the ball is rolled and finally stops in one of the N sectors of the roulette wheel. The sector where the ball stops defines the chromosome that will undergo evolution. It is obvious that the ball may fall two or more times in a sector which corresponds to a high fitness value and the corresponding solution/chromosome is selected an equal number of times for survival in the next generation. A so-
Evolutionary Computation
217
lution with low fitness may not be selected for survival in the next generation.
13.13% 25.25%
A/A Chromosome 1 011100 2 000011 3 110100 4 010011
TOTAL
Fitness
% Totally
9.09% 52.53%
25 9 52 13 99
Figure 15.4 Roulette-wheel selection for the objective function (x1,x2) =x12+x22 and a population of N=4.
218
Chapter 15
Evolutionary Computation
219
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
220
Chapter 15
1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1 -1 -0.5 0 0.5 1
Figure 15.7 Evolution of the trajectory of the optimum solution in every generation
2.5
1.5
0.5
10
20
30
40
50
60
Evolutionary Computation
221
222
Chapter 15
If the dynamic behavior of the process is known and there exists a macroscopic model of the process, then we may use the single objective function (or criterion) to evaluate the performance of the closed system to a step disturbance:
t e dt or
e 2 dt
where T. These are the familiar ITAE and ISE criteria respectively. Alternatively some criterion which involves overshoot, steady state error and rise time could be used.
If every weight is decoded into a binary string with L digits, the total length of the chromosome is clearly NL. It is obvious that the use of Genetic Algorithms in problems involving ANNs with thousands of neu-
Evolutionary Computation
223
rons becomes difficult and extremely time-consuming. However ANNs, which are used in Neural Controllers, usually have a very simple architecture with rarely more than 30 neurons or more than one hidden layer, in which case GAs are very efficient. The second application of Evolutionary Algorithms in the field of Neural Control has to do with the evolution of the topology of the network (i.e., the manner in which the neurons of the network are interconnected) with or without parallel evolution of the weights of the network. This problem is practically unsolvable with conventional optimization methods. The main characteristic that makes Evolutionary Algorithms attractive for a broad class of optimization problems is their robustness because:
they do not require specific knowledge or derivative information of the objective function discontinuities, noise or other unpredictable phenomena have little impact on the performance of the method they perform in parallel in the solution space, exploring the search space with simultaneous exploitation of the information derived and they do not become entrapped in local optima they have good performance in multidimensional largescale optimization problems and they can implement in many different optimization problems without big changes in their algorithmic structure.
The main disadvantages of GAs are that they:
face some difficulties in locating the precise global optimum, although it is easy for them to locate the vicinity where the global optimum exists and require a great number of evaluations of the objective function and therefore require considerable computational power.
With regard to the first disadvantage, many hybrid evolutionary algorithms, which embed local search techniques, have been proposed. Methods like hill-climbing or Simulated Annealing (presented in the
224
Chapter 15
next chapter) in combination with Evolutionary Algorithms have been developed in order to determine the exact global optimum and to improve the overall performance of the search algorithm. A notable genetic algorithm is that due to Moed and Saridis for global optimization. Concerning the second disadvantage, the dramatic evolution in computer technology in combination with the progress in parallel computer machines is tending to minimize this disadvantage.
Chapter 16
Simulated Annealing
Simulated Annealing, which has much in common with Evolutionary Computation, is a derivative-free stochastic search method for determining the optimum solution in an optimization problem. The method was proposed by Kirkpatrick et al. in 1983 and has since been used extensively to solve large-scale problems of combinatorial optimization, such as the well-known traveling salesman problem (TSP), the design of VLSI circuitry and in the design of optimum controllers. The main difference between Evolutionary Computation and Simulated Annealing is that the latter is inspired by the annealing process for metals during cooling, while the former is based on evolutionary processes. The principle of annealing is simple: at high temperatures the molecules in a metal move freely but as the metal is cooled gradually this movement is reduced and atoms align to form crystals. This crystalline form actually constitutes a state of minimum energy. Metals that are cooled gradually reach a state of minimum energy naturally, while if they are forcibly cooled they reach a polycrystalline or amorphous state whose energy level is significantly higher. Metals that are annealed are pliable while the latter are brittle. However, even at low temperatures there exists a small, but finite probability that the metal will enter a state of higher energy. This implies that it is possible that the metal will leave the state of minimum energy for a new state where the energy is increased. During the cooling process, the intrinsic energy may rise or drop but as the temperature is lowered the probability that the energy level will increase suddenly is re225
226
Chapter 16
duced. The probability that a change in the state of the metal at some temperature T and initial energy level E1 to some other state with energy level 2 is given by:
p=e
if 2>1 =1 otherwise
( E2 E1 ) T
where is Boltzmanns constant. This thermodynamic principle was adapted to numerical analysis by Metropolis et al. in 1953 giving rise to the terms Simulated Annealing. Simulated annealing attempts to minimize energy. This is similar to minimizing a Lyapunov function in modern control theory. In implementing the Metropolis algorithm the following must be known: the objective function (by analogy with the energy E of the metal) whose minimum is sought and, a control parameter T (the simulated temperature) whose temporal strategy defines the changes in the simulated temperature at every iteration of the algorithm.
Simulated Annealing
227
3. using some stochastic or heuristic strategy, a new solution vector x2 is selected and the corresponding objective function value is evaluated (x2), 4. the difference of the objective function = (x2)-(x1) is computed, 5. if <0, then the solution vector x2 is accepted, otherwise if >0 accept the solution vector according to the probability of acceptance:
p(k ) = e
T (k )
otherwise go to step 7, 6. set x1=x2 and (x1)=(x2) and weight the current simulated temperature with the coefficient , where 0<<1, decreasing the simulated temperature successively at every iteration, so that at the (k+1)st iteration: T(k+1) = T(k), where k is the iteration index, 7. if the current simulated temperature is lower or equal to the final temperature, i.e., T(k) Tfinal, then accept the current solution vector as being optimum, otherwise return to Step 3 and repeat the process. If the Simulated Annealing algorithm is to succeed, it is important that the temporal annealing strategy that is followed, i.e., the simulated temperature profile, be suitable. The rate at which the simulated temperature is decreased depends on the weighting coefficient . Too high a simulated cooling rate leads to non-minimum energy solutions, while too low a cooling rate leads to excessively long computation times. The closer the value of is to unity, the slower simulated temperature decreases. Figure 16.1 shows the probability of acceptance of the solution p(), as a function of the iteration index for different values of . In order to achieve effective exploration of the search space, it is advisable to use 0.95<<0.98. Finally, as in Evolutionary Computation, the trajectory of an optimization problem is critically dependent on the initial estimates of the optimum solutions that are heuristic or the result of statistical analysis.
228
Chapter 16
10
19
28
37
46
55
64
73
82
Figure 16.1 Variation of the acceptance probability p() as a function of the iteration index k.
Simulated Annealing
229
230
Chapter 16
-0.5
0.5
Figure 16.2. Trajectory of the solution in the search space (x1,x2) (x1,x2)
3
2.5
1.5
0.5
0 0
50
100
150
200
250
300
350
Figure 16.3 Evolution of the objective function (x1,x2) as a function of the iteration index
Simulated Annealing
231
50
50
10
15
20
25
30
35
232
Chapter 16
Here e is the error between the desired and the real output and u is the control variable. The unknown parameters of the industrial controller are the gains Kp and Ki whose optimum values are sought. One way of determining the best gain pair is to use classical tuning methods such as those of Ziegler and Nichols or modern tuning techniques such as those of Persson and Astrom (see Bibliography in chapter 18). These methods (i) assume a simplified dynamic model of the plant and (ii) use heuristics to arrive at the best parameters instead of an analytical error criterion. Here in contrast, an analytical criterion is used directly and the optimum pair is determined using stochastic techniques
Simulated Annealing
233
Using the ITAE criterion it is desired to obtain the values of the parameter pair (Kp,Ki), which minimize the objective function (i.e., error criterion): ()= ITAE =
t e(t ) dt
T 0
where T . An example of the evolution of the criterion dependent from the iterations is depicted in Figure 16.7. Convergence is achieved in about 120 epochs (iterations).
234
Chapter 16
Figure 16.8 shows the evolution of the parameter pair (Kp,Ki) of the industrial controller. It is noted that convergence is achieved, whatever the initial values of the unknown parameters. Finally, the step response of the closed-loop system is shown in Figure 16.9. This response must be compared with that of Figure 14.6 for the non-optimized neural controller. A cursory glance will confirm that the response of this system is superior.
Figure 16.9 Step response of the closed-loop system with minimum ITAE
Chapter 17
236
Chapter 17
Stochastic optimization with a qualitative measure of performance of a closed system, instead of quantitative terms, offers distinct advantages in controller design to the Control Engineer as he is able to235 relate to the design problem directly in linguistic terms rather than through some abstract analytical formulation.
237
that was described in Chapter 15. The main difference here is that the evaluation of fitness does not follow the classical approaches, but is derived from a set of linguistic rules that express the multiple engineering objectives of the problem. The consequent of each fuzzy rule is taken as the objective function that can take m fuzzy linguistic values from the set: w = {w1, w2 wm} Assume also that the array a = {a1, a2 am} is the set of parameters, which consist of the antecedents, and vi = {vi1 , vi2 viq} is the set of q fuzzy values of the variable ai. Then, the linguistic rules that specify the fitness function in the Evolutionary Algorithm take the form: R: IF a1 is v1i AND a2 is v2i AND a3 is v3i THEN is wi The total number of design rules is thus equal to a1*n1+a2*n2++ak*nk.
238
Chapter 17
The controller attributes are described by fuzzy sets, three to five being sufficient for most practical purposes. These fuzzy sets describe the desired attributes of the design. The specifications normally used in designing industrial controllers are overshoot, rise time (i.e., the time for the closed-loop response to reach some specified percentage of its final value) and settling time (i.e., the time required for the closed-loop response to reach some specified percentage of its final value). More design specifications can be added as necessary, e.g. steady-state error and the maximum permissible control actions that can be used, in which case the complexity of the computational problem is increased proportionally. Consider for example, the fuzzy sets for the Rise_Time, Overshoot and Settling_Time to be Small, Medium and Large while the fuzzy sets of the resultant Fitness to be Very_Small, Small, Negative_ Medium, Medium, Positive_Medium, Large and Very_Large. A sample of suitability rules can therefore be stated as follows: R1: IF (Rise_Time is Small) AND (Overshoot is Small) AND (Settling_Time is Small) THEN (Fitness is Very_Large) 2 R : IF (Rise_Time is Medium) AND (Overshoot is Small) AND (Settling_Time is Small) THEN (Fitness is Large) 3 R : IF (Rise_Time is Medium) AND (Overshoot is Medium) AND (Settling_Time is Large) THEN (Fitness is Negative_Medium) R4: IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is Large) THEN (Fitness is Small)) R5: IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Large) THEN (Fitness is Very_Small) Examples of membership functions that can be used in the qualitative design technique are shown in Figure 17.1. Assuming that the settling time of the closed system is to remain constant, the variation of the fitness with rise time and overshoot is shown in Figure 17.2. The complete set of linguistic rules, containing 33=27 rules, constitutes the rule-base of the design procedure and is given in the Table that follows. These rules may be modified to satisfy any control objective and MATLAB and its Fuzzy Toolbox can be used to implement the design technique.
239
(a) Rise_Time
(b) Overshoot
(c) Settling_Time
240
Chapter 17
(d) Fitness
Figure 17.1 (continued) Fuzzy sets used in the qualitative design technique
241
242
Chapter 17
20. IF (Rise_Time is Large) AND (Overshoot is Small) AND (Settling_Time is Medium) THEN (Fitness is Medium) 21. IF (Rise_Time is Large) AND (Overshoot is Small) AND (Settling_Time is Large) THEN (Fitness is Negative_Medium) 22. IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is SMALL) THEN (Fitness is Medium) 23. IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is Medium) THEN (Fitness is NM) 24. IF (Rise_Time is Large) AND (Overshoot is Medium) AND (Settling_Time is large) THEN (Fitness is Small) 25. IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Small) THEN (Fitness is Negative_Medium) 26. IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Medium) THEN (Fitness is Small) 27. IF (Rise_Time is Large) AND (Overshoot is Large) AND (Settling_Time is Large) THEN (Fitness is Very_Small)
Rule-Base for the qualitative evaluation of controller fitness using Rise_Time, Overshoot and Settling_Time.
243
Example 17.1 Design of an optimum two-term controller for temperature control of a greenhouse
The qualitative controller design technique is applied to the design of a two-term (PI) controller to control the environment inside a greenhouse. Following a step demand in the reference temperature Tref, the temperature T at some point in the greenhouse shown in Figure 17.3, has the characteristic dead time followed by an exponential rise shown in Figure 17.4.
Tref C Greenhouse
Figure 17.3 Schematic of the controlled greenhouse The proposed technique is model-free and no attempt is made to obtain a low order approximant of the plant in order to tune the closed system. Here it is sufficient to know only the continuous or discrete step response. The design objective is to find the optimum parameters (Kp, Ki)* of a two-term controller that will lead to a closed system step response with a nominal rise time Trise=26 units, a nominal overshoot of p=10% and a nominal settling of Ts=20 units. Instead of defining some quantitative criterion with penalty functions, qualitative measures are used to describe the desired attributes of the closed system. Each attribute is therefore assigned a fuzzy variable, which define the suitability of the overall system.
244
Chapter 17
The fuzzy sets of the three attributes that form the inputs to the inference engine are assumed to be triangular while the fuzzy sets of the suitability function are Gaussian. The fuzzy sets are shown in Figure 17.1. In deriving the control surface, use is made of the Mamdani min-max compositional inference rule. De-fuzzification uses the Center of Gravity (COG). A simple Genetic Algorithm, which follows an elitist strategy, is finally used to obtain the global optima of the controller parameters. It is observed that fitness decreases with increasing overshoot and increasing rise time.
Figure 17.4 Normalized step response of the greenhouse The step response of the optimum system designed with the proposed hybrid Evolutionary - Fuzzy (E-F) technique is compared in Figure 17.5 with that of the same controller designed for minimum ITAE. It is evident that the response of the E-F design is superior, being better damped, and reaches the steady state faster than the ITAE design. It is interesting to note that the E-F design has a higher ITAE index than that of the ITAE design. That the E-F design is superior to the ITAE design is no surprise since a multiple criterion was used in the former.
245
ITAE
1
E-F
0.8
0.6
0.4
0.2
10
15
20
25
30
35
40
45
50
Figure 17.5 Comparison of closed system step responses for the optimum E-F and minimum ITAE controllers
Example 17.2 Design of an optimum neural controller for a lathe cutting process
A rule-based neural controller for a cutting lathe was described earlier in chapter 16 and a schematic of the lathe cutting process was shown in Figure 16.7. The response of the process to a small step demand in feed rate was shown in Figure 16.8. The objective in this case study is a controller with a rise time of less than some specified value Trise, an overshoot that does not exceed p% of the steady state value and a settling time Tset less than some specified value. These design objectives can be achieved by (i) the proper choice of the control rules, (ii) the inference mechanism and (iii) optimization of the free parameters of the controller. Assume that the first part of the controller design procedure that involves rule elicitation and rule-encoding, has been completed and that a suitable neural network of specified topology has been designed and trained to generate the desired control surface, as in chapter 16. Our concern here is the second part of the design, i.e., the optimization of the free controller parameters.
246
Chapter 17
Twenty-seven rules relating the design attributes were necessary to specify the desired properties of the closed system completely and are displayed in Figure 17.3. The fuzzy sets of the three controller design attributes constitute the inputs to the inference engine and are assumed to be triangular while the fuzzy set for the fitness function is taken as Gaussian as shown in Figure 17.1. ITAE E-F
Figure 17.6 Step responses of optimum controllers for the optimum E-F and minimum ITAE controllers The step response of the optimum controller designed with the qualitative Evolutionary - Fuzzy (E-F) design technique is compared with that of the same controller designed for minimum ITAE in Figure 17.6. The response of the qualitative design is superior, being better damped and faster in reaching the steady state than the ITAE design, presumably because a multi-objective criterion has been used.
Chapter 18
Bibliography
A. Computational Intelligence
Eberhart R. C., Dobbins R. C. and Simpson P. K. (1996) : Computational Intelligence PC Tools, AP Professional. Kaynak O. (1998) : Computational Intelligence: Soft Comnputing and Fuzzy-neuro Integration with Applications, Springer-Verlag, Berlin. Palaniswani M., Attikiouzel Y. and Marks R. (Eds) (1996) : Computational Intelligence - a dynamic systems perspective, IEEE Press, NY. Pedrycz W. (1997) : Computational Intelligence - an Introduction, CRC Press. Poole D., Mackworth A. and Goebel R. (1998) : Computational Intelligence, Oxford University Press, Oxford. Reusch B. (Ed). (1999) : Computational Intelligence: Theory and Applications, Springer-Verlag, Berlin. Tzafestas S. G. (Ed.) (1999) : Computational Intelligence in Systems and Control: Design and Applications, Kluwer Academic Publications, Hingham, Ma.
247
248
Chapter 18
B. Intelligent Systems
Antsaklis P. J., Passino K. M. and Wang S. J. (1989) : Towards intelligent autonomous control systems: architecture and fundamental issues, Journal of Intelligent and Robotic Systems, Vol. 1, pp. 315-342. Bernard J. (1988) : Use of rule-based systems for process control, IEEE Control Systems Magazine, Vol. 8, No. 5, pp. 3-13. Bigger C. J. and Coupland J. W. (1982) : Expert Systems : a bibliography, IEE Publications, London. Chiu S. (1997) : Developing Commercial Applications of Intelligent Control, IEEE Control Systems Magazine, April, pp. 94-97. 247 Francis J. C. and Leitch R. R. (1985) : Intelligent knowledge based process control, Proc IEE Conference on Control, London. Harris C. J., Moore C. G. and Brown M. (1993) : Intelligent Control, Aspects of Fuzzy Logic and Neural Nets, World Scientific, Singapore. Saridis G. N. (1979) : Towards the realization of intelligent control, Proc IEEE, Vol. 67, No. 8, pp. 1115-1133. Saridis G. N. and Valavanis K. P. (1988) : Analytical design of intelligent machines, Automatica, Vol. 24, No. 2, pp. 123-133. Saridis G. N. (1996) : On the theory of intelligent machines: a comprehensive analysis, Int. Journal of Intelligent Control and Systems, Vol. 1, No. 1, pp. 3-14. Tzafestas S. G. (Ed.) (1993) : Expert Systems in Engineering Applications, Springer-Verlag, Berlin. Tzafestas S. G. (Ed.) (1997) : Methods and Applications of Intelligent Control, Kluwer Academic Publishers, Hingham, Ma. Tzafestas S. G. (Ed.) (1997) : Knowledge Based Systems Control, World Scientific, Singapore. Valavanis K. and Saridis G. N. (1992) : Intelligent robotic system theory: design and applications, Kluwer Academic Publishers, Hingham, Ma.
Bibliography
249
250
Chapter 18
Larsen P. M. (1980) : Industrial applications of fuzzy control, Int. Journal of Man-Machine Studies, Vol. 12 (one of the first publications on applications of fuzzy control). Lee C. C. (1990a) : Fuzzy logic control systems: fuzzy logic controller Part I, IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-20, No. 2, pp 404-418. (contains an extensive Bibliography on fuzzy control). Lee C. C. (1990b) : Fuzzy logic control systems: fuzzy logic controller - Part II, IEEE Transactions on Systems, Man and Cybernetics, Vol. SMC-20, No. 2, pp. 419-435. Li Y. and Lau C. (1989) : Development of fuzzy algorithms for fuzzy systems, IEEE Control Systems Magazine, Vol. 9, No. 3, pp. 6577. Mamdani E. H. and Gaines B. R. (Eds) (1981) : Fuzzy Reasoning and its Applications, Academic Press, NY. Mamdani E. H. (1974) : An application of fuzzy algorithms for the control of a dynamic plant", Proc. IEE, Vol. 121, No. 12. Mendel J. M. (1995) : Fuzzy logic systems for Engineering: a tutorial, Proceedings IEEE, Vol. 83, No. 3, Mar 1995, pp. 345-377 (contains an extensive list of references on Fuzzy Logic). Negoita C. V. (1981) : Fuzzy Systems, Gordon and Brown. Negoita C. V. (1985) : Expert systems and fuzzy systems, Benjamin Cummings Publishing Co. Mylnek D. M. and Patyra M. J. (Eds.) (1996) : Fuzzy logic implementation and application, J. Wiley, NY. Pedrycz W. (1989) : Fuzzy control and fuzzy systems, J. Wiley and Sons, NY. Ross T. J. (1995) : Fuzzy logic with engineering applications, McGraw Hill, NY. Rutherford D. and Bloore G. (1976) : The implementation of fuzzy algorithms for control, Proc. IEEE, Vol. 64, No. 4, pp. 572-573. Sugeno M. (1985) : Industrial Applications of Fuzzy Control, Elsevier Science Publishers North Holland. Sugeno M. and Kang G. T. (1986) : Fuzzy modelling and control of multilayer incinerator, Fuzzy Sets and Systems, Vol. 18, pp. 329346. Sugeno M. and Yawagawa T. (1993) : A fuzzy-logic-based approach to qualitative modeling, IEEE Trans. on Fuzzy Systems, Vol. 1, No. 1, pp. 7 - 31.
Bibliography
251
Takagi T. and Sugeno M. (1985) : Fuzzy identification of systems and its application to modeling and control, IEEE Trans on Systems, Man and Cybernetics, Vol. SMC-15, pp. 116-132. Terano T., Asai K. and Sugeno M. (1989) : Applied fuzzy systems, Academic Press, NY. Tzafestas S. G. and Venetsanopoulos A. N. (Eds) (1994) : Fuzzy Reasoning in Information, Decision and Control Systems, Kluwer Academic Publishers, Hingham, Ma. Wang P. P. and Tyan C-Y. (1994) : Fuzzy dynamic system and fuzzy linguistic controller classification, Automatica, Vol. 30, No. 11, pp. 1769-1774. Wang L-X. (1994) : Adaptive fuzzy systems and control - design and stability analysis, Prentice Hall, NY. Yen J., Langari R. and Zadeh L.A. (Eds.) (1995) : Industrial applications of fuzzy logic and intelligent systems, IEEE Press, NY. Zadeh L. A. (1965) : Fuzzy Sets, Information and Control, Vol. 8, pp. 3-11. (the definitive paper which laid out the foundations of Fuzzy Logic). Zadeh L. A. (1972) : A rationale for fuzzy control, Trans. ASME, Journal of Dynamic Systems and Control, Vol. G-94, pp. 3-4. Zadeh L. A. (1973) : Outline of a new approach to the analysis of complex systems and decision processes, IEEE Trans. n Systems, Man and Cybernetics, Vol. SMC-3, No. 1, pp. 28-44. Zadeh L. A. (1988) : Fuzzy logic, IEEE Computer, Vol. 21, No. 4, pp. 83-93. Zimmermann H. J. (1996) : Fuzzy set theory and its applications, Kluwer Academic Publishers, Hingham, Ma.
252
Chapter 18
Kosko B. (1992) : Neural Networks and Fuzzy Systems, a dynamic systems approach to machine intelligence, Prentice Hall, NY. Yager R. R. (1992) : Implementing fuzzy logic controllers using a neural network framework, Fuzzy Sets and Systems, Vol. 148, pp. 53-64.
Bibliography
253
Widrow B., Winter R. G. and Baxter R. A. (1988) : Layered neural nets for pattern recognition, IEEE Trans. Acoustics, Speech and Signal Processing, Vol. 36, No. 7, pp. 1109-1118.
254
Chapter 18
Tsitouras G. S. and King R. E. (1997) : Rule-based neural control of mechatronic systems, Proc. Int. Journal of Intelligent Mechatronics, Vol. 2, No. 1, pp. 1-11. Von Altrock C. (1995) : Fuzzy Logic and Neuro-fuzzy applications, Prentice Hall, NY. Wu Q. H., Hogg. B. W. and Irwin G. W. (1992) : A neural network regulator for turbo-generators, IEEE Trans. on Neural Networks, Vol. 3, No.1, pp. 95-100.
H. Evolutionary Algorithms
Angeline P. J., Saunders G. M. and Pollack J. B. (1994) : An evolution algorithm that constructs recurrent neural networks, IEEE Trans on Neural Networks, Vol. 5, No. 1, pp. 54-65. Back T., Hammel U. and Schwefel H-P. (1997) : Evolutionary Computation: Comments on the History and Current State, IEEE Trans on Evolutionary Computation, Vol. 1, No. 1, pp 3 - 17. (contains over 220 references on Evolutionary and Genetic Algorithms). Dasgupta D. and Michalewicz Z. (Eds) (1997) : Evolutionary Algorithms in Engineering, Springer Verlag, Berlin. Davis L. (1991) : Handbook of Genetic Algorithms, Van Nostrand, NY. DeGaris H. (1991) : GenNETS - Genetically programmed neural nets, Proc. IEEE Intl. Joint Conf. on Neural Networks, Singapore.
Bibliography
255
DeJong K. A. (1985) : Genetic Algorithms - a 10 year perspective, Proc. First Intl. Conf. on Genetic Algorithms, Hillsdale, NJ., pp. 169-177. Fogel D. B. (1994) : An Introduction to Simulated Evolutionary Optimization, IEEE Trans. Neural Networks, Vol. 5, No. 1, pp. 3-15. Fogel D. B. (1995) : Evolutionary Computation: Towards a new philosophy of machine intelligence, IEEE Press, NY. Fogel D. B. (Editor) (1997) : Handbook of Evolutionary Computation, IOP Publishing, Oxford. Fogel D. B. (Editor) (1998) : Evolutionary Computation- the Fossil Record, IEEE Press, NY. (Selected readings on the history of Evolutionary Algorithms). Goggos V. and King R. E. (1996) : Evolutionary Predictive Control, Computers and Chemical Engineering, Supplement B on Computer Aided Process Engineering, pp. S817-822. Goldberg D. E. (1985) : Genetic algorithms and rule learning in dynamic systems control, Proc 1st Int. Conf. on Genetic Algorithms and their Applications, Hillsdale, N.J., pp. 5-17. Goldberg D. E. (1989) : Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Reading, Mass. (seminal book on Genetic Algorithms). Greffenstette J. (1986) : Optimization of control parameters for Genetic Algorithms, IEEE Trans. on Systems, Man and Cybernetics, Vol. 16, No. 1, pp. 122-128. Holland J. H. (1975) : Adaptation in Natural and Artificial Systems, University of Michigan Press, Michigan. Holland J. H. (1990) : Genetic Algorithms, Scientific American, July, pp. 44-50. Karr C. L. and Gentry E. J. (1993) : Fuzzy Control of pH using Genetic Algorithms, IEEE Trans. on Fuzzy Systems, Vol. 1, No. 1, pp. 46-53. Kim J., Moon Y. and Zeigler P. (1995) : Designing Fuzzy Net Controllers using Genetic Algorithms, IEEE Control Systems, pp. 66-72. Lin F. T., Kao C. Y., Hsu C. J. C. (1993): Applying the genetic approach to Simulated Annealing in Solving some NP-Hard Problems, IEEE Trans. on Systems Man and Cybernetics, Vol. 23, No. 6, pp. 1752-1767.
256
Chapter 18
Man K. F., Tang T. S., Kwong S. and Halang W. A. (1997) : Genetic Algorithms for Control and Signal Processing, Springer-Verlag, Berlin. Maniezzo V. (1994) : Genetic evolution of the topology and weight distribution of neural networks, IEEE Trans. on Neural Networks, Vol. 5, No. 1, pp 39-53. Marti L (1992) : Genetically generated neural networks, Proc. IEEE Intl. Joint Conf. on Neural Networks, pp. IV-537-542. Michalewicz Z. (1992) : Genetic Algorithms + Data Structures = Evolution Programs, Springer-Verlag, Berlin. Michalewicz Z. and Schoenauer M. (1996) : Evolutionary Algorithms for constrained parameter optimization problems, Evolutionary Computation, Vol. 4, No. 1, pp. 1-32. Moed M. C. and Saridis G. N. (1990) : A Boltzmann machine for the organization of intelligent machines, IEEE Trans. on SMC, Vol. 20, No. 5, pp. 1094-1102. Mohlenbain H., Schomisch M. and Born J. (1991) : The parallel Genetic Algorithm as function optimizer, Parallel Computing, Vol. 17, pp. 619-632. Park D., Kandel A. and Langholz G. (1994) : Genetic-based new fuzzy r reasoning models with application to fuzzy control, IEEE Trans on Systems, Man and Cybernetics, Vol. 24, No. 1, pp. 39-47. Spears W. M, DeJong K. A., Bock T., Fogel D. B. and DeGaris H. (1993) : An Overview of Evolutionary Computation, Proc. European Conference on Machine Learning. Srinivas M. and Patnaik L. M. (1991) :Learning neural network weights using Genetic Algorithms - improving performance by searchspace reduction, Proc IEEE Int. Joint Conf. on Neural Networks, Singapore. Varsek A., Urbancic T. and Filipic B. (1993) : Genetic Algorithms in Controller Design and Tuning, IEEE Trans on Systems, Man and Cybernetics, Vol. 23, No. 5, pp. 1330-1339. Zhang J. (1993) : Learning diagnostic knowledge through neural networks and genetic algorithms, Studies in Informatics and Control, Vol. 2, No. 3, pp. 233-252.
Bibliography
257
Bishop R. H. (1997) : Modern Control Systems Analysis and Design using MATLAB and Simulink, Addison Wesley, Reading, Mass. Cavallo A. (1996) : Using MATLAB, Simulink and Control System Toolbox : A Practical Approach, Prentice Hall, NY. Dabney J. and Harman T. L. (1998) : Mastering Simulink 2 : Dynamic System Simulation for MATLAB, Prentice Hall, NY. Djaferis T. E. (1997) : Automatic Control : the power of feedback using MATLAB, PWS Publishers. Dorf R. C. (1997) : Modern Control Systems Analysis and Design using MATLAB and Simulink, Addison Wesley, Reading, Mass. Gulley N. and Roger Jang J.-S. (1995) : Fuzzy Logic Toolbox for use with MATLAB, Mathworks, Boston, Mass. Moscinski J. and Ogonowski Z. (1995) : Advanced Control with MATLAB and Simulink, Ellis Horwood, Hertfordshire.
Appendix A
260
Appendix A
Inflow
Level sensor
261
262
Appendix A
principle is used in simple on-off controllers using relays in building thermostats for which no knowledge of the room dynamics or characteristics is required! Unfortunately, this simple controller leads to undamped oscillations in the level of fluid in the tank with a peak to peak amplitude of 2 meters, as seen in Figures A.2(a) and A.2(b). Furthermore the step response, i.e., the fluid level as a function of time in response to a sudden change in the desired level, is asymmetric. As seen, an increase in the desired fluid level leads to a different response than a decrease in the desired level. This is characteristic of fluid systems since fluid pressure and level are nonlinearly related. There is no way of estimating the frequency of oscillation since there it is assumed that there is no knowledge of the dynamics of the controlled process. It is observed, however, that reducing the dead-zone 2 reduces the amplitude and increases the frequency of the oscillation. This is observed in Figures A.2(a) and A.2(b). Reducing the dead-zone to zero will, in theory, lead to an infinite frequency of oscillation. In practice this is unlikely because of the inherent delays in the relay. In any case this continuous chattering is undesirable since it shortens relay life.
263
The system can be modeled using MATLAB/Simulink/Fuzzy Toolbox simply typing the instruction sltank at which the flow diagram shown in Figure A.3 appears. In this file we are given the opportunity to compare the step responses of the fuzzy and fuzzy controllers for the same conditions. The fuzzy sets in this simple case have the Boolean form shown in Figure 5.1. As a consequence the output of the controller can only take one of three possible values, i.e., POL, ZER or NEL. The qualitative controller is equivalent to a threshold logic unit with dead-zone, as shown in Figure A.4. If the fluid level error is within the dead-zone then the controller gives no output and the process coasts freely. However, when the absolute error exceeds the dead-zone then the controller gives a command of sgn(e).
264
Appendix A
265
nate them. It is also desirable to find some technique to compensate for the response asymmetry. We could, of course, follow the road of conventional control, identifying the process dynamics by applying step disturbances to it, determining a simple approximant and using the cookbook techniques of Ziegler and Nichols, or more sophisticated methods in order to establish the parameters of the required three-term controller. This procedure is quite tedious and time-consuming and it is doubtful that all objectives can be satisfied simultaneously. It is certain, however, that a conventional three-term controller cannot compensate for the observed asymmetric response. If we could change the control strategy by adding some additional stabilizing rules and some advanced inference mechanism so that the control action changes smoothly, instead of abruptly as in the case of the simple controller, then we could achieve stabilization and significant improvement in the closed system response. The solution is to turn to unconventional control techniques and design a fuzzy controller with one of the available computer aided design packages. This case study uses MATLAB Simulink and the Fuzzy Toolbox. In the following, we present the procedure that must be followed in the design of a simple generic fuzzy controller. Our objective is a robust fuzzy controller which can be described by simple linguistic rules but which will not exhibit the problems that were encountered with the simple linguistic controller described earlier. The next step in the design procedure following encoding of the control rules, is to decide on the number of fuzzy sets for every input and output variable of the controller. We must also specify the shape of the corresponding fuzzy sets, some of the available choices being shown in Figure 11.5. In this study we experiment with a number of shapes in an attempt to find acceptable overall performance. The FIS Editor in the Fuzzy_Toolbox shown in Figure A.5. is a simple and effective tool that makes designing fuzzy controllers simple. As was noted in earlier chapters of this book, it is not necessary to specify many fuzzy sets for the inputs to the controller in order to obtain acceptable accuracy. Most industrial applications use 3 to 5 fuzzy sets. Where fine control is essential then many more fuzzy sets are required. In the F. L. Smidth kiln controller, for instance, more than a dozen fuzzy sets are used. In the fuzzy controller proposed here, only
266
Appendix A
three fuzzy sets LOw, OK and HIgh are used for the input variables for simplicity.
267
R4: IF error_in_the_fluid_level is ZEro AND the rate_of_change_of_level is NEgative_Small THEN the control valve must be closed slowly, i.e., the control_valve_rate must be NEgative_Small R5: IF error_in_the_fluid_level is ZEro AND the rate_of_change_of_ level is POsitive_Small THEN the control valve must be opened slowly, i.e., the control_valve_rate must be POsitive_Small These 5 control rules may be specified in many ways, the most convenient of which is linguistic, as shown in Figure A.6.
1. If (level is OK) then (valve is no_change) 2. If (level is low) then (valve is open_fast) 3. If (level is high) then (valve is close_fast) 4. If (level is OK) and (rate_of_change_of level is positive) then (valve is close_slow) 5. If (level is OK) and (rate_of_change_of level is negative) then (valve is open_slow) Figure A.6. The control rules in linguistic form
268
Appendix A
NEg_Large, NEgative_Small, ZEro, POsitive_Small and POsitive_Large for finer control. The universe of discourse in this Case is normalized to [-1,1] where +1 indicates that the valve is entirely open (i.e., 100% open) and 1 that it is closed (i.e., 0% open). The median 0.5 implies that the valve is at the center of its range.
269
2. Case - Figure A.8: Inputs with 3 triangular fuzzy sets and outputs with 5 symmetric triangular fuzzy sets with small support and no overlap. 3. Case C - Figure A.9: Inputs with 3 Gaussian fuzzy sets and outputs with 5 symmetric triangular fuzzy sets with small support and no overlap. 4. Case D - Figure A.10: Inputs with 3 Gaussian fuzzy sets and outputs with 5 asymmetric triangular fuzzy sets with small support and no overlap. 5. Case E - Figure A.11 : Inputs with 3 Gaussian fuzzy sets and outputs with 5 asymmetric triangular fuzzy sets with small support and some overlap. It is noted that triangular fuzzy sets with small support, i.e., , 0 approximate singletons (see chapter 5) that have been used in industrial fuzzy controllers. The advantage of using singletons is the simplicity and speed of the computations for de-fuzzification. A number of vendors use singletons in their products, e.g., S5-Profuzzy, S7Fuzzy_Control by Siemens and FuzzyTech by Allen Bradley. For every one of the five cases we present the control surface and the corresponding step response. Figure A.12 shows the computer screen from the Fuzzy Toolbox which shows which rules are fired for a given controller input values and the corresponding controller output for the Case E.
270
Appendix A
Figure A.10(c) for instance, the fuzzy set Positive_Small has been shifted to the left with the consequence that the control action for small positive errors in the liquid level is greater than that for small negative errors. Thus for small discrepancies in the liquid level about the nominal level, the rate with which the valve is changed is increased when the tank is filling and decreased when it empties. This leads to compensation of response asymmetry.
A.9 Conclusions
It should have become clear that altering the shape of the fuzzy sets of the controller does not lead to major changes in the control surface. This is evident in Figures A.7(c) to A.11(c). In cases and , the control surfaces shown in Figures A.7(c) and A.8(c) are almost flat in the dead-zone due the triangular nature of the fuzzy sets of both the inputs and the output. The corresponding step responses are shown in Figures A.11(a) and A.11(b) are unsatisfactory because their steady-state error is non-zero and response asymmetry is severe. It is noted, also, that in these two cases, both the triangular fuzzy sets with a large support set and the singletons have comparable step responses. In contrast, in cases C, D and E the control surfaces are smooth because of the smooth (Gaussian) shape of the fuzzy sets of the controller inputs. The corresponding step responses are seen to be superior. In cases D and E (see Figures A.11(d) and B.11(e)) the step responses are almost symmetric. Finally, in the last three cases the steady-state errors are essentially zero and overshoot is negligible. Case E appears to be the best as it demonstrates symmetric response, zero steady-state error and no overshoot. As was seen in all five cases, the closed system step response is influenced significantly by the shape of the fuzzy sets of the inputs and to a lesser extent by the fuzzy sets of the controller output. In general, Gaussian fuzzy sets have smoother control surfaces, implying smoother control actions and improved responses. The best shapes of the fuzzy sets are not generally known and are the subject of ongoing research.
271
272
Appendix A CASE B
273
274
Appendix A CASE D
275
CASE E
276
Appendix A
277
278
Appendix A
Appendix B
280
Appendix B
[best_obj(1), index] = min(object); % Store the best candidate of the initial population best_gen=gen(index,:); best_phen=phen(index, :); [worst_obj(1), index1] = max(object); % Store the worst candidate of the initial population worst_cur_gen=gen(index1); worst_cur_phen=phen(index1); avg_obj(1)=0; % Calculate the average performance of the population for k=1:popsize avg_obj(1)=avg_obj(1)+object(k); end; avg_obj(1)=avg_obj(1)/popsize; best_x(1)=best_phen(1); best_y(1)=best_phen(2); for i1=1:2 fprintf(1,'%f ',best_phen(1)); end; fprintf('\n'); fprintf(1,'BEST : %f WORST : %f AVG : %f \n',best_obj(1),worst_obj(1),avg_obj(1)); for i=1:maxgen % Start of the Genetic-Loop newgen=reproduc(gen,fitness); % Reproduction gen=mate(gen); % Mate two members of the population gen=xover(gen, pcross); % Crossover Operation gen=mutate(gen, pm); % Mutation Operation [phen, coa] = decode(gen, vlb, vub, bits); % Decode the genotype of the new population to phenotype [fitness, object]=score(phen,popsize); % Evaluation of the Fitness & Objective Functions [best_cur_obj, index] = min(object); % Store the best candidate of the current population best_cur_gen=gen(index, :); best_cur_phen=phen(index, :); [worst_obj(i+1), index1] = max(object); % Store the worst candidate of the current population worst_cur_gen=gen(index1); worst_cur_phen=phen(index1); avg_obj(i+1)=0; % Average performance of the current population for k=1:popsize avg_obj(i+1)=avg_obj(i+1)+object(k);
281
if(best_cur_obj > best_obj(i)) % Apply Elitist Strategy phen(index1,:) = best_phen; gen(index1,:) = best_gen; object(index1) = best_obj(i); best_obj(i+1) = best_obj(i); elseif(best_cur_obj <= best_obj(i)) best_phen = best_cur_phen; best_gen = best_cur_gen; best_obj(i+1) = best_cur_obj; end; best_x(i+1)=best_phen(1); % Display evolution of the best solution on the contour graph best_y(i+1)=best_phen(2); hold; line(best_x,best_y); for i1=1:2 fprintf(1,'%f ',best_phen(i1)); end; fprintf(1,'---> %f\n',best_obj(i+1)); fprintf('\n'); fprintf(1,'BEST : %f WORST : %f AVG : %f \n',best_obj(i+1),worst_obj(i+1),avg_obj(i+1)); end xx=1:maxgen+1; % Display evolution of objective functions for the worst, average and best solutions figure(2); plot(xx,best_obj,xx,worst_obj,xx,avg_obj); grid; % File init.m - This function creates a random population function phen=init(vlb,vub, siz, sea) for i=1:siz phen(i,:)=(vub-vlb).*rand(1, sea) + vlb; end % File score.m - This function computes the fitness and the objective function values of a population function [fitness, object]=score(phen, popsize) for i=1:popsize
282
end
The following m-files called by the main program can be downloaded directly from the MATHWORKS web site www.mathworks.com and bear the indication: % Copyright (c) 1993 by the MathWorks, Inc. % Andrew Potvin 1-10-93. % File encode.m - This function converts a variable from real to binary function [gen,lchrom,coarse,nround] = encode(x,vlb,vub,bits) lchrom = sum(bits); coarse = (vub-vlb)./((2.^bits)-1); [x_row,x_col] = size(x); gen = []; if ~isempty(x), temp = (x-ones(x_row,1)*vlb)./ ... (ones(x_row,1)*coarse); b10 = round(temp); nround = find(b10-temp>1e-4); gen = b10to2(b10,bits); end % File reproduc.m - This function selects individuals in accordance to their fitness function [new_gen,selected] = reproduc(old_gen,fitness) norm_fit = fitness/sum(fitness); selected = rand(size(fitness)); sum_fit = 0; for i=1:length(fitness), sum_fit = sum_fit + norm_fit(i); index = find(selected<sum_fit); selected(index) = i*ones(size(index)); end new_gen = old_gen(selected,:); % File mate.m - This function mates two members of the population
Simple Genetic Algorithm function [new_gen,mating] = mate(old_gen) [junk,mating] = sort(rand(size(old_gen,1),1)); new_gen = old_gen(mating,:);
283
% File xover.m - This function performs the Crossover operation function [new_gen,sites] = xover(old_gen,Pc) lchrom = size(old_gen,2); sites = ceil(rand(size(old_gen,1)/2,1)*(lchrom-1)); sites = sites.*(rand(size(sites))<Pc); for i = 1:length(sites); new_gen([2*i-1 2*i],:) = old_gen([2*i-12*i],1:sites(i)) ... old_gen([2*i 2*i-1],sites(i)+1:lchrom)]; end % File mutate.m - This function performs the Mutation operation function [new_gen,mutated] = mutate(old_gen,Pm) mutated = find(rand(size(old_gen))<Pm); new_gen = old_gen; new_gen(mutated) = 1-old_gen(mutated);
% File decode.m - This function coverts a variable from binary to real function [x,coarse] = decode(gen,vlb,vub,bits) bit_count = 0; two_pow = 2.^(0:max(bits))'; for i=1:length(bits), pow_mat((1:bits(i))+bit_count,i) = two_pow(bits(i):-1:1); bit_count = bit_count + bits(i); end gen_row = size(gen,1); coarse = (vub-vlb)./((2.^bits)-1); inc = ones(gen_row,1)*coarse; x = ones(gen_row,1)*vlb + (gen*pow_mat).*inc; % File b10tob2 - This function converts a variable from base 10 to base 2 function b2 = b10to2(b10,bits) bit_count = 0; b2_index = [];
284
Appendix B
bits_index = 1:length(bits); for i=bits_index, bit_count = bit_count + bits(i); b2_index = [b2_index bit_count]; end for i=1:max(bits), r = rem(b10,2); b2(:,b2_index) = r; b10 = fix(b10/2); tbe = find( all(b10==0) | (bits(bits_index)==i) ); if ~isempty(tbe), b10(:,tbe) = []; b2_index(tbe) = []; bits_index(tbe) = []; end if isempty(bits_index), return end b2_index = b2_index-1; end
Appendix C
286
Appendix C
x_1=-1 + 2*rand; % Select New Solution (x_1, x_2) x_2=-1 + 2*rand; % Evaluate New Objective Function Value z_1=x_1^2+x_2^2 - 0.3*cos(3.*pi*x_1)0.4*cos(4*pi*x_2)+0.7; g=exp(-((z_1-z(i))/Tcur)); % Acceptance Probability if ((z_1 < z(i))| (rand < g)) x1=x_1; x2=x_2; z(i+1)=z_1; else z(i+1) = z(i); end Tcur=Tcur*l; % New Simulated Temperature r1(i+1)=x1; r2(i+1)=x2; title('Search for the global optimum point'); xlabel('x axis'); ylabel('y axis'); i=i+1; end % End of Loop f=find(z==min(z)); fprintf(1,'The Minimum value has of the Obj. func. been observed so far is : %f in the %d iteration\n', min(z),f(1)); fprintf(1,'x=%f,y=%f\n',r1(f(1)),r2(f(1))); hold; line(r1,r2); title('Movement of x-y parameters in the search space'); xlabel('x parameter'); ylabel('y parameter'); figure(2); plot(z); title('Obective Function values versus Iterations'); xlabel('Iterations'); ylabel('Objective Function'); figure(3); plot(r1); title('Movement of the x parameter');
Simulated Annealing Algorithm ylabel('x parameter'); xlabel('Iterations'); figure(4); plot(r2); title('Movement of the y parameter'); ylabel('y parameter'); xlabel('Iterations');
287
Appendix D
290
Appendix D both layers plottr(TR); plot results W10 ; print synaptic weights of first layer W20 ; print synaptic weights of second layer B10 ; print bias of first layer B20 ; print bias of second layer
Index
Adaptive linear networks (ADALINEs), 156, 162, 170 Artificial intelligence, 13 Artificial neural networks, 1, 51, 153 autoassociative, 159 feedback, 159 feed-forward networks, 159 generalized function mapping, 154 Hopfield recurrent networks, 158 multi-layer network topologies, 158 Artificial neurons, 153, 156 dynamic, 158 static, 156 Cartesian product, 72 Classical control, 2 Compositional rules of inference, 81 Computational intelligence, 6, 9, 13, 23, 31, 41, 153
Computer integrated manufacturing (CIM), 23, 36 Control: protocols, 119 rules, 119 Conventional control, 39 Deep knowledge, 18, 32 De-fuzzification, 98 center of area (COA), 98 center of gravity (COG), 98 Elemental artificial neuron, 156 Embedded fuzzy controllers, 123 Evolutionary: algorithms, 20 computation, 203 control, 8 controller suitability, 237 decoding, 212 design of conventional controllers, 235 design of intelligent controllers, 221 291
292 [Evolutionary] operations, 205 crossover, 206, 214 mutation, 206, 214 recombination (see crossover) selection, 206, 215 simulated evolution, 205 optimization, 205, 208 programming, 211 strategies, 211 Expert systems, 13, 49 classification, 32 development, 18 diagnosis of malfunctions, 28 elements, 15 energy management, 26 fault diagnosis, 20, 24 fault prediction, 24 implementation, 19 industrial controller design, 24 LISP machines, 19 need, 17 operator training, 22 paradigms, 20 plant simulation, 22 prediction of emergency plant conditions, 26 predictive maintenance, 25 product design, 21 production scheduling, 27 representation of knowledge, 20 shells, 18, 20 supervisory control systems, 23 tools and methods, 18 Flexible manufacturing systems (FMS), 21, 28 Fuzzification, 91, 96
Index degree of fulfillment, 91, 96 graphical interpretation of, 93 Fuzzy: algorithm, 59 associative memory (FAM), 103, 115 conditional statements, 58 control, 54, 89 algorithm, 89 controllers, 105 coarse-fine, 117 decomposition, 90 embedded, 123 gain-scheduled, 40, 136 generalized three-term, 108 generic two-term, 113 hybrid architectures, 112 integrity, 101 optimization using genetic algorithms, 221 partitioned architecture, 109 real-time, 119 robustness, 107 Tagaki-Sugeno, 136, 144 three-term, 107 fitness criteria, 236 gain-scheduling, 136, 146 implications, 78 Boolean, 78 GMP, 80 Larsen, 80 Lukasiewicz, 78 Mamdani, 79 Zadeh, 79 inference engine, 71 degree of fit, 71 linguistic variables, 64 logic, 1, 7, 8 algorithm, 59, 74 basic concepts, 54 logic control (FLC) (see also Fuzzy controllers), 2
Index operators, 60 conjunctive, 72 propositional implication, 71 reasoning, 71, 76 relational matrix, 73 sets, 55, 100 algebraic properties of, 64 choice of, 268 coarseness of, 100 completeness of, 101 linguistic descriptors of, 57 membership function of, 55 operations on, 63 complement, 63, 64 connectives, 69 DeMorgan's theorem, 64 intersection, 63, 64 product, 63 union, 63, 64 shape of, 100 support set of, 56 singletons, 67, 92 systems, 32 variable, 57 universe of discourse of, 55 Gain-scheduled controllers, 40 Generalized: Modus Ponens (GMP), 77 Modus Tollens (GMT), 77 Genetic algorithms (GAs), 8, 205, 211 fitness functions, 203, 206, 213 initialization, 212 parameters, 217 Hard control, 42 Human: intelligence, 6 operators, 59
293 Industrial: control, 23 controller optimization, 232 Inference engine, 6, 15, 17 effectiveness, 37 quality, 37 Intelligent: agents, 124 control, 6, 11, 31, 41, 43 autonomy, 45 basic elements, 34 conditions for use, 33 objectives, 34 techniques, 39 controllers, 7, 35 correctness, 10 extendibility, 10 precision, 10 reusability, 10 robustness, 10, 40 systems, 10 acceptance, 37 architecture, 46 design tools and methods, 18 distributed architecture, 47 efficiency, 37 hierarchical structure, 46 Knowledge: base, 76 based systems, 13, 48 embedded, 193 empirical, 33 engineering, 37 and experience, 31 heuristic, 135 Learning machines, 154 Linguistic: controller, 261 descriptors, 57
294 [Linguistic] rule matrix, 186 rules, 8, 14, 16, 20, 32, 59 values, 57 variables, 57 Mamdani, 1, 79, 84, 111 Membership function, 55 generic S, 66 generic , 67 Model-based fuzzy control (see also Tagaki-Sugeno controllers), 135, 136 Modern control, 3 Multi-level relay controller, 111 Neural: control, 51, 153, 160 learning and adaptation, 161 parallel processing, 161 rule-based, 181 controllers, 160 architectures, 162 indirect learning, 166 inverse model, 164 specialized training, 165 design using genetic algorithms, 222 fidelity, 163 indirect training of, 166 inverse model of, 164 multi-variable, 161 properties of, 161 rule-based, 181 network training algorithms, 169 back-propagation (BP), 169, 176 flow chart, 180 Delta, 173 least mean squares (LMS), 171
Index multi-layer, 175 supervised learning, 169 unsupervised learning, 169 Widrow-Hoff, 170 Neuro-fuzzy control, 8, 51, 193 architectures, 194 isomorphism, 195 fuzzification of neural controllers, 195 neuralization of fuzzy controllers, 195 Numerical Fuzzy Associative Memory (NFAM), 187 Perceptron, 154 Procedural knowledge, 16 Real-time: expert systems, 26 execution scheduler, 124 fuzzy control, 119 Relational algorithm, 7 Representation of knowledge, 20 Response asymmetry compensation, 269 Rule: composition, 82 conflict, 102 encoding, 182 granularity, 116 Rule-based: neural control, 181 network training, 183 Saridis' principle, 10, 46 Shallow knowledge, 18, 24, 32 Simulated annealing, 8, 225 Metropolis algorithm, 226 optimization: constrained, 229 industrial controller, 232
Index Soft: computing, 7, 41, 42, 154, 203 control, 42 Supervisory fuzzy controllers, 120 de-fuzzifier, 120 fuzzifier, 120 inference engine, 120 knowledge-base, 120 real-time data base, 120 Symbolic representation, 9 Synaptic weights, 156 Tagaki-Sugeno controllers (see also Model-based controllers), 136, 144 first approach, 136 fuzzy control law, 141 fuzzy process models, 139 fuzzy variables and fuzzy spaces, 137 locally linearized process model, 142 second approach, 144 stability conditions, 144
295
Uncertainty and vagueness, 7, 53 Unconventional control, 6, 40 Universe of discourse, 55 Waste-water treatment control, 126 Widrow-Hoff training algorithm, 170, 172, 173 Zadeh, 1, 50, 53, 119