Computer-Aided Control System Design Using Optimization Methods
Computer-Aided Control System Design Using Optimization Methods
DOCTOR OF PHILOSOPHY
Grace, A.C.W.
Award date:
1989
Awarding institution:
Bangor University
Link to publication
General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal ?
Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately
and investigate your claim.
A.C.W.Grace
1989
2
M.12.../AMOM • vt•X
Acknowledgments
A research project of this nature demands the help and support of many people. In
particular, I would like to thank my academic supervisor, Prof. P.J.Fleming, for his guidance
throughout this project. Not only for his help in the initial stages of the project, with
numerous lectures and discussions in both control theory and optimization, but also for his
enthusiasm and motivation in the later stages of the project. I am also indebted to him for his
assistance with the preparation of this thesis.
I would also like to express my thanks to Mr. P.Smith and Mr. S.Winter of Royal
Aerospace Establishment (RAE), Bedford who provided me with insights into non-linear
simulation and the design of flight control systems.
I gratefully acknowledge the financial support of the Science and Engineering Research
Council (SERC) and to RAE as part of a Cooperative Award in Science and Engineering
(CASE). I would also like to thank SERC for the funding of numerous Vacation Schools and
workshops which I found most useful.
Contents
OVERVIEW
1.CACSD
1.1 INTRODUCTION 1-1
1.2 INTEGRATED DESIGN ENVIRONMENTS 1-1
1.3 MATLAB 1-3
1.3.1 Possible Improvements to MATLAB 1-4
1.3.2 Data Structures 1-5
1.3.3 Databases 1-7
1.3.4 Help and Error Diagnostics 1-8
1.3.5 Compilation 1-8
1.3.6 Linking to Other Numerical Libraries 1-9
1.3.7 Modern Computing Approaches 1-11
1.4 CACSD ENVIRONMENT FOR OPTIMIZATION 1-12
1.4.1 Integrating Optimization Software 1-12
1.4.2 The Design Environment 1-12
1.4.3 ADS Optimization Package 1-14
1.4.4 Present Software Status. 1-15
1.5 REVIEW 1-15
1.6 REFERENCES 1-16
2. OPTIMIZATION
2.1 INTRODUCTION 2-1
2.1.1 Parametric Optimization 2-1
2.2 UNCONSTRAINED OPTIMIZATION 2-2
2.2.1 Quasi-Newton Methods 2-3
2.2.2 Line Search 2-5
2.3 QUASI-NEWTON IMPLEMENTATION 2-7
2.3.1 Hessian Update 2-7
2.3.2 Line Search Procedures 2-8
2.3.3 Comparison of Methods 2-12
2.4 LEAST SQUARES OPTIMIZATION 2-14
2.4.1 Gauss-Newton Method 245
2.4.2 Levenberg-Marquardt Method 2-16
2.5 LEAST SQUARES IMPLEMENTATION 2-17
2.5.1 Gauss-Newton Implementation 2-17
2.5.2 Levenberg-Marquardt Implementation 2-17
2.6 CONSTRAINED OPTIMIZATION 2-19
2.6.1 Sequential Quadratic Programming (SQP) 2-20
2.7 SQP IMPLEMENTATION 2-22
2.7.1 Updating The Hessian Matrix 2-22
2.7.2 Quadratic Programming Solution 2-23
2.7.3 Line Search and Merit Function 2-26
2.7.4 Constrained Example 2-27
2.8 MULTI-OBJECTIVE OPTIMIZATION 2-29
2.8.1 Introduction to Multi-Objective Optimization 2-29
2.8.2 Goal Attainment Method 2-33
2.8.3 Algorithm Improvements 2-34
2.9 REVIEW 2-35
2.10 REFERENCES 2-36
3. CONTROL SYSTEM DESIGN
3.1 CONTROL IN PERSPECTIVE 3-1
3.1 INTRODUCTION 3-2
3.2 INTEGRAL QUADRATIC MEASURES OF CONTROL 3-3
3.3 CONTROLLER STRUCTURES 3-4
3.3.1 Full State Feedback 3-4
3.3.2 Output Feedback 3-5
3.3.3 Dynamic Output Feedback 3-5
3.3.4 General LQR Problem Solution 3-7
3.4 DISTURBANCE REJECTION 3-8
3.4.1 Impulse Disturbances 3-9
3.4.2 Stochastic Problem(LQG) 3-10
3.4.3 Disturbance Modelling 3-11
3.4.4 Choosing Initial Conditions 3-11
3.4.5 Canonical Form 3-13
3.4.6 Evolutionary Controller Mapping 3-13
3.5 ADDITIONAL DESIGN OPTIONS 3-15
3.5.1 Control Derivative Measures 3-15
3.5.2 Sensitivity Measures 3-17
3.6 SERVOMECHANISMS 3-18
3.6.1 Two-Degree-of-Freedom (2DF) Control Structure 3-19
3.6.2 Design Cycle 3-19
3.6.3 Servo Derivative Measures 3-24
3.7 OBSERVER DESIGN 3-25
3.8 MULTI-OBJECTIVE CONTROL SYSTEM DESIGN 3-26
3.9 DESIGN BY EVOLUTION 3-27
3.10 DESIGN EXAMPLES 3-28
3.10.1 Simple Tracking Example 3-28
3.10.2 Generic VSTOL (GVAM) Tracking Example 3-33
3.10.3 F4C Multi-Model Example 340
3.11 REVIEW 3-44
3.12 REFERENCES 3-45
CONCLUSIONS
"CACSD using Optimization Methods", PhD Thesis, A.C.W.Grace, UCNW, Bangor, UK, 1989
P (11 rt 1
CSD
PART 1: CACSD INTRODUCTION
Summary - Computer-Aided Control System Design (CACSD) plays a critical role in the
implementation and application of control theory. Trends and future directions within this
field will be discussed, looking at, in particular, integrated design environments and possible
further evolutionary changes to the MATLAB package. A design environment using a
FORTRAN version of MATLAB and an optimization package, ADS, is presented as an
integrated design environment for optimization applications of Control System Design.
1.1 INTRODUCTION
HE INCREASING speed of modem computers and the evolving trend towards powerful
Tcomputing systems, such as networked workstations, has resulted in a shift of attention within
engineering software away from algorithm efficiency towards aspects of functionality and user-
friendliness. Issues such as inter-package communication, data structures and user-interfaces are now
important considerations for any CACSD environment.
Control System Design is by its nature a complex procedure which has prompted a plethora of
control design methods and associated software packages. Many of these methods have been developed by
academia where the resulting user-interfaces, reliability, maintenance and applicability of the software
have been key issues in the poor integration of these packages within the control community and into
industry. This has been one of the factors in the continued use of often heuristic methods for Control
System Design within industry and the development of mathematically tractable but sometimes
impractical methods of Control System Design within academia.
This part does not intend to give a full review of existing software and the reader is referred to
Fleming and De Oliveira [1] and [2-6] for useful introductions to the subject. References [14-39] give
specific details with respect to a large number of existing packages.
Over the last three decades a great deal of progress has been made in the development, refinement
and optimization of engineering software in the form of numerical subroutines, libraries and packages.
Much of this software has been written in FORTRAN which still remains the fastest compiled
language for numerical applications on most machines. Attempts have been made to harness these
powerful numerical routines through the use of integrated environments which provide, among other
things, a user-friendly interface to the numerical routines, database handling facilities, graphics
facilities and high-level command language capabilities.
The MATLAB package [18-20] is an example, of such an integrated environment in which the
numerical libraries, LINPACK [34] and EISPACK [35], for linear algebra have been integrated into a
user-friendly environment. Although such environments are a significant improvement over numerical
libraries, they are limited by the amount of facilities that can be reasonably programmed into one
package. Attempts are now being made to develop environments which offer further integration by the
inter-linking of other packages as well as numerical subroutines. Such environments provide a common
software base providing tools and libraries of numerical algorithms and the ability to link to existing
packages. This avoids excessive duplication of software and helps to provide a common base to facilitate
the transfer of design methods into industry. The advantage to the control engineer is that the
environment gives him the ability to work at a high-level of abstraction so that he can concentrate on
important aspects of the design problem.
An integrated design environment has been developed as part of the SERC's Special Initiative in the
field of Computing and Design Techniques for Control Engineering (CDTCE). This is called
ECSTASY [15] (Environment for Control System Theory, Analysis and Synthesis). The aim of the
"CACSD using Optimizati on Methods" PhD Thesis A.C.W.Grace Unii. of Wales, Bangor. 1989 1-1
PART 1: CACSD INTEGRATED DESIGN ENVIRONMENTS
project is to provide an embryo infrastructure which consists of six major design tools as shown in
Fig. 2.1. Further facilities are provided by packages which are linked to the infrastructure and database
through inter-process communication mechanisms. For a fuller description of the ECSTASY
environment, see [16].
ECSTASY aims to provide a flexible environment capable of covering features such as system
identification, simulation and Control System Design. It is one step towards a totally integrated design
environment which would allow, for instance, control implementation details to be taken into
consideration at the system design level. MATRIX-X [24], CTRL-C [21] and DELIGHT [28] are other
packages which are aiming at providing an integrated design environment by incorporating a wider range
of functions by linking to other packages. It is hoped that ECSTASY will provide better flexibility in
terms of data communication enabling a wider range of facilities to be linked into the environment.
Whilst recognizing and fully supporting the aims and objectives of the ECSTASY environment it
may nevertheless have several shortcomings. The cost of ECSTASY is likely to be high due to the
necessity of having to buy supporting packages and because the infrastructure itself contains a number of
expensive packages (PA Set Tools [17], PRO-MATLAB [20]). The environment will also have large
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-2
PART 1: CACSD MATLAB
memory requirements which may result in a degradation of computational speed and limit its use to
mainframes and workstations. The database aspects of the package will also tend to slow the package
down due to type checking, parsing, handshaking and general housekeeping.
Depending on the operating system being used there are many different inter-process communication
mechanisms which can be used. Data requirements and other issues will dictate the type of inter-process
communication mechanism to be used. For instance, for optimization applications a fast data link with
high precision is beneficial due to the iterative nature of the method. In such cases it may be more
appropriate to directly link the application software to the source package. It is therefore important for
integrated environments such as ECSTASY to provide a choice of inter-package communication methods
so that a particular method can be chosen depending on the application being considered. The actual
mechanism could then be chosen to create a balance between precision, speed and ease of use.
Many new features, such as the incorporation of new data structures and modern computing
approaches cannot be wholly achieved through inter-communication between packages. Further
advancements in CACSD will also involve the evolution of design packages such as MATLAB.
1.3 MATLAB
MATLAB is increasingly becoming the de-facto standard for linear control system design. Released
in 1980, by Cleve Moler and originally intended for linear algebra, the application to control design
was quickly seized upon by the control community and a number of packages appeared as derivatives of
MATLAB such as CTRL-C, MATRIX-X and IMPACT [22].
The original version of MATLAB [18] is public-domain software written in FORTRAN.
Subsequently, it has been rewritten in C, offering improved graphics facilities, faster computation and
improved programming facilities including user-defined functions. It has been marketed commercially as
PC-MATLAB [19] and PRO-MATLAB, the former targeted for implementation on IBM-PCs and the
latter for mainframes and workstations, although both offer almost identical facilities.
PRO-MATLAB follows an "open system" philosophy by which it provides the user with a set of
low-level functions and the means to construct higher-level functions by enabling the user to create M-
files (distinguished by having the extension ".m"), which subsequently are added to the set of available
commands and user-functions.
M-files have led to the provision of Toolboxes, which are marketed with PRO-MATLAB. These
consist of M-files, with specific objectives. CACSD-oriented Toolboxes include a Control System
Toolbox, an Identification Toolbox, a Signal Processing Toolbox, a Robust Control Toolbox, a State
Space Identification Toolbox and a Multivariable Frequency Domain (MFD) Toolbox.
The success and wide-spread use of MATLAB can be attributed to a number of features which make
the package easy-to-use and highly functional. They are as follows:
• MATLAB is interpretive so that there is no wasted time in compilation. This allows for
better interaction on the part of the user. PRO-MATLAB is especially applicable to a window-
based environment where one window can be used for editing a user-defined function while
another can be used for testing it. Although interpretive languages are slower than compiled
languages, MATLAB is not wholly interpretive, as user-defined functions are semi-compiled
on their first call, offering some speed-up on subsequent calls.
• Data management, storage and housekeeping is external to the user, allowing the user to
concentrate at a higher level of abstraction than most so-called high-level languages such as C
and FORTRAN.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-3
PART 1: CACSD MATLAB
• MATLAB has at its core a large library of numerically reliable and optimized software based
on the LINPACK and EISPACK subroutine libraries.
• A major feature of MATLAB is that it is extendable allowing the user to write routines
which automatically become part of the command set. This is especially important in the realm
of Control System Design where individual methods and applications are prevalent.
• The simple syntax and the use of default commands reduces unnecessary coding.
One criticism of MATLAB is that it is not a typed language. That is variables do not have to be
given a type and are all assumed to be subsets of the complex double-precision matrix. Min ya [10],
however, has indicated that typing may be a severe restriction on flexibility. Strong typing may not be
appropriate to languages such as MATLAB where its high level commands and data-structure restrict
program size and the quantity of variables. Variable confusions are further reduced if the programmer
uses sensible naming conventions. Further, since MATLAB is interpretive, typing problems are easily
resolved, unlike compiled languages where complex debugging procedures may be necessary.
Although MATLAB has proved a highly successful and popular package, there are areas in which
improvements would help to enhance the package and help to maintain its position as a state-of-the-art
software package. The next few sub-sections will address how possible achievements could be achieved
within MATLAB.
A main criticism of MATLAB is its restrictive data structures which do not allow any data types
other than subsets of the complex matrix and character strings. Lack of database management has also
been cited as a weakness of MATLAB where the commands load and save are deemed inadequate for
handling large amounts of data.
Other possibilities for improvement are an increase in run-time efficiency by further compilation of
the MATLAB M-files. Error diagnostics and help facilities are also areas in which improvements
would help the user.
Inter-process communication mechanisms within the MATLAB infrastructure have the potential to
make MATLAB a highly functional integrated design environment which can link to a whole range of
facilities in a flexible way.
Modern computing approaches such as graphical input, symbolic processing, Intelligent Knowledge
Based Systems (IICBS), object oriented, functional and logical programming concepts are also deemed as
possibilities for further investigation.
The realization of these improvements can be achieved in a number of ways. Besides rewriting
MATLAB from scratch, which would be an expensive solution and contravene the principles of
evolutionary change and reusability within software, the alternatives are as follows:
• Link to other packages which have the desired facilities.
' Try to simulate improvements within the constraints of the MATLAB syntax and data
structures by creating extra M-files which perform the necessary tasks.
• Add improvements to the source code of MATLAB.
Each of these options has its advantages and disadvantages. Linking to other packages preserves
modularity but is also likely to be slow due to communication overheads. This may not resolve
problems such as restrictive data structures.
Simulating improvements through the use of additional M-files , where possible, is an attractive
solution since it requires no re-release of the package though this is likely to be less efficient and more
restrictive than incorporating new features within the source code.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 1-4
PART 1: CACSD Data Structures
Adding new features to the source code seems an attractive solution though this may slow the
interpreter down and upgrading might make new software un-executable on old versions of the package.
The following sections discuss aspects of MATLAB which could be improved and changes which might
be implemented within MATLAB.
The only data structure which exists within MATLAB is the double precision complex matrix
although strings may be thought of as a different data type. Mansour et al [11] and (see also
Maciejowski [9]) have outlined a proposed list of data structures which have been implemented in the
IMPACT package and might be used within control. They are as follows:
• Polynomials.
• Transfer Functions.
• Complex Matrices.
• Linear Systems Descriptions in the Time Domain.
• Linear Systems Descriptions in the Frequency Domain.
• Non-linear Systems Descriptions.
•Domains. (1-D structures containing ordered discrete values)
• Trajectories (e.g. Time and Frequency data)
• Non-numeric Structures.
Many of these structures can be simulated within MATLAB. For instance, a polynomial can be
simulated using a row vector and a transfer function can be simulated using one row for the numerator
and the other for the denominator. This situation, however, is far from satisfactory as there is no
indication from the display or operation of the data what kind of data structure is being used.
Rimvall has also suggested the overloading of operators such as, *, /, +, and - so that they have
different meanings depending on the data structure being used. This has been implemented together with
extensions on data structures in the package IMPACT [19], a MATLAB based package. However, such a
feature is not as yet commercially available in a MATLAB based package.
Some data structures are more difficult to incorporate than others. For instance, non-linear
descriptions and symbolic processing may best be left to specialist packages such as simulation packages
(ACSL [32], SIMNON [33], TSIM [34] etc.) and symbolic processing packages (MACSYMA [27],
REDUCE [26]).
MATLAB has proved very successful within the realm of linear control design. Data structures
such as transfer functions, pole-zero representations and state-space systems should therefore be
inherently part of the package.
A first step for providing new data structures within MATLAB would be to provide flexible
input/output format. For example, this would allow polynomials to be entered and displayed as
polynomials and not simply as row or column vectors.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-5
PART 1: CACSD Data Structures
In order to incorporate additional data structures within MATLAB the source code must be changed
to some extent. One desirable feature would be to allow variables to be labelled with information
concerning how they are to displayed on the screen. This, although not a complete solution, would be
one step in the right direction. An example of a command defining the output format of variable
containing the state space representation of a controller (named CONTI) might be as follows:
label(CONT1, 'format short variable s transfer function', 'Controller 1')
This would mean that the variable CONTI would always be labelled as a transfer function, an example
of such an output might be, for example:
CONTI = ( Controller 1)
2.1234s + 4.3333
This format is much easier to understand when compared with an output consisting of a matrix of
numbers, however problems may arise in user-defined functions if it is required to write flexible code
which allows state-space or polynomial descriptions to be used interchangeably. There should therefore
be rules, so that variables can inherit output formats. For example, the command:
TF1=feedback(SYSTEM,CONT1)
would need to ensure that the variable TF1 has the same format as the variables SYSTEM and CONTI.
This might be achieved by a simple command such as
TF1=inherit(SYSTEM)
within the .m file feedback.
All data elements within MATLAB are treated as matrices, the elements of which are stored in a large
data stack of high precision numbers. A stack containing the position of the first element of each
variable in the main data stack (i.e. an array of pointers) has reference to other stacks which refer to the
variable names, row sizes, column sizes and a flag for indicating whether the matrix is real or complex.
This type of database resembles to some extent that of a relational database with columns of data
relating to each other. To incorporate new data structures and other facilities in MATLAB other
columns of data could be added to the database with information regarding further attributes and
relational aspects of the data. In this way pointers could be set up to display formats and other related
data so that clustering of data could be implemented. This would then allow structured information to
be defined. This might take the form of the structure command as in C. An example of this might be to
define a state space system in the following way:
structure(SYSTEM)
{A;B;C:D}
A structure would become a data type whose elements are a series of pointers to other data elements.
Referencing elements in the structure could be done as in C. Thus SYSTEM.A would refer to the A
matrix of structure SYSTEM.
When functions are called, the structures could behave like macros so that stepr(SYSTEM) is
equivalent to stepr(A,B,C,D). Grouping of data in this way should also serve to reduce complexity of
calling functions where there are a lot of variables. One problem encountered in trying to implement
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-6
PART 1: CACSD Databases
optimization algorithms is the large amount of data that must be temporarily stored (about 10-30
variables). If one uses the principle of not having any global variables, adding long lists of variable
argument lists to functions can become cryptic and the principle of information hiding intended for user-
defined functions is lost. Grouping of data into one matrix is complicated and time inefficient as is
temporarily storing to files. A good solution would be to use pre-defined structures.
There are many other relational aspects of data that could be incorporated within the MATLAB
data-structure. For a more detailed discussion of relational database aspects in control the reader is
referred to Breuer et al [7] [8].
1.3.3 Databases
One of the attractions of MATLAB is the internal database handling of the matrices. All
alterations and housekeeping of the database are maintained within the MATLAB operating system.
However, MATLAB offers very few facilities on a wider scale for data management. Although the
requirements for data management in control are relatively small compared with many applications
there are several areas where database management is an important issue.
One of these areas is in the modelling process where changes to the model may be made by a number
of workers and at intermittent intervals of time. If other design processes are being carried out in
conjunction with the modelling process or when a number of co-workers are working on the same
problem it becomes important to document details of changes in terms of time, date, type of change and
by whom the change was made.
Another area where data management may be a problem is within the realm of control system
design when a large number of linear models are obtained from the non-linear model at various
operating points in preparation for control system design. The control design, may, also create data
problems if different controllers and control structures are used. This might be the case if a control
designer were experimenting with different controllers in order to view trade-offs between them or to
implement them in a gain-scheduled controller.
It is clear that for the proper management of such data, data management systems should be
inherent within the data model. Non-linear modelling is really within the realms of system
identification work and database management aspects of this should be contained within the modelling
and simulation package being used.
The data management of linear models and controllers can be simulated within the constraints of
the MATLAB command language using M-files. These would perform the necessary housekeeping using
the commands load and save.
The policy which is used in the software used for Control System Design examples given in Part 3
is to use a different directory for each linearized model, under a main directory for the overall model.
Each directory contains files which contain the data from different designs resulting from changes in
design features or controller configuration. Files and directories are created automatically at the end of
each design and named depending on characteristics of the design. In this way an update of all designs is
maintained. In each file the data is recorded for further reference.
Although this method provides some form of data management additional features would greatly
facilitate and enhance the handling and storing of data. One such improvement would be to have flexible
data memory handling. At present when a file is to be examined the data is loaded into the system
overwriting any existing data with the same name. When examining differences between designs it
would be an advantage to just switch database from, for example, internal memory to a file.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-7
PART 1: CACSD Help and Error Diagnostics
MATLAB has a clever way of incorporating the documentation accompanying the M-files with the
help diagnostics which appear under the help command. However, the help command gives the user no
control over how much or how little information is received. Further, the help is no way structured
making it difficult to get an overview of related commands.
Error diagnostics are generally not clear. For instance, if a matrix is of the wrong size or an
element which does not exist has been addressed, then the name of the matrix should be displayed along
with its contents. At present the only output which is provided is the line number where the error
occurred.
Another problem involves correction of syntax errors. These are often hard to find due to lack of
information and thus tedious to correct. A solution to this might be to allow syntax errors to be
corrected semi-automatically. This could be achieved by displaying the line at fault at the cursor so that
if the user wishes to make any changes he can do so without having to use the editor. The user would
then have the option of typing a carriage return, in which case no change would be made. Otherwise the
file would be rewritten with the correction made.
More information should be made available when an error condition occurs and all associated
variables should be displayed. A query feature could be made available giving progressively more detail
of the error to avoid clogging the screen.
Due to the ever-growing number of user-defined functions the chances of giving a variable the same
name as a user-defined function become increasingly high. This can result in errors which are difficult to
debug and may not be apparent until later use of a user-defined function. A useful feature would be to
give a warning whenever a variable is created which conflicts with a user-defined faacciorr name. A good
naming convention also relieves this problem. Such a naming convention might be as follows:
' user-defined functions - lower case letters having at least three letters.
' script files - first letter upper case
'global variables & function arguments -upper case
' temporary miscellaneous variables - lower case, maximum of two letters.
'integers - begin variable name with i or I
' strings - begin variable name with s or S
'polynomials - begin variable name with p or P
One additional feature which would be useful would be the facility to call a user-defined function
without parameters and have MATLAB prompt you for the variables whilst in interactive mode. This
overcomes having to remember the Order of often long lists of variables.
1.3.5 Compilation
Because MATLAB is basically an interpretive language, the run time efficiency is slow when
compared to compiled languages. Whilst this is not a problem in most applications, when iteration is
demanded MATLAB is particularly slow. One such application is optimization. It would therefore be
of a great advantage if once a function has been tried and tested it could be compiled and linked into
source code.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-8
PART 1: CACSD Linking To Other Numerical Libraries
Most of the constructs within MATLAB (IF,WHILE,FOR) are contained within the source code
(C or FORTRAN), so that a translator would not be difficult to implement. However, where
compilation is likely to prove difficult is in the database handling of user-defined functions. For
instance, a matrix of arbitrary size can be created within a user-defined function. However dynamic
creation of variables like this is not generally or easily performed in high-level compiled languages.
Therefore, some form of kernel database handling system must be inherent within the compiled code.
Compiled routines must be linked into the main database either by communication mechanisms (e.g.
UNIX pipes) or by directly linking the compiled routines into the source code.
Several lessons can be learned from the enormous popularity of MATLAB. Essentially, MATLAB
is a package which has linked to a library of routines (LINPACK [36] and EISPACK [37,381) and
provided a user-friendly and easy-to-use environment. One of the criticisms of libraries such as the
NAG library [35] is that there is no way to easily and quickly write application software or test
programs. A framework such as that provided by MATLAB would be an ideal environment for such
experimentation.
A possibility for incorporating libraries such as NAG into an interactive design environment such
as MATLAB might be to write an auto-linking program which reduces the amount of mundane tasks
concerned with aspects of memory allocation, compilation and coding. Linking directly to whole
libraries as has been done with LINPACK and EISPACK would be prohibitive due to large memory
requirements. However, an auto-linking compilation process could allow commands to be directly typed
in using a MATLAB command which would invoke a process to link with the required subroutine.
As an example of how this could be achieved, an example will be taken from the NAG library. The
MATLAB command which initiates the link will be performed using only constructs which are
available within the MATLAB command set. The example chosen is a subroutine which involves finding
a solution of a set of real linear equations using Crout's factorization method.
The command would be carried out using a MATLAB function of the form:
<>[X,A]=NAGCF04AAF,A,int(rows(A)),B,int(rows(B)),int(cols(A)),int(cols(B)),X,..
int(rows(X)),WSPACE,int(IFAIL))
Associated with this command would be the user-defined function NAG.m which would first store
the variables to a named file and then invoke the auto-link and compilation process. Once compiled the
function would become part of the command set so that no future compilation would be necessary until
the compiled subroutine is deleted. A user-defmed function could then be set up to simplify the syntax
of such a command. For the above function the user-defined function would be:
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 1-9
PART 1: CACSD Linking To Other Numerical Libraries
function [X,U]=solve(A,B)
% SOLVE Calculates the approximate solution for a set of real
°/0 real linear equation with multiple right-hand sides by
°A, Grout's factorization method.
% X-is a contains the solution vectors.
% U-contains the Grout factorization.
X=zeros(B);
WKSPACE=zeros(cols(A));
IFAIL=0;
[X,A]=NAGCF04AAF',A,int(rows(A)),B,int(rows(B)),int(cols(A)),int(cols(B)),
X,int(rows(X)),WSPACE,Int(IFAIL))
The general purpose MATLAB file for invoking the auto-linking program to any of the library
routines is:
function[b1 ,b2,b3,b4,b5,b6,b7,b8,b9,b10]=NAG(a1,a2,a3,a4,a5,a6,a7,a8,a9,a
1 0,a1 1 ,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21 ,a22]
% NAG Invokes auto-linking compilation of NAG subroutines.
cmd='save LINKFILE'
%save correct number of arguments
for i=1:nargin
cmd=[cmd,' a',num2str(i)];
end
exec('cmd')
% Invoke auto-linker compiler
! autolink
% load back variables returned form subroutine and contained in file LINKFILE
load LINKFILE
The command autolink invokes a process to write a program which links with the NAG routine
and stores the returned variables in a file LINKFILE.
Since MATLAB is not a typed language it is necessary to provide an integer function to indicate
when a variable is of type integer - this would be done by conversion of the integer to a character string
which is preceded with a marker (e.g. "*"). Thus the file would of the form:
function in=int(a)
A=round(a)
in=['*',num2str(A)]
The command rows and cols simply returns the number of rows or columns of the matrix using
the MATLAB command size.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-10
PART 1: CACSD Modern Computing Approaches
This proposal could form a useful connection to a whole range of FORTRAN or other high-level
language. However, such a connection due to its dependence on files as the communicating mechanism
would be slow. For faster implementation, other inter-process communication mechanisms need to be
explored, NAG optimization routines would also not be able to be implemented in this way since they
require a separate FORTRAN subroutine to calculate the cost functional and constraints which must be
called at every iteration.
Modern computing approaches and the trend towards data-driven languages have been noted by some
authors as the Future for CACSD. Shepherd [42], for example has highlighted the need for CACSD
databases to support objects and graphical input/output. An object-oriented approach [see, for example,
[44]) lends itself particularly to modern graphical input/output techniques such as iconic and pictorial
representation. Although the introduction of objects requires a restructuring of the data structure
aspects of MATLAB, as outlined in Section 1.3.3, some principles of object-oriented programming
could be implemented without recourse to rewriting the package.
Objects are groups of procedures and data which communicate via message passing. One important
aspect of object-oriented programming is the concept of class and inheritance. Class refers to protocols
and characteristics of an object. Inheritance allows one to create objects which can inherit characteristics
from other classes of objects. Variables are created by making instances of a class. A totally object-
oriented approach may not be appropriate for a package such as MATLAB, however the concept could be
applied by building up objects from user-defined functions and data and thus used as part of an iconic
graphical interface.
Consider for example, the construction of a control configuration in preparation for parameter
optimization. The problem here is data handling since there may be a number of control configurations
which need to be tested. To invoke the necessary control structure requires the execution of both
procedures and data to construct the necessary matrices into a general problem formulation. Under an
object-oriented system the control configuration could be thought of as an object (as opposed to both
procedures and data). By interface to a graphical input/output facility icons could then be used to
represent abstract data types such as controller configuration, model choice and design method.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989
PART 1: CACSD MATLANG CACSD ENVIRONMENT
Fig. 1.2 shows how the design environment is structured. The FORTRAN version of MATLAB was
upgraded and called MATLANG because of its language capabilities. MATLANG parallels to a large
extent the development of PRO-MATLAB offering the following additional features to the
FORTRAN version of MATLAB.
• User-defined functions called as in PRO-MATLAB.
• IF,FOR,WH I LE loops can be written over more than one line.
• Variables can have up to 10 characters (instead of 4).
• Graphics facility for TEK4010.
• On-line Programming without recourse to a text editor.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-12
PART 1: CACSD MATLANG CACSD ENVIRONMENT
1VIATLANG
MATLAB +
Programming
Facilities
Nkakrt.#,4,,
4kkenQinear Simulation,
orrnance
asures Lineltr . ed Models
FORTRAN
Subroutines
ADS TSIM
User-Defined
Functions
This version of MATLAB has the advantage that FORTRAN subroutines can be directly linked into the
package. This offers the advantage of speed up over PRO-MATLAB user-defined functions. Several
FORTRAN subroutines were written and added to the package so that responses for time and frequency
could be obtained quickly for inclusion into performance indices for optimization problems. The package
has been linked to a non-linear dynamic simulator, TSIM [34] for access to non-linear simulations and
direct down loading of linearized models. To preserve modularity, this link is performed through files.
The key features of the design environment are that optimization may be easily and interpretively
entered into the program. The user has a number of options in the choice of optimization algorithm. The
optimization may be interrupted during execution and the user may change algorithms or reconfigure the
optimization problem in a flexible manner. For alternative descriptions of the environment see also
Grace and Fleming [40,41].
An example of an optimization program to minimize Rosenbrock's function (see also Part 2):
is shown below.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-13
PART 1: CACSD MATLANG CACSD ENVIRONMENT
Notice the call to the optimizer <X,PARA>=optim(F,X,PARA); which calls the ADS optimization
package on an iterative basis. The variable PARA contains information regarding algorithm choices and
relevant optimization parameters.
ADS (Automated Design Suite) is an optimization suite of programs written in FORTRAN and
developed by G.N.Vanderplaats at the Naval Postgraduate School, Monterey, California. The
optimization suite offers two significant advantages over existing optimization libraries such as NAG:-
flexibility of algorithm choice and improved program control
The program has been modularized so that the user has options over the algorithms to be used at
three levels of calculation. These levels have been named Strategy, Optimizer and One-Dimensional
Search.
At the Strategy level the user decides whether to use, for instance, Sequential Unconstrained
Minimization [46,47], Sequential Linear Programming [48,49] or Sequential Quadratic Programming (see
Section 2.6.1) to solve a constrained optimization problem. Essentially, this either converts a
constrained optimization problem into an unconstrained optimization problem or tries to deal with the
constraints directly. If the problem is unconstrained the user may ignore the Strategy algorithm. The
Strategy algorithm may also be ignored when the algorithm to be used does not first convert a
constrained optimization problem to an unconstrained optimization problem. This may be the case using
either the Method of Feasible Directions [50,51] or a Random Search which are chosen at the Optimizer
level.
At the Optimizer level the method for solution of an easier sub-problem is chosen. This may be an
unconstrained optimization problem, such as a Quasi-newton methods (cf. Section 2.2.1), including the
Broyden-Fletcher-Golfarb-Shanno (BFGS) method. Alternatively a constrained sub-problem is chosen,
such as, a Quadratic Programming [52] method.
Having found a search direction at the Strategy and Optimizer levels, a line search is performed to
try to locate the minimum along the line determined by the current point and the search direction. This
algorithm may be chosen at the One-Dimensional Search algorithm level. The algorithms available
consist of interpolation and extrapolation methods, and a number of search algorithms (see Section 2.3.3
for comparison of methods).
Although not every permutation of algorithms within each of the three levels is possible, the user
still has a wide choice of combinations available. In this way, different combinations can be tested to
find the most suitable for the class of problem being considered.
ADS is called on an iterative basis by a controlling program. This means that ADS can be
interactively interrupted and the algorithm changed or problem formulation updated. This contrasts to
many optimization subroutines, such as those provided by NAG, where the function to be minimized
must be evaluated in a user-supplied FORTRAN subroutine. The subroutine name is passed as a
parameter to the optimizer and is called when needed from the optimizer. This means that the control
resides totally within the optimizer itself. The disadvantage of this is that it makes interaction
difficult. Further, if the performance indices are functions of variables other than purely the design
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-14
PART 1: CACSD MATLANG CACSD ENVIRONMENT
variables, then they must be stored and loaded by file or hard coded into the compiled function. This is
the case in control where the cost will usually be a function of both the design parameters (controller)
and the system parameters (plant).
The package has proved a useful test-bed for optimization algorithms and for the development of
Control System Design methodologies. However, PRO-MATLAB was thought to offer improved
facilities in terms of graphics, software support and speed due to improved numerical routines. Software
development has therefore continued with PRO-MATLAB and the optimization algorithms have been
developed as MATLAB user-defined functions. MATLANG is not however totally redundant and is
proving a useful test-bed for optimization algorithms and for direct linking with existing FORTRAN
and C subroutines.
1.5 REVIEW
The emergence of packages such as MATLAB and the tend towards integration of existing software
within CACSD has been highlighted. Improvements to the MATLAB package have been suggested in the
form of simple realizable changes to an already useful package. An interactive design environment has
been described for optimization application of Control System Design by the integration of an
optimization package, ADS, and an upgraded FORTRAN version of MATLAB. In the next part we will
examine how optimization algorithms can be directly implemented in the MATLAB command language.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-15
PART 1: CACSD REFERENCES
1.6 REFERENCES
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-16
PART 1: CACSD REFERENCES
[23] Shah S.C., Floyd M.A. and Lehman L.L.,"MATRIX-X:Control Design and Model Building
CAE Capabilities," Computer-Aided Control Systems Engineering (M. Jamshidi and C.J.
Herget,eds.), pp 181-207, North-Holland Elsevier Science Publishers, Amsterdam, 1985.
[24] "MATRIX-X:User's Guide, MATRIX-X:Reference Guide, MATRIX-X: Training Guide,
Command Summary and On-line Help," Integrated Systems, Inc., Palo Alto, CA 94031.
[25] SLICE (Principal Developers: Control Systems Research Group, Kingston Polytechnic), NAG
Central Office, Oxford.
[26] Hearn A.C.,"REDUCE User's Manual," The Rand Corporation, Santa Monica, 1985.
[27] MACSYMA: Symbolics Inc., Cambridge, MA 02143, 1986.
[28] Polak E., Siegel P., Wuu T., Nye W.T. and Mayne D.Q., "DELIGHT.MLMO: An Interactive
Optimization-Based Multivariable Control System Design Package," IEEE Control Systems
Magazine, pp. 9-14, Dec 1982
[29] Nye W.T.,"DELIGHT: An Interactive System for Optimization-Based Engineering Design,"
Ph.D. Thesis, Dept. EECS, Berkeley, Univ. California, 1983.
[30] Weislander, J.,"IDPAC Commands -User's Guide," Department of Automatic Control, Lund
Institute of Technology,Lund,Sweden, Report CODEN: LUTFD2/(TFRT-3157)/1-108/1980.
[31] Fleming P.J., "SUBOPT - A CAD Program for Suboptimal Regulators," Proc. Inst. Meas.
Control Workshop on "Computer Aided Control System Design," 19-21 September, 1984,
Sussex, U.K., pp.13-20.
[32] "ACSL Users' Guide," Mitchell and Guathier Associates, Inc., Concord, MA 01742.
[33] "SIMNON User's Guide," Dept. of Automatic Control, Lund Inst. of Technology, Lund,
Sweden, 1988.
[34] "TSIM Users' Guide and Reference Manual," Cambridge Control Ltd, High Cross, Madingley
Road, Cambridge. 1986.
[35] "The NAG Fortran Library Manual," The Numerical Algorithms Group, Mayfield House, 256
Banbury Road, Oxford, U.K.
[36] Dongarra J.J. et al., "UNPACK Users' Guide," Society for Industrial and Applied Mathematics,
Philadelphia, 1979.
[37] Smith B.T. et al., "Matrix Eigensystem Routines - EISPACK Guide," Lecture Notes in
Computer Science, Vol. 6, second edition, Springer Verlag, 1976.
[38] Garbow B.S. et al., "Matrix Eigensystem Routines - EISPACK Guide Extension," Vol. 51,
Springer Verlag, 1977.
[39] Vanderplaats, G.N., "ADS - A Fortran Program for Automated Design Synthesis," Naval
Postgraduate School, Monterey,CA, USA, 1983.
[40] Grace, A.C.W. and Fleming P.J., "A Design Environment for Control System Design via Multi-
Objective Optimization", Proc. 12th IMACS World Congress, Paris, Vol.2, pp.572-574, July
1988.
[41] Grace, A.C.W., Fleming, P.J. "Design Environment for Optimization Applications of Control
System Design", in "Advanced Computing Concepts and Techniques in Control Engineering"
NATO ASI Series(F) Vol 47, Springer Verlag, pp.495-512, 1988
Computing Aspects
[42] Shepherd, D., "Alternatives to ECSTASY," FEE Colloquium Digest No.1987/97, pp.5/1-5/4
[43] Maskell K. "Building Software Bridges," Systems International, pp.63-64, Jan 1987.
[44] Bell D., Murray I., and Pugh I., "Software Engineering - A Programming Approach," Prentice-
Hall, 1987.
[45] "Software Development Environments: A Tutorial," IEEE Computer Society, Los Alamitos, CA,
USA, (Wasserman A.I. ed.), Catalog No. EHO 187-5, 1981.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 1-17
PART 1: CACSD REFERENCES
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 1-18
P (II rt 2
OPTIMIZA TION
PART 2: Optimization INTRODUCTION
2.1 INTRODUCTION
DVANCES in modern computing and the ability to describe increasingly complex systems have
prompted the development and use of optimization techniques as a way of performing complex
design tasks. This has enabled the improvement of cost, reliability and performance in a wide range of
applications. As engineering design is automated to an ever-increasing level, realistic measures of
performance, which may be in the form of multiple non-linear objectives, need to be incorporated into
the design. In tandem with this requirement, is the need to model the system accurately so that reliable
measures of system performance can be obtained. These measures may be functions of complex
relationships such as the results of expensive computer simulations. It therefore becomes profitable to
invest in careful problem formulation and algorithm efficiency. Hence, attention will be focussed on
efficient solution procedures for a number of representative optimization design problems.
In Part 1 the difficulty of limiting existing FORTRAN optimization code into a interactive
environment was outlined. A number of optimization methods have therefore been programmed in the
MATLAB command language which form an Optimization Toolbox (a users' manual is given in
Appendix A). These routines are easy-to-use and avoid the inherent difficulties associated with
communication to MATLAB with existing optimization software. The disadvantage is that, due to the
interpretive nature of MATLAB, the code is slower to execute than, for instance, a FORTRAN
realization of the same algorithm. For large prcblems, where matrix calculations dominate the CPU
time, the optimization routines, which use the efficient MATLAB matrix algorithms, are able to
approach the speed of FORTRAN implementations. Besides, since cost functionals and constraints are
often computationally expensive to evaluate, slower optimization execution time is compensated for by
the iterative efficiency of the methods employed. Moreover, the ease of programming using MATLAB
and the associated optimization routines, considerably reduces the time to code and refine optimization
problems. This allows a high-level of interaction on the part of the user.
Each of the routines will be presented in terms of a short general introduction, followed by a more
detailed description about practical aspects of the implementation. For an introduction to the subject,
the books by Gill and Murray [1], and Fletcher [2] are recommended. Brayton et al [3] give a good
overview of the state of the art in optimization while Mayne et al [4], and Nye and Tits [5] give useful
insights into the subject.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 2-1
PART 2: Optimization UNCONSTRAINED OPTIMIZATION
GP minimige f(x)
XE %
where x is the vector of design parameters (xe9f 1), f the objective function (f:9/91) and g is the vector
of equality and inequality constraints (g..9tn9im).
The efficient and accurate solution of this problem is not only dependent on the size of the problem
in terms of the number of constraints and design variables but also on characteristics of the objective
function and constraints. When both the objective function and the constraints are linear functions of
the design variable the problem is known as a Linear Programming problem (LP). Quadratic
Programming (QP) concerns the minimization or maximization of a quadratic objective function which
is linearly constrained. For both the LP and QP problems reliable solution procedures are readily
available. More difficult to solve is the Non-linear Programming (NP) problem in which the objective
function and constraints may be non-linear functions of the design variables. Solution of the NP
problem generally requires an iterative procedure to establish a direction of search, at each major
iteration. This is generally achieved by the solution of an LP, a QP or an unconstrained sub-problem.
The minimum of this function is at x=(1,1), f(x)=0. A contour map is shown in Fig. 2.1 which shows
the solution path to the minimum for a steepest descent implementation starting at the point (-1.9,4
The optimization was terminated after 1000 iteration; still a considerable distance form the minimum.
The black areas are where the method is continually zig-zagging from one side of the valley to another.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 2-2
PART 2: Optimization Quasi-Newton Methods
This type of function (Eq. 2.2) is notorious in unconstrained examples because of the way the
curvature bends around the origin (also known as the banana function). Eq. 2.2 will be used throughout
this part to illustrate the use of a variety of optimization techniques. The contours have been plotted in
exponential increments due to the steepness of the slope surrounding the U-shaped valley.
where the Hessian matrix, H, is a positive definite symmetric matrix, b is a constant vector and c is a
constant. Optimal solution for this problem occurs when the partial derivatives of x go to zero, i.e.
e= -114 b. (2.5)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-3
PART 2: Optimization Quasi-Newton Methods
Newton-type methods (as opposed to quasi-Newton methods) calculate H directly and proceed in a
direction of descent using a line search method to locate the minimum after a number of iterations.
Calculating H numerically involves a large amount of computation. Quasi-Newton methods avoid this,
by using the observed behaviour of f(x) and Vf(x) to build up curvature information, in order to make an
approximation to H using an appropriate updating technique.
A large number of Hessian updating methods have been developed. It is now generally recognized
that the formula of Broyden [6], Fletcher [7], Goldfarb [8] and Shanno [9] (BFGS) is the most effective
for use in a general purpose method. The formula is given by
BFGS
qqT H TH
H =H + kk k k
k+ 1 k ciTs
- sr H s
kk kkk
where =X -x
sk: k+1 k
q k:=VAx k+i) - V f(x k)
(2.6)
As a starting point Ho can be set to any symmetric, positive definite matrix; for example, the unit
matrix, I. If one wants to avoid the inversion of the Hessian H it is possible to derive an updating
method in which the direct inversion of H is avoided by using a formula which makes an approximation
of the inverse Hessian, H-1 , at each update. A well known procedure is the DFP formula of
Davidon [10], Fletcher and Powell [11]. This uses the same formula as the above BFGS (Eq. 2.6) method
except that q k is substituted for sk.
The gradient information is either supplied through analytically calculated gradients, or derived by
partial derivatives using a numerical differentiation method via finite differences. This involves
perturbing each of the design variables, x, in turn and calculating the rate of change in the objective
function.
At each major iteration, k, a line search is performed in the direction:
—1
d — -Hk V f(x k) (2.7)
The quasi-Newton method is illustrated by the solution path on Rosenbrock's function (Eq. 2.2) in
Fig. 2.2. The method is able to follow the shape of the valley and converges to the minimum after 140
function evaluations using only finite difference gradients.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-4
PART 2: Optimization Line Search
where xk denotes the current iterate, d the search direction obtained by an appropriate method and ce is
a scalar step length parameter which is the distance to the minimum.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-5
PART 2: Optimization Line Search
Quadratic Interpolation
Quadratic interpolation involves a data fit to a univariate function of the form
a* = — — (2.10)
2a
This point may be a minimum or a maximum. It is a minimum when interpolation is performed (i.e.
using a bracketed minimum) or when a is positive. Determination of coefficients, a and b, can be found
using any combination of three gradient or function evaluations. It may also be carried out with just
two gradient evaluations. The coefficients are determined through the formulation and solution of a
linear set of simultaneous equations. Various simplifications in the solution of these equations can be
achieved when particular characteristics of the points are used. For instance, the first point can generally
be taken as a=0. Other simplifications can be achieved when the points are evenly spaced. A general
problem formula is as follows:
Given three unevenly spaced points (x 1 , x3 ), and their associated function values
{f(x1),f(x2),f(x3)), the minimum resulting from a second-order fit is given by:
Quadratic Interpolation
1 13 ,, f(x ) + (3 ) + 13 2/43)
xk+i = 25 1 3r 2 1
2 723f(x i ) 731gx2) 712f43)
<f(x1) and
flx2) f(x2)<Itx3)'
Cubic Interpolation
Cubic interpolation is useful when gradient information is readily available or when more than
three function evaluations have been calculated. It involves a data fit to the univariate function,
In order to find the minimum extremum the root should be taken which gives 6act+2b as positive.
Coefficients, a and b, can be determined using any combination of four gradient or function evaluations,
or alternatively, with just three gradient evaluations. The coefficients are calculated by the formulation
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-6
PART 2: Optimization QUASI-NEWTON IMPLEMENTATION
and solution of a linear set of simultaneous equations. A general formula, given two points,( xl , x2 ),
their corresponding gradients with respect to x, { Vf(x i), Vfix2) ), and associated function values,
Cubic Interpolation
where: 3l
131 = Vitx 1) + Vflx2) - -f(x2)
xl - x2
2
132 = { 13 1 — Vf(xl)VAx2)}1/2
(2.14)
positive. The term qkTsk is a product of the line search step length parameter, ak, and a combination of
the search direction, d, with past and present gradient evaluations:
The condition that qkTsk is positive is always achieved by ensuring that a sufficiently accurate line
search is performed. This is because the search direction, d, is a descent direction so that ak
and —Vf(xk)Td are always positive. Thus, the possible negative term Vf(x k+i)Td can be made as small
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-7
PART 2: Optimization Line Search Procedures
At each iteration a step, ak, is attempted to form a new iterate of the form
If this step does not satisfy the condition (Eq. 2.16) then a k is reduced to form a new step, akit . The
usual method for this reduction is to use bisection (i.e. to continually halve the step length until a
reduction is achieved in f(x)). However, it has been found, through numerical experimentation, that this
procedure is slow when compared to an approach which involves using gradient and function evaluations
together with cubic interpolation/extrapolation methods to identify estimates of step length.
When a point is found which satisfies the condition (Eq. 2.16) an update is performed if qkTsk is
positive. If it is not, then further cubic interpolations are performed until the univariate gradient term
Vf(xk+1)T d is sufficiently small so that is positive.
qkTsk
It is usual practice to reset ak to unity after every iteration. However, it should be noted that the
quadratic model (Eq. 2.3) is generally only a good one near to the solution point. Therefore, a k is
modified at each major iteration to compensate for the case when the approximation to the Hessian is
monotonically increasing or decreasing. To ensure that, as xk approaches the solution point, the
procedure reverts to a value of ak close to unity, the values of q kTs k, —V f(x k)T d and ak_ i are used to
estimate the closeness to the solution point and thus to control the variation in ak.
After each update procedure, a step length ak is attempted, following which a number of scenarios
are possible. Consideration of all the possible cases is quite complicated and so they are represented
pictorially in Fig. 2.3, where the left hand point on the graphs represents the point x k. The slope of the
line bisecting each point represents the slope of the univariate gradient, Vf(x) Td which is always
negative for the left hand point. The right hand point is the point N k+i after a step of ak is taken in the
direction d. Cases 1 and 2 show the procedures which are performed when Vf(x k+i )Td is positive.
Cases 3 and 4 show the procedures performed when Vf(xk+i)Td is negative. The notation min(a,b,c)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 2-8
PART 2: Optimization Line Search Procedures
f(x)
Reduce step length.
,0(
E>
a/2
ac if ak <0.1
0 ak+i ak a CCk+1 =
otherwise
f(x)
Update H.
E> Reset a Reduce step length.
0 a ak+1
k
ak÷ i = min(2, p, 1.2*a c ) ak+1= min(2, 1.5*ak, ac)
where p=l+q kTs k—Vf(x k+i )Td +minfO,ak-1)
f(x)
11, Reduce step length.
E>
Set aic+1 = min (a
( ac , a//22 }
0 ak+1 ak a
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 2-9
PART 2: Optimization Line Search Procedures
At each iteration a cubicly interpolated step length ac is calculated which is used to adjust the step
length parameter ak+1 . Occasionally for very non-linear functions a c may be negative in which case ac
is given a value of 2*ak . The methods for changing the step length have been refined over a period of
time by considering a large number of test problems.
Certain robustness measures have also been included so that, even in the case when false gradient
information is supplied, a reduction in f(x) may be achieved by taking a negative step. This is realized by
setting ak+i = -a4 /2 when ak falls below a certain threshold value (e.g. le-8). This is important for
the case when extremely high precision is required if only finite difference gradients are available. A
further robustness measure is incorporated for use on discrete functions. This is used when the step
length falls below a second threshold (e.g. le-12) and consists of calculating a random search direction,
d, and step length, aka.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Gracc Univ. of Wales, Bangor. 1989 2-10
PART 2: Optimization Line Search Procedures
f(x)
\ E>
Reduce step length.
I
aj-1-1 = aq
0 a.1.+1 0J a
f(x)
Increase step length.
\
E>
1
j. = 1•2*aq
cc+1
'---s a
0 a.
j a•j1
+
f(x)
\ •
E>
l
Increase step length
,
1
max( 1.2*a ,2*0.+0
a
j+2=
,,).- a q I
0 a.1. a '-1-1`"".+2
-1i
f(x)
\ o
Reduce step length.
o
E>
1
Set a.1+
. 2= a c
J +2ce
0 a. a•
ji-i.1
Fig. 2.4 Line Search Procedures With Only Gradient For The First Point
"CACSD using Optimization Methods PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-11
PART 2: Optimization Comparison of Methods
The number of function and gradient evaluations for each problem are shown in Table 2.1 with
starting values appearing in the first column. The first two problems used both gradient and function
evaluations and the number of evaluations of each have been shown in se-parate cohmuns. Tnt ‘2&. N.wo
problems used finite difference, gradients and thns ordy \he, -northex CN.m,\.%n e.wr&N,s%wb 'oem
entered. The fmite difference interval was set to le-8 for each package.
The results are split into columns, where the first four columns represent the ADS implementation
using different line search strategies, and the last two columns represent the MATLAB implemented
strategies. The line search strategies are as follows:
(1) Golden Section search.
(2) Golden Section search followed by Cubic Interpolation.
(3)Polynomial Interpolation after bracketing the minimum.
(4) Polynomial interpolation/extrapolation without bracketing the minimum.
(5) MATLAB implemented Cubic Polynomial Interpolation/Extrapolation method.
(6) MATLAB implemented Mixed Polynomial Interpolation/Extrapolation method.
ADS reported a number of failures in which case f has been entered into the appropriate column (failure
taken to mean function evaluations exceeding 2000 iterations). In order to make an appropriate
comparison, only those values in which all the routines were successful have been used for the total and
the average.
The MATLAB routines showed the least number of failures with the greatest efficiency being
achieved by the mixed polynomial method.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-12
PART 2: Optimization Comparison of Methods
Table 2.1 Function and gradient evaluations for two BFGS implementations
ADS MATLAB
1 2 3 4 5 6
Golden Golden Bracketed Unbracketed Cubic Mixed
Problem Section Section + Polynomial Polynomial InterlExtrap Polynomial
No. .• Cubic Inter. Interpolation InterlExtrap InterlExtrap
3 a
X func grad func = grad func = grad func grad func grad func = grad
, (-1.2,1) 546 26 313 25 112 27 108 27 40 39 76 23
i (2,2) 261 13 f f 61 14 56 14 28 27 35 10
(-2,-2) 573 28 354 27 129 31 120 31 45 44 90 27
(2,-2) 170 9 138 12 37 8 45 11 20 19 15 4
(-2,2) 734 35 511 40 150 36 139 35 52 51 93 28
(0,01 287 14 176 14 60 15 60 15 25 24 50 16
(10,10) 518 24 353 24 88 25 78 25 96 95 76 22
(-10,10) 1406 63 849 61 270 64 312 79 109 108 169 50
(10,-10) 670 27 353 24 88 25 81 18 90 89 138 41
(-10,-10) 1467 66 796 58 286 68 336 83 104 103 47 14
(-1,1) 128 6 85 6 28 6 28 7 11 10 18 6
2 (1,1) 101 5 67 5 26 5 23 6 16 15 21 6
(1,-1) 95 5 62 5 26 5 20 5 8 7 13 3
(-2,-2) 213 9 141 9 39 9 73 16 22 21 54 18
(2,-2) 110 5 80 5 28 5 33 7 8 7 20 5
(0,0) 104 5 70 5 25 5 24 6 8 7 12 4
(10,10) 469 18 333 18 93 19 84 22 58 57 50 15
(5,5) 288 11 194 11 63 13 59 15 22 21 38 11
(3,3) 137 6 98 6 48 10 57 14 18 17 48 14
(5,-5) 419 17 232 14 50 10 40 10 18 17 38 11
3 (-1.2,1,0)
(1,1,0)
f f f f 89 96
f 1 f 640 53 59
(1,-1,0) f 1 f f 77 80
(-2,-2,2) f f f f 173 146
[2,-2,2) f f f f 73 106
(0,0,0) f f f f 81 74
(10,10,10; f 1 1273 374 341 127
(5,5,5) 649 424 215 211 189 111
(3,3,3) 413 276 146 f 169 106
(5,-5,-5) f f f f 97 103
4 (-1,1) 112 78 34 27 16 16
OM 111 75 34 28 22 23
(1,-1) 113 78 34 29 16 16
(-2,-2) 138 96 42 41 64 35
(2,-2) 112 77 36 32 25 23
(10,10) 103 104 41 60 82 61
(5,5) 103 75 38 45 61 48
(3,3) 98 99 48 59 43 33
(5,-5) 126 102 42 60 37 36
(3,0.1) 122 87 41 46 52 29
TOTAL 9803 374 6500 369 2251 386 2358 432 1377 752 1496 318
AVERAGE 327.7 17.2 216.7 19.4 75.0 20.3 78.6 22.7 45.9 39.5 49.9 16.7
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 2-13
PART 2: Optimization LEAST SQUARES OPTIMIZATION
LS m
minimize F(x) = 2 Tr.
YAW = fix) J(x)
XE 9111
i=1
(2.18)
Problems of this type occur in a large number of practical applications especially when fitting model
functions to data i.e. nonlinear parameter estimation. They are also prevalent in control where it is
desired that the output, y(x,t), follow some continuous model trajectory, OW. This problem can be
expressed as
r1
minimize
xe ar
I ( y(x,r) - 4)(t) )2 dt (2.19)
to
When the integral is discretized using a suitable quadrature formula, Eq. 2.19 can be formulated as a
least squares problem
m
minimize F(x) = E ( y-(x,ti) - -0(0 ) 2
xe 9r (2.20)
i=1
In problems of this kind the residual, IlAx)11, is likely to be small at the optimum since it is general
practice to set realistically achievable target trajectories. Although the function in LS (Prob. 2.18) can
be minimized using a general unconstrained minimization technique as described in Section 2.2, certain
characteristics of the problem can often be exploited to improve the iterative efficiency of the solution
procedure. In particular, the gradient and Hessian matrix of LS (Prob. 2.18) have a special structure.
Denoting the mxii Jacobian matrix of f(x) as J(x), the gradient, VF(x), and the Hessian, H(x) of F(x) are
defined as
VFW = J(x)T ftx)
H(x)= J(x)TJ(x) + Q(x) (2.21)
The matrix Q(x) has the property that when the residual, Ilf(x)II, tends to zero then as x k approaches
the solution it also tends to zero. When Ilf(x)11 is small at the solution, a very effective method is to use
the Gauss-Newton direction as a basis for an optimization procedure.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-14
PART 2: Optimization Gauss-Newton Method
(2.22)
the direction derived from this method is equivalent to the Newton direction when the terms of Q(x)
can be ignored. The search direction dk can be used as part of a line search strategy to ensure that at each
iteration the function F(x) decreases.
To consider the efficiencies that are possible with the Gauss-Newton method, Fig. 2.5 shows the
path to the minimum on Rosenbrock's function (Eq. 22) when posed as a least squares problem. The
Gauss-Newton method converged after only 48 function evaluations using finite difference gradients
compared to 140 iterations using an unconstrained BFGS method.
The Gauss-Newton method often encounters problems when the second order term Q(x) in Eq. 2.21
is significant A method which overcomes this problem is the Levenberg-Marquardt method.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-15
PART 2: Optimization Levenberg-Marquardt Method
where the scalar, X.k, controls both the magnitude and direction of dk. When Xk is zero, the direction
dk, is identical to that of the Gauss-Newton method. As X k tends to infinity, dk tends towards a vector
of zeros and a steepest descent direction. This implies that for some sufficiently large X ic the term
F(xk+dk) < F(xk). The term Xk can therefore be controlled to ensure descent even when second order
terms which restrict the efficiency of the Gauss-Newton method are encountered.
The Levenberg-Marquardt method therefore uses a search direction which is a cross between the
Gauss-Newton direction and the steepest descent. This is illustrated in Fig. 2.6 below. The solution for
Rosenbrock's function (Eq. 2.2) converged after 90 function evaluations compared to 48 for the Gauss-
Newton method. The poorer efficiency is partly because the Gauss-Newton method is generally more
effective when the residual is zero at the solution. However, such information is not always available
beforehand, and occasional poorer efficiency of the Levenberg-Marquardt method is compensated for by
its increased robustness.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-16
PART 2: Optimization LEAST SQUARES IMPLEMENTATION
the explicit matrix, J(xk)TJ(xk), which can cause unnecessary errors to occur.
Robustness measures are included in the method which consist of changing the algorithm to the
Levenberg-Marquardt method when either the step length goes below a threshold value (in this
implementation le-15) or when the condition number of J(xk) exceeds le10 (the condition number is a
ratio of the largest singular value to the smallest).
and the term Fk(x*) is obtained by cubicly interpolating the points F(x k) and F(xk 1). A step length
parameter a* is also obtained form this interpolation which is the estimated step to the minimum. If
Fp(xk) is greater than Fk(x*) then Xic is reduced, otherwise it is increased. The justification for this is
that the difference between Fp(xk) and Fk(x*) is a measure of the effectiveness of the Gauss-Newton
method and the linearity of the problem. This determines whether to use a direction approaching the
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-17
PART 2: Optimization Levenberg-Marquardt Implementation
steepest descent direction or the Gauss-Newton direction. The formulas for the reduction and increase in
Xk' which have been developed through consideration of a large number of test problems, are shown in
Fig. 2.7 below:
, . , '. Fk(x*)-Fp(xk)
Ak 'k
a*
Following the update of 1/2, a solution of Eq. 2.23 is used to obtain a search direction, d k. A step
length of unity is then taken in the direction dk which is followed by a line search procedure similar to
that discussed for the unconstrained implementation. The line search procedure ensures that F(x k+i) <
F(xk) at each major iteration and the method is therefore a descent method.
The implementation has been successfully tested on non-linear problems with up to 58 design
variables and 100 sums of squares. It has been found that the method is considerably more robust than
the Gauss-Newton method, and iteratively more efficient than an unconstrained method.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-18
PART 2: Optimization CONSTRAINED OPTIMIZATION
KT
Vf(x*) + I
i =1 /
X.* Vg.(x*) = 0
X.* g .(x*) = 0
X.* 0 i=me+1,...,m
(2.25)
The first equation describes a cancelling of the gradients between the objective function and the
active constraints at the solution point. In order for the gradients to be cancelled, Lagrangian
Multipliers (Xj, j= 1,..,m) are necessary to balance the deviations in magnitude of the objective function
and constraint gradients. Since only active constraints are included in this cancelling operation,
constraints which are not active must not be included in this operation and so are given Lagrangian
multipliers equal to zero. This is stated implicitly in the last two equations of Eq. 2.25.
The solution of the KT equations forms the basis to many Non-linear Programming algorithms.
These algorithms attempt to compute directly the Lagrangian multipliers. Constrained quasi-Newton
methods, in particular, guarantee superlinear convergence by accumulating second order information
regarding the KT equations using a quasi-Newton updating procedure. These methods are commonly
referred to as Sequential Quadratic Programming (SQP) methods since a QP sub-problem is solved at
each major iteration (also known as Iterative Quadratic Programming, Recursive Quadratic Programming
and Constrained Variable Metric methods).
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 2-19
PART 2: Optimization SEQUENTIAL QUADRATIC PROGRAMMING
Here Prob. 2.1 is simplified by assuming that bound constraints have been expressed as inequality
constraints. The QP sub-problem is obtained by linearizing the non-linear constraints:
QP sub-problem
mixnArilide 12 dTHkd + Vf(xk)Td
This sub-problem may be solved using any QP algorithm (see, for instance, Section 2.7.2 below).
The solution is used to form a new iterate:
xk+1 = xk + akdk
The step length parameter, ak, is determined by an appropriate line search procedure so that a
sufficient decrease in a merit function is obtained (see below in Section 2.7.3). The matrix H is a
k
positive definite approximation of the Hessian matrix of the Lagrangian function (Eq. 2.26). H k can be
updated by any of the quasi-Newton methods, although the BFGS method (see below in Section 2.7.1)
appears to be the most popular.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-20
PART 2: Optimization SOP IMPLEMENTATION
3
'WWn
2.5
2
4. 00 "ft
11° nn••
111.1.1 •
1.5
0.5
-0.5
1
(r.otut Bo
A non-linearly constrained problem can often be solved in fewer iterations using SQP than an
unconstrained problem. One of the reasons for this is that, due to limits on the feasible area, the
optimizer can make well informed decisions regarding directions of search and step length.
Consider Rosenbrock's function (Eq. 2.2) with an additional non-linear inequality constraint.
This was solved by an SQP implementation in 96 iterations compared to 140 for the unconstrained case.
Fig. 2.8 shows the path to the solution point, x=(0.9072,0.8228), starting at x=(-1.9, 2).
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-21
PART 2: Optimization Updating The Hessian Matrix
q qT H TH
H =H + k k
k+1 k
_k k
q TS s TH s
kk kkk
where := X -x
sk k+1 k
(2.29)
where X. (i=1,...,m) is an estimate of the Lagrangian multipliers. Powell [15] recommends keeping the
Hessian positive definite even though it may be positive indefinite at the solution point. A positive
definite Hessian is maintained providing qkTsk is positive at each update and that H is initialized with a
positive definite matrix. When qkTsk is not positive, qk is modified on an element by element basis so
that qkTsk > 0. The general aim of this modification is to distort the elements of qk, which contribute
to a positive definite update, as little as possible. Therefore, in the initial phase of the modification, the
most negative diagonal element of qkskT is repeatedly halved. This procedure is continued until the
minimum diagonal element of qkskT -le-5 or if qicskT becomes positive. If after this procedure, qjsk
is still not positive, qk is modified using a vector, v, multiplied by a constant, w, so that::
qk = qk w'v
(2.30)
where
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-22
PART 2: Optimization Quadratic Programming Solution
QP
nimize1 xTHx + CTX
mi
xE 9111 2
A ix = b i=1, ..., me
Aix b i=m , m
e
(2.31)
A number of methods have been implemented and tested including one using Wolfe's procedure [28]
which uses a Simplex [27] (or other) Linear Programming algorithm. The method which was finally
adopted due to its fast convergence was an active set strategy (also known as a projection method)
similar to that of Gill et al, described in [1] and [29]. It has been modified for both LP and QP
problems.
The solution procedure involves two phases: the first phase involves the calculation of a feasible
point (if one exists), the second phase involves the generation of an iterative sequence of feasible points
which converge to the solution. In this method an active set is maintained, Ak, which is an estimate of
the active constraints (i.e. which are on the constraint boundaries) at the solution point. Virtually all
QP algorithms are active set methods. This point is emphasized because there exist many different
methods which are very similar in structure but which are described in widely different terms.
Ak is updated at each iteration, k, and this is used to form a basis for a search direction, d k. It
should be noted that equality constraints always remain in the active set, Ak. The notation for the
non-italicized variables, d k and k, is used here in order to distinguish them from dk and k in the major
iterations of the SQP method. The search direction, d k , is calculated which minimizes the objective
function while remaining on any active constraint boundaries. The feasible subspace for d k is formed
from a basis, Zk, whose columns are orthogonal to the estimate of the active set, Ak (i.e. AkZk=0). Thus
a search direction, which is formed from a linear summation of any combination of the columns of Zk, is
guaranteed to remain on the boundaries of the active constraints.
The matrix Zk is formed from the last m-1 columns of the QR decomposition of the matrix Ák,
where 1 is the number of active constraints and 1< tn. i.e. Zk is given by:
xk+1 = X k + ad k(2.33)
where dk is the search direction formed from a linear combination of the columns of Zk, i.e. dk-=pTZk,
where p is a vector of constants.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989
PART 2: Optimization Quadratic Programming Solution
The value of the objective function at the next iterate, k+ 1, can thus be given as:
1 (2.34)
f(P)k+ 1= (xk + pZk)TH (xk + pZk) + cT(xk + pZk).
(2.35)
Vf(P)k+1 = ZkTH Zkp + Zk (Hxk+c ).
Vf(p)k+ , is referred to as the projected gradient of the objective function since it is the gradient
projected in the subspace defined by Zk . The term, ZkTH Zk, is called the projected Hessian. Assuming
the Hessian matrix, H, is positive definite (which is the case in this implementation of SQP) then the
minimum of the function, f(p)k+1 , in the subspace defined by Zk , occurs when Vf(p)k+1=-0, which is the
solution of the system of linear equations
At each iteration, because of the quadratic nature of the objective function, there are only two
choices of step length. A step of unity along dk is the exact step to the minimum of the function
restricted to the null space of A k. If such a step can be taken, without violation of the constraints, then
this is the solution to QP (Prob. 2.31). Otherwise, the step along d k to the nearest constraint is less
than unity and a new constraint will be included in the active set at the next iterate. The distance to the
constraint boundaries in any direction, dk, is given by:
a• = -(Ax- bd
1
Ai dki=1, ..., m (2.38)
which is defined for constraints not in the active set and where the direction, d k, is towards the
constraint boundary i.e. A idk > 0, i=1,...,m.
When n independent constraints have been included in the active set, without location of the
minimum, Lagrange multipliers, 4, are calculated which satisfy the non-singular set of linear equations:
If all elements of xi, are positive, xk is the optimal solution of QP (Prob. 2.31). However, if any
component of xi, is negative, and it does not correpond to an equality constriant, then the corresponding
element is deleted from the active set and a new iterate is sought.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grucv Univ. of Wales, Banger. 1989 2-94
PART 2: Optimization Quadratic Programming Solution
Initialization
The algorithm requires a feasible point to start. If the current point from the SQP method is
infeasible then a feasible point can be found by solving the linear programming problem
LP Feasibility Phase
minimize y
1E91, XE 91n
A .x =b 1=1, ..., me
Ax
z —y b ...'m
(2.40)
where, again, the notation Ai indicates the ith row of the matrix A. A feasible point (if one exists) to
Prob. 2.40 can be found by setting x with a value which satisfies the equality constraints. This can be
achieved by solving an under or over-determined set of linear equations formed from the set of equality
constraints. If there is a solution to this problem then the slack variable, y, is set to the maximum
inequality constraint at this point.
The above QP algorithm is modified for LP problems by setting the search direction to the steepest
descent direction,
at each iteration, where gk is the gradient of the objective function (equal to the coefficients of the
linear objective function).
If a feasible point is found using the above LP method, the main QP phase is entered. The search
direction, dk, is initialized with a search direction, dp found from solving the set of linear equations,
Hdi = (2.42)
where gk is the gradient of the objective function at the current iterate x k (i.e. Hxk + c). A step of unity
is the unconstrained minimum off(x).
If a feasible solution is not found for the QP problem, the direction of search for the main SQP
routine, dk , is taken as one which minimizes y.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-25
PART 2: Optimization Line Search and Merit Function
X 1- ak dk.
Xk+1= K (2.43)
The step length parameter, ock, is determined in order to produce a sufficient decrease in a merit
function. The merit function used by Han [15] and Powell [15] of the form:
Merit Function
has been used in this implementation. Powell recommends setting the penalty parameter:
1
= max( i = 1, m (2.45)
r 2 ( r(k-1)i 2'1 ) ),
This allows a positive contribution form constraints which are inactive in the QP solution but were
recently active. In this implementation, initially the penalty parameter r is set to:
II Vf(x) II
r=
Vgi(x) II i = 1, m (2.46)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-26
PART 2: Optimization Constrained Example
Example Problem
2 2 3 4 4
minimize f(x) =-(x1 -1) + (x1-x2) + (x2-x3) + (x3-x4) + (x4-x5)
g3(x): 0 x1x5 2
(2.47)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-27
PART 2: Optimization Constrained Example
The optimization routine (constr) is called on an iterative basis following calculation of the
objective function and gradients. In the above example, default optimization parameters have been used
which can be overridden by entering the appropriate values in the vector PARA. In this example,
gradients are calculated using a finite difference method.
Results
After 68 function evaluations the optimization terminated with the following results:
=
Comparison of Results
The NAG results given in the reference manual are as follows:
After 160 Function Evaluations the Estimate of the Solution is:
x=f1.2264, 1.4150, 1.4445, 1.5000, 1.5000)
Objective Function Value:
f(x)=8.6812e-02
Constraint Values:
g 1 (x) =9.8595e-5
g2(x)=-2.3336e-5
g3(x)=1.8396
The NAG routine uses a sequential augmented Lagrangian method, the minimization sub-problem is
solved using a Quasi-Newton method.
The method used in the MATLAB implementation is Sequential Quadratic Programming. The
equality constraint has been met to a greater degree and the objective function value is less than that of
the NAG routine. The number of function evaluations is significantly less than that of the NAG routine
which demonstrates the iterative efficiency of the SQP method.
The MATLAB implementation does not have the speed advantage of a compiled language. However,
the flexibility of an interpretive language more than compensates for the additional computing speed.
For example, the FORTRAN implementation for this problem in the NAG user manual has nearly 200
lines of source code compared to the 14 lines of MATLAB code.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-28
PART 2: Optimization MULTI-OBJECTIVE OPTIMIZATION
MO minimize f(x)
XE 9tri
It is important to note that since itx) is a vector, then, if any of the components of f(x) are competing,
there is no-unique solution to this problem. Instead, the concept of non-inferiority [47]] (also called
Pareto optimality [46], [48]) must be used to characterize the objectives. A non-inferior solution is one
in which an improvement in one objective requires a degradation of another. To define this concept more
precisely, consider a feasible region, L2, in the parameter space xe9i n which satisfies all the constraints,
i.e.
CI : x
This allows us to define the corresponding feasible region for the objective function space, A,
A = f(x) subject to x e L2. (2.50)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-29
PART 2: Optimization Introduction To Multi-Objective Optimization
The performance vector, f (x) , therefore maps parameter space into objective function space as is
represented for a two-dimensional case in Fig. 2.9 below.
xi
X2 f2
Fig. 2.9 Mapping from parameter space into objective function space.
Definition: A point a x*E C2 is a non-inferior solution if and only if for some neighborhood of x*
there does not exist a Ax such that (x*+Ax)e C2 and
In the two dimensional representation of Fig. 2.10 the set of non-inferior solutions lie on the curve
between C and D. Points A and B represent specific non-inferior points.
fi
non-inferior
flA solutions
flB
f2Af2B f2
A and B are clearly non-inferior solution points since an improvement in one objective, fi , requires
a degradation in the other objective, f2; i.e. -fIBC,fIA' - f2B > -f2A'
Since any point in S2 which is not a non-inferior point represents a point in which improvement can
be attained in all the objectives, it is clear that such a point is of no value. Multi-objective optimization
is therefore, concerned with the generation and selection of non-inferior solution points. The techniques
for multi-objective optimization are wide and varied and all the methods cannot be covered within the
scope of this thesis. However, some of the techniques will be described below.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-30
PART 2: Optimization Weighted Sum Strategy
Weighted Sum
minimize F(x) = Ewi.fi(x)N2
xe
i=1
(2.52)
The problem can then be optimized using a standard unconstrained optimization algorithm. The problem
here is in attaching weighting coefficients to each of the objectives. The weighting coefficients do not
necessarily correspond directly to the relative importance of the objectives or allow trade-offs between
the objectives to be expressed. Further, the non-inferior solution boundary may be non-convex so that
certain solutions would not be accessible.
This can be illustrated geometrically. Consider the two objective case in Fig. 2.11. In the objective
function space a line, L, wTAx) = c is drawn. The minimization of Prob. 2.53 can be interpreted as
finding the value of c for which L just touches the boundary of A as it proceeds outwards from the
origin. Selection of weights w 1 and w2, therefore, define the slope of L which in turn leads to the
solution point where L touches the boundary of A.
f2
Fig. 2.11 Geometrical Representation of the Weighted Sum Method.
The aforementioned convexity problem arises when the lower boundary of A is non-convex as
shown in Fig. 2.12. In this case the set of non-inferior solutions between A and B is not available.
fl
f2
Fig. 2.12 Non-convex Solution Boundary.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-31
PART 2: Optimization e-constraint method
6-constraint method
A procedure which overcomes some of the convexity problems of the weighted sum technique is the
E-constraint method. This involves minimizing a primary objective, fp and expressing the other
objectives in the form of inequality constraints
E-constraint
minimize f (x)
X E C2 P
(2.53)
Fig. 2.13 shows a 2-dimensional Representation of the c-constraint method for a two objective problem
fl
fis
C2
This approach is able to identify a number of non-inferior solutions on a non-convex boundary that
would not be obtainable using the weighted sum technique, for instance, at the solution point fi=fi s and
f2=e2. A problem with this method is, however, a suitable selection of E to ensure a feasible solution.
A further disadvantage of this approach is that the use of hard constraints is rarely adequate for
expressing true design objectives. Similar methods exist, such as that of Waltz [55] which prioritize the
objectives. The optimization proceeds with reference to these priorities and allowable bounds of
acceptance. Here the difficulty is in expressing such information at early stages of the optimization
cycle.
In order for the designers true preferences to be put into a mathematical description would require
that the designer express a full table of his preferences and satisfaction levels for a range of objective
value combinations. A procedure must then be realized which is able to find a solution with reference to
this. Such methods have been derived for discrete functions using the branches of statistics known as
decision theory and game theory (for a basic introduction, see [52]). Implementation for continuous
functions requires suitable discretization strategies and complex solution methods. Since it is rare for
the designer to know such detailed information anyway, this method is deemed impractical for most
practical design problems, however, it is seen as a possible area for further research.
What is required is a formulation which is simple to express, which retains the designers
preferences and which is numerically tractable.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-32
PART 2: Optimization Goal Attainment Method
Goal Attainment
minimize 'y
r9t, xeS2
f(x)
1
-w•
1
y 11`
/
i=1, m
(2.54)
The term wi y introduces an element of slackness into the problem which would otherwise impose
that the goals should be rigidly met. The weighting vector, w, enables the designer to express a measure
of the relative trade-offs between the objectives. For instance, setting the weighting vector, w, equal to
the initial goals indicates that the same percentage under or over-attainment of the goals, f*, will be
achieved. Hard constraints may be incorporated into the design by setting a particular weighting factor
to zero (i.e. w1=0). The Goal Attainment method therefore provides a convenient intuitive
interpretation of the design problem that is solvable using standard optimization procedures.
Illustrative examples of the use of Goal Attainment method in Control System Design can be found in
Fleming [53,54].
The Goal Attainment method is represented geometrically in Fig. 2.14 for the 2-dimensional
problem
f2* f2s f2
Fig. 2.14 Geometrical Representation of Goal Attainment Method
Specification of the goals, [f 1 *, f2*1, defines the goal point, P. The weighting vector defines the
direction of search from P to the feasible function space, A(7). During the optimization y is varied
which changes the size of the feasible region. The constraint boundaries will converge to the unique
solution point f f2s)•
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-33
PART 2: Optimization Algorithm Improvements For Goal Attainment Method
Following the argument of Brayton et al [56] for minimax optimization using SQP, using the
merit function of Eq. 2.45 for the Goal Attainment problem of Eq. 2.55, gives:
lgx;10 = + I r1
i=1 '
0, f.(x) - w ..y - Pk. ).
t (2.56)
When the merit function of Eq. 2.57 is used as the basis of a line search procedure, then, although
w(x,y) may decrease for a step in a given search direction, the function max(AD may paradoxically
increase. This would be accepting a degradation in the worst case objective. Since the worst case
objective is responsible for the value of the objective function y we would be accepting a step which
would ultimately increase the objective function to be minimized. Conversely, w(x,X) may increase
when max(A) decreases implying a rejection of a step which improN es the worst vase Si6ttNt.
Following the lines of Brayton et al [56] a solution is therefore to set w(x) equal to the worst case
objective, i.e.
A problem in the Goal Attainment method is that it is common to use a weighting coefficient
equal to zero in order to incorporate hard constraints. The merit function of Eq. 2.58 then becomes
infinite for arbitrary violations of the constraints. To overcome this problem while still retaining the
features of Eq. 2.58 the merit function is combined with that of Eq. 2.45 giving:
• 'max( 0, fi(x) -ft), if wi = 0
v(x) =
1=1
max ( Ai ) , i=1, m otherwise. (2.58)
Another feature which can be exploited in SQP is the objective function y. We know from the KT
equations (Eq. 2.25) that:
rrz
VT + E X.* V(A- y) = 0 (2.59)
i=1 1
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-34
PART 2: Optimization Algorithm Improvements For Goal Attainment Method
The gradient vector Vy is a vector of zeros (except for the element corresponding to y which is
equal to unity). Also, the elements of V(fi(x) - wi y - fp) corresponding to the gradient of y are all
less than or equal to zero. Therefore, assuming w>0, we can conclude that:
(2.60)
L(x,X) = f(x)+ I
i=1
Xig,(x)
=7 + • E
i=1
X; (A- y)
(2.61)
It follows that the approximation to the Hessian of the Lagrangian, H, should have zeros in the
rows and columns associated with the variable y. By initializing H as the identity matrix this property
would not appear. H is therefore initialized and maintained to have zeros in the rows and columns
associated with y.
These changes make the Hessian, H, indefinite, therefore H is set to have zeros in the rows and
columns associated with y, except for the diagonal element which is set to a small positive number
(e.g. le-10). This allows the fast converging positive definite QP method described in Section 2.7.2 to
be used.
The above modifications have been implemented as part of the Optimization Toolbox, described in
Appendix A. It has been found that the above modifications make the method more robust However,
due to the rapid convergence of the SQP method, the requirement that the merit function (Eq. 2.59)
strictly decreases sometimes requires more function evaluations than an implementation of SQP using
the merit function of (Eq. 2.45). The choice of which merit function to use is therefore left as an option
for the user.
2.9 REVIEW
A number of different optimization strategies have been discussed. The algorithms used (e.g. BFGS,
Levenberg-Marquardt and SQP) have been chosen for their robustness and iterative efficiency. The choice
of problem formulation (e.g. unconstrained, least squares, constrained, minimax or multi-objective)
depends on the problem being considered and the required execution efficiency. The overall aim of this
part of the research has been to develop a set of tools which are readily accessible to control engineers
(and other workers) and which allow optimization problems to be coded in a way which is natural to
the problem at hand. The MATLAB environment has been used to implement the programs due to its
Control System Design and other utilities. This has enabled the development of an Optimization
Toolbox in which problems can be coded easily and efficiently. In Part 3 it is seen how these utilities
may be used, within the context of Control System Design, to design robust and effective controllers.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-35
PART 2: Optimization REFERENCES
2.10 REFERENCES
Overviews of Optimization, General References
[1] Gill, P.E., Murray, W. and Wright, M.H., "Practical Optimization", Academic Press, London,
1981.
[2] Fletcher, R., "Practical Methods of Optimization", Vol. 1, Unconstrained Optimization, and
Vol. 2, Constrained Optimization, John Wiley and Sons. 1980.
[3] Brayton, R.K., Hachtel, G.D. and Sangiovanni-Vincentelli, A.L., "A survey of optimization
techniques for integrated-circuit design," Proc. of IEEE, Vol.69, No.10, pp.1334-1363, 1981.
[4] Mayne, D., Polak, E. and Sangiovanni-Vincentelli, "Computer-Aided Design via optimization: A
review," Automatica, Vol.18, pp.881-907, 1980.
[5] Nye, W.T. and Tits, A.L. "An application-oriented, optimization-based methodology for
interactive design of engineering systems," Int. J. Control, Vol.43, No.6, pp.1693-1721, 1986.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-36
PART 2: Optimization REFERENCES
[21] Hock, W. and Schittowsld, K., "A comparative performance evaluation of 27 nonlinear
programming codes," Computing Vol. 30., pp.335, 1983.
[22] Crowder, H., and Saunders, P.B., "Reporting computational experiments with mathematical
software," Math. Programming. Vol.5, pi:016-319, 1978. (Also in A.C.M. Trans. Math.
Software Vol.5, pp.193-203, 1979.
[23] Lenard, M.L., and Minkoff, M. "Randomly generated test problems for positive definite
quadratic programming," A.C.M. Trans. Math. Software, Vol.10, pp.86-96, 1984.
[24] Lootsma, F.A., "Comparative performance evaluation: Experimental design and generation of test
problems in non-linear optimization," Computational Mathematical Programming (NATO ASI
Series),pp, 249-260, K.Schittowski (ed.) Springer, 1985.
[25] Minkoff, M. "Methods of evaluating nonlinear programming," in Nonlinear Programming 4, ed.
O.L. Mangasarian, Academic Press, pp.519-548, 1981.
[26] Mulvey, J.M. (ed.), "Evaluating Mathematical Programming Techniques," Springer Verlag,
1982.
LP and QP methods
[27] Dantzig, G., "Linear programming and extensions", Princeton Universtiy Press, Princeton, 1963.
[28] Wolfe, P., "The simplex method for quadratic programming," Econometrica, Vol.27 pp.382-398,
1959.
[29] Gill, P.E., Murray, W., Saunders, M.A. and Wright, M.H. "Procedures for optimization
problems with a mixture of bounds and general linear constraints," ACM Trans. Math.
Software, Vol.10, pp.282-298, 1984.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-37
PART 2: Optimization REFERENCES
[41] Drud, A., "CONOPT: A GRG code for large sparse dynamic nonlinear optimization problems",
Math. Program., Vol.27, pp.153-191, 1985.
[42] Ecker, J.G., and Kupfersclunid "An ellipsoid algorithm for nonlinear programming," Math.
Program. Vol.27, pp.83-106, 1983.
[43] Lasdon, L.S., and Waren, A.D., "GRG2 User's Guide," CIS-86-01. Dept. of Computer
Information Science, Cleveland State Univ., Cleveland, Ohio, USA, 1986.
[44] Nye., W., Polak, E., Sangiovanni-Vincentelli, A. and Tits, A., "DELIGHT: An optimization-
based Computer-Aided-Design system," Proc. IEEE Int. Symp. on Circuits and Systems,
Chicago, 1981.
[45] NAG Fortran Library Manual, Mark 12, Vol.4 E04UAF pp.16
Multi - Objective Optimization
[46] Censor, Y., "Pareto optimality in multiobjective problems," Appl. Math. Optimiz., vol. 4,
41-59, 1977.
[47] Zadeh, L.A., "Optimality and non-scalar-valued performance criteria, ", FRE Trans. Automat.
Contr., vol. AC-8, pp. 1, 1963.
[48] Da Cunha, N.O., and Polak, E. "Constrained minimization under vector-valued criteria in fmite
dimensional spaces," J.Math. Anal. Appl., Vol. 19, pp. 103-124, 1967.
[49] Mulcai, H. "Algorithms for multicriterion optimization," IEEE Trans. Autom. Contr., Vol. AC-
25, pp.421-432, 1980.
[50] Gembicici, F.W. "Vector optimization for control with performance and parameter sensitivity
indices," Ph.D. Dissertation, Case Western Reserve Univ., Cleveland, Ohio, USA., 1974.
[51] Lightner, M.R., and Director S.W., "Multiple criterion optimization for the design of electronic
circuits," IEEE Trans. Circuits and Systems, Vol. CAS-28, No.3, pp.169-179, 1981.
[52] Hollingdale S.H., "Methods of operational analysis," in Newer Uses of Mathematics (James
Lighthill, ed.), Penguin Books, 1978
[53] Fleming, P.J., "Application of multi-objective optimisation to compensator design for SISO
control systems," Electronics Letters, Vol.22, No.5, pp.258-259, 1986.
[54] Fleming, P.J., "Computer-Aided Control System Design of regulators using a Multiobjective
Optimization approach," Proc. IFAC Control Applications of Nonlinear Porg. and Optim.,
Capri, Italy, pp.47-52, 1985.
[55] Waltz, F,M., "An engineering approach: Hierarchical optimization criteria," IEEE Trans., Vol.
AC-12, April, 1967, pp.179-180.
Minimax Optimization
[56] R.K.Brayton, S.W.Director, G.D.Hachtel, an L.Vidigal, "A new algorithm for statistical circuit
design based on quasi-Newton methods and function splitting," IEEE Trans. Circuits Syst., Vol.
CAS-26, pp. 784-794, Sept. 1979.
[57] Hald J., and Madsen K., "Combined LP and quasi-Newton methods for minimax optimization,"
Math. Program., Vol. 20, No. 1, pp. 49-62, 1981.
[58] Madsen K., Schjaer-Jacobsen H., and Voldby J. "Automated minimax design of networks," IEEE
Trans. Circuits and Systems, Vol. CAS-22, pp. 791-795, Oct. 1975.
[59] Madsen,K. and Schjaer-Jacobsen, H., "Singularities in minimax optimization of networks", IEEE
Trans. Circuits and Systems, Vol. CAS-23, no. 7,456-460, 1976.
[60] Madsen, K. and Schjaer-Jacobsen, H., "Algorithms for worst case tolerance optimization", IEEE
Trans. Circuits and Systems, Vol. CAS-26, Sept 1979.
Miscellaneous
[61] Nelder, J.A., and Mead, R., "A simplex method for function minimization," Computer Journal,
Vol.7, pp.308-313
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 2-38
P r3
(11
Summary - Control System Design (CSD) methods which are used in conjunction with the
optimization techniques described in Part 2 are presented. Integral quadratic measures are used
as the primary measure of system performance. Problem formulations have been developed to
cover a large number of design options and disturbance types. In particular, a method of
incorporating control derivative energy terms is used in order to avoid excessive actuator rate
saturation. A method for the design of servomechanisms is also presented using a general feed-
forward/feedback two-degree-of-freedom control structure. Application of multi-objective
optimization to CSD is presented as part of an evolutionary and interactive design process.
Examples are given which demonstrate these techniques.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Gratx Univ. of Wales, Bangor. 1989 3-1
PART 3: Control System Design INTRODUCTION
3.1 INTRODUCTION
We consider here systems or sub-systems which can be described by continuous variables and which
can be described by differential equations. CSD will therefore be considered in terms of a mathematical
description or model. The advantages of being able to model the system, over an on-line technique such
as self-tuning or adaptive control, are that the CSD procedure may be tested and tuned within the safety
of the model before implementation. The on-line speed advantage over adaptive controllers allows a
wider set control parameters to be considered and, additional system information to be utilized. A
drawback of basing the design on a modelled system arrives when the system is subject to uncertain
change such as aging effects. However, if bounds can be expressed on these uncertainties then they can be
compensated for by incorporation of robustness measures or sensitivity reduction in the CSD procedure.
In the order to handle uncertainty and to improve system performance further, requires the use of
better control structures and design techniques. To achieve this, it seems likely that there will be a
merging of off-line and on-line techniques. This means that multi-loop adaptive control might form for
the basis of an otherwise fixed-gain or gain-scheduled global controller. This requires incorporation of
effective control algorithms and more realistic design criteria. Thus a control strategy, utilizing
improved control structures such as a feed-forward/feedback dynamic controller, and incorporating
multi-objective performance objectives, is the means to providing robust, efficient and effective control.
This part will therefore focus on the improvement of control structure and the use of multiple
performance objectives in the design method. It is felt that off-line linear CSD strategies will continue
to form an integral part of future CS]) methods as the foundation for gain-scheduled and adaptive
algorithms which can tackle non-linearities and uncertainties. Linear CSD will therefore be considered
and envisaged as part of a gain-scheduled or adaptive control strategy which is capable of handling non-
linearities, uncertainties, multiple modes of operation and differing operating conditions.
Integral quadratic measures of performance will be used extensively. This is motivated by the well-
established numerical solution of such problems and the flexibility with which this criterion can be
transformed to incorporate broader control objectives such as stability, speed of response, reduction of
interaction in multivariable systems, sensitivity reduction, actuator limits, integrity with respect sensor
or actuator failure, as well as differing operating conditions and disturbance types. It is felt that due to
the plethora of design choices in terms of controller structures, design objectives and operating
conditions it is important to base the Control Design Method on a well established and numerically
robust framework in which the problem can be solved. Having laid the foundation for this framework,
techniques will be developed for covering other performance objectives, and for incorporating these in a
multiple-objective design strategy.
In order to tackle complex problems, the concept of design by evolution will be introduced. In this
application, this concerns the systematic increase in both control order and problem complexity in order
to incorporate more appropriate control structures and a wider set of design objectives. This facilitates a
structured approach to CSD and has distinct computational advantages. Moreover, insight is gained with
regard to the trade-offs, associated with control order, and conflicting requirements for control energy
restrictions and performance improvement. A number of design examples are given which illustrate this
approach focussing on servo design using a two-degree-of-freedom control structure and multi-objective
performance criteria.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-2
PART 3: Control System Design LINEAR QUADRATIC REGULATOR
x =A x +Bu
P PP PP
(3.1)
y =C x
P PP
where A , B, C are real constant matrices and x (it (t) €R" ) is termed the state vector of the
pp P PP
system, and u (u (t)e9t m ) and y (y (t)E 91 1. ) are the inputs and outputs of the system respectively.
PP PP
The plant is represented pictorially in Fig. 3.1 below. As in Part 2, the notation of boldface lower-
case letters for vectors and boldface upper-case letters for matrices will be used where possible.
Plant Outputs
YP
In this Section we will be concerned with parametric Linear Quadratic problems in which a
parametrized controller (i.e. a controller composed of a number of design parameters) is optimized with
respect to a performance objectives consisting of integral quadratic measures. A common form of this is
the basic Linear Quadratic Regulator (LQR) problem which is concerned with finding a control u to
minimize a cost function, J, composed of integral squared error terms i.e.
Linear Quadratic Regulator Problem
minimize J
u E91111
T T
where = f (x Q x + u R u ) dt
0 P PP P PP
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-3
PART 3: Control System Design CONTROLLER STRUCTURES
The introduction of the term Rp limits the control energy to the input of the system. If this were
not present then the real parts of the eigen-values of the closed loop system might tend to --cc making
the control unrealizable.
In order to make the control realizable for a number of initial conditions the control u is
generally parametrized as a function of the outputs or states using a matrix of control gains. The way in
which the control gains are parameterized gives rise to a number of different control structures and
problem formulations.
u *= K x (3.3)
P sP
Yp
The solution to this problem is well known and is found is for example in [3]
-1 T (3.4)
Ks = -R B P
where P is an nXn symmetric matrix which is the unique positive semi-definite solution of the algebraic
Ricatti equation:
A TP + PA - PB R - 1 B T + Q =0 (3.5)
P P P
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-4
PART 3: Control System Design CONTROLLER STRUCTURES
It should be noted that the optimal controller, Ks, is not initial condition dependent, that is Ks
will be optimal for any set of initial conditions. This point is emphasized because other forms of
controllers are initial condition dependent (including observer based designs and output feedback based
designs). The implementation of the full state controller has the disadvantage, however, that the states
must be available for feedback which is not always possible. The following formulations overcome this:
U =Ky (3.6)
p op
which minimizes the cost function in Eq. 3.5. The controller and plant are represented in Fig. 3.3
below.
U =D y +Cx
p cp cc (3.7)
it =Ay +Dx
C cp cc
where x is the compensator state vector (x ft) E 9111C ) . The controller is represented in Fig. 2.4.
c
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-5
PART 3: Control System Design CONTROLLER STRUCTURES
This can be converted to an equivalent output feedback problem by forming the following set of
closed loop equations:
A. A 0 B= Bp 0 K= Dc C 0 x ]
where CC C= [ P X= [ P
0 0 0 I B A 0 I x
c c c
The cost function to be minimized is taken as the same as in Prob. 3.2 i.e
T
j = .1 °° (xpT Q p x p + up Rpup) dt. (3.9)
0
In the original problem formulation Johnson and Athans [8] considered a different cost function of
the form:
on
T
J= (x Q x + y T (D TR D + B TR B )y + x T (C T R C + A TR A )x) dt. (3.10)
P pp p c pc c cc p c c pc c cc c-
0
The inclusion of the second and third terms in Eq. 3.10 penalize the elements of the controller in
order to limit the control gains. However, since the object of limiting the control gains is to limit the
control energy in order to avoid excessive actuator saturation, it is argued that the term
"CACSD using Optimization Methods PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-6
▪
i00
T
u R u dt in Prob. 3.2 is sufficient to minimize this. Obviously this condition may not be
P pp
0
sufficient if the elements of R are negative or are all zero. It is therefore necessary to impose the
P
conditions that the elements of R are both positive and the diagonal elements are finite. The system
P
should also be controllable and observable. There may be some rare cases where although the integral
quadratic control energy is low the elements of the controller are apparently high, however, the
realization of large control values is not usually a problem using modern electronics. Further, since the
state space description is non-unique, scaling of the matrices may be achieved. The advantage of using the
cost function in Eq. 3.9 is that it simplifies the equations. Further, Eq. 3.9 can be related to the control
energy in more direct terms giving the designer more insight to the precise functioning of the weighting
matrices. This is especially important in the light of the criticism of LQR methods that the weighting
matrix coefficients are difficult to chose with respect to actual design requirements.
oooo
T T T T— (3.12)
J= I (x (Q + C K RKC)x dt x Qx dt. =I
0 0
The construction of the matrices A,B,C,Q,R and x is dependent on the control structure being used
(cf. Eq.3.6 and Eq. 3.8).
The solution to this problem is a set of non-linear matrix equations which can be derived along the
lines of Levine and Athans [6] and Kosut [7]. The preferred approach is to use a method used originally
by Mendel [17] and Newmann [14]. This involves calculating the cost function and gradients explicitly.
From Eq. 3.11 and Eq. 3.12 it is well known that the cost functional, J, is given by:
J=tr(PX), (3.13)
where X0= E(x x T) and X=X when there are no disturbances acting on the system.
0 0 0
PA +P
1TP + 0 = 0
(3.14)
KTRK c = 0)
(i.e P(A + BKC) + (A + BKC)TP + Q + cT
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-7
PART 3: Control System Design DISTURBANCE REJECTION
Gradient Calculation
While Eq. 3.13 and Eq. 3.14 allow the explicit calculation of J, gradient calculations enable the
efficient solution of Prob. 3.2 using an unconstrained gradient optimization method. The problem can be
considered as an equality constrained problem composed of the cost function in Prob. 3.2 and a set of
equality constraints (Eq. 3.14). Forming the L,agrangian function we get:
where A is a symmetric matrix of Lagrangian multipliers. The necessary conditions for a minimum are
given by (cf. Mendel[17]):
VL(A)=VL(P)=VL(K0 (3.16)
The partial derivatives VL(A), VL(P), VL(K) can be calculated using gradient matrix operations (see,
[25-27] and Appendix B) giving:
VL(A) = JZAT +AA + X (3.17)
It should be noted that VL(P) is identical to Eq. 3.14, also, VL(A) is a Lyapunov matrix equation which
can be calculated efficiently using the Schur form of A used in the calculation of Eq. 3.14 (see, for
instance, [23]). The gradient matrix, VL(K) is equivalent to VJ(K) since Eq. 3.14 is solved at each
iteration.
The advantage of calculating the cost, J, and gradient, VJ(K), explicitly are twofold. Firstly, it
facilitates the use of a well established gradient optimization (e.g. BFGS). Secondly, the cost function
may be used in another problem formulation, such as, in a multi-objective strategy.
"CACSD using Optimization Methods PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-8
PART 3: Control System Design DISTURBANCE REJECTION
Output Disturbance
A disturbance at the output as shown in Fig. 3.5 cannot be directly handled by setting constant
initial conditions on the plant. This is because the disturbance acts initially on the controller so that the
initial conditions on the plant and controller states are dependent on the controller itself. We must
therefore add terms to the gradient, VJ(K) and initial condition matrix Xo.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-9
PART 3: Control System Design Stochasitc Problem
X
d =E(xdxdT)
Y d= E(YdYdT)' (3.21)
xdl [ o
where xd =[ n , 3rd —_ ° n"
u nc Yd
Since the disturbances are independent, X in Eq. 3.13 can be augmented as follows
X = X0 + Xd + BKYdKTB. (3.22)
Thus, using gradient matrix operations and the formulas given in Appendix B, a new gradient vector is
found of the form:
This problem has been considered by Kuhn and Schmidt [20] using a different approach but resulting in
the same equations.
y
p = C p x + yd0' (3.25)
where in this case xdand ydo (cf. Eq. 3.21) are zero-mean white-noise processes with
Etxd(t)xd(t+T))=Xs45(t)
Etyd(Oyd(t+T))=Ys8(t) x 71 _ d _ [ o n 01
Efxd (t)y d(t+T))=0 where xd = [ ,y— (3.26)
"n Yd
The cost function of concern is of the form:
Is=lim
t E(xTQx), (3.27)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-10
PART 3: Control System Design Disturbance Modeling
Js=tr(PsQ) (3.28)
where
There is thus a close correspondence between the stochastic problem and the deterministic problem. The
problem may be solved using the same procedures as described for the deterministic problem. By
augmenting the matrices 0 and X it is even possible to consider a plant with both stochastic and
deterministic disturbances.
' Set X p0 =I to minimize an average set of initial conditions (see Kleinman and Athans(1968)[26])
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-11
PART 3: Control System Design Disturbance Modeling
xdO o, Yd0
Fig. 3.6 Augmenting The Plant Matrices To Include Disturbance Models and a plant D matrix
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-12
PART 3: Control System Design Canonical Form
where the matrices C c and D c are free to vary. For examples of this canonical form, for both SISO and
MIMO systems, refer to the design examples in Section 3.10. In order to fix the necessary elements of
the control matrices a set of masking matrices are used so that any elements of Ac, Bc, Cc and Dc can
be fixed or free to vary. This allows the controller to be set to any appropriate form, such as a
decentralized control structure with each output feeding back independently. The reason for using this
canonical description, as opposed to any other, is to facilitate the mapping of other controllers into
this form using similarity transformations. This enables other controllers to be used as starting values
for the optimization procedure whcih may assist in fast convergence and the avoidance of local minima.
nC n_C +1
0 Bc
Ac
nc
EMCC Em
Dc
Cc[ D c[ Dc
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-13
PART 3: Control System Design Evolutionary Controller Mapping
where Em is a row vector of ones, (1,1,..1), of size m. The scalar p should be chosen as any finite
negative number (e.g. -10), it represents the position of the additional pole on the real axis. In the
s-domain, the above mapping corresponds to making a series connection of a unity gain controller whose
poles and zeros are cancelled.
Having performed the mapping, it is then necessary to transform the new controller to the required
canonical form prior to optimization. This can be achieved using similarity transformations based on
forming the controllability matrix of Ac, Bc
A canonical form can then be obtained using the following similarity transformations:
At = C o\AcC o Bt = Co\Be Ct = C C
c o D = D,
c, (3.33)
where the operator C o\Ae indicates a non-zero solution, F, to the equation, C oF = A. In the case
when m=1 (i.e. SISO or SIMO systems) the matrices At, Bt, Ct, Dt will be in the required canonical
form, otherwise certain rows and columns of the matrices must be removed which correspond to the
rows in the matrix A which contain all zero elements.
t
When m>1 (i.e. MIMO or MISO system), it has been found, for some systems, that the required
canonical form cannot always be achieved using this procedure. In this case different values for the
variable p in Fig. 3.7 are tried until the required canonical form is found. Further research is necessary
to discover whether a more robust method can be found for performing this task.
To summarize, incrementing the controller matrices, Ac ' C and D c involves the following
' B cc
steps:
(1) Map lower-order controller into higher-controller using matrix augmentations given in
Fig. 3.7.
(2) Using similarity transformations (Eq. 3.32 and Eq. 3.33) put A c' B
cand C c in the required
canonical form.
(3) For MIMO or MISO controllers, remove rows and columns At, Bt, Ct (Eq. 3.33) which
correspond to rows of zeros in A t. If the controller is still not in the required form try a different
value for p in Fig. 3.7 and go to Step(1).
"CACSD using Optimization Methods" PhD Thesis A.C.W.Cgace Univ. of Wales, Bangor. 1989 3-14
PART 3: Control System Design Control Derivative Measures
Introduction
The incorporation of the integral term .1°3(uTRu) dt in Prob. 3.2 reduces the amount of
0 P PP
control energy applied to the actuators or inputs of the system. This is required since all actuators or
inputs are limited in terms of the amount of energy that can be transferred to the system.
In a practical system the actuator limits generally occur in the form of magnitude and rate limits.
Magnitude limits are caused by a maximum absolute value of the input that can be applied to the system
before saturation occurs. For instance, an accelerator pedal in a car cannot be pressed past a fixed point.
Rate limits are caused by a maximum rate of change that can applied to the system before saturation
occurs. In the car example rate limits occur during acceleration of the car when, pressing the accelerator
down past a certain rate, does not make the car accelerate any faster.
Actuator saturation of this type can cause non-linear effects in the system which can lead to
instability and performance degradation. Whilst in many cases driving the actuators to these limits may
produce a system with better system characteristics than a system whose actuation is under-utilized
(i.e. which never saturates) the amount of saturation should be controlled so that non-linear effects do
not undermine the performance characteristics of the linear model.
00
T
Jd = (ti Su) dt, (3.34)
P PP
0
where u indicates the rate of change of the control input with respect to time.
The addition of this term in the cost function has the effect of limiting the derivative control
energy and can be used to reduce the amount of actuator rate limit saturation. Eq. 3.34 is used to form
the augmented cost function:
T
J= (x„T Qx +u Ru +i tSu)
T dt. (3.35)
0 PP P PP P PP
The addition of this extra term in the cost function requires additional terms in the general LQR
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-15
PART 3: Control System Design Control Derivative Measures
solution of Section 3.4.4. The additional terms make use of the equivalence:
T T T
Jd = x ( C K (A + BKC)TS (A + BKC)KC) x dt. (3.37)
0
Eq. 3.37 is still of the same form as Eq. 3.11 and Eq. 3.12, so that Q can be augmented to include
the additional terms. Alternatively we can express the cost function explicitly for use in other problem
formulations (e.g. multi-objective), i.e
= tr(PdX)'
where P is a solution of the Lyapunov matrix equation (3.38)
d
the partial derivative of Jd can be calculated using the method of Lagrangian multipliers and matrix
gradient operations (see [27-29 ] and Appendix B):
T (3.39)
VJd(K)=2(B PdAdC+ Dd)
where
D mAdATc T+c TKTBT) +BKCAd(A Te+cTKTBT))÷ BTerKT.,J ftt.,((A +BKC)Ad)) CT,
d =Sp KC p
The above cost function may also be used in the standard problem using the following augmentations:
VJ(K) = VJ(K) + D
(3.42)
where
ts. D - )+BKCA(A T C T+C
D =S KC((AA(ATCT+CT—T—T TKTB T))+ S KC((A +BKC)A)) CT.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-16
PART 3: Control System Design Sensitivity Measures
J = tr fSTES (3.45)
where S is matrix of sensitivity terms and E is a weighting matrix which can be used to address
individual elements or to weight particular elements which have more importance.
The sensitivity matrices (S in Eq. 3.45) that we will be considering are with respect to the initial
condition matrix, X the disturbance matrices, X d' Yd' and the plant matrices A B C They will
0' P' P'P.
be denoted as W(X 0), W(Xd), VJ(Y d ), VJ(Ap), VJ(Bp), and VJ(Cp) respectively. The sensitivity
matrices, corresponding to S in Eq. 3.45, can be calculated using gradient operations (see, Appendix B),
giving:
VJ(X0)=P
VJ(Xd)=P
VJ(Yd)=KTBTYdBK
V.I(A )=2PA
VJ(B )=2ACTKT + 2PBKY K
P „ d
V.I(C )=2K 'BIA.
(3.46)
Thus, using Eq. 3.45, the cost functions are of the form
.Ixo=tr MEP}
Jxd=tr (PEP)
Jyd=tr ( KTBTYd BKEKTBTYd BK )
JA=tr{2PAEA1i
JB =tr[4(ACTKT + PBKY K))E((AC TKT + PBKY T))
d d
J c.tr f4KTBTAEABK)
(3.47)
by augmenting these equations to the general problem formulation of Prob. 3.2 and using a further set
of gradient operations, it is possible to generate the gradients with respect to K, i.e, VJ(K) where
J=J+Js. . Alternatively, by forming a new Lagrangian, it is possible to find VJ s(K) explicitly for use,
for instance, in a multi-objective problem formulation. It is also possible to derive sensitivity matrices
with respect to the derivative control measures (Eq. 3.34) and the stochastic problem (Section 3.4.2).
These are derived in a straightforward manner using gradient matrix operations (cf. [27-29] and
Appendix B).
It should be noted that when a number of cost functions are being calculated, it is sometimes more
computationally efficient to calculate gradients using a fmite difference method. For poorly conditioned
matrices, this approach may also be more accurate.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-17
PART 3: Control System Design Servomechanisms
3.6 SERVOMECHANISMS
The purpose of this section is to show the proper formulation of a class of linear control problems
known as tracking or servomechanism problems using integral quadratic measures of control.
Athans [30] has considered the design of PID controllers. Bernstein and Haddad(1987)[36] has
considered tracking for constant gain output feedback controllers. We extend this here to the
generalized two-degree-of-freedom feedforward/feedback controller (with or without integral action).
Davison [32-34] has also considered the servomechanism problem with Full State Feedback but without
consideration to limiting the control energy. We show here how control energy may be limited in order
to avoid actuator saturation by the inclusion of special terms in the cost function.
Recently Arstein and Leizarowitz [35], and Choek et a/ [37] have also considered the
servomechanism problem for the full state feedback case (or using state estimators). We assume here
that not all the states are available for feedback.
The general problem can be described as follows:
Problem Definition: Given the plant description of Eq. 3.1 a reference input y ref which is in the
form of a set of step responses (or filtered steps) must be tracked by the outputs y p. The problem can
be stated as follows:
LQ Servo Problem
minimize J
11 pe 901
00
where J = I ) R( u-udt
+ (ii -u ) )
( Yref-Y)T(2( Yrefl) p'T ref p
0
(3.48)
where yref is the reference input signal to be tracked and u ref is the corresponding input which is
required to maintain tracking of yref in steady state.
We do not use the term u TR u in the integral since this would tend to 00 as t-300. This is because
PPP
for most systems it is necessary to apply a constant reference input to the control inputs in order to
maintain tracking.
The approach we adopt here is to consider the problem in two parts: First to design a controller to
maintain u ref so that yref is tracked as t-300. This requires the control structure to be either some form
of integral action or for a constant-gain precompensator to be applied to the system so that the d.c. gain
of the closed loop system is unity (or for multivariable systems the unit matrix I). Second, to
minimize Eq. 48 with respect to the control structure being considered.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-18
PART 3: Control System Design Feedback/Feedforward Controller
x =A x +B y +B
c cc cfb ref ctr
up = Ccxc, + (3.49)
Deferrer+ DcfbY
Controller
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-19
PART 3: Control System Design Integral Action
Integral Action
The use of integrators in a control system has long been used as a means to ensure robust steady-
state tracking. The addition of integral action to a system ensures that the system tracks any constant
reference input provided that the closed-loop system remains stable. For the multivariable case, a set of
m integrators (where m is the number of reference inputs) will eliminate steady state interaction
between inputs.
The addition of integral action does have some disadvantages, however, since it may have a
destabilizing effect on the plant and cause a degradation of time response or frequency characteristics.
There is generally a trade-off between the amount of integral action that is applied to the system and
the transient performance that is required.
In order to incorporate integral action the control structure shown in Fig. 3.9 is used
DF Integrating Contro er
xdo5(t) Ydo8(0
'ref
fb
C
Integral Action
C•
Fig. 3.9 Two Degree of Freedom Control Structure with Integral Action
To incorporate integral action into the generalized 2DF controller description of Eq. 3.49 the
control matrices A , B , B and C are augmented as follows:
c ab cff c
A, co B c ff
A— °
Bab = Bcfb =
0 Om 1 -Imfb
[B [ 1m
Cc
C
c=
Cci 1. (3.50)
When used as part of an optimization procedure the elements corresponding to the integral feedback
are not free and must be fixed.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-20
PART 3: Control System Design SERVOMECHANISMS
The amount of integral action is controlled by the matrix Cc. This matrix can be allowed to vary
freely with respect to any performance criteria being used, e.g. Eq. 3.48. However C ci should have
bounds on the absolute value of its diagonal elements so that they do not go below a certain threshold
value. This threshold value should be chosen to ensure adequate tracking is achieved even in the presence
of modelling errors, non-linearities or aging effects.
BpDcril
where 2- ti and B are the closed loop system matrices: A
- = A+BKC,
=[ Bcff
and 'ref is a set of step responses applied to the input u at t=0.
In order to minimize the cost function of Eq. 3.48 an equivalent deterministic (LQR) regulator
problem is found by considering final values and initial values of the system. Consider initially the
first part of the integral:
re.
(3.52)
= J o ( Yrer 3)r(2( YrerY) dt.
The aim is to find the initial and final values of the closed loop system and to rewrite the problem
in terms of an equivalent LQR problem. Assuming that the closed loop system is stable, then as
t-400, ;-30. Therefore, the final value of x (as t-3 00), xfv can be defined as :
0 =Ax (3.53)
fv + ''ref
J. A-1E
.Yref jip) p ‘./ 'ref - x p) dt. (3.54)
0+
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-21
-
▪
Proof
It is to proved here that the cost functions Eq. 3.52 and Eq. 3.54 are identical. An equivalent
deterministic problem is also derived.
First define a state vector of the form:
(3.55)
xs = A-113- Yref xp
J=1 xC
Sp
T
QC x dt,
PS (3.56)
0
and as x —>0.
s
Since 'ref is a step response then ye ref = 0 for all t> 0, the state equation may thus be written as:
The cost function is therefore equivalent to a system which is released from rest and whose initial
conditions are given by:
—1 — (3.60)
xo = — A B
ref T3'ref
(u — u )TR(u — u dt (3.61)
f ref p ref p)
0
since u p may be expressed as a function of xp and xe. It should be noted that "ref is arbitrary here and is
put into the cost function so that (uref — up) +0, as t-->00
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-22
PART 3: Control System Design SERVOMECHANISMS
Summary
The evaluation of the cost function (Eq. 3.48) for the plant description (Eq. 3.1) is therefore
equivalent to a deterministic problem in which the system is released from rest and whose initial
conditions are given by:
Since the initial conditions are dependent on the augmented closed loop plant matrix, A, where
A= A + BKC, then the cost functional, J= tr(PX), is dependent on K. Assuming that the step responses
occur independently, i.e. E(yref)=0, and that there are independent plant disturbances, Ebrd0 )=0, and
output disturbances, E(xdo)=0, then the matrix, X in the general problem formulation (Section 3.3.4),
should be set according to
Using gradient matrices and the method of Lagrangian multipliers, the augmented gradient term is given
by:
Typically for a multivariable system the step response matrix, Y rerE YregrefT\ should be set
with the identity matrix, I. This assumes that the step responses occur at intermittent intervals of
time and after the transients from previous step responses have died away.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-23
PART 3: Control System Design Servo Control Derivative Measures
0.
. (3.65)
Jd= S1 (.uT u )dt
P P P
0
helps to minimize actuator rate saturation. When applying this to the servomechanism problem there is
an additional term in the solution procedure caused by direct coupling of the input to the plant through
the component D ell., since u is given by:
P
up = Ccxc + (3.66)
D cerer + DcfbY*
Assuming yrer is defined in terms of the expected value of the norm of a set of step responses, i.e.
yred, then the contribution of D cff to the cost function (3.65) is given by:
Ye El. 3 Treir
tr( Yd D eff TS p D cff ) (3.67)
00
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-24
PART 3: Control System Design OBSERVER DESIGN
xdO Yd0
•
Plant :Malrices:
6 1. 41/ yp
Xp
'x
— Po err
z
est
The aim is to minimize xerr, given statistical information about the disturbances,au''(10*
xv The
disturbances x do and ydo may be modelled as Gaussian or impulse type disturbances as for the LQR and
LQG design approaches. If the disturbances are modelled as filtered disturbances then the matrix
augmentations given in Section 3.5.3 can be used. The observer may also be designed with respect to
statistical information regarding the input, up , and plant initial conditions, Xpo. The integral quadratic
measure to be minimized is of the form:
00 00
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace • Univ. of Wales, Bangor. 1989 3-25
PART 3: Control System Design MULTI-OBJECTIVE CSD
Using the principles developed for the design of output feedback controllers it possible to design the
observers for a range of disturbances (e.g. Gaussian or impulse) and input types (e.g. impulse, Gaussian
or step response models). Since, in the design of a steady-state Kalman-Bucy filter, no information
regarding the characteristics of the input is used, the 2DF observer in Fig. 3.10 may give improved
performance. This is because B cff is generally set to B and Dcff is set to a matrix of zeros in the
Kalman-Bucy filter which may not be the optimal settings, given information regarding the
characteristics of up.
Further research is necessary to develop the problem solution strategies and matrix augmentations
for a range of disturbances and input characteristics.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-26
PART 3: Control System Design DESIGN BY EVOLUTION
The optimization process is aided by using results from previous design phases as starting values for
the next phase. To accommodate an increase in control order and to exploit the solution points from
previous design phases, a method of pole/zero cancellation is used to map the low order controller onto
a higher order controller (see Section 3.4.6). This is then used to restart the optimization cycle.
Through this evolutionary design process the control engineer gains insight and a growing
appreciation of the systems best capabilities, the trade-offs associated with control order and competing
design requirements. Further, since computationally expensive design goals are added later in the design
cycle, this approach serves to reduce the computational burden. Using results from previous design
phases in this way, as opposed to using arbitrary starting values, is likely to improve execution speed
and reduce the likelihood of encountering local minima.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-27
PART 3: Control System Design DESIGN EXAMPLES: Simple Tracking Example
1
(3.70)
Gp(s). (1+s)(3+s) 3
We demonstrate here an evolutionary design procedure with a number of design phases. Fig. 3.12 shows
how both the controller complexity and performance measures are systematically evolved in order to
view the trade-offs between different performance measures and control structures The design approach
which we will demonstrate here has 7 design phases. In the first phase a controller is designed using a
proportional output feedback controller using only a scalar performance measure of control and output
energy terms. In the later stages of the design a high order feedforward controller is designed using a
multi-objective optimization strategy.
Controller
Complexity
Controller
OOOOO JO", OOOOOOOOO 1110. O
PID
Feedforward 7
PI Controller
PI
Controller
2
3
11
Proportional
feedback Performance
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-28
PART 3: Control System Design DESIGN EXAMPLES: Simple Tracking Example
The precompensator Pp is required in order to make the closed loop DC Gain equal to unity in order that
the output tracks the input. Using a cost function of the form:
2 2 (3.71)
= J•00 yref-y) + (uref
-u__) dt,
0
where 'ref was taken to be a unit step response and uref is arbitrary to ensure the integral is finite.
Starting at a value of K=O Pp=1, a controller of Kp=-0.1973, Pp=1.1973 was found after 8 function
evaluations and 3 gradient evaluations giving J=1.3001. The output and step responses are shown in
Fig 3.14 below.
It should be noted that while this controller appears to give a good output response, it demands
that the actuators reach a level of 1.1973 at t=0. Obviously in this situation such a demand would cause
actuator saturation which might cause unwanted non-linear effects. Further, the tracking depends on the
model of the plant being totally accurate since any deviation of the plant or controller parameters will
result in a loss of tracking. A better control regime is one that uses integral action.
0.5 0.5
5 10 5 10
Time(sec) Time(sec)
"CACSD using Optimization Method? PhD Thesis A.C.W.Cmace Univ. of Wales, Bangor. 1989 3-29
PART 3: Control System Design DESIGN EXAMPLES: Simple Tracking Example
Design 2 PI Controller
We now consider a PI controller shown in Fig. 3.15 below.
PI Controller Plant
Kr G (s) --.--).
Yref KP + —1- P Y
s
The proportional and integral parameters Kp and K1 where found in order to minimize the cost function
in Eq. 3.71. Since the integral action has the effect of adding an open loop pole at the origin, it was
necessary to find a stabilizing controller before the optimization could commence. This was achieved by
using a minimax optimization routine in which the eigenvalues of the closed system are moved to a
sufficient distance into the left half plane. A stabilizing controller was found to be Kp=-0.2644,
K1 =0.2807 giving 3=4.747. The optimal controller was found after 13 function evaluations and 5
gradient evaluations to be Kp=-1.7746, K1 =1.1973 and 3=2.175. The step response is shown as the solid
line in Fig. 3.16.
Design 3 Derivative Measures
We can see how the effect of integral control puts less demands on the actuators than that of Design 1.
A further measure which can be used to stop actuator rate saturation is to add derivative control
measures to the cost function. Consider a cost function of the form
(3.72)
I
J=j (
13
‘,\2 ,., („
yrefj ,
u \2
,uref--p, +
• 2 dt
u
An optimal controller for this cost function was found after 14 function evaluations and 5 gradient
evaluations to be Kp=-1.1290 and K1 =0.7804 with J=3.1645. The step response is shown as the dashed
line in Fig. 3.16. We see how this response places less demand on the actuator rates although there is a
trade-off in the speed of response.
0.5 0.5
5 10 5 10
Time(sec) Time(sec)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-30
PART 3: Control System Design DESIGN EXAMPLES: Simple Tracking Example
Feedforward
PI Controller Plant
1
KP il HO G (s)
?ref
The optimal control values starting with F=0 and the Kp=-1.7746 and K1 =1.1973 for the two cost
function were found to be :
Cost function 1 (Eq. 3.71): Kp=-0.24551, K1 =2.1516e-4, F=1.2452 J=1.2815 after 122 function
evaluations and 60 gradient evaluations this is shown as the solid line in Fig. 3.18. Note, this design is
approaching that of the output feedback design. This causes the slow convergence in the optimization
routine as the pole corresponding to the integral action moves towards the origin, creating a marginally
stable system. This stresses the importance of using derivative control measures or by ensuring lower
bounds on the variable K1.
Cost function 2 (Eq. 3.73): Kp=-0.53387, K1 =0.41699 F=0.66392 J=2.1006 after 17 function
evaluations and 6 gradient evaluations this is shown as the dashed line in Fig. 3.18.
Although this controller demands that the actuators react quickly initially, it does not have the
disadvantage of the output feedback controller in that y ref is always tracked even In the presence of
model uncertainties and non-linear effects.
0.5 0.5
O
5 10 o 5 10
Time(sec) Time(sec)
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-31
PART 3: Control System Design DESIGN EXAMPLES: Simple Tracking Example
(3.73)
=(Yref-Y)2dt 12 = Purecup)2dt 13 = dt
542
0 0 0
For higher-order controllers we consider relaxing the control energy terms (f2f3) in order to achieve a
faster speed of response. We therefore set the goals fi *=1, f2*=10, f3*=10. In order to achieve the same
percentage under or over achievement of the active objectives we set the weights w 1=1, w2=10, w3=10
(see eqn.(2.54) resu ting in the following NP problem
Goal Attainment Formulation
minimize y
In order to stop the integral term, C i tending to zero it was bounded with the constraint C i 0.1.
A 2DF P11) controller gave a solution of f1=0.8871, f2=1.4691, 13=8.8710 with r-0.1129 indicating
that at least an 11.29% improvement of the original goals was achieved. The active constraints were
reported to be A and 13 showing the restriction on faster control is the derivative control energy term.
The solid line in Fig. 3.19 shows the time response for the 2DF PID controller.
Using the evolutionary mapping technique and canonical form described in Sections 3.5.5 and 3.5.6
higher-order controllers were designed. Only very small improvements were achieved for each increment
in control order. For instance, a fourth order controller gave a result of f i=0.8735, f2=1.5528,
f3=8.7352 with r-0.1269 and superimposed as the dashed line in Fig. 3.19. Only 1.4% improvement
was obtained using the fourth order controller over the PID controller, this is due to the restriction of
control derivative energy and also because the plant is non-oscillatory and of relatively low-order. The
controller matrices are given in Appendix B.
0.5
5 10 0 5 10
Time(sec) Time(sec)
Fig. 3.19 Step Responses For 2DF PID and 2DF Fifth Order Controllers
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-32
PART 3: Control System Design DESIGN EXAMPLES: GVAM Design Porblem
Controller
Complexity
6
Partial State
Feedback
2DF 4 >1
5
PID Controller
2DF 2
PI Controller j33
PI
Controller
11 Performance
>
Control/Output Control Derivative Multi-objective
Measures Measures Measures
"CACSD using Optimization Methods" PhD 1 hesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-33
PART 3: Control System Design DESIGN EXAMPLES: GVAM Design Porblem
Design 1: PI Controller
In the first design phase a PI controller was designed using a cost function of the from J
= J1+J2 (3.75)
where J 1 is the cost function for a step demand in VKD (y 1) and given by
r o.
)2 2 (3.76)
+ (uref2-u2)2 dt
= j o ( 1-Y 1 + (uref1-u1) + ( y2)2
and J2 is the cost function for a step demand in VTKT (y 2) and given by
0.5
-0.5
0 5 10 5 10
Time(sec) Time(sec)
Control Res onse 0.05 Control Response
0.4
0.3 ..................................
0.2 .....
0.1 -0.05
0
-0.1 -0.1
0 10 0 5 10
It is clear that the addition of feedforward considerably reduces the value of the cost function. The step
responses are plotted in Fig. 3.22. We see that the addition of feedforward has the effect of increasing
the speed of response. The trade-off is in the actuator rate demands. In the next design we penalize the
rate demands by adding a control derivative term to the cost function.
5
Time(sec)
0.3
0.2 .........
0.1 ..........
-0.05
0 .. ........
-0.1 0.1
5 10 0 5 10
Fig. 3.22 GVAM Responses Using 2DF PI Controller
"CACSD using Optimization Method? PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-35
PART 3: Control System Design DESIGN EXAMPLES: GVAM Design Porblem
to both J 1 in Eq. 3.75 and J2 in Eq. 3.76. Using the same 2DF PI control structure as in Design 2 a
controller was designed giving J=2.2692. The control gains were found to be as follows
The step responses are plotted in Fig. 3.23. There is less actuator rate demand although only a marginal
difference in the step response characteristics.
0.3
0.2
0.1
-0.1 -0.1
0 5 10 0 5 10
Fig. 3.23 GVAM Responses Using 2DF PI Controller and Derivative Control Measures
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-36
PART 3: Control System Design DESIGN EXAMPLES: GVAM Design Porblem
The step responses are plotted in Fig. 3.24. The effect of increasing the order of the controller only
marginally improves the cost function. This is because of the restriction in the control derivative
measures. In the next design we consider relaxing the control constraints by considering a multi-
objective formulation. In this design each of the eight responses in Fig. 3.24 are treated independently.
-0.5
0 5 10
Time(sec)
0.3 .................................
0.2
0.1
-0.05 ...............................
0
-0.1 -0.1
0 5 10 0 5 10
Fig. 3.24 GVAM Responses Using 2DF PID Controller
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-37
PART 3: Control System Design DESIGN EXAMPLES: GVAM Design Porblem
The value of y at the solution was y=0.060 indicating a 6% under-achievement in the active
objectives. The active objectives are emboldened in Table 3.1 and are the barriers to further improvement
in the objectives. Step responses are plotted below in Fig. 3.25.
Ste Res from . In ut 1 Ste Res • from In ut 2
2 1.5
0.5
-0.5
5 10 0 5 10
Time(sec) Time(sec)
Control Res onse Control Response
0.2
0
•
-0.2 ..................................................................
-0.4 .................................
-0.6
5 10 0 5 10
Fig. 3.25 GVAM Responses Using 2DF PID Controller
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-38
PART 3: Control System Design DESIGN EXAMPLES: GVAM Design Porblem
0.5
-0.5
5 10 0 5 10
Time(sec) Time(sec)
Control Res onse Control Res onse
10
Fig. 3.26 GVAM Responses Using 2DF PID Controller
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-39
PART 3: Control System Design DESIGN EXAMPLES: F4C Nonlinear Problem
a ll a 12 b 1 1[ se
a 00 Tic
t a21 a22
[0 0 0 i 20
(3.87)
Altitude (feet)
60,000
40,000
20,000
1 2 MachNumber
We consider designing a 2DF controller with integral action as depicted in Fig. 3.9. Three
performance objectives for each of the four flight conditions were used of the form
• •
fl eref 8)2 dt 12 = o1cref-T1C ) 2dt h Inc2 dt (3.88)
where fi is a measure of the speed of response, f2 is measure of actuator magnitude demands and f3 is a
measure of actuator rate demands. Initially we design a controller using a scalar weighted sum of the
objectives for flight condition 1.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-40
PART 3: Control System Design DESIGN EXAMPLES: F4C Nonlinear Problem
Problem: minimize{ J =11 +410 f2 +3!0 f3 ) for flight condition 1. The weights were chosen to reflect
subsequent weights to be used in the multi-objective design.
Table 3.3
PI--) . Output Control Der. Control
Key to Fig.3.24
_Op. Pointy fl 12 13
A solution to this problem was found giving J==0.5563 and the controller gains:
.... ......... •
•..............
......
•••• ....
0.8 ............... .....
........ .. •
......... ....
..........
0.6
............. ..............
........ .......... ............
0.4 .. •
............
........
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-41
PART 3: Control System Design DESIGN EXAMPLES: F4C Nonlinear Problem
0p. Pointy fi f2 f3
The multi-objective design has reduced the magnitude actuator saturation for flight condition 4.
The trade-off is a degradation in the output response as shown in Fig. 3.29 and an increase in actuator
rate demands for Design 1. The active objectives are emboldened in Table 3.4 . These are the barriers to
further improvement in the goals. The optimization routine reported a value of y=0.5814 (58% under-
achievement in the active objectives over the original goals). In order to gain improvement in the
objectives higher-order 2DF controllers were designed using the evolutionary mapping technique
described in Section 3.5.6.
Pitch Rate Output Response To Step Input
1.4
1.2
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Fig. 3.29 Output Responses for MO Designed PI Controller
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-42
PART 3: Control System Design DESIGN EXAMPLES: F4C Nonlinear Problem
The active objectives are emboldened in Table 3.5 indicating the barriers to further improvement in the
objectives. The optimization gave a value of 7.-0.0548 showing that at least a 5.48% improvement has
been achieved over the original goals. The controller gain matrices corresponding to Fig. 3.9 are:
0 0 -11.4783 1 Dar -0.5469 -0.6844
A 1 0 -107.3589 Pc 0 Ben.= 1.4632
c= 0 0.2469
0 1 -20.2031 De=3.7731
Increasing the controller order again gave marginal improvement in the obj ectives (a 6% improvement
over the original goals was achieved using a sixth order controller). The small improvement in
performance in practice would not justify the use of such a high order controller.
•. ......... ...............
1.6 ... •
...............
..••••••"-
„. .
1.4
...
................ ..........
0.8 ... .......... ................
0.6
................
0.4 .......... ................
e;
0.2 ..............
........... ................
0
0.5 1 1.5 2 25 3 3.5 4 4.5
Fig. 3.30 Output Response for MO Designed 3rd Order 2DF I Controller
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-43
PART 3: Control System Design REVIEW
3.11 REVIEW
The aim of this part has been to present a control design methodology which can produce practical
realizable controllers. In order to achieve this a theoretical basis has been established which incorporates
a number of design options, disturbance types and controller configurations. The design of a 2DF
controller with integral action has been focussed on. This has been used to design controllers for a
number of different examples. For more efficient solution and better understanding of the problem an
evolutionary design technique has been used. In the later stages of this process multi-objective
optimization has been used to address problems such as reduction of interaction in multivariable
systems, actuator rate limits and the design of fixed gain robust controllers for nonlinear systems
using multiple operating points.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-44
PART 3: Control System Design REFERENCES
3.12 REFERENCES
General References
[1] Maldla, P.M and Toivonen H.T. "Computational methods for parametric LQ problems - a
survey," IEEE Trans. on Autom. Control, Vol.AC-32, No.8, 1987
[2] Anderson B.D.0 and Moore J.B., "Linear optimal control," Prentice-Hall, 1971
[3] Patel R.V. and Munro N., "Multivariable system theory and design," Pergamon Press,
International Seris on Control, Vol.4, 1982
[4] Friedland, B., "Control System Design: An introduction to state-space methods," McGraw-Hill,
Series in Electrical Engineering, 1987
[5] Mac Farlane,A.G.J., "The calculation of the time and frequency response of a linear constant
coefficient dynamical system," Quart. Joum. Mech. and Applied Math. , Vol.16, Pt.2, 1963
Output Feedback
[6] Levine, W.S. and Athans M.,"On the determination of the optimal constant output feedback gains
for linear multivariable systems," IEEE Trans. on Autom. Control, Vol. AC-15, No.1, pp.44-
48, 1970
[7] Kosut, R.L., "Suboptimal time-invariant systems subject to control structure constraints," IEEE
Trans. on Autom. Control, Vol.AC-15, No.5, pp.557-563 1970
[8] Johnson, T.L. and Athans, M., "On the design of optimal constrained compensators for linear
control systems," IEEE Trans. on Autom. Control, Vol. AC-15, No.1, pp.658-660, 1970
[9] Levine W.S., Johnson, TI, and At/tans, M., "Optimal limited state variable feedback for linear
systems," IEEE Trans. on Autom. Control, Vol. AC-16, No.1, pp.785, 1971
[10] Basuthaker, S. and Knapp, C.H., "Optimal constant controllers for stochastic linear systems,"
IEEE Trans. on Autom. Control, Vol. AC-20, No.5, pp.664-666, 1975
[11] Wenk C.J. and Knapp C.H., "Parameter optimization in linear systems with arbitrarily
constrained controller structure," IEEE Trans. on Autom. Control, Vol. AC-25, No.1, pp.44-
48, 1980
[12] Horisberger H.P. and Belanger P.R., "Solution of the optimal constant output feedback problem
by conjugate gradients," IEEE Trans. on Autom. Control, Vol. AC-19, pp.434-435, 1974
[13] Choi S.S and Sirisena H.R., "Computation of optimal output feedback gains for linear
multivariable systems," IEEE Trans. on Autom. Control, Vol. AC-19, pp.257, 1974
[14] Newmann, M.H., "Specific optimal control of the linear regulator using a dynamical controller
based on the minimal-order Luenberger observer," Int. J. Control, Vol.12, No.1, pp.33-48, 1970
[15] Weston J.E. and Bongiorno Jr., "Extension of analytical techniques to multivariable feedback
control systems," IEEE Trans. on Autom. Control, Vol.AC-17, No.5, pp.613-620, 1972
[16] Sirisena, H.R. and Choi, S.S, "Design of optimal constrained dynamic compensators for linear
stationary stochastic servomechanisms", Int J. Control, Vol.20, No.3, pp.363-369, 1974.
[17] Mendel J.M., "A concise derivation of optimal constant limited state feedback gains," IEEE
Trans. on Autom. Control, Vol.AC-19, No.5, pp.447-448, 1974
[18] Fleming Pi., "A CAD Program For Suboptimal Control Linear Regulators," Proc. IFAC
Symposium on "Computer-Aided Design of Control Systems", Zurich, 1979.
[19] Fleming P.J., "SUBOPT - A CAD Program for Suboptimal Regulators," Proc. Inst. Meas.
Control Workshop on "Computer Aided Control System Design," 19-21 September, 1984,
Sussex, U.K., pp.13-20.
[20] Kuhn. U, and Schmidt G., "Fresh look into the design and computation of optimal output
feedback controls for linear multivariable systems," Int. J. Contiol., Vol. 46, No. , pp.75-95,
1987.
[21] Martin, G.D., and Bryson, A.E., "Attitude control of a flexible spacecraft," A.I.A.A. J.
Guidance, Control and Dynamics, Vol. 3, pp. 37 1980.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-45
PART 3: Control System Design REFERENCES
[22] Fleming, P.J. "A non-linear programming approach to the computer-aided design of regulators
using a linear-quadratic formulation," Int. J. Control, Vol.42, No.1,pp.257-268, 1985
[23] Kleinman, D.L. and Rao, P.K., "Extensions to the Bartels-Stewart Algorithm for Linear Matrix
Equations," IEEE Trans. on Autom. Control, Vol AC-23„ pp.85, 1978
[24] Kwakernaak, H.and Sivan, R. "Linear Optimal Control Systems," John Wiley and Sons, 1972.
[25] Fleming, P.J., "Trajectory sensitivity reduction in the optimal linear regulator," PhD Thesis,
Queen's University, Belfast, 1973.
[26] Kleinman, D.L. and Athans, M., "The design of suboptimal linear time-varying systems," IEEE
Trans. on Autom. Control, Vol AC-13„ pp.150-159, 1968
Gradient Matrices
[27] Athans M. and Levine W.S., "Gradient matrices and matrix calculations," MIT Lincoln Labs.,
Lexington, Mass., Tech.Note 1965-53, 1965
[28] Athans M., "The matrix minimum principle," Inform. Contr., Vol.11, 1967
[29] Geering H.P., "On calculating gradient matrices," WEE Trans. on Autom. Control, Vol AC-21,
No. 1, pp.615-616, 1976
Servomechanisms
[30] Athans M., "On the design of PID controllers using optimal linear regulator theory,"
Automatica, Vol.7, pp.643-647, 1971
[31] Sandell N.Jr, and Athans M.,"Brief paper on 'Type L' multivariable linear systems,",
Automatica, Vol.9 pp.131-136, Pergamon Press, 1973
[32] Davison E.J., 1976, "The robust control of a servomechanism problem for linear time-invariant
multivariable systems," IEEE Trans. on Autom. Control, Vol AC-21, No. 1, pp.25-34., 1976
[33] Davison, E.J., and Ferguson, I.J., "The design of controllers for the multivariable robust
servomechanism problem using parameter optimization methods," IEEE Trans. on Autom.
Control, Vol AC-26, No. 1, pp.93-109., 1981
[34] Davison, EJ., and Scherzinger B.M., "Perfect control of the robust servomechanism problem,"
IEEE Trans. on Autom. Control, Vol AC-32, No. 8, pp.689-702., 1987
[35] Arstein, Z. and Leizarowitz A., "Tracking periodic signals with the overtaking criterion," IEEE
Trans. on Autom. Control, Vol. AC-30, pp.1123-1126, 1985
[36] Bernstein D.S., and Haddad W.M, "Optimal output feedback for nonzero setpoint regulation,"
IEEE Trans. on Autom. Control, Vol. AC-32, No.7, pp.641-645, 1987
[37] Choek K.C., Loh N.K. and Ho J.B., "Continuous-time optimal robust servo-controller with
internal model principle," Int. J.. Control Vol.48, No.5, pp.1993-2010, 1988
"CACSD using Optimization Method? PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 3-46
PART 3: Control System Design REFERENCES
Sensitivity Reduction
[44] Fleming, P.J., "Desensitizing constant gain feedback linear regulators," IEEE Trans. on Autom.
Control, AC-23, pp.933-936, 1978
[45] Subbayyan R., Sarma V.V.S an Vaithilingam M.C., "An approach for sensitivity-reduced design
of linear regulators," Int. J. Systems, Vol.9, No.1, pp.65-74, 1978
[46] Yahagi, T., "Optimal output feedback control with reduced performance index sensitivity," Int.
J. Control, Vol.25, No.5, pp.769-783, 1977
Design Examples
[47] Eitelberg E., "A regulating and tracking PI(D) controller", Int. J. Control, Vol. 45, No. 1, pp.
91-95, 1987.
[48] Nishilcawa, Y., Sannomiya, N., Ohta, T. and Tanaka, H., Automatica, Vol.20, pp.321
[49] Muir, E.A.M., Kellett, M.G., "The RAF Generic VSTOL Aircraft Model: GVAM 87 User's
Guide," RAE Technical Report in Preparation, RAE, Bedford, UK.
[50] Heffley, R.K., and Jewell, W.F., "Aircraft handling qualities data,", NASA CR-2144
[51] ICreisselmeier G., and Steinhauser R., "Application of vector performance optimization to a
robust control loop design for a fighter aircraft," Int. J. Control, Vol.37, No.2, pp.251-284,
1983.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 3-47
CONCLUSIONS
CONCLUSIONS CACSD
This thesis has presented a diverse yet unifying approach to Control System Design, ending with
control examples which demonstrate how an effective control structure together with multi-objective
design criteria is capable of producing practically realizable controllers covering a wide range of
performance specifications. As this thesis has been fairly broad in nature, conclusions and suggestions
for further research are given for each of the three main subjects.
CACSD
In Part 1 further evolution of a package such as MATLAB was considered. Various changes to the
package were suggested including data structure aspects, inter-process communication facilities,
compilation facilities, etc. It is apparent that MATLAB is a very powerful language for many types of
mathematical problems and it has the potential to replace languages such as C and FORTRAN as a high-
level language for numerical development. Already MATLAB is a proving a very useful tool for
numerical algorithm development, however there is a trade-off in computational efficiency when
compared to C or FORTRAN. If MATLAB is to become the next major numerical programming
language, it is imperative that a compiler is written for it. The compiler will be in the form of a
translation facility to C or FORTRAN code which will require a kernel database handler as well as a
communication link to the MATLAB environment. Although a compilation facility is important, it
requires a joint effort on the part of software companies and other workers to set standards and to
undertake the necessary development.
Another very important concept which was discussed was the desirability of the integration of
software. Ideally, all packages would be able to communicate freely to each other and on all machines.
Much of current software development has been concerned with meeting this ideal with techniques such
as cross-compilation, data transfer mechanisms and inter-process communication methods.
Unfortunately in this domain, there is no panacea to resolve the fundamental problems of
communication between programs and computers. Integration of software needs to be tackled using a
pragmatic approach so that special requirements can be addressed.
One such area in which integration of software could be tackled would be for MATLAB to link
easily and simply to other numerical libraries. A solution to this problem was suggested which required
the development of a simple linking and compilation program. Such a project would provide MATLAB
with a powerful interface to FORTRAN and C subroutines, and numerical libraries.
An example of a utility in which integration is troublesome is in the area of optimization software
where the optimization program needs to communicate freely with the routine supplying objective
functions and constraints. The problem here is one of speed since optimization programs often require
large amounts of data to be transferred at high rates of transmission. This is due to the iterative nature
in which the routines are called. It is therefore difficult to link optimization code to design
environments such as MATLAB without directly inter-linking the packages via the source code. This
motivated the direct linking of a FORTRAN version of MATLAB to a an optimization package, ADS.
While this proved a useful and effective optimization tool it lacked the support of Pro- and
PC-MATLAB versions in terms of graphics and other utilities. This prompted the development of a
MATLAB Optimization Toolbox which could be directly integrated into the MATLAB environment.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 C-i
CONCLUSIONS OPTIMIZATION
OPTIMIZATION
In Part 2 optimization methods were discussed for a number of different types of problem
formulation. Methods which are generally considered robust and iteratively efficient identified and
implemented as a number of routines coded in the MATLAB command language. These routines form a
MATLAB Toolbox which has possible wide ranging applications.
SQP was highlighted as a state-of-the-art method for Non-linear Programming. Further efficiencies
for the SQP method could be achieved by using an active set strategy so that the gradients of all the
constraints are not required at every major iteration (dill). Such a strategy would reduce the number of
constraint gradient calculations for some problems significantly. This would make the possibility of
considering semi-infmite problems using discretization strategies viable (cf.[2]).
A number of minor changes could also be made to the Quadratic Programming solution in order to
improve the efficiency of the SQP method. One possible improvement would be to use updating
factorizations of the projected Hessian matrix in the solution of the QP sub-problem in Section 2.7.2 as
suggested in [3]. However, such procedures carry an overhead in terms of more coding which is likely to
result in a beneficial trade-off only for larger problems.
Multi-objective optimization was discussed and the Goal Attainment method was introduced as a
convenient method for solving problems with conflicting design requirements. Algorithm
improvements were proposed to the SQP method and implemented as part of the Optimization Toolbox.
Further research is necessary to develop other multi-objective methods and methods for statistical
design in which the system operates with a level of uncertainty (cf.[4]).
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 C-ii
CONCLUSIONS REVIEW
REVIEW
Overall this thesis has presented a viable and attractive way to perform Control System Design and
other types of engineering optimization. The aim has been to provide a set of tools using the MATLAB
environment which are accessible to a wide user group of control engineers and other workers. The
methods employed have been chosen for their effectiveness and have been founded on well established
mathematical theory. Control System Design methods have been used employing integral quadratic
measures of control. The theory has been extended to include servomechanisms, output disturbances and
a two-degree-of-freedom control structure.
Further areas of research have been suggested within the areas of CACSD, Optimization and
Control System Design. These consist of both minor improvements to existing methods and new
research areas which encompass broad aspects of the design approach.
REFERENCES
[1] Schittowski K., "NLQPL: A FORTRAN-subroutine solving constrained nonlinear programming
problems", Annals of Operations Research, Vol. 5,485-500, 1985.
[2] Polak, E. and Tits, A. "A recursive quadratic programming algorithm for semi-infinite
optimization problems," J.Appl. Math. Optimiz., Appl. Math. Optim., Vol.8, pp.325-349, 1982
[3] Gill P.E., Murray W., and Wright M.H. "Practical Optimization", Academic Press, London,
1981.
[4] Brayton, R.K., Hachtel, G.D. and Sangiovanni-Vincentelli, A.L., "A survey of optimization
techniques for integrated-circuit design," Proc. of IEEE, Vol.69, No.10, pp.1334-1363, 1981
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989
Appen ix A
(111
OVERVIEW A-1
TUTORIAL A-2
Unconstrained Optimization A-2
Adding Constraints A-3
User-Supplied Gradients A-3
Gradient Check A-4
Adding Bounds A-5
Maximization A-6
Greater Than Zero Constraints A-6
Equality Constraints A-6
Changing The Default Settings A-7
Speeding Up The Optimization A-9
Storing The Results A-9
Graphics Facilities A-10
Interrupting The Optimization A-13
Common Problems A-14
REFERENCE A-15
unconstr A-16
constr A-1g
attaingoal A-20
minimax A-23
solve A-25
leastsq A-26
lp A-28
qP A-29
setpara A-30
optimglob A-31
OPTIMIZATION TOOLBOX: Overview
The Optimization Toolbox is a set of easy-to-use routines for solving optimization problems. It
consists of a set of MATLAB m-files which implement a number of non-linear programming
algorithms. The principle routines are as follows:
The routines are designed to work with scalars, vectors and matrices. Matrices are indicated by upper-
case bold letters, vectors by lower case bold letters and scalars by plain letters.
All the routines except lp and qp are called on an iterative basis by a user-defined function, a script
file or by the user. The optimization routines do not call any user-supplied functions. Instead, the
information in terms of functions evaluations and any available gradient information is supplied to the
optimization functions at each iteration. This means that the optimization may be interactively
interrupted in order to update or change the problem formulation or optimization parameters. It also
gives the user freedom over program structure and helps to promote modularity.
Emphasis has been placed on ease-of-programming. The intention has been to provide a set of robust
and iteratively efficient set of routines. The routines are ideal for complex problem solving and for
design applications involving non-linear objectives. All the main routines are supported by graphics
facilities which consist of graphical monitoring of both the design variable location and the function
values. This is performed using contour plots and x-y graphs.
Default optimization parameters are used extensively, these may be changed by the user through the
vector para. Gradients when needed are calculated using a finite difference approximation method unless
they are supplied using the optional variable grad.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 A-1
OPTIMIZATION TOOLBOX: A Tutorial
In this section the use of the Optimization Toolbox will be presented through examples. Although
only the functions unconstr and constr will be considered, the features described below apply to all
the optimization routines (attaingoal, minimax, minimaxabs, leastsq and solve). The only difference
between the routines is in the problem formulation and the termination criteria.
Unconstrained Optimization
Consider initially the problem of finding a set of design variables [x 1 ,x2] to minimize the
following function:
A solution to this problem can be found by typing in the following commands, entering them in a
script file or as part of a user-defined function.
Unconstrained Example
This routine is contained in the script file testunconstr2.m and may be run as a demonstration or
used as a template for writing other optimization problems. Executing this script file gives the
following solution after 49 iterations.
F
3.1158e-12
=
0.5000 -1.0000
At each iteration the function unconstr returns new values for the design variables [x1,x2].
Function evaluations for this new point must then be evaluated and passed back to unconstr. Notice that
we have to make an initial guess on the design variables [x 1 ,x2] which may affect both the number of
iterations and the value of the solution point should there exist a number of local minima. In the
example above X has been initialized to [-1,1j.
There is also a variable para which must be passed to unconstr. This is a vector of optimization
parameters which may be used to change the characteristics of the optimization solution procedure. It
contains values such as termination tolerances and algorithm choices. The first element of para is used
for control flow. Initially this is set to zero, to ensure initialization of the optimization procedure. If
no other values are specified for para a vector of default parameters is returned. When sufficient
termination criteria have been met (see Reference Manual) the optimizer returns: para(1)=1. More
about changing the default settings later.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-2
Appendix A Optimization Toolbox: TUTORIAL Adding Constraints
Adding Constraints
Suppose now we wish to add inequality constraints to the problem in (1) giving a problem of the form:
min (ex1(4x12+2x22+4x1x2+2x2+1))
X ix2
Constrained Example
PARA = 0; %Initialization
X = [-1,1];
while PARA(1)--.1 %Check Termination
F = exp(X(1))*(4*X(1)^2 + 2*X(2)^2 + 4*X(1)*X(2) + 2*X(2) + 1); %Evaluate F
G(1) = 1.5 + X(1)*X(2) - X(1) - X(2); %Evaluate Constraints
G(2) = -X(1)*X(2) - 10;
[X,PARA] = constr(X,F,G,PARA); %Call Optimizer recursively
end
This problem is found in the file testconstr2.m and gives the following solution after 37
iterations:
F=
0.0236
G=
1.0e-13 *
0.0489 -0.5151
x=
-9.5474 1.0474
User-Supplied Gradients
The above problem solution procedure uses a method to systematically perturb each of the design
variables in order to estimate the function and constraint gradients. The problem can be solved more
accurately and efficiently if we supply analytic partial derivatives of the function and constraints. This is
done by introducing an additional argument, grad, in the function call to constr. The first column of
grad contains the partial derivatives of the objective function, f(X), with respect to x. The next
columns contain the partial derivatives of the constraints in order of location. When the constraints, G,
are in the form of a matrix then the i+/th column of grad refers to the ith constraint of G when it is
arranged as a column-wise vector using the command G(:). Gradients are only required when the
optimizer returns para(1)=2. Thus problem (2) with analytic gradients can be programmed as:
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-3
Appendix A Optimization Toolbox: TUTORIAL User-Supplied Gradients
User-Supplied Gradients
PARA = 0; tY0Initialization
X=[-1,1];
while PARA(1) —=1 %Check Termination
if PARA(1) == 2 %Calculate Analytic Gradients If Needed
TEMP=exp(X(1))*(4*X(1)^2+2*X(2)^2+4*X(1)*X(2)+2*X(2)+1);
grad= [TEMP + 4*exp(X(1)) * (2*X(1) + X(2)), X(2)-1, -X(2)
4*exp(X(1))*(X(1)+X(2)+0.5), X(1)-1, -X(1)];
else %Otherwise Calculate F and G
G(1) =1.5 + X(1)*X(2) - X(1) - X(2);
G(2) = -X(1)*X(2) - 10;
F = exp(X(1)) * (4*X(1)^2 + 2*X(2)^2 + 4*X(1)*X(2) + 2*X(2) + 1);
end
[X,PARA] = constr(X,F,G,PARA,grad); %Call Optimizer
end
This is contained in the file testconstr3.m and gives the following result after 12 function
evaluations and 12 gradient evaluations.
F=
0.0236
G=
1.0e-12*
-0.2558 0.2558
x=
-9.5474 1.0474
Gradient Check
When user-supplied gradients are available, the user has the option of checking these, in the first few
evaluations of the optimization process, with a set calculated using finite difference evaluation. This is
particularly useful for detecting typing or other errors in either the objective function or the gradient
calculation. It can also be used in other applications where the numerical evaluation of partial derivatives
is required.
If such gradient checks are required then para(1) should be initialized with the value -1. The first
cycle of the optimization is then concerned with checking the user-supplied gradients. If they do not
match within a given tolerance the user is informed of the discrepancy and is given the option either to
abort the optimization or to continue.
The routines may also be used to evaluate the gradient (partial derivatives) at a point without
performing any optimization. If this is the case then para(1) should be initialized with -2. The gradient
is then returned as a third argument in the input parameter list to the optimization function e.g.
[X,PARA,GRAD] = unconstr(X,F,PARA).
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-4
Appendix A Optimization Toolbox: TUTORIAL Adding Bounds
Adding Bounds
Suppose we wish to restrict the variables to be within certain limits. This can be achieved by using
the bounded syntax of the appropriate function. For constr the syntax is as follows
[X,PARA] = constr(X,F,G,PARA,VLB,VUB);
or [X,PARA] = constr(X,F,G,PARA,VLB,VUB,GRAD);
Where VLB and VUB contain lower and upper bounds on the variables. Thus the commands to restrict
the variables in problem (2) to be greater than zero can be written as:
Bounded Example
PARA = 0;
X = [-1,1];
while PARA(1) —= 1
F=exp(X(1))*(4*X(1)^2+2*X(2)A2+4*X(1)*X(2)+2*X(2)+1);
G(1)=1.5 + X(1)*X(2) -X(1) -X(2);
G(2)=-X(1)*X(2)-10;
[X,PARA]=constr(X,F,G,PARA,zeros(X),[ ]); %Add Bounds
end
Generally VLB and VUB should be of the same size as X although the routines will also accept
smaller sizes and will assume the undefined variables to be unbounded. Notice in this example that since
there are no upper bounds we have passed down the empty matrix [ ]. This can also be done for lower
bounds and for the constraint variable, G. Thus constr can also be used as an unconstrained optimization
routine.
The above program is coded in testconstr4.m and gives the following solution after 7 iterations:
F=
8.5000e+00
G=
0 -1.0000e+01
=
0 1.5000e+00
Note that when we express lower bounds on the design variables then we must also express upper
bounds on the variables although either may be set to the empty matrix as in the above example. This is
so that constr can distinguish between the syntax for when user-supplied gradients are given,
[X,PARA]=constr(X,F,G,PARA,GRAD), and when only bounds are supplied,
[X,PARA]=constr(X,F,G,PARA,VLB,VUB). Alternatively we can express bounds using linear
inequality constraints. This may be more appropriate when there are only a few bounded variables i.e.
Upper Bound: xi _s`UB should be written as : -x • + UB—
<0
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-5
Appendix A Optimization Toolbox: TUTORIAL Equality Constraints
Notice, in the above problems, that the more constrained and bounded the problems have been, the
less function iterations have been required. This is because the optimization can make better decisions
regarding steplength and regions of feasibility than in the unconstrained case. It is therefore always
wise to bound and constrain any problem whenever possible to promote a fast convergence to the
solution.
Maximization
The optimization functions (uncosntr,constr,attaingoal,minimax, minimaxabs, leastsq) all
perform minimization of the objective function(s), F. Maximization is achieved by supplying the
routines with -F.
e.g. for max f(X) subject to: G(X) 0 use: [X,PARA]=constr(X,-F,G,PARA)
X
Equality Constraints
Equality constraints are expressed in the first few elements of the matrix G. Para(13) must be initial-
ized with the number of equality constraints. For example, a program which adds the constraint
x 1 +x2=0 to problem (1) is as follows:
This is coded in the script file tesconstr5.m and produces the following solution after 13
iterations:
F=
1.8951e+00
G=
-4.2633e-14 8.5265e-14 -8.5000e+00
x=
-1.2247e+00 1.2247e+00
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-6
Appendix A Optimization Toolbox: TUTORIAL Changing The Default Settings
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-7
Appendix A Optimization Toolbox: TUTORIAL Changing The Default Settings
17 Finite Difference3 0.1 Maximum change in variables for finite difference gradients.
18 Step length Step length parameter. Generally on the first iteration this is set
conservatively to a value of 1 or less depending on the gradients.
19 Graphics 0 Turns graphics facilities on(>0) or off(0). The value of para(19)
Indicates the type of plot required. Set to 1 for a performance
monitoring graph. Set to 2 for a contour plot. Set to 3 for an
isometric plot Set to 4 for a contour plot, an isometric plot and an
on-fine performance plot on the same screen.
The default parameters can be changed in a number of ways, either before the initialization or during
the running of the optimization cycle. An example of how to change the termination criteria in problem
(2) to le-8 might be
The above code is contained in the file testunconstr6.m and gives the following solution after 63
function evaluations:
F=
2.1965e-14
x=
0.5000 -1.0000
To get on-fine help for the meanings of the parameters enter the command help setpara. To get a set
of default parameters use the command: para=setpara([ ]).
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-8
Appendix A Optimization Toolbox: TUTORIAL Speeding Up The Optimization
The use of global variables is highly recommended for reasons of execution efficiency. However, in
certain cases it may be advantageous to store the variables to file at every iteration, for instance, it may
be necessary to investigate how a background job is progressing in which case this can be done by loading
the last saved version of tempoptim while the background job is still executing. The global variables
have been named with capital letters and a G_ preceding each variable so that it is highly unlikely that
this will coincide with other variable names.
The global variables are as follows:
G_MATL G_MATX G_PCNT G_STEPMIN G_SD G_GCNT G_OLDF G_GRAD G_HOW
G_CHG G_LAMBDA G_GLOBFLAG G_LAMBDABEST G_XBEST G_FBEST
There are also a number of global variables associated with the graphics facilities they are follows:
G_MESH G_GPARA G_M DX G_MDY G_GXCNT G_GYCNT G_CONTOURS G_GSX
G_GSY G_AXIS G_M ESHC G_AXIS2
Of particular interest is the string variable G_HOW which contains a complete history of the
optimization cycle. Another variable G_HESS contains an estimate of the Hessian matrix at the
solution point.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 A-9
Appendix A Optimization Toolbox: TUTORIAL Graphics Facilities
Graphics Facilities
All the main routines are supported by graphics facilities which are invoked simply be setting
para(19) to an appropriate value on the first call to the optimization routine. The options for the
graphs consist of a contour plot, an isometric (mesh) plot and/or on-line performance plots. The
selection is achieved through para(19) which is set accordingly:
para(19)=1 gives performance monitoring plot(s).
para(19)=2 gives a position monitoring contour plot.
para(19)=3 gives an isometric(mesh) plot of the objective function and/or constraints.
para(19)=4 gives all of the above plots on the same screen.
When using the graphics facilitates global variables must be set up using optimglob. This is for
reasons of execution efficiency. If they are not set the user will be informed and the program will
abort. When plotting facilities are requested then the user is prompted for the necessary information to
determine plotting parameters. This information is put in a global variable called G_GPARA which
may be altered. Alternatively this information can be directly entered into G_GPARA by setting the
elements with the following information.
Element of G GPARA Description
1 x-axis element variable for contour plot e.g. for x(3) set G_GPARA(1)=3.
2 y-axis element variable for contour plot e.g. for x(2) set G_GPARA(2)=2.
3 minimum value for x-axis variable on contour plot.
4 maximum value for x-axis variable on contour plot.
5 minimum value for y-axis variable on contour plot.
6 maximum value for y-axis variable on contour plot.
7 minimum value for objective function monitor plot.
8 maximum value for objective function monitor plot.
9 minimum value for constraint plot (log scale).
10 maximum value for constraint plot (log scale).
11 estimated maximum number of iterations to be displayed on plot.
12 number of subdivisions on contour plot for each axis.
13 position on screen for contour plot e.g. 111 is all of screen 221 is first
quadrant.
14 position on screen for isometric plot.
15 position on screen for function performance plot.
16 position on screen for constraint performance plot.
17 when set to 1 the last generated mesh and inputted axis information is used
for subsequent plots. When set to 2 a new mesh is generated but using the axis
information in the elements of G_GPARA. These settings are useful when
trying different starting values for the same optimization problem.
18-20 internal parameters. (18=used as indicator for termination of contour plotting,
19=number of constraints, 21=last iteration count for last plotted point)
G_CONTOURS is another global variable associated with the plotting which is either a scalar
containing the number of contours or a vector containing precise values for the contours.
G_GPARA(17)=1 allows the same contour and axis information to be used on subsequent cycles
without recalculation of the contour values.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 A-10
Appendix A Optimization Toolbox: TUTORIAL Graphics Facilities
Example
Consider the doubly constrained problem below which is in the script file testconstr16.m:.
Graphics Example
optimg lob %Set up global variables
if -exist('PARA'), PARA(19)=4; else PARA(1)=0; end %Set up PARA
if -exist('X') ,X=[2,4]; end
while PARA(1)-.1
F=100*(X(2)-X(1)^2)A2+(1-X(1))^2;
0(1).(X(1))*(X(1))-0.3*(X(2)-2)*(X(2)-2)-0.01;
G(2)=0.5*(X(1))*(X(1))+(X(2)-2)*(X(2)-2)-2;
[X,PARA]=constr(X,F,G,PARA);
end
We have used the exist command in this routine in order to change the settings after a first running.
After executing this code the user is prompted with information regarding the way the graphs are
plotted. Here is a typical input:
Minimum value of x(1) = ? -2
Maximum value of x(1) = ? 2
Minimum value of x(2) = ? -1
Maximum value of x(2) = ? 4
Number of contour points to be taken for each axis(e.g. 10) ? 30
Enter the number of contours you want displayed
- alternatively enter a vector containing contour values: exp(2:2:20)
Maximum likely number of iterations (e.g. 100) ? 50
Minimum function value for plot ? 0
Maximum function value for plot ? 10
Give exponent (10^exp) for the minimum constraint value for plot ? - 10
Give exponent (10^exp) for the maximum constraint value for plot ? 2
Having entered this information the contour points are evaluated by the optimization routine. An
isometric plot of the objective function is then displayed and the user is given the prompt:
Rotate ? (1=left,r=right, u=up, d=down, q=quit, 0=fun, 1,2..=constraint):
Options (I,r,u,d) have the effect of rotating the plot in increments of 10 degrees. When constraints
exist entering a number greater than zero produces an isometric plot of the respective constraint which is
plotted in place of the objective function, this may then be rotated. Entering 0 brings back the plot of
the objective function. Typing the option q ends the isometric plotting phase. The remaining graphs are
then displayed and the optimization begins. Points are then plotted as the optimization progresses. The
above settings gave the following results and graphs.
Results after 51 iterations:
X = -7.2831e-01 6.8289e-01
G = 2.0841e-12 1.0418e-12
F= 5.3113e+00
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-11
Appendix A Optimization Toolbox: TUTORIAL Graphics Facilities
Isometric Plot
x1
Objective Function Lan est Constraint Value
10 105
8 102
6 10-1
4 10-4
2 10-7
0 '- 10-10
0 50 0 50
Function Evaluations Function Evabations
The isometric plot is that of the objective function. The contour plot is that of the objective function
and the constraints (dense contours). The constraints contours are only displayed for infeasible values
and are displayed densely to indicate the area of infeasibility. The starting point of the Optimization
cycle is at the point [2,4] at the top right-hand corner of the contour diagram. The path of the solution is
represented by the dotted line which is plotted at the completion of each major iteration. The above
example illustrates the ability of the optimizer to locate the feasible region despite the small gap
between the constraints with which it must pass in order to enter the bottom part of the feasible
region. However, it also illustrates a fundamental problem in optimization and that is the location of
the global minimum in the face of a non-convex solution boundary. Consider what happens when we
start the optimization problem from the opposing side of the contour diagram. We can use the contour
points from the last plot and make the contour diagram fill the whole screen using the following
commands:
PARA(19)=2 %Set graph parameter to just plot contour diagram
X=[-2,4] %Reset X to new point.
G_G PARA(17)=1 %Tell graphics routine to use last evaluated set of contours
testconstr16 %Start optimization
Executing these instructions gives the following solution and graph:
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-12
Appendix A Optimization Toolbox: TUTORIAL Graphics Facilities
3
00
01 1
\-
otototz-e
•%MAI)
o
2.5
c;
2
1.5
14
**4ke. 4•1 0
0.5
1
*#*##At.
..polgorlatIZEttst:410
"Nn AL---svAnwv
0/4
0
AAINIn 11-
11.1141111\ A
0 n
-0.5
-1
-2
r /ALI I I I I 111141ni,\
-1.5 -1 -0.5 0 0.5 1 1.5 2
xI
The minimum located form this starting point has a function value which is less that achieved from
the original starting point. This stresses the importance of trying different starting values when it is
suspected that there may be a number of local minima.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-13
Appendix A Optimization Toolbox: TUTORIAL Common Problems
Common Problems
'The routines may only give local solutions, it is therefore necessary to try the optimization form a
number of different starting points if global solutions are sought.
•If the routines do not converge check that you have not posed an infeasible problem.
' The routines make use of finite difference gradients if they are not user-supplied. It may therefore be
necessary to interpolate all discrete functions such as time and frequency responses to avoid excessive
errors in the gradient evaluation. This can be achieved using the spline function or any other
interpolation method.
' Sometimes the optimization may give values for which it is impossible to evaluate F and G, such as
the evaluation of a time response when the system is unstable. It is therefore necessary to properly
bound the design variables or to give a large positive value to F and G when infeasibility is encountered.
' The function to be minimized must have continuous first and second derivatives. However, some
success may be achieved for certain classes of discontinuities if the finite difference parameters are
adjusted to appropriate values.
'The optimization routines do not provide for the case when the variable X can only take on discrete
values, however, some success may be achieved by resetting the global vector CHG at each iteration.
This variable corresponds to the finite difference gradient perturbation levels for the matrix X
multiplied by para(15) i.e. For each variable indexed by i a partial derivative is calculated by perturbing
X using the following formula : xk+i = xk + iX = xk + CHG(i) *para(15) where para(16) < Ax <
para(17).
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 A-14
Appendix A Optimization Toolbox: REFERENCE Common Problems
OPTIMIZATION TOOLBOX
Reference
This section contains detailed descriptions of the main OPTIMIZATION TOOLBOX functions. The
routines contained in this section are as follows:
MAIN ROUTINES
UTILITY ROUTINES
setpara parameter settings and help
optimglob sets up global variables
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-15
Appendix A Optimization Toolbox: REFERENCE unconstr
Purpose:
Solves unconstrained optimization problems.
Synopsis:
[x,para]=unconstr(x,f,para)
[x,para]=unconstr(x,f,para,grad)
Description:
Unconstr minimizes a scalar objective function of the form:
min f(X)
X
Values of the scalar function f(X) must be supplied to unconstr on an iterative basis. Values of X
are returned at each iteration and a new value of f(X) must be evaluated. X may be a scalar, vector or a
matrix.
Upon initialization para(1) should be set to 0. If other values for para are not supplied then
unconstr returns default parameters (see Tutorial). Unconstr returns a value of para(1)=1 when the
optimization has terminated following sufficient convergence or when the number of iterations exceeds
para(14). The optimization will terminate successfully following convergence if the precision of X at a
minimum is within the tolerance given by para(2) (default: le-4) and the objective function is
estimated to be within the tolerance given by para(3) (default: le-4).
If the optional variable grad is not supplied gradients are calculated using a finite differences
approximation.
Example:
Find values of X which minimize: 100* (x2_/(12 .)2 (1-x1))2 starting at the point [-1.2,1 ];
8.8348e-11
x=
1.0000 1.0000
Limitations:
The function to be minimized must be continuous. Unconstr may only give local solutions.
Algorithm:
The default algorithm is a quasi-Newton method. Setting para(5)=1 implements the simplex
method of Nelder and Mead[2] and programmed by S.Hancock (gradients should not be supplied when
using this method). If a quasi-Newton method is used then the default algorithm for updating the
approximation of the Hessian matrix is the BFGS[3-6] formula. The DFP[7,8] formula which avoids
direct calculation of the inverse Hessian, may also be selected by setting para(6)=1. A steepest descent
is selected by setting para(6)=2 (although not recommended). The default line-search algorithm
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-16
Appendix A Optimization Toolbox: REFERENCE unconstr
(para(7)=0) is a safeguarded mixed quadratic and cubic polynomial interpolation and extrapolation
method. Safeguarded cubic interpolation is the default line-search algorithm (para(7)=1) when
gradients are supplied.
See Also:
setpara, optimglob
References:
[1]Nelder J.A. and Mead R., A simplex method for function minimization, Computer Journal Vol.7,
pp. 308-313.
[2]Broydon C.G, The convergence of a class of double-rank minimization algorithms, J. of the Inst.
of Mathematics and its Applic., Vol. 6, pp. 76-90, 1970.
[3]Fletcher R., A new approach to variable metric algorithms, Computer Journal, Vol. 13, pp. 317-
322, 1970.
[4]Golfarb. D., A family of variable metric updates derived by variational means, Mathematics of
Computing, Vol. 24, pp. 23-26, 1970.
[5]Shanno D.F., Conditioning of quasi-Newton methods for function minimization, Mathematics of
Computation, Vol. 24 pp. 647-656, 1970.
[6]Davidon W.C., Variable metric method for minimization, A.E.C. Research and Development
Report, ANL - 5990, 1959.
[7]Fl etcher R., Powell M.J.D., A rapidly convergent descent method for minimization, Computer
Journal, Vol. 6, pp. 163-168, 1963.
[8]Fletcher R., Practical Methods of Optimization Vol. 1, Unconstrained Optimization, John Wiley
and Sons.
[9]Grace A.C.W., Computer-Aided Control System Design using Optimization Techniques, Ph.D.
Thesis, University of Wales, Bangor. 1989
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 A-17
Appendix A Optimization Toolbox: REFERENCE constr
Purpose:
Solves constrained optimization problems.
Synopsis:
[x,para]=constr(x,f,g,para)
[x,para]=constr(x,f,g,para,grad)
[x,para]=constr(x,f,g,para,v1b,vub)
[x,para]=constr(x,f,g,para,v1b,vub,grad)
Description:
Solves the constrained problem of the form:
f(X)
Values of the scalar objective function, f(X), and constraints, G(X), must be supplied to constr on
an iterative basis. Values of X are returned at each iteration and new values of f(X) and G(X) must be
evaluated. X and G(X) may be a scalars, vectors or a matrices.
Upon initialization para(1) should be set to 0. If other values for para are not supplied then constr
returns default parameters (see Tutorial). Constr returns a value of para(1)=1 when the optimization
has terminated following convergence and when all the termination criteria (para(2:5)) have been met or
when the number of iterations exceeds para(14). Para(2) is a measure of the precision required of X
before the optimization will terminate (default: le-4). Para(3) is a measure of the precision required of
the objective function at the solution (default: 1-4). Para(4) indicates the maximum constraint violation
that can be tolerated before the optimization will terminate (default le-7).
Gradient information, if available, need only be supplied when para(1)=2. The first column of grad
should contain the gradient of f(X) ; the remaining columns should contain the gradients of G(X) . To
change default settings and for more information refer to Tutorial.
Lower and upper bounds on the design variables are set using the optional variables vlb, vub which
may also be empty. Equality constraints should be put in the first few elements of g and para(13)
should be set with the number of them (see Tutorial).
Example:
Find values of X which minimize: -x 1 x2x3 ,
subject to the constraints: -x 1 - 2x2 - 2x3 0,
x 1 + 2x2 + 2x3 7 starting at the point X=[10,10,10]
PARA=0; %Reset Optimization Parameters
X=[10,1 0,1 0]; %Initialize Design Variables
while PARA(1)—=1 %Check Termination Parameter
F =-X(1)*X(2)*X(3); %Evaluate F
G(1)=-X(1)-2*X(2)-2*X(3); %Evaluate Constraints
G(2). X(1)+2*X(2)+2*X(3)-72;
[X,PARA]=constr(X,F,G,PARA); %Call Optimizer
end
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-18
Appendix A Optimization Toolbox: REFERENCE constr
This program is contained in the script file testconstr1.m and gives the following solution after 39
iterations:
F=
-3.4560e+03
G=
-7.2000e+01 -3.8654e-12
x=
2.4000e+01 1.2000e+01 1.2000e+01
Algorithm:
Constr uses a Sequential Quadratic Programming(SQP) method. In this method a Quadratic
Programming (QP) sub-problem is solved at each iteration. An estimate of the Hessian of the Lagrangian
is updated at each iteration using the BFGS formula (see unconstr ref.[3-6]). A line search is performed
using a merit function similar to that proposed by Han[1] and Powell[2,3]. The QP sub-problem is
solved using an active set strategy similar to that described in Gill and Murray[4].
Limitations:
The function to be minimized and the constraints must be continuous. Constr may only give local
solutions.
See Also:
setpara,optimglob
References:
[1] Han, S.P, A Globally Convergent Method For Nonlinear Programming J. of Optimization
theory and Applications Vol 22. 1977 pp.297.
[2]Powell, M.J.D., The Convergence Of Variable Metric Methods For Nonlineary Constrained
Optimization Calculations, in Nonlinear Programming 3, ed. 0.L.Mangasarian, R.R. Meyer and
S.M.Robinson (Academic Press) 1978.
[3]Powell, M.J.D. A fast algorithm for nonlineary constrained optimization calculations,
Numerical Analysis, ed. G.A. Watson, Lecture Notes in Mathematics, Springer Varleg, Vol. 630,
1978.
[4]Gill, P.E., Murray, W., and Wright M.H. Practical Optimization, Academic Press, London,
1981.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-19
Appendix A Optimization Toolbox: REFERENCE attaingoal
Purpose:
Solves the Multi-Objective Goal Attainment[1] problem.
Synopsis:
[x,para]=attaingoal(x,f,goal,w,para)
[x,para]=attaingoal(x,f,goal,w,para,grad)
[x,para]=attaingoal(x,f,goal,w,para,v1b,vub)
[x,para]=attaingoal(x,f,goal,w,para,v1b,vub,grad)
Description:
Attempts to make a matrix of objectives, F(X), attain a matrix of goal values, GOAL, by solving
the problem:
min y
X,y
f ii(X) - wity 5 goalu i=1,...,m1 j=1,..., m2
The objectives, F(X), may not reach the required goals in GOAL (under-attainment) or may be
better than the goal values (over-attainment). The amount of under- or over-attainment can be
controlled by setting the variable W. If the objectives are desired to be less than the objectives, then set
W=abs(GOAL). If it is desired for the objectives to be greater than the goals, then set W.-abs(GOAL).
This will ensure the same percentage under or over-attainment of the active objectives. For hard
constraints set w1=0.
If it is desired for a number of the objectives to be in the neighbourhood (or equal) to the goals then
set PARA(13) with the number of objectives for which this is required. Such objectives should be
partitioned into the first few element of F. W should be set to GOAL (or -GOAL) which will ensure
the same percentage of over or under-attainment over the required values. The variables GOAL, W, F(X),
may be scalars, vectors or matrices of equal size, they should be supplied to attaingoal on an iterative
basis. Para(8) contains the value of y.
Attaingoal returns a value of para(1)=1 when the optimization has terminated following sufficient
convergence or when the number of iterations exceeds para(14). Para(2) is a measure of the precision
required of X before the optimization will terminate (default le-4). Para(3) is a measure of the
precision required of the objective function (para(8)) at the solution. Para(4) indicates the maximum
constraint violation that can be tolerated as a function of y before the optimization will terminate
(default: le-7).
If the optional variable grad is not supplied, gradients are calculated using a fmite differences
approximation. Set para(1)=0 on first iteration or to re-initialize. To change default settings, such as
termination criteria, refer to the Tutorial. Gradient information, if available, need only be supplied when
para(1)=2. The first columns of grad contain the gradients for the respective elements of F(X) with
respect to X. Lower and upper bounds on the design variables are set using the optional variables vlb,
vub which may also be empty.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-20
Appendix A Optimization Toolbox: REFERENCE attaingoal
Example:
A system requires its eigenvalues to lie on the real axis in the complex plane to the left of the
points [-1,-3,-3]. A proportional output feedback controller is to be designed with restrictions on the
control gains not to exceed a value of 4 or be less than -4. The plant is a 2-input 2-output, open loop
unstable system which is given in terms of a state space description (A,B,C matrices). A set of goal
values for the closed loop eigenvalues are initialized as,GOAL=[-1,-2,-3]. To ensure the same percentage
under or over attainment in the active objectives at the solution the weighting matrix, W, is set equal to
abs(GOAL). Starting with a controller, X40,0;0,0], the problem is coded as follows.
PARA(1)=0; %Initialize
optimglob %Use global variables for faster execution
A=[-0.5 0 0; 0 -2 10; 0 1 -2]; B=[1 0; -2, -2; 0 1]; C=[1 0 0; 0 0 1]; %Plant Matrices
X=zeros(2); %Initialize controller matrix
GOAL=[-1 ,-2,-3] %Set goal values for the eigenvalues
WEIGHT=abs(GOAL) % Set W to give same percentage under or over-attainment
VLB=-2*ones(X); VUB=2*ones(X); %Set upper and lower bounds
while PARA(1)-=1 %Check Termination Parameter
F=sort(eig(A+B*X*C)) I ; %Evaluate Objectives
[X,PARA]=attaingoal(X,F,GOAL,W,PARA,VLB,VUB); %Call Optimizer
end
This program is contained in the script file testattaingoa11.m and gives the following solution
after 85 iterations.
The attainment factor PARA(8) ..-0.3865
F=
-6.9313 -4.1588 -1.4099
x=
-4.0000 -0.2564
-4.0000 -4.0000
The set of active constraints is:
12
Discussion
The attainment factor indicates that each of the objectives has been over-achieved by at least 38.63%
over the original design goals. The set of active constraints indicates those objectives which are barriers
to further improvement and for which the percentage over-attainment is met exactly.
In the above design the optimizer tries to make the objectives lees than the goals. For a worst case
problem in which it is desired for the objectives to be as near as possible to the goals then PARA(13)
should be set with the number of objectives for which this is required.
Consider the above problem where it is desired that the eigenvalues be equal to the goal values. A
solution to this problem is found by adding a line to the beginning of the above program:
PARA(13)=3;
On execution of this program the following results were obtained after 49 iterations.
The attainment factor PARA(8) =4.0409e-23
F=
-1.0000 -3.0000 -5.000
x=
-1.5785 1.2185
-0.4028 -2.9215
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-21
Appendix A Optimization Toolbox: REFERENCE attaingoal
In this case the objectives have tried to match the goals. The attainment factor of 4.0409e-23
indicates that the goals have been matched exactly (within a tolerance of 4.40409e-21%).
Notes
These types of problem are often non-convex and the solution is dependent on the starting values
given for the variable X. When the objectives and goals are complex then attaingoal tries to achieve the
goals in a least squares sense.
Algorithm:
Attaingoal uses the same algorithms as constr with modifications similar to those described in [3].
The choice of merit function is set using PARA(7). The default is to use the merit function of Han[2]
and Powell[3]. An exact merit function together with a modified Hessian (see [5] and [6]) can be used by
setting PARA(7)=1. The exact merit function method tends to be more robust than the method
proposed by Han and Powell but suffers from slower convergence in a number of examples.
Limitations:
The objectives must be continuous. Attaingoal may only give local solutions.
See Also:
setpara, optimglob .
References:
[1]Gembicki, F.W., Vector Optimization for Control with Performance and Parameter Sensitivity
Indices', Ph.D. Dissertation, Case Western Reserve Univ., Cleveland, Ohio, USA, 1974.
[2] Han, S.P, A Globally Convergent Method For Nonlinear Programming J. of Optimization
theory and Applications Vol 22. 1977 pp.297.
[3]Powell M.J.D. A fast algorithm for nonlineary constrained optimization calculations, Numerical
Analysis, ed. G.A. Watson, Lecture Notes in Mathematics, Springer Varleg, Vol. 630, 1978.
[4]Fleming, P.J., and Pashkevich, A.P, Computer Aided Control System Design using a Multi-
Objective Optimisation Approach', Control '85 conference, Cambridge UK, pp. 174-179.
[5]R.K.Brayton, S.W.Director, G.D.Hachtel, an L.Vidigal, A new algorithm for statistical circuit
design based on quasi-Newton methods and function splitting, WEE Trans. Circuits Syst., Vol.
CAS-26, pp. 784-794, Sept. 1979.
[6]Grace A.C.W., Computer-Aided Control System Design using Optimization Techniques, Ph.D.
Thesis, Univ. Of Wales, Bangor. 1989.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-22
Appendix A Optimization Toolbox: REFERENCE minimax
Purpose:
Solves minimax optimization problems.
Synopsis:
[x,para]=minimax(x,f,g,para)
[x,para]=minimax(x,f,g,para,grad)
[x,para]=minimax(x,f,g,para,v1b,vub)
[x,para]=minimax(x,f,g,para,v1b,vub,grad)
Description:
Attempts to minimize the worst case values of the matrix, F(X), by varying X:
Values of the objective matrix, F (X), and constraint matrix, G(X), must be supplied to constr on an
iterative basis. Values of X are returned at each iteration and new values of F(X) and G(X) must be
evaluated. X, F(X), G(X) may be a scalars, vectors or matrices, G(X) may be the empty matrix.
If it is required to minimze the worst case absolute value of F, (i.e. minimax abs(F(X)) )then set
PARA(13) with the number of objectives for which this is required. Such objectives should be
partitioned into the first few element of F(X).
Upon initialization para(1) should be set to 0. If other values for para are not supplied then
minimax returns default parameters (see Tutorial).Mirtimax returns a craw of para<t)=3. when the
optimization has terminated following sufficient convergence or when the number of iterations exceeds
para(14). Para(2) is a measure of the precision required of X before the optimization will terminate
(default: le-4). Para(3) is a measure of the precision required of the objective function at the solution.
Para(4) indicates the maximum constraint violation that can be tolerated before the optimization will
terminate (default: le-7).
Gradient information, if available, need only be supplied when para(1)=2. The first columns of
grad should contain the gradients of F(X) ; the remaining columns should contain the gradients of
G(X). To change default settings and for more information refer to the Tutorial. Lower and upper
bounds on the design variables are set using the optional variables vlb, vub. Equality constraints should
be put in the first few elements of g and para(13) should be set with the number of them (see
Tutorial).
Example:
Find values of X which minimize the maximum value of [f1,f2,f3,f4,f5]
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Itangce. 1989 A-23
Appendix A Optimization Toolbox: REFERENCE minimax
This program is contained in the script file testminimax1.m and gives the following solution after 29
iterations.
F=
0.0000 -16.0000 -2.0000 -8.0000 0.0000
x=
4.0000 4.0000
Notes
The worst case absolute values of the elements of F can be minimized by setting PARA(13) with
the number of elements for which this is required. They should be partitioned into the first few
elements of F.
For example consider the above problem in which it is required to find values of X which minimize
the maximum absolute value of [f 11f2,f3 ,f4,f5]. This is solved by adding as the first line
PARA(13)=5;
On execution of this program the following results were obtained after 39 iterations.
F=
10.7609 -10.7609 -7.2391 -9.4382 1.4382
x=
8.7769 0.6613
Algorithm:
Attaingoal uses a Sequential Quadratic Programming algorithm as for constr. The. choice of merit
function is changed by setting PARA(7). The default is to use the merit function of Han[2] and
Powell[3]. An exact merit function together with a modified Hessian (see [1]) can be used by setting
PARA(7)=1. The exact merit function method tends to be more robust than that proposed by Han and
Powell but suffers from slower convergence in a number of examples.
Limitations:
The function to be minimized must be continuous. Minimax may only give local solutions. Minimax
does not allow equality constraints to be expressed.
See Also:
setpara, opt imgiob
References:
[1]R.K.Brayton, S.W.Director, G.D.Hachtel, an L.Vidigal, A new algorithm for statistical circuit
design based on quasi-Newton methods and function splitting, IEEE Trans. Circuits Syst., Vol.
CAS-26, pp. 784-794, Sept. 1979.
[2]Madsen K. and Schjaer-Jacobsen H., Algorithms for worst case tolerance optimization, IEEE
Trans. Circuits and Systems, Vol. CAS-26, Sept 1979.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 A-24
Appendix A Optimization Toolbox: REFERENCE solve
Purpose:
Solves Non-linear Equations.
Synopsis:
[x,para]=solve(x,f,para)
[x,para]=solve(x,f,para,grad)
Description:
Finds roots of algebraic non-linear equations of the form:
fii(X)= 0
Values of F(X) must be supplied to solve on an iterative basis. Values of X are returned at each
iteration and new values of F(X) must be evaluated. X and F(X) may be a scalars, vectors or matrices.
Upon initialization para(1) should be set to 0. If other values for para are not supplied then solve
returns default parameters (see Tutorial).
Solve returns a value of para(1)=1 when a root has been found or when the number of iterations
exceeds para(14). Para(3) indicates the precision with which a root is required (default le-7).
Gradient information, if available, need only be supplied when para(1)=2. The columns of grad
should contain the gradients (partial derivatives) of F(X) for each element of X. If the optional variable
grad is not supplied gradients are calculated using a fmite differences approximation.
Example:
Find a matrix X which satisfies the equation; X*X*X= [1, 2 ; 3, 4]; starting at the point X=[1,1;1,1].
PARA=0; %Reset Optimization Parameters
X=ones(2);; %Initialize Design Variables
while PARA(1)-=1 %Check Termination Parameter
F=X*X*X[1,2;3,4]; %Evaluate Non-linear Equation
[X,PARA]=solve(X,F,PARA); %Call Optimizer
end
This program is contained in the script file testsolve1.m and gives the following solution after
75 iterations:
A root has been found to the tolerance = 4.4239e-10
x=
-1.2915e-01 8.6022e-01
1.2903e+00 1.1612e+00
Limitations:
The function to be solved must be continuous. Solve only gives one root if successful. Solve may
converge to a non-zero point in which case other starting values should be tried.
Algorithm:
The choice of algorithm is made by setting para(5). The default algorithm (para(5)=0) is the
Levenberg-Marquardt method. Other Least Squares methods can be chosen by setting para(5) as given in
the leastsq reference section. Setting para(5)=5 implements a minimax method.
See Also:
setpara, optimglob
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-25
Appendix A Optimization Toolbox: REFERENCE leastsq
Purpose:
Solves Non-linear least squares optimization problems.
Synopsis:
[x,para]=leastsq(x,f,para)
[x,para]=leastsq(x,f,para,grad)
Description:
Minimizes a non-linear function composed of squared terms:
rriin [ Efip) ) i=1, ..., m i j=1, ..., m2
Values of F(X) must be supplied to solve on an iterative basis. Values of X are returned at each
iteration and new values of F(X) must be evaluated. X and F(X) may be a scalars, vectors or matrices.
Upon initialization para(1) should be set to 0. If other values for para are not supplied then leastsq
returns default parameters (see Tutorial).
Leastsq returns a value of para(1)=1 when the optimization has terminated following sufficient
convergence or when the number of iterations exceeds para(14). The optimization will terminate
successfully following convergence if the precision of x at a minimum is within the tolerance given by
para(2) (default: le-4) and the objective function is estimated to be within the tolerance given by
para(3) (default: le-4).
Gradient information, if available, need only be supplied when para(1)=2. The columns of grad
should contain the gradients (partial derivatives) of F(X) for each element of X. If the optional variable
grad is not supplied gradients are calculated using a fmite differences approximation.
Example:
10
Find values of X which minimize Yf(x) 2 : where:
i=1
lo
ix
f(x) = 2+2i - eix1+ e 2 starting at the point [0.3,0.41;
i=1
PARA=0; %Reset Optimization Parameters
X=[-1,1]; %Initialize Design Variables
while PARA(1)-.1 %Check Termination Parameter
for 1=1:10; F(i)= %Function Evaluations
for 1=1:10; F(i)=2+21-exp(X(1)*1)-exp(X(2)*1);end
[X,PARA]=Ieastsq(X,F,PARA); %Call Optimizer
end
This program is contained in the script file testsolve1.m and gives the following solution after 33
iterations:
The sum of squares an 124.3622
x=
0.25783 0.25783
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 A-26
Appendix A Optimization Toolbox: REFERENCE leastsq
Limitations:
The function to be minimized must be continuous. Leastsq may only give local solutions.
Algorithm:
The choice of algorithm is made by setting para(5). The default is the Levenberg-Marquardt
method [1-3]. Setting para(5)=1 implements a Guass-Newton method (see for example[4]) . Setting
para(5)=2 implements an unconstrained optimization method.
See Also:
setpara, optimglob
References:
[1]Levenberg K., A method for the solution of certain problems in least squares, Quart. App!.
Math. 2, pp. 164-168, 1944.
[2]Marquardt D., An algorithm for least-squares estimation of nonlinear parameters, SIAM J.
Appl. Math. Vol 11, pp. 431-441, 1963.
[3]More J.J., The Levenberg-Marquardt algorithm: implementation and theory, Numerical
Analysis, (G. A. Watson, ed.) Lecture Notes in Mathematics 630, Springer-Varleg, pp. 105-116,
1977.
[4]Dennis J.E., Jr. Nonlinear Least Squares, State of the Art in Numerical Analysis (D. Jacobs,
ed.), Academic Press. pp. 269-312, 1977
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-27
Appendix A Optimization Toolbox: REFERENCE lp
Purpose:
Solves Linear Programming(LP) problems.
Synopsis:
[x,para]=Ip(F,A,B)
Description:
Solves the linear programming problem:
. T
mm (fTx )
x
A.x b.
Where f is a the vector of coefficients of the linear objective function. The matrix, A, and
vector, b, are the coefficients of the linear constraints. The vector, x, is the set of design variables.
Example:
Find values of x which minimize: -Sx -4x -6x
1 2 3
subject to: x 1 -x2 -x3 20, 3x 1 -F2x2+4x 442, 3x +2x I-4x <30 ,x ,X 0
1 2 2— ' x1 2 3
Enetering the following commands
F=[-5,-4,-6]
A=[1 -11
3 24
320
-1 00
0-1 0
0 0-1]
b=[20;42;30;0;0;0]
x=lp(F,A,B)
gives the solution:
x=
0 15.0000 3.000
Algorithm:
lp uses a variation of the qp algorithm.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-28
Appendix A Optimization Toolbox: REFERENCE cIP
Purpose:
Solves Quadratic Programming(QP) problems.
Synopsis:
[x]=qp(H,c,A,B)
[x,lambda1=qp(H,c,A,B)
Description:
Solves the Quadratic Programming problem:
T
min [ x Hx+ cx)
Ax b
Where the Hessian matrix, H, and vector, c, are the set of coefficients of the quadratic objective
function. The matrix, A, and vector, b, are the coefficients of the linear constraints. The vector, x, is a
set of design variables.
Example:
Find values of x which minimize: f(X) = x 1 x2 [-1 lix, _ [2 6 ]x, + 10
L 1 —2 x2 2
Algorithm:
Op uses an active set method (which is also a projection method) similar to that described in
Gill and Murray [1]. Another method is implemented in the routine qp2 (with the same arguments as
qp) which uses Wolfe's method with a modified Simplex linear programming algorithm [2]
(programmed by S.Hancock).
References:
[1]Gill P.E., Murray W., and Wright M.H. Practical Optimization, Academic Press, London,
1981.
[2]Wolfe P., The simplex method for quadratic programming, Econometrica, Vol. 27 pp.382-398,
1959
”CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-29
Appendix A Optimization Toolbox: REFERENCE setpara
Purpose:
Gives help and returns default settings for optimization parameters..
Synopsis:
help setpara
para=setpara(0)
para=setpara(para)
Description:
Setpara returns a vector of default parameters used in the optimization process. Typing help
setpara gives details about the Optimization parameters used in the routines. For a fuller description
refer of the parameters refer to the Tutorial Section.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-30
Appendix A Optimization Toolbox: REFERENCE optimglob
Purpose:
Sets up Global Optimization parameters for faster execution and later inspection of the variables.
Synopsis:
optimg lob
Description:
Optimglob is a script file which sets up a number of global variables used in the optimization routines.
The advantage of this is that at each iteration it is no longer necessary to store the variables to an external
file. This serves to improve efficiency and allows the inspection of the variables at the end of the
optimization cycle.
In order to use global variables enter the command optimglob at the beginning of each session (or in
your matlab.m or etartup.rn file). It is also necessary to use this command after every use of clear.
Alternatively this command may be used as the first line in every optimization script file
The global variables are as follows:
G_MATL G_MATX G_PCNT G_STEPMIN G_SD G_GCNT G_OLDF G_GRAD G_HOW
G_CHG G_LAMBDA G_GLOBFLAG G_LAMBDABEST G_XBEST G_FBEST
There are also a number of global variables associated with graphics facilities they are as follows:
G_MESH G_GPARA G_MDX G_MDY G_GXCNT G_GYCNT G_CONTOURS G_GSX G_GSY
G_AXIS G_MESHC G_AXIS2
The variables have been given the prefix G_ to avoid naming confusions in other routines. Of particular
interest is the string variable G_HOW which contains a complete history of the optimization cycle.
Example:
The file tesunconstr2.m can be made to use global variables by adding optimglob as the first line in
the script file:
optimglob
PARA=0;
X=[-1 ,11;
while PARA(1)---1
F=exp(X(1))*(4*X(1)^2+2*X(2)^2+4*X(1)*X(2)+2*X(2)+1);
[X,PARA]=unconstr(F,X,PARA);
end
On execution of this program the global variables may be inspected. For example, the variable G_HOW
contains a complete history of the optimization cycle: The above example gives the following contents for
G_HOW.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-31
Appendix A Optimization Toolbox: REFERENCE optimglob
G_HOW =
Where ITERCNT is the number of iterations; F is the value of the objective function , F(x); STEP is the
step length used in thelithe-march;GRADIENT is the gradient of the new point in the direction of search;
UPDATE is a IlleaRlfe of positive-definiteness of the Hessian update. The remaining columns give
information regarding the procedures being performed at each gage, such as, updating of the
Hessian(update), step-length increase (incstep) or decrease(red_step, inter_step) or ail*
interpolation (inter)
Limitations
Optimg lob must not be executed from a user-defined function, only form a script file or through
keyboard entry.
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 A-32
Appendix I:
gradient matrix, af(X)/8X, of f(X) and denoted as Vf(X) is the nxm matrix, the inh element is defined
by
agx) 1 _ WO
(B.1)
ax u — ax..
1
and the techniques supplied in [1] to [3], the anumber of gradient matrices were calculated for the a
funtion of the form f(X)=tr(F(X)), where F(X) is a square matrix function of the elements xi., of the
nxm matrix X. They are shown in Table B.1.
REFERENCES
[1] Athans M. and Levine W.S., "Gradient matrices and matrix calculations," MIT Lincoln Labs.,
Lexington, Mass., Tech.Note 1965-53, 1965
[2] Athans M., "The matrix minimum principle," Inform. Contr., Vol.11, 1967
[3] Geering H.P., "On calculating gradient matrices," IEEE Trans. on Autom. Control, Vol AC-21,
No. 1, pp.615-616, 1976
'CACSD %song Opturuzauon Minhods" RID The= A CW GEICC Uruv of Wales Bangor. 1989 B1
Appendix B GRADIENT MATRICES
F(X) Wtr(F(X))
AX AT
AXT A
AXB ATBT
AXTB BA
XX 2XT
XXT 2X
AXBX ATxTBT+BTxTAT
XAX T XAT + XA
AXBXT ATXBT+ AXB
xAB xT xBTxTAT+ATxTxBT+xAxB
AXBXCXT At, +AXBXC
ATXCTXTBT+BTXTAT--T
AXBXCXDXT ATXDTXTCTXTBT+BTxTATxTxTcT
cTxT B TxT
+
A T—T
•
A +AXBXCXD
AxB xTcxT excTxBT + UA
•-•,,,T
AXB + AXBXTC
AxBxTcxTpxT ATxpTxcTxBT+cxTreAxB
+ DXTAXBXTC AXBXTCXTD +
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 B-2
Appendix B SYSTEM MATRICES
0 0 0 -1.4955 -3.5311 1 1
1 0 0 -11.9179 -1.5529 0
Ac= B
0 1 0 -2114791 Bcff = [ -0.8727 cfb= L0
0 0 1 -11.1962 -0.0929 0
"CACSD using Optimization Methods" PhD Thesis A.C.W.Grace Univ. of Wales. Bangor. 1989 B-3
Appendix B SYSTEM MATRICES
-
0 1.0000e+00 0 0 0
- 2.5690e+00 -1.0420e+00 1.1860e-04 -7.6040e-03 - 2.0820e-01
- 2.6730e+01 5.1230e-03 -5.1610e-02 1.6030e-02 - 1.1730e-05
A= - 1.8370e+02 -3.3090e+00 -1.8640e-01 -5.4360e-01 - 1.0310e+00
0 0 - 2.0000e+01
0 0 0
o o 0
o 5.3460e-05 0
0 5.3460e-05 0
o o o
- -
0 0 0
4.6160e-02 5.2490e-02 -2.7490e-0 -2.9040e-06
-2.3950e-02 2.2190e+01 1.6200e+0 3.3760e-03
-7.9150e-02 -2.7580e+00 -1.4910e+0 -3.1070e-04
o o 0
-1.0000e+0 o o 0
-5.0000e+00 o 0
o -2.8130e+00 2.6830e+0 9.2780e-04
o 5.2970e-05 -1.6420e+0 6.2720e-04
3.1390e+0 o o -1.3330e+01
0 0
0 0
0 0
B= 0 0
-20 0
0 -10
0 0
0 0
0 0
0 0_
(2 = 0 0 0 -1.6960e-01 0
p 7.6290e-04 0 5.9220e-01 0 0
o o 0 0 0
o 0 0 0 0
]
"CACSD using Optirnizstion Methods" PhD Thesis A.C.W.Grace Univ. of Wales, Bangor. 1989 B-4