100% found this document useful (1 vote)

232 views

@ElectricalDocument Mathematical Programming for Power Systems O

The document is a book titled 'Mathematical Programming for Power Systems Operation' by Alejandro Garcés, focusing on the application of mathematical optimization techniques in the operation of power systems. It covers various topics including economic and environmental dispatch, unit commitment, and optimal power flow, with an emphasis on convex optimization methods implemented in Python. The book aims to provide theoretical and practical insights for students and professionals in power systems engineering.

Uploaded by

engabdulrahmanramaki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

232 views

@ElectricalDocument Mathematical Programming for Power Systems O

Uploaded by

engabdulrahmanramaki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 299

Telegram: @ElectricalDocument

Mathematical Programming for Power Systems Operation

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
Mathematical Programming for Power
Systems Operation

From Theory to Applications in Python

Alejandro Garcés
Technological University of Pereira
Pereira, Colombia

Telegram: @ElectricalDocument
This edition first published 2022
© 2022 by The Institute of Electrical and Electronics Engineers, Inc. All rights reserved.

Published by John Wiley & Sons, Inc., Hoboken, New Jersey.

Published simultaneously in Canada.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any
form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise,
except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without
either the prior written permission of the Publisher, or authorization through payment of the
appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers,
MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to
the Publisher for permission should be addressed to the Permissions Department, John Wiley &
Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at
http://www.wiley.com/go/permission.

Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best
efforts in preparing this book, they make no representations or warranties with respect to the
accuracy or completeness of the contents of this book and specifically disclaim any implied
warranties of merchantability or fitness for a particular purpose. No warranty may be created or
extended by sales representatives or written sales materials. The advice and strategies contained
herein may not be suitable for your situation. You should consult with a professional where
appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other
commercial damages, including but not limited to special, incidental, consequential, or other
damages.

For general information on our other products and services or for technical support, please contact
our Customer Care Department within the United States at (800) 762-2974, outside the United
States at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print
may not be available in electronic formats. For more information about Wiley products, visit our
web site at www.wiley.com.

Library of Congress Cataloging-in-Publication Data

A catalogue record for this book is available from the Library of Congress

Paperback ISBN: 9781119747260; ePub ISBN: 9781119747284;

ePDF ISBN: 9781119747277; oBook ISBN: 9781119747291

Cover image: © Redlio Designs/Getty Images

Cover design by Wiley

Set in 9.5/12.5pt STIXTwoText by Integra Software Services Pvt. Ltd, Pondicherry, India

10 9 8 7 6 5 4 3 2 1

Telegram: @ElectricalDocument
v

Contents

Acknowledgment ix
Introduction xi

1 Power systems operation 1

1.1 Mathematical programming for power systems
operation 1
1.2 Continuous models 3
1.2.1 Economic and environmental dispatch 3
1.2.2 Hydrothermal dispatch 3
1.2.3 Effect of the grid constraints 5
1.2.4 Optimal power flow 5
1.2.5 Hosting capacity 7
1.2.6 Demand-side management 7
1.2.7 Energy storage management 9
1.2.8 State estimation and grid identification 9
1.3 Binary problems in power systems operation 11
1.3.1 Unit commitment 12
1.3.2 Optimal placement of distributed generation and capacitors 12
1.3.3 Primary feeder reconfiguration and topology identification 13
1.3.4 Phase balancing 13
1.4 Real-time implementation 14
1.5 Using Python 15

Part I Mathematical programming 17

2 A brief introduction to mathematical optimization 19

2.1 About sets and functions 19
2.2 Norms 22
2.3 Global and local optimum 24

Telegram: @ElectricalDocument
vi Contents

2.4 Maximum and minimum values of continuous

functions 25
2.5 The gradient method 26
2.6 Lagrange multipliers 32
2.7 The Newton’s method 33
2.8 Further readings 35
2.9 Exercises 35

3 Convex optimization 39
3.1 Convex sets 39
3.2 Convex functions 45
3.3 Convex optimization problems 47
3.4 Global optimum and uniqueness of the solution 50
3.5 Duality 52
3.6 Further readings 56
3.7 Exercises 58

4 Convex Programming in Python 61

4.1 Python for convex optimization 61
4.2 Linear programming 62
4.3 Quadratic forms 67
4.4 Semidefinite matrices 69
4.5 Solving quadratic programming problems 71
4.6 Complex variables 74
4.7 What is inside the box? 75
4.8 Mixed-integer programming problems 76
4.9 Transforming MINLP into MILP 79
4.10 Further readings 80
4.11 Exercises 81

5 Conic optimization 85
5.1 Convex cones 85
5.2 Second-order cone optimization 85
5.2.1 Duality in SOC problems 90
5.3 Semidefinite programming 92
5.3.1 Trace, determinant, and the Shur complement 92
5.3.2 Cone of semidefinite matrices 95
5.3.3 Duality in SDP 97
5.4 Semidefinite approximations 98
5.5 Polynomial optimization 102
5.6 Further readings 105
5.7 Exercises 106

Telegram: @ElectricalDocument
Contents vii

6 Robust optimization 109

6.1 Stochastic vs robust optimization 109
6.1.1 Stochastic approach 110
6.1.2 Robust approach 110
6.2 Polyhedral uncertainty 111
6.3 Linear problems with norm uncertainty 113
6.4 Defining the uncertainty set 115
6.5 Further readings 121
6.6 Exercises 121

Part II Power systems operation 125

7 Economic dispatch of thermal units 127

7.1 Economic dispatch 127
7.2 Environmental dispatch 133
7.3 Effect of the grid 136
7.4 Loss equation 140
7.5 Further readings 143
7.6 Exercises 143

8 Unit commitment 145

8.1 Problem definition 145
8.2 Basic unit commitment model 146
8.3 Additional constraints 150
8.4 Effect of the grid 151
8.5 Further readings 153
8.6 Exercises 153

9 Hydrothermal scheduling 155

9.1 Short-term hydrothermal coordination 155
9.2 Basic hydrothermal coordination 156
9.3 Non-linear models 159
9.4 Hydraulic chains 162
9.5 Pumped hydroelectric storage 165
9.6 Further readings 168
9.7 Exercises 169

10 Optimal power ﬂow 171

10.1 OPF in power distribution grids 171
10.1.1 A brief review of power flow analysis 173
10.2 Complex linearization 177
10.2.1 Sequential linearization 181

Telegram: @ElectricalDocument
viii Contents

10.2.2 Exponential models of the load 182

10.3 Second-order cone approximation 184
10.4 Semidefinite approximation 188
10.5 Further readings 190
10.6 Exercises 190

11 Active distribution networks 195

11.1 Modern distribution networks 195
11.2 Primary feeder reconfiguration 196
11.3 Optimal placement of capacitors 200
11.4 Optimal placement of distributed generation 203
11.5 Hosting capacity of solar energy 205
11.6 Harmonics and reactive power compensation 208
11.7 Further readings 212
11.8 Exercises 212

12 State estimation and grid identiﬁcation 215

12.1 Measurement units 215
12.2 State estimation 216
12.3 Topology identification 221
12.4 Y bus estimation 224
12.5 Load model estimation 228
12.6 Further readings 231
12.7 Exercises 232

13 Demand-side management 235

13.1 Shifting loads 235
13.2 Phase balancing 240
13.3 Energy storage management 246
13.4 Further readings 249
13.5 Exercises 249

A The nodal admittance matrix 253

B Complex linearization 257

C Some Python examples 263

C.1 Basic Python 263
C.2 NumPy 266
C.3 MatplotLib 268
C.4 Pandas 268

Bibliography 271
Index 281
Telegram: @ElectricalDocument
ix

Acknowledgment

Throughout the writing of this book, I have received a great deal of support and
assistance from many people. I would first like to thank my friends Lucas Paul
Perez at Welltec, Adrian Correa at Universidad Javeriana in Bogotá-Colombia,
Ricardo Andres Bolaños at XM (the transmission system operator in Colom-
bia), Raymundo Torres at Sintef-Norway, and Juan Carlos Bedoya at the Pacific
Northwest National Laboratory (USA), who, in 2020 (during the COVID-19
pandemic), agreed to discuss some practical aspects associated to power sys-
tem operation problems. The discussions during these video conferences were
invaluable to improve the content of the book. I am also very grateful to my
students, who are the primary motivation for writing this book. Special thanks
to my former Ph.D. students, Danilo Montoya and Walter Julian Gil. Finally, I
want to thank the Department of Electric Power Engineering at the Universi-
dad Tecnológica de Pereira in Colombia and the Von Humbolt Foundation in
Germany for the financial support required to continue my research about the
operation and control of power systems.
Alejandro Garcés

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
xi

Introduction

Electrification is the most outstanding engineering achievement in the 20tℎ

century, a well-deserved award if we consider the high complexity of genera-
tion, transmission, and distribution systems. An electric power system includes
hundreds or even thousands of generation units, transformers, and transmis-
sion lines, located throughout an entire country and operated continuously 24
hours per day. Running such a complex system is a great challenge that requires
using advanced mathematical techniques.
All industrial systems seek to increase their competitiveness by improv-
ing their efficiency. Electric power systems are not the exception. We can
improve efficiency by introducing new technologies but also by implementing
mathematical optimization models into daily operation. In every mathemati-
cal programming model, we require to perform four critical stages depicted in
Figure . The first stage is an informed review of reality, identifying opportu-
nities for improvement. This stage may include conversations with experts in
order to establish the available data and the variables that are subject to be opti-
mized. The second stage is the formulation of an optimization model as given
below:

min 𝑓(𝑥)
subject to 𝑥 ∈ Ω (0.1)

Where 𝑥 is the vector of decision variables, 𝑓 is the objective function and, Ω is

the set of feasible solutions. Going from stage one (reality) to stage two (model)
is more of an art than a science. One problem may have different models and
different degrees of complexity. Practice and experience are required to master
this stage, as some models are easier to solve than others. Subsequently, the
third stage consists of the implementation of the mathematical model into a
software. After that, the fourth stage is the analysis of results in the context of
the real problem.

Telegram: @ElectricalDocument
xii Introduction

Figure 0.1 Stages of solving an optimization problem.

This book will focus on stages two and three, associated with power system
operations models. In particular, we are interested in models with a geometric
characteristic called convexity, that present several advantages, namely:

● We can guarantee the global optimum and unique solution under well-
defined conditions. This aspect is interesting from both theoretical and prac-
tical points of view. In general, a global optimum advisable in real operation
problems.
● There are efficient algorithms for solving convex problems. In addition, we
can guarantee convergence of these algorithms. This is a critical aspect for
operation problems where the algorithm requires to be solved in real-time.
● There are commercial and open-source packages for solving convex opti-
mization models. In particular, we are going to use CvxPy, a free Python-
embedded modeling language for convex problems.
● Many power system operations problems are already convex; for example,
the economic and environmental dispatches, the hydrothermal coordination,
and the load estimation problem. Besides, it is possible to find efficient convex
approximations to non-convex problems such as the optimal power flow.

In summary, convex problems have both theoretical and practical advantages

for power systems operation. This book studies both aspects. The book is
oriented to bachelor and graduated students of power systems engineering.
Concepts related to power systems analysis such as per-unit representation,
the nodal admittance matrix, and the power flow problem are taken for granted.
A previous course of linear programming is desirable but not mandatory. We do
not pretend to encompass all the theory behind convex optimization; instead,
we try to present particular aspects of convex optimization which are useful in
power systems operation. The book is divided into two parts: In the first part,

Telegram: @ElectricalDocument
Introduction xiii

the main concepts of convex optimization are presented, including a distinct

chapter about conic optimization. After that, selected applications for power
systems operation are presented. Most of the solvers for convex optimization
allow mixed-integer convex problems. Therefore, we include models that can
be solved in this framework too. The student is recommended to do numerical
experiments in order to acquire practical intuition of the problems.
All applications are presented in Python, which is a language that is becom-
ing more important in power systems applications. Students are not expected
to have previous knowledge in Python, although basic concepts about pro-
gramming (in any language) are helpful. Our methodology is based on many
examples and toy-models. We made a great effort in showing the most simple
model with a clean code. Of course, these toy-models are an oversimplifica-
tion of the real problem; however, they allow us to understand the model and
its coding. In practice, we may have complex models that combine different
aspects such as the economic dispatch, the unit commitment and/or the opti-
mal power flow. A real operation model may require a sophisticated platform
that integrates the model with the supervisory control and data acquisition sys-
tem (SCADA) operating in real-time. The development of such a real industrial
model is beyond the objectives of this book.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
1

Power systems operation

Learning outcomes

By the end of this chapter, the student will be able to:

● Identify problems related to power systems operation.
● Link mathematical optimization models to power systems operation
problems.

1.1 Mathematical programming for power systems

operation
Mathematical optimization is a fundamental tool for the electrical supply
chain, from generation through transmission, distribution, and end-use. It may
also be used, in different time frames, from a few milliseconds to several years.
This book concentrates on optimization problems for power systems operation.
These problems are usually continuous and have a time frame from several
minutes to one day. Optimization problems with faster dynamics lie in the
control and stability analysis, whereas problems with slower dynamics are
planning problems.
Mathematical optimization problems associated with power system oper-
ation have existed since the beginning of operations research as an inde-
pendent area, back in the middle of the 20th century. However, modern
technologies such as renewable energies and electric vehicles; and current
concepts, such as smart-grids, active distribution networks, and microgrids,
have created a renewed interest in mathematical optimization applied to power
systems. Smart-grids implicate a massive use of technologies such as power

Mathematical Programming for Power Systems Operation: From Theory to Applications in

electronics, communications, and advanced metering. However, the smart

aspect of these grids comes from mathematical techniques such as mathe-
matical optimization, that manage these technologies, in order to improve the
efficiency, reliability, security, and resilience of the system.
Figure 1.1 depicts schematically four common types of mathematical opti-
mization models. These are linear programming (LP), mixed-integer linear
programming (MIP), non-linear programming (NLP), and mixed-integer non-
linear programming (MINLP). Another classification is to separate them into
convex and non-convex problems. The former include LPs and some NLPs;
the latter is the rest of the problems. Convex problems are well-behaved in
the sense that they have theoretical guarantees, such as global optimum and
practical algorithms with fast convergence rate. The first part of the book
presents these theoretical aspects. However, not all power systems operation
problems are convex; therefore, we need to develop convex approximations
for those problems, most of them based on conic optimization as presented in
Chapter 5.
A power system is quite complex, and therefore, modeling and implement-
ing mathematical optimization problems are equally complex. We need to gain
experience in the complex art of modeling and solving mathematical opti-
mization problems for power system applications. Our approach is to create
toy-models for each problem. These are simplified models that allow us to
understand the central issues and do numerical experiments. In the following
sections, we briefly describe each of these toy-models, explained in detail in the
second part of the book.

Figure 1.1 Types of optimization

models.

Telegram: @ElectricalDocument
1.2 Continuous models 3

1.2 Continuous models

1.2.1 Economic and environmental dispatch
The economic and environmental dispatch of thermal units is one of the most
classic problems in power systems operation. It consists of minimizing the oper-
ating costs or the total CO2 emissions, subject to physical constraints such as the
power balance and the maximum generation capacity. For the economic dis-
patch, each generation unit has a cost function 𝑓𝑖 , which is usually quadratic
and depends on the power generated by each unit. Thus, the objective is to
minimize the total cost (or emissions), subject to power balance, as presented
below:
∑
min 𝑓𝑖 (𝑝𝑖 )
𝑖
∑
𝑝𝑖 = 𝑑 (1.1)
𝑖

𝑝min ≤ 𝑝𝑖 ≤ 𝑝max

where 𝑝𝑖 is the power generated by each thermal unit, and 𝑑 is the total
demand. Environmental dispatch introduces quadratic or exponential func-
tions in the objective function, but the problem’s structure is the same.
Moreover, power flow constraints can be introduced into the model, although,
in that case, it is more precise to name the problem as an optimal power flow
(OPF). Chapter 7 presents the economic and environmental dispatches, while
Chapter 10 presents the OPF problem.
Another problem closely related to the economic dispatch of thermal units
is the unit commitment. This problem considers not only the operating
costs but also the start-up and shut-down costs of thermal units. There-
fore, the problem becomes binary and dynamic. This problem is studied in
Chapter 8.

1.2.2 Hydrothermal dispatch

The economic dispatch problem may include hydroelectric power plants; said
plants, generate two fundamental changes into the model. On the one hand, the
model becomes dynamic since current operational decisions affect the future
operation of the system. On the other hand, the problem becomes stochas-
tic, because the inflows are usually random variables, especially in long-term
models. The later aspect is usually solved by an accurate forecasting of the
loads and the inflows; hence, it is possible to formulate a deterministic problem,

Telegram: @ElectricalDocument
4 1 Power systems operation

called hydrothermal dispatch or hydrothermal coordination. The basic model

has the following structure, namely:
∑∑
min 𝑓𝑖 (𝑝𝑖𝑡 )
𝑡 𝑖
∑ ∑
𝑝𝑖𝑡 + 𝑝𝑗𝑡 = 𝑑𝑡 ∀𝑡 ∈ 𝒯
𝑖 𝑗

𝑝𝑗𝑡 = 𝑔(𝑞𝑗𝑡 , 𝑣𝑗𝑡 )

𝑣𝑗(𝑡+1) = 𝑣𝑗𝑡 + 𝛼(𝑎𝑗𝑡 − 𝑞𝑗𝑡 − 𝑠𝑗𝑡 ) (1.2)
𝑝min ≤ 𝑝𝑖𝑡 ≤ 𝑝max
𝑞min ≤ 𝑞𝑗𝑡 ≤ 𝑞max
𝑠min ≤ 𝑠𝑖𝑡 ≤ 𝑠max
𝑣min ≤ 𝑣𝑖𝑡 ≤ 𝑣max

Where 𝑖 represents thermal units and 𝑗 enumerates hydroelectric units; 𝑡 rep-

resents the time, thus, 𝑝𝑖𝑡 is the power generated by the thermal unit 𝑖 at time 𝑡.
The values of 𝑎𝑗𝑡 , 𝑣𝑗𝑡 , 𝑝𝑗𝑡 , 𝑞𝑗𝑡 , and 𝑠𝑗𝑡 are respectively, the inflow, volume, power,
outflow, and spillage of the hydroelectric unit 𝑗 at time 𝑡. Figure 1.2, which is
self-explanatory, shows these variables.
In this model, 𝑔 represents the relation between generated power, outflow,
and water volume stored in the dam. Although the planning horizon 𝒯 may
be of short-term (1 day to 1 week), medium-term (1 month), or long-term (1
or more years), we are interested only in the short-term model. As aforemen-
tioned, the problem may be stochastic since power demands 𝑑𝑡 and inflows 𝑎𝑗𝑡
are all random variables. However, a determinist model is suitable to under-
stand the problem and its practical implementation. The situation becomes
even more problematic when introducing other renewable energies, such as

Figure 1.2 Schematic representation of the variables associated to a hydroelectric

generation unit.

Telegram: @ElectricalDocument
1.2 Continuous models 5

wind generation and photovoltaic solar generation. Chapter 9 presents the

hydrothermal dispatch problem.

1.2.3 Effect of the grid constraints

Both the economic dispatch of thermal units and the hydrothermal dispatch are
relatively simple problems. However, the transmission grid introduces addi-
tional constraints. Let us consider, for example, a power system with three
operation areas, as shown in Figure 1.3. Each line has a maximum power flow
capacity that introduces additional constraints in the model (both economic
dispatch and hydrothermal dispatch). This constraint limits the flow among
areas and modifies the results. These aspects are studied in Chapter 7 from a
classic perspective and later, in Chapter 10, under a modern view based on conic
optimization.

1.2.4 Optimal power ﬂow

The power flow is one of the most important tools for the analysis of power
systems. It allows to determine the state of the power system, knowing the mag-
nitude of voltage and active power in all generating nodes and, active/reactive
power demanded in the loads. This results in a non-linear system of equations
in complex variables, as presented below:
∑
𝑠𝑘∗ = 𝑦𝑘𝑚 𝑣𝑘∗ 𝑣𝑚 (1.3)
𝑚∈𝒩

Figure 1.3 Economic dispatch by areas considering network constraints with the
transport model.

Telegram: @ElectricalDocument
6 1 Power systems operation

where 𝑠𝑘 = 𝑝𝑘 +𝑞𝑘 represents the active and reactive power in node 𝑘; 𝒩 repre-
sents the set of nodes of the grid; 𝑦𝑘𝑚 is the entry 𝑘𝑚 of the 𝑌bus matrix; 𝑣𝑘 and
𝑣𝑘 are the voltages at nodes 𝑘 and 𝑚, respectively, both represented as complex
variables; and 𝑠𝑘∗ and 𝑣𝑘∗ are the complex conjugate of the respective variables.
This representation on the complex number can be splitted into real and imag-
inary parts. However, as presented in Chapter 10, a complex representation is
suitable both for modeling and implementation purposes.
These constraints can be introduced into the economic dispatch, as well as
in an optimization model that minimizes total power loss (𝑝loss ). In both cases,
we named the problem as OPF. The basic model has the following structure:

min 𝑃loss (𝑣)

𝑣,𝑠
∑
𝑠𝑘∗ = 𝑦𝑘𝑚 𝑣𝑘∗ 𝑣𝑚
𝑚∈𝒩

𝑝min ≤ 𝑝 ≤ 𝑝max (1.4)

‖𝑠‖ ≤ 𝑠max
𝑣min ≤ ‖𝑣𝑘 ‖ ≤ 𝑣max
angle(𝑣0 ) = 0

This problem is highly complex due to the non-linear and non-convex nature
of the power flow equations. Therefore, it is required to review different
approximations that simplify the model. Chapter 10 presents three of these
approximations. These are linearization, second-order cone approximation,
and semidefinite programming approximation.
Linearization is, perhaps, the most straightforward way to solve the prob-
lem. Although the concept of linearization is well-known in real numbers, in
this case, we do a linearization on the complex domain. This linearization uses
Wirtinger’s calculus since Equation (1.3) is non-holomorphic (i.e., it does not
have a derivative in the complex numbers). Chapter 10 and Appendix B study
this aspect in detail.
We also solve the optimal power flow using second-order cone and semidef-
inite programming. These approximations demand a basic understanding of
conic optimization. Therefore, we present a general background of conic opti-
mization in Chapter 5, and its application to the optimal power flow problem
in Chapter 10, including a complete discussion about their advantages and
disadvantages.
An optimal power flow may optimize both power systems and power distri-
bution grids. However, the latter case presents some particularities that deserve

Telegram: @ElectricalDocument
1.2 Continuous models 7

an independent study. Moreover, both solar and wind energy require power
electronic converters connected at power distribution level. These converters
are capable of controlling reactive power, and therefore, it is possible to for-
mulate an OPF wherein the decision variables are the power factor of each
converter. The problem can be deterministic for real-time operation or stochas-
tic for the day ahead planning. In both cases, the model has, at least, the same
complexity as the basic OPF.

1.2.5 Hosting capacity

The concept of hosting capacity refers to the amount of solar photovoltaic gen-
eration (or wind) that can be hosted on a power distribution network, at a
given time, without adversely impacting safety, power quality, reliability, or
other operational features [1]. This analysis can use different performance cri-
teria, including transient and stationary state analysis. The latter alludes to
the maximum amount of generated power possible to host without creating
over-voltage problems and maintaining operational limits on the distribution
lines.
A hosting capacity analysis requires considering the stochastic nature of solar
radiation and power demand [2]. Furthermore, it needs to consider the non-
linear characteristics related to the power flow equations. Therefore, it is a
problem as complex as the optimal power flow (Chapter 11 investigates this
problem).

1.2.6 Demand-side management

Demand-side management constitutes a paradigm shift in the context of smart
grids, where loads are active components subject to optimization. The role of
demand-side management is crucial to decrease CO2 emissions, reduce the bot-
tleneck in the transmission system, diminish operational cost, and improve
efficiency. In order to attain these objectives, it is required to apply mecha-
nisms of electrical load management with static and dynamic techniques. Static
techniques involve administrative measures as policies and activities to incen-
tive the end-users to change their energy demand pattern; dynamic techniques
include actions to reduce the electricity consumption, such as peak-clipping,
valley filling, and load shifting, among others. Information and communica-
tion technologies (ICT), as well as the concept of the Internet of the Things
(IoT), allow to control the loads and integrate this control in a centralized
optimization model.

Telegram: @ElectricalDocument
8 1 Power systems operation

In general, controllable loads are introduced in a model very similar to the

economic dispatch. Its basic structure is presented below:
∑∑
min 𝑐𝑖𝑡 𝑝𝑖𝑡
𝑡 𝑖
∑
(𝑝̄ 𝑖𝑡 − 𝑝𝑖𝑡 ) ≥ 𝑑𝑡 (1.5)
𝑡

0 ≤ 𝑝𝑖𝑡 ≤ 𝑝̄ 𝑖𝑡

where 𝑝̄ 𝑖𝑡 is the power required by the load 𝑖 at time 𝑡; 𝑝𝑖𝑡 is the amount of
power that is reduced due to the demand-side management model; 𝑐𝑖𝑡 is the
cost of disconnecting one unit of power; and 𝑑𝑡 is the minimum demand. This
is only the basic optimization model, which can be modified, in order to include
more type of loads and other aspects of the operation of the system.
Some loads can be moved in time, for example, the washing machine in a resi-
dential user. These loads, known as shifting loads, can be optimized by defining
the load’s optimal starting time. This optimization model is binary but tractable
as presented in Chapter 13.
A demand-side management model can also include a model for tertiary con-
trol in microgrids or a model for charging electric vehicles. The latter is usually
called vehicle-to-grid or V2G. In these cases, the optimization model requires
to be executed in real-time by an aggregator as depicted in Figure 1.4.
An aggregator is a crucial component in modern smart distribution net-
works. This device receives information of the final users – in this case, the
electric vehicles – and gives the control actions in order to obtain a smart oper-
ation. However, the intelligent part of this system is not in the hardware but
in the optimization required to solve the problem efficiently and in real-time;
therein lies the importance of understanding the optimization model.
A V2G strategy can be unidirectional or bidirectional. In unidirectional V2G,
an aggregator controls the electric vehicles’ charge similarly as shifting loads.

Figure 1.4 Vehicle-to-grid

concept with an aggregator that
centralizes control actions.
Dashed lines represent a
communication architecture with
the aggregator.

Telegram: @ElectricalDocument
1.2 Continuous models 9

In bidirectional, the electric vehicle can inject power into the grid if required for
improving the operation. In any case, the model can become stochastic since
the state of charge of the vehicles can be unknown, and the aggregator does
not control the arrival/departing time of the vehicles1 . The aggregator can also
incorporate economic dispatch and OPF models to manage other distributed
resources such as local batteries, solar panels, and wind turbines. Chapter 11
examines these problems.

1.2.7 Energy storage management

Modern power systems can integrate renewable energy and energy storage
devices through a virtual power plant (VPP), an entity that group and cen-
tralize the operation of distributed resources to be dispatched by the power
system operator. A VPP can encompass an entire region with different renew-
able sources and energy storage devices. It can also group other microgrids
along a distribution feeder.
There are at least two moments where optimization models are required:
day-ahead dispatch and real-time operation. Day-ahead dispatch corresponds
to the optimization model executed the day before the operation as an economic
dispatch model (see Section 1.2.1). This model must include the availability
of generation and consider a forecast of the primary resource (inflows, wind,
and solar radiance). Moreover, it gives the value power that the VPP opera-
tor undertakes on the day of the operation. During the operation, the VPP
requires satisfying operative constraints and correcting errors in forecasting the
primary resource. Again, a real-time algorithm is necessary for energy storage
management.

1.2.8 State estimation and grid identiﬁcation

The problem of state estimation is classic in power systems. It is also a key com-
ponent in Supervisory Control And Data Acquisition (SCADA) systems. The
problem consists in determining the most probable state of the system from
redundant measurements and knowledge of the topology and electrical rela-
tions of the grid. When the variables to be measured are active and reactive
powers, a non-convex problem is obtained with the same degree of complexity
as the load flow. Modern technologies such as the phasor measurement units
(PMUs) allow to include direct measures of voltages and angles.

1 We incorporate uncertainty in the models using robust optimization. Chapter 6 is dedicated

to this aspect.

Telegram: @ElectricalDocument
10 1 Power systems operation

The problem can be also formulated in power distribution networks and

microgrids, both AC and DC. Figure 1.5 shows, for example, a microgrid with
a centralized control. Each active element of the network can have both volt-
age and current measurement. We can use these measurements in order to find
the most likely state of the system based on the least squares model as shown
below:

min (𝐼 − 𝐽)⊤ 𝑀(𝑖 − 𝑗) + (𝑣 − 𝑢)⊤ 𝑁(𝑣 − 𝑢)

𝐼,𝑉

𝐼 = 𝑌𝑉 (1.6)
𝐼min ≤ 𝐼 ≤ 𝐼max
𝑉min ≤ 𝑉 ≤ 𝑉max

where 𝐽, 𝑈 are measurements of current and voltage, respectively; 𝐼, 𝑉 are the

corresponding estimations and 𝑀, 𝑁 are diagonal matrices that represent the
weight of each measurement. The state estimation problem is closely related to
the optimal power flow. In fact, some authors call this problem as the inverse
power flow problem. The problem is studied in more detail in the second part
of the book (Chapter 12).
Another operation problem, closely linked to the state estimation, is the iden-
tification of the network. In this case, we have measurements of both voltages
and currents at different operating points. Our goal is to estimate the value
of the nodal admittance matrix from these measurements. In this case, the
optimization model is the following:

1 2
min 𝑓(𝑌) = ‖𝐼 − 𝑌𝑉‖𝐹 (1.7)
𝑌 2

The decision variable is the nodal admittance matrix 𝑌, and the objective
function is the norm of error between measurements and estimations2 .
The model can include information about the structure of the matrix 𝑌. For
example, we already know that the matrix is symmetric and, some of its entries
are zero. In that case, the optimization model is the following:

1
min 𝑓(𝑌) = ‖𝐼 − 𝑌𝑉‖𝐹
𝑌 2
𝑦𝑖𝑗 = 𝑦𝑗𝑖 , ∀𝑖, 𝑗 (1.8)
𝑦𝑖𝑗 = 0, if 𝑖 is not connected to𝑗

2 It is usual to consider the Frobenius norm as explained in Chapter 12.

Telegram: @ElectricalDocument
1.3 Binary problems in power systems operation 11

Figure 1.5 Example of a microgrid with a centralized control/measurement in the

aggregator.

Both AC and DC grids may handle this type of estimation. In this case, we
only presented the DC case because it is easier to develop. The entire model
must be implemented in an aggregator structure, as depicted in Figure 1.5.

1.3 Binary problems in power systems operation

Some of the problems previously described are binary. These problems appear
in electrical systems, both in the planning and the operation stages. In the
planning stage, the problems of transmission expansion planning and distri-
bution planning are typical. Heuristic techniques usually solve these prob-
lems together with mixed-integer programming approximations [3]. Although
mixed-integer problems are non-convex, they can be solved efficiently using
methodologies such as the branch-and-bound algorithm. We present a brief
introduction of this method in Chapter 4. Besides, several modules can be called
from Python to solve these types of problems. We used Gurobi and Mosek

Telegram: @ElectricalDocument
12 1 Power systems operation

for this task, although there are plenty of options available. Binary operation
problems include unit commitment and phase balancing in power distribution
networks.

1.3.1 Unit commitment

As aforementioned, the unit commitment problem consists in determining the
order of starting and stopping of the thermal units, taking into account the costs
of turning-ON and turning-OFF, as well as the starting ramps. The optimization
model is similar to the economic dispatch, but in this case, there are binary vari-
ables 𝑠𝑘 related to the state ON or OFF of each thermal unit. The most simplified
model is presented below:
∑ ∑
min 𝑓(𝑝𝑘 ) + ℎ(𝑠𝑘 )
∑
𝑝𝑘 = 𝑝𝐷
𝑟(𝑠𝑘 ) ≤ 0 (1.9)
𝑝min 𝑠𝑘 ≤ 𝑝𝑘 ≤ 𝑝max 𝑠𝑘
𝑠𝑘 ∈ {0, 1}

where 𝑓 represents operative costs, ℎ are costs of turning-ON and turning-OFF,

𝑟 are starting ramps, and 𝑝min , 𝑝max are the operation limits of each unit. Binary
problems are usually difficult; however, it is possible to find either convex
approximation or MILP equivalents as presented in Chapter 5.

1.3.2 Optimal placement of distributed generation and capacitors

We can use the same convex approximations of the optimal power flow problem
in problems such as the optimal placement of distributed generation in power
distribution networks. The problem consists of determining the placement and
capacity of distributed generation, subject to physical constraints, such as volt-
age regulation and transmission lines’ capacity. Besides, the problem includes
binary constraints related to the placement of the distributed generation. The
objective function is usually total power loss, although other objectives, such
as reliability and optimal costs, are also suitable.
Another binary problem closely related to the OPF is the optimal placement
of capacitors. In this problem, fixed or variable capacitors are placed along
the primary feeder to minimize power loss. The capacitors’ location and the
number of fixed capacitors in each node are binary variables considered in
the model. Chapter 11 examines the optimal placement of capacitors and the
optimal placement of distributed generation.

Telegram: @ElectricalDocument
1.3 Binary problems in power systems operation 13

1.3.3 Primary feeder reconﬁguration and topology identiﬁcation

The topology of a grid is not constant, especially in power distribution net-
works. There are several sectionalizing switches along with the primary feeders
that permit transferring load among circuits. A feeder reconfiguration is an
operating model that determines the optimal topology to minimize power loss.
This model is non-linear, non-convex, and binary. In addition, it has a con-
straint related to the connectivity and radiality of the graph that is tricky to
represent in equations. All these aspects are studied in Chapter 11.
Like the state estimation is the inverse of the optimal power problem, there
is a problem that can be considered as the inverse of the primary feeder recon-
figuration. It is the topology identification in power distribution grids, which
takes measurements of current and voltages in different parts of the grid and
determines the state of each sectionalizing switch. This problem is studied in
Chapter 12.

1.3.4 Phase balancing

Phase balancing is a combinatorial problem that consists of phase swapping of
the loads and generators to reduce power loss. Despite being a classic problem,
it is still relevant since the unbalance is a common phenomenon in micro-
grids, especially when single-phase photovoltaic units are included. Because
it is a combinatorial problem, phase balancing requires heuristic algorithms
with high computational effort. However, it is possible to generate simplified
instances of the problem, as presented in Chapter 13.
Each three-phase node has six possible configurations as depicted in
Figure 1.6. The problem consists of defining the phase in which each load or
generator is connected in order to reduce the grid’s total losses. Therefore, the
problem is combinatorial since there are 6𝑛 possible configurations, where 𝑛
is the number of three-phase nodes. The problem is nontrivial even in small

Figure 1.6 Set of possible conﬁgurations in a three-phase node.

Telegram: @ElectricalDocument
14 1 Power systems operation

networks; for example, a microgrid with 10 nodes will have 6 × 107 possible
configurations.
Phase-balancing problems appear in many applications such as aircraft elec-
tric systems [4], and in power distribution grids with high penetration of
electric vehicles [5]. Due to its combinatorial nature, the problem necessitates
the use of heuristics [6], and meta-heuristics [7] as well as expert systems
[8]. Modern approaches include the uncertainty associated to the load and
generator [9].

1.4 Real-time implementation

A receding horizon control can implement most of the optimization algorithms
presented in this book. Figure 1.7 depicts the main architecture for this sim-
ple but powerful strategy, for real-time implementation of operation models.
These optimization models may be the optimal power flow, economic dispatch,
energy storage management, or a mixed model that includes multiple models.
An unbiased forecast predicts variables such as wind speed, solar radiation, and
power demand. Moreover, a state estimator gives accurate measurements of the
system variables.
Equation (1.10) represents the optimization model, where 𝑥 is the vector
of decision variables for each time 𝑡, and 𝛼 are the parameters predicted by the
forecast module. Of course, this forecast may change since renewable resources
and loads may be highly variable in modern power systems. Therefore, the
optimization model must be continuously executed and the solution updated.

min 𝑓(𝑥, 𝛼)
𝑥∈Ω (1.10)

Optimization Forecast
model module

State Power
estimation system

Figure 1.7 A possible architecture for implementing an optimization model for

power systems operation.

Telegram: @ElectricalDocument
1.5 Using Python 15

In many cases, the forecast has an error that introduces uncertainty in the
model. Either stochastic or robust optimization is a suitable option to face this
uncertainty. Chapter 6 presents the latter option.

1.5 Using Python

Python is a general programming language that is gaining attention in power
systems optimization. Although it is neither a mathematical software nor an
algebraic modeling language, it has many free tools for solving optimization
problems. Moreover, there are many other tools for data manipulation, plot-
ting, and integration with other software. Hence, it is a convenient platform for
solving practical problems and integrate different resources3 .
We use a module named CvxPy [10] that allows to solve convex and mixed-
integer convex optimization problems; this module, together with NumPy,
MatplotLib, and Pandas, forms a robust platform for solving all types of opti-
mization problems in power systems applications. Let us consider, for instance,
the following optimization problem:

min 𝑐⊤ 𝑥
∑
𝑥𝑖 = 1 (1.11)
𝑖

𝑥𝑖 ≥ 0

where 𝑥 ∈ ℝ6 and 𝑐 = (5, 3, 2, 4, 8, 7)⊤ . A model in CvxPy for this problem

looks as follows:
import numpy as np
import cvxpy as cvx
c = np.array([5,3,2,4,8,7])
x = cvx.Variable(6)
objective = cvx.Minimize(c.T * x)
constraints = [ sum(x) == 1, x >= 0]
Model = cvx.Problem(objective,constraints)
Model.solve()

Without much effort, we can identify variables, the objective function, and con-
straints. This neat code feature is an essential aspect of Python and CvxPy.
After the problem is solved, we can make additional analyses using the same
platform. This combination of tools is, of course, a great advantage; however,
we must avoid any fanaticism for software. There are many programming

3 See appendix C for a brief introduction to Python.

Telegram: @ElectricalDocument
16 1 Power systems operation

languages and many modules for mathematical optimization. What is learned

in this book may be translated to any other language. The problem is the same,
although its implementation may change from one platform to another.
We made a great effort in making the examples as simple as possible (we call
them toy-models). This approach allows us to understand each problem indi-
vidually and do numerical experiments. Real operation models may include
different aspects of these toy-models; for instance, they may combine economic
dispatch, optimal power flow, and state estimation. These models are highly
involved with thousands of variables and constraints. Nevertheless, they can
be solved using the same paradigm presented in this book.

Telegram: @ElectricalDocument
17

Part I

Mathematical programming

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
19

A brief introduction to mathematical optimization

Learning outcomes

By the end of this chapter, the student will be able to:

● Establish ﬁrst-order conditions for locally optimal solutions.
● Solve unconstrained optimization problems, using the gradient
method, implemented in Python.
● Solve equality-constrained optimization problems, using Newton’s
method implemented in Python.

2.1 About sets and functions

Sets and functions are familiar concepts in mathematics. A set is a well-defined
collection of distinct objects, considered an object in its own right. A function
is a map that takes objects from one set (i.e., input or domain) and returns an
object in another set (i.e., output or image). An optimization problem consists
of finding the best object in the output set and its corresponding input, as shown
schematically in Figure 2.1. The input set Ω is called the set of feasible solutions,
and the best object corresponds to a minimum or a maximum, according to an
objective function 𝑓 ∶ Ω ⊆ ℝ𝑛 → ℝ.
Solving an optimization problem implies not only to find the value of the
objective function (e.g., 𝑓min or 𝑓max ) but also the value 𝑥, at the input set Ω
(e.g., 𝑥min , 𝑥max ). These values are represented as follows:
min 𝑓(𝑥) = 𝑓min , max 𝑓(𝑥) = 𝑓max (2.1)
𝑥∈Ω 𝑥∈Ω

argmin 𝑓(𝑥) = 𝑥min , argmax 𝑓(𝑥) = 𝑥max (2.2)

𝑥∈Ω 𝑥∈Ω

Mathematical Programming for Power Systems Operation: From Theory to Applications in

Python. First Edition. Alejandro Garcés.
© 2022 by The Institute of Electrical and Electronics Engineers, Inc. Published 2022 by John
Telegram: @ElectricalDocument
Wiley & Sons, Inc.
20 2 A brief introduction to mathematical optimization

Figure 2.1 Representation

of the sets related to a max
general optimization
problem.
min

max

min

Ω, input set, domain, or set output set or image

offeasible solutions

Notice that 𝑓min and 𝑓max are numbers whereas 𝑥min and 𝑥max are vectors. The
following example shows the difference between min and argmin (the same
applies for max and argmax).
Example 2.1. Let us consider four simple optimization problems and their
respective solution, namely1 :
min(𝑥 − 5)2 = 0, argmin(𝑥 − 5)2 = 5 (2.3)
min(𝑥 − 5)2 = 25, argmin(𝑥 − 5)2 = 10 (2.4)
𝑥≥10 𝑥≥10

min(𝑥 − 5)2 + (𝑦 − 8)2 = 0, argmin(𝑥 − 5)2 + (𝑦 − 8)2 = [5, 8] (2.5)

min cos(𝑥) = −1, argmin cos(𝑥) = 𝜋 or 3𝜋 or 5𝜋 or ... (2.6)
As aforementioned, the operator min returns a number while the operator
argmin can return a vector. Two or more points could produce the same
minimum. In that case, the argmin is not unique.
But, what exactly does it mean the best solution? And, what characteristics
should have both the sets and the functions involved in the problem? Some
mathematical sophistication is required to answer these questions. Finding the
best solution in a set implies comparing one element with the rest of the set
elements. A comparison is a relation of the form 𝑥 ≤ 𝑦 or 𝑥 ≥ 𝑦. However, not
all sets allow these types of comparisons; those that enable it are called ordered
sets. For instance, the real numbers and the integer numbers are all ordered
set. However, complex numbers are non-ordered because such a comparison is
not possible (what number is higher: 𝑧𝐴 = 1 + 𝑗 or 𝑧𝐵 = 1 − 𝑗 ?).
The objective function establishes a criterion of comparison. Therefore, its
output must be an ordered set. Nevertheless, the input set may be ordered
or non-ordered; it depends on the problem’s representation. For example, the
optimal power flow in power distribution networks targets an ordered set since

1 At this point, the only tool we have to check these results is plotting the function and locating
the optimum.

Telegram: @ElectricalDocument
2.1 About sets and functions 21

the active power losses belongs to the real numbers; however, the input may be
represented as a vector (𝑣, 𝜃) ∈ ℝ2×𝑛 or as a set of phasors (𝑣𝑒𝑗𝜃 ∈ ℂ𝑛 ). The
former is an ordered set, whereas the latter is non-ordered. Other possible rep-
resentations can be the set of positive definite matrices or positive polynomials,
as presented in Chapter 10. In conclusion, the objective function must point to
an ordered set, but the input set (i.e., the set of feasible solutions) can be any
arbitrary set.
We usually compare values in the output set since our objective is to mini-
mize or maximize the objective function. It is also possible to compare values
in Ω when it is an ordered set. However, a comparison between elements of the
input set may be different in the output set. A function 𝑓 is monotone (or mono-
tonic) increasing, if 𝑥 ≤ 𝑦 implies that 𝑓(𝑥) ≤ 𝑓(𝑦), that is to say, the function
preserves the inequality. Similarly, a function is monotone decreasing if 𝑥 ≤ 𝑦
implies that 𝑓(𝑥) ≥ 𝑓(𝑦), that is to say, the function reverses the identity.
Example 2.2. The function 𝑓(𝑥) = 𝑥2 is not monotone; for example −3 ≤ 1
but 𝑓(−3) ≰ 𝑓(1). Nevertheless, the function is monotone increasing in ℝ+ . In
this set, 4 ≤ 8 implies that 𝑓(4) ≤ 𝑓(8) since both 4 and 8 belong to ℝ+ .
An ordered set Ω ∈ ℝ𝑛 admits the following definitions:

● Supreme: the supreme of a set, denoted by sup(Ω), is the minimum value

greater than all the elements of Ω.
● Infimum: the infimum of a set, denoted by inf (Ω), is the maximum value
lower than all the elements of Ω.

The supreme and the infimum are closely related to the maximum and the
minimum of a set. They are equal in most practical applications. The main
difference is that the infimum and the supreme can be outside the set. For
example, the supreme of the set Ω = {𝑥 ∶ 3 ≤ 𝑥 < 5} is 5 whereas its maximum
does not exists. It may seem like a simple difference, but several theoretical
analyzes require this differentiation.
Some properties of the supreme and the infimum are presented below:

sup 𝑓(𝑥) = −inf − 𝑓(𝑥) (2.7)

𝑥 𝑥

sup 𝛼 ⋅ 𝑓(𝑥) = 𝛼 ⋅ sup 𝑓(𝑥) (2.8)

𝑥 𝑥

sup {𝑓(𝑥) + 𝑔(𝑥)} ≤ sup 𝑓(𝑥) + sup 𝑔(𝑥) (2.9)

𝑥 𝑥 𝑥

sup {𝑓(𝑥) + 𝛼} = 𝛼 + sup 𝑓(𝑥) (2.10)

𝑥 𝑥

Moreover, the last case implies that:

𝑥̃ = argmin {𝑓(𝑥) + 𝛼} = argmin {𝑓(𝑥)} (2.11)

Telegram: @ElectricalDocument
22 2 A brief introduction to mathematical optimization

That is to say, the value of 𝑥 that minimizes the function 𝑓(𝑥) + 𝛼 is the same
value that minimizes 𝑓(𝑥); for this reason, it is typical to neglect the constant
𝛼 in practical problems.
Example 2.3. Table 2.1 shows some examples of maximum, minimum,
supreme, and infimum.

Table 2.1 Bounds of some ordered sets.

Set sup max inf min

Ω1 = {1, 2, 3, 4} 4 4 1 1
Ω2 = {𝑥 ∈ ℝ ∶ 1 ≤ 𝑥 ≤ 2} 2 2 1 1
Ω3 = {𝑥 ∈ ℝ ∶ 3 < 𝑥 ≤ 8} 8 8 3 -
Ω4 = {𝑥 ∈ ℝ ∶ 2 ≤ 𝑥 < 9} 9 - 2 2
Ω5 = {𝑥 ∈ ℝ ∶ 4 < 𝑥 < 7} 7 - 4 -

2.2 Norms
In many practical problems, we may be interested in measuring the objects
in a set, either as an objective function or as a way of analyzing solutions. A
norm is a geometric concept that allows us to make this measurement. The
most common norm is the Euclidean distance given by Equation (2.12)
√
‖𝑥‖ = 𝑥12 + 𝑥22 + … 𝑥𝑛2 (2.12)
However, this function is not the only way to measure a distance. In general,
we can define a norm as a function ‖⋅‖ ∶ Ω → ℝ that fulfills the following
conditions:
{ }
‖𝑥‖ > 0 ∀𝑥 ∈ Ω − 0⃗ (2.13)

‖𝑥‖ = 0 → 𝑥 = 0⃗ (2.14)
‖𝛼𝑥‖ = |𝛼| ‖𝑥‖ (2.15)
‖𝑥 + 𝑦‖ ≤ ‖𝑥‖ + ‖𝑦‖ (2.16)
The first two conditions indicate that a norm must return a positive value,
except when the input is the vector 0⃗. The third condition indicates that it is
scalable; for example, the norm must be twice the original vector’s norm if we
multiply all the vector entries by 2. The last condition, known as the triangle
inequality, is a generalization of the triangles’ property (therein lies its name).

Telegram: @ElectricalDocument
2.2 Norms 23

The sum of any two sides’ lengths is greater (or equal) to the remaining side’s
length. This property is intuitive for the Euclidean norm, but surprisingly it is
general for many other functions, such as Equation (2.17):

1∕𝑝
∑
‖𝑥‖𝑝 = ( 𝑝
|𝑥𝑖 | ) (2.17)
𝑖

This function is known as p-norm, where 𝑝 ≥ 1. Three of the most common

examples of p-norms in ℝ𝑛 have a well-defined representation, as presented
below:
∑
‖𝑥‖1 = |𝑥𝑖 | (2.18)
𝑖
√∑
‖𝑥‖2 = 𝑥𝑖2 (2.19)
𝑖

‖𝑥‖∞ = sup |𝑥𝑖 | (2.20)

𝑖

The Euclidean distance is equivalent to a 2-norm whereas 1-norm, also known

as Manhattan distance, consists in measuring the distance along axes at right
angles (see Figure 2.2b), and infinity-norm or uniform norm, takes the max-
imum distance along axes as shown in Figure 2.2c). In general, ‖𝑥‖1 ≤
‖𝑥‖2 ≤ ‖𝑥‖∞ . All of these norms are suitable ways to measure vectors in
the space.
We can use a norm to define a set given by all the points at a distance less or
equal to a given value 𝑟, as given in Equation (2.21).

ℬ = {𝑥 ∈ ℝ𝑛 ∶ ‖𝑥‖ ≤ 𝑟} (2.21)

This set is known as a ball of radius 𝑟. Figure 2.3 shows the shape of unit balls
(i.e., balls of radius 1), generated by each of the three previously mentioned
norms.

Figure 2.2 Three ways to measure the vector 𝑣 = (3, 2) in ℝ2 : a) 2-norm or Euclidean
norm, b) 1-norm or Manhattan distance, c) inﬁnity-norm or uniform norm.

Telegram: @ElectricalDocument
24 2 A brief introduction to mathematical optimization

Figure 2.3 Comparison among unit balls deﬁned by norm-2, norm-1, and norm-∞.

Notice that a ball is not necessarily round, at least with this definition. All
balls share a common geometric property known as convexity that is studied in
Chapter 3.

2.3 Global and local optimum

Let us consider a mathematical optimization problem represented as
Equation (2.22).

min 𝑓(𝑥, 𝛽) (2.22)

𝑥∈Ω

where 𝑓 ∶ ℝ𝑛 → ℝ is the objective function, 𝑥 are decision variables, Ω is the

feasible set, and 𝛽 are constant parameters of the problem.
A point 𝑥̃ is a local optimum of the problem, if there exists an open set 𝒩(𝑥),
̃
named neighborhood, that contains 𝑥̃ such that 𝑓(𝑥) ≥ 𝑓(𝑥), ̃ ∀𝑥 ∈ 𝒩(𝑥). ̃
If 𝒩 = Ω then, the optimum is global. Figure 2.4 shows the concept for two
functions in ℝ with their respective neighborhoods 𝒩.
There are two local minima in the first case, whereas there is a unique global
minimum in the second case. This concept is more than a fancy theoretical
notion; what good is a local optimum if there are even better solutions in
another region of the feasible set? In practice, we require global or close-to-
global optimum solutions.
On the other hand, several points may be optimal, as shown in Figure 2.5.
In that case, all the points in the interval 𝑥1 ≤ 𝑥 ≤ 𝑥2 are global optima.
Thus, the question is not only if the optimal point is global but also
if it is unique. Both globality and uniqueness are geometrical ques-
tions with practical implications, especially in competitive markets. Con-
vex optimization allows naturally answering these questions as explained in
Chapter 3

Telegram: @ElectricalDocument
2.4 Maximum and minimum values of continuous functions 25

Figure 2.4 Example of local and global optima: a) function with two local minima
and their respective neighborhoods, b) function with a unique global minimum (the
neighborhood is the entire domain of the function).

Figure 2.5 Example of a

function with several optimal
points.

2.4 Maximum and minimum values of continuous

functions
It is well-known, from basic mathematics, that the optimum of a continuous
differentiable function is attached when its derivative is zero. This fact can be
formalized in view of the concepts presented in previous sections. Consider a
function 𝑓 ∶ ℝ → ℝ with a local minimum in 𝑥. ̃ A neighborhood is defined as
𝒩 = {𝑥 ∈ ℝ ∶ 𝑥 = 𝑥̃ ± 𝑡, |𝑡| < 𝑡0 }, with the following condition:

𝑓(𝑥̃ ± 𝑡) ≥ 𝑓(𝑥)
̃ (2.23)

where 𝑡 can be positive or negative. If 𝑡 > 0, then Equation (2.23) can be divided
by 𝑡 without modifying the direction of the inequality, to then take the limit

Telegram: @ElectricalDocument
26 2 A brief introduction to mathematical optimization

when 𝑡 → 0+ as presented below:

𝑓(𝑥̃ + 𝑡) − 𝑓(𝑥)
̃
lim ≥0 (2.24)
𝑡→0+ 𝑡
The same calculation can be made if 𝑡 < 0, just in that case, the direction of the
inequality changes as follows:
𝑓(𝑥̃ + 𝑡) − 𝑓(𝑥)
̃
lim ≤0 (2.25)
𝑡→0− 𝑡
Notice that this limit is the definition of derivative; hence, 𝑓 ′ (𝑥)
̃ ≥ 0 and
𝑓 ′ (𝑥)
̃ ≤ 0. These two conditions hold simultaneously when 𝑓 ′ (𝑥) ̃ = 0. Con-
sequently, the optimum of a differentiable function is the point where the
derivative vanishes. This condition is local in the neighborhood 𝒩.
This idea can be easily extended to multivariable functions as follows: con-
sider a function 𝑓 ∶ ℝ𝑛 → ℝ type 𝒞1 (continuous and differentiable) and
a neighborhood given by 𝒩 = {𝑥 ∈ ℝ𝑛 ∶ 𝑥 = 𝑥̃ + ∆𝑥} with ∆𝑥 ∈ ℬ𝑟 . Now,
define a function 𝑔(𝑡) = 𝑓(𝑥̃ + 𝑡∆𝑥). If 𝑥̃ is a local minimum of 𝑓, then

𝑓(𝑥̃ + 𝑡∆𝑥) ≥ 𝑓(𝑥)

̃ (2.26)

In terms of the new function 𝑔, Equation (2.26) leads to the following

condition:

𝑔(𝑡) ≥ 𝑔(0) (2.27)

This condition implies that 0 is a local optimum of 𝑔; moreover,

𝑓(𝑥̃ + 𝑡∆𝑥) − 𝑓(𝑥)
̃ ⊤
𝑔′ (0) = lim+ ̃
= (∇𝑓(𝑥)) ∆𝑥 (2.28)
𝑡→0 𝑡
Notice that 𝑔 is a function of one variable, then optimal condition 𝑔′ (0) = 0 is
met, regardless the direction of ∆𝑥. Therefore, the optimum of a multivariate
̃ = 0). This condition permits
function is given when the gradient is zero (∇𝑓(𝑥)
to find local optimal points, as presented in the next section. Two questions
are still open: in what conditions are the optimum global? And, when is the
solution unique? We will answer these relevant questions in the next chapter.
For now, let us see how to find the optimum using the gradient.

2.5 The gradient method

The gradient method is, perhaps, the most simple and well-known algorithm
for solving optimization problems. Cauchy invented the basic method in the
19th century, but the computed advent leads to different applications that

Telegram: @ElectricalDocument
2.5 The gradient method 27

encompass power systems operation and machine learning. Let us consider the
following unconstrained optimization problem:

min 𝑓(𝑥) (2.29)

where the objective function 𝑓 ∶ ℝ𝑛 → ℝ is differentiable. The gradient ∇𝑓(𝑥)

represents the direction of greatest increase of 𝑓. Thus, minimizing 𝑓 implies to
move in the direction opposite to the gradient. Therefore, we use the following
iteration:

𝑥 ← 𝑥 − 𝑡∇𝑓(𝑥) (2.30)

The gradient method consists in applying this iteration until the gradient is
small enough, i.e., until ‖∇𝑓(𝑥)‖ ≤ 𝜖. It is easier to understand the algorithm
by considering concrete problems and their implementation in Python, as given
in the next examples.
Example 2.4. Consider the following optimization problem:

min 𝑓(𝑥, 𝑦) = 10𝑥 2 + 15𝑦 2 + exp (𝑥 + 𝑦) (2.31)

The gradient of this function is presented below:

20𝑥 + exp (𝑥 + 𝑦)
∇𝑓(𝑥, 𝑦) = ( ) (2.32)
30𝑦 + exp (𝑥 + 𝑦)

We require to find a value (𝑥, 𝑦) such that this gradient is zero. Therefore, we
use the gradient method. The algorithm starts from an initial point (for example
𝑥 = 10, 𝑦 = 10) and calculate new points as follows:
𝜕𝑓
𝑥 ←𝑥−𝑡 (2.33)
𝜕𝑥
𝜕𝑓
𝑦 ←𝑦−𝑡 (2.34)
𝜕𝑦
This step can be implemented in a script in Python, as presented below:

import numpy as np
x = 10
y = 10
t = 0.03
for k in range(50):
dx = 20*x + np.exp(x+y)
dy = 30*y + np.exp(x+y)
x += -t*dx
y += -t*dy
print(’grad:’,np.abs([dx,dy]))
print(’argmin:’,x,y)

Telegram: @ElectricalDocument
28 2 A brief introduction to mathematical optimization

In the first line, we import the module NumPy with the alias np. This mod-
ule contains mathematical functions such as sin, cos, exp, ln among others.
The gradient introduces two components dx and dy, which are evaluated in
each iteration and added to the previous point (x,y). We repeat the process 50
times and print the value of the gradient each iteration. Notice that all the
indented statements belong to the for-statement, and hence the gradient is
printed in each iteration. In contrast, the argmin is printed only at the end of the
process.

Example 2.5. Python allows calculating the gradient automatically using the
module AutoGrad. It is quite intuitive to use. Consider the following script,
which solves the same problem presented in the previous example:

import autograd.numpy as np
from autograd import grad # gradient calculation

def f(x):
z = 10.0*x[0]**2 + 15*x[1]**2 + np.exp(x[0]+x[1])
return z

g = grad(f) # create a funtion g that returns the gradient

x = np.array([10.0,10.0])
t = 0.03
for k in range(50):
dx = g(x)
x = x -t*dx
print(’argmin:’,x)

In this case, we defined a function 𝑓 and its gradient 𝑔 where (𝑥, 𝑦)

was replaced by a vector (𝑥0 , 𝑥1 ). The module NumPy was loaded using
autograd.numpy to obtain a gradient function automatically. The code
executes the same 50 iterations, obtaining the same result. The reader
should execute and compare the two codes in terms of time calculation and
results.

Example 2.6. Consider a small photovoltaic system formed by three solar pan-
els 𝐴, 𝐵, and 𝐶, placed as depicted in Figure 2.6. Each solar system has a power
electronic converter that requires to be connected to a common point 𝐸 before
transmitted to the final user in 𝐷. The converters and the user’s location are
fixed, but the common point 𝐸 can be moved at will. The coordinates of the
solar panels and the final user are 𝐴 = (0, 40), 𝐵 = (20, 70), 𝐶 = (30, 0), and
𝐷 = (100, 50), respectively.
The cost of the cables is different since each cable carries different current.
Our objective is to find the best position of 𝐸 in order to minimize the total cost

Telegram: @ElectricalDocument
2.5 The gradient method 29

Figure 2.6 A small photovoltaic system with three solar panels.

of the cable. Therefore, the following unconstrained optimization problem is

formulated:
‖ ‖ ‖ ‖ ‖ ‖ ‖ ‖
min 𝑓 = cost𝐴𝐸 ‖‖‖𝐴𝐸 ‖‖‖ + cost𝐵𝐸 ‖‖‖𝐵𝐸 ‖‖‖ + cost𝐶𝐸 ‖‖‖𝐶𝐸 ‖‖‖ cost𝐷𝐸 ‖‖‖𝐷𝐸 ‖‖‖ (2.35)
‖ ‖ ‖ ‖ ‖ ‖ ‖ ‖

where cost𝑖𝑗 is the unitary cost of the cable that connects the point 𝑖 and 𝑗, and
‖‖ ‖‖
‖‖𝑖𝑗 ‖‖ is the corresponding length.
‖ ‖
The costs of the cables are cost𝐴𝐸 = 12, cost𝐵𝐸 = 13, cost𝐶𝐸 = 11, and
cost𝐷𝐸 = 18. The distance between any two points 𝑈 = (𝑢0 , 𝑢1 ) and 𝑉 = (𝑣0 , 𝑣1 )
is given by the following expression:
√
dist = (𝑢0 − 𝑣0 )2 + (𝑢1 − 𝑣1 )2 (2.36)

This equation is required several times; thus, it is useful to define a function, as

presented below:

import numpy as np
A = (0,40)
B = (20,70)
C = (30,0)
D = (100,50)
def dist(U,V):
return np.sqrt((U[0]-V[0])**2 + (U[1]-V[1])**2)

P = [31,45]
f = 12*dist(P,A) + 13*dist(P,B) + 11*dist(P,C) + 18*dist(P,D)
print(f)

Telegram: @ElectricalDocument
30 2 A brief introduction to mathematical optimization

The function is evaluated in a point 𝑃 = (10, 10) to see its usage2 . The value
of the objective function is easily calculated as function of dist(𝑈, 𝑉). Likewise,
the gradient of 𝑓 is defined as function of the gradient of dist(𝑈, 𝑉) with 𝑉 fixed,
as presented below:

1 𝑢 − 𝑣0
∇ dist(𝑈, 𝑉) = ( 0 ) (2.37)
dist(𝑈, 𝑉) 𝑢1 − 𝑣1
then,

∇𝑓(𝑥) =12∇ dist(𝑥, 𝐴) + 13∇ dist(𝑥, 𝐵) + 11∇ dist(𝑥, 𝐶)+

18∇ dist(𝑥, 𝐷) (2.38)

These functions are easily defined in Python as follows:

def g_d(U,V):
"gradient of the distance"
return [U[0]-V[0],U[1]-V[1]]/dist(U,V)

def grad_f(E):
"gradient of the objective function"
return 12*g_d(E,A)+13*g_d(E,B)+11*g_d(E,C)+18*g_d(E,D)

Now the gradient method consists in applying the iteration given by

Equation (2.30), as presented below:

t = 0.5
E = np.array([10,10])
for iter in range(50):
E = E -t*grad_f(E)
f = 12*dist(E,A) + 13*dist(E,B) + 11*dist(E,C) + 18*dist(E,D)
print("Position:",E)
print("Gradient",grad_f(E))
print("Cost",f)

In this case, 𝑡 = 0.5 and a initial point 𝐸 = (10, 10) with 50 iterations were
enough to find the solution. The reader is invited to try with other values and
analyze the effect on the algorithm’s convergence.
The step 𝑡 is very important for the convergence of the gradient method. It can
be constant or variable, according to a well-defined update rule. There are many
variants of this algorithm, most of them with sophisticated ways to calculate
this step3 . A plot of ‖∇𝑓‖ versus the number of iterations may be useful for

2 Notice 𝑃 is defined in a line outside the function definition. Recall that 𝑥 2 is represented as
x**2 in Python (see Appendix C)
3 A complete discussion about the calculation of 𝑡 is beyond this book’s objectives. Interested
readers can consult the work of Nesterov and Nemirovskii, in [11] and [12].

Telegram: @ElectricalDocument
2.5 The gradient method 31

determining the optimal value of 𝑡 and showing the convergence rate of the
algorithm, as presented in the next example. We expect a linear convergence
for the gradient method, although the algorithm can lead to oscillations and
even divergence if the parameter 𝑡 is not selected carefully. Fortunately, there
are modules in Python that make this work automatically.

Example 2.7. The convergence of the algorithm can be visualized by using the
module MatplotLib as follows:

import matplotlib.pyplot as plt

t = 0.5
conv = []
E = [10,10]
for iter in range(50):
E += - t*grad_f(E)
conv += [np.linalg.norm(grad_f(E))]
plt.semilogy(conv)
plt.grid()
plt.xlabel("Iteration")
plt.ylabel("|Gradient|")
plt.show()

The result of the script is shown in Figure 2.7. As expected, the convergence
rate is linear; that is to say, the convergence plot describes almost a line in
a semi-logarithmic scale. The value of 𝜖 can be used as convergence criteria
(a gradient ‖∇𝑓‖ ≤ 10−4 can be considered as the local optimum for this
problem).
Notice that addition was simplified by the statement +=. In general, an state-
ment such as x=x+1 is equivalent to x+=1. More details about this aspect are
presented in Appendix C.

Figure 2.7 Convergence of the gradient method.

Telegram: @ElectricalDocument
32 2 A brief introduction to mathematical optimization

2.6 Lagrange multipliers

Reality imposes physical constraints into the problems and these constraints
must be considered into the model. For example, an optimization model may
include equality constraints, as presented below:

min 𝑓(𝑥)
𝑔(𝑥) = 𝑎 (2.39)

For solving this problem, a function called lagrangian is defined as follows:

ℒ(𝑥, 𝜆) = 𝑓(𝑥) + 𝜆(𝑎 − 𝑔(𝑥)) (2.40)

This new function depends on the original decision variables 𝑥 and a new
variable 𝜆, known as Lagrange multiplier or dual variable. By means of the
lagrangian function, a constrained optimization problem was transformed
into an unconstrained optimization problem that can be solved numerically,
namely:

𝜕ℒ 𝜕𝑓 𝜕𝑔
= −𝜆 =0 (2.41)
𝜕𝑥 𝜕𝑥 𝜕𝑥
𝜕ℒ
= 𝑎 − 𝑔(𝑥) = 0 (2.42)
𝜕𝜆
by a small abuse of notation, 𝜕𝑓∕𝜕𝑥 is used instead of ∇𝑓, which is the formal
representation for the n-dimentional case, (the same for ℒ and 𝑔). Notice that
the optimal conditions of ℒ imply optimal solution in 𝑓 but also feasibility in
terms of the constraint.
The first condition implies that the gradient of the objective function must
be parallel to the gradient of the constraint and, the Lagrange multiplier is
the proportionality constant. Besides this geometrical interpretation, Lagrange
multipliers have another interesting interpretation. Suppose a local optimum
𝑥̃ is found for a constrained optimization problem, and we want to know the
sensibility of this optimum with respect to 𝑎. The following derivative can
be calculated, relating the change of the objective function with respect to a
change in the constrain:

𝜕ℒ 𝜕𝑓 𝜕𝑥 𝜕𝜆 𝜕𝑔 𝜕𝑥
= ( ) ( ) + ( ) (𝑎 − 𝑔(𝑥)) + 𝜆 (1 − )( ) (2.43)
𝜕𝑎 𝜕𝑥 𝜕𝑎 𝜕𝑎 𝜕𝑥 𝜕𝑎
𝜕𝑓 𝜕𝑔 𝜕𝑥 𝜕𝜆
=( − 𝜆 ) ( ) + (𝑎 − 𝑔(𝑥)) ( ) + 𝜆 (2.44)
𝜕𝑥 𝜕𝑥 𝜕𝑎 𝜕𝑎
=𝜆 (2.45)

Telegram: @ElectricalDocument
2.7 The Newton’s method 33

The first two terms in the right-hand side of the equation vanishes, in view of
̃ thus, the following expresion is obtained:
the optimal conditions of 𝑥;
𝜕ℒ
𝜆= (2.46)
𝜕𝑎
this means that 𝜆 is the variation of the lagrangian (and hence the objective
function), for a small variation on the parameter 𝑎 (see Chapter 3 for more
details about dual variables).
Example 2.8. Consider the following optimization problem

min 𝑥 2 + 𝑦 2
𝑥+𝑦 =1 (2.47)

If the problem were unconstrained, the solution would be 𝑥 = 0, 𝑦 = 0;

however, the solution must fulfill the constraint 𝑥 + 𝑦 = 1. Therefore, a new
function is defined as follows:

ℒ(𝑥, 𝑦, 𝜆) = 𝑥2 + 𝑦 2 + 𝜆(1 − 𝑥 − 𝑦) (2.48)

with the following optimal conditions

2𝑥 − 𝜆 = 0 (2.49)
2𝑦 − 𝜆 = 0 (2.50)
1−𝑥−𝑦 =0 (2.51)

This is a linear system of equation with solution 𝑥 = 1∕2, 𝑦 = 1∕2, and 𝜆 = 1

that constitutes the optimum of the problem.

2.7 The Newton’s method

The optimal conditions for both constrained and unconstrained problems con-
stitute a set of algebraic equations 𝑆(𝑥) for the first case and 𝑆(𝑥, 𝜆) for the
second case. This set can be solved using Newton’s method4 .
Consider a set of algebraic equations 𝑆(𝑥) = 0 where 𝑆 ∶ ℝ𝑛 → ℝ𝑛 is
continuous and differentiable. Thus, the following approximation is made:

𝑆(𝑥) = 𝑆(𝑥0 ) + 𝒥(𝑥0 )∆𝑥 (2.52)

where 𝒥 is the Jacobian matrix of 𝑆 and ∆𝑥 = 𝑥 − 𝑥0 . This constitutes

the first-order approximation of 𝑆 around a point 𝑥0 ; finding the zero of

4 Again, a classic method that became more important with the advent of the computer.

Telegram: @ElectricalDocument
34 2 A brief introduction to mathematical optimization

𝑆 means to approximate successively the solution by using the following

iteration:
∆𝑥 ← [𝒥(𝑥)]−1 𝑆(𝑥) (2.53)
𝑥 ← 𝑥 − ∆𝑥 (2.54)
This iteration is the primary Newton’s method. Compared to the gradient
method, this method is faster since it includes information from the second
derivative5 . In addition, Newton’s method does not require defining a step 𝑡 as
in the gradient method. However, each iteration of Newton’s method is com-
putationally expensive since it implies the formulation of a jacobian and solves
a linear system in each iteration.
Example 2.9. Consider the following optimization problem:
min 10𝑥2 + 15𝑦 2 + exp(𝑥 + 𝑦)
𝑥+𝑦 =5 (2.55)
Its corresponding lagrangian is presented below:
ℒ(𝑥, 𝑦, 𝜆) = 10𝑥 2 + 15𝑦 2 + exp(𝑥 + 𝑦) + 𝜆(5 − 𝑥 − 𝑦) (2.56)
and the optimal conditions forms the following set of algebraic equations:

⎧ 20𝑥 + exp(𝑥 + 𝑦) − 𝜆 = 0 ⎫
𝑆(𝑥, 𝑦, 𝜆) =30𝑦 + exp(𝑥 + 𝑦) − 𝜆 = 0 (2.57)
⎨ ⎬
5−𝑥−𝑦 =0
⎩ ⎭
The corresponding Jacobian is the following matrix:

⎛ 20 + exp(𝑥 + 𝑦) exp(𝑥 + 𝑦) −1 ⎞
𝒥=⎜ exp(𝑥 + 𝑦) 30 + exp(𝑥 + 𝑦) −1 ⎟ (2.58)
⎜ ⎟
−1 −1 0
⎝ ⎠
It is possible to formulate Newton’s method using the information described
above. The algorithm implemented in Python is presented below:
import numpy as np
def Fobj(x,y):
"Objective funcion"
return 10*x**2 + 15*y**2 + np.exp(x+y)

def Grad(x,y,z):
"Gradient of Lagrangian"
dx = 20*x + np.exp(x+y) + z

5 The jacobian matrix of 𝑆 is equivalent to the hessian matrix of 𝑓.

Telegram: @ElectricalDocument
2.9 Exercises 35

dy = 30*y + np.exp(x+y) + z
dz = x + y - 5
return np.array([dx,dy,dz])

def Jac(x,y,z):
"Jacobian of Grad"
p = np.exp(x+y)
return np.array([[20+p,p,1],[p,30+p,1],[1,1,0]])

(x,y,z) = (10,10,1) # initial condition

G = Grad(x,y,z)
while np.linalg.norm(G) >= 1E-8:
J = Jac(x,y,z)
step = -np.linalg.solve(J,G)
(x,y,z) = (x,y,z) + step
G = Grad(x,y,z)
print(’Gradient: ’,np.linalg.norm(G))

print(’Optimum point: ’, np.round([x,y,z],2))

print(’Objective function: ’, Fobj(x,y))

In this case, we used a tolerance of 10−3 . The algorithm achieves convergence

in few iterations, as the reader can check by running the code.

2.8 Further readings

This chapter presented basic optimization methods for constrained and uncon-
strained problems. Conditions for convergence of these algorithms were not
presented here. However, they are incorporated into modules and solvers called
for a modeling/programming language such as Python. Our approach is to use
these solvers and concentrate on studying the characteristics of the models.
Readers interested in details of the algorithms are invited to review [12] and
[11], for a formal analysis of convergence; other variants of the algorithms can
be found in [13]. Moreover, a complete review of Lagrange multipliers can be
studied in [14].
This chapter is also an excuse for presenting Python’s features as program-
ming and modeling language. The reader can review Appendix C for more
details about each of the commands used in this section.

2.9 Exercises
1. Find the highest and lowest point, of the set given by the intersection of the
cylinder 𝑥2 + 𝑦 2 ≤ 1 with the plane 𝑥 + 𝑦 + 𝑧 = 1, as shown in Figure 2.8.

Telegram: @ElectricalDocument
36 2 A brief introduction to mathematical optimization

Figure 2.8 Intersection of an afﬁne space

with a cylinder.

2. What is the new value of 𝑧max and 𝑧min , if the cylinder increases its radius
in a small value, that is, if the radius changes from (𝑟 = 1) to (𝑟 = 1 + ∆𝑟)
(Consider the interpretation of the Lagrange multipliers).
3. The following algebraic equation gives the mechanical power in a wind
turbine:
1
𝑃 = 𝜌𝐴𝐶𝑝 (𝜆)𝑣 3 (2.59)
2
where 𝑃 is the power extracted from the wind; 𝜌 is the air density; 𝐶𝑝 is
the performance coefficient or power coefficient; 𝜆 is the tip speed ratio;
𝑣 is the wind velocity, and 𝐴 is the area covered by the rotor (see [15] for
details). Determine the value of 𝜆 that produce maximum efficiency if the
performance coefficient is given by Equation (2.60):
151 −18.4
𝐶𝑝 = 0.73 ( − 13.2) exp ( ) (2.60)
𝜆 𝜆
Use the gradient method, starting from 𝜆 = 10 and a step of 𝑡 = 0.1. Hint:
use the module SymPy to obtain the expression of the gradient.
4. Solve the following optimization problem using the gradient method:

min 𝑓(𝑥0 , 𝑥1 ) = (𝑥0 − 10)2 + (𝑥2 − 8)2 (2.61)

Depart from the point (0, 0) and use a fixed step 𝑡 = 0.8. Repeat the problem
with a fixed step 𝑡 = 1.1. Show a plot of convergence.
5. Solve the following optimization problem using the gradient method.
1
min 𝑓(𝑥) = (𝑥 − 1𝑛 )⊤ 𝐻(𝑥 − 1𝑛 ) + 𝑏⊤ 𝑥 (2.62)
2
where 1𝑛 is a column vector of size 𝑛, with all entries equal to 1; 𝑏 is a
column vector such that 𝑏𝑘 = 𝑘𝑛2 ; and 𝐻 is a symmetric matrix of size
𝑛 × 𝑛 constructed in the following way: ℎ𝑘𝑚 = (𝑚 + 𝑘)∕2 if 𝑘 ≠ 𝑚 and

Telegram: @ElectricalDocument
2.9 Exercises 37

ℎ𝑘𝑚 = 𝑛2 + 𝑛 if 𝑘 = 𝑚. Show the convergence of the method for different

steps 𝑡 and starting from an initial point 𝑥 = 0. Use 𝑛 = 10, 𝑛 = 100, and
𝑛 = 1000. All index 𝑘 or 𝑚 starts in zero.
6. Show that Euclidean, Manhattan, and uniform norms fulfill the four con-
ditions to be considered a norm.
7. Consider a modified version of Example 2.6, where the position of the com-
mon point 𝐸 must be such that 𝑥𝐸 = 𝑦𝐸 . Solve this optimization problem
using Newton’s method.
8. Solve the problem of Item 4 with the following constraint (use Newton’s
method):
𝑥0 + 3𝑥1 = 5 (2.63)
9. Solve problem of Item 5 including the following constraint (use Newton’s
method):
1⊤𝑛 𝑥 = 1 (2.64)
10. Newton’s method can be used to solve unconstrained optimization prob-
lems. Solve the following problem using Newton’s method and compare
the convergence rate and the solution with the gradient method.
1
min (𝑥 + 3𝑦 + 1)2 + (𝑥 − 2𝑦)4 (2.65)
4

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
39

Convex optimization

Learning outcomes

By the end of this chapter, the student will be able to:

● Identify convex functions, convex sets, and convex optimization
problems.
● Recognize when a problem has global optimum and unique solution.
● Formulate and solve a dual problem.

3.1 Convex sets

The set of feasible solutions of an optimization problem, henceforth feasible
set, may have different shapes and properties. It can be open or close, dis-
crete or continuous, linear or non-linear. Each shape determines the type of
optimization problem. However, there are sets remarkably well-behaved for
optimization problems. These are the convex sets.
A set is convex if we can choose any pair of points within the set so that
the line that joins these points also belongs to the set. Figure 3.1 shows an
example of a convex and a non-convex set. In the first case, all the points in the
line 𝑥1 𝑥2 belong to the set. In the second case, there are some points outside
the set.
It is easy to identify convex sets in ℝ2 and even in ℝ3 . We only require to draw
the set, and the property becomes evident. However, power systems operation
problems are usually in ℝ𝑛 . Therefore, we require a systematic way to identify
convex sets.

Mathematical Programming for Power Systems Operation: From Theory to Applications in

Figure 3.1 Example of a convex set Ω𝐴 and a non-convex set Ω𝐵 .

Example 3.1. Equations and inequalities usually define sets. Let us consider,
for instance, the following three sets, related to the 2-norm:
{ }
Ω𝐴 = (𝑥, 𝑦) ∶ 𝑥 2 + 𝑦 2 ≤ 1 (3.1)
{ }
Ω𝐵 = (𝑥, 𝑦) ∶ 𝑥 2 + 𝑦 2 = 1 (3.2)
{ 2 2
}
Ω𝐶 = (𝑥, 𝑦) ∶ 𝑥 + 𝑦 ≥ 1 (3.3)

The set Ω𝐴 is convex since any pair of points 𝑚, 𝑛 generates a line whose points
belong to the set. The set Ω𝐵 is not convex since it is defined only in the ball’s
boundary, and hence, any line segment will have points outside the set. The set
Ω𝐶 is also non-convex since several points leave the set if we draw, for instance,
a line that passes the point (0, 0). Figure 3.2 shows a graphical representation
of each of these situations. Notice that we can be misled if we carelessly see the
equation.

Figure 3.2 Example of three different sets with a similar deﬁnition. Only the ﬁrst set
is convex.

Telegram: @ElectricalDocument
3.1 Convex sets 41

We might identify a convex set by a graph in ℝ2 or ℝ3 , as in the previous

example. However, most of the power systems problems are in ℝ𝑛 , where we
may not have a simple graphic representation. Therefore, we require a more
formal definition, namely: A set Ω ⊆ ℝ𝑛 is convex if for any pair of points
𝑥, 𝑦 ∈ Ω we have that,

(1 − 𝜆)𝑥 + 𝜆𝑦 ∈ Ω (3.4)

for any 𝜆 ∈ ℝ, 0 ≤ 𝜆 ≤ 1.
This definition allows determining if a set is convex in a general and system-
atic way. We only require to select two general points 𝑥 and 𝑦 that belong to the
set, and demonstrate that any point 𝑧 in the segment 𝑥𝑦 also belongs to the set
too. Let us perform this calculation algebraically, as presented in the following
example.

Example 3.2. Let us consider a set defined as Ω = {𝑥 ∈ ℝ𝑛 ∶ 𝐴𝑥 = 𝑏}. This

set is called an affine set. Then, a point 𝑥 ∈ Ω or 𝑦 ∈ Ω must be such that
𝐴𝑥 = 𝑏 and 𝐴𝑦 = 𝑏. Let us consider now a point 𝑧 between 𝑥 and 𝑦, that is
𝑧 = (1 − 𝜆)𝑥 + 𝜆𝑦, with 0 ≤ 𝜆 ≤ 1. Then we have the following:

𝐴𝑥 = 𝑏 (3.5)
𝐴𝑦 = 𝑏 (3.6)
(1 − 𝜆)𝐴𝑥 + 𝜆𝐴𝑦 = (1 − 𝜆)𝑏 + 𝜆𝑏 (3.7)
𝐴𝑧 = 𝑏 (3.8)

The last equation implicates that 𝑧 ∈ Ω and hence, Ω is a convex set. Notice
that we do not select a particular point 𝑥 or 𝑦. Our demonstration was general
for any pair of points.

An important property of convex sets is that the intersection of two

convex sets generates another convex set. However, the union of two
convex is not necessarily convex. Figure 3.3 shows an example of these
set-operations.

Figure 3.3 Union and

intersection of two convex
sets (a triangle and a ball). The
intersection is convex but the
union may be non-convex.

∩ (convex) ∪ (non-convex)

Telegram: @ElectricalDocument
42 3 Convex optimization

Fortunately, an optimization problem is given by the intersection of the sets

generated by their constraints. Consider, for instance, the following optimiza-
tion problem:

min 𝑓(𝑥)
𝑔(𝑥) ≤ 0 (3.9)
ℎ(𝑥) ≤ 0

This problem is equivalent to Equation (3.10),

min 𝑓(𝑥) with 𝑥 ∈ Ω (3.10)

where
{ }
Ω = Ω𝑔 ∩ Ω ℎ (3.11)
Ω𝑔 = {𝑥 ∈ ℝ𝑛 ∶ 𝑔(𝑥) ≤ 0} (3.12)
𝑛
Ωℎ = {𝑥 ∈ ℝ ∶ ℎ(𝑥) ≤ 0} (3.13)

Therefore, it is enough to check that each constrain defines a convex set. We

show several classic examples of convex sets below.
Example 3.3. An affine set, is a subset of ℝ𝑛 given by Ω = {𝑥 ∈ ℝ𝑛 ∶ 𝐴𝑥 = 𝑏}.
This set is convex as demonstrated in Example 3.2. We prefer the term affine
instead of linear, since Ω is not a linear space, unless 𝑏 = 0 (a linear space must
contain the zero vector [16]).
Example 3.4. A polytope is a set defined as follows:

𝒫 = {𝑥 ∈ ℝ𝑛 ∶ 𝐴𝑥 ≤ 𝑏} (3.14)

where 𝑏 is a vector and 𝐴 is a matrix of proper size. It is easy to see that 𝒫

is convex. Let us consider two points 𝑥 ∈ 𝒫 and 𝑦 ∈ 𝒫, then, we have the
following results:

𝐴𝑥 ≤ 𝑏 (3.15)
𝐴𝑦 ≤ 𝑏 (3.16)

Now, let us define an intermediate point 𝑧 = 𝜆𝑥 + (1 − 𝜆)𝑦 with 0 ≤ 𝜆 ≤ 1,

then,

𝜆𝐴𝑥 + (1 − 𝜆)𝐴𝑦 ≤ 𝜆𝑏 + (1 − 𝜆)𝑏 (3.17)

𝐴𝑧 ≤ 𝑏 (3.18)

Therefore, 𝑧 ∈ 𝒫 which means that 𝒫 is convex.

Telegram: @ElectricalDocument
3.1 Convex sets 43

Example 3.5. The following set forms a polytope, as depicted in Figure 3.4:

2𝑥 + 5𝑦 ≤ 10
2𝑦 − 2𝑥 ≤ 1
𝑥≥0 (3.19)
𝑦≥0

This is a convex set that is represented as Equation (3.14) with the following
parameters:

⎛ 2 5 ⎞
⎜ 2 −2 ⎟
𝐴=⎜ (3.20)
−1 0 ⎟
⎜ ⎟
0 −1
⎝ ⎠

and,

𝑏 = (10 1 0 0)⊤ (3.21)

Example 3.6. A unit ball with center in zero is a set ℬ0 , defined as follows:

ℬ0 = {𝑥 ∈ ℝ𝑛 ∶ ‖𝑥‖ ≤ 1} (3.22)

where ‖⋅‖ represents a norm in ℝ𝑛 . Let us consider two points 𝑥 ∈ 𝔹0 and

𝑦 ∈ 𝔹0 , then we have that,

‖𝑥‖ ≤ 1 (3.23)
‖𝑦‖ ≤ 1 (3.24)

Figure 3.4 Example of a polytope in

the plane.

Telegram: @ElectricalDocument
44 3 Convex optimization

Now, let us consider an intermediate point 𝑧 = 𝜆𝑥 + (1 − 𝜆)𝑦 with 0 ≤ 𝜆 ≤ 1,

then we have the following result1 :

‖𝑧‖ = ‖𝜆𝑥 + (1 − 𝜆)𝑦‖ (3.25)

≤ 𝜆 ‖𝑥‖ + (1 − 𝜆) ‖𝑦‖ (3.26)
≤𝑏 (3.27)

Therefore 𝑧 ∈ 𝔹0 and consequently, 𝔹0 is a convex set. The set Ω𝐴 as shown

in Figure 3.2) is a particular case of a unitary ball in ℝ2 . Notice that the set
ℳ𝑟 = {𝑥 ∈ ℝ𝑛 ∶ ‖𝑥‖ ≥ 𝑟} is not a convex set, for the same reasons exposed in
Example 3.1.

In summary, the procedure to demonstrate convexity is straightforward: first,

we define two points in the set; then, we define an intermediate point; finally,
we demonstrate that this intermediate point belongs to the set. Let us see a final
example.

Example 3.7. An ellipsoid is a set given by

{ }
ℰ = 𝑥 ∈ ℝ𝑛 ∶ 𝑥⊤ 𝐴𝑥 ≤ 1 (3.28)

where 𝐴 is a symmetric positive definite matrix (i.e., all its eigenvalues are
positive). A set of this form is equivalent to a unit ball, as follows:
{ }
ℬ = 𝑥 ∈ ℝ𝑛 ∶ ‖𝑥‖𝐴 ≤ 1 (3.29)

with a norm ‖⋅‖𝐴 defined as Equation (3.30),

√
‖𝑥‖𝐴 = 𝑥⊤ 𝐴𝑥 (3.30)

We already demonstrated in the previous example that a unit ball is a convex

set. The reader is invited to demonstrate that Equation (3.30) is actually a norm,
if 𝐴 is positive definite.

Example 3.8. A second-order cone (SOC) is a convex set defined as follows:

{ }
SOC = 𝑥 ∈ ℝ𝑛 ∶ ‖𝐴𝑥 + 𝑏‖ ≤ 𝑐⊤ 𝑥 + 𝑑 (3.31)

where ‖⋅‖ represents the 2-norm; 𝐴 is a square matrix; 𝑏 and 𝑐 are vectors
in ℝ𝑛 ; and 𝑑 is a scalar. Second-order cones play a crucial role in the convex
approximations for the optimal power flow, as presented in Chapter 10.

1 We used the properties of the norm, already explained in the previous chapter.

Telegram: @ElectricalDocument
3.2 Convex functions 45

a) b)

( 1) (
1 1 1)

( 2 2)
( 2 2)

Figure 3.5 Comparison between a convex and a non-convex function.

3.2 Convex functions

As discussed in Chapter 2, an optimization problem is constituted by a feasible
set and an objective function. We already observed that the feasible set might
be convex. Now, we can extend this concept to the objective function.
A function 𝑓 ∶ ℝ𝑛 → ℝ is convex if fulfills the following inequality,
commonly known as Jensen’s inequality, for any pair of points 𝑥, 𝑦 ∈ ℝ𝑛 :

𝑓(𝜆𝑥 + (1 − 𝜆)𝑦) ≤ 𝜆𝑓(𝑥) + (1 − 𝜆)𝑓(𝑦) (3.32)

where 0 ≤ 𝜆 ≤ 1. If the inequality fulfills strictly, then the function is strictly

convex2 . We say the function 𝑓(𝑥) is concave if −𝑓(𝑥) is convex.
Figure 3.5 depicts a convex and a non-convex function. In case a) we can
draw a line between (𝑥1 , 𝑦1 ) and (𝑥2 , 𝑦2 ) but there are some parts of the function
that are above the line segment (i.e., the function is non-convex). In case b) we
can see that every point in the line segment is below the function itself. This
function is convex.

Example 3.9. A list of convex functions in ℝ is presented in Figure 3.6. A con-

vex function is continuous but not necessarily derivable; for example, 𝑓(𝑥) =
|𝑥| is a convex function but it does not have derivative at 𝑥 = 0. In addition, a
function might be convex only in certain domain; for instance, 𝑓(𝑥) = − cos(𝑥)
is convex, only for 𝑥 ∈ [−𝜋∕2, 𝜋∕2].

There are three properties that are useful for defining and identifying convex
functions, especially in the general case of ℝ𝑛 .

2 This type of functions ensures uniqueness of the solution as will be presented in Section 3.4.

Telegram: @ElectricalDocument
46 3 Convex optimization

Figure 3.6 List of common convex functions.

Property 1: The sum of two or more convex functions is also a convex func-
tion. However, the rest of convex functions is not necessarily
convex.
Property 2: The composition of a convex and an affine function is also a convex
function, e.g., 𝑓(𝐴𝑥 + 𝑏) is convex if 𝑓 is convex.
Property 3: The composition of a convex function with a convex non-
decreasing function is also a convex function (see Figure 3.7).

Let us demonstrate the third property (the reader is invited to demonstrate the
first two properties). Consider a convex function 𝑓 ∶ ℝ𝑛 → ℝ, then it fulfills
Jensen’s inequality:

𝑓(𝜆𝑥 + (1 − 𝜆)𝑦) ≤ 𝜆𝑓(𝑥) + (1 − 𝜆)𝑓(𝑦) (3.33)

Figure 3.7 Composition of a convex function and a convex non-decreasing function.

Telegram: @ElectricalDocument
3.3 Convex optimization problems 47

Let us consider now, a function 𝑔 which is monotonic increasing, then we can

apply this function to Equation (3.33) without modifying the direction of the
inequality:
𝑔(𝑓(𝜆𝑥 + (1 − 𝜆)𝑦)) ≤ 𝑔(𝜆𝑓(𝑥) + (1 − 𝜆)𝑓(𝑦)) (3.34)
If 𝑔 is also convex, then it is easy to see from Equation (3.34) that 𝑔(𝑓(𝑥)) fulfills
also the Jensen’s inequality and therefore, it is convex.
Example 3.10. The following functions are convex since they can be defined
as the sum of convex functions:
𝑓(𝑥) = 3𝑥 + 2𝑒−𝑥 (3.35)
2
𝑓(𝑥) = 5𝑥 + 3𝑥 (3.36)
𝑓(𝑥) = 8𝑥 2 − ln(𝑥) (3.37)
Notice that the rest of convex functions is not necessarily convex. For example
𝑓(𝑥) = 5𝑥 − 3𝑥2 is non-convex.
Example 3.11. The following functions are convex since they are the compo-
sition of a convex and an affine function:
‖ ‖
𝑓(𝑥) = ‖‖‖𝑎⊤ 𝑥 + 𝑏‖‖‖ (3.38)
⊤ 2
𝑓(𝑥) = (𝑎 𝑥 + 𝑏) (3.39)
𝑓(𝑥) = − ln(𝑎⊤ 𝑥 + 𝑏) (3.40)

Example 3.12. The following function is convex since it is the composition of

a convex function 𝑥⊤ 𝑥, and a convex/monotonic-increasing function exp(𝑥):
𝑓(𝑥) = exp(𝑥⊤ 𝑥) (3.41)

3.3 Convex optimization problems

A convex optimization problem is a problem given by Equation (3.42),
min 𝑓(𝑥)
𝑥∈Ω (3.42)
where 𝑓 is a convex function and Ω is a convex set. The latter can be represented
in terms of equality and inequality constraints due to the properties already
defined in the previous section.

Telegram: @ElectricalDocument
48 3 Convex optimization

Let us consider a function 𝑞 ∶ ℝ𝑛 → ℝ and define a new set known as the

epigraph of 𝑔, as follows:

epi(𝑔, 𝑡) = {𝑥 ∈ ℝ𝑛 ∶ 𝑔(𝑥) ≤ 𝑡} (3.43)

A function is convex if and only if its epigraph is convex. This property is very
useful to identify convex sets and convex functions. For instance, the set Ω𝐴
in Example 3.1 is convex since the function 𝑓(𝑥, 𝑦) = 𝑥2 + 𝑦 2 is a convex
function.

Example 3.13. An optimization problem of the form

min 𝑓(𝑥) (3.44)

can be transformed using the epigraph into the following problem:

min 𝑡
𝑓(𝑥) ≤ 𝑡 (3.45)

This model is convex if 𝑓 is convex.

In general, a set of equations of the form 𝑔𝑖 (𝑥) ≤ 0 can be represented as the

intersections of the epigraphs of 𝑔𝑖 with 𝑡 = 0, namely:
⋂
Ω= epi(𝑔𝑖 , 0) (3.46)
𝑖

Recall that the intersection of convex sets is also a convex set. Equality con-
straints of the form 𝐴𝑥 = 𝑏 (affine) may complement the model. Hence,
the canonical representation of a convex optimization problem is presented
below:

min 𝑓(𝑥)
𝑔(𝑥) ≤ 0 (3.47)
𝐴𝑥 = 𝑏

where 𝑓 and 𝑔 are convex functions. It is important to notice that equality con-
straints must be affine and inequality constraints must be convex. A constraint
of the form 𝑓(𝑥) = 0 or 𝑓(𝑥) ≥ 0 does not define a convex set, even if 𝑓 is convex
(recall the case of Ω𝐵 and Ω𝐶 in Figure 3.2). Below we present some common
examples of convex optimization problems.

Telegram: @ElectricalDocument
3.3 Convex optimization problems 49

Example 3.14. A linear programming problem is a problem where both the

objective function and the constraints are affine.

min 𝑐⊤ 𝑥
𝐴𝑥 = 𝑏 (3.48)
𝑥≥0

This type of problem is convex since affine functions are also convex. The fea-
sible set is a polytope and the optimum is usually a vertex. Several problems in
power systems operation can be represented a linear programming models, for
example, the economic dispatch and the demand-side management.

Example 3.15. Quadratic programming problems are problems where the

objective function is quadratic and the constraints are affine.

1 ⊤
min 𝑥 𝐻𝑥 + 𝑐⊤ 𝑥
2
𝐴𝑥 = 𝑏 (3.49)
𝑥≥0

This problem is convex if 𝐻 positive semidefinite. The feasible set is also a poly-
tope but the optimum could be inside the set, or in the boundary. The economic
dispatch of thermal units is a common example of quadratic programming
problems.

Example 3.16. Quadratic programming with quadratic constrains are prob-

lems with the following structure:

1 ⊤
min 𝑥 𝐻𝑥 + 𝑐⊤ 𝑥
2
𝐴𝑥 ≤ 𝑏 (3.50)
1 ⊤
𝑥 𝑀𝑥 + 𝑠⊤ 𝑥 ≤ 𝑡
2
This problem is convex if both 𝐻 and 𝑀 are positive semidefinite. It is very
important to emphasize in direction of the sign ≤ in Equation (3.50). Notice
that the following constraints are not convex even if 𝑀 is semidefinite:

1 ⊤
𝑥 𝑀𝑥 + 𝑠⊤ 𝑥 = 𝑡 (3.51)
2
1 ⊤
𝑥 𝑀𝑥 + 𝑠⊤ 𝑥 ≥ 𝑡 (3.52)
2

Telegram: @ElectricalDocument
50 3 Convex optimization

Quadratic constraints can be transformed in second-order cones constrains as

will be presented in Chapter 5 and in Chapter 10 for the optimal power flow
problem.
Example 3.17. A problem with the following objective function is convex,
since exp(𝑥) is a convex function.
∑
min exp(𝑎𝑘 𝑥𝑘 )
𝑘

𝐴𝑥 ≤ 𝑏 (3.53)

Environmental dispatch is an instance of this type of problem.

3.4 Global optimum and uniqueness of the solution

The main feature of convex optimization problems is the capability to guaran-
tee global optimal, therein lies our interest in this type of models. The intuition
behind this feature is quite simple: let us consider a convex optimization
problem represented as Equation (3.47) with a local optimum 𝑥, ̃ that is to say:

̃ = inf {𝑓(𝑥) ∶ 𝑥 ∈ 𝒩}
𝑓(𝑥) (3.54)

where 𝒩 ⊂ Ω in a given neighborhood is defined as follows:

𝒩 = {𝑥 ∈ Ω, ‖𝑥 − 𝑥‖
̃ < 𝑟} (3.55)

Let us consider now, a feasible point 𝑦 ∈ Ω outside of 𝒩. Since Ω is convex,

we have that:

𝑥 = 𝛼𝑥̃ + 𝛽𝑦 (3.56)

where 𝛼 + 𝛽 = 1 y 0 ≤ 𝛼 ≤ 1, 0 ≤ 𝛽 ≤ 1. Now, since 𝑓 is convex, then we have

the following:

̃ + 𝛽𝑓(𝑦)
𝑓(𝑥) ≤ 𝛼𝑓(𝑥) (3.57)

̃ ≤ 𝑓(𝑥) since 𝑥 is in the neighborhood of 𝑥,

Evidently 𝑓(𝑥) ̃ therefore:

̃ ≤ 𝑓(𝑥) ≤ 𝛼𝑓(𝑥)
𝑓(𝑥) ̃ + 𝛽𝑓(𝑦) (3.58)

Figure 3.8 Schematic representation

of the neighborhood of 𝑥̃ in a convex
optimization problem over a set Ω.

Telegram: @ElectricalDocument
3.4 Global optimum and uniqueness of the solution 51

̃ ≤ 𝛽𝑓(𝑦)
(1 − 𝛼)𝑓(𝑥) (3.59)
̃ ≤ 𝛽𝑓(𝑦)
𝛽𝑓(𝑥) (3.60)
̃ ≤ 𝑓(𝑦)
𝑓(𝑥) (3.61)

Consequently, any feasible point outside the neighborhood of 𝑥̃ is also greater

than 𝑓(𝑥)̃ and hence, 𝑥̃ is the optimal point in the entire set Ω, that is to say, 𝑥̃
is a global optimum of the problem.
Now, a global optimum does not imply uniqueness, since we could have sev-
eral 𝑥̃ with the same value of the objective function3 . We require the function
to be strictly convex in order to guarantee the solution is unique. Let us see why
̃ 𝑦,
this is so: consider two optimal points 𝑥, ̃ namely:

̃ = 𝑓(𝑦)
𝑓(𝑥) ̃ ≤ 𝑓(𝑧) (3.62)

for all 𝑧 ∈ Ω. Let us consider now, an intermediate point 𝑧 = 𝛼 𝑥̃ + 𝛽 𝑦;

̃ since
Ω is convex, then this point must belong to the feasible set. If the function is
strictly convex, then we have that:
̃ + 𝛽𝑓(𝑦)
𝑓(𝑧) < 𝛼𝑓(𝑥) ̃ = 𝑓(𝑥)
̃ = 𝑓(𝑦)
̃ (3.63)

but 𝑓(𝑧) < 𝑓(𝑥) ̃ contradicts our initial assumption. Therefore, there is no way
that 𝑥̃ ≠ 𝑦̃ for a strictly convex function (i.e., 𝑥̃ = 𝑦̃ and optimum is unique).
One way to identify a strictly convex function is by defining a new function
𝑔 as follows:
2
𝑔(𝑥) = 𝑓(𝑥) − 𝜇 ‖𝑥‖ (3.64)

with 𝜇 > 0. If 𝑔 is convex, then 𝑓 is not only convex but also strictly con-
vex. The relation among strongly, strictly, and convex functions is depicted in
Figure 3.9:

Figure 3.9 Relation among the concepts of strict and strong convexity. All strongly
convex functions are strictly convex. Both are, of course, convex.

3 see Figure 2.5 in Chapter 2.

Telegram: @ElectricalDocument
52 3 Convex optimization

3.5 Duality
In this section, we briefly study duality theory for convex optimization prob-
lems. We can associate a complementary optimization problem called a dual
problem for any optimization problem, convex or not convex. Let us con-
sider a convex optimization problem as Equation (3.47), then we can define
a Lagrangian function as follows:

ℒ(𝑥, 𝜆, 𝜇) = 𝑓(𝑥) + 𝜇⊤ 𝑔(𝑥) + 𝜆⊤ (𝐴𝑥 − 𝑏) (3.65)

Where 𝜆 are the Lagrange multipliers associated to equality constraints,

and 𝜇 are the Lagrange multipliers associated to inequality constraints.
These multipliers represent a change in the objective function for a change
in the constraint, as previously demonstrated in Section 2.6. We assume
that 𝜇 ≥ 0.
Now, we define a new function called the dual function as presented
below:

𝒲(𝜆, 𝜇) = inf ℒ(𝑥, 𝜆, 𝜇) (3.66)

𝑥

Notice the dual function depends on 𝜆 and 𝜇 but not on 𝑥 as is the case of ℒ.
Since 𝜇 ≥ 0 then for any feasible point 𝑥, we have that,

ℒ(𝑥, 𝜆, 𝜇) ≤ 𝑓(𝑥) (3.67)

Since the point is feasible, then 𝐴𝑥 − 𝑏 = 0 and 𝑔(𝑥) ≤ 0, therefore the term
𝜇⊤ 𝑔(𝑥) is negative and Equation (3.67) is clearly met. This property is known
as weak duality and is general for any feasible point, even in the minimum.
Therefore, we have also that

𝒲(𝜆, 𝜇) ≤ 𝑓(𝑥) (3.68)

Hence, the dual function is a lower limit of the optimization problem

Equation (3.47) (henceforth, the primal problem). We can define a dual opti-
mization problem as given below:

max 𝒲(𝜆, 𝜇)
𝜇≥0 (3.69)

The relation between the primal and the dual problem is depicted in
Figure 3.10. The Primal problem search for a minimum point in the feasible
set while the dual problem search for a maximum point in its own feasible set.
The maximum point of 𝒟 constitutes a low boundary of the problem since any

Telegram: @ElectricalDocument
3.5 Duality 53

Figure 3.10 Relation between primal and dual problems.

feasible point in the Primal returns a higher value than the value given by any
feasible point at the dual problem.

Example 3.18. The dual of a linear programming problem is calculated as

follows:

{ }
𝒲(𝜇) = inf 𝑐⊤ 𝑥 + 𝜇⊤ (𝐴𝑥 − 𝑏) (3.70)
𝑥

This infimum exists if (𝑐⊤ + 𝜇⊤ 𝐴) = 0⊤ , and thus, 𝒲(𝜇) = −𝜇⊤ 𝑏. Therefore,

the dual problem is:

min 𝑏⊤ 𝜇
𝑐 + 𝐴⊤ 𝜇 = 0 (3.71)
𝜇≥0

That is to say, the dual of a linear programming problem is another linear

programming problem.

Example 3.19. Let us consider the following quadratic programming problem,

that represents a simple economic dispatch:

𝑛
∑ 𝑎𝑘 2
min 𝑝 + 𝑏𝑘 𝑝𝑘
𝑘=1
2 𝑘
𝑚
∑
𝑝𝑘 = 𝑑 (3.72)
𝑘=1

𝑝𝑘 ≥ 0

Telegram: @ElectricalDocument
54 3 Convex optimization

The dual function is given below:

𝑛 𝑚 𝑚
∑ 𝑎 ∑ ∑
𝒲(𝜇, 𝜆) = inf { ( 𝑘 𝑝𝑘2 + 𝑏𝑘 𝑝𝑘 ) + 𝜆 (𝑑 − 𝑝𝑘 ) + 𝜇𝑘 (−𝑝𝑘 )}
𝑝
𝑘=1
2 𝑘=1 𝑘=1

(3.73)
and hence, the dual problem is the following:
𝑛
∑ 1
max 𝜆𝑑 − (𝜆 + 𝜇𝑘 − 𝑏𝑘 )2
𝑘=1
2𝑎 𝑘

𝜇𝑘 ≥ 0 (3.74)
In general, the dual of a quadratic programming problem is another quadratic
programming problem.
The dual function is concave, even if the Primal problem is non-convex.
Remember that a function 𝒲 is concave if −𝒲 is convex. Therefore, we must
check if −𝒲 holds Jensen’s inequality. Let us take two points in the dual
function, (𝜆1 , 𝜇1 ) and (𝜆2 , 𝜇2 ) with two real values 𝛼 ≥ 0, 𝛽 ≥ 0 with 𝛼 =
1 − 𝛽; then, let us evaluate the function in a generic midpoint, as presented
below 4 :
−𝒲(𝛼𝜆1 + 𝛽𝜆2 , 𝛼𝜇1 + 𝛽𝜇2 ) = −inf {ℒ(𝑥, 𝛼𝜆1 + 𝛽𝜆2 , 𝛼𝜇1 + 𝛽𝜇2 )}
𝑥

= sup {−ℒ(𝑥, 𝛼𝜆1 + 𝛽𝜆2 , 𝛼𝜇1 + 𝛽𝜇2 )}

𝑥

≤ 𝛼sup {−ℒ(𝑥, 𝜆1 , 𝜇1 )} + 𝛽sup {−ℒ(𝑥, 𝜆2 , 𝜇2 )}

𝑥 𝑥

= −𝛼𝒲(𝜆1 , 𝜇1 ) − 𝛽𝒲(𝜆2 , 𝜇2 ) (3.75)

Therefore, 𝒲 is concave, and max 𝒲 = − min −𝒲 is a convex problem which
is, sometimes, easier to solve than the primal problem.
Example 3.20. Let us consider the following optimization problem:
min 𝑓(𝑥) = 𝑥4 − 5𝑥2 − 3𝑥
𝑥≥1 (3.76)
The objective function of this problem is the polynomial plotted in Figure 3.11a
which is, evidently, non-convex. However, the dual function is convex and
defines a lower bound of the problem as shown in Figure 3.11b.

4 Recall that inf (𝑊) = − sup(−𝑊).

Telegram: @ElectricalDocument
3.5 Duality 55

Figure 3.11 Example of a primal non-convex problem and its corresponding dual
function.

Weak duality allows finding a lower limit of the primal problem by maxi-
mizing the dual. However, very often, the dual is equal to the primal. In that
case, we say the problem fulfills the strong duality conditions. There is a sim-
ple criterion for strong duality in convex optimization problems. Informally,
the primary condition to guarantee strong duality is that the feasible region
must have a non-empty relative interior. This criterion is named as Slater’s
conditions [17].
Let us consider the feasible set Ω, of a primal convex optimization problem,
namely:

Ω = {𝑥 ∈ ℝ𝑛 ∶ 𝐴𝑥 − 𝑏 = 0, 𝑔(𝑥) ≤ 0} (3.77)

we define a new set known as the relative interior relint(Ω) as follows:

relint(Ω) = {𝑥 ∈ ℝ𝑛 ∶ 𝐴𝑥 − 𝑏 = 0, 𝑔(𝑥) < 0} (3.78)

Notice the only difference between Ω and relint(Ω) is in the inequality con-
straint. Slater’s condition state that relint(Ω) ≠ ∅ , that is to say, there is at
least one point that fulfills equality and the inequality constraints strictly. Let
us analyze these concepts in the following example:
Example 3.21. Consider the following optimization problem

min 𝑥 2
𝑥≥3 (3.79)

The Lagrangian function is

ℒ(𝑥, 𝜇) = 𝑥2 + 𝜇(3 − 𝑥) (3.80)

There is no 𝜆 because the model has only inequality constraints. The dual
function is calculated taking the minimum of this function (i.e., when
2𝑥 − 𝜇 = 0):

Telegram: @ElectricalDocument
56 3 Convex optimization

Figure 3.12 Comparison between the primal and the dual problems.

𝒲(𝜇) = inf ℒ(𝑥, 𝜇) = 3𝜇 − 𝜇2 ∕4 (3.81)

𝑥

Now, we can define a new optimization problem as max 𝒲(𝜇) with 𝜇 ≥ 0.

Both, the primal and the dual problems are shown in Figure 3.12; as expected,
any feasible point in 𝒲 is below that any feasible point in 𝑓 (feasible points in
the primal are those 𝑥 ≥ 3 and feasible points in the dual are those 𝜇 ≥ 0). In
addition, there are feasible points that fulfills strictly the inequality constraint
(for example 𝑥 = 4), therefore, we can guarantee that min 𝑓(𝑥) = max 𝒲(𝜇)
̃ = 9).
(in this case, the optimum is 𝑓(𝑥)

Dual variables represent the change in the objective function for a change in
the constrain. Let us consider an optimization problem as

min 𝑓(𝑥)
𝐴𝑥 = 𝑏 (3.82)
𝑔(𝑥) ≤ 𝑡

Let us suppose we know the optimum 𝑥̃ and the corresponding dual variables
𝜆̃ and 𝜇,
̃ then we have that:
̃
∆𝑓(𝑥)
𝜆̃ = (3.83)
∆𝑏
̃
∆𝑓(𝑥)
𝜇̃ = (3.84)
∆𝑡
This interpretation of the dual variables is vital in common practical prob-
lems. These variables can be used to define new investments, as explained in
Chapter 7.

3.6 Further readings

A summary of the main properties of convex optimization problems is pre-
sented in Table 3.1. A problem is convex if the objective function and inequality

Telegram: @ElectricalDocument
3.6 Further readings 57

constraints are convex, and the equality constraints are affine. Once we have
checked these conditions, we can conclude that the optimum is global. Therein
lies the importance of identifying convex functions in an optimization model.
In this chapter, we presented common examples of convex functions. More
exotic functions can be studied in [17]. Another type of convex functions such
as second-order cone (SOC) and semidefinite programming (SDP) will be stud-
ied in Chapter 5. Uniqueness can be guaranteed by using the concepts of strong
and strict convexity; usually, it is simpler to identify a strongly convex func-
tion. More details about these types of functions can be studied in [18]. The
concept of duality can be studied in more detail in [19]. We only presented
the most basic concept of duality and some conditions to interpret results. The
interpretation of the dual variables is critical in many practical problems. Other
concepts such as the Karush–Kuhn–Tucker conditions can be studied in [20].
Finally, it is recommended to review general concepts of linear algebra; two
excellent references are [21] and [16]. We avoided solving optimization prob-
lems by hand since our objective is to solve large optimization problems. All
repetitive calculations must be made by the computer, as presented in the next
chapter.

Table 3.1 Summary of the main properties of convex optimization problems.

Deﬁnition Consequence
Convex problem
min 𝑓(𝑥), convex global optimum
𝐴𝑥 = 𝑏, affine
𝑔(𝑥) ≤ 0, convex
Strictly convex function
𝑓(𝛼𝑥 + 𝛽𝑦) < 𝛼𝑓(𝑥) + 𝛽𝑓(𝑦) unique solution
Strongly convex function
2
𝑓(𝑥) − 𝜇 ‖𝑥‖ also convex strictly convex
Dual function
{ }
𝒲(𝜇, 𝜆) = inf 𝑓(𝑥) + 𝜆⊤ (𝐴𝑥 − 𝑏) + 𝜇⊤ 𝑔(𝑥) concave
Dual problem weak duality
max 𝒲(𝜇, 𝜆) with 𝜇 ≥ 0 max 𝒲 ≤ min 𝑓
Slater conditions
There exists at least one 𝑥 such that strong duality
𝐴𝑥 + 𝑏 = 0, and 𝑔(𝑥) < 0 max 𝒲 = min 𝑓
Dual variables change of the objective function
𝜆, 𝜇 for a change in the constraint

Telegram: @ElectricalDocument
58 3 Convex optimization

3.7 Exercises
1. Show that the following set is convex:
{ }
Ω = (𝑥, 𝑦) ∈ ℝ2 ∶ 𝑥 ≥ 0, 𝑦 ≥ 0, 𝑥𝑦 ≥ 1 (3.85)
2. Show that the sum of two convex functions is also a convex function.
3. Show that the composition of a convex function and an affine function,
results in an convex function.
4. Demonstrate that a function is convex if and only if its epigraph is convex.
5. Identify which of these functions are convex:
𝑓(𝑥) = 𝑥⊤ 𝐴𝑥, with 𝐴 ≻ 0 (3.86)
𝑓(𝑥) = exp((𝐵𝑥 + 𝑐)⊤ 𝐴(𝐵𝑥 + 𝑐)), with 𝐴 ⪰ 0 (3.87)
⊤
𝑓(𝑥) = − ln((𝐵𝑥 + 𝑐) 𝐴(𝐵𝑥 + 𝑐)), with 𝐴 ⪰ 0 (3.88)
6. Show the following relation using Jensen’s inequality (use the fact that
− ln(𝑥) is convex and monotone).
𝑛 𝑛
∏ 1∕𝑛 ∑ 1
𝑥𝑘 ≤ 𝑥𝑘 (3.89)
𝑘=1 𝑘=1
𝑛

7. Show the Lagrangian and the dual function associated to the following
optimization problem:
min 𝑓(𝑥) = (𝑥 − 1)2
𝑥2 ≤ 0 (3.90)
Show the feasible space of both the dual and the primal problems. Analyze
the conditions of weak and strong duality.
8. Determine the dual problem associated to a quadratically constrained
quadratic programme presented below:
1 ⊤
min 𝑥 𝐻𝑥 + 𝑏𝑥
2
1 ⊤
𝑥 𝐴𝑥 + 𝑐𝑥 ≤ 𝑑 (3.91)
2
9. Consider the following optimization problem
min 𝑥2 + 1
(𝑥 − 2)(𝑥 − 4) ≤ 0 (3.92)
Show the set of feasible solutions and find the optimum. Plot the objective
function vs 𝑥; on the same graph, plot the Lagrangian function ℒ(𝑥, 𝜇) for
different values of 𝜇 (for example 𝜇 = 1, 𝜇 = 5, and 𝜇 = 10). Formulate

Telegram: @ElectricalDocument
3.7 Exercises 59

the dual problem and solve it. Analyze the conditions of strong and weak
duality.
10. Solve the following problem using both the primal and the dual formulations
(use graphs of the function and the feasible set in order to find the optimum)
min 𝑥 + 𝑦
𝑥2 + 𝑦2 ≤ 1 (3.93)

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
61

Convex Programming in Python

Learning outcomes

By the end of this chapter, the student will be able to:

● Identify linear and quadratic optimization problems.
● Solve linear and quadratic problems using Python.

4.1 Python for convex optimization

In the previous chapter, we learned that local optima are guaranteed to be
global in convex optimization problems. Other theoretical properties, such as
the uniqueness of the solution and strong duality, were also assured under
well-defined conditions. These properties are intrinsic of the model, regard-
less of the solution algorithm, and hence, any solver will give the same
solution.
We use Python and the module CvxPy, as a modeling language for convex
optimization problems [10]. This module checks the convexity of the prob-
lem and calls a solver that returns the solution of the model (see Figure 4.1);
then, results can be analyzed in Python using other modules, such as NumPy
for linear algebra operations, and MatplotLib for plotting the results. In this
way, Python is transformed into a complete modeling language that permits
writing and analyzing the optimization problem in a systematic form. How-
ever, we must carefully define the model to ensure it is convex, using a set of
functions and rules for constructing the model and guarantee convexity. This
philosophy for solving optimization problems is known as disciplined convex
programming [22].

Mathematical Programming for Power Systems Operation: From Theory to Applications in

Figure 4.1 Using Python

for mathematical
optimization.

Python and CvxPy allow us to concentrate on the model without worrying

about the solution algorithm; since, as mentioned above, the model is well-
behaved when it is convex. The following sections present the implementation
of linear and quadratic programming models to solve general optimization
problems. After that, a brief review of the algorithms is presented in order to
understand what is inside the box.

4.2 Linear programming

Reality is non-linear in nature. However, in many cases, we can formulate lin-
ear programming approximations that simplify complicated problems. In a lin-
ear programming problem, both the objective function and the constraints are
affine1 , generating a polytope as the feasible set. The canonical representation
of a linear programming problem is presented below:

min 𝑐⊤ 𝑥
𝐴𝑥 = 𝑏 (4.1)
𝑥≥0

Where 𝑥 ∈ ℝ𝑛 , are decision variables; 𝑐 ∈ ℝ𝑛 is a vector that usually repre-

sents costs; 𝑏 ∈ ℝ𝑚 is a vector that represents resources; and, 𝐴 ∈ ℝ𝑚×𝑛 is a
matrix that defines physical constraints of the process that is being optimized.
Any linear programming problem can be transformed into the canonical rep-
resentation. However, this is not the only representation, and sometimes, it is
not the most convenient either. Practical applications will come with their own
representation.

1 Remember that an affine function is of the form 𝑓(𝑥) = 𝑎𝑥 + 𝑏.

Telegram: @ElectricalDocument
4.2 Linear programming 63

Example 4.1. Let us transform the following linear programming problem, to

the canonical form:

max 3𝑥 − 2𝑦
𝑥+𝑦 ≤5 (4.2)
𝑥, 𝑦 ≥ 0

First, we multiply objective function by −1 to obtain a minimization problem;

then, we define a slack variable 𝑧 that transforms the inequality into equality;
finally, we organize the model as follows:

min − 3𝑥 + 2𝑦
𝑥+𝑦+𝑧 =5 (4.3)
𝑥, 𝑦, 𝑧 ≥ 0

In this case, 𝑐 = (−3, 2)⊤ , 𝐴 = (1, 1, 1), and 𝑏 = 5.

The set of feasible solutions of a linear programming problem is a geometric

object known as polytope2 . The optimum of a linear programming problem,
if it exists, is placed in a vertex. Therefore, optimization algorithms, such as
the simplex method, search among the vertices until it achieves the optimum.
For problems in ℝ2 and ℝ3 , we can use the gradient direction to identify the
optimal direction and find graphically the optimal solution. The next example
shows the methodology.

Example 4.2. Let us consider the following linear programming problem:

max 3𝑥0 + 3𝑥1

𝑥0 + 2𝑥1 ≤ 4
4𝑥0 + 2𝑥1 ≤ 12
− 𝑥0 + 𝑥1 ≤ 1 (4.4)
𝑥0 ≥ 0
𝑥0 ≥ 0

The set of feasible solutions for this problem is shown in Figure 4.2. The direc-
tion of the gradient of the objective function is represented as ∇obj; using this
direction, we can easily identify the optimum. As expected, the optimum is in
a vertex, in this case in 𝑑 = (8∕3, 2∕3).

2 We already studied this object in Example 3.4 (Chapter 3) and concluded that it is convex.

Telegram: @ElectricalDocument
64 4 Convex Programming in Python

Figure 4.2 Set of feasible solutions for the linear programming problem in
Example 4.2.

Practical linear programming problems for power systems operation could

have thousands of decision variables and constraints. Therefore, the graphical
method is not enough; we require to solve these problems using a computer. In
the following examples, we show how to use Python and CvxPy to solve linear
programming problems.

Example 4.3. Let us solve the linear programming problem presented in

Example 4.1 using Python. The script that solves this problem is presented
below:

import numpy as np
import cvxpy as cvx
x = cvx.Variable()
y = cvx.Variable()
obj = cvx.Maximize(3*x-2*y)
res = [x+y <= 5, x>=0, y>=0]
Model = cvx.Problem(obj,res)
Model.solve()
print(np.round(obj.value),np.round(x.value,2),
np.round(y.value,2))

The code is intuitive; it starts by defining decision variables with the command
cvx.Variable; then, the objective function and the set of constraints are
determined; the set of constraints are stored in a vector named res; after that,
the model is solved, and results are printed, rounded to two decimal places.
Notice that we did not require to change the problem to a canonical form;
instead, the problem was represented as was raised.

Telegram: @ElectricalDocument
4.2 Linear programming 65

We can use different solvers and see the iterations on each solver as shown
below. The complete list of solvers is available in [23].

Model.solve(solver=cvx.OSQP,verbose=True)
print(obj.value,x.value,y.value)
Model.solve(solver=cvx.ECOS,verbose=True)
print(obj.value,x.value,y.value)
Model.solve(solver=cvx.SCS,verbose=True)
print(obj.value,x.value,y.value)

Example 4.4. Let us solve the linear programming problem presented in

Example 4.2.

import cvxpy as cvx

x = cvx.Variable(2,nonneg=True)
obj = cvx.Maximize(3*x[0]+3*x[1])
res = [ x[0] + 2*x[1] <= 4,
4*x[0] + 2*x[1] <= 12,
-x[0] + x[1] <= 1]
Model = cvx.Problem(obj,res)
Model.solve(solver=cvx.SCS)
print(’objective:’,obj.value)
print(’decisions:’,x.value)

The script is quite similar to the previous example. The only difference was
in the definition of the variables 𝑥, which is a vector in ℝ2+ . Therefore, we
defined the size of the variable and a condition no no-negativity (i.e., nonneg=
True).

Example 4.5. The transportation problem is a special type of linear pro-

gramming problem, which consists in minimizing the cost of transporting a
commodity, from a set of sources to a set of destinations (this commodity can
be, off course, electric power). Each source has a limited supply while each des-
tination has a demand to be satisfied. Decision variables are represented in a
matrix 𝑥, where 𝑥𝑖𝑗 represents the amount of products transported from 𝑖 to
𝑗. Each route has unit costs 𝑐𝑖𝑗 and the amount of products available in the
sources is represented by 𝑠𝑖 , while the amount of product demanded in each
destination is 𝑑𝑗 . The problem consists on minimizing total costs, constrained
to the balance of each source and destination. A general mathematical model
is presented below:

Telegram: @ElectricalDocument
66 4 Convex Programming in Python

Figure 4.3 Oriented graph for a transportation problem with four sources (s) and
three destinations (d). The numbers in the arrows corresponds to the unit costs for
each route.

𝑚 𝑛
∑ ∑
min 𝑐𝑖𝑗 𝑥𝑖𝑗
𝑖 𝑗

𝑛
∑
𝑥𝑖𝑗 = 𝑠𝑖
𝑗

𝑚
∑
𝑥𝑖𝑗 = 𝑑𝑗 (4.5)
𝑖

𝑥𝑖𝑗 ≥ 0

Let us consider a problem with 𝑚 = 4 sources and 𝑛 = 3 destinations, as shown

in Figure 4.3. All the parameters required to solve the problem are depicted in
the figure.
The implementation in Python for this transportation problem is given
below. The reader is invited to evaluate the solution and analyze the solution.

import cvxpy as cvx

import numpy as np
m = 4
n = 3
c = np.array([[3,3,5,8],[7,2,5,8],[4,6,2,3]])
s = np.array([30,20,20,30])
d = np.array([35,42,23])
x = cvx.Variable((m,n),nonneg=True)
obj = cvx.Minimize(cvx.trace(c@x))
res = [[email protected](n) == s,
[email protected](m) == d]
Model = cvx.Problem(obj,res)

Telegram: @ElectricalDocument
4.3 Quadratic forms 67

Model.solve()
print(np.round(x.value,2))

Again, we avoid constraints in the form x ≥ 0, by defining the variable

as positive, with the modifier nonneg=True. Notice that equality constraints
were defined by ==.

4.3 Quadratic forms

It is common to find optimization problems in power systems operations that
have quadratic objective functions. For example, the classic economic dispatch
of thermal units is usually a quadratic problem. Therefore, it is essential to
understand some properties of these types of functions.
A quadratic form 𝑞 ∶ ℝ𝑛 → ℝ is a multivariate polynomial with variables
𝑥0 , 𝑥1 , ...𝑥𝑛−1 , where all the terms are, at most, of order 2. Any quadratic form
can be written as given in Equation (4.6):

𝑞(𝑥) = 𝑥 ⊤ 𝐴𝑥 + 𝑏⊤ 𝑥 + 𝑐 (4.6)

where 𝐴 is a square matrix, 𝑏 is a column vector, and 𝑐 is a constant.

Example 4.6. Let us consider the following multivariate polynomial:

𝑞(𝑥0 , 𝑥1 ) = 5𝑥02 + 2𝑥0 𝑥1 + 3𝑥12 + 8𝑥0 + 4𝑥1 + 15 (4.7)

This polynomial is a quadratic form because the maximum exponent is 2.

Therefore, it can be written as Equation (4.6) with the following structure:
⊤ ⊤
𝑥 5 2 𝑥 8 𝑥
𝑞(𝑥0 , 𝑥1 ) = ( 0 ) ( ) ( 0 ) + ( ) ( 0 ) + 15 (4.8)
𝑥1 0 3 𝑥1 4 𝑥1

Notice that, this representation is not unique. Another possible representation

is presented below:
⊤ ⊤
𝑥 5 10 𝑥 8 𝑥
𝑞(𝑥0 , 𝑥1 ) = ( 0 ) ( ) ( 0 ) + ( ) ( 0 ) + 15 (4.9)
𝑥1 −8 3 𝑥1 4 𝑥1

It is important to note that not all quadratic forms are convex. In this case, the
form is convex since it describes a paraboloid, as may be easily demonstrated
by plotting the function.
The main properties of a quadratic form are determined by the matrix 𝐴.
Two interesting cases are when this matrix is symmetric and when it is skew-
symmetric. A matrix 𝐴 is symmetric when 𝐴 = 𝐴⊤ and skew-symmetric when
𝐴 = −𝐴⊤ . Let us analyze the latter case.

Telegram: @ElectricalDocument
68 4 Convex Programming in Python

Consider a quadratic form 𝑞(𝑥) = 𝑥⊤ 𝑁𝑥, where 𝑁 is skew-symmetric; in that

case, we have the following3 :

𝑞(𝑥) = 𝑞(𝑥)⊤ (4.10)

⊤ ⊤
= (𝑥 𝑁𝑥) (4.11)
⊤
= (𝑁𝑥) 𝑥 (4.12)
= 𝑥⊤ 𝑁 ⊤ 𝑥 (4.13)
⊤
= −𝑥 𝑁𝑥 = −𝑞(𝑥) (4.14)

Therefore, 𝑞(𝑥) = −𝑞(𝑥) for any value of 𝑥 and hence, 𝑞(𝑥) = 0. In conclusion,
a quadratic form 𝑞(𝑥) = 𝑥 ⊤ 𝑁𝑥 is zero if 𝑁 is skew-symmetric.
Any square matrix 𝐴 can be written in terms of a symmetric and a skew-
symmetric matrix, as follows:

1 1
𝐴= (𝐴 + 𝐴) + (𝐴⊤ − 𝐴⊤ ) (4.15)
2 2
1 1
= (𝐴 + 𝐴 ) + (𝐴 − 𝐴⊤ )
⊤
(4.16)
2 2
1 1
= 𝑀+ 𝑁 (4.17)
2 2

where 𝑀 = 𝐴 + 𝐴⊤ and 𝑁 = 𝐴 − 𝐴⊤ . Matrix 𝑀 is symmetric (i.e., 𝑀 = 𝑀 ⊤ )

and 𝑁 is skew-symmetric (i.e., 𝑁 = −𝑁 ⊤ ). Therefore, we have the following:

𝑞(𝑥) = 𝑥⊤ 𝐴𝑥 (4.18)
1 ⊤ 1
= 𝑥 𝑀𝑥 + 𝑥 ⊤ 𝑁𝑥 (4.19)
2 2
1 ⊤
= 𝑥 𝑀𝑥 (4.20)
2

In other words, any quadratic form 𝑞(𝑥) can be written in terms of a symmetric
matrix.

Example 4.7. The quadratic form given in Equation (4.7) can be written in
terms of a symmetric matrix as given below:

⊤ ⊤
1 𝑥 10 2 𝑥 8 𝑥
𝑞(𝑥0 , 𝑥1 ) = ( 0 ) ( ) ( 0 ) + ( ) ( 0 ) + 15 (4.21)
2 𝑥1 2 6 𝑥1 4 𝑥1

3 Notice that 𝑞(𝑥) is a scalar, not a matrix, and hence 𝑞 = 𝑞 ⊤ .

Telegram: @ElectricalDocument
4.4 Semideﬁnite matrices 69

where

5 10 1 5 10 5 −8 1 10 2
( ) = (( )+( )) = ( )
−8 3 2 −8 3 10 3 2 2 6
(4.22)

Any quadratic form 𝑞(𝑥) given by Equation (4.6) is continuous and has
derivative. Its gradient is given by Equation (4.23):

∇𝑄 = (𝐴 + 𝐴⊤ )𝑥 + 𝑏 (4.23)

since 𝐴 is symmetric, then ∇𝑄 = 2𝐴𝑥 + 𝑏, and its hessian is simply 2𝐴.

4.4 Semideﬁnite matrices

There is a particular type of quadratic forms that are always positive. The matri-
ces that represent these forms are known as semidefinite matrices. Thus, we say
a symmetric matrix 𝐴 ∈ ℝ𝑛×𝑛 is positive semidefinite if 𝑞(𝑢) = 𝑢⊤ 𝐴𝑢 ≥ 0 for
any 𝑢 ∈ ℝ𝑛 . These types of matrices are represented as 𝐴 ⪰ 0 (notice that the
symbol is different from ≥). Moreover, we say that the matrix is positive defi-
nite (𝐴 ≻ 0) if 𝑞(𝑢) > 0 for any 𝑢 ≠ 0. Likewise, we say the matrix is negative
definite or semidefinite if (−𝐴) ⪰ 0 or (−𝐴) ≻ 0, respectively.
A symmetric and positive semidefinite matrices have the following proper-
ties:

● Its eigenvalues are all positive.

● The matrix 𝐴 can be factorized as 𝐴 = 𝐶𝐶 ⊤ where 𝐶 is a triangular matrix.
This is called Cholesky factorization, and the matrix 𝐶 is usually represented
as 𝐴1∕2 .
● If 𝐴 ≻ 0 then 𝐴−1 ≻ 0
● If 𝐴 ≻ 0 and 𝐵 ≻ 0 then 𝐴 + 𝐵 ≻ 0
● However, if 𝐴 ≻ 0 and 𝐵 ≻ 0 we cannot say anything about 𝐴𝐵.
● if 𝐴 ≻ 0 and 𝐵 ≻ 0 then 𝐴◦𝐵 ≻ 0 where ◦ represents the Hadamard product
(i.e., the point-wise product).

Example 4.8. The following matrix is positive definite

2 1
𝐴=( ) (4.24)
1 1

Telegram: @ElectricalDocument
70 4 Convex Programming in Python

since its eigenvalues are both positive, i.e., 𝜆 = {0.3819, 2.61803}, and have
Cholesky factorization4 :
√ √ ⊤ √ √
2 1 2 2∕2 2 2∕2
𝐴=( )=( √ ) ( √ ) (4.25)
1 1 0 2∕2 0 2∕2
This matrix defines the following quadratic form, which is evidently positive
for any 𝑥 ≠ 0:

𝑞(𝑥) = 𝑥⊤ 𝐴𝑥 = 2𝑥02 + 2𝑥0 𝑥1 + 𝑥12 (4.26)

= (𝑥1 + 𝑥0 )2 + 𝑥02 (4.27)

Example 4.9. We can define a function in Python that identify positive def-
inite matrices using the eigenvalues. The code for this function is presented
below:

import numpy as np
def IsSD(M):
Lmin = min(np.linalg.eigvals(M))
if (Lmin==0):
print(’Positive semidefinite’)
if (Lmin>0):
print(’Positive definite’)
if (Lmin<0):
print(’It is not positive semidefinte’)
# usage
A = [[2,1],[1,1]]
IsSD(A)

Example 4.10. A quadratic function with a positive semidefinite matrix is

convex. Let us consider the following two examples:
⊤
1 𝑥 1 1∕4 𝑥
𝑞1 (𝑥, 𝑦) = ( ) ( )( ) (4.28)
2 𝑦 1∕4 1 𝑦
⊤
1 𝑥 1 1∕4 𝑥
𝑞2 (𝑥, 𝑦) = ( ) ( )( ) (4.29)
2 𝑦 1∕4 −1 𝑦

4 The command in Python for Cholesky factorization is np.linalg.cholesky(A) and the com-
mand for calculating the eigenvalues is np.linalg.eigvals(A), where np comes form the NumPy
module.

Telegram: @ElectricalDocument
4.5 Solving quadratic programming problems 71

Figure 4.4 Example of two quadratic functions, 𝑞1 (left) is convex whereas 𝑞2 (right)
is not.

In this case, the eigenvalues of the matrix associated to 𝑞1 are 𝜆 = {1.25, 0.75}
whereas the eigenvalues of the matrix associated to 𝑞2 are 𝜆 = {1.03, −1.03}.
Clearly, 𝑞1 is convex whereas 𝑞2 is not (see Figure 4.4).

4.5 Solving quadratic programming problems

A quadratic form 𝑞(𝑥) is convex if its symmetric matrix is positive semidefinite5 .
In that case, we can use CvxPy for solving problems that involves quadratic
forms in the objective function, or in the inequality constraints. The first case,
is known as quadratic programming and the second case, is known as quadratic
programming with quadratic constraints. Let us consider the second case that
is more general, namely:

min 𝑞0 (𝑥)
𝑞1 (𝑥) ≤ 0 (4.30)

where both 𝑞0 and 𝑞1 are convex quadratic forms. It is important to remark that
the constraint must be an inequality of the type ≤ 0, otherwise the problem is
non-convex. Consider the following optimization problem:

min 𝑞0 (𝑥)
𝑞1 (𝑥) ≥ 0 (4.31)
𝑞2 (𝑥) = 0

5 Besides, it is strictly convex if it is positive definite.

Telegram: @ElectricalDocument
72 4 Convex Programming in Python

This problem is not convex, even if 𝑞1 and 𝑞2 were a convex quadratic form,
because equality constraints must be affine and inequality constraints must
be of the type ≤ (see Example 3.1 in Chapter 3). In the following sections, we
present a series of simple examples to familiarize ourselves with the functions
available in CvxPy, for solving quadratic problems.
Example 4.11. An unconstrained quadratic-convex optimization prob-
lem is trivial and can be solved directly. Let us consider the following
problem:
1 ⊤
min 𝑞(𝑥) = 𝑥 𝐻𝑥 + 𝑏⊤ 𝑥 + 𝑐 (4.32)
2
where 𝐻 = 𝐻 ⊤ ≻ 0 (symmetric and positive definite), then we have the
following:

∇𝑞(𝑥) = 𝐻𝑥 + 𝑏 = 0 (4.33)
−1
𝑥 = −𝐻 𝑏 (4.34)

we require that 𝐻 is positive definite in order to guarantee the inverse exists. In

case the matrix is only semi defined, we can use the Moore–Penrose inverse to
find a solution for 𝑥 [16].
Example 4.12. Let us consider the following optimization problem:

min 5𝑥02 + 2𝑥0 𝑥1 + 3𝑥12 + 7𝑥0 + 𝑥1 + 10

𝑥0 + 𝑥1 = 1 (4.35)
𝑥0 , 𝑥1 ≥ 0

Where the objective function can be written as the following quadratic

form:
⊤ ⊤
𝑥0 5 2 𝑥 7 𝑥
𝑞(𝑥0 , 𝑥1 ) = ( ) ( ) ( 0 ) + ( ) ( 0 ) + 10 (4.36)
𝑥1 0 3 𝑥1 1 𝑥1

The script in Python for solving this problem is presented below:

import numpy as np
import cvxpy as cvx
A = np.matrix([[5,2],[0,3]])
H = 1/2*(A+A.T)
IsSD(H)
b = np.array([7,1])
c = 10
x = cvx.Variable(2, nonneg = True)
q = cvx.quad_form(x,H)+b.T@x + c

Telegram: @ElectricalDocument
4.5 Solving quadratic programming problems 73

obj = cvx.Minimize(q)
res = [x[0]+x[1]==1]
Model = cvx.Problem(obj,res)
Model.solve()
print(x.value)

First, we define a matrix for the quadratic form and build a symmetric equiv-
alent (see Example 4.7). Then, we determine if it is semidefinite using the
function created in Example 4.8; the quadratic form is then represented by the
function quad_form that is part of the module CvxPy. The rest of the model
is intuitive.
Example 4.13. Let us define the unitary ball in ℝ2 with center in 𝑎 = (𝑎0 , 𝑎1 )
as ℬ𝑎 , given by Equation (4.37),
{ }
ℬ𝑎 = (𝑥0 , 𝑥1 ) ∶ (𝑥0 − 𝑎0 )2 + (𝑥1 − 𝑎1 )2 ≤ 1 (4.37)

we are interested in finding a point that minimizes the function 𝑓(𝑥0 , 𝑥1 ) =

𝑥0 + 𝑥1 , such that (𝑥0 , 𝑥1 ) belongs to the intersection of the unit balls ℬ𝑎 and
ℬ𝑏 ; with 𝑎 = (1, 1) and 𝑏 = (0, 0), namely:

min 𝑥0 + 𝑥1
(𝑥0 , 𝑥1 ) ∈ ℬ𝑎 ∩ ℬ𝑏 (4.38)

This problem is convex since unit balls are convex sets. Moreover, each ball can
be represented as quadratic forms as presented below:
{ }
ℬ𝑎 = 𝑥 ∈ ℝ𝑛 ∶ (𝑥 − 𝑎)⊤ 𝐼(𝑥 − 𝑎) ≤ 1 (4.39)

where 𝐼 is the identity matrix, and 𝑎 ∈ ℝ𝑛 is a vector that represents the

center of the ball. So, a script to solve Model Equation (4.38) is presented
below:

import numpy as np
import cvxpy as cvx
x = cvx.Variable(2, nonneg = True)
a = np.array([1,1])
q_a = cvx.quad_form(x-a,np.identity(2))
q_b = cvx.quad_form(x,np.identity(2))
obj = cvx.Minimize(x[0]+x[1])
res = [q_a <= 1, q_b <= 1]
Model = cvx.Problem(obj,res)
Model.solve()
print(x.value)

The student is invited to plot this problem and analyze the results.

Telegram: @ElectricalDocument
74 4 Convex Programming in Python

4.6 Complex variables

Several problems in power systems operation have a simple representation in
the set of the complex numbers, hence it is natural to formulate optimization
problems, using complex decision variables. However, we must be very care-
ful in this type of formulation. Unlike the real and the integer numbers, that
are totally ordered set, the complex numbers are not. In general, an optimiza-
tion model on the complexes may have the following canonical representation,
namely:

min 𝑓(𝑧)
𝑔(𝑧) ≤ 0
ℎ(𝑧) = 0 (4.40)
𝑧 ∈ ℂ𝑛

which is similar to the representation given in Equation (3.47). Nonetheless, we

must ensure the co-domain of both the objective function and the inequality
constraints, is the real numbers, that is to say 𝑓 ∶ ℂ𝑛 → ℝ and 𝑔 ∶ ℂ𝑛 → ℝ.
A constraint in the form 𝑔(𝑥) ≤ 0 does not have sense if the image of g is also
complex. Of course Equation (4.40) may be also represented in terms of real
and imaginary variables, as given below:

min 𝑓(𝑥, 𝑦)
𝑔(𝑥, 𝑦) ≤ 0
real (ℎ(𝑥, 𝑦)) = 0 (4.41)
imag (ℎ(𝑥, 𝑦)) = 0
𝑥, 𝑦 ∈ ℝ𝑛

where 𝑧 = 𝑥 + 𝑗𝑦. However, (4.40) is a more compact representation of the

problem in many practical applications. Convexity and similar mathematical
properties must be evaluated in Equation (4.41). Thus, Equation (4.40) is, in
most of the cases, only a convenient representation of the problem.
Example 4.14. The optimization problem presented in Equation (4.38) may
be written in terms of complex variables; for this, we define a complex variable
𝑧 = 𝑥0 + 𝑥1 𝑗, and the optimization model presented below:

min 𝑧real + 𝑧imag

‖𝑧 − (1 + 𝑗)‖ ≤ 1 (4.42)
‖𝑧‖ ≤ 1

Telegram: @ElectricalDocument
4.7 What is inside the box? 75

The script for solving this problem is the following:

z = cvx.Variable(complex=True)
obj = cvx.Minimize(cvx.real(z)+cvx.imag(z))
res = [cvx.abs(z-(1+1j)) <= 1,
cvx.abs(z) <= 1]
Model = cvx.Problem(obj,res)
Model.solve()
print(z.value)

4.7 What is inside the box?

In this chapter, we used CvxPy as a modeling platform. This module call other
solvers to find an optimal solution to the problem. Most of these solvers use
efficient variations of the gradient and Newton’s methods for unconstrained
problems, whereas inequality constrained problems are usually solved by the
Interior Point method or similar barrier methods [17]. Details of these methods
are beyond the objectives of this book; however, it is interesting to see the main
idea by using the following problem:

min 𝑓(𝑥)
𝑔(𝑥) ≤ 0 (4.43)

we define an indicator function for the inequality constraint as follows:

0 if 𝑔(𝑥) ≤ 0
𝐼𝑔 (𝑥) = { } (4.44)
∞ otherwise

This function returns zero when the 𝑥 is a feasible solution of the problem.
Now, we can define a new function 𝐵 given by Equation (4.45):

𝐵(𝑥) = 𝑓(𝑥) + 𝐼𝑔 (𝑥) (4.45)

This function is similar to the Lagrangian. However, it is not continuous and

cannot be optimized by simple derivation. Therefore, we define a continu-
ous approximation for the indicator function as given in Equation (4.46). This
function is called logarithmic barrier.

𝐼𝑔 (𝑥) ≈ 𝜙𝜇 (𝑥) = −𝜇 ln(−𝑔(𝑥)) (4.46)

Figure 4.5 shows the indicator function and the corresponding logarithmic
barrier. The barrier function approximates the indicator function as 𝜇 → 0.
The idea is to solve the problem using the approximated function using a

Telegram: @ElectricalDocument
76 4 Convex Programming in Python

Figure 4.5 Indication function and logarithmic barrier.

continuous method (for example, Newton’s method) and decrease the value
of 𝜇 iterative until achieving convergence.
Although this is oversimplification of the algorithm implemented in practice,
it is useful to understand the concept. There are several variants to the interior
point algorithm. Interested readers can refer to [17] for more details.

4.8 Mixed-integer programming problems

A mixed-integer programming (MIP) problems is an optimization problem
with both continuous and discrete variables. Discrete spaces are non-convex,
and hence, all MIP problems are non-convex. More notably, MIPs are NP-hard
which means they are among the most challenging problems in terms of the-
oretical complexity. However, some mixed-integer programming problems can
be solved in Python using CvxPy with a suitable solver. The most common
approach for solving this type of problem is the Branch & Bound (B&B) algo-
rithm, which is based on the idea of dividing the problem until a binary solution
is found.
Let us see the basic philosophy of the algorithm by considering the following
binary optimization problem:

min 𝑓(𝑥)
𝑥𝑖 ∈ 𝔹 (4.47)

where 𝑓 ∶ ℝ𝑛 → ℝ is convex and 𝔹 = {0, 1}. First, we solve the following

relaxed problem:

min 𝑓(𝑥)
0 ≤ 𝑥𝑖 ≤ 1 (4.48)
𝑥𝑖 ∈ ℝ

Telegram: @ElectricalDocument
4.8 Mixed-integer programming problems 77

we would be lucky if this solution turns out to be binary. Most probably, the
solution would be real values such as 𝑥𝑖 = 0.8 or 𝑥𝑖 = 0.3. This solution is
obviously not feasible from the point of view of the binary problem. However,
it is a lower bound 𝑓 lower of the problem. A binary upper bound is also required
and marked as 𝑓 upper .
The B&B algorithm departs from 𝑓 lower and evaluates different problem
instances using a branching rule. For example, we evaluate the solution with
𝑥𝑖 = 0 and the solution with 𝑥𝑖 = 1. If one of these branching problems results
to be binary and higher than 𝑓 upper , then the solution is discarded as well as the
branching stages below this instance. If the solution is lower than 𝑓 upper , then
we have a new 𝑓 upper and continue the algorithm. The main drawback of this
algorithm is the high computational load associated with evaluating each node,
as the tree is built. Therefore, we require efficient branching rules to reduce the
number of nodes that are evaluated. Most of the commercial solvers have addi-
tional techniques to accelerate the process. The following example shows how
the algorithm works in practice.
Example 4.15. Consider the following optimization problem:

min 𝑓(𝑥) = 𝑥⊤ 𝐻𝑥
∑
𝑥𝑖 = 3 (4.49)
𝑥𝑖 ∈ 𝔹
with 𝐻 = diag(0.41, 0.51, 0.32, 0.20, 0.31, 0.21) and 𝑖 ∈ {0, 1, … , 6}. First, we
relax the binary constraint obtaining the following convex model:

⎧ min 𝑥⊤ 𝐻𝑥 ⎫
∑
𝐴= 𝑥𝑖 = 3 (4.50)
⎨ ⎬
0 ≤ 𝑥𝑖 ≤ 1
⎩ ⎭
If the solution of this problem is binary, then our problem is solved. However,
this is not the case, so 𝐴 is just a lower bound. Then we generate new instances
of the problem with 𝑥0 = 0 and 𝑥1 = 1, namely

⎧ min 𝑥⊤ 𝐻𝑥 ⎫
⎪ ∑𝑥 = 3 ⎪
𝑖
𝐵= (4.51)
⎨ 0 ≤ 𝑥𝑖 ≤ 1 ⎬
⎪ ⎪
𝑥0 = 0
⎩ ⎭
and
⎧ min 𝑥⊤ 𝐻𝑥 ⎫
⎪ ∑𝑥 = 3 ⎪
𝑖
𝐶= (4.52)
⎨ 0 ≤ 𝑥𝑖 ≤ 1 ⎬
⎪ ⎪
𝑥0 = 1
⎩ ⎭

Telegram: @ElectricalDocument
78 4 Convex Programming in Python

Figure 4.6 Tree generated by

the branch and bound
algorithm.

The process continues generating a tree as depicted in Figure 4.6. Eventually,

the algorithm finds a binary solution; in this case, in the node 𝐻 with 𝑓 = 0.72
and 𝑥 = (0, 0, 0, 1, 1, 1)⊤ . This binary solution is an upper bound that blocks
any solution resulting from nodes 𝐸 and 𝐺. The branching process must be
executed from nodes 𝐷 and 𝐹 until a better binary solution is found or until
the nodes become higher than the upper bound. If a better binary solution is
found, this is a set as the new upper bound, and the process continues until the
tree is completed.
Table 4.1 shows in detail the first nodes generated by the algorithm for the
states depicted in Figure 4.6. The enumeration tree generated by the branch
and bound method may be large, and hence, the algorithm is not as efficient as
the algorithms for continuous optimization. However, this is the primary tool
for most of the integer and binary problems in practice.

Example 4.16. There are efficient solvers for binary problems available in
CvxPy, so we do not require generating the enumeration tree by hand. The code
for the previous example is presented below:

import numpy as np
import cvxpy as cvx
x = cvx.Variable(6, boolean=True)
H = np.diag([0.41,0.51,0.32,0.20,0.31,0.21])
f = cvx.Minimize(cvx.quad_form(x,H))
res = [ 0<= x, x<=1, cvx.sum(x)==3]
BinaryProblem = cvx.Problem(f,res)
BinaryProblem.solve()
print(’Optimal value’,np.round(f.value,4),np.round(x.value,4))

Telegram: @ElectricalDocument
4.9 Transforming MINLP into MILP 79

Table 4.1 Details of the node generated by the branch and bound
problem.

Node f(x) x0 x₁ x₂ x₃ x₄ x5
A 0.4388 0.3567 0.2868 0.4570 0.7313 0.4718 0.6964
B 0.4980 0 0.3255 0.5187 0.8299 0.5354 0.7904
C 0.6313 1 0.2170 0.3458 0.5533 0.3570 0.5269
D 0.5586 0 0 0.5818 0.9309 0.6006 0.8866
E 0.7583 0 1 0.3879 0.6206 0.4004 0.5911
F 0.6583 1 0 0.3879 0.6206 0.4004 0.5911
G 0.9821 1 1 0.1939 0.3103 0.2002 0.2955
H 0.7200 0 0 0 1 1 1

The only difference concerning the continuous problem is that variable 𝑥 is

defined as binary with the parameter boolean=True.
Although it is easy to generate binary problems, we must keep in mind that
each binary variable implies an increase in the enumeration tree’s size, which
makes the model more complex and the algorithm slower. As a basic rule, we
should reduce, if possible, the number of binary variables in our models.

4.9 Transforming MINLP into MILP

Mixed-integer non-linear programming problems (MINLP) are among the
most complicated mathematical optimization problems in theory and practice.
Therefore, it is convenient to transform these models into mixed-integer linear
programming problems (MILP). Below, we present some examples of heuristic
transformations.

Example 4.17. Let us consider a set of constraints as presented below:

𝑦 = 𝑢𝑥
𝑥low ≤ 𝑥 ≤ 𝑥up (4.53)
𝑢 ∈ 𝔹,𝑥 ∈ ℝ, 𝑦 ∈ ℝ

The first constraint is both non-linear, non-convex, and mixed-integer. There-

fore, it is convenient to transform it into a set of mixed-integer affine constraints
as follows:

Telegram: @ElectricalDocument
80 4 Convex Programming in Python

𝑢𝑥low ≤ 𝑦 ≤ 𝑢𝑥up
𝑥 − (1 − 𝑢)(𝑥up − 𝑥low ) ≤ 𝑦 ≤ 𝑥 + (1 − 𝑢)(𝑥up − 𝑥low )
𝑥low ≤ 𝑥 ≤ 𝑥up (4.54)
𝑢 ∈ 𝔹,𝑥 ∈ ℝ, 𝑦 ∈ ℝ
Notice that if 𝑢 = 0 then the first constraint is reduced to 𝑦 = 0, whereas if
𝑢 = 1 the two fist constraints result in 𝑥low ≤ 𝑦 ≤ 𝑥up and 𝑦 = 𝑥, respectively.
These conditions are equivalent to 𝑦 = 𝑢𝑥 with 𝑢 ∈ 𝔹.
Example 4.18. Mixed-integer quadratic programming problems as the one
given in Equation (4.49) can be easily transformed into a MILP problem. First,
we write the quadratic in polynomial form as follows:
∑∑
𝑓(𝑥) = ℎ𝑘𝑚 𝑥𝑘 𝑥𝑚 (4.55)
𝑘 𝑚

Then, we notice that 𝑥2 = 𝑥 for 𝑥 ∈ 𝔹. Therefore, the terms in the diagonal of

the quadratic form can be replaced by linear equations as presented below:
∑ ∑∑
𝑓(𝑥) = ℎ𝑘𝑘 𝑥𝑘 + ℎ𝑘𝑚 𝑥𝑘 𝑥𝑚 (4.56)
𝑘 𝑘 𝑚≠𝑘

Next, the bi-linear terms 𝑥𝑘 𝑥𝑚 are replaced by a new binary variable 𝑦𝑘𝑚 , that
is to say:
∑ ∑∑
𝑓(𝑥) = ℎ𝑘 𝑘𝑥𝑘 + ℎ𝑘𝑚 𝑦𝑘𝑚 (4.57)
𝑘 𝑘 𝑚≠𝑘

Finally, we add the following auxiliary constraints:

𝑥𝑘 + 𝑥𝑚 − 1 ≤ 𝑦𝑘𝑚
𝑦𝑘𝑚 ≤ 𝑥𝑘 (4.58)
𝑦𝑘𝑚 ≤ 𝑥𝑚
Notice that, under these constraints, 𝑦𝑘𝑚 = 1 if 𝑥𝑘 = 1 and 𝑥𝑚 = 1, otherwise
𝑦𝑘𝑚 = 0. The model is now an MIP problem.

4.10 Further readings

In this chapter, we learned how to solve linear and quadratic problems in
Python using CvxPy. The reader is invited to see the module’s manual in [23]
and reproduce the examples presented there. Most of the solvers called by
Python are modifications of the gradient, the Newton’s, and/or the interior
point method. These methods have well-defined conditions that guarantee

Telegram: @ElectricalDocument
4.11 Exercises 81

convergence if the problem is convex. The theory behind these algorithms is

beyond this book’s objectives. It can be found in [11] and [24].
A reader interested in delving into the theory can continue with Chapter 5
where a family of convex optimization problems, known as conic optimiza-
tion, is studied. A reader interested in applications for power systems oper-
ation can go directly to Chapter 7, which presents the economic dispatch of
thermal units.
There is extensive literature about mixed-integer problems that go beyond
the practical objectives of this book. Most of the solvers for mixed-integer
optimization are based on the branch and bound method, although other
algorithms such as the cutting plane and greedy algorithms are also used.
A good presentation about these methods, both in theory and practice, is
available in [25].

4.11 Exercises
1. Solve the following linear programming problem:
min 𝑐⊤ 𝑥
∑
𝑥𝑖 = 1 (4.59)
𝑖

𝑥𝑖 ≥ 0
where 𝑐, 𝑥 ∈ ℝ𝑛 and 𝑐𝑖 = 𝑖 + 1. Solve the problem for 𝑛 = 2, 𝑛 = 10 and
100.
2. Solve the transportation problem with six sources and eight demands
described in Table 4.2
3. Solve the following problem in Python using the module CvxPy.
min 3𝑥 2 + 2𝑦 2 + 5𝑧2
𝑥+𝑦+𝑧 =1 (4.60)
𝑥, 𝑦, 𝑧 ≥ 0
4. Solve the following problem similar to Example 4.13 for 𝑛 = 4 and 𝑛 = 5.
𝑛−1
∑
min 𝑥𝑖
𝑖=0

𝑥 ∈ ℬ𝑎 ∩ ℬ𝑏 (4.61)
where 𝑎 = 1𝑛 (i.e., a vector with all entries equal to 1) and 𝑏 = 0𝑛 (i.e., a
vector of zeros).

Telegram: @ElectricalDocument
82 4 Convex Programming in Python

Table 4.2 Parameters for a transportation problem

with six sources and eight demands.

cij 0 1 2 3 4 5 si
0 43 90 10 58 95 60 175
1 49 41 65 75 25 17 62
2 33 41 26 64 72 29 118
3 16 49 84 26 36 91 118
4 8 95 82 66 2 17 58
5 90 92 28 32 55 66 175
6 95 90 71 87 69 72 173
7 66 87 29 40 37 52 122
𝑑𝑖 212 144 92 168 201 184 total = 1001

5. Solve the following optimization problem using Python and the module
CvxPy

min 𝑥2 + 𝑦 2
(𝑥 − 1)2 + (𝑦 − 1)2 ≤ 1 (4.62)
(𝑥 − 1)2 + (𝑦 + 1)2 ≤ 1

6. Solve the following quadratic optimization problem:

1
min (𝑥 − 1𝑛 )⊤ 𝐻(𝑥 − 1𝑛 )
2
∑
𝑥𝑖 = 1 (4.63)

where 1𝑛 is a vector with all entries equal to 1 and 𝐻 is a symmetric matrix

of size 𝑛 × 𝑛 constructed in the following way: ℎ𝑘𝑚 = (𝑚 + 𝑘)∕2 if 𝑘 ≠ 𝑚
and ℎ𝑘𝑚 = 𝑛2 + 𝑛 if 𝑘 = 𝑚. Use 𝑛 = 2, 𝑛 = 10, and 𝑛 = 100.
7. Solve the following optimization model using the basic interior point
method described in Section 4.7.

min 𝑥2
𝑥≥5 (4.64)

8. A matrix 𝐴 is diagonal dominant if its entries are such that

∑
𝑎𝑘𝑘 ≥ |𝑎𝑘𝑚 | (4.65)
𝑚≠𝑘

Telegram: @ElectricalDocument
4.11 Exercises 83

Show that every dominant diagonal matrix is positive semidefinite, but the
opposite is not true.
9. Define a function in Python that generates a random positive definite
matrix of size 𝑛 × 𝑛. Use this function to generate random matrices 𝐴 and
𝐵; evaluate numerically each of the properties given in Section 4.4.
10. Finish Example 4.15 and compare the solution with Example 4.16.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
85

Conic optimization

Learning outcomes

By the end of this chapter, the student will be able to:

● Identify the main features related to semideﬁnite and second-order
cone optimization.
● Transform optimization problems into standard SDP or SOC models.
● Solve SDP and SOC problems using Python.

5.1 Convex cones

A cone is a set 𝒞 ∈ ℝ𝑛 such that if 𝑥 ∈ 𝒞 then 𝛼𝑥 ∈ 𝒞. A convex cone is
set that is simultaneously a cone and a convex set as depicted in Figure 5.1.
A conic optimization problem minimizes a convex function over the intersec-
tion of an affine subspace and a convex cone. Two particular types of convex
cones are relevant in power systems operation: the cone generated by semidef-
inite matrices and the second-order cone. Linear, quadratic, and quadratically
constrained problems can be considered as particular cases of these cones. In
addition, there are several solvers available in CvxPy that efficiently solve conic
optimization problems. The following sections studies theoretical and practical
aspects of conic optimization in Python.

5.2 Second-order cone optimization

A second-order cone or SOC is a set in ℝ𝑛+1 given by the following expression:
{ }
𝒞SOC = (𝑥, 𝑧) ∈ ℝ𝑛+1 ∶ ‖𝑥‖ ≤ 𝑧 (5.1)

Mathematical Programming for Power Systems Operation: From Theory to Applications in

Figure 5.1 Example of a convex cone 𝒞A and a non-convex cone 𝒞B

Figure 5.2 Representation of the second order cone Ω = {‖𝑥‖ ≤ 𝑧} with 𝑥 ∈ ℝ2 and
𝑧 ∈ ℝ.

where 𝑥 is a vector in ℝ𝑛 ; 𝑧 is a real variable; and ‖⋅‖ is the Euclidean norm,

given by Equation (5.2).
√
‖𝑥‖ = 𝑥02 + 𝑥12 + 𝑥22 + ⋯ + 𝑥𝑛−1 2
(5.2)

Figure 5.2 depicts the boundary of an SOC with 𝑥 ∈ ℝ2 .

Let us see why Equation (5.1) defines a cone. Consider a point (𝑥, 𝑧) ∈ 𝒞SOC
and a positive scalar 𝛼. Now, consider a point scaled by that 𝛼. This new point
holds the same inequality as given below:

‖𝛼𝑥‖ ≤ 𝛼𝑧 (5.3)

Telegram: @ElectricalDocument
5.2 Second-order cone optimization 87

In other words, the point (𝛼𝑥, 𝛼𝑧) ∈ 𝒞SOC . Therefore, 𝒞SOC is a cone. It is
straightforward to demonstrate that this cone is also convex using the properties
of the norm (Section 2.2) and the Jensen’s inequality.
On the other hand, both sides of the inequality in Equation (5.1) can be com-
posed by affine spaces obtaining a general representation of an SOC constraint,
as given below:

‖𝐴𝑥 + 𝑏‖ ≤ 𝑐⊤ 𝑥 + 𝑑 (5.4)

where 𝐴 is a matrix, 𝑏 and 𝑐 are vectors, and 𝑑 is a scalar. Thus, a general

representation of an SOC optimization problem is the following:

min ℎ⊤ 𝑥
‖𝐴𝑥 + 𝑏‖ ≤ 𝑐⊤ 𝑥 + 𝑑 (5.5)

However, many practical problems are a combination of different types of con-

straints. Therefore, a practical problem might not be an SOC optimization
problem but a convex problem with second-order cone constraints.
Among convex optimization problems, SOC optimization is particularly
appealing in practice for three main reasons: first, there are several available
algorithms and solvers for SOC problems. These algorithms are fast and effi-
cient in practice. Second, several optimization problems may be represented
as SOC problems. For example, a linear programming problem is a particu-
lar case of an SOC problem with 𝐴 = 0, and a quadratic-convex restriction is
also presentable as an SOC constraint. Third, it is possible to transform some
non-convex problems into equivalent SOC problems. The examples below show
some of these equivalents.

Example 5.1. A convex-quadratic constraint can be represented as an SOC.

Let us consider the following inequality that is convex if 𝑀 ≻ 0:

𝑥 ⊤ 𝑀𝑥 − 𝑛⊤ 𝑥 − 𝑠 ≤ 0 (5.6)

Since 𝑀 ≻ 0 (see Section 4.3) then it has Cholesky factorization given by 𝑀 =

1 1
(𝑀 2 )⊤ 𝑀 2 .
Let us define two new variables 𝑢 and 𝑧 as follows:
1
𝑢 = 𝑀2𝑥 (5.7)
⊤
𝑧 =𝑛 𝑥+𝑠 (5.8)

then, Equation (5.6) is equivalent to the following constraint:

𝑢⊤ 𝑢 ≤ 𝑧 (5.9)

Telegram: @ElectricalDocument
88 5 Conic optimization

This inequality can be transformed into an SOC by adding 𝑧2 ∕4 − 𝑧∕2 + 1∕4 in

both sides of the inequality, as given below:

𝑧2 𝑧 1 𝑧2 𝑧 1
𝑢⊤ 𝑢 + − + ≤𝑧+ − + (5.10)
4 2 4 4 2 4
2 2
𝑧−1 1+𝑧
𝑢⊤ 𝑢 + ( ) ≤( ) (5.11)
2 2
‖‖ ‖‖
‖‖ 𝑢 ‖ 1+𝑧
‖‖( 𝑧−1 )‖‖‖ ≤ (5.12)
‖‖ ‖‖ 2
‖ 2 ‖
Returning to the original variables, we have the following expression:
‖‖ ‖
⎞‖‖‖ 1 + 𝑛⊤ 𝑥 + 𝑠
1
‖‖⎛ 𝑀2𝑥
‖‖ ‖
‖‖⎜ 𝑛⊤ 𝑥+𝑠−1 ⎟‖‖‖ ≤ 2
(5.13)
‖‖ ‖
‖⎝ 2 ⎠‖‖
By using this method, any convex-quadratic model can be transformed into an
SOC problem.

Example 5.2. Consider the following set of inequality constraints:

𝑥𝑦 ≥ 𝑤⊤ 𝑤
𝑥≥0 (5.14)
𝑦≥0

where 𝑤 is a vector ∈ ℝ𝑛 and 𝑥, 𝑦 are variables in ℝ. These inequalities define

a hyperbolic set that, at fist glance looks like a non-convex set. However, it is
in fact convex and can be transformed into an SOC; for this, we start from the
following representation of an hyperbolic paraboloid:
1 1
𝑥𝑦 = (𝑥 + 𝑦)2 − (𝑥 − 𝑦)2 (5.15)
4 4
then, we have the following:

𝑥𝑦 ≥ 𝑤 ⊤ 𝑤 (5.16)
1 1
(𝑥 + 𝑦)2 − (𝑥 − 𝑦)2 ≥ 𝑤 ⊤ 𝑤 (5.17)
4 4
(𝑥 + 𝑦)2 ≥ (𝑥 − 𝑦)2 + 4𝑤⊤ 𝑤 (5.18)

the last inequality can be easily transformed into an SOC as follows:

‖‖ ‖‖
‖ 2𝑤 ‖
𝑥 + 𝑦 ≥ ‖‖‖( )‖‖‖ (5.19)
‖‖ 𝑥 − 𝑦 ‖‖
‖ ‖

Telegram: @ElectricalDocument
5.2 Second-order cone optimization 89

This example shows that SOC is a very general way to represent many non-
linear problems.

Example 5.3. Consider the following set:

𝑧 = 𝑥𝑦
0≤𝑥≤1 (5.20)
0≤𝑦≤1

A constraint of the form 𝑧 = 𝑥𝑦 is not convex. Therefore, it is common to use a

linearization to include this type of constraints into an optimization problem,
namely:

𝑧≥0
𝑥≤𝑧
𝑦≤𝑧 (5.21)
𝑧 ≥𝑥+𝑦−1

This linearization works in most of the cases. However, for a constraint of the
form 𝑧2 = 𝑥𝑦, it is more precise to use a SOC approximation. We transform the
equality into an inequality, and use Equation (5.19) as follows:
‖‖ ‖‖
‖‖ 2𝑧 ‖
‖‖( )‖‖‖ ≤ 𝑥 + 𝑦
‖‖ 𝑥 − 𝑦 ‖‖
‖ ‖
0≤𝑥≤1 (5.22)
0≤𝑦≤1
𝑧≥0

The SOC approximation maintains the non-linear nature of the problem, mak-
ing it convex. Notice the point (1∕2, 1∕2, 1∕4) is feasible in both the original
problem and the SOC approximation. However, it is infeasible in linearization.

Example 5.4. The function 𝑓(𝑥) = − ln(𝑥) is convex and hence the following
set is also convex:

− ln(𝑥 + 1) ≤ 𝑧
1 1
− ≤𝑥≤ (5.23)
2 2

Telegram: @ElectricalDocument
90 5 Conic optimization

However, it is possible to obtain an SOC approximation of this set by consider-

ing a quadratic expansion of the logarithmic function:
𝑥2
ln(𝑥 + 1) ≈ 𝑥 − (5.24)
2
This approximation can be included into Equation (5.23) obtaining the follow-
ing SOC constraint:
‖‖ ‖‖ 1 + 2(𝑥 + 𝑧)
‖‖ 𝑥 ‖‖
‖‖ ‖≤
‖‖ 𝑥 + 𝑧 − 1∕2 ‖‖‖ 2
1 1
− ≤𝑥≤ (5.25)
2 2
The student is invited to plot the set given by Equation (5.23) and the set given
by Equation (5.25) and compare these results.
Example 5.5. A SOC problem can be easily solved in Python. Consider the
following optimization model:
min 𝑧
‖𝐴𝑥 + 𝑏‖ ≤ 𝑧 (5.26)
where 𝑥 ∈ ℝ𝑛 , 𝐴 ∈ ℝ𝑚×𝑛 , and 𝑏 ∈ ℝ𝑚 . The following code solves a random
instance of this problem, for 𝑛 = 10 and 𝑚 = 6:
import numpy as np
import cvxpy as cvx
n = 10
m = 6
A = np.random.rand(m,n)
b = np.random.rand(m)
x = cvx.Variable(n)
z = cvx.Variable()
obj = cvx.Minimize(z)
res = [cvx.SOC(z,A@x+b)]
prob = cvx.Problem(obj, res)
prob.solve()

The key function in this example, is the SOC constraint presented in the
penultimate line. As always, the reader is invited to experiment with this
code.

5.2.1 Duality in SOC problems

Duality theory is easily applied to second-order cone optimization problems.
Consider the SOC model given by Equation (5.5) which is equivalent to the
model presented below:

Telegram: @ElectricalDocument
5.2 Second-order cone optimization 91

min ℎ⊤ 𝑥
‖𝑢‖ ≤ 𝑐⊤ 𝑥 + 𝑑 (5.27)
𝑢 = 𝐴𝑥 + 𝑏

We define a Lagrangian function as follows:

ℒ(𝑥, 𝑢, 𝑦, 𝑧) = ℎ⊤ 𝑥 + 𝑦(‖𝑢‖ − 𝑐⊤ 𝑥 − 𝑑) + 𝑧⊤ (𝑢 − 𝐴𝑥 − 𝑏) (5.28)

with 𝑦 ≥ 0. Now, we take the infimum in 𝑥 and 𝑢, in order to obtain the dual
function. Fortunately, the problem is separable as presented below:
( )
inf ℒ = inf ℎ⊤ 𝑥 + 𝑦(‖𝑢‖ − 𝑐⊤ 𝑥 − 𝑑) + 𝑧⊤ (𝑢 − 𝐴𝑥 − 𝑏) (5.29)
𝑥,𝑢 𝑥,𝑢
( )
= inf (−𝑦𝑑 − 𝑏⊤ 𝑧) + (ℎ⊤ − 𝑦𝑐⊤ − 𝑧⊤ 𝐴)𝑥 + (𝑦 ‖𝑢‖ + 𝑧⊤ 𝑢) (5.30)
𝑥,𝑢

= −𝑦𝑑 − 𝑏⊤ 𝑧 + inf (ℎ⊤ − 𝑦𝑐⊤ − 𝑧⊤ 𝐴)𝑥 + inf (𝑦 ‖𝑢‖ + 𝑧⊤ 𝑢) (5.31)

𝑥 𝑢

Thus inf ℒ(𝑥, 𝑢, 𝑦, 𝑧) implies that:

𝑥

ℎ − 𝑦𝑐 − 𝐴⊤ 𝑧 = 0 (5.32)

and inf ℒ(𝑥, 𝑢, 𝑦, 𝑧) is obtained from:

𝑢

inf 𝑦 ‖𝑢‖ + 𝑧⊤ 𝑢 (5.33)

𝑢

If we are using the Euclidean norm, then the Cauchy inequality is valid:

|𝑧⊤ 𝑢| ≤ ‖𝑧‖ ‖𝑢‖ (5.34)

and consequently

− ‖𝑧‖ ‖𝑢‖ ≤ 𝑧⊤ 𝑢 ≤ ‖𝑧‖ ‖𝑢‖ (5.35)

− ‖𝑧‖ ‖𝑢‖ + 𝑦 ‖𝑢‖ ≤ 𝑦 ‖𝑢‖ + 𝑧⊤ 𝑢 ≤ ‖𝑧‖ ‖𝑢‖ + 𝑦 ‖𝑢‖ (5.36)
⊤
(𝑦 − ‖𝑧‖) ‖𝑢‖ ≤ 𝑦 ‖𝑢‖ + 𝑧 𝑢 ≤ (𝑦 + ‖𝑧‖) ‖𝑢‖ (5.37)

The infimum in both the right and the left-hand side of this inequality is zero
as long as 𝑦 − ‖𝑧‖ ≥ 0. Combining all these results, we have the following dual
problem:

Telegram: @ElectricalDocument
92 5 Conic optimization

min 𝑦𝑑 + 𝑏⊤ 𝑧
𝑦𝑐 + 𝐴⊤ 𝑧 = ℎ (5.38)
‖𝑧‖ ≤ 𝑦

In conclusion, the dual of an SOC problem is another SOC problem. Since an

SOC problem is convex, then we can conclude the dual is equal to the primal
as long as it fulfills Slater’s conditions.

5.3 Semideﬁnite programming

Another important cone for mathematical optimization is the cone generated
by positive semidefinite matrices. Here we are interested in solving problems
with the following structure:
min tr(𝐶𝑋)
𝐴𝑋 = 𝐵 (5.39)
𝑋⪰0
where 𝐴, 𝐵, 𝐶, 𝑋 are matrices. This problem looks very different from the con-
vex problems we have studied so far; decision variables are now matrices
𝑋 ∈ ℝ𝑛×𝑛 and the objective function is the trace of a matrix product. More
importantly, there is a constraint of form 𝑋 ⪰ 0 that indicates the matrix is
positive-semidefinite1 .
Before studying these problems, let us review some basic concepts from
matrix algebra.

5.3.1 Trace, determinant, and the Shur complement

The trace is an operator that takes a square matrix 𝐴 and returns a scalar equal
to the sum of the entries in the diagonal, as follows:
tr(𝐴) = 𝑎11 + 𝑎22 + … 𝑎𝑛𝑛 (5.40)
This operator have some useful properties, namely
● tr(𝐴 + 𝐵) = tr(𝐴) + tr(𝐵)
● tr(𝛼𝐴) = 𝛼 tr(𝐴)
● tr(𝐴) = tr(𝐴⊤ )
● tr(𝐴⊤ 𝐵) = tr(𝐴𝐵⊤ )

1 Notice the symbol is different from an inequality.

Telegram: @ElectricalDocument
5.3 Semideﬁnite programming 93

● tr(𝐴𝐵) = tr(𝐵𝐴) but in general, tr(𝐴𝐵) ≠ tr(𝐴) tr(𝐵)

● si 𝐴 ≻ 0 y 𝐵 ≻ 0 then tr(𝐴𝐵) ≥ 0
● 𝑥⊤ 𝐻𝑥 = tr(𝐻𝑥𝑥 ⊤ )
∑
● tr(𝐻) = 𝜆𝑖 where 𝜆𝑖 are the eigenvalues of 𝐻

Example 5.6. Let us experiment in Python by taking random matrices 𝐴, 𝐵

and checking the aforementioned properties:
import numpy as np
n = 10
A = np.random.rand(n,n)
B = np.random.rand(n,n)
print("sum",np.trace(A+B), np.trace(A)+np.trace(B))
print("prd",np.trace(A@B), np.trace(A)*np.trace(B))

The determinant is another operator that takes a square matrix and returns
a scalar. It has the following properties:

● A matrix 𝐴 has a unique inverse if det(𝐴) ≠ 0

● det(𝐴𝐵) = det(𝐴) det(𝐵) but in general det(𝐴 + 𝐵) ≠ det(𝐴) + det(𝐵)
● det(𝐴⊤ ) = det(𝐴)
● If a matrix is triangular, then the determinant is the product of the entries in
the main diagonal.
● det(𝐴−1 ) = 1∕ det(𝐴)
● det(𝐴) = det(𝑃−1 𝐴𝑃)
∏
● det(𝐴) = 𝜆𝑖 where 𝜆 = eig(𝐴)

These are useful properties of the determinant, which is a complex and inter-
esting operator2 . The following example shows a geometric interpretation of
the determinant.
Example 5.7. As we have seen, a quadratic form 𝑥⊤ 𝐻𝑥 with 𝐻 ≻ 0 may have
different interpretations. For instance, it can generate an ellipsoid ℰ defined as
follows:
{ }
ℰ = 𝑥 ∈ ℝ𝑛 ∶ 𝑥 ⊤ 𝐻𝑥 ≤ 1 (5.41)

Since 𝐻 is positive definite, we can make a Cholesky factorization, namely:

1 1
𝐻 = (𝐻 2 )⊤ 𝐻 2 (5.42)

2 We suppose the student is familiar with the ways as the determinant is calculated. A student
interested in a formal definition can refer to [21] pg 632.

Telegram: @ElectricalDocument
94 5 Conic optimization

Figure 5.3 Area of an ellipsoid seen as a linear transformation of a unit ball

1
that allows to define the following linear transformation 𝑦 = 𝐻 2 𝑥 that in turns,
{ }
transforms the ellipsoid as follows ℬ = 𝑦 ∈ ℝ𝑛 ∶ 𝑦 ⊤ 𝑦 ≤ 1 . This is a unitary
ball of which we know its hypervolume3 . The determinant allows to calculate
the volume of the ellipsoid as shown in Figure 5.3.

To end this review of matrix algebra, let us define the Shur complement.
Consider the following block matrix

𝐴 𝐵
𝑀=( ) (5.43)
𝐵⊤ 𝐶

then, we define the Shur complement of 𝑀 with respect to 𝐶 as follows:

𝐴 − 𝐵𝐶 −1 𝐵⊤ (5.44)

If 𝑀 ⪰ 0 and 𝐴 ⪰ 0, then its Shur complement is also positive semidefinite,

and vice versa.
We can also define the Shur complement respect to 𝐴 as given below:

𝐶 − 𝐵⊤ 𝐴−1 𝐵 (5.45)

The Shur complement is useful because we can identify when a matrix is

positive semidefinite by evaluating the condition in some sub-matrices.

3 The hypervolume is the generalization of measurements such as length, area, and volume
for ℝ𝑛 . The hypervolume of a unitary ball in ℝ2 is the area, i.e., 𝜋12 .

Telegram: @ElectricalDocument
5.3 Semideﬁnite programming 95

5.3.2 Cone of semideﬁnite matrices

The set of semidefinite matrices defines a convex cone. To prove it, consider the
following set:

Ω = {𝑋 ∈ ℝ𝑛×𝑛 ∶ 𝑋 ⪰ 0} (5.46)

This set defines a cone since if 𝑋 ∈ Ω then 𝛼𝑋 is also in Ω (𝛼𝑋 is positive

semidefinite if 𝑋 ⪰ 0 and 𝛼 ≥ 0). Now consider two matrices 𝑋, 𝑌 ∈ Ω and
two scalars 𝛼, 𝛽 such that 𝛼 + 𝛽 = 1 and 𝛼, 𝛽 ≥ 0. Since an intermediate matrix
𝑍 = 𝛼𝑋 + 𝛽𝑌 also belongs to Ω, we conclude the set is convex.
A semidefinite programming problem or SDP is any problem that includes
semidefinite constraints as in Equation (5.39). Far from being a meaningless or
obscure mathematical theory, SDP is a practical tool for different optimization
problems, as shown in the following examples.

Example 5.8. A SOC is a particular case of an SDP. Consider the following

SOC:

‖𝑢‖ ≤ 𝑡 (5.47)

which is equivalent to

𝑢⊤ 𝑢 ≤ 𝑡 2 (5.48)
2 ⊤
𝑡 − 𝑢 𝐼𝑢 ≥ 0 (5.49)

where 𝐼 is the identity matrix. This constraint can be transformed into a

semidefinite constraint using the Shur complement as follows:

𝑡𝐼 𝑢
( )⪰0 (5.50)
𝑢⊤ 𝑡

As a direct consequence of this, there is a hierarchy among SOC, SDP, QP, and
LP problems as given in Figure 5.4, since linear programming (LP), quadratic
programming (QP), and quadratically constrained quadratic programming
(QCQP) are particular cases of SOC.

Example 5.9. Semidefinite programming is of practical interest because it is

possible to solve SDP problems using efficient algorithms. However, some prob-
lems can be represented either as SDP or SOC problems. Below, we present the
solution in Python of a random instance of the following optimization problem:

Telegram: @ElectricalDocument
96 5 Conic optimization

Figure 5.4 Venn diagram that represents the relation among different connic
problems

min 𝑡
𝑐⊤ 𝑢 ≥ 1 (5.51)
‖𝑢‖ ≤ 𝑡

import numpy as np
import cvxpy as cvx
n = 10
c = np.random.rand(n)
u = cvx.Variable(n)
t = cvx.Variable()
obj = cvx.Minimize(t)
res = [c@u >= 1, cvx.SOC(t,u)]
ModelSOC = cvx.Problem(obj,res)
ModelSOC.solve()
print("SOC:",obj.value)
print("Time:",ModelSOC.solver_stats.solve_time)

I = np.eye(n)
X = cvx.Variable((n+1,n+1),symmetric=True)
obj = cvx.Minimize(t)
res = [c@u >= 1, X >>0]
res += [X[n,n] == t]
for k in range(n):
res += [ X[k,n] == u[k]]
res += [ X[n,k] == u[k]]
for m in range(n):
res += [X[k,m] == t*I[k,m]]
ModelSDP = cvx.Problem(obj,res)

Telegram: @ElectricalDocument
5.3 Semideﬁnite programming 97

ModelSDP.solve()
print("SDP:",obj.value)
print("Time:",ModelSDP.solver_stats.solve_time)

A semidefinite constraint is generated by a new variable 𝑋 and the symbol >>

that indicates positive semidefinite4 . It is interesting to see the difference in
time calculation for SDP and SOC problems. Usually, SOC is solved more effi-
ciently than SDP. The student is invited to experiment with different sizes of
the problem, changing the values of 𝑛.

Example 5.10. Semidefinite programming is a highly flexible tool for mod-

eling; as we have seen, SOC, QP, and LP are particular SDP cases. Many
other optimization problems can be represented as SDP models. Table 5.1
shows a list of optimization constraints that can be transformed into an SDP
constraint.

Table 5.1 Constraints that can be transformed to SDP.

Constraint Semideﬁnite constraint

⎛ 𝑡𝐼 𝑢 ⎞
‖𝑢‖ ≤ 𝑡 ⎜ ⎟⪰0
⊤
𝑢 𝑡
⎝ ⎠
⎛ 1+𝑥 𝑦 ⎞
𝑥2 + 𝑦2 ≤ 1 ⎜ ⎟⪰0
𝑦 1−𝑥
⎝ ⎠
(𝑐⊤ 𝑥)2 ⎛ 𝑑⊤ 𝑥 𝑐⊤ 𝑥 ⎞
⎜ ⎟⪰0
𝑑⊤ 𝑥 𝑐⊤ 𝑥 𝑡
⎝ ⎠

5.3.3 Duality in SDP

Duality theory can be easily extended to SDP problems, for this, consider the
following problem:

4 It is essential to remember that >> is different from >=. The first symbol indicates the
matrix is positive semidefinite, whereas the second means that all the matrix entries are
positive.

Telegram: @ElectricalDocument
98 5 Conic optimization

min tr(𝐶𝑋)
tr(𝐴𝑖 𝑋) = 𝑏𝑖 ∀𝑖 (5.52)
𝑋⪰0

where affine constraints are represented using a trace function. Let us define a
Lagrangian funtion as follows:
∑
ℒ(𝑋, 𝑧, 𝑌) = tr(𝐶𝑋) + 𝑧𝑖 (𝑏𝑖 − tr(𝐴𝑖 𝑋)) − tr(𝑌𝑋) (5.53)
𝑖

where 𝑌 is a square-symmetric and positive semidefinite matrix (𝑌 = 𝑌 ⊤ ⪰ 0)

and 𝑧 is a vector. We can define the dual function just as in the case of general
convex optimization problems, namely:

𝒲(𝑧, 𝑌) = inf ℒ(𝑥, 𝑧, 𝑌) (5.54)

𝑥

The following conditions are required to guarantee the existence of this infi-
mum5 :
∑
tr(𝐶𝑋) − 𝑧𝑖 tr(𝐴𝑖 𝑋) − tr(𝑌𝑋) = 0 (5.55)
𝑖
∑
tr(𝐶𝑋 − 𝑧𝑖 𝐴𝑖 𝑋 − 𝑌𝑋) = 0 (5.56)
𝑖
∑
∴ 𝑧𝑖 𝐴𝑖 + 𝑌 = 𝐶 (5.57)
𝑖

therefore, the dual problem takes the following form

max 𝑏⊤ 𝑧
∑
𝑌+ 𝑧𝑖 𝐴𝑖 = 𝐶 (5.58)
𝑌⪰0

Just as in any optimization problem, the dual problem is such that dual ≤
primal and the Slater conditions.

5.4 Semideﬁnite approximations

In Example 5.3, we showed how a hyperbolic constraint might be transformed
into a second-order cone constraint. Similarly, we can convert some non-convex
quadratic constraints into semidefinite thereof.

5 Here, we use the fact that the trace is distributive with respect to the sum.

Telegram: @ElectricalDocument
5.4 Semideﬁnite approximations 99

Let us consider the following quadratic equality constraint:

𝑥 ⊤ 𝐻𝑥 = 1 (5.59)

where 𝐻 is a square matrix. This constraint is evidently non-convex, even in the

case of 𝐻 semidefinite (recall equality constraints must be affine). In order to
find a semidefinite approximation, we define a new matrix 𝑋 = 𝑥𝑥 ⊤ , namely:

⎛ 𝑥1 𝑥1 𝑥1 𝑥2 … 𝑥1 𝑥𝑛 ⎞
⎜ 𝑥 𝑥 𝑥2 𝑥2 … 𝑥2 𝑥𝑛 ⎟
𝑋 = 𝑥𝑥 ⊤ = ⎜ 2 1 ⎟ (5.60)
⋮ ⋮ ⋮
⎜ ⎟
𝑥 𝑥 𝑥𝑛 𝑥2 … 𝑥𝑛 𝑥𝑛
⎝ 𝑛 1 ⎠
We can express the quadratic form as function of this matrix as given below6 :

𝑥 ⊤ 𝐻𝑥 = tr(𝐻𝑋) (5.61)

Evidently, 𝑋 is positive semidefinite and rank(𝑋) = 1, then the following

constraint is equivalent to Equation (5.59):

tr(𝐻𝑋) = 1
𝑋⪰0 (5.62)
rank(𝑋) = 1

This set is convex except for the rank constraint. Therefore, we can generate a
convex approximation by relaxing this constraint. Notice the semidefinite con-
dition is imposed in 𝑋 and not in 𝐻. Thus, the approximation is very general
for quadratic and hyperbolic problems.

Example 5.11. Consider the following quadratic optimization problem with

quadratic constraints:

min 𝑥 ⊤ 𝑄𝑥
𝑥⊤ 𝑥 = 1 (5.63)

with 𝑥 ∈ ℝ2 and 𝑄 = 𝑄⊤ ∈ ℝ2×2 given below:

1 0.3
𝑄=( ) (5.64)
0.3 −2

This problem is evidently non-convex; however, given its small size, it can be
solved easily using Lagrange multipliers. To do so, we define the following

6 Recall properties given in Section 5.3.1

Telegram: @ElectricalDocument
100 5 Conic optimization

Figure 5.5 Graphical

representation for the problem
Equation (5.63): constraint (dark
line), level curves of the objective
function (light lines)

Lagrangian function:

ℒ(𝑥, 𝜆) = 𝑥 ⊤ 𝑄𝑥 + 𝜆(1 − 𝑥 ⊤ 𝑥) (5.65)

with the following optimal conditions

𝑄𝑥 − 𝜆𝑥 = 0 (5.66)
⊤
𝑥 𝑥=1 (5.67)

Therefore, we can conclude that 𝜆 are the eigenvalues of 𝑄 and 𝑥 are the uni-
tary eigenvectors of 𝑄. The optimal solution is just the minimum eigenvalue.
Figure 5.5 shows the level curves of the objective function and the equality con-
straint in the space 𝑥0 , 𝑥1 . The problem has one minimum and one maximum,
achieved at different points; that is to say, the solution is not unique.
This problem was easy to solve because it was defined in ℝ2 ; the situation
becomes more and more complex as 𝑛 increases since we require to evaluate
all possible eigenvalues and their corresponding eigenvectors.
Let us solve the problem using a convex approximation. We define a new
matrix 𝑋 = 𝑥𝑥 ⊤ and the following optimization model which is completely
equivalent to Equation (5.63):

min tr(𝑄𝑋)
tr(𝑋) = 1
𝑋⪰0 (5.68)
rank(𝑋) = 1

Telegram: @ElectricalDocument
5.4 Semideﬁnite approximations 101

Now, we relax the rank constraint obtaining a semidefinite optimization prob-

lem with the following coding in Python:

import numpy as np
import cvxpy as cvx
Q = np.array([[1,0.3],[0.3,-2]])
X = cvx.Variable((2,2),symmetric=True)
fo = cvx.Minimize(cvx.trace(Q@X))
re = [X >> 0, cvx.trace(X)==1]
SDaprox = cvx.Problem(fo,re)
SDaprox.solve()
print(’Aprox’,fo.value)
print(’Optimal’,np.linalg.eigvals(Q))

We have transformed Model Equation (5.63) with two decision variables

to Model Equation (5.68) with 4 decision variables. However, Model
Equation (5.68) is convex (after relaxing the rank constraint) whereas Model
Equation (5.63) is not. Finding a convex model improves the problem; in
most applications, we prefer a large convex model instead of a medium-size
non-convex problem.

Example 5.12. Binary problems may be solved using semidefinite approxima-

tions. Consider the following quadratic problem with binary constraints

min 𝑥 ⊤ 𝑄𝑥
𝑥𝑖 ∈ {−1, 1} (5.69)

In this case, the problem is binary although it takes values of 𝑥𝑖 = ±1 instead

of 0, 1 as usual. This binary constraint is transformed into a quadratic equality
constraint that is equivalent:

𝑥𝑖2 = 1 (5.70)

Now we proceed as in the previous example obtaining a semidefinite model:

min tr(𝑄𝑋)
diag(𝑋) = 1 (5.71)
𝑋⪰0

This approximation may be efficient in some problems, for example, in the max-
cut problem [26, 27].

Telegram: @ElectricalDocument
102 5 Conic optimization

5.5 Polynomial optimization

A polynomial optimization problem is a model that can be represented as
follows:

min 𝑓(𝑥)
𝑝(𝑥) = 0 (5.72)
𝑞(𝑥) ≤ 0

Where 𝑓, 𝑝, and 𝑞 are multivariable polynomials. These types of problems

are, in principle, non-convex and highly complex. Polynomial optimization
is very general since it encompasses linear, quadratic, and second-order opti-
mization problems as well as many non-convex problems. In addition, binary
constraints can be also transformed into polynomial constraints. For example,
a binary variable that takes values 𝑥 = ±1, may be represented as 𝑥2 =
1. More importantly, polynomial optimization problems can be efficiently
transformer into semidefinite optimization problems, as presented in this
section.
A particular type of polynomials is those that can be represented as a sum-
of-square (SOS), namely:
∑
𝑝(𝑥) = (𝑞𝑖 (𝑥))2 (5.73)
𝑖

Where 𝑞𝑖 are multivariable polynomials. This type of problem can be trans-

formed into semidefinite equivalents that are easier to solve. Below, we present
series of examples that show how to transform SOS problems into SDP. Our
exposition is practically oriented and based on the work presented by Parrillo
et al. in [28].
The main idea, is that any polynomial SOS can be transformed as the
equation presented below:
⊤
𝑝(𝑥) = (𝑄𝑚(𝑥)) (𝑄𝑚(𝑥)) (5.74)

where 𝑚(𝑥) is a vector of monomials associated to 𝑝, and 𝑄 is a positively

semidefinite matrix.

Example 5.13. Let us consider the following multivariate polynomial,

namely:

𝑝(𝑥, 𝑦) = 2𝑥4 + 5𝑥2 + 𝑦 2 + 2𝑦𝑥 2 − 2𝑥𝑦 (5.75)

Telegram: @ElectricalDocument
5.5 Polynomial optimization 103

This polynomial is SOS since it can be represented as follows:

⊤
⎛ 𝑥 ⎞ ⎛ 5 −1 0 ⎞ ⎛ 𝑥 ⎞
𝑝(𝑥, 𝑦) = ⎜ 𝑦 ⎟ ⎜ −1 1 1 ⎟⎜ 𝑦 ⎟ (5.76)
⎜ 2 ⎟ ⎜ ⎟⎜ 2 ⎟
𝑥 0 1 2 𝑥
⎝ ⎠ ⎝ ⎠⎝ ⎠
( )
2 ⊤
The vector of monomials is 𝑚(𝑥, 𝑦) = 𝑥, 𝑦, 𝑥 and a matrix 𝑄 is defined as
follows:
⊤
⎛ 𝑥 ⎞ ⎛ 𝑞00 𝑞01 𝑞02 ⎞ ⎛ 𝑥 ⎞
𝑝(𝑥, 𝑦) = ⎜ 𝑦 ⎟ ⎜ 𝑞10 𝑞11 𝑞12 ⎟ ⎜ 𝑦 ⎟ (5.77)
⎜ 2 ⎟ ⎜ ⎟⎜ 2 ⎟
𝑥 𝑞20 𝑞21 𝑞22 𝑥
⎝ ⎠ ⎝ ⎠⎝ ⎠
Therefore, the polynomial may be written as presented below:

𝑝(𝑥, 𝑦) = 𝑞00 𝑥 2 +(𝑞01 +𝑞10 )𝑥𝑦+(𝑞02 +𝑞20 )𝑥 3 +𝑞11 𝑦 2 +(𝑞12 +𝑞21 )𝑦𝑥 2 +𝑞22 𝑥4
(5.78)

Matching term to term, a set of equality constraints is obtained as

follows:

𝑞00 = 5
𝑞01 + 𝑞10 = −2
𝑞02 + 𝑞20 = 0
𝑞11 = 1 (5.79)
𝑞12 + 𝑞21 = 2
𝑞22 = 2
𝑄⪰0

This is a feasibility problem that can be solved in Python, as presented

below:

import cvxpy as cvx

Q = cvx.Variable((3,3))
obj = cvx.Minimize(0)
res = [Q[0,0] == 5,
Q[0,1]+Q[1,0] == -2,
Q[0,2]+Q[2,0] == 0,
Q[1,1] == 1,
Q[1,2]+Q[2,1] == 2,
Q[2,2] == 2,

Telegram: @ElectricalDocument
104 5 Conic optimization

Q >> 0.01]
SOS = cvx.Problem(obj,res)
SOS.solve()
print(np.round(Q.value,3))

The result of this script is the following positive semidefinite matrix

⎛ 5 −1 0 ⎞
𝑄 = ⎜ −1 1 1 ⎟ (5.80)
⎜ ⎟
0 1 2
⎝ ⎠

The eigenvalues of this matrix are 𝜆 = (5.25, 0.23, 2.52), so it is a positively

semidefinite matrix. Therefore, 𝑝(𝑥, 𝑦) is a sum-of-square polynomial.

Example 5.14. Let us consider the polynomial given in Equation (5.81),

𝑝(𝑥) = 𝑥4 − 30𝑥 2 + 8𝑥 − 15 (5.81)

A plot of this polynomial is depicted in Figure 5.6.

From the plot, we can conclude that the function is non-convex and have
two local minimum at 𝑥 = −3.94 and 𝑥 = 3.8, with 𝑝(−3.94) ≈ −271.25 and
𝑝(3.8) ≈ −209.28, respectively. The global optimum is therefore ≈ −271.25.
Now, we formulate the problem as an approximated SOS optimization model,
as presented below:

min 𝑡
𝑥4 − 30𝑥 2 + 8𝑥 − 15 − 𝑡 ∈ 𝑆𝑂𝑆 (5.82)

−100
( )

−200

−5 −4 −3 −2 −1 0 1 2 3 4 5

Figure 5.6 Plot of polynomial p(x) = 𝑥4 − 30𝑥 2 + 8𝑥 − 15

Telegram: @ElectricalDocument
5.6 Further readings 105

The SOS factorization of this polynomial is given below:

⊺
⎛ 1 ⎞ ⎛ 𝑞11 𝑞12 𝑞13 ⎞ ⎛ 1 ⎞
⎜ 𝑥 ⎟ ⎜ 𝑞21 𝑞22 𝑞23 ⎟ ⎜ 𝑥 ⎟ (5.83)
⎜ 2 ⎟ ⎜ ⎟⎜ 2 ⎟
𝑥 𝑞31 𝑞32 𝑞33 𝑥
⎝ ⎠ ⎝ ⎠⎝ ⎠
Matching term by term, we obtain the following equivalent semidefinite pro-
gramming model:

max 𝑡
𝑄 = 𝑄⊤ ⪰ 0
𝑞33 = 1
2𝑞23 = 0 (5.84)
𝑞22 + 2𝑞13 = −30
2𝑞12 = 8
𝑞11 = −15 − 𝑡

The optimal solution to this problem is obtained using CvxPy. The optimal
value corresponds to −271.2461. That is the expected valued, according to
Figure 5.6. Therefore, we have found the global optimum of the problem. The
critical step in this problem was to transform a question from the multivariate
polynomials’ space to the positive semidefinite matrices’ space and solve the
resulting model.

5.6 Further readings

Second-order cone optimization and semidefinite programming are two of the
most common type of conic optimization problems. Other applications and
theoretical details can be studied in [29] and [30]. A useful review of linear
algebra is essential to understand these concepts. See for example [16] and
[21] where there is an excellent presentation of basic concepts such as deter-
minant and quadratic forms. This chapter showed fundamental mathematical
aspects. However, conic optimization has several power systems applications,
especially in the optimal power flow problem. This aspect will be presented in
Chapter 10.
Some aspects associated with polynomial optimization and, in particular, the
sum-of-squares problems were also presented in this chapter. A complete study
of this subject from the point of view of algebraic geometry is found in [28].

Telegram: @ElectricalDocument
106 5 Conic optimization

Importantly, SOS is not the unique type of polynomial optimization problem.

However, SOS problems have the advantage of being representable as SDP and,
therefore, they can be solved efficiently by means of algorithms of convex opti-
mization, such as the interior point method. Another approach closely related
to SOS and polynomial optimization is the Lasserre hierarchy. A good review
of this subject can be found in [31] and [32].

5.7 Exercises
1. Make a function in Python that generates random square-semidefinite
matrices of size 𝑛 × 𝑛.
2. Generate random instances of the following quadratically constrained
quadratic program:

min 𝑥⊤ 𝐻𝑥 + 𝑟⊤ 𝑥
𝑥 ⊤ 𝑀𝑥 + 𝑏⊤ 𝑥 + 𝑐 ≤ 0 (5.85)

Where 𝐻 and 𝑀 are positive semidefinite. Transform the problem into an

SOC model and solve in Python.
3. Generate an example in Python for each of the cases given in Table 5.1.
4. Make a plot of the average execution time vs 𝑛 for both the SOC and the
SDP representation. Analyze the results.
5. A linear programming problem can be represented as an SDP prob-
lem. Generate a random instance of a linear programming problem and
transform it in an SDP. Implement the code in Python and analyze the
results.
6. A simple way to calculate an area, volume or hyper volume is by means
of Monte Carlo integration. Suppose we are interested in calculating the
shadow area in Figure 5.7, then we proceed as follows: first, we initialize a
variable 𝑠 = 0, next, we generate random values of 𝑥, 𝑦 such that −𝑥max ≤
𝑥 ≤ 𝑥max and −𝑦max ≤ 𝑦 ≤ 𝑦max ; if the point (𝑥, 𝑦) belongs to the are a (such
as the area A) then 𝑠 ← 𝑠 + 1; we repeat this procedure for 𝑛 iterations (𝑛
requires to be a high value in order to obtain a good approximation); the
area can be calculated as (𝑠∕𝑛)(4𝑥max 𝑦max ). Use this procedure to calculate
the area of the ellipsoid given in Example 5.7 for a random positive definite
matrix 𝐻. Analyze also the case for calculating the volume of an ellipsoide
in ℝ3 and a hypervolume in ℝ4 .
7. The exponential function is convex, however, it can be approximated to
an SOC constraint. Use a Taylor expansion around 0 and generate an SOC

Telegram: @ElectricalDocument
5.7 Exercises 107

Figure 5.7 Example of a Monte Carlo integration. We add only the points inside the
area that requires to be calculated

approximation of the following set:

exp(𝑥) ≤ 𝑧
1 1
≤𝑥≤ (5.86)
2 2

Evaluate the accuracy of the approximation.

8. Solve the following SDP using Python

min 𝑥 + 𝑦 (5.87)

𝑥 1
( )⪰0 (5.88)
1 𝑦

𝑥+𝑦 ≤3 (5.89)

Plot the feasible set, formulate and solve the dual problem, and analyze the
results.
9. Semidefinite programming models can be solved by interior point methods.
In this case, a constraint of the form 𝑋 ⪰ 0 can be penalized by a barrier
function given by 𝜙(𝑋) = −𝜇 ln(det(𝑋)). Use this fact to solve the following
feasibility problem:

min 0
𝑋⪰0 (5.90)
tr(𝑋) = 1

Telegram: @ElectricalDocument
108 5 Conic optimization

The following equations can be useful:

𝜕 ln(det(𝑋))
= 𝑋 −1
𝜕𝑋
𝜕 tr(𝐴𝑋)
=𝐴 (5.91)
𝜕𝑋
where both 𝐴 and 𝑋 are square matrices.
10. Generate a random instance of the problem presented in Example 5.12
for 𝑄 ∈ ℝ1 0. Find the optimal solution using a brute-force search and a
semidefinite approximation. Compare the solution.

Telegram: @ElectricalDocument
109

Robust optimization

Learning outcomes

By the end of this chapter, the student will be able to:

● Represent uncertainties through an uncertainty set.
● Formulate simple, robust optimization problems.

6.1 Stochastic vs robust optimization

So far we have been concerned with the problems of the following form:

min 𝑓(𝑥, 𝛽) (6.1)

Where 𝑥 ∈ ℝ𝑛 is a vector of decision variables, and 𝛽 is a vector of parame-

ters. However, it is common that 𝛽 is not completely known. Hence we require
to decide in the presence of uncertainty. There are two classical ways to deal
with this problem, namely: stochastic optimization and robust optimization.
In stochastic optimization, it is assumed that the probability distribution of 𝛽
is known. We need extensive statistical data to define this distribution. In prac-
tice, this information may not be available. In robust optimization, we require
less information since we only assume that 𝛽 is in a closed uncertainty set 𝒰.
In this case, our goal is to find an optimal solution in the sense of 𝑓 and, at the
same time, a feasible solution regardless of the value of 𝛽 ∈ 𝒰. We want the best
solution in the worst-case scenario. Below, we present a naive introduction to
stochastic optimization, and next, we present the robust approach.

Mathematical Programming for Power Systems Operation: From Theory to Applications in

6.1.1 Stochastic approach

There are several ways to deal with randomness and risk in stochastic optimiza-
tion. One of the strategies, called here and now, assumes that the optimization
problem is solved at the beginning of the planning horizon, taking into account
future uncertainty. The second strategy is called wait and see. In this strategy, we
make some decisions at the first stage and make modifications throughout the
planning horizon. In both cases, we require a suitable model of the stochastic
process.
A common objective for stochastic problems is to minimize an expect value,
as presented below:

min 𝔼(𝑓(𝑥, 𝛽))

𝛽∈ℳ (6.2)

where ℳ represents the probability space in which 𝛽 lives. There are many
ways to represent ℳ. For example, it may be represented as a set of discrete
scenarios with a given probability, or as a continuous distribution function. In
any case, the challenge is not only the representation of the uncertainty but the
tractability of the resulting optimization model.

6.1.2 Robust approach

In a robust optimization problem, we have a continuous uncertainty set 𝒰 that
contains 𝛽. This set plays a similar role than ℳ in the stochastic approach.
However, we do not have additional information related to the probability, i.e.,
we know that 𝛽 belongs to 𝒰, but we ignore which regions are more probable.
Therefore, we can define an optimization problem that seeks to minimize 𝑓
under the worst-case scenario of 𝛽, as given in (6.3),

min {sup𝑓(𝑥, 𝛽)} (6.3)

𝑥 𝛽∈𝒰

A critical step in the problem above is defining the uncertainty set 𝒰 to obtain
a tractable and realistic problem. In the following sections, we present how
to define this set and how it is related to the objective function and the prob-
lem’s constraints. We use the following simple linear programming problem to
present the main concepts:

min 𝑐⊤ 𝑥
𝑎⊤ 𝑥 ≤ 𝑏 (6.4)

where 𝑐, 𝑎, and/or 𝑏 may be uncertain parameters of the model.

Telegram: @ElectricalDocument
6.2 Polyhedral uncertainty 111

6.2 Polyhedral uncertainty

Let us consider the case in which 𝑎 is contained in a polyhedral uncertainty set
𝒰 ⊂ ℝ𝑛 defined as follows:
{ }
𝒰 = 𝑎 ∶ 𝐷⊤ 𝑎 ≤ 𝑑 (6.5)
The problem seems highly complex since both 𝑎 and 𝑥 are unknown. However,
the problem can be solved in two step: first, we determine a model with a worst-
case outcome and then, we optimize this model. The worst-case outcome for
(6.4) is given below:

(sup 𝑎⊤ 𝑥) ≤ 𝑏 (6.6)
𝑎∈𝒰

Then, the following optimization problem is raised:

𝒫(𝑎) = {sup 𝑎⊤ 𝑥} (6.7)

𝑎∈𝒰

This is a linear programming problem, where the decision variables are 𝑎, and
the uncertainty set 𝒰 is a polytope given by (6.5). Therefore, we can define a
primal problem 𝒫(𝑎) written as follows:

max 𝑎⊤ 𝑥
𝒫(𝑎) = { } (6.8)
𝐷⊤ 𝑎 ≤ 𝑑

Next, we define the dual problem 𝒟(𝑦), as given below1 :

⎧ min 𝑦 ⊤ 𝑑 ⎫
𝒟(𝑦) = 𝑦 ⊤ 𝐷 = 𝑥 (6.9)
⎨ ⎬
𝑦≥0
⎩ ⎭
The following linear programming problem is obtained, after replacing into
(6.4):
min 𝑐⊤ 𝑥
𝑦⊤ 𝑑 ≤ 𝑏
𝑦⊤ 𝐷 = 𝑥 (6.10)
𝑦≥0
The previous analysis considered only data uncertainty in the constraints.
However, the objective function may also be subject to uncertainty. In that case,

1 The reader is invited to review Section 3.5 for studying the basic duality theory.

Telegram: @ElectricalDocument
112 6 Robust optimization

it is convenient to transform (6.4) into the following equivalent representation:

min 𝑧
𝑐⊤ 𝑥 ≤ 𝑧 (6.11)
⊤
𝑎 𝑥≤𝑏

Thus, the uncertainty set must now include the values of 𝑐. The following
example helps to understand this model.

Example 6.1. Let us consider the following optimization problem:

min 3𝑥0 + 5𝑥1

𝑥0 + 𝑥1 ≥ 1 (6.12)
𝑥0 , 𝑥1 ≥ 0

The solution to this problem is 𝑥 = (1, 0) with objective function 𝑧 = 3. Let

us suppose now, the coefficients in the objective function are uncertain, with a
maximum deviation of ±0.5. The following polyhedral uncertainty is defined:

𝒰 = {(𝑐0 , 𝑐1 ) ∶ 2.5 ≤ 𝑐0 ≤ 3.5, 4.5 ≤ 𝑐1 ≤ 5.5} (6.13)

We define the worst-case outcome as follows:

min 𝑧

( sup 𝑐0 𝑥0 + 𝑐1 𝑥1 ) ≤ 𝑧
(𝑐0 ,𝑐1 )∈𝒰

𝑥0 + 𝑥1 ≥ 1 (6.14)
𝑥0 , 𝑥1 ≥ 0

The primal problem 𝒫 associated to the uncertainty set is given below:

max 𝑐0 𝑥0 + 𝑐1 𝑥1

⎛ 1 0 ⎞ ⎛ 3.5 ⎞
⎜ −1 0 ⎟ 𝑐0 ⎜ −2.5 ⎟
⎜ 0 ( )≤⎜ (6.15)
1 ⎟ 𝑐1 5.5 ⎟
⎜ ⎟ ⎜ ⎟
0 −1 −4.5
⎝ ⎠ ⎝ ⎠

Telegram: @ElectricalDocument
6.3 Linear problems with norm uncertainty 113

Now, we formulate the dual model to obtain a robust equivalent, namely:

min 𝑧
3.5𝑦0 − 2.5𝑦1 + 5.5𝑦2 − 4.5𝑦3 ≤ 𝑧
𝑦0 − 𝑦1 = 𝑥0
𝑦2 − 𝑦3 = 𝑥1 (6.16)
𝑥0 + 𝑥1 ≥ 1
𝑥0 , 𝑥 1 ≥ 0
𝑦0 , 𝑦 1 , 𝑦 2 , 𝑦 3 ≥ 0

The solution to this problem is 𝑧 = 3.5. This solution is, indeed, worst than the
solution of (6.12). However, it is the best solution in the worst-case scenario,
i.e., it is a robust solution.

6.3 Linear problems with norm uncertainty

Uncertainty may also be represented by a closed ball ℬ in ℝ𝑛 . In that case, all
coefficients 𝑎 are uncertain. However, we might know they are inside a ball
with center in 𝛼 and radius 𝛿, as follows:

min 𝑐⊤ 𝑥
𝑎⊤ 𝑥 ≤ 𝑏 (6.17)
𝑎 ∈ ℬ = {𝑎 ∈ ℝ𝑛 ∶ 𝑎 = 𝛼 + 𝛿𝜉, with, ‖𝜉‖ ≤ 1}

where ‖⋅‖ is any norm in ℝ𝑛 . Likewise the polyhedral uncertainty, we require

to determine the wort-case outcome, is given below:

(sup 𝑎⊤ 𝑥) ≤ 𝑏 (6.18)
𝑎∈ℬ

However, this is not a linear programming problem, since the set of feasible
solutions, ℬ, is non-linear. Therefore, the following optimization problem is
raised:

max 𝛼 ⊤ 𝑥 + 𝛿𝜉 ⊤ 𝑥
‖𝜉‖ ≤ 1 (6.19)

Telegram: @ElectricalDocument
114 6 Robust optimization

Table 6.1 Dual norms for the most common cases.

Norm Dual norm

norm-1 norm-∞
norm-2 norm-2
norm-∞ norm-1

where the decision variables are 𝜉. We can formulate the dual model associ-
ated with this problem and proceed in the same way as in the previous cases.
Nevertheless, a more systematic way to solve the problem is by defining a new
function called dual norm, as follows:
{ }
‖𝑦‖𝒟 = sup 𝑦 ⊤ 𝑥, ‖𝑥‖ ≤ 1 (6.20)
This norm holds all properties presented in Section 2.2. In addition, it is a
bijective operation, which means that the dual norm of ‖⋅‖𝒟 is again ‖⋅‖. Table
6.1 shows the dual norms of the three most common cases.
With this useful definition, we can easily formulate a tractable model for
(6.17), namely:
min 𝑐⊤ 𝑥
𝛼⊤ 𝑥 + 𝛿 ‖𝑥‖𝒟 ≤ 𝑏 (6.21)
notice this is a convex optimization problem since a norm is a convex function.
The problem might be reduced to a linear programming problem for the cases
of 1-norm and ∞ − norm. It is a second-order cone optimization problem for
the case of the 2-norm.
Example 6.2. Let us consider the following linear programming problem:
min 𝑧 = −8𝑥0 − 7𝑥1 − 9𝑥2
𝑥0 + 𝑥1 + 𝑥2 ≤ 10 (6.22)
𝑥≥0
This problem has an optimum in 𝑥 = (0, 0, 10)⊤ with 𝑧 = −90. Now, let us
consider the case in which the coefficients associated to the first constraint are
𝑎 = (𝑎0 , 𝑎1 , 𝑎2 )⊤ with
(𝑎0 − 1)2 + (𝑎1 − 1)2 + (𝑎2 − 1)2 ≤ 0.1 (6.23)
This constraint is equivalent to say that 𝑎 = (1, 1, 1)⊤ + 𝜉 with ‖𝜉‖ ≤ 0.3162.
Therefore, the equivalent robust optimization problem is given by the following
model:

Telegram: @ElectricalDocument
6.4 Deﬁning the uncertainty set 115

min − 8𝑥0 − 7𝑥1 − 9𝑥2

𝑥0 + 𝑥1 + 𝑥2 + 0.3162 ‖𝑥‖2 ≤ 10 (6.24)
𝑥≥0

The optimal solution of this problem is 𝑧 = −70.1356 with 𝑥̃ = (2.685,

0, 5.406)⊤ .

6.4 Deﬁning the uncertainty set

The uncertainty set can be defined using statistic information of the parame-
ters. For example, it might be the case that 𝑎 ∈ ℝ is a single variable which
is normally distributed, i.e., 𝑎 ∼ 𝒩(𝛼, 𝜎), where 𝛼 is the mean and 𝜎 is the
standard deviation. The probability density function of 𝑎 is presented below:
1 1 ( 𝑎 − 𝛼 )2
𝜓(𝑎) = √ exp (− ) (6.25)
𝜎 2𝜋 2 𝜎

We can define a confidence interval 𝒰 for 𝑎. Hence, the probability that 𝑎

belongs to 𝒰 is given by (6.26),

Prob (𝑎 ∈ 𝒰) = ∫ 𝜓(𝑎)𝑑𝑎 (6.26)

𝒰

With this simple approach, we can define the uncertainty set for a single param-
eter 𝑎. The main idea is to replace the constraint in (6.4) by a chance constraint
as follows:

Prob(𝑎𝑥 ≤ 𝑏) ≥ 𝜂 (6.27)

where 𝜂 is the probability given by (6.26). This equation implies that the
probability of meeting the constraint is above a given value 𝜂.
Example 6.3. Figure 6.1 shows the probability density function for a parame-
ter 𝑎 with mean 𝛼 = 20 and standard deviation 𝜎 = 1. The set 𝒰 constitutes a
confidence interval for the robust optimization problem. Two uncertainty sets
are defined in Figure 6.1, namely: 𝒰𝐴 = [19, 21] and 𝒰𝐵 = [18, 22]. The set 𝒰𝐴
includes values between ±𝜎 with a probability Prob(𝑎 ∈ 𝒰𝐴 ) = 68%, while the
set 𝒰𝐵 includes values between ±2𝜎 and its probability is Prob(𝑎 ∈ 𝒰𝐵 ) = 95%.
These probabilities were calculated using (6.26).
This same idea may be applied for 𝑎 in higher dimensions, for example,
𝑎 ∈ ℝ𝑛 . In that case, the univariate normal distribution is replaced for a mul-
tivariate normal distribution with the following probability density function:

Telegram: @ElectricalDocument
116 6 Robust optimization

Figure 6.1 Probability

density function for a variable
with mean 𝜶 = 𝟐𝟎 and
standard deviation 𝝈 = 𝟏

Figure 6.2 A multivariate

normal distribution in ℝ2

0.1

4
0
0 2 4 6 8 2 1

0
10

1 1
𝜓(𝑎) = √ exp (− (𝑎 − 𝛼)⊤ 𝑆 −1 (𝑎 − 𝛼)) (6.28)
(2𝜋)𝑛 det(𝑆) 2

where 𝛼 is a vector of mean values, and 𝑆 is a positive definite covariance

matrix. This multivariate distribution can be used to define the uncerntanty
set as presented in the next example.

Example 6.4. Figure 6.2 depicts a probability density function for ℝ2 . The
confidence intervals are now replaced by the following confidence regions,
{ }
𝒰𝛾 = 𝑎 ∈ ℝ2 ∶ (𝑎 − 𝛼)⊤ 𝑆 −1 (𝑎 − 𝛼) ≤ 𝛾 (6.29)

These confidence regions are ellipsoid as depicted in Figure 6.3

Example 6.5. The probability associated to a confidence region 𝒰𝛾 can be

calculated using a numerical approach, based on Monte Carlo simulation.Let

Telegram: @ElectricalDocument
6.4 Deﬁning the uncertainty set 117

Figure 6.3 Conﬁdence regions for a multivariate normal distribution in ℝ2

us consider a multivariate normal distribution with 𝛼 = (5, 3) and 𝑆 =

diag(3.24, 0.25). We use the function Multivariate Normal from the module
SciPy to generate 𝑛points random scenarios of 𝑎; next, we count the number of
scenarios in which 𝑎 ∈ 𝒰𝛾 ; this number is stored in a variable 𝜂in ; the proba-
bility Prob(𝑎 ∈ 𝒰𝛾 ) is given by 𝜂in ∕𝑛points . This approach is more precise for a
high value of 𝑛points . The code in Python is presented below:

import numpy as np
from scipy.stats import multivariate_normal
alpha = np.array([5,3])
S = np.array([[3.24,0],[0,0.25]])
p = multivariate_normal(alpha,S)
n_points = 10000
Sinv = np.linalg.inv(S)
gamma = 5
eta_in = 0
for k in range(n_points):
a = p.rvs()
w = np.array(a-alpha)
z = w.T@Sinv@w
if z <= gamma: eta_in += 1
print(’Prob = ’,eta_in/n_points)

Notice this method can be applied to any distribution in ℝ𝑛 .

A linear constraint 𝑎⊤ 𝑥 ≤ 𝑏 with 𝑎 ∼ 𝒩(𝛼, 𝑆) can be transformed into a

robust optimization problem by defining a confidence ellipsoid given by (6.29).
Without lost of generality, we define a new parameter 𝛿 ∼ 𝒩(0, 𝑆); therefore,
the linear constraint takes the following form:

Telegram: @ElectricalDocument
118 6 Robust optimization
( )
𝛼⊤ 𝑥 + sup 𝛿 ⊤ 𝑥 ≤ 𝑏 (6.30)
𝛿∈𝒰𝛾

with
{ }
𝒰𝛾 = 𝛿 ∈ ℝ𝑛 ∶ 𝛿⊤ 𝑆 −1 𝛿 ≤ 𝛾1∕2 (6.31)

This set can also be defined using a second-order cone, as presented below:
{ ‖ ‖ }
𝒰𝛾 = 𝛿 ∈ ℝ𝑛 ∶ 𝛾−1∕2 ‖‖‖𝑆 −1∕2 𝛿 ‖‖‖ ≤ 1 (6.32)

where 𝑆 −1∕2 is the Cholesky decomposition of 𝑆 −1 . This decomposition exists

since 𝑆 is positive definite. Let us define a new variable 𝑧 = 𝛾−1∕2 𝑆 −1∕2 𝛿, then,
the robust constraint can be represented as follows:
( )
𝛼⊤ 𝑥 + sup 𝛾1∕2 𝑧⊤ 𝑆 1∕2 𝑥 ≤ 𝑏 (6.33)
‖𝑧‖≤1

The supreme in the left-hand side of (6.33) can be represented as the dual
norm of norm-2. Consequently, the robust constraint is defined by the following
second-order cone:
‖ ‖
𝛼⊤ 𝑥 + 𝛾1∕2 ‖‖‖𝑆 1∕2 𝑥‖‖‖ ≤ 𝑏 (6.34)

This is, evidently, a convex constraint.

Example 6.6. Let us consider the following linear programming problem:

min 𝑞 = −10𝑥0 − 15𝑥1

𝑎0 𝑥0 + 𝑎1 𝑥1 ≤ 10 (6.35)
𝑥0 , 𝑥1 ≥ 0

Where 𝑎 = (𝑎0 , 𝑎1 )⊤ is normally distributed with mean 𝛼 = (5, 3) and

covariance matrix 𝑆 = diag(3.24, 0.25) (the same parameters of Example 6.5).
The solution to this linear programming problem for 𝑎 = 𝛼 is 𝑞 = −50
and 𝑥 = (0, 10∕3)⊤ . The robust solution can be calculated with different
degree of robustness. For instance, for 𝛾 = 5 we have a 92% probability to
hold the constraint. The equivalent robust optimization problem is presented
below:

import numpy as np
import cvxpy as cvx
c = np.array([-10,-15])
alpha = np.array([5,3])
S = np.array([[3.24,0],[0,0.25]])
M = np.linalg.cholesky(S)
r = np.sqrt(1/5)
x = cvx.Variable(2, nonneg=True)

Telegram: @ElectricalDocument
6.4 Deﬁning the uncertainty set 119

obj = cvx.Minimize(c.T@x)
res = [cvx.SOC(r*10-r*alpha.T@x,M@x)]
Model = cvx.Problem(obj,res)
Model.solve(verbose=True)
print(np.round(x.value,3))
print(np.round(obj.value))

The solution of this problem is 𝑞 = −36 and 𝑥 = (0, 2.428)⊤ . This solution is
clearly lower than the solution of the base problem. However, this solution is
robust enough to guarantee that the solution is feasible in 92% of the scenarios.

Another simple but common case of robust optimization, is when the uncer-
tainty is associated to 𝑏 in (6.4). In that case, the constraint may be transformed
into a robust problem as presented below:

𝑎⊤ 𝑥 ≤ 𝜙−1 (𝜂) (6.36)

where 𝜙𝑏−1 is the quantile function2 associated to the distribution of 𝑏, and 𝜂 is

the probability to hold the constraint. It is not required that this distribution is
normal.

Example 6.7. Let us consider the following constraint:

8𝑥 + 15𝑦 ≤ 𝑏 (6.37)

where 𝑏 is normally distributed with mean 𝜇 = 10 and standard deviation

𝜎 = 1. We can see the histogram for this parameter by generating a high num-
ber of random scenarios and using the corresponding function in MatplotLib,
as follows:

import numpy as np
import matplotlib.pyplot as plt
b = 10 + np.random.randn(10000)
plt.hist(b,20)
plt.grid()
plt.show()

The quantile with a given probability is obtained using the quantile function,
as presented below:

2 The quantile function is defined as 𝜙𝑏−1 (𝑝) = inf {𝑏 ∈ ℝ ∶ 𝐹(𝑏) ≥ 𝑝}; 𝐹 is the cumulative
distribution function.

Telegram: @ElectricalDocument
120 6 Robust optimization

print(’Quantile at 98%: ’,np.quantile(b,1-0.98))

This function results in a value of 𝑏 = 7.9. Therefore, the robust constraint

associated to (6.37) is (6.38),

8𝑥 + 15𝑦 ≤ 7.97 (6.38)

Example 6.8. The numerical method presented in the previous example can
be extended to obtain robust solution in problems where 𝑏 has a distribution
different from the normal distribution. That is the case of the wind velocity
which is often approximated by the Weibull distribution. However, the distri-
bution of the generated power is different, since the output power of a wind
turbine depends on its control. Usually, a wind turbine is controlled to obtain
maximum efficiency for wind velocities between 0 and 𝑣nom ; consequently
the turbine is controlled to obtain nominal power; finally, for wind velocities
higher than 𝑣max , the turbine is blocked. This control can be represented by the
following equation:
3
⎧ 𝑝nom (𝑣∕𝑣nom ) 0 ≤ 𝑣 ≤ 𝑣nom ⎫
𝑝(𝑣) = 𝑝nom 𝑣nom < 𝑣 ≤ 𝑣max (6.39)
⎨ ⎬
0 𝑣 > 𝑣max
⎩ ⎭
Let us consider a wind turbine with 𝑣nom = 12 and 𝑣max = 25. This turbine
is located in an offshore emplacement where the wind varies according to a
Weibull distribution with scale factor 𝜆 = 13 and shape 𝑎 = 2. A histogram of
this variable can be obtained as follows:
import numpy as np
import matplotlib.pyplot as plt
v = 13*np.random.weibull(2, 10000)
plt.hist(v,20)
plt.grid()
plt.show()

However, we may be interested in the distribution of the output power; there-

fore, we define (6.39) as a Python function, as presented below:
def wind_power(w):
p = 0
if (w>0)&(w<=12):
p = 2*(w/12)**3
if (w>12)&(w<=25):
p = 2
return p
pt = np.zeros(len(v))

Telegram: @ElectricalDocument
6.6 Exercises 121

for k in range(len(v)):
pt[k] = wind_power(v[k])
plt.hist(pt)
plt.grid()

Finally, we can evaluate the quantile function for different probabilities,

namely:

q = []
for k in range(100):
q += [np.quantile(pt,1-k/100)]
plt.plot(q)
plt.grid()

This plot allows obtaining different quantiles according to the expected proba-
bility.

6.5 Further readings

Robust optimization is a rich area of research with many impressive theoretical
results. Several applications can be found in scientific journals; for example, an
application for smart grids can be found in [33]. A general review of robust
optimization is available in [34].
Another approach to deal with the uncertainty in optimization problems is
using stochastic optimization. A complete analysis of this approach is beyond
the objectives of this book. An excellent presentation of this subject is given
in [35].

6.6 Exercises
1. Demonstrate the relation between the norm and dual norm for the cases
shown in Table 6.1. Use the definition of dual norm, given by (6.20), and the
theory presented in Section 2.2.
2. Consider a norm ‖𝑥‖ and its corresponding dual norm ‖𝑥‖𝒟 . Show that
these norms holds the following inequality:

𝑥 ⊤ 𝑦 ≤ ‖𝑥‖ ‖𝑦‖𝒟 (6.40)

3. Example 6.2 used norm-2 to define the uncertainty set. Formulate and solve
the same problem but now using norm-1 and norm-∞.

Telegram: @ElectricalDocument
122 6 Robust optimization

4. Consider a matrix 𝐻 = 𝐻 ⊤ ≻ 0 and the function

𝑛(𝑥) = 𝑥 ⊤ 𝐻𝑥 (6.41)
show that 𝑛(𝑥) is a norm and determine its dual norm.
5. Consider the following optimization problem
min 𝑓(𝑥)
𝑎⊤ 𝑥 ≤ 𝑏 (6.42)
{ }
where 𝑎 is a vector contained in the ellipsoid ℰ = 𝑎⊤ 𝐻𝑎 ≤ 1 , with 𝐻 ≻ 0.
6. Solve the problem in Example 6.6 for different values of 𝛾. Show plots of 𝑞
vs 𝛾 and Probability vs 𝛾.
7. Repeat the calculations of Example 6.5 for 𝑎 ∈ ℝ3 ; consider a normal dis-
tribution 𝑎 ∼ 𝒩(𝛼, 𝑆) with 𝛼 = (15, 12, 25)⊤ and 𝑆 given by the following
matrix:
⎛ 1.0 0.5 0.1 ⎞
𝑆 = ⎜ 0.5 2.0 0.1 ⎟ (6.43)
⎜ ⎟
0.1 0.1 2.0
⎝ ⎠
8. Find a robust counterpart for the following optimization problem:
min 𝑎𝑥 + 𝑏𝑦 + 𝑐𝑧
𝑥 + 𝑦 + 𝑧 = 100
0≤𝑥≤𝛿 (6.44)
0≤𝑦≤𝛿
0≤𝑧≤𝛿
with 𝛿 = 50; 𝑎 ∼ 𝒩(30, 4), 𝑏 ∼ 𝒩(31, 2), and 𝑐 ∼ 𝒩(32, 1);
9. Solve the previous problem but now 𝛿 ∼ 𝒩(50, 3).
10. Robust optimization tends to be too conservative since robust solutions may
occur when all the parameters deviate simultaneously to the worst condi-
tion. This condition, although robust, may be unlikely in practice. One way
to qualify the solution is by using cardinality constrained uncertainty. In this
approach, the uncertainty set is represented as a polyhedron; however, we
shall allow at must Γ coefficients to deviate. Let us consider Model (6.4) with
𝑎 = 𝑎̄ ± 𝛿, with the following primal problem:

Telegram: @ElectricalDocument
6.6 Exercises 123

max 𝛿 ⊤ 𝑣
𝑣𝑖 = |𝑥𝑖 |𝑤𝑖
∑
𝑤𝑖 ≤ Γ (6.45)
0 ≤ 𝑤𝑖 ≤ 1
Formulate the dual problem associated with (6.45) and the robust counter-
part of the original problem.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
125

Part II

Power systems operation

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
127

Economic dispatch of thermal units

Learning outcomes

By the end of this chapter, the student will be able to:

● Formulate the problems of economic and environmental dispatch.
● Include constraints related to the active power loss.
● Include constraints related to the transmission lines’ capacity con-
sidering the transportation model or the linear power ﬂow.

7.1 Economic dispatch

The economic dispatch of thermal units was one of the first mathematical
programming applications to power systems operation. Historically, the first
implementations of economic dispatch models coincided with the computer
development, which allowed to make automatic calculations efficiently and
in real-time [36]. The problem consists of determining the most economical
manner of operation to supply a given load condition. Each thermal power
plant has a different relationship between input (i.e., fuel) and output (i.e.,
electric power) according to the type of fuel, thermodynamic cycle, and par-
ticular plant characteristics. Therefore, a cost function is defined for each
thermal unit. Figure 7.1 depicts schematically, the economic dispatch prob-
lem for three thermal units supplying a single load. In its most basic form, the
effect of the grid is neglected leading to an optimization problem as given in
Equation (7.1),

Mathematical Programming for Power Systems Operation: From Theory to Applications in

Python. First Edition. Alejandro Garcés.
© 2022 by The Institute of Electrical and Electronics Engineers, Inc. Published 2022 by John
Telegram: @ElectricalDocument
Wiley & Sons, Inc.
128 7 Economic dispatch of thermal units
∑
min 𝑓𝑘 (𝑝𝑘 )
𝑘∈𝒯
∑
𝑝𝑘 = 𝑑 (7.1)
𝑘∈𝒯

where 𝒯 is the set of thermal units, 𝑓𝑘 is the cost function for each unit 𝑘 ∈ 𝒯,
𝑝𝑘 is the generated power, and 𝑑 is the total demand. In this model, a number
of simplifications were made on the manner in which power systems would
be operated. For instance, power losses, grid constrains, and capacity of the
generation units were neglected (these aspects will be considered later on in
this chapter). Model Equation (7.1) can be solved by the method of Lagrange
multipliers with the following Lagrangian:
∑ ∑
ℒ(𝑝, 𝜆) = 𝑓𝑘 (𝑝𝑘 ) + 𝜆(𝑑 − 𝑝𝑘 ) (7.2)
𝑘∈𝒯 𝑘

The first-order condition for optimal solution is obtained by deriving ℒ with

respect to 𝑝𝑘 :

𝜕𝑓𝑘
−𝜆 =0 (7.3)
𝜕𝑝𝑘

The value of 𝜕𝑓𝑘 ∕𝜕𝑝𝑘 is known as incremental cost. Therefore, the optimal
dispatch is obtained when the incremental costs of all thermal units are the
same. The second condition is obtained by deriving ℒ with respect to 𝜆 and
gives the power balance (i.e., the sum of the generation must be equal to the
demand).
Cost functions are usually represented as quadratic function as given in
Equation (7.4),

𝑎𝑘 2
𝑓𝑘 (𝑝𝑘 ) = 𝑝 + 𝑏𝑘 𝑝 𝑘 + 𝑐𝑘 (7.4)
2 𝑘

where 𝑎𝑘 , 𝑏𝑘 , 𝑐𝑘 are constants fit from data of the input to output relation of
each thermal unit. A quadratic representation of the thermal units simplifies
the problem enormously. The optimal conditions are the following set of linear
equations that can be easily solved in practice:

𝑎𝑘 𝑝𝑘 + 𝑏𝑘 = 𝜆 (7.5)
∑
𝑝𝑘 = 𝑑 (7.6)
𝑘∈𝒯

Telegram: @ElectricalDocument
7.1 Economic dispatch 129

Figure 7.1 Three thermal units with their respective cost functions for the economic
dispatch problem.

Actual power units have limits of minimum and maximum generation (𝑝min ,
𝑝max ) that must be included into the model as follows:

∑
min 𝑓𝑘 (𝑝𝑘 )
𝑘∈𝒯
∑
𝑝𝑘 = 𝑑 (7.7)
𝑘∈𝒯

𝑝𝑘min ≤ 𝑝𝑘 ≤ 𝑝𝑘max , ∀𝑘 ∈ 𝒯

The effect of these inequality constraints is shown in Figure 7.2. We may obtain
the most economical dispatch by solving a set of linear equations if the solution
is within the operating limits (that is the case for 𝜆𝐴 ). However, one unit may
achieve full load before the others, as is the case of Unit 2 for the incremental
cost 𝜆𝐵 . In that case, Unit 2 is set to the maximum (𝑝2 = 𝑝max ) and the rest
of the demand is supplied by the other two units, at equal incremental cost. In
general, the problem is solved as a quadratic optimization problem as presented
in the following examples.

Telegram: @ElectricalDocument
130 7 Economic dispatch of thermal units

Figure 7.2 Incremental cost for three thermal units considering capability
limits.

Example 7.1. Let us consider a system with two thermal units that supply a
demand of 𝑑 = 200 MW, where the costs functions of each unit is given below:
0.31 2
𝑓0 (𝑝0 ) = 𝑝 + 38𝑝0 (7.8)
2 0
0.22 2
𝑓1 (𝑝1 ) = 𝑝 + 46𝑝1 (7.9)
2 1
Our objective is to supply demand at the minimum cost. Therefore, we require
to minimize the following Lagrangian equation:
𝑓0 (𝑝0 ) + 𝑓1 (𝑝1 ) + 𝜆(𝑑 − 𝑝0 − 𝑝1 ) (7.10)
which results in the following set of linear equations:
0.31𝑝0 + 38 − 𝜆 = 0 (7.11)
0.22𝑝1 + 46 − 𝜆 = 0 (7.12)
𝑝0 + 𝑝1 = 200 (7.13)
The solution of this linear system was 𝑝0 = 98.11, 𝑝1 = 101.88 and 𝜆 = 68.42.
The code in Python for solving this simple problem, is presented below:

A = [[0.31,0,-1],[0,0.22,-1],[1,1,0]]
b = [-38,-46,200]
x = np.linalg.solve(A,b)
print(x)

This problem was simple enough to be solved without any optimization solver.
However, as the number of variables increases, the problem becomes more
complicated. In addition, the presence of box constraints for the maximum and
minimum power generation, makes necessary the use of a general quadratic
programming solver.

Telegram: @ElectricalDocument
7.1 Economic dispatch 131

Table 7.1 Cost functions and operative limits for a system with six thermal units [37].

pmin pmax ak bk 𝜶k 𝜷k
Unit
𝟐 𝟐
(MW) (MW) ($∕MWh ) ($∕MWh) (lb∕MWh ) (lb∕MWh)

T0 10 125 0.30494 38.5390 0.00838 0.32767

T1 10 150 0.21174 46.1591 0.00838 0.32767
T2 35 210 0.07092 38.3055 0.01366 −0.54551
T3 35 225 0.05606 40.3965 0.01366 −0.54551
T4 125 315 0.03598 38.2704 0.00922 −0.51116
T5 130 325 0.04222 36.3278 0.00922 −0.51116

Example 7.2. Consider a system with six units with parameters given in Table
7.1 and demand 𝑑 = 1200 MW.
For the sake of simplicity, the values of 𝑐𝑘 were set to zero1 . The economic dis-
patch consists on a quadratic programming problem, given by Equation (7.7),
that can be coded in Python as follows:
import numpy as np
import cvxpy as cvx
pmin = [10,10,35,35,125,130]
pmax = [125,150,210,225,315,325]
a = np.diag([0.30494,0.21174,0.07092,0.05606,0.03598,0.04222])
b = [38.5390,46.1591,38.3055,40.3965,38.2704,36.3278]
d = 1200
p = cvx.Variable(6)
obj = cvx.Minimize(1/2*cvx.quad_form(p,a)+b*p)
res = [sum(p) >= d , p>=pmin, p<=pmax]
Model = cvx.Problem(obj,res)
Model.solve()
print(p.value)
print(’Incremental Cost:’,res[0].dual_value)

The economic dispatch for this case is 𝑝 = (66, 59, 210, 225, 315, 325)⊤ and the
incremental cost is 𝜆 = 58.66 (the reader is invited to implement the code to
prove the results). Notice that the objective function is convex since 𝑎𝑘 ≥ 0. In
addition, the function is strictly convex, therefore, this is the global and unique
optimum of the problem.
Example 7.3. Linear models are common for economic dispatch problem
in modern electricity markets. In that case, the optimization model is highly
simplified, as presented below:

1 Parameters 𝛼, 𝛽 will be used in other examples below.

Telegram: @ElectricalDocument
132 7 Economic dispatch of thermal units
∑
min 𝑐𝑘 𝑝 𝑘
𝑘
∑
𝑐𝑘 = 𝑑 (7.14)
𝑘

0 ≤ 𝑝 ≤ 𝑝𝑘max

The following heuristic algorithm, known as merit order method, can solve
this linear problem: First, all units are organized according to the price 𝑐𝑘 , from
the unit with minimum cost to the unit with maximum cost (i.e., ascending
order of price). Next, each unit is dispatched with its maximum power until
the total demand is supplied. The spot price is the dual variable associated to
the power balance constraint.

Example 7.4. A linear model might be obtained from the quadratic cost
as given in Figure 7.3. A linear approximation enormously simplifies the
optimization model.
In that case, each thermal unit is represented by a linear cost function 𝜁𝑘 𝑝𝑘 ,
where the value of 𝜁𝑘 can be calculated as follows:

𝑝max
2
𝑎𝑘 2
min ∫ (𝜁𝑘 𝑝𝑘 − 𝑝𝑘 − 𝑏𝑘 𝑝𝑘 ) 𝑑𝑝𝑘 (7.15)
2
0

This is a simple minimum square problem with the following solution:

3𝑎𝑘 max
𝜁𝑘 = 𝑝 + 𝑏𝑘 (7.16)
8
The economic dispatch is transformed into a linear programming model with
this approximation. Example 7.2 was solved with this model obtaining the fol-
lowing result: 𝑝 = (115, 10, 210, 225, 315, 325)⊤ . Notice the linear model agreed

Figure 7.3 Linear approximation of quadratic cost functions.

Telegram: @ElectricalDocument
7.2 Environmental dispatch 133

with the quadratic model in 𝑝3 to 𝑝5 but it was not accurate for 𝑝0 and 𝑝1 .
However, total costs were not that different in one and the other case.
Example 7.5. It might be the case that 𝑐𝑘 in Equation (7.14), is unknown, but
determined by a temporal series, i.e, 𝑐𝑘 = 𝑐̄𝑘 ± 𝛿𝑘 , where 𝛿𝑘 is normally dis-
tributed with mean 𝜇 = 0 and standard deviation 𝛿𝑘 . In that case, we can
obtain a robust optimization model for the economic dispatch problem. The
robust optimization problem is the one presented below:
min 𝑧
∑ ‖‖∑ ‖‖
‖ ‖
𝑐̄𝑘 𝑝𝑘 + 𝛾1∕2 ‖‖‖ 𝜎𝑘 𝑝𝑘 ‖‖‖ ≤ 𝑧
‖‖ ‖‖
𝑘 ‖𝑘 ‖
∑
𝑐𝑘 = 𝑑 (7.17)
𝑘

0 ≤ 𝑝 ≤ 𝑝𝑘max
where 𝛾 defines the size of the confidence region2 . The second-order cone in
the first constraint can be interpreted as a penalization factor for a deviation of
the cost 𝑐𝑘 . A high penalization factor results in a robust although perhaps not
very efficient solution. A trade-off between cost and robustness can be defined
by this parameter.
Example 7.6. Uncertainty in the load or renewable power generation (e.g.,
solar and wind) can be introduced in the model by using robust optimization. In
that case, we just change the power balance for the following robust constraint:

∑
𝑝𝑘 ≤ 𝜙𝑑−1 (𝜂) (7.18)
𝑘

where 𝜙𝑑−1 is the quantile function of the load 𝑑, and 𝜂 is the desired probability.

7.2 Environmental dispatch

Another problem closely related to economic dispatch is environmental dis-
patch. In this problem, the cost function is replaced by an emission function
that considers greenhouse gas emissions such as sulfur oxides (SOx) and nitro-
gen oxides (NOx). The former has a quadratic form, whereas the latter is

2 See Chapter 6.

Telegram: @ElectricalDocument
134 7 Economic dispatch of thermal units

usually characterized by an equation consisting of a straight line and an expo-

nential function. Thus, the environmental dispatch has the same structure as
Equation (7.7) with emission functions given by Equation (7.19).
𝛼𝑘 2
𝑓𝑘 (𝑝𝑘 ) = 𝑝 + 𝛽𝑘 𝑝𝑘 + 𝛾𝑘 exp(𝜂𝑘 𝑝𝑘 ) (7.19)
2 𝑘
Notice this function is convex if 𝛼𝑘 ≥ 0. The analysis of this type of problem is
straightforward.
Example 7.7. Solve the environmental dispatch problem for a power system
with six units presented in Table 7.1 where only SOx emissions are considered
(i.e., the objective function is quadratic). The script for solving the problem
is the same as Example 7.2 replacing the values of 𝑎𝑘 for 𝛼𝑘 and 𝑏𝑘 for 𝛽𝑘 .
The solution is 𝑝 = (125, 150, 188, 188, 275, 275)⊤ , notice this solution is differ-
ent from the economic dispatch since economic and environmental objectives
are usually contradictory. For example, the unit 𝑇0 was dispatched with less
power in the economic dispatch than the environmental dispatch since it is
more expensive but less pollutant.
The economic dispatch and the environmental dispatch are usually contra-
dictory objectives. Therefore, it is required to study the problem as a multiobjec-
tive optimization. Although there is a trend in the scientific literature of power
systems for using heuristic algorithms in multiobjective optimization prob-
lems, the economic/environmental dispatch has convex objectives that allow
an immediate solution, as presented below.
Consider two optimization problems with the same set of feasible solutions
but two contradictory objectives, 𝑓𝐴 and 𝑓𝐵 ; this situation can be represented
as Equation (7.20).
min {𝑓𝐴 (𝑥), 𝑓𝐵 (𝑥)}
𝑓(𝑥) = 0 (7.20)
𝑔(𝑥) ≤ 0
Solving the problem concerning 𝑓𝐴 may lead to an unacceptable solution
regarding 𝑓𝐵 and vice versa. Therefore, it is required to find a trade-off between
the two objectives. We can find this trade-off through the concept of the Pareto
frontier. Consider three feasible solutions A, B, C depicted in Figure 7.4 for a
two-objective optimization problem. Solution A is better than B regarding 𝑓𝐴 ,
but B is better than A regarding 𝑓𝐵 . However, no solution is better in both objec-
tives simultaneously. This type of solution is named as non-dominated solution.
Instead, C is a dominant solution since there are better solutions in both objec-
tives (i.e., solutions A and B are better than C in both objectives). The set of
non-dominated solutions is called the Pareto frontier, and the multiobjective

Telegram: @ElectricalDocument
7.2 Environmental dispatch 135

Figure 7.4 Example of a Pareto frontier for two contradictory objective functions
that require to be minimized.

problem is considered solved when the Pareto frontier is found. The final deci-
sion about the dispatch is made by the transmission system operator using this
frontier; for example, the transmission system operator may decide the maxi-
mum deviation of the economic dispatch that it is willing to accept to reduce
emissions.
The Pareto frontier can be found by transforming the multiobjective problem
into a single-objective problem as follows:

min 𝜉𝑓𝐴 (𝑥) + (1 − 𝜉)𝑓𝐵 (𝑥)

𝑓(𝑥) = 0 (7.21)
𝑔(𝑥) ≤ 0

where 𝜉 is a real number between 0 and 1. This model is convex since both 𝑓𝐴
and 𝑓𝐵 are convex. The Pareto frontier is obtained by solving the problem for
different values of 𝜉 as presented in the example below:

Example 7.8. Consider the economic/enviromental dispatch for the system

presented in Table 7.1. In this case, it is required to define the single-objective
model Equation (7.21) and solve for different values of 𝜉 between 0 and 1 as
follows:

import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt
pmin = [10,10,35,35,125,130]
pmax = [125,150,210,225,315,325]
a = np.diag([0.30494,0.21174,0.07092,0.05606,0.03598,0.04222])
b = [38.5390,46.1591,38.3055,40.3965,38.2704,36.3278]
alpha = np.diag([0.00838,0.00838,0.01366,0.01366,0.00922,0.00922])
beta = [0.32767,0.32767,-0.54551,-0.54551,-0.51116,-0.51116]
d = 1200
def Pareto(xi):

Telegram: @ElectricalDocument
136 7 Economic dispatch of thermal units

p = cvx.Variable(6)
f_ecn = 1/2*cvx.quad_form(p,a)+b.T@p
f_env = 1/2*cvx.quad_form(p,alpha)+beta.T@p
fo = cvx.Minimize(xi*f_ecn+(1-xi)*f_env)
res = [sum(p) >= d , p>=pmin, p<=pmax]
Model = cvx.Problem(fo,res)
Model.solve()
return [f_ecn.value,f_env.value]
points = 10
F_ecn = np.zeros(points)
F_env = np.zeros(points)
for k in range(points):
xi = 1/(k+1)
F_ecn[k],F_env[k] = Pareto(xi)

plt.plot(F_ecn,F_env,marker=’o’)
plt.grid()
plt.xlabel(’Economic’)
plt.ylabel(’Enviromental’)

The reader is invited to execute the script and compare the results with previous
examples.

7.3 Effect of the grid

Transmission lines impose restrictions on the economic/environmental dis-
patch that must be considered in the model. These constraints can be repre-
sented by three different models, namely: transportation model, linear power
flow (DC-model), and non-linear power flow equations (or AC-model). The
first two models are discussed in this section. The third model is studied in
Chapter 10 together with the optimal power flow problem.
A first approximation of the grid constrains is based on the classic transporta-
tion model (see Example 4.5 in Chapter 4). In this model, the grid is represented
by a graph 𝒢 = {𝒩, ℰ} where 𝒩 is the set of nodes and ℰ ⊆ 𝒩 × 𝒩 is the set of
branches (i.e., transmission lines and transformers). Branches are included in
the model by adding new variables 𝑠𝑗 that represent the active power flow for
each branch 𝑗. The main constraint is the capacity of the line/transformer:
|𝑠𝑗 | ≤ 𝑠𝑗max (7.22)
this constraint can be transformed into a box constraint as follows:
−𝑠𝑗max ≤ 𝑠𝑗 ≤ 𝑠𝑗max (7.23)
On the other hand, loads are now distributed along the nodes forming a vector
of ℝ𝑛 where 𝑛 is the number of nodes. The power balance is now defined in
each node as follows:

Telegram: @ElectricalDocument
7.3 Effect of the grid 137

∑ ∑
𝑝𝑖 − 𝑑𝑘 = ±𝑠𝑗 (7.24)
𝑖∈Λ𝑘 𝑗∈Ω𝑘

where Λ𝑘 and Ω𝑘 are the set of generators and lines connected to node 𝑘. The
sum in the right-hand side of the equation must take into account the orienta-
tion of the flux, thus 𝑠𝑗 is positive if 𝑗 departs from 𝑘 and negative if it arrives
to 𝑘. This concept is better understood by the following example:
Example 7.9. Consider the grid shown in Figure 7.5. Solve the economic dis-
patch considering the cost functions given in Table 7.1 and the transportation
model of the grid. All lines have a capacity of 300 MW. The complete model is
as follows:
∑
min 𝑓𝑖 (𝑝𝑖 )
𝑖

𝑑0 + 𝑠01𝑎 + 𝑠01𝑏 + 𝑠03 + 𝑠04 − 𝑝0 − 𝑝1 = 0

𝑑1 − 𝑠01𝑎 − 𝑠01𝑏 + 𝑠12 + 𝑠15 = 0
𝑑2 − 𝑠12 + 𝑠25 − 𝑝2 − 𝑝3 = 0
𝑑3 − 𝑠03 + 𝑠34 − 𝑝4 = 0 (7.25)
𝑑4 − 𝑠34 − 𝑠04 + 𝑠45 = 0
𝑑5 − 𝑠15 − 𝑠25 − 𝑠45 − 𝑝5 = 0
− 𝑠𝑗max ≤ 𝑠𝑗 ≤ 𝑠𝑗max , ∀𝑗 ∈ {01𝑎, 01𝑏, 03, 04, 12, 15, 25, 34, 45}

𝑝𝑖min ≤ 𝑝𝑖 ≤ 𝑝𝑖max , ∀𝑖 ∈ {0, 1, 2, 3, 4, 5}

Power balance equations include power demand, generation and flows in the
directions shown in Figure 7.5. Two lines can connect the same nodes as is the
case of 01𝑎 and 01𝑏; also, two generators can be connected to the same node as
is the case of 𝑝0 , 𝑝1 and 𝑝2 , 𝑝3 . Loads are now a vector 𝑑 = (0, 800, 0, 0, 400, 0)⊤ .
The script for solving this problem starts as in Example 7.2 and continues
with the following code:
ng = 6 # number of generators
nl = 9 # number of lines
nn = 6 # number of nodes
smax = nl*[300]
d = [0,800,0,0,400,0] # demand
Lambda = (0,0,2,2,3,5) # generators location
Omega = ((0, 1),(0,1),(0,3),(0,4),(1,2),(1,5),(2,5),(3,4),(4,5))
# grid
s = cvx.Variable(nl) # power flows
EqB = nn*[0] # equation of balance of energy

Telegram: @ElectricalDocument
138 7 Economic dispatch of thermal units

Figure 7.5 Power grid with six nodes and six generators. All lines have a capacity of
200 MW.

for j in range(nl):
k = Omega[j][0]
m = Omega[j][1]
EqB[k] += s[j] # flow in departing from k
EqB[m] += -s[j] # flow arriving to k
for k in range(ng):
n1 = Lambda[k]
EqB[n1] += -p[k]
res = [p>=pmin, p<=pmax, s<=smax, -s<=smax]
for k in range(nn):
res += [EqB[k] + d[k] ==0]
obj = 1/2*cvx.quad_form(p,a)+b*p
Model = cvx.Problem(cvx.Minimize(obj),res)
Model.solve()
print(’Generation:’,np.round(p.value))
print(’Flows:’,np.round(s.value))

The active power flow in each line, including the double circuit between nodes
0 and 1, is represented by the array 𝑠; loads are represented as a vector of size six,
and EqB represents Equation (7.24) i.e., the balance of energy for each node.
The round solution for this model matches to Example 7.2.
Load flows are 𝑠 = (137, 137, −200, 50, −241, −284, 194, 115, −235)⊤ . Notice
some power flows such as 𝑠03 and 𝑠12 are negative which indicates the power
flows in the opposite direction.

Telegram: @ElectricalDocument
7.3 Effect of the grid 139

The transportation model allows to include grid constraints into the eco-
nomic/environmental dispatch problem. However, it is an oversimplification
of the problem (it could be more useful in radial grids). We must include power
flow equations into the model. A simple yet accurate approximation of the
power flow equations is the linear power flow, also known as dc power flow3 .
In this case, the power flow in each branch 𝑗 = 𝑘𝑚 is given by the following
expression:

𝜃𝑘 − 𝜃𝑚
𝑠𝑗 = 𝑠base (7.26)
𝑥𝑗

where 𝜃𝑘 is the angle of the voltage in 𝑘 and 𝜃𝑚 is the voltage in 𝑚; 𝑥𝑗 is the

branch impedance in per unit and 𝑠base is the nominal power for per unit rep-
resentation (recall 𝑝 and 𝑠𝑗 are given in MW). Let us consider the following
example to understand the influence of this constraint.

Example 7.10. The economic dispatch problem presented in Example 7.9 is

now solved taking into account the linear power flow. All lines have the same
impedance 𝑥 = 0.02 with 𝑠nom = 100. The model is the same as in the previous
example including additional variables and constraints as follows:

snom = 100
x = nl*[0.02]
th = cvx.Variable(nn) # nodal angles
res += [th[0] == 0] # angle reference
for j in range(nl):
k = Omega[j][0]
m = Omega[j][1]
res += [x[j]*s[j] == snom*(th[k]-th[m])]
ModelLPF = cvx.Problem(cvx.Minimize(obj),res)
ModelLPF.solve()

The new economic dispatch was 𝑝 = (94, 100, 178, 188, 315, 325)⊤ origi-
nated by a redistribution of the power flows which are now given by the
vector 𝑠 = (133, 133, −129, 57, −300, −234, 66, 186, −157)⊤ . Notice that line 12
achieves its maximum capability. This effect was not identified with the trans-
portation model, hence the importance of including power flow equations. The
new dispatch is more expensive, but it is feasible with the conditions of the grid.

3 The term dc power flow comes from an analogy between the linearized model and a linear
dc grid [38]. We discourage this name since it can be confused with the power flow in grids
that are actually dc.

Telegram: @ElectricalDocument
140 7 Economic dispatch of thermal units

7.4 Loss equation

Losses can be included in the economic dispatch model by a simple quadratic
equation and a power flow calculation. We must include both inductances and
resistances of transmission lines in the model through the nodal admittance
matrix or 𝑌bus . Therefore, nodal currents and nodal voltages are related by the
following expression:

𝐼bus = 𝑌bus 𝑉bus (7.27)

−1
We can also define the nodal impedance matrix 𝑍bus = 𝑌bus ; this inverse exists
as long as the graph is connected, including a connection with grown, there-
fore, capacitance of the lines must also be included. Nodal voltages are given
by Equation (7.28)

𝑉bus = 𝑍bus 𝐼bus (7.28)

nodal power in each node is given by the following equation

∗
𝑝𝑘 + 𝑗𝑞𝑘 = 𝑣𝑘 𝑒𝑗𝜃𝑘 𝐼bus(𝑘) (7.29)

where 𝑣𝑘 and 𝜃𝑘 are the magnitude and the angle of the voltage. Therefore,
the current can be represented as function of the nodal voltages as given in
Equation (7.30)

𝑝𝑘 cos(𝜃𝑘 ) + 𝑞𝑘 sin(𝜃𝑘 ) 𝑝𝑘 sin(𝜃𝑘 ) − 𝑞𝑘 cos(𝜃𝑘 )

𝐼bus(𝑘) = ( )+𝑗( ) (7.30)
𝑣𝑘 𝑣𝑘

On the other hand , total power losses are given by

( 𝖧
)
𝑝loss = real 𝑉bus 𝐼bus (7.31)

where (⋅)H represents the transpose and complex conjugate. This equation can
be represented as function of the real and imaginary parts of the current,
namely:

⊤ ⊤
𝑝loss = real(𝐼bus )𝑅bus real(𝐼bus ) + imag(𝐼bus )𝑅bus imag(𝐼bus ) (7.32)

where 𝑅bus = real(𝑌bus ) = [𝑟𝑘𝑚 ] ∈ ℝ𝑛×𝑛 . Replacing Equation (7.30) into

Equation (7.32) and after a lengthy but straightforward algebraic manipula-
tions, the following loss equation is obtained:

𝑝loss = 𝑝⊤ 𝐵𝑝 + ℎ⊤ 𝑝 + 𝑤 (7.33)

Telegram: @ElectricalDocument
7.4 Loss equation 141

where 𝐵 ∈ ℝ𝑛×𝑛 is a positive definite matrix whose entries are given by

Equation (7.34),

cos(𝜃𝑘𝑚 )
𝑏𝑘𝑚 = 𝑟𝑘𝑚 (7.34)
𝑣𝑘 𝑣𝑚

and ℎ is a vector given by Equation (7.35),

∑ sin(𝜃𝑘𝑚 )
ℎ𝑘 = −2 𝑟𝑘𝑚 𝑞 (7.35)
𝑚∈ℰ
𝑣𝑘 𝑣𝑚 𝑚

finally, 𝑤 is a scalar given by Equation (7.36)

∑ ∑ cos(𝜃𝑘𝑚 )
𝑤= 𝑟𝑘𝑚 𝑞 𝑞 (7.36)
𝑘∈ℰ 𝑚∈ℰ
𝑣𝑘 𝑣𝑚 𝑘 𝑚

Notice that 𝑟𝑘𝑚 is the input 𝑘𝑚 of 𝑅bus and not the resistance of the line 𝑘𝑚;
in fact, it may be the case that there is no transmission line between 𝑘 and 𝑚
and yet, there is an 𝑟𝑘𝑚 in the 𝑅bus matrix. Equation (7.33) constitutes a convex
quadratic form, therefore, it can be relaxed to the following convex inequality

𝑝loss ≥ 𝑝⊤ 𝐵𝑝 + ℎ⊤ 𝑝 + 𝑤 (7.37)

It is common in the literature of economic dispatch to neglect ℎ and 𝑤 in

the loss Equation (7.33). This approximation could be justified in cases where
reactive power is negligible. However, it is advisable to consider these terms in
cases where the information is available4 .
Under ideal conditions, the Lagrangian function associated to the economic
dispatch with losses is given by the following expression:

∑ ∑
ℒ(𝑝, 𝜆) = 𝑓𝑘 (𝑝𝑘 ) + 𝜆 (𝑑 + 𝑝loss − 𝑝𝑘 ) (7.38)
𝑘∈𝒯 𝑘

therefore, optimal conditions are given by

𝜕𝑓𝑘 𝜕𝑝loss
+𝜆( − 1) = 0 (7.39)
𝜕𝑝𝑘 𝜕𝑝𝑘

and hence the incremental costs are

𝜕𝑓𝑘
𝜆 = 𝜉𝑘 (7.40)
𝜕𝑝𝑘

4 This simplification was also common in times where the computational resources were
limited as presented in [39].

Telegram: @ElectricalDocument
142 7 Economic dispatch of thermal units

where 𝜉𝑘 is a penalty factor that considers the effect of power loss, this factor is
given by Equation (7.41)

1
𝜉𝑘 = (7.41)
1 − 𝜕𝑝loss ∕𝜕𝑝𝑘

Example 7.11. Let us solve the economic dispatch for the system presented in
Example 7.2 with the following loss matrix

⎛ 50 10 0 0 20 0 ⎞
⎜ 10 50 0 0 0 10 ⎟
⎜ 0 0 60 0 0 10 ⎟
𝐵=⎜ × 10−6 (7.42)
0 0 0 350 20 0 ⎟
⎜ ⎟
⎜ 20 0 0 20 370 40 ⎟
⎝ 0 10 10 0 40 480 ⎠

both ℎ and 𝑤 are zero, therefore, it is easy to transform Equation (7.37) into the
following second-order constraint

‖‖ 1∕2 ‖‖
‖‖𝐵 𝑝‖‖ ≤ 𝑧 (7.43)

where 𝐵1∕2 is the Cholesky factorization of 𝐵, and 𝑧 is an auxiliar variable such

that 𝑧2 = 𝑝loss . The code in Python is the same as in Example 7.2 modifying the
set of constraints as given below:

B = [[50, 10, 0, 0, 20, 0],

[10, 50, 0, 0, 0, 10],
[ 0, 0, 60, 0, 0, 10],
[ 0, 0, 0, 350, 20, 0],
[20, 0, 0, 20, 370, 40],
[ 0, 10, 10, 0, 40, 480]]
Bchol = 1E-3*np.linalg.cholesky(B)
z = cvx.Variable()
res = [sum(p) >= d + z**2, p>=pmin, p <= pmax]
res += [cvx.SOC(z,Bchol.T@p)]

The new dispatch is 𝑝 = (117, 133, 210, 225, 315, 324)⊤ , total power loss is 𝑧2 =
124. Penalization factors were 𝜉 = (1.03, 1.02, 1.03, 1.2, 1.37, 1.52)⊤ ; the first
two generators had a lower penalization factor compared to the last generator,
and hence it is efficient to dispatch the last generator with less power5 .

5 The penalization factor was calculated as xi = 1/(1-2E-6*np.array(B)@p.value)

Telegram: @ElectricalDocument
7.6 Exercises 143

7.5 Further readings

The economic/environmental dispatch has been studied for a long time in the
scientific literature. It is interesting to see how the problem used to be solved
on analogical computers compared to how it is solved today [40]. The algo-
rithms can be classified into exact methods as presented in this chapter and
metaheuristic algorithms as shown in [37]. Nevertheless, metaheuristic algo-
rithms and artificial intelligence entail a lack of understanding of the problem;
they are usually based on biological or social metaphors that may divert atten-
tion from the power systems problem (see [41] for an analysis about the use of
these biological metaphors).
Models for the economic dispatch in actual power systems are more complex
than those presented here. They can include constraints related to the stabil-
ity, security, and reliability of the grid [42]. In addition, grid codes and market
regulations introduce conditions that must be considered in the model. Such
complex problems are usually solved by distributed algorithms [43].
Linear and piecewise linear cost functions are also standard for the economic
dispatch problem. The student is invited to read [44] for a complete numerical
review of different linear implementations.

7.6 Exercises
1. Plot the incremental cost function for each of the six thermal units presented
in Example 7.2; plot also the optimal operation cost vs demand for the system
presented in Example 7.2 with 345 ≤ 𝑑 ≤ 1350.
2. Formulate in Python the optimization model for Example 7.4; compare
results.
3. Solve the problem presented in Example 7.2 but this time, the load varies
according to to a load curve 𝑑 = (0.5, 0.3, 0.4, 0.6, 0.8, 1.0, 0.9, 0.8)⊤ . Plot the
optimal incremental cost for each load.
4. Solve the multi objective economic dispatch problem presented in Example
7.8 but this time, consider the emission of NOx with 𝜂 = 0.0123 and 𝛾 = 0.25
for all the units. Show the Pareto frontier as well as a plot of incremental costs
vs incremental emissions.
5. Solve the problem presented in Example 7.11 without considering a limit
in 𝑝max . Compare results considering and without considering the loss
equation.
6. Write Equation (7.37) as a second-order constrain considering both ℎ and 𝑤.
7. Power loss in a transmission line can be approximated to the following
equation

Telegram: @ElectricalDocument
144 7 Economic dispatch of thermal units

loss 2
𝑝𝑘𝑚 ≈ 𝑔𝑘𝑚 𝜃𝑘𝑚 (7.44)
where 𝑔𝑘𝑚 is the admittance of the line. Use this approximation to include
loss into the economic dispatch with linearized power flow. Use the param-
eters of Example 7.10 with 𝑟 = 0.01 for all transmission lines.
8. Compare results of the transportation and the linear power flow models.
Experiment with different values of 𝑠max and 𝑥𝑗 .
9. Repeat the previous exercise, eliminating lines 3–4, 4–5, and 2–5.
10. Show the matrix 𝐵 given in Equation (7.34) is positive semidefinite (hint:
take into account that the Hadamard product of two definite matrices is also
a definite matrix).

Telegram: @ElectricalDocument
145

Unit commitment

Learning outcomes

By the end of this chapter, the student will be able to:

● Identify the difference between the economic dispatch and the unit
commitment
● Solve basic problems of deterministic unit commitment
● Include transmission constraints into the model by a linear power
ﬂow formulation.

8.1 Problem deﬁnition

The economic dispatch presented in Chapter 7 is a continuous problem decou-
pled in time; the latter means that current decisions do not affect future
operation. However, this picture is incomplete since a power system presents a
dynamic behavior due to loads’ daily changes. Therefore, a dynamical model is
required.
The starting-up of a thermal power plant is not instantaneous since the boiler
requires suitable pressure and temperature conditions to generate power, as
shown in Figure 8.1. Shutting-down is not instantaneous either. Besides, sev-
eral physical and economic limitations such as the minimum operative time
and the maximum off-line time must be considered in the model. All these
constraints introduce binary variables into the optimization problem. Thus,
the unit starts generating power when the temperature is between 𝑇min and
𝑇nom ; below 𝑇min the unit requires fuel to maintain the temperature, but the
output power is zero. This implies costs that must be included in an opti-
mization model named unit commitment. This problem is discrete since the

Mathematical Programming for Power Systems Operation: From Theory to Applications in

temperature
nom
min

output power
nom

min

Figure 8.1 Simpliﬁed model for the starting-up of a thermal power plant.

time in which the unit is connected (committed) or disconnected from the grid
(de-committed) is part of the decision; hence the problem dynamic. Additional
constraints are also included in the model, related to the ramps of start-up and
start-down in each thermal unit. In the following sections, we present the basic
model with the most common constraints. As in all the previous chapters, we
offer only toy-models in order to understand the problem.

8.2 Basic unit commitment model

Unlike the economic dispatch, the unit commitment requires considering each
individual unit into the model. For example, a combined cycle power plant may
have five gas units and two steam. Each of these units must be considered in
the model since they have a different dynamic performance. Gas units are more
flexible with relatively fast start-up time compared to steam units.
We consider a time horizon 𝖳 = {0, 1, … , 𝑇}, for example one day or one week
with discrete steps in hours. Thermal units are grouped in a set 𝒯 with three
types of costs, namely:

𝑓operation + 𝑓start-up + 𝑓shut-down (8.1)

Operation costs (𝑓operation ) are linear or quadratic functions as presented in

Chapter 7; start-up (𝑓start-up ) and shut-down costs (𝑓shut-down ) are linear func-
tions that depend of the status (on/off) of each power unit. Therefore, binary
variables are required to identify each transition. A set of variables 𝜁𝑘𝑡 are

Telegram: @ElectricalDocument
8.2 Basic unit commitment model 147

defined such that 𝜁𝑘𝑡 = 1 if the unit 𝑘 is operating at the time 𝑡. The starting-
up action is defined by another binary variable 𝜇𝑘𝑡 such that 𝜇𝑘𝑡 = 1 if the
unit 𝑘 was disconnected in time 𝑡 − 1 and connected in the time 𝑡. Similarly, a
binary variable 𝛿𝑘𝑡 is defined such that 𝛿𝑘𝑡 = 1 if the unit was connected at time
𝑡 − 1 but disconnected at time 𝑡. Additional constraints are required in order to
identify starting-up and shutting-down as presented below:

𝜇𝑘𝑡 − 𝛿𝑘𝑡 = 𝜁𝑘𝑡 − 𝜁𝑘𝑡−1 (8.2)

𝜇𝑘𝑡 + 𝛿𝑘𝑡 ≤ 1 (8.3)
𝜁𝑘𝑡 , 𝜇𝑘𝑡 , 𝛿𝑘𝑡 ∈ {0, 1} (8.4)

Notice that these constraints uniquely meet the conditions presented in Table
8.1 for binary variables 𝜁, 𝜇, 𝛿. Thus, 𝜇 = 1 only in the case the unit starts to
operate; similarly, 𝛿 = 1 only in the case the unit is disconnected.

Table 8.1 Logic table for operation (𝜁),

start-up (𝜇) and shut-down (𝛿) conditions.

𝜻kt-1 𝜻kt 𝝁kt 𝜹kt

0 0 0 0
0 1 1 0
1 0 0 1
1 1 0 0

With these binary variables, it is possible to define the cost functions for a
time horizon, as presented below:
∑∑ 2
𝑓operation = 𝑎𝑘𝑡 𝑝𝑘𝑡 + 𝑏𝑘 𝑝𝑘𝑡 + 𝑐𝑘 𝜁𝑘𝑡 (8.5)
𝑘∈𝒯 𝑡∈𝖳
∑∑ up
𝑓start-up = 𝑐𝑘 𝜇𝑘𝑡 (8.6)
𝑘∈𝒯 𝑡∈𝖳
∑∑
𝑓shut-down = 𝑐𝑘down 𝛿𝑘𝑡 (8.7)
𝑘∈𝒯 𝑡∈𝖳

These functions may be more complex in practice. For example, the start-up
cost depends on how long the unit was de-committed since the cost is lower as
the initial temperature is higher. A detail model of the start-up cost may include
exponential cost functions as presented in [45]. However, it is common practice
to linearize these functions. Notice that 𝜁𝑘 affects the fixed costs in 𝑓operation
under the assumption that the unit incurs this cost only when connected.

Telegram: @ElectricalDocument
148 8 Unit commitment

The variable 𝜁𝑘 affects also the operation limits of the thermal units as
follows:

𝜁𝑘𝑡 𝑝𝑘min ≤ 𝑝𝑘𝑡 ≤ 𝑝𝑘max 𝜁𝑘𝑡 (8.8)

thus, 𝑝𝑘𝑡 = 0 when 𝜁𝑘𝑡 = 0. The model is completed with the power balance at
each time 𝑡 as presented below:
∑∑ up
min 2
𝑎𝑘𝑡 𝑝𝑘𝑡 + 𝑏𝑘 𝑝𝑘𝑡 + 𝑐𝑘 𝜁𝑘𝑡 + 𝑐𝑘 𝜇𝑘𝑡 + 𝑐𝑘down 𝛿𝑘𝑡
𝑘∈𝒯 𝑡∈𝖳
∑
𝑝𝑘𝑡 = 𝑑𝑡 , ∀𝑡 ∈ 𝖳
𝑘∈𝒯

𝜁𝑘𝑡 𝑝𝑘min ≤ 𝑝𝑘𝑡 ≤ 𝑝𝑘max 𝜁𝑘𝑡 , ∀𝑡 ∈ 𝖳, 𝑘 ∈ 𝒯 (8.9)

𝜇𝑘𝑡 − 𝛿𝑘𝑡 = 𝜁𝑘𝑡 − 𝜁𝑘𝑡−1 , ∀𝑡 ∈ 𝖳, 𝑘 ∈ 𝒯
𝜇𝑘𝑡 + 𝛿𝑘𝑡 ≤ 1, ∀𝑡 ∈ 𝖳, 𝑘 ∈ 𝒯
𝜁𝑘𝑡 , 𝜇𝑘𝑡 , 𝛿𝑘𝑡 ∈ {0, 1} , ∀𝑡 ∈ 𝖳, 𝑘 ∈ 𝒯

This is the basic model of the unit commitment problem. However, it can be
complemented with additional constraints related to the grid. Practical applica-
tions combine the hydrothermal schedule with the unit commitment and even
with the ac optimal power flow. Therefore, Equation (8.9) may be considered as
a toy-model used to understand the problem. Let us see the model in practice:

Example 8.1. Let us solve the basic unit commitment problem for a system
with three thermal units. Parameters of the system are presented directly in
the following Python script:

a = np.array([0.0004984, 0.001246, 0.00623 ])

b = np.array([16.821 , 40.6196, 21.9296])
c = np.array([220.4174, 161.8554, 171.2004])
c_up = np.array([124.69, 249.22, 0])
z_ini = np.array([1,1,0])
pmax = np.array([220, 100, 20])
pmin = np.array([100,10,10])
d = np.array([178.690,168.450,161.840,157.830,158.160,163.690,
176.860,194.210,209.670,221.540,233.180,240.820,
247.030,248.470,253.830,260.900,261.120,251.680,
250.890,242.100,242.050,231.680,205.070,200.690])

These values were adapted from [46] for 24h operation with 𝑐down = 0. The
optimization model is easily translated from Equation (8.9) to a Python script,
with T = dim(𝖳) and n = 𝒯, as presented below:

Telegram: @ElectricalDocument
8.2 Basic unit commitment model 149

T = len(d)
n = len(a)
zeta = cvx.Variable((n,T), boolean=True)
mu = cvx.Variable((n,T), boolean=True)
delta = cvx.Variable((n,T), boolean=True)
p = cvx.Variable((n,T))

fop = 0 # operation cost

fsup = 0 # start-up cost
res = []
for t in range(T):
for k in range(n):
fop = fop + a[k]*p[k,t]**2+b[k]*p[k,t]+c[k]*zeta[k,t]
fsup = fsup + c_up[k]*mu[k,t]
res += [p[k,t] >= pmin[k]*zeta[k,t]]
res += [p[k,t] <= pmax[k]*zeta[k,t]]

for t in range(T):
res += [cvx.sum(p[:,t])==d[t]]

for t in range(1,T):
for k in range(n):
res += [mu[k,t]-delta[k,t] == zeta[k,t]-zeta[k,t-1]]
res += [mu[k,t]+delta[k,t] <= 1]

for k in range(n):
res += [mu[k,0]-delta[k,0] == zeta[k,0]-z_ini[k]]
res += [mu[k,0]+delta[k,0] <= 1]

obj = cvx.Minimize(fop+fsup)
UnitC = cvx.Problem(obj,res)
UnitC.solve()
print(UnitC.status, obj.value)

The optimal value was 100807. Binary variables can be plotted as follows:

plt.subplot(4,1,1)
plt.plot(p.value.T)
plt.subplot(4,1,2)
plt.pcolor(zeta.value)
plt.subplot(4,1,3)
plt.pcolor(mu.value)
plt.subplot(4,1,4)
plt.pcolor(delta.value)
plt.show()

Telegram: @ElectricalDocument
150 8 Unit commitment

Figure 8.2 Unit commitment for a system with three thermal units.

Figure 8.2 shows the results for the binary variables of the problem. Notice
that 𝜇 and 𝛿 properly identify committed and de-committed time of each
thermal units.

8.3 Additional constraints

The unit commitment problem can be complemented with additional con-
straints. For instance, a thermal unit may require maintenance at a particular
time. In that case, binary variables 𝜁𝑘𝑡 must be zero during the time expected
to run maintenance.
Likewise, thermal units have minimum uptime and downtime. The former is
the minimum time the unit must be committed once it is turned on, while the
latter is the minimum time the unit is de-committed before it can be turned on
again. For example, if the minimum up time is 4h, we can define the following
constraint:
𝜁𝑘+1 + 𝜁𝑘+2 + 𝜁𝑘+3 + 𝜁𝑘+4 ≥ 4𝜇𝑘 (8.10)
This constraint force 𝜁𝑘+1 = ⋯ = 𝜁𝑘+4 = 1 when 𝜇𝑘 = 1, that is to say, when
the unit is turned on. The same can be done for the minimum downtime. All
these constraints are affine, and hence, the model remains tractable.
Thermal units cannot achieve full load instantaneously; likewise, they can-
not pass instantaneously from full to zero load; therefore, the turning-on and
turning-off process require to be gradual. This is represented by ramping limits
as follows:

Telegram: @ElectricalDocument
8.4 Effect of the grid 151

up
𝑝𝑘𝑡 − 𝑝𝑘𝑡−1 ≤ 𝜌𝑘 (8.11)
𝑝𝑘𝑡−1 − 𝑝𝑘𝑡 ≤ 𝜌𝑘down (8.12)

On the other hand, it is required to guarantee a fixed amount of power to

hedge the system against the sudden changes of load and generation. This
quantity, known as spinning reserve 𝜎, is included in the model as presented
below:
∑
𝜁𝑘𝑡 𝑝𝑘max − 𝑝𝑘𝑡 ≥ 𝜎𝑡 , ∀𝑡 ∈ 𝖳 (8.13)
𝑘∈𝒯

Notice the spinning reserve may be different for each time. For example, it could
be higher for the periods of peak load where load changes are greater.
Thermal units may have physical operation limitations that create prohibited
operating zones of power. These restricted zones must be included in the model
as additional binary variables and/or constraints.

8.4 Effect of the grid

Just like the economic dispatch and the hydrothermal schedule, the unit com-
mitment may include power flow constraints, either as dc or ac formulations.
For easy explanation, we present here the transportation model. AC power flow
equations can be included with linear or conic approximations, as explained in
Chapter 10.
Example 8.2. Consider the problem presented in Example 8.1 with the grid
shown in Figure 8.3. All lines have a maximum capacity of 100MW.

Figure 8.3 Grid constraints for the unit comment problem.

Telegram: @ElectricalDocument
152 8 Unit commitment

Let us code this grid in Python but this time, we use the module NetworkX
that allows to operate graphs; the command DiGraph generate an oriented
graph with the connections depicted in Figure 8.3. The graph can be plotted
as presented below:
import networkx as nx
grid = nx.DiGraph([(0,1),(0,3),(1,2),(1,3),(1,4),(2,4),(3,4)])
smax = np.array([100,100,100,100,100,100,100]) # smax lines
smax = smax;
nl = 7 # number of lines
nn = 5 # number of nodes
plt.figure()
nx.draw(grid,with_labels=True)
plt.show()

We neglect power loss and therefore, nodal power can be represented as

function of the power flows, as follows:

𝑝node = 𝐴𝑝flow (8.14)

where 𝐴 is the incidence matrix which is calculated by the module NetworkX:

A = nx.incidence_matrix(grid,oriented=True)

Finally, we define a new variable 𝑝flow that represents the power flow in each
line and each time. We assume thermal units are connected to nodes 0 to 2 and
load is distributed between nodes 3 and 4 with 60% in node 3 and 40% in node
4. These considerations are transformed in constrains as presented below:
= cvx.Variable((nl,T)) # power flows
p_node = cvx.Variable((nn,T)) # nodal powers
for t in range(T):
res +=[ p_flow[:,t] >= -smax]
res +=[ p_flow[:,t] <= smax]
res +=[ p_node[:,t] == A@p_flow[:,t]]
res +=[ p_node[0,t] == p[0,t]]
res +=[ p_node[1,t] == p[1,t]]
res +=[ p_node[2,t] == p[2,t]]
res +=[ p_node[3,t] == -d[t]*0.6]
res +=[ p_node[4,t] == -d[t]*0.4]

obj = cvx.Minimize(fop)
UnitG = cvx.Problem(obj,res)
UnitG.solve()
print(UnitG.status, obj.value)

The optimal point is now 106795. As expected, the unit commitment was
affected by grid constraints. However, in this case, the starting-up and shutting-
down conditions are the same as in Example 8.1. This is not the general case.

Telegram: @ElectricalDocument
8.6 Exercises 153

However, this system is tiny compared to current power systems, and the grid
has enough capability to transport all generated power. The reader is invited to
experiment with this model, modifying 𝑠max and adding practical constraints in
the model.

8.5 Further readings

A complete review of the unit commitment problem can be found in [47] that
includes stochastic and robust versions of the problem. More details about the
mathematical formulation and especially the use of binary variables can be
found in [45]. The model may include ac power flow constraints as given in
[46], where different test systems are available. An extension of the problem
for power distribution systems with renewable energy and storage devices can
be studied in [48].
Besides the mixed-integer programming approach presented in this chapter,
the unit commitment problem can be solved by heuristic techniques and
dynamic programming. A complete review of these approaches can be found
in [49].

8.6 Exercises
1. Make a comparative table between the unit commitment problem and the
economic dispatch.
2. Solve the problem presented in Example Example 8.1 as an economic dis-
patch problem, that is to say, without including start-up and shut-down
costs. Compare the results with the unit commitment.
3. Solve the unit commitment problem in the system presented in Example 8.1
considering a spinning reserve of 𝜎 = 20 𝑀𝑊. Compare results.
4. Include now ramping limits as 𝜌up = 𝜌down = (55, 50, 20)⊤ .
5. Include a minimum up a limit of 4h for all thermal units.
6. Include power loss into the unit commitment model. Use a quadratic
approximation as presented in Example 7.11. Compare results.
7. Solve the unit commitment problem considering the transportation model
(Example 8.2) without using the module NetworkX.
8. Solve the problem presented in Example 8.2 using a linear power flow
instead of the transportation model. Use 𝑥𝑘𝑚 = 0.01pu. Compare results.

Telegram: @ElectricalDocument
154 8 Unit commitment

9. Solve the problem presented in Example 8.2, but this time the load is shared
50% between nodes 3 and 4.
10. Formulate the unit commitment problem considering the ac power flow
constraints. Identify the main characteristics of this model (we will learn
how to solve this type of problems in Chapter 10)

Telegram: @ElectricalDocument
155

Hydrothermal scheduling

Learning outcomes

By the end of this chapter, the student will be able to:

● Formulate the problem of the hydrothermal dispatch.
● Include hydraulic chains into the model.
● Study some non-linear constraints related to the model of hydraulic
units.

9.1 Short-term hydrothermal coordination

Hydropower is a renewable energy source with high potential around the world
[50]. Despite its advantages, such as high flexibility and fast dynamic response,
hydropower generation is highly vulnerable to complex weather patterns such
as El Niño-southern oscillation. Therefore, systems with high hydropower gen-
eration capability are usually complemented with thermal units, and hence
economic dispatch requires considering both hydroelectric and thermal power
stations. This problem is more complex than the economic dispatch in all-
thermal units for two reasons: first, the problem is coupled in the time; and
second, the system may have hydraulic chains. The first aspect implies that
an operation decision at one time can affect the future operation of the grid.
The second aspect implies that an operative decision in a hydroelectric unit
upstream in a river (or hydraulic chain) may affect the hydroelectric units
placed downstream in the same hydraulic chain. These two aspects are studied
in this chapter.
Hydrothermal scheduling requires considering the dynamics of the electric
part and the dynamics of the hydraulic system, which includes the change in

Mathematical Programming for Power Systems Operation: From Theory to Applications in

the volume of the reservoirs and water discharges and spillage. These variables
may be related to other uses of the reservoir, for example, irrigation; hence,
additional constraints must be included in the model. These constraints can
also be related to the safety limits of the volume and/or discharges of the
reservoir.
The level of detail of the model of the hydroelectric system may vary from
one system to another. There are many types of hydroelectric and reservoirs,
and each one has a different type of model. Thus, the mathematical relation
between volume/water discharge and power may be linear or non-linear. In
addition, hydraulic chains can introduce delays in the inflows that affect the
entire system’s dynamic. To do this, we must add non-linear constraints related
to losses and cost of thermal units, and obtaining a model highly non-linear that
requires to be solved in real-time.
On the other hand, the modern power system may include pumped hydro-
electric storage power plants. This type of storage is becoming more popular
with the high penetration of renewable energies. Pump energy storage is just
hydroelectric units with two reservoirs and a reversible capability. Water can be
pumped from the low reservoir to an upper reservoir to store energy. The effect
of other renewable energies, such as wind and solar, must be included in the
model as well.

9.2 Basic hydrothermal coordination

Let us consider a hydrothermal system where the units are grouped in two sets,
ℋ for hydraulic units and 𝒯 for thermal units. Incremental costs of hydroelec-
tric units are usually neglected in practice, then the objective function consists
of minimizing costs of thermal units, just as in the conventional economic
dispatch. However, the optimization model must include physical constraints
related to hydroelectric power plants. The problem becomes coupled in the
time since an operative decision in one instant can affect the subsequent oper-
ation. Moreover, decision variables are not only the generated power but also
the water discharge, spillage, and volume of the reservoirs.
Generated power in each hydroelectric unit 𝑖 ∈ ℋ can be calculated as given
in Equation (9.1),

𝑝𝑖 = (𝜌𝑔𝜂𝑖 ℎ𝑖 )𝑞𝑖 (9.1)

Where 𝜌 is the water density (≈ 100kg∕m3 ); 𝑔 is the acceleration of the gravity

9.81m∕s2 ; ℎ𝑖 is the head of reservoir, measured in meters; 𝜂𝑖 is the efficiency
of the group turbine-generator; and 𝑞𝑖 is the water discharge in m3 ∕s (i.e., the
flow passing through the turbine). The three most common types of turbines
are Pelton, Francis, and Kaplan. Pelton is impulse turbines used for high-head

Telegram: @ElectricalDocument
9.2 Basic hydrothermal coordination 157

plants, while Francis and Kaplan are reaction turbines used for medium and
low heads, respectively. In regards to the water flow, Pelton turbines are used for
relatively low water flow rates while Francis and Kaplan are used for high water
flow [51]. The head and efficiency of the unit depending on the volume of the
reservoir and the water discharge; however, they can be considered constant for
hydro plants with large reservoir capability and in cases where the powerhouse
is placed at a long distance below the dam. In those cases, generated power can
be considered proportional to the water discharge, as given in Equation (9.2):

𝑝𝑖 = 𝜇𝑖 𝑞𝑖 , ∀𝑖 ∈ ℋ (9.2)

where 𝜇𝑖 = 𝜌𝑔𝜂𝑖 ℎ𝑖 is called turbine factor.

The dynamics of the reservoir must be considered into the model. It includes
the volume of the reservoir 𝑣, water discharge 𝑞, inflows 𝑎, and spillage 𝑠 as
given in Equation (9.3),

𝑣𝑖𝑡+1 = 𝑣𝑖𝑡 + ∆𝑇(𝑎𝑖𝑡 − 𝑞𝑖𝑡 − 𝑠𝑖𝑡 ), ∀𝑖 ∈ ℋ, 𝑡 ∈ 𝖳 (9.3)

where ∆𝑇 is the discretization of the time in the operation horizon 𝖳 =

{0, 1, … , 𝑇} (usually ∆𝑇 = 1h). The horizon may be one day, one week, or one
month, discretized in hours or even minutes, according to the desired level of
detail. In this model, we assume an accurate forecasting of the inflows, so the
model is deterministic.
The entire model for the short-term hydrothermal dispatch is presented
below:
∑∑
min 𝑓𝑘 (𝑝𝑘𝑡 )
𝑡∈𝖳 𝑘∈𝒯

𝑝𝑖𝑡 = 𝜇𝑖 𝑞𝑖𝑡 , ∀𝑖 ∈ ℋ
𝑣𝑖𝑡+1 = 𝑣𝑖𝑡 + 𝑎𝑖𝑡 − 𝑞𝑖𝑡 − 𝑠𝑖𝑡 , ∀𝑖 ∈ ℋ, 𝑡 ∈ 𝖳
∑ ∑
𝑝𝑖𝑡 + 𝑝𝑘𝑡 = 𝑑𝑡 , ∀𝑡 ∈ 𝖳
𝑖∈ℋ 𝑘∈𝒯

𝑣𝑖min ≤ 𝑣𝑖𝑡 ≤ 𝑣𝑖max ∀𝑖 ∈ ℋ, 𝑡 ∈ 𝖳

𝑞𝑖min ≤ 𝑞𝑖𝑡 ≤ 𝑞𝑖max ∀𝑖 ∈ ℋ, 𝑡 ∈ 𝖳 (9.4)
𝑝𝑖min ≤ 𝑝𝑖𝑡 ≤ 𝑝𝑖max ∀𝑖 ∈ ℋ, 𝑡 ∈ 𝖳
𝑝𝑘min ≤ 𝑝𝑘𝑡 ≤ 𝑝𝑘max ∀𝑘 ∈ 𝒯, 𝑡 ∈ 𝖳
0 ≤ 𝑠𝑖𝑡 ≤ 𝑠𝑖max ∀𝑖 ∈ ℋ, 𝑡 ∈ 𝖳
𝑣𝑖0 = 𝑣𝑖initial
𝑣𝑖𝑇 = 𝑣𝑖final

Telegram: @ElectricalDocument
158 9 Hydrothermal scheduling

The model is similar to the conventional economic dispatch of thermal units;

just, in this case, the power balance equation includes both thermal and
hydropower plants. The objective function is associated with thermal power
plants’ operation cost since the incremental operating cost of hydroelectric
units is almost zero (a suitable approximation in practice). The objective func-
tion can be either quadratic or linear, according to the cost model of thermal
units. The rest of the constraints are related to the dynamics of the reservoir,
generated power of the hydroelectric, and box constraints that represent the
limits of each variable. Most of the constraints are affine in this basic model,
and hence the problem is convex, although it may present a high number of
decision variables. Initial and final values of the volume in the reservoirs are
obtained by a medium or long-term model of hydrothermal coordination1 .
Example 9.1. Let us consider a hydrothermal system with one hydropower
plant 𝑝𝐻 and one thermal unit 𝑝𝑇 with a linear cost function 𝑓 = 19.2𝑝𝑇 . Other
parameters of the system are 𝑝𝑇min = 0, 𝑝𝑇max = 250, 𝑝𝐻min max
= 0, 𝑝𝐻 = 150, 𝜇𝐻 =
max min
8.5, 𝑣 = 150, 𝑣 = 80. The initial volume of the reservoir is 𝑣 initial = 150
while the final volume is required to be at least 𝑣 final = 80. The demand and
inflows are presented in Figure 9.1.
The code in Python is a direct representation of Equation (9.4) as presented
below:
import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt
d = (137, 139, 136, 129, 129, 141, 165, 200, 224, 232, 223, 231,
223, 220, 213, 207, 213, 214, 224, 228, 224, 212, 185, 159)
#load
a = (10,9,8,7,7,7,8,9,10,10,10,9,8,8,8,8,8,9,9,9,9,9,9,9)
# inflows
pH = cvx.Variable(24) # power hydro unit
pT = cvx.Variable(24) # power thermo unit
q = cvx.Variable(24) # water discharge
s = cvx.Variable(24) # spillage
v = cvx.Variable(25) # volume
cost = 0
res = [v[0]==150, v[24]==80]
for t in range(24):
cost = cost + 19.2*pT[t]
res += [pH[t] == 8.5*q[t]]
res += [pH[t] + pT[t] == d[t]]
res += [80 <= v[t], v[t] <= 150]

1 Long-term hydrothermal coordination is a stochastic problem related to operational plan-

ning. Although its model is similar to the short-term hydrothermal coordination, the study of
this problem is beyond the objectives of this book. Interested reader is invited to see [52].

Telegram: @ElectricalDocument
9.3 Non-linear models 159

res += [0 <= pT[t], pT[t] <= 250]

res += [0 <= pH[t], pH[t] <= 150]
res += [s[t] >= 0]
res += [v[t+1] == v[t] + a[t] - s[t] - q[t]]
HydTh = cvx.Problem(cvx.Minimize(cost), res)
HydTh.solve()

Despite being a system with only two units, the model presents 121 decision
variables, a very high number compared to a dispatch in an all-thermal system.
Nevertheless, the model is linear or, at most quadratic. Generated power in each
unit is given in Figure 9.2 together with the total demand. The power generated
by the hydroelectric unit is very flat, following a curve that guarantees the reser-
voir’s initial and final volume and minimizes the use of the thermal unit. The
power in the thermal unit tries to follow the demand at a minimum cost.

9.3 Non-linear models

The linear model given by Equation (9.2) may be insufficient to represent accu-
rately hydroelectric units, specially in the case of variable-head hydro plants. In
those cases, a quadratic model that relates the output power with the water dis-
charge and the volume of the reservoir is required as given in Equation (9.5),

Figure 9.1 Power demand and inﬂows for a hydrothermal system.

Telegram: @ElectricalDocument
160 9 Hydrothermal scheduling

Figure 9.2 Power demand and generated power.

𝑝𝑖𝑡 = −𝑥⊤ 𝐴𝑥 + 𝑏⊤ 𝑥 + 𝑐 (9.5)

where 𝑥 = (𝑣𝑖𝑡 , 𝑞𝑖𝑡 )⊤ and 𝐴 is a real-square 2 × 2 matrix; this equation

includes the effect of the turbine efficiency as well as the head variations that
in most hydroturbines, is given by the so called hill-chart curve [53]. This curve
describes a concave surface and hence 𝐴 can be adjusted to a positive def-
inite matrix making Equation (9.5) a concave quadratic form2 as shown in
Figure 9.3.
Equation (9.5) is clearly non-convex, however, it can be approximated either
to a semidefinite or a second-order constraint as presented in [54] and [55],
respectively.
On the other hand, the effect of the grid can be considered into the model, just
as in the case of thermal units, as presented in the following example adapted
from [56].

Example 9.2. A hydrothermal system consists on four units labeled as

{0, 1, 2, 3} where {0, 1} are hydroelectric units and {2, 3} are thermal power
plants. Each hydroelectric unit has a non-linear relation to the water discharge
given by the following concave quadratic functions:

𝑝0 = −0.58𝑞02 + 3.60𝑞0 + 0.89 (9.6)

𝑝1 = −0.78𝑞12 + 3.96𝑞1 + 1.13 (9.7)

2 Notice the sign minus in Equation (9.5).

Telegram: @ElectricalDocument
9.3 Non-linear models 161

100
ℎ

15
50
80 10
100
120
140 5

Figure 9.3 Quadratic function for a hydropower unit.

The minimum permissible water discharge is 𝑞 = 0.5 for both units. Generation
costs of the thermal plants are adjusted to the following quadratic forms:

𝑓2 (𝑝2 ) = 0.80𝑝2 + 0.02𝑝22 (9.8)

𝑓3 (𝑝3 ) = 0.78𝑝3 + 0.03𝑝32 (9.9)

In addition, power losses are given by the following loss-formula matrix

⎛ 0.05 −0.02 0.01 0.00 ⎞

⎜ −0.02 0.06 −0.02 0.01 ⎟
𝐵=⎜ (9.10)
0.01 −0.02 0.04 0.00 ⎟
⎜ ⎟
0.00 0.01 0.00 0.02
⎝ ⎠
Initial and final volume are 10 pu for reservoir 0 and 12 pu for reservoir 1.
Inflows and power demand for 12 h operation are included in the following
script:
import cvxpy as cvx
p = cvx.Variable((12,4)) # hydro = {0,1} thermal = {2,3}
pL = cvx.Variable(12) # losses
v = cvx.Variable((13,2)) # volumne
q = cvx.Variable((12,2)) # water discharge
a = [[0, 0.6, 1.2, 1.2, 1.2, 1.8, 2.4, 1.5, 1.2, 0.9, 0, 0],
[0, 0.0, 0.0, 1.5, 3.0, 4.5, 4.5, 1.5, 0.0, 0.0, 0, 0]]
# inflows
B = [[ 0.05,-0.02, 0.01, 0.00],
[-0.02, 0.06,-0.02, 0.01],

Telegram: @ElectricalDocument
162 9 Hydrothermal scheduling

[ 0.01,-0.02, 0.04, 0.00],

[ 0.00, 0.01, 0.00, 0.02]] # loss matrix
d = [8, 7, 7, 6, 7, 8, 9, 8, 7, 7, 8, 8] # demand
cost = 0
res = [v[0,0] == 10, v[0,1] == 12, v[12,0]==10, v[12,1] == 12]
for t in range(12):
cost = cost + 0.02*p[t,2]**2 + 0.80*p[t,2]
cost = cost + 0.03*p[t,3]**2 + 0.78*p[t,3]
res += [sum(p[t,:]) == d[t] + pL[t]]
res += [pL[t] >= cvx.quad_form(p[t,:],B)]
res += [q[t,:] >= 0.5]
res += [p[t,:] >= 0.0]
res += [p[t,0] + 0.58*q[t,0]**2 - 3.60*q[t,0] + 0.89 <= 0 ]
res += [p[t,1] + 0.78*q[t,1]**2 - 3.96*q[t,1] + 1.13 <= 0 ]
res += [v[t+1,0] == v[t,0] + a[0][t] - q[t,0]]
res += [v[t+1,1] == v[t,1] + a[1][t] - q[t,1]]
HydTh = cvx.Problem(cvx.Minimize(cost), res)
HydTh.solve()

In this case, the code was a little different from the previous example. First, gen-
eration power was saved in a single vector where the two first terms correspond
to hydraulic units and the last two to thermal units. Spillages were set to zero,
and quadratic functions were included as inequality constraints. Notice this is
an approximation that requires to be evaluated after solving the optimization
problem. As always, the reader is invited to execute and experiment with the
script.

9.4 Hydraulic chains

Generation systems may contain hydraulic chains where the spillage and water
discharge of one unit are part of the inflows of other units downstream, as
shown in Figure 9.4. Hydraulic chains constitute a hydraulic network where
each node represents a reservoir. Therefore, a balance of flows must be included
for each of these reservoirs as given in Equation (9.11),
∑
𝑣𝑖𝑡 = 𝑣𝑖𝑡−1 + 𝑎𝑖𝑡 − 𝑞𝑖𝑡 − 𝑠𝑖𝑡 + (𝑞𝑗𝑡−𝜏 + 𝑠𝑗𝑡−𝜏 ) ∀𝑖 ∈ ℋ, 𝑡 ∈ 𝖳 (9.11)
𝑗∈Ω𝑖

where Ω𝑖 represents the set of nodes that are connected to reservoir 𝑖 in the
hydraulic chain3 . Notice that Equation (9.11) is an affine constraint, hence the
problem remains convex. On the other hand, flows that come from an upper
reservoir to a lower reservoir do not arrive immediately. Therefore, a time delay

3 For the hydraulic chain given in Figure 9.4, we have that Ω2 = {0, 1} and Ω3 = {2}, whereas
Ω0 and Ω1 are just empty.

Telegram: @ElectricalDocument
9.4 Hydraulic chains 163

0 0

0
0

1 1

1
1

2
2

3 3

Figure 9.4 Example of a hydraulic chain where 𝑎𝑖 represents inﬂows, 𝑣𝑖 volume, 𝑞𝑖

ﬂow, 𝑠𝑖 spillage, and 𝜏𝑖 delays.

𝜏𝑖 must be considered in each branch of the hydraulic network. This constrain

is not relevant for medium and long-term hydrothermal coordination, but it
is important in short-term scheduling. Box constraints related to spillage and
water discharge must be carefully considered in hydraulic chains. Safety levels
of the rivers and other uses of the reservoir introduce additional constraints to
the problem [57].

Example 9.3. A power generation system consists on a large thermal unit and
the hydraulic chain depicted in Figure 9.4. Operation cost of the thermal unit is

Telegram: @ElectricalDocument
164 9 Hydrothermal scheduling

linear and given by 𝑓(𝑝) = 19.2𝑝. Parameters of the system including demand
and inflows are coded in Python as follows:

numh = 4
vmin = [80,60,100,70]
vmax = [150,120,240,160]
vini = [100,80,170,120]
vend = [120,80,170,100]
qmin = numh*[0]
qmax = [15,15,30,30]
smin = numh*[0]
smax = numh*[10]
pHmax = numh*[500]
pHmin = numh*[0]
pTmax = 2500
pTmin = 0
a = [[ 9.0, 7.5, 2.5, 2.8],
[ 9.2, 8.0, 3.0, 2.4],
[ 9.5, 8.8, 4.0, 1.6],
[ 9.6, 9.0, 4.5, 1.0],
[ 9.8, 9.3, 4.3, 1.0],
[ 9.9, 9.5, 4.0, 1.0],
[10.0, 10.0, 3.0, 1.0],
[10.3, 10.2, 2.0, 1.3],
[10.5, 10.3, 1.5, 1.5],
[11.0, 10.3, 1.0, 1.6],
[11.2, 10.5, 1.0, 1.7],
[11.5, 10.4, 1.8, 1.5],
[11.4, 10.3, 2.3, 1.5],
[11.3, 10.0, 3.0, 1.3],
[11.2, 9.8, 3.0, 1.2],
[10.0, 9.5, 2.8, 1.2],
[ 9.3, 9.3, 2.5, 1.2],
[ 8.6, 9.0, 2.0, 1.0],
[ 7.5, 8.8, 1.8, 1.0],
[ 7.0, 8.7, 1.6, 0.8],
[ 7.2, 8.6, 1.6, 0.8],
[ 7.3, 8.3, 1.8, 0.8],
[ 7.4, 8.0, 2.0, 0.8],
[ 7.5, 8.0, 2.0, 0.8]]
d = [685,695,680,645,645,705,825,1000, 1120,1160,1115,1155,
1115,1100,1065,1035,1065,1070,1120,1140, 1120,1060,925,795]
miu = [6.5,5.5,9.4,4.7]

Telegram: @ElectricalDocument
9.5 Pumped hydroelectric storage 165

A linear model for the hydrothermal scheduling is presented below, where

delays were neglected:
import cvxpy as cvx
pT = cvx.Variable(24)
pH = cvx.Variable((24,numh))
v = cvx.Variable((25,numh))
q = cvx.Variable((24,numh))
s = cvx.Variable((24,numh))
cost = 5000
res = []
for t in range(24):
cost=cost + 19.2*pT[t]
res+=[pT[t] >= pTmin, pT[t] <= pTmax]
res+=[sum(pH[t,:])+pT[t]==d[t]]
res+=[v[t+1,0]==v[t,0]+a[t][0]-s[t,0]-q[t,0]]
res+=[v[t+1,1]==v[t,1]+a[t][1]-s[t,1]-q[t,1]]
res+=[v[t+1,2]==v[t,2]+a[t][2]-s[t,2]-q[t,2]
+s[t,0]+q[t,0]+s[t,1]+q[t,1]]
res+=[v[t+1,3]==v[t,3]+a[t][3]-s[t,3]-q[t,3]+s[t,2]+q[t,2]]
for k in range(numh):
res += [v[0,k] == vini[k], v[24,k] == vend[k]]
res += [v[t,k] >= vmin[k], v[t,k] <= vmax[k]]
res += [q[t,k] >= qmin[k], q[t,k] <= qmax[k]]
res += [s[t,k] >= smin[k], s[t,k] <= smax[k]]
res += [pH[t,k] >= pHmin[k], pH[t,k] <= pHmax[k]]
res += [pH[t,k] == miu[k]*q[t,k]]
HydTh = cvx.Problem(cvx.Minimize(cost), res)
HydTh.solve()

The model does not grow significantly in the number of variables. It is required
only to include additional constraints related to the balance of flows in each
reservoir in the hydraulic chain. Otherwise, the model is the same as the pre-
vious cases. Spillages are important in hydraulic chains since they affect the
power production of hydraulic units placed downstream. In some cases, the
optimization model could introduce spillages in one reservoir in order to supply
another reservoir and increase power generation.

9.5 Pumped hydroelectric storage

Pumped hydroelectric storage is a classic technology that is recovering atten-
tion due to the increasing penetration of wind and solar generation. These new
types of renewable resources present high variability, and hence energy stor-
age is required. A pumped energy storage system consists of two reservoirs
connected with a combined pump/turbine system as shown in Figure 9.5. In
generation mode, water flows from the upper to the lower reservoir, generating

Telegram: @ElectricalDocument
166 9 Hydrothermal scheduling

Figure 9.5 Schematic representation of a pumped hydroelectric storage system.

power to the electric grid. In charging mode, water is pumped from the lower
to the upper reservoir taking electric power from the grid.
Finding a suitable place for building a pumped hydroelectric storage sys-
tem is the main limitation of this technology. However, construction of a lower
reservoir placed deep underground and directly below the upper reservoir can
reduce this limitation [58]. Efficiency and energy density is another main limi-
tation; the total efficiency of the existing pumped hydroelectric storage system
is around 70−85% [59]. However, the use of variable speed systems can increase
these values [60].
Compared to other storage technologies, pumped hydroelectric have the
largest capacity in both energy and power, which varies from 1 to 300 MW.
The turbine/pump system is usually placed just below the upper reservoir, con-
nected with a vertical tunnel or penstock. Many existing pumped hydroelectric
consist of separate pump and turbine systems, but current configurations are
based on reversible turbines. A separate pump and turbine system allows for a
shorter transition time between pumping and generation modes, but its cost is
high.
The model of a pumped hydroelectric requires considering the dynamics of
the reservoir, just as in the case of hydraulic chains.
up up
𝑣𝑡+1 = 𝑣𝑡 − 𝑞𝑡 (9.12)
dw
𝑣𝑡+1 = 𝑣𝑡dw + 𝑞𝑡 (9.13)

The model must include inflows and spillage in case they exist. The model
must consider the net efficiency 𝜂 for a charge/discharge cycle as presented
in Equation (9.14).
gen pmp
𝑝𝑡 − 𝜂𝑝𝑡 == 𝜇𝑞𝑡 (9.14)

Telegram: @ElectricalDocument
9.5 Pumped hydroelectric storage 167

where 𝑝gen is the power generated by the pumped hydroelectric and 𝑝𝑝𝑚𝑝 is the
power required from the system to pump water to the upper reservoir. Pump-
ing requires more energy than is obtained by generating and hence 𝜂 ≤ 1.
Equation (9.14) is valid only if 𝑝gen is not positive simultaneously. That is to
say the system is generating or pumping but not the two at the same time. This
condition can be added to the model as the following set of constraints:
gen
0 ≤ 𝑝𝑡 ≤ 𝑝𝑡max 𝑥𝑡
pmp
0 ≤ 𝑝𝑡 ≤ 𝑝𝑡max (1 − 𝑥𝑡 ) (9.15)

where 𝑥𝑡 is a boolean variable. When 𝑥𝑡 = 0 the generated power is zero and

𝑝pmp takes values from zero to its maximum, the opposite occurs when 𝑥𝑡 = 1.
The conventional use of pumped hydroelectric balances the load allowing
nuclear plants to maintain constant power and/or to compensate for the high
variability of wind and solar systems, as presented in the following example.
Example 9.4. Let us consider a generation system consisting of a large solar
farm (100MW), a small thermal unit (10MW), and a pumped hydroelectric
(30MW/120MWh). The system can buy and sell energy to the main grid; the
objective is to maximize total income. Therefore, the pumped hydroelectric can
buy energy from the grid at periods of low price to sell this energy at periods
of a high price. The system is also able to store the energy generated by the
solar plant. The price of the energy 𝑐𝑡 is variable according to the hour, and the
operation costs of the thermal unit are assumed linear. Therefore, the objective
function is as follows:
∑
max 𝑓 = 𝑐𝑡 𝑝𝑡 − 𝛼𝑝𝑡thm (9.16)
𝑡∈𝖳

where 𝛼 is the incremental cost of the thermal unit, and 𝑝𝑡thm is generated power
at the time 𝑡. Moreover, 𝑝𝑡 is the total power trade with the main grid, that is to
say:
gen pmp
𝑝𝑡 = 𝑝𝑡thm + 𝑝𝑡sol + 𝑝𝑡 − 𝑝𝑡 (9.17)

where 𝑝gen is the power injected by the hydroelectric in generation mode, 𝑝pmp
is the power taken from the grid in pumping mode, and 𝑝sol is the power gen-
erated by the solar farm; notice that 𝑝𝑡 may be negative, meaning the system is
taking energy from the main grid to pump water.
The optimization model implemented in Python is presented below:
import cvxpy as cvx
pS = [0,0,0,0,0,0,0,26,50,71,87,97,100,97,87,71,50,26,0,0,0,0,0,0]
c = [0.4, 0.4, 0.4, 0.4, 0.4, 0.5, 0.6, 0.6, 0.6, 0.5, 0.5, 0.4,
0.4, 0.4, 0.5, 0.5, 0.6, 0.9, 1.1, 1.1, 1.0, 0.8, 0.7, 0.5]

Telegram: @ElectricalDocument
168 9 Hydrothermal scheduling

vup = cvx.Variable(25,nonneg=True)
vdw = cvx.Variable(25,nonneg=True)
pgen = cvx.Variable(24,nonneg=True)
ppmp = cvx.Variable(24, nonneg=True)
pthm = cvx.Variable(24, nonneg=True)
q = cvx.Variable(24)
p = cvx.Variable(24)
x = cvx.Variable(24, boolean=True)
f = 0
res = [vup[0] == 0, vdw[0] == 120] # initial conditions
for t in range(24):
f = f + c[t]*p[t]-0.95*pthm[t]
res += [vup[t] <= 120, vdw[t] <= 120]
res += [pgen[t] - 0.8*ppmp[t] == 1*q[t]]
res += [pgen[t] - ppmp[t] + pS[t] + pthm[t] == p[t]]
res += [pgen[t] <= 30*x[t], ppmp[t] <= 30*(1-x[t])]
res += [vup[t+1] == vup[t] - q[t] ]
res += [vdw[t+1] == vdw[t] + q[t] ]
res += [pthm[t] <= 10]
PHS = cvx.Problem(cvx.Maximize(f),res)
PHS.solve()
print(’eff:’,print(np.sum(pgen.value-psto.value)))

Results of this problem are shown in Figure 9.6. The lower reservoir starts
full and the upper reservoir empty. Prices are low in the first four hours, and
hence, the hydroelectric unit starts pumping water; from 4 am to 9 am prices
increase, making it viable to generate this energy stored; At medium day, solar
generation is maximum, but prices are minimum. Therefore, it is convenient to
store this energy pumping water to the upper reservoir; this energy is released
to the grid from 16h to 20h where the prices are maximum. The thermal unit
is turned on in this last period. The storage system ends with the same starting
conditions (lower reservoir full and upper reservoir empty). Total efficiency of
the storage process can be calculated by adding 𝑝gen − 𝑝pmp , in this case, the
result is 40MW.

9.6 Further readings

The hydrothermal schedule has been usually solved by classic techniques such
as linear programming, Lagrangian relaxation [61], and dynamic programming
[56]. There is a vast literature about the use of metaheuristic techniques, such
as simulated annealing [62] and genetic algorithms [63]. However, modern
approaches are based on convex optimization, including semidefinite pro-
gramming [54] and second-order cone approximations [55]. Other renewable
generation can also be introduced in the model using stochastic optimization
as presented in [64].

Telegram: @ElectricalDocument
9.7 Exercises 169

Figure 9.6 Results for the Example 9.4.

All models presented in this chapter simplify real operation problems, which
can consider coupling with other models such as the unit commitment [65].
There is a vast literature in the field, especially in the power system soci-
ety of IEEE; however, the problem has been studied by other communities,
for example, the operation research community [66]. The problem may be
extended to the operation planning that includes periods of one or several
years; in that case, the problem is also stochastic. A tutorial on stochastic
programming to solve this problem can also be found in [67].

9.7 Exercises
1. Solve the problem given in Example 9.1 considering the grid depicted in
Figure 9.7. Use the transportation model with 𝑝max = 150MW in all
transmission lines.
2. Solve the hydrothermal scheduling problem given in Example 9.1 but
now, consider a non-linear model of the hydroelectric power given by
Equation (9.18),

𝑝𝐻 = ℎ(𝑣, 𝑞) = −0.0042𝑣 2 + 0.03𝑣𝑞 − 0.42𝑞 2 + 0.9𝑣 + 10𝑞 − 50 (9.18)

where 𝑣 is the volume of the reservoir and 𝑞 is the water discharge. Plot the
surface and transform the equation into a second-order inequality constrain
ℎ(𝑣, 𝑞) ≥ 𝑝𝐻 . Solve the corresponding hydrothermal dispatch and compare
results with the linear model.
3. Solve the problem presented in Example 9.2 but without considering losses.
Analyze and compare results.

Telegram: @ElectricalDocument
170 9 Hydrothermal scheduling

Figure 9.7 Three node system for hydrothermal scheduling.

4. Quadratic equality constraints related to power loss and water discharge

were relaxed to convex inequality constraints in Example 9.2. Evaluate the
accuracy of this approximation.
5. Execute the script presented in Example 9.3. Plot volume, water discharge,
spillage, and generated power in each unit vs time.
6. Solve the hydrothermal scheduling problem given in Example 9.3 but
assume that each hydroelectric unit is independent, i.e., without the
hydraulic chain; compare results.
7. Solve the hydrothermal dispatch problem given in Example 9.3 considering
time delays in the hydraulic chain. Consider 𝜏1 = 𝜏2 = 1 and 𝜏3 = 2.
8. Solve the hydrothermal scheduling problem with pumped hydroelectric
storage presented in Example 9.4 without allowing charge from the grid.
9. Solve the problem presented in Example 9.4 without considering the ther-
mal unit.
10. Introduce a pump hydroelectric into Example 9.1. Use the parameters of
Example 9.4.

Telegram: @ElectricalDocument
171

Optimal power ﬂow

Learning outcomes

By the end of this chapter, the student will be able to:

● Formulate the optimal power ﬂow problem.
● Solve linear, SOC, and SDP approximations for the OPF.
● Identify the advantages and disadvantages of each approximation.

10.1 OPF in power distribution grids

Modern power distribution grids include renewable energy sources and energy
storage devices that inject active and reactive power to the grid – each con-
figuration of generation and demand results in a different operation point.
However, not all operation points are equal; in practice, we seek operation
points with minimum losses. This task is the main objective of the OPF.
A power distribution grid is represented as an oriented graph 𝒢 = {𝒩, ℰ}
where 𝒩 = {0, 1, 2, … , 𝑛 − 1} is the set of nodes and ℰ ⊆ 𝒩 × 𝒩 is the set of
edges. As convention, the slack node is 0 and its voltage is 𝑣0 = 1∠0. The nodal
admittance matrix is represented by 𝑌bus = [𝑦𝑘𝑚 ] ∈ ℂ𝑛×𝑛 allowing to calculate
nodal current from nodal voltages as given in (10.1).
∑
𝑖𝑘 = 𝑦𝑘𝑚 𝑣𝑚 , ∀𝑘 ∈ 𝒩 (10.1)
𝑚∈𝒩

This is an affine equation, thereby easily included in any optimization model.

However, loads and generators are usually represented in terms of active
and reactive power. Therefore, nodal equations become non-linear as given
in (10.2).

Mathematical Programming for Power Systems Operation: From Theory to Applications in

∗
𝑠𝑘 − 𝑑𝑘 ∑
( ) = 𝑦𝑘𝑚 𝑣𝑚 , ∀𝑘 ∈ 𝒩 (10.2)
𝑣𝑘 𝑚∈𝒩

where (⋅)∗ represents the convex conjugate, 𝑠𝑘 is the generated nodal power,
and 𝑑𝑘 is the corresponding load. For the sake of a compact representation of
the model, we will assume that subscripts 𝑚 and 𝑘 belong to 𝒩 in all cases. The
model is presented in complex variable, for example, Equation (10.2) is repre-
sented in the complex domain; this is only a representation since the equation
requires to be separated into real and imaginary parts. However, a complex
representation is more direct when implemented in Python1 .
Although the problem may consider different objectives and may combine
problems such as the economic/environmental dispatch, the typical applica-
tion consists in minimizing power losses given by (10.3):

∑∑ ∗
𝑝𝐿 = real ( 𝑦𝑘𝑚 𝑣𝑘 𝑣𝑚 ) (10.3)
𝑘 𝑚

This equation can be represented in a real domain by splitting 𝑦𝑘𝑚 = 𝑔𝑘𝑚 +𝑗𝑏𝑘𝑚
and 𝑣 = 𝑣 real + 𝑗𝑣 imag , resulting in the following equivalent expression:
∑∑ imag imag
𝑝𝐿 = 𝑔𝑘𝑚 (𝑣𝑘real 𝑣𝑚
real
+ 𝑣𝑘 𝑣𝑚 ) (10.4)
𝑘 𝑚

Since 𝐺 = [𝑔𝑘𝑚 ] ∈ ℝ𝑛×𝑛 is positive semidefinite2 , then 𝑝𝐿 is a convex function.

Thus, the basic model for the OPF is the following:

∑∑ ∗
min real ( 𝑦𝑘𝑚 𝑣𝑘 𝑣𝑚 )
𝑘 𝑚

𝑣0 = 1 + 𝑗0
1 − 𝛿 ≤ ‖𝑣𝑘 ‖ ≤ 1 + 𝛿, ∀𝑘 ∈ 𝒩
𝑝𝑘max ≥ 𝑠𝑘real ≥ 𝑝𝑘min , ∀𝑘 ∈ 𝒩 (10.5)
𝑠𝑘max ≥ ‖𝑠𝑘 ‖ , ∀𝑘 ∈ 𝒩
max
𝑖𝑘𝑚 ≥ ‖𝑦𝑘𝑚 (𝑣𝑘 − 𝑣𝑚 )‖ , ∀(𝑘𝑚) ∈ ℰ
∗
𝑠𝑘 − 𝑑𝑘 ∑
( ) = 𝑦𝑘𝑚 𝑣𝑚 ∀𝑘 ∈ 𝒩
𝑣𝑘 𝑚

1 See Section 4.6 in Chapter 4 for more details about optimization on the complex field.
2 This can be easily demonstrated taking into account that 𝐺 can be calculated as 𝐺 = 𝐴𝐺𝑝 𝐴⊤ ,
where 𝐴 is the incidence matrix and 𝐺𝑝 is a diagonal matrix with the resisting effect of each
branch.

Telegram: @ElectricalDocument
10.1 OPF in power distribution grids 173

As we have seen, the objective function is convex; the first constraint is affine
and represents the voltage in the slack node; the right-hand side of the second
constraint is a second-order cone that represents the maximum deviation of
the nodal voltage, whereas the left-hand side is a non-convex constraint that
represents the minimum deviation of the nodal voltage. The value of the devi-
ation 𝛿 is usually between 0.05pu to 0.10pu, according to the grid code in each
country. The third and fourth constraints are the maximum capacity of each
renewable source; the fifth constraint represents the thermal limit of each line,
and the final constraint is the set of power flow equations. The latter is the pri-
mary source of complexity of this model; since it is not non-convex, therein lies
the necessity of convex approximations to the OPF.

10.1.1 A brief review of power ﬂow analysis

Before presenting convex approximations to the OPF, let us review some basic
concepts from power flow analysis. First, it is important to differentiate the
power flow analysis from the OPF. The former is the solution of a set of
equations whereas the later is an optimization problem. The power flow prob-
lem allows to calculate nodal voltages from information of nodal powers. Since
we know the voltage in the slack node (𝑣0 = 1 + 𝑗0), then we can divide the set
of nodes as 𝒩 = {0, 𝑁}, where 𝑁 are the nodes were the voltage is unknown.
Therefore, the nodal admittance matrix can be represented as follows3 :

𝑌00 𝑌0𝑁
𝑌bus = ( ) (10.6)
𝑌𝑁0 𝑌𝑁𝑁

With a slight abuse of notation, we can represent (10.2) in matrix form as given
below:
∗
𝑆𝑁 − 𝐷 𝑁
( ) = 𝑣0 𝑌𝑁0 + 𝑌𝑁𝑁 𝑉𝑁 (10.7)
𝑉𝑁

where 𝑉𝑁 = (𝑣1 , 𝑣2 , … )⊤ and 𝑆𝑁 = (𝑠1 , 𝑠2 , … )⊤ , 𝐷𝑁 = (𝑑1 , 𝑑2 , … )⊤ are column

vectors4 . This is a set of non-linear algebraic equations that require a numeri-
cal method to find the value of 𝑉𝑁 , a problem that can be solved by different
methods such as Newton’s and GaussŰ-Seidel. Here, we present a method
based on a fixed point iteration, which is similar to the Gauss–Seidel method
with a simple implementation in Python. Let us define the impedance matrix
−1
𝑍𝑁𝑁 = 𝑌𝑁𝑁 ; this inverse exists as long as the graph that represents the grid is
connected, which is the usual case; then Equation (10.7) can be represented as

3 See Appendix A for more details about the construction of the admittance matrix.
4 𝑆∕𝑉 indicates a division term to term.

Telegram: @ElectricalDocument
174 10 Optimal power ﬂow

the following fixed point:

𝑉𝑁 = 𝑇(𝑉𝑁 ) (10.8)

where 𝑇 is a non-linear map from ℂ𝑛 to ℂ𝑛 given by Equation (10.9).

∗
𝑆𝑁 − 𝐷 𝑁
𝑉𝑁 = 𝑍𝑁𝑁 (( ) − 𝑣0 𝑌𝑁0 ) (10.9)
𝑉𝑁

The algorithm departs from an initial point 𝑉𝑁 = 1𝑁 where 1𝑁 is a column

vector with entries equal to one. Then, we evaluate 𝑉𝑁 ← 𝑇(𝑉𝑁 ), and this iter-
ation is repeated until achieving a fixed point, i.e., a point where 𝑇(𝑉𝑁 ) = 𝑉𝑁 .
This is a solution to the set of algebraic equations. Although this is not the
most efficient method to calculate a load flow, it is enough for our purposes,
and, as we will see later, it is straightforward to implement. It is important to
notice that a system may lack a solution or have several fixed points, some of
them without practical meaning. However, under certain conditions, we can
guarantee convergence and uniqueness of the solution with this approach, as
was demonstrated in [68] for DC grids. Formal conditions for convergence
and uniqueness of the solution are beyond the objectives of this book. Our
approach is practical-oriented, and hence, convergence is checked by executing
the algorithm.
Figure 10.1 shows a simple power distribution that will be used in later exam-
ples. These examples serve two purposes: first, to familiarize the reader with the
implementation of graphs in Python, and second, to show the implementation
of the power flow algorithms. This system will be used later in the OPF mod-
els, so it is recommended to implement and understand the following examples
before continuing with subsequent sections.

Figure 10.1 Example of a power distribution grid with distributed resources.

Telegram: @ElectricalDocument
10.1 OPF in power distribution grids 175

Example 10.1. All the information related to the power distribution grid
depicted in Figure 10.1 can be stored in a single variable using the module
networkx as presented below:

import numpy as np
import networkx as nx
G = nx.DiGraph()
G.add_node(0,name=’slack’,smax=10,d=0)
G.add_node(1,name=’step’,smax=0,d=0)
G.add_node(2,name=’house’,smax=0,d=1.2+0.3j)
G.add_node(3,name=’solar’,smax=1,d=0)
G.add_node(4,name=’building’,smax=0,d=2.5+0.9j)
G.add_node(5,name=’wind’,smax=1.5,d=0)
G.add_edge(0,1,y=1/(0.0075+0.010j),thlim=2)
G.add_edge(1,2,y=1/(0.0080+0.011j),thlim=2)
G.add_edge(2,3,y=1/(0.0090+0.018j),thlim=2)
G.add_edge(1,4,y=1/(0.0040+0.004j),thlim=2)
G.add_edge(4,5,y=1/(0.0050+0.006j),thlim=2)
nx.draw(G,with_labels=True,pos=nx.spectral_layout(G))

All the examples below depart from this definition of the graph, stored in a
variable 𝐺. More details of this module are presented in Appendix A.
Example 10.2. We require to build the 𝑌bus as the block matrices given
in (10.6). The nodal admittance matrix is calculated as given in (10.10):
𝑌bus = 𝐴𝑌𝑝 𝐴⊤ (10.10)
where 𝐴 is the incidence matrix of the oriented graph and 𝑌𝑝 is a diag-
onal matrix of the branch admittance. The incidence matrix can be easily
obtained using the module networkx named as nx in Example 10.1, see the code
below:

A = nx.incidence_matrix(G,oriented=True)
Yp = np.diag([G.edges[k][’y’] for k in G.edges])
Ybus = A@[email protected]
print(Ybus)
print(np.linalg.eigvals(Ybus.real))

In the last line, we checked if the real part of this matrix is positive semidefinite
by calculating its eigenvalues.
Block matrices given in (10.6) are calculated from the 𝑌bus as follows:
n = G.number_of_nodes()
YN0 = Ybus[1:n,0]
YNN = Ybus[1:n,1:n]
ZNN = np.linalg.inv(YNN)
d = np.array([G.nodes[k][’d’] for k in G.nodes])
print(YN0)
print(YNN)

Telegram: @ElectricalDocument
176 10 Optimal power ﬂow

Example 10.3. The power flow equations seen as fixed point (10.8) allow a
simple algorithm for calculating the operation point of the system. Let us define
a function for the load flow calculation using this fixed point map with ten
iterations:
def LoadFlow(sN,dN):
v0 = 1+0j
vN = np.ones(n-1)*v0
for t in range(10):
vN = ZNN@(np.conj((sN-dN)/vN)-v0*YN0)
vT = np.hstack([v0,vN]);
sT = vT*np.conj(Ybus@vT)
err = np.linalg.norm(sT[1:n]-(sN-dN))
print(’Load Flow, after 10 iterations the error is’,err)
return vT

The algorithm depart from 𝑉𝑁 = 1𝑁 and evaluates the fixed point map (10.9).
After that, the new voltages are stored in a variable 𝑉𝑇 , including the slack node.
Total loss is displayed at the end of the process. The algorithm can be improved
using a while-loop instead of a for-loop (in this example, we preferred a compact
code over an efficient algorithm).
We can evaluate the function using results from Example 10.2, considering
loads exclusively (i.e., the solar panel and the wind turbine have generation
equal to zero):

VT = LoadFlow(np.zeros(n-1),d[1:n])
ST = VT*np.conj(Ybus@VT)
pL = sum(ST)
print(’Loss’,pL)
for (k,m) in G.edges:
Sf = Ybus[k,m]*(VT[k]-VT[m])
print(’flow’,(k,m),np.abs(Sf))

Results can be stored in a DataFrame as follows:

import pandas as pd
results = pd.DataFrame()
results[’name’] = [G.nodes[k][’name’] for k in G.nodes]
results[’vpu’] = np.abs(VT)
results[’ang’] = np.angle(VT)*180/np.pi
results[’pnode’] = np.round(ST.real,4)
results[’qnode’] = np.round(ST.imag,4)
results.head(n)

The reader can verify that nodal voltages are 𝑣 =(1,0.956,0.943,0.943,0.943,

0.943)⊤ and power loss is 𝑝𝐿 = 0.173.

Telegram: @ElectricalDocument
10.2 Complex linearization 177

Example 10.4. Node 1 in the system depicted in Figure 10.1 does not have
generation or load. Therefore, it can be eliminated using a Kron reduction. Let
us split the set of nodes 𝒩 = {𝑠, 𝑟} where 𝑠 is the set of nodes with nodal current
equal to zero, and 𝑟 are the remaining nodes. Then, we have the following:

0 = 𝑌𝑠𝑠 𝑉𝑠 + 𝑌𝑠𝑟 𝑉𝑟 (10.11)

𝐼𝑟 = 𝑌𝑟𝑠 𝑉𝑠 + 𝑌𝑟𝑟 𝑉𝑟 (10.12)

where 𝑌𝑠𝑠 , 𝑌𝑠𝑟 , 𝑌𝑟𝑠 , 𝑌𝑟𝑟 are block matrices from 𝑌bus . Therefore, we can define
a reduced admittance matrix 𝑌kron as follows:

−1
𝑌kron = 𝑌𝑟𝑟 − 𝑌𝑟𝑠 𝑌𝑠𝑠 𝑌𝑠𝑟 (10.13)

This equation can be coded in Python as presented below for a single node
𝑠 = [1]:

s = [1]
r = list(set(range(n)).difference(s))
nn = len(r)
Ykron = np.zeros((5,5))*0j
for k in range(nn):
for m in range(nn):
Ykron[k,m] = Ybus[r[k],r[m]]-1/Ybus[s,s]*Ybus[r[k],s]*
Ybus[s,r[m]]

Kron reduction is used extensively in many power systems applications, and

therefore, it is useful to have this code for future examples.

10.2 Complex linearization

As we have seen, the fundamental OPF problem given by (10.5) is non-
convex due to the power flow equations, and hence, a convex approximation is
required. In this section, we present a simple linearization based on Wirtinger
calculus. There are many other linearizations in the literature (most of them
equivalent), but the representation presented here has advantages in terms of
accuracy and simple implementation.
Let us represent (10.2) as the following equivalent algebraic system:
∑
𝑠𝑘∗ − 𝑑𝑘∗ = 𝑦𝑘𝑚 𝑤𝑘𝑚 (10.14)
𝑚

𝑤𝑘𝑚 = 𝑣𝑘∗ 𝑣𝑚 (10.15)

Telegram: @ElectricalDocument
178 10 Optimal power ﬂow

where 𝑤𝑘𝑚 is a new complex variable5 . Notice that Equation (10.14) is affine
and the non-convexity appears in Equation (10.15). This equations can be lin-
earized in the complex plain around a given point 𝑢𝑘 , 𝑢𝑚 , using Wirtinger’s
calculus (see Appendix B for more details):
𝑤𝑘𝑚 − 𝑢𝑘∗ 𝑢𝑚 = 𝑢𝑘∗ (𝑣𝑚 − 𝑢𝑚 ) + 𝑢𝑚 (𝑣𝑘∗ − 𝑢𝑘∗ ) (10.16)
usually 𝑢𝑘 = 𝑢𝑚 = 1pu resulting in the following affine constraint:
𝑤𝑘𝑚 = 𝑣𝑘∗ + 𝑣𝑚 − 1 (10.17)
This simple equation constitutes a convex linearization of the power flow
equations.
On the other hand, voltage limitation introduces another set of non-convex
constraints, namely:
1 − 𝛿 ≤ ‖𝑣𝑘 ‖ ≤ 1 + 𝛿 (10.18)
The right hand is a ball or radius 1+𝛿 which is, of course, convex. However, the
left-hand side is a non-convex constraint since it is the exterior of a open ball
of radius 1 − 𝛿. The set defined by (10.18) (both left- and right-hand sides) is
named an annulus and is a non-convex set. In practice, this set can be replaced
by the following set:
1 − 𝛿 ≤ 𝑣𝑘real ≤ 1 + 𝛿 (10.19)
imag
1 − 𝛿 ≤ 𝑣𝑘 ≤1+𝛿 (10.20)
or equivalently as:
‖𝑣𝑘 − 1‖1 ≤ 𝛿 (10.21)
where ‖⋅‖1 represents the 1-norm. Although this approximation may seem
arbitrary, the following example shows in logic behind it.
Example 10.5. Constraint (10.21) is suitable approximation for values of 𝛿 =
0.1 and below. Figure 10.2 shows a subsection of the annulus 0.9 ≤ ‖𝑣𝑘 ‖ ≤ 1.1
for values around 1 + 0𝑗. The box constraint (10.21) is represented as a shadow
square which is a close approximation of the set for angles between 𝜃 = ±5𝑜 and
𝜃 = ±7𝑜 . Voltage angles are usually small in power distribution networks [69],
so this is a fair approximation. The model may be complemented by constraints
on the angle, which are convex. An exact value for the maximum angle would
require stability criteria beyond the objectives of this book. A more conservative
constraint is obtained by replacing the 1-norm with a 2-norm in (10.21).

5 This new variable increases the dimension of the set of feasible solutions; sometimes the
nature of the problem is only revealed when we change our perspective to a higher dimension.

Telegram: @ElectricalDocument
10.2 Complex linearization 179

Figure 10.2 Approximation of the voltage restriction as a box constraint.

Combining the aforementioned approximations, the OPF is transformed into

the following convex optimization problem:
∑∑ ∗
min real ( 𝑦𝑘𝑚 𝑣𝑘 𝑣𝑚 )
𝑘 𝑚

𝑣0 = 1 + 𝑗0
𝛿 ≥ ‖𝑣𝑘 − 1‖1 , ∀𝑘 ∈ 𝒩
𝑝𝑘max ≥ real(𝑠𝑘 ) ≥ 𝑝𝑘min , ∀𝑘 ∈ 𝒩 (10.22)
𝑠𝑘max ≥ ‖𝑠𝑘 ‖ , ∀𝑘 ∈ 𝒩
max
𝑖𝑘𝑚 ≥ ‖𝑦𝑘𝑚 (𝑣𝑘 − 𝑣𝑚 )‖ , ∀(𝑘𝑚) ∈ ℰ
∑
𝑠𝑘∗ − 𝑑𝑘∗ = 𝑦𝑘𝑚 𝑤𝑘𝑚 , ∀𝑘 ∈ 𝒩
𝑚

𝑤𝑘𝑚 = 𝑣𝑘∗ + 𝑣𝑚 − 1, ∀(𝑘𝑚) ∈ 𝒩 × 𝒩

notice that 𝑤𝑘𝑚 increases the number of variables of the model; however, the
new equations are affine and hence, it is not a problem in practice. It is also
possible to replace (10.17) into (10.14) to obtain a model with the same num-
ber of variables as the original problem. Here, we are prioritizing a simple
representation over the efficiency of the algorithm.
Notice the model is still non-linear since the objective function is quadratic.
In addition, there are second-order constraints related to the nodal voltage
and the capacity of each renewable power resource. However, these non-linear
equations generate a convex model that can be efficiently solved using CvxPy,
as shown in the following example.
Example 10.6. Let us solve the OPF problem for the system given in
Figure 10.1 using a convex linearization of the power flow equations. We

Telegram: @ElectricalDocument
180 10 Optimal power ﬂow

assume we have stored the graph as given in Example 10.1 and calculated the
𝑌bus as shown in Example 10.2. Both solar panel and wind turbine are available
to generate its nominal power. The code is presented below:

import cvxpy as cvx

smax = np.array([G.nodes[k][’smax’] for k in G.nodes])
d = np.array([G.nodes[k][’d’] for k in G.nodes])
v = cvx.Variable(n,complex=True)
s = cvx.Variable(n,complex=True)
W = cvx.Variable((n,n),complex=True)
obj = cvx.Minimize(cvx.quad_form(cvx.real(v),Ybus.real)+
cvx.quad_form(cvx.imag(v),Ybus.real))
res = [v[0] == 1.0]
M = Ybus@W
for k in G.nodes:
res += [cvx.conj(s[k]-d[k]) == M[k,k]]
res += [cvx.abs(v[k]-1) <= 0.05]
res += [cvx.abs(s[k]) <= smax[k]]
for m in G.nodes:
res += [W[m,k] == cvx.conj(v[k])+v[m]-1]
for (k,m) in G.edges:
res += [cvx.abs(Ybus[k,m]*(v[k]-v[m])) <= G.edges[(k,m)]
[’thlim’]]
OPF = cvx.Problem(obj,res)
OPF.solve()
print(’pL’,obj.value,OPF.status)

Most of the lines in this code are self explanatory; however, there are some
aspects that require careful explanation. First, notice that 𝑊 = [𝑤𝑘𝑚 ] is a
matrix of the same size of 𝑌bus , therefore, we can define a new matrix given
by (10.23):

𝑀 = 𝑌bus 𝑊 (10.23)

This matrix allows to represent (10.14) as follows:

𝑠𝑘∗ − 𝑑𝑘∗ = 𝑚𝑘𝑘 (10.24)

Second, set representations such as ∀𝑘 ∈ 𝒩 help to define the for-loop in the

code. So, ∀(𝑘𝑚) ∈ 𝒩 ×𝒩 indicates a nested-loop whereas ∀(𝑘𝑚) ∈ ℰ indicates
a for-loop in the set of the edges. In this case, 𝒩 is equivalent to G.nodes (we
can also use range(n)), and ℰ is equivalent to G.edges.
The reader can prove that the result of this problem is 𝑠3 = 0.96 + 0.28𝑗
and 𝑠5 = 1.42 + 0.49𝑗, with 𝑝𝐿 = 0.0406. However, this is an approxi-
mation of the power loss which requires to be calculated via a power flow
analysis:

Telegram: @ElectricalDocument
10.2 Complex linearization 181

VT = LoadFlow(s.value[1:n],d[1:n])
ST = VT*np.conj(Ybus@VT)
pL = sum(ST)
print(’Loss’,pL)

After executing this code, power loss is 𝑝𝐿 = 0.0406 (a great reduction in com-
parison to Example 10.3). Notice that although the solar panel has a capacity
of 𝑠𝑘max = 1, not all generation is an active power. The algorithm chooses to
reduce its active power in order to generate some reactive power and minimize
power loss. In case the primary resource (i.e., wind/solar) is limited, then we
require to include constraints of the form real(𝑠𝑘 ) ≤ 𝑝𝑘max . This constraint is, of
course, affine and does not represent a complication of the model. The reader is
invited to compare nodal voltages in the system, with and without distributed
generation.

10.2.1 Sequential linearization

We can improve the results of the linearization by linearizing again in the new
operating point. The algorithm is quite simple; we start with a vector 𝑈 = 1𝑁
and linearize the power flow equations around this point. Then we solve Model
(10.22) obtaining a vector 𝑆 with the power generated by each unit. Then, we
calculate a power flow, using, for instance, the fixed-point algorithm given in
Example 10.3. This algorithm returns a new set of voltages 𝑈, which are used
to linearize the model again using (10.16). The optimization model is again
solved using this new linearization, and the steps are repeated until achieving
convergence.
This method does not guarantee global optimum, but it is efficient in practice,
as shown in the following example:
Example 10.7. In the following code, we make three iterations of sequential
linearizations in order to obtain a better approximation of the optimal solution.
First, we define a function named LinearOPF which solves the optimization
model for a linearization around a point 𝑉𝑇 :

def LinearOPF(u):
v = cvx.Variable(n,complex=True)
s = cvx.Variable(n,complex=True)
W = cvx.Variable((n,n),complex=True)
obj = cvx.Minimize(cvx.quad_form(cvx.real(v),Ybus.real)+
cvx.quad_form(cvx.imag(v),Ybus.real))
M = Ybus@W
res = [v[0] == 1.0]

Telegram: @ElectricalDocument
182 10 Optimal power ﬂow

for k in range(n):
res += [cvx.conj(s[k]-d[k]) == M[k,k]]
res += [cvx.abs(s[k]) <= smax[k]]
for m in range(n):
res += [W[m,k] == cvx.conj(v[k])*u[m]+
np.conj(u[k])*v[m]
-np.conj(u[k])*u[m]]
OPF = cvx.Problem(obj,res)
OPF.solve()
print(’pL’,obj.value,OPF.status)
return s.value

The main difference of this model with respect to the model in Example 10.6 is
the point in which 𝑤𝑘𝑚 is linearized; in this case, we linearize around 𝑈. Next,
we evaluate this function as well as the power flow already defined in Example
10.3, namely:

VT = np.ones(n)*(1.0+0.0j)
for t in range(3):
ST = LinearOPF(VT)
VT = LoadFlow(ST[1:n],d[1:n])
print(’Loss’,sum(VT*np.conj(Ybus@VT)))

Power loss is 𝑝𝐿 = 0.04258 for both the load flow and the linear OPF.

10.2.2 Exponential models of the load

Loads in power distribution grids are usually represented as exponential mod-
els as presented below:
𝛼 imag 𝛽
𝑑𝑘 = 𝑑𝑘real ‖𝑣𝑘 ‖ + 𝑗𝑑𝑘 ‖𝑣𝑘 ‖ (10.25)

where 𝛼, 𝛽 are real numbers that represent the variation of the active and reac-
tive power, with respect to the voltage. Typical values of 𝛼, 𝛽 are 𝛼 = 𝛽 = 0 for
industrial loads, 𝛼 = 𝛽 = 1 for commercial loads and 𝛼 = 𝛽 = 2 for residential
loads. Nevertheless, fractional values are allowed.
Equation (10.25) leads to a non-convex constraint, however, it can be eas-
ily linearized using Wirtinger’s calculus. We present only the linearization
𝛼 𝛼
of ‖𝑣𝑘 ‖ since the linearization of ‖𝑣𝑘 ‖ follows the same procedure. First,
consider the following complex function:
𝛼
‖𝑣‖ = (𝑣𝑣 ∗ )𝛼∕2 (10.26)

Telegram: @ElectricalDocument
10.2 Complex linearization 183

then, we linearize this equation by derivating with respect to 𝑣 and 𝑣 ∗ and

evaluating in a reference value 𝑣0 :
𝛼
𝛼
𝛼 𝛼
−1 𝛼 𝛼
−1
‖𝑣‖ ≈ (𝑣0 𝑣0∗ ) 2 + (𝑣 𝑣 ∗ ) 2 𝑣0∗ ∆𝑣 + (𝑣0 𝑣0∗ ) 2 𝑣0 ∆𝑣 ∗ (10.27)
2 0 0 2
𝛼
𝛼 𝛼
−1 𝛼 𝛼
−1
= (𝑣0 𝑣0∗ ) 2 + (𝑣0 𝑣0∗ ) 2 𝑣0∗ (𝑣−𝑣0 )+ (𝑣0 𝑣0∗ ) 2 𝑣0 (𝑣 ∗ −𝑣0∗ ) (10.28)
2 2
= 𝑎 + 𝑏𝑣 + 𝑏∗ 𝑣 ∗ (10.29)
with
𝛼
𝑎 = (1 − 𝛼) ‖𝑣0 ‖ (10.30)
𝛼 𝛼
−1
𝑏= (𝑣 𝑣 ∗ ) 2 𝑣0∗ (10.31)
2 0 0
For the case of 𝑣0 = 1 + 0𝑗, the linearization is simplified as (10.32):
𝛼 𝛼
‖𝑣‖ ≈ 1 − 𝛼 + (𝑣 + 𝑣 ∗ ) (10.32)
2
Example 10.8. We are going to evaluate the accuracy of Equation (10.32) for
voltages close to 1pu. Let us define the following functions:
𝛼
𝑓(𝑣) = ‖𝑣‖ (10.33)
𝛼( )
𝑔(𝑣) = 1 − 𝛼 + 𝑣 + 𝑣𝑘∗ (10.34)
2 𝑘
𝜖(𝑣) = ‖𝑓(𝑣) − 𝑔(𝑣)‖ (10.35)
where 𝑓 ∶ ℂ → ℝ is the exact exponential function, 𝑔 ∶ ℂ → ℝ is its lineariza-
tion, and 𝜖 ∶ ℂ → ℝ is the total error. We evaluate this error in 𝑛 = 104 random
points generated in set Ω presented below:
{ }
Ω = 𝑣 ∈ ℂ ∶ 0.9 ≤ 𝑣 real ≤ 1.1, −0.1 ≤ 𝑣imag ≤ 0.1 (10.36)
A distribution function is obtained which gives the expected error with
a defined probability. The corresponding script in Python is presented
below:

import numpy as np
import matplotlib.pyplot as plt
n = 10000
vreal = [0.9+0.2*np.random.rand() for k in range(n)]
vimag = [0.1-0.2*np.random.rand() for k in range(n)]
v = np.array(vreal) + 1j*np.array(vimag)
alpha = 2
f = np.abs(v)**alpha
g = 1-alpha + alpha/2*(v+v.conj())
error = np.abs(f - g)

Telegram: @ElectricalDocument
184 10 Optimal power ﬂow

error.sort()
probability = np.linspace(0,1,n)
plt.plot(error*100,probability)
plt.grid()
plt.show()

Figure 10.3 shows the results for 𝛼 = 2. This plot represents the cumulative
distribution function of 𝜖. In this case, 80% of the randomly generated points
produced an error less than 1%.
This demonstrates the high accuracy of the method. The student is invited to
generate this plot for different 𝑛 and different values of 𝛼 > 0.

10.3 Second-order cone approximation

A Second-order cone approximation is a convenient manner to include power
flow equations into an optimization model, especially for power distribution
applications. In this case, we convexify the equations maintaining the non-
linear nature of the problem. We depart from (10.15) which is the primary
∗
non-convex constraint in the problem. Let us multiply by 𝑤𝑘𝑚 obtaining the
following equivalent equation:

∗
𝑤𝑘𝑚 𝑤𝑘𝑚 = (𝑣𝑘∗ 𝑣𝑚 )(𝑣𝑘 𝑣𝑚
∗
) (10.37)

which can be written as

2 2 2
‖𝑤𝑘𝑚 ‖ = ‖𝑣𝑘 ‖ ‖𝑣𝑚 ‖ (10.38)

1 Figure 10.3 Cumulative

distribution function of the
0.8 linearization error for 𝛼 = 2
probability

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
error in %

Telegram: @ElectricalDocument
10.3 Second-order cone approximation 185

2
Let us define a new vector 𝐻 ∈ ℝ𝑛 with entries ℎ𝑘 = ‖𝑣𝑘 ‖ , then (10.38) is
transformed into (10.39)
2
‖𝑤𝑘𝑚 ‖ = ℎ𝑘 ℎ𝑚 (10.39)

At this point, this constraint is still non-convex; therefore, we propose an

approximation that consists in transforming the equality into inequality and
solve the resulting hyperbolic set as previously presented in Example 5.3,
Chapter 5, namely:
‖‖ ‖‖
‖‖ 2𝑤𝑘𝑚 ‖
‖‖( )‖‖ ≤ ℎ𝑘 + ℎ𝑚 (10.40)
‖‖ ℎ𝑘 − ℎ𝑚 ‖‖‖
‖ ‖
The limit in each distribution line can be represented as function of the new
variables ℎ𝑘 by multipling the branch current by 𝑣𝑘∗ as follows:

𝑖𝑘𝑚 𝑣𝑘∗ = 𝑦𝑘𝑚 𝑣𝑘∗ (𝑣𝑘 − 𝑣𝑚 ), ∀(𝑘𝑚) ∈ ℰ (10.41)

which in turn is transformed into the following affine equation:

𝑠𝑘𝑚 = 𝑦𝑘𝑚 (ℎ𝑘 − 𝑤𝑘𝑚 ) (10.42)

max max
For the sake of simplicity, we assume 𝑖𝑘𝑚 = 𝑠𝑘𝑚 to complete the model; thus,
the SOC approximation for the OPF is presented below:
∑
min real ( 𝑠𝑘 − 𝑑𝑘 )
𝑘

ℎ0 = 1
(1 + 𝛿)2 ≥ ℎ𝑘 ≥ (1 − 𝛿)2 , ∀𝑘 ∈ 𝒩
𝑝𝑘max ≥ real(𝑠𝑘 ) ≥ 𝑝𝑘min , ∀𝑘 ∈ 𝒩
𝑠𝑘max ≥ ‖𝑠𝑘 ‖ , ∀𝑘 ∈ 𝒩 (10.43)
max
𝑖𝑘𝑚 ≥ ‖𝑦𝑘𝑚 (ℎ𝑘 − 𝑤𝑘𝑚 )‖ , ∀(𝑘𝑚) ∈ ℰ
∑
𝑠𝑘∗ − 𝑑𝑘∗ = 𝑦𝑘𝑚 𝑤𝑘𝑚 , ∀𝑘 ∈ 𝒩
𝑚
‖‖ ‖‖
‖‖ 2𝑤𝑘𝑚 ‖
‖‖( )‖‖‖ ≤ ℎ𝑘 + ℎ𝑚 , ∀(𝑘𝑚) ∈ 𝒩 × 𝒩
‖‖ ℎ𝑘 − ℎ𝑚 ‖‖
‖ ‖
In this model, we calculated power loss as the sum of nodal powers. This
equation is entirely equivalent.
Example 10.9. Let us implement the SOC model given by (10.43) for the
distribution system shown in Figure 10.1. The code in Python is given below:

Telegram: @ElectricalDocument
186 10 Optimal power ﬂow

smax = np.array([G.nodes[k][’smax’] for k in G.nodes])

d = np.array([G.nodes[k][’d’] for k in G.nodes])
h = cvx.Variable(n)
s = cvx.Variable(n,complex=True)
W = cvx.Variable((n,n),complex=True)
M = Ybus@W
res = [h[0] == 1.0]
for k in range(n):
res += [cvx.conj(s[k]-d[k]) == M[k,k]]
res += [cvx.abs(s[k]) <= smax[k]]
res += [h[k] >= 0.9025]
res += [h[k] <= 1.1025]
res += [W[k][k] == h[k]]
for m in range(n):
res += [cvx.SOC(h[k]+h[m], cvx.vstack([2*W[k,m],
h[k]-h[m]]))]
res += [W[m,k] == cvx.conj(W[k,m])]
for (k,m) in G.edges:
ylin = np.abs(G.edges[(k,m)][’y’])
slin = G.edges[(k,m)][’thlim’]
res += [cvx.abs(h[k]-W[k,m]) <= slin/ylin]
res += [cvx.abs(h[m]-W[m,k]) <= slin/ylin]
obj = cvx.Minimize(cvx.sum(cvx.real(s-d)))
OPFSOC = cvx.Problem(obj,res)
OPFSOC.solve()
print(’pL’,obj.value,OPFSOC.status)

After executing this code, power loss is 𝑝𝐿 = 0.04259 with 𝑠3 = 0.9534+0.3018𝑗

and 𝑠5 = 1.4115 + 0.5076𝑗. The student may observe this solution is close to the
solution obtained by the power flow analysis.

Example 10.10. The magnitude of nodal voltages can be recovered from

the results of Model (10.43) without executing a new power flow analysis as
follows:
√
𝑣𝑘 = ℎ 𝑘 (10.44)

The angle can be calculated evaluating the angle of each edge of the graph as
given below:

𝜃𝑘 − 𝜃𝑚 = angle(𝑤𝑘𝑚 ), ∀(𝑘𝑚) ∈ ℰ (10.45)

The reader is invited to evaluate these equations and compare them

to the power flow results with the powers resulting from the SOC
approximation.

Telegram: @ElectricalDocument
10.3 Second-order cone approximation 187

Figure 10.4 Comparison between the constraint 𝑤𝑘𝑚 = 𝑣𝑘 + 𝑣𝑚 − 1 (solid line) and
the constraint 𝑤𝑘𝑚
2 2
≤ 𝑣𝑘2 𝑣𝑚 (dashed line) for (𝑣𝑘 , 𝑣𝑚 , 𝑤𝑘𝑚 ) ∈ ℝ3 .

Example 10.11. Both SOC approximation and linearization are based on

Equation (10.15), which goes from ℂ2 to ℂ. Although it is difficult to
visualize a function in ℂ2 , it is possible to make a plot when it goes
from ℝ2 to ℝ (that is the case of the OPF on DC grids). Figure 10.4
shows a comparison among the two approximations. The dashed line
defines a linear approximation around 1pu, whereas the solid line defines
a hyperbolic set that can be transformed into an SOC. Both approxima-
tions are quite similar, although the linear approximation is more impre-
cise as the voltages go far from 1pu. In practice, voltages are 1 ± 0.1 up
so that the linear approximation is enough. It should be noted that the
OPF constitutes the tertiary control in active distribution networks, and
it requires to be evaluated in real-time. Therefore, we require to define
a suitable tray-off between accuracy and speed. Linearization is perhaps
the best approach in this case (see [70] for more details about the linear
approximation).
Another way to visualize the difference between linear and SOC approxima-
tions is by generating random samples of 𝑣𝑘 and 𝑣𝑚 on the complex set Ω given
by (10.36), and evaluate the error as in Example 10.8. The reader is invited to
do this numerical experiment.

Telegram: @ElectricalDocument
188 10 Optimal power ﬂow

10.4 Semideﬁnite approximation

Semidefinite programming allows to generate a highly accurate approximation
for the OPF problem. Unlike the approaches previously presented, in this case
it is convenient to separate nodal voltages in real and imaginary parts as 𝑣𝑘 =
𝑓𝑘 + 𝑗𝑒𝑘 and define the following block matrix:

𝐸 𝑍
( ) (10.46)
𝑍⊤ 𝐹

where the entries of each block matrix are defined as follows:

𝐸𝑘𝑚 = 𝑒𝑘 𝑒𝑚 (10.47)

𝑍𝑘𝑚 = 𝑒𝑘 𝑓𝑚 (10.48)

𝐹𝑘𝑚 = 𝑓𝑘 𝑓𝑚 (10.49)

Notice that (10.46) is positive semidefinite and rank 1. Therefore it is possible

to generate an SDP approximation of (10.5) by representing all the models as a
function of this matrix, relaxing the rank constraint.
As we have seen, a key step is to find a suitable representation for 𝑤𝑘𝑚
in (10.15), and in this case, it is easy to see the following:

𝑤𝑘𝑚 = 𝐸𝑘𝑚 + 𝐹𝑘𝑚 + 𝑗(𝑍𝑘𝑚 − 𝑍𝑚𝑘 ) (10.50)

Moreover, we can obtain a suitable representation of the objective function

from (10.4),

𝑝𝐿 = trace(𝐺bus 𝐸 + 𝐺bus 𝐹) (10.51)

where 𝐺bus = real(𝑌bus ). With these simple change of variables, we obtain the
following semidefinite problem:

min trace(𝐺bus 𝐸 + 𝐺bus 𝐹)

𝐸00 = 1
𝐹0𝑘 = 0
∑
𝑠𝑘∗ − 𝑑𝑘∗ = 𝑦𝑘𝑚 𝑤𝑘𝑚
𝑚

𝑤𝑘𝑚 = 𝐸𝑘𝑚 + 𝐹𝑘𝑚 + 𝑗(𝑍𝑘𝑚 − 𝑍𝑚𝑘 ) (10.52)

Telegram: @ElectricalDocument
10.4 Semideﬁnite approximation 189

𝐸 𝑍
( )⪰0
𝑍⊤ 𝐹

⎛ 𝑠max 0 real(𝑠𝑘 ) ⎞
𝑘
⎜ 0 𝑠𝑘max imag(𝑠𝑘 ) ⎟⪰0
⎜ ⎟
real(𝑠𝑘 ) imag(𝑠𝑘 ) 𝑠𝑘max
⎝ ⎠

For simplicity, we have omitted some constraints related to the thermal limit
and voltage regulation. In addition, the last constraint which is equivalent to
‖𝑠𝑘 ‖ ≤ 𝑠𝑘max is transformed into a semidefinite constraint in order to obtain a
pure SDP problem.

Example 10.12. The code presented below represents a SDP approximation

for the optimal power flow problem:

smax = np.array([G.nodes[k][’smax’] for k in G.nodes])

E = cvx.Variable((n,n),symmetric=True)
F = cvx.Variable((n,n),symmetric=True)
Z = cvx.Variable((n,n))
s = cvx.Variable(n,complex=True)
W = cvx.Variable((n,n),complex=True)
obj = cvx.Minimize(cvx.trace(Ybus.real@E+Ybus.real@F))
M = Ybus@W
res = [E[0,0] == 1]
res += [cvx.bmat([[E,Z],[Z.T,F]]) >> 0]
res += [cvx.trace(Ybus.real@E+Ybus.real@F) == cvx.sum
(cvx.real(s-d))]
for k in range(n):
res += [cvx.conj(s[k]-d[k]) == M[k,k]]
res += [F[k,0] == 0]
res += [F[0,k] == 0]
res += [cvx.bmat([[smax[k],0,cvx.real(s[k])],
[0,smax[k],cvx.imag(s[k])],
[cvx.real(s[k]),cvx.imag(s[k]),
smax[k]]]) >> 0]
for m in range(n):
res += [W[k,m] == E[k,m] + F[k,m] + 1j*(Z[k,m]-Z[m,k])]
OPFSDP = cvx.Problem(obj,res)
OPFSDP.solve()
print(’pL’,obj.value,OPFSDP.status)

The results of this model evaluated in the distribution system depicted in

Figure 10.1 are 𝑝𝐿 = 0.042587, 𝑠solar 0.9534 + 0.3017𝑗, and 𝑠wind = 1.4115 +
0.5077𝑗.

Telegram: @ElectricalDocument
190 10 Optimal power ﬂow

10.5 Further readings

The OPF has been studied for a long time, with classic approaches based
on non-linear programming as can be found in [71], [72], and [73].
The interior-point method seems to work well in practice, even for non-
convex formulations, as demonstrated in [74]. However, these applications
do not analyze characteristics such as convergence and global optima.
Therefore, linearizations and cone approximations are required usually
proposed.
Although there is a proliferation of linearizations for the power flow
equations (most of them entirely equivalent), the approximations presented
in this chapter are based on [75] and [70] that allow complex representa-
tions easily implementable in Python even for three-phase unbalanced power
distribution systems.
There is also vast literature about second-order cone approximations, stand-
ing out the work of Low (see for example [76] and [77]). A complete review of
the problem can be found in [78]. This review includes linearizations and cone
approximations.
The OPF can be extended to DC grids in both high voltage and
microgrids. The interested reader can refer to [79] for the case of
microgrids.
Semidefinite programming has also been an active area of research for
OPF problems [80]. A complete analysis of the geometry of the problem can
be found in [81]. Most of the approximations presented in this chapter are
suitable for radial distribution grids; an analysis for meshed grids can be
found in [82]

10.6 Exercises
1. Make a comparative analysis among linearization, sequential linearization,
SOC and SDP approximations. Identify the advantages and disadvantages
of each approach.
2. Analyze convergence properties of the power flow algorithm presented in
Section 10.1.1, by plotting a curve of error vs iterations for different values
of load.
3. Compare numerical results for linearization, sequential linearization, SOC
and SDP approximations in the system depicted in Figure 10.1 for different
values of loads and power factor.

Telegram: @ElectricalDocument
10.6 Exercises 191

4. Calculate the OPF for the system depicted in Figure 10.1 if a new line is
included between Node 3 and Node 5, with 𝑧15 = 0.0060 + 0.01.
5. Solve the OPF using the linearization and SOC approximation, for the
power distribution system given in Table 10.1. Assume there is dis-
tributed generation at nodes 11, 20, 21, and 26 with nominal capacity
𝑠max = 0.08pu.
6. T1, T2, and T3 in Table 10.1 represent the load curve for the 34-bus test
system. T1 represents the load factor for operation between 0h:8h, T2
for 8h:16h, and T3 for 16h:24h. Solve the OPF problem for each of this
operation points.
7. Conventional distributed generation are based on synchronous machines
instead of power electric converters. This type of machine presents
a capability curve as shown in Figure 10.5. Include this capabil-
ity curve into the OPF model (notice this curve generates a convex
constraint).
8. Include voltage constraints in the SDP approximation given in (10.52).
Recover the nodal voltages from the semidefinite approximation presented
in Example 10.12.
9. The optimal power flow problem can be extended to the operation of
DC distribution systems. Consider the DC distribution system given in

Figure 10.5 Example of a capability curve for a conventional synchronous machine.

Telegram: @ElectricalDocument
192 10 Optimal power ﬂow

Table 10.1 34-bus test system taken from [83].

From To 𝒓𝒌𝒎 [pu] 𝒙𝒌𝒎 [pu] 𝒑[pu] 𝒒[pu] T1 T2 T3

1 2 0.00967 0.00397 0.0230 0.01425 0.55 0.70 0.65
2 3 0.00886 0.00364 0.0000 0.00000 0.00 0.00 0.00
3 4 0.01359 0.00377 0.0230 0.01425 0.55 0.70 0.65
4 5 0.01236 0.00343 0.0230 0.01425 0.55 0.70 0.65
5 6 0.01236 0.00343 0.0000 0.00000 0.00 0.00 0.00
6 7 0.02598 0.00446 0.0000 0.00000 0.00 0.00 0.00
7 8 0.01732 0.00298 0.0230 0.01425 0.55 0.70 0.65
8 9 0.02598 0.00446 0.0230 0.01425 0.55 0.70 0.65
9 10 0.01732 0.00298 0.0000 0.00000 0.00 0.00 0.00
10 11 0.01083 0.00186 0.0230 0.01425 0.55 0.70 0.65
11 12 0.00866 0.00149 0.0137 0.00840 0.50 0.60 0.55
3 13 0.01299 0.00223 0.0072 0.00450 0.45 0.65 0.60
13 14 0.01732 0.00298 0.0072 0.00450 0.45 0.65 0.60
14 15 0.00866 0.00149 0.0072 0.00450 0.45 0.65 0.60
15 16 0.00433 0.00074 0.0014 0.00075 0.60 0.70 0.65
6 17 0.01483 0.00412 0.0230 0.01425 0.55 0.70 0.65
17 18 0.01359 0.00377 0.0230 0.01425 0.55 0.70 0.65
18 19 0.01718 0.00391 0.0230 0.01425 0.55 0.70 0.65
19 20 0.01562 0.00355 0.0230 0.01425 0.55 0.70 0.65
20 21 0.01562 0.00355 0.0230 0.01425 0.55 0.70 0.65
21 22 0.02165 0.00372 0.0230 0.01425 0.55 0.70 0.65
22 23 0.02165 0.00372 0.0230 0.01425 0.55 0.70 0.65
23 24 0.02598 0.00446 0.0230 0.01425 0.55 0.70 0.65
24 25 0.01732 0.00298 0.0230 0.01425 0.55 0.70 0.65
25 26 0.01083 0.00186 0.0230 0.01425 0.55 0.70 0.65
26 27 0.00866 0.00149 0.0137 0.00850 0.50 0.60 0.55
7 28 0.01299 0.00223 0.0075 0.00480 0.55 0.75 0.70
28 29 0.01299 0.00223 0.0075 0.00480 0.55 0.75 0.70
29 30 0.01299 0.00223 0.0075 0.00480 0.55 0.75 0.70
10 31 0.01299 0.00223 0.0057 0.00345 0.57 0.63 0.58
31 32 0.01732 0.00298 0.0057 0.00345 0.57 0.63 0.58
32 33 0.01299 0.00223 0.0057 0.00345 0.57 0.63 0.58
33 34 0.00866 0.00149 0.0057 0.00345 0.57 0.63 0.58

Telegram: @ElectricalDocument
10.6 Exercises 193

Table 10.2; solve the corresponding OPF problem using linearization and
SOC approximation.

Table 10.2 Parameters of a 21-nodes

DC power distribution grid

From To 𝒓[pu] 𝒅[pu] 𝒑max [pu]

1 2 0.0053 0.70 0.0
1 3 0.0054 0.00 0.0
3 4 0.0054 0.36 0.0
4 5 0.0063 0.04 0.0
4 6 0.0051 0.36 0.0
3 7 0.0037 0.00 0.0
7 8 0.0079 0.32 0.0
7 9 0.0072 0.80 1.5
3 10 0.0053 0.00 0.0
10 11 0.0038 0.45 0.0
11 12 0.0079 0.68 1.5
11 13 0.0078 0.10 0.0
10 14 0.0083 0.00 0.0
14 15 0.0065 0.22 0.0
15 16 0.0064 0.23 0.0
16 17 0.0074 0.43 0.0
16 18 0.0081 0.34 1.5
14 19 0.0078 0.09 0.0
19 20 0.0084 0.21 0.0
19 21 0.0082 0.21 3.0

10. Equation (10.26) may be represented as follows:

𝛼
(√ )𝛼
‖𝑣𝑘 ‖ = 𝑥2 + 𝑦2 (10.53)
imag
with 𝑥 = 𝑣𝑘real and 𝑦 = 𝑣𝑘 . This is an equation from ℝ2 to ℝ. Use a Taylor
imag
expansion to linearize this equation around 𝑥 = 𝑣𝑘real = 1 and 𝑦 = 𝑣𝑘 =
0. Compare the result with (10.32).

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
195

Active distribution networks

Learning outcomes

By the end of this chapter, the student will be able to:

● Formulate mixed-integer convex models for the optimal placement
of capacitors and distributed generation.
● Formulate a mixed-integer convex model for the optimal placement
of distributed generation.
● Formulate a convex model for hosting capacity.

11.1 Modern distribution networks

Modern power distribution networks include several active elements, such as
distributed generation and electric vehicles, that must be included in the oper-
ation via optimization models. Convex approximations for the optimal power
flow equations previously presented in Chapter 10 are key to formulate these
optimization models; therefore, it is convenient to review these formulations,
especially the linear formulation before continuing with the sections below. We
present three main problems, namely: optimal placement of capacitors, optimal
placement and size of distributed generation, and hosting capacity. Although
these problems are closely related to planning rather than operation, they share
most of the properties of the OPF, and hence it is possible to obtain convex and
mixed-integer convex approximations.

Mathematical Programming for Power Systems Operation: From Theory to Applications in

Figure 11.1 Three-feeder test system for power system reconﬁguration [84].

11.2 Primary feeder reconﬁguration

Power distribution networks have tie/sectionalizing switches 𝜇𝑘 that allow
transferring load from one feeder to other, as depicted in Figure 11.1. This
action may be performed automatically from the control center. However, we
require an optimization algorithm that guides the process in order to mini-
mize loss and improve efficiency. This algorithm is known as primary feeder
reconfiguration [84].
In simple terms, the algorithm determines each switching state (on/off) that
minimizes power loss. However, the problem is complex for three main rea-
sons: first, the model must include power flow equations that are non-convex,
as discussed in Chapter 10; second, the switches along the feeder impose binary
constraints into the model; and third, it is required to impose constraints that
guarantee that each primary feeder is radial and connected. We address each
of these problems below.
We must represent the grid as an oriented graph 𝒢 = {𝒩, ℰ}, where 𝒩 is the
set of nodes and ℰ ⊆ 𝒩 × 𝒩 is the set of edges (connected or disconnected).
Each node has associated a voltage 𝑣𝑘 and a value of active and reactive power,
𝑝𝑘 and 𝑞𝑘 , respectively. Besides, each edge has an admittance 𝑦𝑘𝑚 and a binary
variable 𝜇𝑘𝑚 that represents the corresponding switch state. All substations are
represented by the same node, marked as 0 and with voltage 𝑣0 = 1∠0. Circuit
relations are represented by the incidence matrix 𝐴, as presented below:

𝐼𝒩 = 𝐴𝐼ℰ (11.1)
⊤
𝑉ℰ = 𝐴 𝑉𝒩 (11.2)

Telegram: @ElectricalDocument
11.2 Primary feeder reconﬁguration 197

where 𝐼𝒩 and 𝑉𝒩 are the vectors of nodal current and voltage, and 𝐼ℰ , 𝑉ℰ are
the vectors of branch current and voltage, respectively. The Ohm’s law in each
edge and the power balance in each node are also included into the model as
follows:

𝑖𝑘𝑚 = 𝜇𝑘𝑚 𝑦𝑘𝑚 𝑣𝑘𝑚 , ∀𝑘𝑚 ∈ ℰ (11.3)

∗
𝑖𝑘 = (𝑠𝑘 ∕𝑣𝑘 ) , ∀𝑘 ∈ 𝒩 (11.4)
𝜇𝑘𝑚 ∈ {0, 1} (11.5)

these equations constitute the main binary and non-linear/non-convex con-

straints of the problem. Below, we propose a linear approximation for these
constraints.
First, we define an auxiliary complex variable 𝑗𝑘𝑚 for the current in each edge
of the graph, regardless of whether the edge is connected or not. This current,
given in 11.6, lacks physical meaning if the edge is disconnected and is used
only as auxiliary variable.

𝑗𝑘𝑚 = 𝑦𝑘𝑚 𝑣𝑘𝑚 (11.6)

Then, the bi-linear equation related to current in each edge is replaced by a

linear equivalent as explained in Chapter 4:

−𝛿𝐼real 𝜇𝑘𝑚 ≤ 𝑖𝑘𝑚

real
≤ 𝛿𝐼real 𝜇𝑘𝑚 (11.7)
imag imag imag
−𝛿𝐼 𝜇𝑘𝑚 ≤ 𝑖𝑘𝑚 ≤ 𝛿𝐼 𝜇𝑘𝑚 (11.8)
real
𝑗𝑘𝑚 − (1 − 𝜇𝑘𝑚 )𝛿𝐼real ≤ 𝑖𝑘𝑚
real real
≤ 𝑗𝑘𝑚 + (1 − 𝜇𝑘𝑚 )𝛿𝐼real (11.9)
imag imag real imag imag
𝑗𝑘𝑚 − (1 − 𝜇𝑘𝑚 )𝛿𝐼 ≤ 𝑖𝑘𝑚 ≤ 𝑗𝑘𝑚 + (1 − 𝜇𝑘𝑚 )𝛿𝐼 (11.10)

where 𝛿𝐼 represents the maximum deviation of the current in each branch.

Next, the power balance in each node is linearized using a complex lineariza-
tion around 𝑣𝑘 = 1∠0𝑜 , as presented below1 :

𝑖𝑘 = 𝑠𝑘∗ (2 − 𝑣𝑘∗ ) (11.11)

At this point, the model is a mixed-integer linear. However, it is required to

impose a radiality constraint; otherwise, the model would connect all switches.
A meshed grid is more efficient than a radial grid. However, radiality is required
in classic power distribution networks because the protections are calibrated for
such configuration. We use the radiality constraints proposed in [85], which are
based on two key observations from graph theory: first, a radial grid (i.e, a tree)

1 See Appendix B for details about complex linearizations.

Telegram: @ElectricalDocument
198 11 Active distribution networks

has |ℰ|−1 node and second, the graph must be connected. The first observation
can be imposed in the model as the following affine constraint:
∑
𝜇𝑘𝑚 = 𝑛 − 1 (11.12)
𝑘𝑚∈ℰ

where 𝑛 is the number of nodes. In this way, we ensure there are only 𝑛 − 1
switches connected in the grid. The second condition can be imposed by
noticing that the Laplacian matrix associated to the graph must be diagonally
dominant. The Laplacian matrix 𝑊 is defined as follows:

𝑊 = 𝐴 diag(𝜇)𝐴⊤ (11.13)

where diag(𝜇) is a diagonal matrix of size |ℰ| × |ℰ|. We could impose a con-
straint such that 𝑊 is positive semidefinite in which case, we would obtain a
semidefinite programming problem. However, it is straightforward to impose a
simple linear constraint related to the diagonal-dominant characteristic of the
Laplacian matrix, namely:
∑
𝑤𝑘𝑘 ≥ 𝑤𝑘𝑚 , ∀𝑘 (11.14)
𝑚

Collecting all the aforementioned approximations, we obtain a mixed-integer

linear programming model for the primary feeder reconfiguration. Let us see
the use of the model by a simple example.

Example 11.1. Let us solve the power system reconfiguration problem in

the classic three-feeder test system proposed by Civanlar in [84]. For the sake
of completeness, the parameters of these feeders are presented in Table 11.1.
We store all the parameters in a graph using the module networkx (see
Appendix A). The inputs of the model are the matrix of admittance 𝑌ℰ , the
incidence matrix, and the vector of nodal powers. The corresponding code for
minimizing active power loss is presented below:

Vnode_real = cvx.Variable(num_nodes)
Vnode_imag = cvx.Variable(num_nodes)
Inode_real = cvx.Variable(num_nodes)
Inode_imag = cvx.Variable(num_nodes)
Vedge_real = cvx.Variable(num_edges)
Vedge_imag = cvx.Variable(num_edges)
Iedge_real = cvx.Variable(num_edges)
Iedge_imag = cvx.Variable(num_edges)
Jedge_real = cvx.Variable(num_edges)
Jedge_imag = cvx.Variable(num_edges)
W = cvx.Variable((num_nodes,num_nodes))
mu = cvx.Variable(num_edges, integer=True)

Telegram: @ElectricalDocument
11.2 Primary feeder reconﬁguration 199

re = [mu >= 0, mu <= 1,

Vnode_real[0] == 1,
Vnode_imag[0] == 0,
Vnode_real <= 1.2,
Vnode_real >= 0.8,
Vnode_imag <= 0.05,
Vnode_imag >= -0.05,
Vedge_real == A.T@Vnode_real,
Vedge_imag == A.T@Vnode_imag,
Jedge_real == Yedge_real@Vedge_real-Yedge_imag@Vedge_imag,
Jedge_imag == Yedge_real@Vedge_imag+Yedge_imag@Vedge_real,
Inode_real == A@Iedge_real,
Inode_imag == A@Iedge_imag,
W == [email protected](mu)@A.T,
cvx.sum(mu) == num_nodes-1]
for k in range(1,num_nodes):
re += [Inode_real[k]==S_real[k]*(2-Vnode_real[k])+
S_imag[k]*(Vnode_imag[k])]
re += [Inode_imag[k]==S_real[k]*(Vnode_imag[k])
-S_imag[k]*(2-Vnode_real[k])]
sm = 0
for m in range(num_nodes):
sm = sm + W[k,m]
re += [sm >= 0]

for k in range(num_edges):
re += [-mu[k]*deltaI_real[k] <= Iedge_real[k]]
re += [-mu[k]*deltaI_imag[k] <= Iedge_imag[k]]
re += [Iedge_real[k] <= mu[k]*deltaI_real[k]]
re += [Iedge_imag[k] <= mu[k]*deltaI_imag[k]]
re += [Iedge_real[k] <= Jedge_real[k]+(1-mu[k])
*deltaI_real[k]]
re += [Iedge_imag[k] <= Jedge_imag[k]+(1-mu[k])
*deltaI_imag[k]]
re += [Iedge_real[k] >= Jedge_real[k]-(1-mu[k])
*deltaI_real[k]]
re += [Iedge_imag[k] >= Jedge_imag[k]-(1-mu[k])
*deltaI_imag[k]]

fo = cvx.Minimize(Inode_real[0])

Reconfiguration = cvx.Problem(fo,re)

Notice that minimizing power loss is equivalent to minimizing the power

injected at the subestation (i.e, 𝑖0 ). The reader is invited to experiment with
this code.

Telegram: @ElectricalDocument
200 11 Active distribution networks

Table 11.1 Parameters of the three-feeder test

system for power system reconﬁguration [84].

From To 𝒓𝒌𝒎 (pu) 𝒙𝒌𝒎 (pu) 𝒑𝒌 (pu) 𝒒𝒌 (pu)

SL N4 0.0750 0.1000 0.02 0.02

N4 N5 0.0800 0.1100 0.03 0.00
N4 N6 0.0900 0.1800 0.02 0.00
N6 N7 0.0400 0.0400 0.02 0.01
SL N8 0.1100 0.1100 0.04 0.03
N8 N9 0.0800 0.1100 0.05 0.02
N8 N10 0.1100 0.1100 0.01 0.01
N9 N11 0.1100 0.1100 0.01 −0.01
N9 N12 0.0800 0.1100 0.05 −0.02
SL N13 0.1100 0.1100 0.01 0.01
N13 N14 0.0900 0.1200 0.01 −0.01
N13 N15 0.0800 0.1100 0.01 0.01
N15 N16 0.0400 0.0400 0.02 −0.01
N5 N11 0.0400 0.0400 0.00 0.00
N10 N14 0.0400 0.0400 0.00 0.00
N7 N16 0.0900 0.1200 0.00 0.00

11.3 Optimal placement of capacitors

Large primary feeders require reactive compensation (e.g., shunt capacitors)
installed at appropriated locations in order to reduce power and energy losses
and improve voltage profile [83]. An optimization model is required to define
the size and place of each of these shunt capacitors, resulting in a problem
that is closely related to the optimal power flow (OPF), previously studied in
Chapter 10.
Shunt capacitors are represented as discrete injections of reactive power per
unit. These capacitors can be fixed or switching capacitor banks located along
the primary feeder. For the sake of simplicity, we address only the case of fixed
capacitors.
The objective function consists in minimizing power loss and/or cost,
whereas the result of the optimization is the size and placement of the shunt
capacitors. It is not economically viable to place capacitors in all nodes and
hence, the amount of reactive power must be limited. A basic optimization
model is presented below:

Telegram: @ElectricalDocument
11.3 Optimal placement of capacitors 201

min 𝑓objective
∑∑
𝑝𝐿 ≥ real ( 𝑦𝑘𝑚 𝑣𝑘 𝑣𝑚 )
𝑘 𝑚

𝑣0 = 1 + 0𝑗
𝛿 ≥ ‖𝑣𝑘 − 1‖ , ∀𝑘 ∈ 𝒩
max
𝑖𝑘𝑚 ≥ ‖𝑦𝑘𝑚 (𝑣𝑘 − 𝑣𝑚 )‖ , ∀𝑘𝑚 ∈ ℰ (11.15)
∑
𝑠𝑘∗ − 𝑑𝑘∗ − 𝑗𝜉𝑘 𝑞nom = 𝑦𝑘𝑚 𝑣𝑘∗ 𝑣𝑚 , ∀𝑘 ∈ 𝒩
𝑚

‖𝑠𝑘 ‖ ≤ 𝑠𝑘max , ∀𝑘 ∈ 𝒩
∑
𝜉𝑘 ≤ 𝜉 available
𝑘

𝜉𝑘 ∈ {0, 1} , ∀𝑘 ∈ 𝒩

Where 𝑞nom is the nominal value of the shunt capacitors to be placed in the
feeder and 𝜉𝑘 is a binary variable that indicates the placement of one capacitor
in node 𝑘; the amount of capacitors to be placed in the feeder is limited by
𝜉 available ; the rest of the variables and constraints have the same interpretation
as the OPF problem studied in Chapter 10.
Model Equation (11.15) has two sources of complexity: first, power flow
equations are non-affine; and second, the problem is mixed-integer. The first
issue can be addressed by using convex approximations, whereas the second
issue is solved directly by CvxPy. Both linearization or conic approximations
(i.e., SOC and SDP) can be used in this problem. A linearization is convenient
since the model results in mixed-integer quadratic programming (MIQP), and
these type of models are solvable in practice; the quadratic term comes from
the power loss equation, which is convex. SOC and SDP approximations result
in mixed-integer second-order and mixed-integer semidefinite programming
problems, which are computationally more demanding.
We may be interested in minimizing operation costs in a planning period
(e.g., one year). In that case, the objective function includes costs associated
with the power and energy losses, as well as the cost of installation of shunt
capacitors. The objective function is therefore given by Equation (11.16):

𝑓objective = 𝑓loss (𝑝𝐿 , 𝑒𝐿 ) + 𝑓installation (𝑞) (11.16)

where 𝑒𝐿 is the energy loss, and 𝑓loss , 𝑓installation are functions of annual costs
and installation, respectively. These functions are usually linear. The number
of binary variables is the same in this case, but the feeder requires to be repre-
sented in a load curve in order to calculate both power and energy losses. Some

Telegram: @ElectricalDocument
202 11 Active distribution networks

0 1 2 3 4 5 6 7

0.020 0.025 0.024 0.030 0.020 0.023 0.020

0.015j 0.015j 0.017j 0.025j 0.015j 0.015j 0.010j

Figure 11.2 Radial distribution network 𝑧𝑘𝑚 = 0.01 + 0.005𝑗.

countries have different penalization costs for power and energy loss, and then
the model requires to be adapted to each grid code.

Example 11.2. Let us consider the 8-nodes primary feeder shown in Figure
11.2; loads are depicted in the figure and impedance is 𝑧𝑘𝑚 = 0.01 + 0.005𝑗 for
all line segments. We already calculated the matrix 𝑌bus and have a vector 𝑑 of
size 𝑛 that stores the loads; up to 0.01pu reactive power compensation will be
allowed. The optimization model is presented below:

n = 8
v = cvx.Variable(n,complex=True) # voltages
W = cvx.Variable((n,n),complex=True) # linearization
s0 = cvx.Variable(complex=True) # power at slack
xi = cvx.Variable(n, boolean=True) # shunt capacitors
pL = cvx.Variable() # power loss
s = n*[0]
s[0] = s0
M = Ybus@W
res = [pL >= cvx.quad_form(cvx.real(v),Ybus.real)+
cvx.quad_form(cvx.imag(v),Ybus.real)]
res += [v[0]==1.0]
for k in range(n):
res += [cvx.conj(s[k]+0.01j*xi[k]-d[k]) == M[k,k]]
res += [cvx.abs(v[k]-1) <= 0.1]
res += [xi[k]>= 0]
res += [xi[k]<= 1]
for m in range(n):
res += [W[m,k] == cvx.conj(v[k])+v[m]-1]
res += [cvx.sum(xi) <= 1]
obj = cvx.Minimize(pL)
OPCAP = cvx.Problem(obj,res)
OPCAP.solve()
print(’pL’,pL.value)
print(’shunt capacitors’, np.round(xi.value))

Telegram: @ElectricalDocument
11.4 Optimal placement of distributed generation 203

In this case, we used a linear approximation of the power flow equations.

The model places a capacitor at Node 7, resulting in a power loss of
𝑝𝐿 = 0.00102. The reader is invited to experiment with this model; for
example, relax the binary constraint (boolean=False) and compare the
results.

Example 11.3. Power distribution networks may include active components

such as D-STATCOMS (distribution static var compensators). These compo-
nents are basically voltage source converters equipped with a suitable control
that maintains a constant reactive power or nodal voltage. The model for opti-
mal placement of capacitors can be easily extended to the optimal placement
of D-STATCOMS. In that case, the term 𝜉𝑘 𝑞nom is replaced by a new variable
𝑞𝑘 = 𝜉𝑘 𝑞nom in the power flow equations in Model Equation (11.15), and a new
constraint is included as follows:

−𝜉𝑘 𝑞nom ≤ 𝑞𝑘 ≤ 𝜉𝑘 𝑞 nom (11.17)

Therefore, the optimization model returns not only the placement of the
component but also the reactive power that must inject into the grid.

11.4 Optimal placement of distributed generation

Modern power distribution networks include a massive penetration of dis-
tributed generation, especially renewable sources such as photovoltaic and
wind generation, motivated by a growing concern about global warming.
The optimal placement of this distribution generation constitutes an opti-
mization problem that can be efficiently solved by the convex approx-
imations presented in the previous chapter. The problem is discrete,
although a good approximation can be obtained if binary variables are
relaxed [86].
The main objective is to minimize power loss 𝑝𝐿 , although other objectives
such as costs or reliability can be considered, subject to physical constraints
similar to the OPF problem. A vector 𝜉𝑘 ∈ {0, 1} is defined for each node, where
𝜉𝑘 = 1 if a distributed generator is placed in Node 𝑘. For the sake of simplicity,
all generators are consider of the same capacity 𝑠nom , resulting in the following
optimization model:

Telegram: @ElectricalDocument
204 11 Active distribution networks

min 𝑝𝐿
∑∑
𝑝𝐿 ≥ real ( 𝑦𝑘𝑚 𝑣𝑘 𝑣𝑚 )
𝑘 𝑚

𝑣0 = 1 + 0𝑗
𝛿 ≥ ‖𝑣𝑘 − 1‖ , ∀𝑘 ∈ 𝒩
max
𝑖𝑘𝑚 ≥ ‖𝑦𝑘𝑚 (𝑣𝑘 − 𝑣𝑚 )‖ , ∀𝑘𝑚 ∈ ℰ (11.18)
∑
𝑠𝑘∗ − 𝑑𝑘∗ = 𝑦𝑘𝑚 𝑣𝑘∗ 𝑣𝑚 , ∀𝑘 ∈ 𝒩
𝑚

‖𝑠𝑘 ‖ ≤ 𝜉𝑘 𝑠nom , ∀𝑘 ∈ 𝒩
∑
𝜉𝑘 ≤ 𝜉 max
𝑚

𝜉𝑘 ∈ {0, 1} , ∀𝑘 ∈ 𝒩

where 𝜉 max is the maximum number of distributed generators to be placed in

the system. This is a mixed-integer convex programming problem that can be
efficiently solved using mixed-integer methods such as the Branch and Bound
method. The model returns not only the placement but also the sizing of dis-
tributed generators. The costs of these generators can be also included into the
objective function, in any case, the model remains (mixed-integer) convex .

Example 11.4. We are interested in placing two distributed generators of

𝑠nom = 0.02pu in the primary distribution feeder presented in Figure 11.2,
with the objective to minimize power loss. The code in Python for solving this
problem is presented below (we assume parameters of the grid are stored in a
graph G):

n = G.number_of_nodes()
d = np.array([G.nodes[k][’d’] for k in G.nodes])

nt = 2
v = cvx.Variable(n,complex=True)
W = cvx.Variable((n,n),complex=True)
s = cvx.Variable(n,complex=True)
xi = cvx.Variable(n,nonneg=True)
pL = cvx.Variable()
M = Ybus@W
res = [pL >= cvx.quad_form(cvx.real(v),Ybus.real)+
cvx.quad_form(cvx.imag(v),Ybus.real)]
res += [v[0]==1.0]
for k in range(n):
res += [cvx.conj(s[k]-d[k]) == M[k,k]]
res += [cvx.abs(v[k]-1) <= 0.05]

Telegram: @ElectricalDocument
11.5 Hosting capacity of solar energy 205

for m in range(n):
res += [W[m,k] == cvx.conj(v[k])+v[m]-1]
for k in range(1,n): # except the slack
res += [cvx.abs(s[k]) <= 0.02*xi[k]]
res += [xi[k]>=0, xi[k]<=1]
res += [cvx.sum(xi) <= nt]

obj = cvx.Minimize(pL)
HOSTCAP = cvx.Problem(obj,res)
HOSTCAP.solve()
print(HOSTCAP.status,obj.value)
print(np.round(xi.value,3))

Although the model is binary, we use a continuous relaxation that returns a

final solution of 𝜉6 = 𝜉7 = 1. Notice that most of the code is similar to the
OPF problem for a linear approximation. The reader is invited to test a SOC
approximation for the problem.

11.5 Hosting capacity of solar energy

High penetration of renewable resources, especially solar photovoltaic, could
create overvoltages along with primary feeders [87]. Therefore, it is necessary
to define the amount of solar energy that can be hosted on a power distribution
network without adversely impacting safety, power quality, reliability, or other
operational features [1].
The hosting capacity model is similar to the OPF, however, in this case
the capacity of each distributed generator is also a variable. The objective is
to determine the maximum amount of power that can be generated in each
node without jeopardizing the normal operation of the system. The model is
presented below using a complex linearization of the power flow equations:
max 𝑓objective
𝑣0 = 1 + 0𝑗
𝛿 ≥ ‖𝑣𝑘 − 1‖ , ∀𝑘 ∈ 𝒩
max
𝑖𝑘𝑚 ≥ ‖𝑦𝑘𝑚 (𝑣𝑘 − 𝑣𝑚 )‖ , ∀𝑘𝑚 ∈ ℰ (11.19)
∑
𝑠𝑘∗ − 𝑑𝑘∗ = 𝑦𝑘𝑚 𝑣𝑘∗ 𝑣𝑚 , ∀𝑘 ∈ 𝒩
𝑚

(1∕𝜌)ℎ𝑘 ≥ ‖𝑠𝑘 ‖ , ∀𝑘 ∈ 𝒩 − {0}

ℎ𝑘 ≥ real(𝑠𝑘 ), ∀𝑘 ∈ 𝒩 − {0}

Telegram: @ElectricalDocument
206 11 Active distribution networks

Figure 11.3 Example of two conﬁgurations of distributed generation.

In this model, ℎ𝑘 represents the maximum active power that can host each
Node 𝑘; the objective function measures the maximum amount of power that
can host the system, subject to the power flow equations and limits of volt-
age and current flow; 𝜌 represents the minimum power factor in each power
electronic converter.
There are different metrics to measure the hosting capacity, for instance we
may be interested in maximizing the total distributed generation, in that case
the objective function is given by Equation (11.20),
∑
𝑓objective = ℎ𝑘 (11.20)
𝑘

However, it may be the case that most of the distribution generation con-
centrates in a single node. To solve this problem, a metric based on the
hypervolume of the new distribution generation is proposed, namely:
∏
max ℎ𝑘 (11.21)
𝑘

∏
where represents the product among the active power generated in the grid.
In order to understand the logic behind Equation Equation (11.21) consider a
system with two distributed generators ℎ1 , ℎ2 with two possible configurations
shown in Figure 11.3. Both configurations host the same amount of distributed
generation, i.e., host(𝐴) = 60MW + 40MW = 100MW and host(𝐵) = 90MW +
10MW = 100MW; however, 𝐵 concentrates most of the power in one node
whereas 𝐴 distribute the power more equitably; this can be measured by the
area(𝐴) = 2400, that is greater than area(𝐵) = 2000. In a system with three
generators, we can calculate the volume vol = ℎ1 ℎ2 ℎ3 instead of the area, and
in the general case we can calculate the hypervolume given by Equation (11.21).

Telegram: @ElectricalDocument
11.5 Hosting capacity of solar energy 207

Equation Equation (11.21) can be transformed using a logarithmic function,

as follows:
∏ ∑
ln ( ℎ𝑘 ) = ln (ℎ𝑘 ) (11.22)
𝑘 𝑘

notice that ln is a monotone-concave function, and hence, we can define the

following convex objective function:
∑
min − ln (ℎ𝑘 ) (11.23)
𝑘

subject to the same constraints of Model Equation (11.19). In the following

example, we compare objectives Equation (11.20) and Equation (11.23):

Example 11.5. Let us define the hosting capacity of the 8-nodes radial distri-
bution network presented in Figure 11.2; we already defined a graph G with all
the parameters of the grid. The code in Python for Model Equation (11.23) is
presented below:
n = G.number_of_nodes()
d = np.array([G.nodes[k][’d’] for k in G.nodes])

v = cvx.Variable(n,complex=True)
W = cvx.Variable((n,n),complex=True)
s = cvx.Variable(n, complex=True)
h = cvx.Variable(n)
M = Ybus@W
res = [v[0]==1.0]
for k in range(n):
res += [cvx.conj(s[k]-d[k]) == M[k,k]]
res += [cvx.abs(v[k]-1) <= 0.05]
for m in range(n):
res += [W[m,k] == cvx.conj(v[k])+v[m]-1]

htotal = 0
for k in range(1,n): # except the slack
htotal = htotal + h[k]
res += [cvx.abs(s[k]) <= 1.2*h[k]]
res += [cvx.real(s[k]) >= h[k]]
obj = cvx.Maximize(htotal)
HOSTCAP = cvx.Problem(obj,res)
HOSTCAP.solve()
print(HOSTCAP.status,obj.value)
print(’hosting:’,np.round(h.value,3))

After executed this code, a total power of 4.63pu is placed along the
feeder with ℎ = (0, 4.54, 0.03, 0.02, 0.01, 0.01, 0.01, 0.01)⊤ ; notice that most
of the new generation is concentrated in a single node. However, in

Telegram: @ElectricalDocument
208 11 Active distribution networks

the case of the maximum hypervolume, the hosting capacity vector is

ℎ = (0, 0.73, 0.37, 0.24, 0.18, 0.15, 0.12, 0.10)⊤ . The maximum hypervolume
approach gives a better distribution of the newly distributed generation.

11.6 Harmonics and reactive power compensation

Power distribution systems may include non-linear loads, such as diode rec-
tifiers and/or saturated magnetic devices, that introduce harmonic currents to
the system. These harmonics, which are represented as currents at a multiple of
the fundamental frequency, create power quality problems in the grid; so that
they require to be reduced or, if possible, eliminated. One simple and efficient
way to reduce these harmonics is by means of active filters as depicted in Figure
11.4. An active filter is a power electronic converter, usually a pulse-width mod-
ulated voltage-source converter, that injects currents that compensate for both
reactive power and harmonic content. Most renewable resources, such as solar
photovoltaics and wind energy, are integrated through these types of power
electronic devices; hence, the compensation action may be performed by these
devices.
A power electronic converter is able to control the output currents by using
techniques such as carrier-based modulation, space vector modulation, or
hysteresis control2 . The reference of these currents is defined by a compensa-
tion theory. There are different compensation theories, as well as the definition
of reactive power under harmonic distortion. In this section, we present a
simple compensation theory based on mathematical optimization.
Indeed, the value of the current injected by the active filter, can be obtained
by a simple optimization model. Let us consider a three-phase system with load
currents 𝑖𝐴 , 𝑖𝐵 , and 𝑖𝐶 ; line-to-neutral voltages are given by 𝑣𝐴 , 𝑣𝐵 , and 𝑣𝐶 ; the
currents injected by the active filter are 𝑢𝐴 , 𝑢𝐵 , and 𝑢𝐶 . Therefore, the objective
is to minimize the root mean square of the current line current, in one period;
subject to power balance, as presented below:
𝑡+𝑇
1 ∑
min ∫ ( (𝑖𝑘 − 𝑢𝑘 )2 ) 𝑑𝑡 ′
𝑇 𝑘∈Φ
𝑡
𝑡+𝑇
1 ∑
∫ ( 𝑣𝑘 𝑢𝑘 ) 𝑑𝑡 ′ = 0 (11.24)
𝑇 𝑘∈Φ
𝑡

2 See [88] for more details about modulation and control of power electronic converters, for
the integration of renewable energies.

Telegram: @ElectricalDocument
11.6 Harmonics and reactive power compensation 209

Figure 11.4 Schematic representation of the voltages and currents involved in the
model for reactive power compensation and harmonic ﬁltering.

where 𝑇 = 1∕(2𝜋𝑓) and Φ = {𝐴, 𝐵, 𝐶}; notice that, minimizing line currents
entails optimization in the power losses of the system. The constraint indicates
that the instantaneous power delivered by the active filter is always zero in one
period. This implies that, in average, the active filter does not deliver power to
the grid.
This model is designed for real-time control, meaning that all variables are
time-dependent. The optimization model is solved using Lagrange multipliers;
therefore, the lagrangian function is calculated, namely:

𝑡+𝑇 𝑡+𝑇
1 ∑ 𝜆 ∑
ℒ(𝑢, 𝜆) = ∫ ( (𝑖𝑘 − 𝑢𝑘 )2 ) 𝑑𝑡′ + ∫ ( 𝑣𝑘 𝑢𝑘 ) 𝑑𝑡′ (11.25)
𝑇 𝑘∈Φ
𝑇 𝑘∈Φ
𝑡 𝑡

The first optimality conditions imply that the instantaneous derivative of L

with respect to three-phase currents is equal to zero,

𝑡+𝑇
1
∫ (−2(𝑖𝑘 − 𝑢𝑘 ) + 𝜆𝑣𝑘 ) 𝑑𝑡 ′ = 0 (11.26)
𝑇
𝑡

For the integral to be zero, it is required that its integrand is also zero. Therefore,
we have the following expresion:

−2(𝑖𝑘 − 𝑢𝑘 ) + 𝜆𝑣𝑘 = 0 (11.27)

Telegram: @ElectricalDocument
210 11 Active distribution networks

Therefore, the current in each phase is given by the equation presented

below:
𝜆𝑣𝑘
𝑢𝑘 = 𝑖𝑘 − (11.28)
2
Let us multiply Equation (11.27) by 𝑣𝑘 and add in the three phase, to obtain
the following expression:
∑
−2𝑖𝑘 𝑣𝑘 + 2𝑣𝑘 𝑢𝑘 ) + 𝜆𝑣𝑘2 = 0 (11.29)
𝑘∈Φ

Now, we integrate this expression and use the fact that the active filter does
not deliver power in a period. Therefore, the following expression is obtained:
𝑡+𝑇
1 ∑( )
∫ ( −2𝑖𝑘 𝑣𝑘 + 𝑣𝑘 𝑢𝑘 + 𝜆𝑣𝑘2 ) 𝑑𝑡 ′ = 0 (11.30)
𝑇 𝑘∈Φ
𝑡
2
−2𝑝 + 𝜆𝑣 = 0 (11.31)

where 𝑝̄ is the average power, given by the equation presented below:

𝑡+𝑇
1 ∑
𝑝= ∫ ( 𝑣𝑘 𝑖𝑘 ) 𝑑𝑡′ (11.32)
𝑇 𝑘∈Φ
𝑡

2
and 𝑣 is the three-phase square voltage, given by the following expression:
𝑡+𝑇
2 1 ∑ 2
𝑣 = ∫ ( 𝑣𝑘 ) 𝑑𝑡 ′ (11.33)
𝑇 𝑘∈Φ
𝑡
2 2 2
= 𝑣𝐴(rms) + 𝑣𝐵(rms) + 𝑣𝐶(rms) (11.34)

Finally, replacing Equation (11.31) into Equation (11.28), the compensation

current is obtained.
𝑝
𝑢𝑘 = 𝑖𝑘 − 𝑣
2 𝑘
(11.35)
𝑣
This simple expression defines the optimal current that the active filter must
inject in order to reduce line currents and power loss. One aspect that is missing
in this description is the effect on the power factor of the proposed approach.
The example presented below shows this effect in practice.
Example 11.6. Let us consider a non-linear load with the fifth harmonic. To
analyze this load, we define first a function that generates three-phase variables
at the positive sequence, as presented in the code below:

Telegram: @ElectricalDocument
11.6 Harmonics and reactive power compensation 211

Figure 11.5 Three-phase currents

for a non-linear load with 5th
harmonic.

import NumPy as np
import matplotlib.pyplot as plt

t = np.linspace(0,2/60,100)
w = 2*np.pi*60
def three_phase(w,m,ph):
xA = m*np.cos(w*t+ph)
xB = m*np.cos(w*t+ph-2*np.pi/3)
xC = m*np.cos(w*t+ph+2*np.pi/3)
return xA, xB, xC

This function, receives the nominal frequency w, a magnitude m, and a phase

ph. Nesx, the function is used to generate three-phase voltages and currents, as
follows:
vA,vB,vC = three_phase(w,170,0) # voltage
iA,iB,iC = three_phase(w,10,-0.8) # fundamental current
hA,hB,hC = three_phase(5*w,1,0.1) # harmonic current

iA = iA+hA
iB = iB+hB
iC = iC+hC

Figure 11.5 shows three-phase currents.

The compensation current is calculated using Equation (11.35) as presented
below:
pm = vA*iA+ vB*iB + vC*iC
vm = (vA**2).mean() + (vB**2).mean() + (vC**2).mean()
uA = iA - pm.mean()/vm*(vA)
uB = iB - pm.mean()/vm*(vB)
uC = iC - pm.mean()/vm*(vC)

This compensation not only reduces the harmonic contents of the line currents
but also achieves a unity power factor. The reader is invited to plot currents
𝑖𝑘 − 𝑢𝑘 and analyze the results.

Telegram: @ElectricalDocument
212 11 Active distribution networks

11.7 Further readings

All methods presented in this chapter may be extended to the case of three-
phase unbalanced power grids. In that case, there is required an efficient way
to store the parameters of the system. However, the main ideas are the same.
The interested reader can be referred to [89].
Heuristic algorithms have also been proposed to solve the optimal capacitor
placement as well as the optimal placement of distributed generation. See for
example [83] and [90] for the capacitor problem and [91] for the distributed
generation problem. Switched capacitors under unbalanced representation can
be included in the problem as given in [92]. Typically, these algorithms use a
master-slave strategy where a master problem chooses the placement and size
of the component, and the slave algorithm solves a power flow to obtain the
continuous variables. However, the random nature of the solutions obtained
by metaheuristics makes them unsuitable for real applications. In addition, the
abuse of biological and social metaphors tends to hide the mathematical and
physical structure of the original problem [93].
In the case of the hosting capacity problem, it can be solved using Monte
Carlo simulation [94]. This method generates a high number of scenarios where
a power flow analysis is performed [95]. The elevated number of scenarios
makes the method cumbersome for everyday operation. Risk assessment tools
have also been proposed to solve the problem [96]. However, it also requires
the generation of multiple scenarios, just as in the case of Montecarlo methods.
Heuristic and metaheuristic methods have also been proposed for solving these
types of models [97]. However, a convex approximation, like the one presented
in this chapter, is sufficient to solve the problem.
The theory presented here for compensation of reactive power and active fil-
tering is, perhaps, the most simple approach to solve this problem. However,
there are several methodologies and theories which may be complete and rig-
orous. These theories can include the effect of the neutral current and other
compensation objectives, as was demonstrated in [98]. Other compensation
theories can be found in [99] and [88], and in seminal papers such as [100]
and [101].

11.8 Exercises
1. Solve the problem presented in Example 11.2 using a SOC formulation for
the power flow equations. Compare the results.
2. Solve the problem of optimal placement of D-STATCOMs in the system
presented in Example 11.2. Use the same parameters of costs and 𝑞nom .

Telegram: @ElectricalDocument
11.8 Exercises 213

3. Find the optimal capacitor placement for the 34-bus test system presented in
Table 10.1 (Chapter 10); T1 represents the load factor for operation between
0h:8h, T2 for 8h:16h, and T3 for 16h:24h. Try different objective functions,
for example power loss, energy loss, and/or costs.
4. Solve the problem presented in Example 11.4 considering binary variables
𝜉𝑘 and 𝑠nom = 0.3pu.
5. Solve the problem presented in Example 11.4 using an SDP approximation
for the power flow equations. Compare the results.
6. Solve the primary feeder reconfiguration for the test system presented in
Table 11.2.
7. Solve the problem presented in Example 11.5 using a SOC approximation
for the power flow equations.
8. Consider the problem presented in Example 11.5 but now, a loop is created
connecting nodes 2 and 5. Compare the results.
9. Determine the hosting capacity for the power distribution network pre-
sented in Table 10.1 (Chapter 10).
10. The concept of hosting capacity can be extended to dc distribution grids.
Formulate and solve the problem in the 21-nodes dc-distribution system
presented in Table 10.2.

Telegram: @ElectricalDocument
214 11 Active distribution networks

Table 11.2 IEEE 33 nodes test distribution network [102].

From To 𝒓𝒌𝒎 (pu) 𝒙𝒌𝒎 (pu) 𝒑𝒌 (pu) 𝒒𝒌 (pu)

0 1 0.000575259 0.000297612 0.100 0.060

1 2 0.003075952 0.001566676 0.090 0.040
2 3 0.002283567 0.001162997 0.120 0.080
3 4 0.002377779 0.001211039 0.060 0.030
4 5 0.005109948 0.004411152 0.060 0.020
5 6 0.001167988 0.003860850 0.200 0.100
6 7 0.010677857 0.007706101 0.200 0.100
7 8 0.006426430 0.004617047 0.060 0.020
8 9 0.006488823 0.004617047 0.060 0.020
9 10 0.001226637 0.000405551 0.045 0.030
10 11 0.002335976 0.000772420 0.060 0.035
11 12 0.009159223 0.007206337 0.060 0.035
12 13 0.003379179 0.004447963 0.120 0.080
13 14 0.003687398 0.003281847 0.060 0.010
14 15 0.004656354 0.003400393 0.060 0.020
15 16 0.008042397 0.010737754 0.060 0.020
16 17 0.004567133 0.003581331 0.090 0.040
1 18 0.001023237 0.000976443 0.090 0.040
18 19 0.009385084 0.008456683 0.090 0.040
19 20 0.002554974 0.002984859 0.090 0.040
20 21 0.004423006 0.005848052 0.090 0.040
2 22 0.002815151 0.001923562 0.090 0.050
22 23 0.005602849 0.004424254 0.420 0.200
23 24 0.005590371 0.004374340 0.420 0.200
5 25 0.001266568 0.000645139 0.060 0.025
25 26 0.001773196 0.000902820 0.060 0.025
26 27 0.006607369 0.005825590 0.060 0.020
27 28 0.005017607 0.004371221 0.120 0.070
28 29 0.003166421 0.001612847 0.200 0.600
29 30 0.006079528 0.006008401 0.150 0.070
30 31 0.001937288 0.002257986 0.210 0.100
31 32 0.002127585 0.003308052 0.060 0.040
7 20 0.012478500 0.012478500 0 0
8 14 0.012478500 0.012478500 0 0
11 21 0.012478500 0.012478500 0 0
17 32 0.003119600 0.003119600 0 0
24 28 0.003119600 0.003119600 0 0
Telegram: @ElectricalDocument
215

State estimation and grid identiﬁcation

Learning outcomes

By the end of this chapter, the student will be able to:

● Solve basic state estimation problems using the gradient method.
● Identify the 𝑌bus from measurements of voltage and current.
● Solve optimization problems including norms in the objective func-
tion.

12.1 Measurement units

Synchrophasor or phasor measurement units (PMUs) are devices that allow
measuring voltage and current in magnitude and angle via global positioning
system (GPS) synchronization, i.e., the time-stamp given by the GPS is used
to synchronize measures and obtain exact values of nodal angles. PMUs are
common in modern power systems and constitute the primary tool to improve
observability. However, measures alone are not enough to have an accurate
picture of the state of the grid. Therefore, a state estimation algorithm must
filter redundant data and compensate spurious measurements [103]. This algo-
rithm is integrated into the supervisory control and data acquisition system
(SCADA), which requires to be precise, exact, and highly efficient to operate in
real-time.
The use of PMUs also allows estimating the 𝑌bus and even the grid’s topology
in a model known as grid identification or inverse power flow [104]. Both the
state estimation and the grid identification are studied in this chapter under
the assumption there are PMUs in all nodes. In order to keep our philosophy of
toy-models, our presentation is based on the module CvxPy. However, practical

Mathematical Programming for Power Systems Operation: From Theory to Applications in

implementations use tailored algorithms based on the gradient method, which

are faster and more efficient.

12.2 State estimation

Measurements instruments such as voltmeters and ammeters are never perfect
but have an intrinsic measurement error. Therefore, the actual state of a system
is always unknown, although it can be estimated using available measures and
an optimization method known as state estimation.
To understand the problem, let us consider a simple circuit made up of a volt-
age source and a resistor. Let us suppose the resistance is 5.0 Ω, and a voltmeter
connected in parallel to this resistor gives a voltage of 9.6 V, whereas an amme-
ter connected in series gives a current of 1.83 A. We know that the circuit must
comply with Ohm’s law, but 1.83 A × 5Ω = 9.15V ≠ 9.6V, which measurement
should we trust? The voltmeter or the ammeter? It is not a large discrepancy,
but this type of error can spread in a system with thousands of measurements.
We require a systematic approach.
Consider a power system with different type of real-time measurement
instruments as well as pseudo-measurements (i.e forecasts or historical data).
then, a non-linear measurement model may be defined as (12.1):

𝑧 = ℎ(𝑥) + 𝑒 (12.1)

where 𝑧 ∈ ℝ𝑚 is the vector of measurement (and pseudo-measurements),

𝑥 ∈ ℝ𝑛 is the true state vector, ℎ ∶ ℝ𝑛 → ℝ𝑚 relates measurements and states,
and 𝑒 ∈ ℝ𝑚 is the measurement error. Dimension of 𝑧 is higher than dimen-
sion of 𝑥 (𝑚 > 𝑛) in order to obtain an over determined system of non-linear
equations1 . We only know 𝑧 and ℎ, thus, our objective is to find an estimate for
𝑥 such that the estimation error is minimized. Therefore, the problem can be
represented as weighted least squares model:
1 ⊤
min (𝑧 − ℎ(𝑥)) 𝑊 (𝑧 − ℎ(𝑥)) (12.2)
2
where 𝑊 is a diagonal matrix that represents the weight associated to each mea-
surement. We assume each measurement instrument is independent with zero
mean error and variance 𝜎𝑖2 , therefore, 𝑊 = diag(1∕𝜎𝑖2 ). The problem may be
complemented with other equality and inequality constraints that represent
operating limits and unobservable parts of the network; and for the sake of

1 In our naive example, we have one state 𝑥 = 𝑖, two equation ℎ1 = 𝑅𝑖, ℎ2 = 𝑖, and two
measurements 𝑧1 = 𝑣, 𝑧2 = 𝑖.

Telegram: @ElectricalDocument
12.2 State estimation 217

simplicity, we focus on the unrestricted case, the reader who wishes to delve
into the subject can refer to [103].
In the classic formulation of the problem, the state variables 𝑥 are the nodal
voltage (magnitude and angle), whereas the measurements 𝑧 include volt-
age magnitudes, active and reactive power flows, active and reactive power
injections, and current magnitudes, among others. Modern state estimation
models include phasor measurement units which allow obtaining the angles
as real-time and synchronized measurements.
In general, Model (12.2) is non-convex since ℎ is made up of non-linear rela-
tions between states and measurements. In those cases, the problem is solved
using Newton-based methods without guarantee of finding the global opti-
mum. The general case includes inequality constraints, and hence it is solved
using interior-point methods, again, without a theoretical guarantee of conver-
gence or optimality. However, we are interested in linear measurement models
that make the problem convex and solvable in CvxPy. On the positive side, this
approach is close to reality to the extent that modern systems rely on PMUs,
which generate linear relationships between states and measurements. More-
over, we seek for toy-models that allow us to understand the problem and
solve it using the paradigm of disciplined convex optimization. On the negative
side, a state estimation algorithm is mainly designed for real-time operation,
and hence Python may not be the best option in practice since algorithms
implemented in Python tend to be slower than their counterpart in compiled
languages C or C++.
The most simple instance of the problem is the dc state estimation. In this
case, the grid is represented by the linear equations or DC power flow; hence,
the state of the system is given by the angle of nodal voltages 𝜃. Our objec-
tive is to find a vector 𝑥th that estimates 𝜃 using available measurements of
power at each substation. Three type of measurements are available as shown
in Figure 12.1, namely: nodal powers 𝑧p , power flows departing from the
node𝑧pf1 , and power flows arriving to the node 𝑧pf2 . Nodal powers and power
flows in the transmission lines are linearly related to the nodal powers as
follows:
𝑧p = 𝐵𝑥th + 𝑒p (12.3)

Figure 12.1 Power

measurements at a given node
𝑖 for dc state estimation.

Telegram: @ElectricalDocument
218 12 State estimation and grid identiﬁcation

𝑧pf1 = 𝐻𝑥th + 𝑒pf1 (12.4)

𝑧pf2 = −𝐻𝑥th + 𝑒pf2 (12.5)
where 𝐵 is the Jacobian for the linear (dc) formulation of the power flow
given by (12.12) as function of the nodal admittance matrix 𝐴 and the branch
admittance 𝑌:
𝐵 = 𝐴𝑌𝐴⊤ (12.6)
and 𝐻 is a matrix representation of the power flows, namely:
𝐻 = 𝑌𝐴⊤ (12.7)
The error associated to each measurement has a variance 𝜎𝑝2 , 𝜎pf1
2
, 2
and 𝜎pf2
which allow to define weight factors as follows:
𝑊p = diag(1∕𝜎p2 ) (12.8)
2
𝑊pf1 = diag(1∕𝜎pf1 ) (12.9)
2
𝑊pf2 = diag(1∕𝜎pf2 ) (12.10)
These weight factors allow to formulate the following optimization model for
estimating the angles of the system:
1
min (𝑧 − 𝐵𝑥th )⊤ 𝑊p (𝑧p − 𝐵𝑥th )
2 p
1
+ (𝑧pf1 − 𝐻𝑥th )⊤ 𝑊pf1 (𝑧pf1 − 𝐻𝑥th ) (12.11)
2
1
+ (𝑧pf2 − 𝐻𝑥th )⊤ 𝑊pf2 (𝑧pf2 − 𝐻𝑥th )
2
The model may be complemented with additional constraint 𝑥th (0) = 0 to
ensure the slack node has an angle equal to zero. This model is evidently
quadratic-convex.
Example 12.1. Figure 12.2 shows a network with three buses, two genera-
tors, and a load. All voltages have the same magnitude (1𝑝𝑢), but their angles
(𝜃𝑖 ) are unknown. Power metering systems are placed to measure both nodal
power and power flows. The state variables are 𝜃𝑖 , and the measurements can
be related to the states via linear equations resulting in a convex problem.
The state of the system can be completely represented by the angles of the
system 𝑥th = (𝜃0 , 𝜃1 , 𝜃2 ). In addition, there are three set of measurements,
namely: 𝑧p = (𝑝0 , 𝑝1 , 𝑝2 )⊤ for nodal powers, 𝑧pf1 = (𝑝01 , 𝑝02 , 𝑝12 )⊤ for the
power flows measured at the beginning of the lines, and 𝑧pf2 = (𝑝10 , 𝑝20 , 𝑝21 )⊤
for the power flows measured at the end of the lines. Each of these vectors

Telegram: @ElectricalDocument
12.2 State estimation 219

Figure 12.2 Three-bus system. ▾ represents points with power metering systems.

have a linear relation with the nodal angles as given in (12.3) to (12.5) with the
following numerical values for matrices 𝐵 and 𝐻:

⎛ 1∕0.012 + 1∕0.010 −1∕0.012 −1∕0.010 ⎞

𝐵= ⎜ −1∕0.012 1∕0.012 + 1∕0.011 −1∕0.011 ⎟
⎜ ⎟
−1∕0.010 −1∕0.011 1∕0.011 + 1∕0.010
⎝ ⎠
(12.12)

⎛ 1∕0.012 −1∕0.012 0 ⎞
𝐻 = ⎜ 1∕0.010 0 −1∕0.010 ⎟ (12.13)
⎜ ⎟
0 1∕0.011 −1∕0.011
⎝ ⎠
Let us suppose nodal powers are 𝑝 = (0.8, 0.7, −1.5)⊤ , therefore, nodal angles
𝜃 = th and power flows pf can be calculated as follows:
import networkx as nx
G = nx.DiGraph()
G.add_edges_from([(0,1),(0,2),(1,2)])
A = nx.incidence_matrix(G,oriented=True)
Y = 1/np.array([0.012,0.010,0.011])
B = [email protected](Y)@A.T
p = np.array([0.8,0.7,-1.5])
th = np.linalg.solve(B,p)
th = th - th[0] # the angle in the slack is 0
H = -np.diag(Y)@A.T
pf1 = H@th # power flow at the beginning of the line
pf2 = -pf1 # power flow at the end of the line

This is the real state of the grid, however, we can only obtain measurements
with normal distributed noise with zero mean and variance 𝜎𝑝 and 𝜎𝑓 for the
nodal powers and the power flows, respectively. This effect can be considered
as follows:

Telegram: @ElectricalDocument
220 12 State estimation and grid identiﬁcation

sigma_p = 0.01
sigma_f = 0.01
zp = p + np.random.randn(3)*sigma_p
zf1 = pf1 + np.random.randn(3)*sigma_f
zf2 = pf2 + np.random.randn(3)*sigma_f

With these measurements we have all the elements to formulate Model (12.11)
as given below:
import cvxpy as cvx
Wp = (1/sigma_p**2)*np.identity(3)
Wf = (1/sigma_f**2)*np.identity(3)
x_th = cvx.Variable(3)
obj = cvx.Minimize(1/2*cvx.quad_form(zp-B@x_th,Wp)+
1/2*cvx.quad_form(zf1-H@x_th,Wf)+
1/2*cvx.quad_form(zf2+H@x_th,Wf))
res = [x_th[0] == 0]
WLS = cvx.Problem(obj,res)
WLS.solve(verbose=True)

We can know how accurate our model is by calculating the distance between 𝜃
(the real state) and 𝑥th (the estimation).
print(np.linalg.norm(x_th.value-th)*100)

The reader is invited to experiment with the model by executing the code
several times with different values of 𝜎p and 𝜎pf .
Example 12.2. Consider the power distribution grid depicted in Figure 10.1,
Chapter 10. The 𝑌bus is calculated as Example 10.2 and the real state of the sys-
tem is obtained by the load flow algorithm given in Example 10.3, then, nodal
values can be calculated as given below:
d = np.array([G.nodes[k][’d’] for k in G.nodes])
s = np.array([G.nodes[k][’smax’] for k in G.nodes])
Vn = LoadFlow(s[1:n],d[1:n])
In = Ybus@Vn
Sn = Vn*In.conj()

Let us suppose this is the real state of the system, however, we have only inac-
curate measurements of voltages and currents; this can be represented in the
model as measurements 𝑧𝑣 , 𝑧𝑖 with random noise, namely:
zv = Vn + np.random.randn(n)*0.001*np.exp(0.1j*np.random.
randn(n))
zi = In + np.random.randn(n)*0.0001*np.exp(0.1j*np.random.
randn(n))

Telegram: @ElectricalDocument
12.3 Topology identiﬁcation 221

Both voltages and currents are inaccurate so we cannot rely on one to calcu-
late the other. However, we can generate the following measurement model in
complex variable:

𝑧𝑣 1 𝑒
( ) = ( 𝑛 ) 𝑥𝑣 + ( 𝑣 ) (12.14)
𝑧𝑖 𝑌bus 𝑒𝑖

where the state variables are represented by the vector 𝑥𝑣 that estimates the
nodal voltages and the measurements are 𝑧 = (𝑧𝑣 , 𝑧𝑖 )⊤ = (𝑉𝑛 + 𝑒𝑣 , 𝐼𝑛 + 𝑒𝑖 )⊤ .
The matrix 1𝑛 is the identity of size 𝑛 and the errors are given by (𝑒𝑣 , 𝑒𝑖 )⊤ . We
assume both voltages and currents has the same accuracy and hence the weight
matrix can be given as the identity. Therefore, Model (12.2) can be directly
implemented in Python as follows:
import cvxpy as cvx
xv = cvx.Variable(n, complex=True)
id = np.identity(n)
obj = cvx.Minimize(1/2*cvx.quad_form(zv-xv,id)+
1/2*cvx.quad_form(zi-Ybus*xv,id))
WLS = cvx.Problem(obj)
WLS.solve(verbose=True)

At first glance, it would seem illogical to calculate 𝑥𝑣 given that we have a set of
voltage measurements 𝑧𝑣 , however, note that 𝑥𝑣 has a smaller deviation from
the true state of the system 𝑉𝑛 (which we do not know in practice), thanks
to the information provided by the other measurements. Let us calculate this
deviation in percentage:
print(np.linalg.norm(zv-Vn)*100)
print(np.linalg.norm(xv.value-Vn)*100)

After executing this code, the deviation of 𝑧𝑣 with respect to 𝑉𝑛 is around 0.4%,
whereas the deviation of the estimation is around 0.03% (results can change
from one execution to another due to the random error introduced in the code).

12.3 Topology identiﬁcation

Power distribution networks are usually operated radially. However, along with
primary feeders, there are tie and sectionalizing switches that allow changing
the topology, transferring load from one feeder to another. Modern sectionaliz-
ing switches may be controlled centrally, but the switching effect requires to be
checked. This observability problem for smart-distribution networks is known
as topology identification.

Telegram: @ElectricalDocument
222 12 State estimation and grid identiﬁcation

On the other hand, the increasing growth of measurement technologies

for power systems applications, such as smart meters and PMUs, results in
improved controllability and observability. These aspects are essentials for the
development of smart-grids at power and distribution levels. However, the
actual implementation of these technologies in power distribution networks is
limited by their costs. Therefore, early implementation of smart-distribution
applications shall come with low-cost technologies that have limited mea-
surement capability. Therefore, we require efficient algorithms for topology
identification that guarantee real-time operation using low-cost measurement
technologies [105].
Let us consider a power distribution network as the one depicted in
Figure 12.1. We suppose the system is equipped with sectionalizing switches
that modify the topology according to an optimization model that seeks loss
reduction (See Section 11.2 Chapter 11 for the distribution feeder reconfigura-
tion problem). The grid is also equipped with a non-contact line current sensor
at specific points. A set of pseudo measurements of the nodal power is also
considered in the problem.
The grid is represented as an oriented graph 𝒢 = {𝒩, ℰ} with 𝒩 =
{0, 1, … , 𝑘, … 𝑛} the set of nodes and ℰ ⊆ 𝒩 × 𝒩 the set of edges. The fol-
lowing variables are considered in the model: the nodal voltage 𝑉𝒩 = [𝑣𝑘 ]
estimated at node 𝑘; the current 𝐼𝒩 = [𝑖𝑘 ] estimated at node 𝑘; the edge current
𝐼ℰ = [𝑖𝑘𝑚 ] estimated at branch 𝑘𝑚; and a binary variable 𝜇𝑘𝑚 that represents
the switching status of edge 𝑘𝑚. Besides, the following inputs and parame-
ters are considered: the pseudo-measurement of the model power 𝑠𝑘 at each
node 𝑘; a subset ℳ ⊂ ℰ that represents the edges with current sensors; the
measurement of these current sensors 𝜉𝑘𝑚 ; the admittance of each branch of
the grid 𝑌ℰ = [𝑔𝑘𝑚 ]; and the incidence matrix 𝐴 of the graph included all the
branches.
The optimization model consists of minimizing the error between mea-
sured and estimated variables, subject to power flow constraints as presented
below:
∑
min |𝑖𝑘𝑚 − 𝜉𝑘𝑚 |
𝑘𝑚∈ℳ

𝜇𝑘𝑚 ∈ {0, 1}
𝑉𝒩 = 𝐴 ⊤ 𝑉ℰ (12.15)
𝐼𝒩 = 𝐴𝐼ℰ
𝐼ℰ = 𝜇𝑌ℰ 𝑉ℰ
𝑖𝑘 = (𝑠𝑘 ∕𝑣𝑘 )∗

Telegram: @ElectricalDocument
12.3 Topology identiﬁcation 223

This model is non-linear and mixed-integer. Hence, a mixed-integer linear

approximation is developed as follows: first, the non-linear equation of the
nodal power is approximated to a linear model as follows [89]:

𝑖𝑘 = 𝑠𝑘∗ (2 − 𝑣𝑘∗ ) (12.16)

Next, the product of binary and continuous variables is approximated as pre-

sented below2 :
real
−𝜇𝑘𝑚 𝛿𝑘𝑚 ≤ 𝑖𝑘𝑚 ≤ 𝜇𝑘𝑚 𝛿𝑘𝑚
imag
−𝜇𝑘𝑚 𝛿𝑘𝑚 ≤ 𝑖𝑘𝑚 ≤ 𝜇𝑘𝑚 𝛿𝑘𝑚
real real real
𝜆𝑘𝑚 − (1 − 𝜇𝑘𝑚 )𝛿𝑘𝑚 ≤ 𝑖𝑘𝑚 ≤ 𝜆𝑘𝑚 + (1 − 𝜇𝑘𝑚 )𝛿𝑘𝑚 (12.17)
imag imag imag
𝜆𝑘𝑚 − (1 − 𝜇𝑘𝑚 )𝛿𝑘𝑚 ≤ 𝑖𝑘𝑚 ≤ 𝜆𝑘𝑚 + (1 − 𝜇𝑘𝑚 )𝛿𝑘𝑚
𝜆𝑘𝑚 = 𝑦𝑘𝑚 𝑣𝑘𝑚

where 𝜆𝑘𝑚 is an auxiliary complex variable related to the current for each
branch 𝑘𝑚 and 𝛿𝑘𝑚 is the current capacity of each branch.
Example 12.3. The mixed-integer model for topology identification in power
distribution is implemented in Python as follows:
Vnode_real = cvx.Variable(num_nodes)
Vnode_imag = cvx.Variable(num_nodes)
Inode_real = cvx.Variable(num_nodes)
Inode_imag = cvx.Variable(num_nodes)
Vedge_real = cvx.Variable(num_edges)
Vedge_imag = cvx.Variable(num_edges)
Iedge_real = cvx.Variable(num_edges)
Iedge_imag = cvx.Variable(num_edges)
Jedge_real = cvx.Variable(num_edges)
Jedge_imag = cvx.Variable(num_edges)
mu = cvx.Variable(num_edges, integer=True)
re = [mu >= 0, mu <= 1,
Vnode_real[0] == 1, Vnode_imag[0] == 0,
Vedge_real == A.T@Vnode_real, Vedge_imag ==
A.T@Vnode_imag,
Inode_real == A@Iedge_real, Inode_imag ==
A@Iedge_imag,
Jedge_real == Yedge_real@Vedge_real-Yedge_imag@
Vedge_imag,
Jedge_imag == Yedge_real@Vedge_imag+Yedge_imag@
Vedge_real]
for k in range(1,num_nodes):
re += [Inode_real[k]==S_real[k]*(2-Vnode_real[k])+

2 See Section 4.9 Chapter 4.

Telegram: @ElectricalDocument
224 12 State estimation and grid identiﬁcation

S_imag[k]*(Vnode_imag[k])]
re += [Inode_imag[k]==S_real[k]*(Vnode_imag[k])-
S_imag[k]*(2-Vnode_real[k])]
for k in range(num_edges):
re += [-mu[k]*deltaI_real[k] <= Iedge_real[k]]
re += [-mu[k]*deltaI_imag[k] <= Iedge_imag[k]]
re += [Iedge_real[k] <= mu[k]*deltaI_real[k]]
re += [Iedge_imag[k] <= mu[k]*deltaI_imag[k]]
re += [Iedge_real[k] <= Jedge_real[k]+(1-mu[k])
*deltaI_real[k]]
re += [Iedge_imag[k] <= Jedge_imag[k]+(1-mu[k])
*deltaI_imag[k]]
re += [Iedge_real[k] >= Jedge_real[k]-(1-mu[k])
*deltaI_real[k]]
re += [Iedge_imag[k] >= Jedge_imag[k]-(1-mu[k])
*deltaI_imag[k]]
fo = cvx.sum([email protected](Iedge_real-IE.real))
fo += cvx.sum([email protected](Iedge_imag-IE.imag))
Identification = cvx.Problem(cvx.Minimize(fo),re)
Identification.solve(verbose=True)

where the input are the measurements IE, the incidence matrix A, and
the pseudo-measurements of power S_real and S_imag. The model
was separated into real and imaginary parts in order to simplify its
implementation.

12.4 Y bus estimation

Estimating the 𝑌bus is a crucial problem in the operation of power systems, both
at power and distribution levels. We are interested in generating an estimation
𝑌 closest to the real value 𝑌bus . Therefore, we require a way to measure the
distance between 𝑌 and 𝑌bus , that is to say, we require a norm, as defined in
Chapter 2, but in the space of the complex matrices; so, given a norm ‖⋅‖ in ℝ𝑛 ,
we can generate a new norm in space ℝ𝑚×𝑛 as given in (12.18).

‖𝑀‖ = sup ‖𝑀𝑥‖ (12.18)

‖𝑥‖=1

As the norm of a vector, a matrix norm is a convex function that may be used as
the objective function is convex optimization problems implemented in CvxPy.
Consider a power grid in which all nodes are equipped with PMUs that
allow measuring both voltages and currents. Let us assume we do not know
the exact values of the parameters of the transmission lines and the topology of

Telegram: @ElectricalDocument
12.4 Ybus estimation 225

3 4

PMU PMU

PMU PMU
central computer
1 2
Figure 12.3 Four-node grid with PMUs in all nodes for 𝑌bus identiﬁcation.

the grid, and therefore, an 𝑌bus estimation algorithm is required. A central com-
puter receives and stores the information for different scenarios as depicted in
Figure 12.3.
The basic estimation algorithm is defined as the following unconstrained
optimization problem:
2
min ‖𝑌𝑉 − 𝐼‖ (12.19)
𝑌

where ‖⋅‖ is any matrix norm, 𝑉 and 𝐼 are matrices of size 𝑛𝑛 × 𝑛𝑠 with 𝑛𝑛 the
number of nodes and 𝑛𝑠 the number of scenarios. Notice this optimization prob-
lem is defined in the set of the complex matrices and not in the set of vectors as
in the optimization problems presented in previous chapters.
The model may be complemented with additional constraints related to the
main features of the 𝑌bus . For instance, we know it is symmetric, meaning that

𝑌 = 𝑌⊤ (12.20)

In addition, we know that 𝐺 = real(𝑌) and 𝐵 = − imag(𝑌) are positive semidef-

inite, diagonally dominant and sparse. The topology of the graph defines entries
that are already known and equal to zero. All these features can be added to the
model in order to obtain a more accurate estimation.

Telegram: @ElectricalDocument
226 12 State estimation and grid identiﬁcation

Example 12.4. Let us estimate the 𝑌bus for the system shown in Figure 12.3.
Our example is divided into three parts: first, we generate the exact model of
the grid, in order to obtain the correct value of the 𝑌bus ; then, a random set of
measurement scenarios are generated with this matrix; and finally, the 𝑌bus is
estimated by the proposed optimization model.
We use the module NetworkX for generating the 𝑌bus as presented below;
parameters of the transmission lines are included in the code:

import numpy as np
import networkx as nx
Grid = nx.DiGraph()
Grid.add_edges_from(((1,2),(2,3),(3,4),(4,1)))
Yp = 1/np.array([0.002+0.02j,0.001+0.03j,0.001+0.02j,
0.001+0.04j])
A = nx.incidence_matrix(Grid,oriented=True)
Ybus = [email protected](Yp)@A.T

Now, we generate a set of random scenarios for the voltages around 1pu
and next, these voltages are used to calculate nodal currents for each scenario.
Finally, a random noise is added in order to emulate possible inaccuracies of
the measure devices. The code in Python is presented below:

nn = 4 # number of nodes
ns = 20 # number of scenarios
v = 0.9*np.ones((nn,ns))+0.2*np.random.random((nn,ns))
a = np.random.random((nn,ns))-0.5
Vbus = v*np.exp(a*1j)
Ibus = Ybus@Vbus + np.random.normal(0,0.1,(nn,ns))
Ibus = Ibus + 1j*np.random.normal(0,0.1,(nn,ns))

The optimization model consists on minimizing (12.19) subject to the con-

straint 𝑌 = 𝑌 ⊤ as follows:

import cvxpy as cvx

Y = cvx.Variable((nn,nn), complex=True)
fo = cvx.Minimize(cvx.norm(Y@Vbus-Ibus))
re = [Y==Y.T]
Est = cvx.Problem(fo,re)
Est.solve()
print(np.linalg.norm(Y.value-Ybus))

The result of this model is different each time the script is executed due to the
random scenarios. However, the order of magnitude of the error is the same
according to the number of scenarios. The script was executed with a different
number of scenarios, and the results are shown in Figure 12.4; it shows that
a high number of scenarios does not necessarily increase the accuracy of the
estimation.

Telegram: @ElectricalDocument
12.4 Ybus estimation 227

Figure 12.4 Estimation error vs number of scenarios.

Example 12.5. Consider now the previous example with semidefinite con-
traints as follows:
Y = cvx.Variable((nn,nn), complex=True)
fo = cvx.Minimize(cvx.norm(Y@Vbus-Ibus))
re = [Y==Y.T, cvx.real(Y) >> 0, -cvx.imag(Y)>>0]
Est2 = cvx.Problem(fo,re)
Est2.solve()
print(np.linalg.norm(Y.value-Ybus))

The reader is invited to execute the script for different number of scenarios 𝑛𝑠 .
In general, the positive semidefinite constraints does not improve the accuracy
of the solution for a large set of scenarios (e.g 𝑛𝑠 = 20), and instead, the time
calculation is highly increased. However, for a small data set (for instance 𝑛𝑠 =
3), the semidefinite constraints highly improve the results.
Example 12.6. We can show that the following conditions hold in the grid of
Example 12.4:
∑
𝑔𝑘𝑘 ≥ −𝑔𝑘𝑚
𝑚≠𝑘
∑
−𝑏𝑘𝑘 ≥ 𝑏𝑘𝑚 (12.21)
𝑚≠𝑘

This conditions are general, easy to implement, and less time-consuming that
semidefinite constraints, for problems with few measurement scenarios. A
script with this constraints is presented below:
Y = cvx.Variable((nn,nn), complex=True)
fo = cvx.Minimize(cvx.norm(Y@Vbus-Ibus))
re = [Y==Y.T]
re += [cvx.real(Y[0,0]) >= -cvx.real(Y[0,1]+Y[0,2]+Y[0,3])]
re += [cvx.real(Y[1,1]) >= -cvx.real(Y[1,0]+Y[1,2]+Y[1,3])]
re += [cvx.real(Y[2,2]) >= -cvx.real(Y[2,0]+Y[2,1]+Y[2,3])]
re += [cvx.real(Y[3,3]) >= -cvx.real(Y[3,0]+Y[3,1]+Y[3,2])]
re += [-cvx.imag(Y[0,0]) >= cvx.imag(Y[0,1]+Y[0,2]+Y[0,3])]

Telegram: @ElectricalDocument
228 12 State estimation and grid identiﬁcation

re += [-cvx.imag(Y[1,1]) >= cvx.imag(Y[1,0]+Y[1,2]+Y[1,3])]

re += [-cvx.imag(Y[2,2]) >= cvx.imag(Y[2,0]+Y[2,1]+Y[2,3])]
re += [-cvx.imag(Y[3,3]) >= cvx.imag(Y[3,0]+Y[3,1]+Y[3,2])]
Est2 = cvx.Problem(fo,re)
Est2.solve()
print(np.linalg.norm(Y.value-Ybus))

The student is invited to compare results with the previous examples for 𝑛𝑠 = 3.

12.5 Load model estimation

The loads in distribution systems depend on the voltage, therefore, they are
usually represented by quadratic functions of the nodal voltage as follows:
( )
𝑝 = 𝑝0 𝑎𝑣 2 + 𝑏𝑣 + 𝑐 (12.22)
( 2
)
𝑞 = 𝑞0 𝑎𝑞 𝑣 + 𝑏𝑞 𝑣 + 𝑐𝑝 (12.23)

where 𝑝0 , 𝑞0 , and 𝑣 represent the nominal active/reactive power and the nodal
voltage in per unit. Each coefficient in the quadratic form has a physical mean-
ing according to the type of load, thus 𝑐𝑝 and 𝑐𝑞 represent constant-power loads
whereas coefficients of the linear part (𝑏𝑝 and 𝑏𝑞 ) represent constant-current
loads and, coefficient associated to the quadratic terms (𝑎𝑝 and 𝑎𝑞 ) represent
constant-impedance loads. This model is known as ZIP model since each term
represent the percentage of constant impedance loads (Z), constant current
loads, (I) and constant power loads (P). Therefore, all coefficients in the model
are positive and the following constraints hold:

𝑎𝑝 + 𝑏𝑝 + 𝑐𝑝 = 1 (12.24)
𝑎𝑞 + 𝑏𝑞 + 𝑐𝑞 = 1 (12.25)

Every load exhibits a mixture of such voltage-dependent behavior. However,

constant impedance is most commonly found in residential loads, constant cur-
rent in commercial loads and, constant power in industrial loads. The use of
Advanced Metering Infrastructure (AMI) for real-time monitoring and control
of a distribution system allows estimating the model ZIP of the load accurately
by a simple optimization problem.
Let us consider a database with measures of nodal voltage and active/reac-
tive power. Our objective is to adjust these measures to a ZIP model taking
into account the previously mentioned physical constraints. Therefore, the fol-
lowing least-squares estimation model is proposed for the case of the active
power3 :

3 The model for the reactive power has the same structure.

Telegram: @ElectricalDocument
12.5 Load model estimation 229

𝑛−1
1 ∑( )2
min 𝑎 𝑣 2 + 𝑏𝑝 𝑣𝑘 + 𝑐𝑝 − 𝑟𝑝 𝑝 𝑘
2 𝑘=0 𝑝 𝑘

𝑎 𝑝 + 𝑏𝑝 + 𝑐𝑝 = 1 (12.26)
𝑎𝑝 , 𝑏 𝑝 , 𝑐 𝑝 , 𝑟 𝑝 ≥ 0

where 𝑝𝑘 and 𝑣𝑘 are measures of power and nodal voltage with dim(𝑝) =
dim(𝑣) = 𝑛 and 𝑟𝑝 = 1∕𝑝0 . The objective is to minimize the sum of the squares
of the error between the data and the model, subject to (12.24). Notice that
Model (12.26) is convex and can be easily solved using CvxPy. This model can
be executed for a database grouped by hours, in order to obtain values of each
hour of the day, but can be also executed to evaluate the average behavior of
the load.

Example 12.7. Let us see how Model (12.26) works in practice; first, we gen-
erate a set of 𝑛 = 800 synthetic measurements of voltage and active/reactive
power. Voltages are randomly generated through a normal distribution such
that most of the data is between 0.95 and 1.05. To do this, we define a mean of
1 and a standard deviation of 0.05∕3 (thus, the 98% of the data falls into this
interval). Active and reactive power are calculated by a predefined ZIP model
with additional noise, also generated by a normal distribution as follows:

import numpy as np
n = 800
v = np.random.normal(1,0.05/3,n)
p = 1.2*(0.3*v**2+0.2*v+0.5) + np.random.normal(0,0.004,n)
q = 0.2*(0.6*v**2+0.2*v+0.2) + np.random.normal(0,0.001,n)

Figure 12.5 shows the results for one execution of this code.
A script for the estimation of the ZIP model for the active power is straight-
forward for this case:

ap = cvx.Variable(nonneg=True)
bp = cvx.Variable(nonneg=True)
cp = cvx.Variable(nonneg=True)
rp = cvx.Variable(nonneg=True)
fo = cvx.Minimize(1/2*cvx.sum((ap*v**2+bp*v+cp-rp*p)**2))
re = [ap+bp+cp == 1]
Model = cvx.Problem(fo,re)
Model.solve()

The student is invited to execute the script and evaluate the accuracy of the
estimation, taking into account this synthetic data set is perhaps more distorted
than actual measurements. More general models are possible, including the

Telegram: @ElectricalDocument
230 12 State estimation and grid identiﬁcation

Figure 12.5 Synthetic data of voltage and power for load estimation.

effect of the frequency. The structure of these models is also the least square
model and can be solved quickly, as presented here.
Example 12.8. The least square model for the load estimation can be solved
directly by relaxing inequality constraints. In that case, the Lagragian is given
by (12.27):
𝑛−1
1 ∑( )2
ℒ= 𝑎 𝑣2 + 𝑏𝑝 𝑣𝑘 + 𝑐𝑝 − 𝑟𝑝 𝑝𝑘 + 𝜆𝑝 (𝑎𝑝 + 𝑏𝑝 + 𝑐𝑝 − 1) (12.27)
2 𝑘=0 𝑝 𝑘

The optimal conditions can be easily obtained by taking the derivative of ℒ as

function of 𝑎𝑝 , 𝑏𝑝 , 𝑐𝑝 , 𝑟𝑝 , and 𝜆𝑝 , resulting in the following linear system:

⎛ 𝜎𝑣4 𝜎𝑣3 𝜎𝑣2 −𝜎𝑝𝑣2 1 ⎞⎛ 𝑎𝑝 ⎞ ⎛ 0 ⎞

⎜ 𝜎𝑣3 𝜎𝑣2 𝜎𝑣1 −𝜎𝑝𝑣1 1 ⎟⎜ 𝑏𝑝 ⎟ ⎜ 0 ⎟
⎜ 𝜎𝑣2 𝜎𝑣1 𝑛 −𝜎𝑝1 1 ⎟⎜ 𝑐𝑝 ⎟=⎜ 0 ⎟ (12.28)
⎜ ⎟⎜ ⎟ ⎜ ⎟
𝜎𝑝𝑣2 𝜎𝑝𝑣1 𝜎𝑝1 −𝜎𝑝2 0 𝑟𝑝 0
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝ 1 1 1 0 0 ⎠⎝ 𝜆𝑝 ⎠ ⎝ 1 ⎠

Telegram: @ElectricalDocument
12.6 Further readings 231

with
𝑛−1 𝑛−1
∑ ∑
𝜎𝑣4 = 𝑣𝑘4 𝜎𝑣3 = 𝑣𝑘3
𝑘=0 𝑘=0
𝑛−1 𝑛−1
∑ ∑
𝜎𝑣2 = 𝑣𝑘2 𝜎𝑣1 = 𝑣𝑘
𝑘=0 𝑘=0
(12.29)
𝑛−1 𝑛−1
∑ ∑
𝜎𝑝𝑣2 = 𝑝𝑘 𝑣𝑘2 𝜎𝑝𝑣1 = 𝑝𝑘 𝑣𝑘
𝑘=0 𝑘=0
𝑛−1 𝑛−1
∑ ∑
𝜎𝑝1 = 𝑝𝑘 𝜎𝑝2 = 𝑝𝑘2
𝑘=0 𝑘=0

By solving this five linear system, we may obtain the values of 𝑎𝑝 , 𝑏𝑝 , 𝑐𝑝 , and
𝑟𝑝 ; however, in some pathological cases, the values may be negatively violating
the inequality constraint.

12.6 Further readings

One of the first formulations of the state estimation problem can be found in
[106]; see also [103] and the references therein for a complete review of the
classic formulation. A modern approach considering PMUs can be found in
[107]. In general, the problem is non-convex. However, there are convex approx-
imations including semidefinite programming [108], just as in the case of the
power flow equations [109]. The problem may be complemented with other
algorithms that check connectivity and observability of the grid as well as the
presence of insufficient data as presented in [110].
Although methods for grid identification have been known for a long time,
their application is recently enhanced by the development of PMUs. Modern
approaches to the problem are based on statistical algorithms such as the least
absolute shrinkage and selection operator (LASSO) proposed in [104], where
the problem is referred to as the inverse power flow.
Topology identification is an active research area with different models
according to the type of available measurement. The model presented here was
proposed by Farajollahi et al. in [105]. This model is attractive for practical
application since it requires very few low-cost measurements. In addition, it
is notably robust to the variations of the pseudo-measurements.
Parameter identification for aggregate load modeling has been studied under
different approaches. For example, in [111] a hybrid learning algorithm was
proposed. This algorithm combines heuristics with a non-linear Levenberg–
Marquardt method and allows for the representation of static loads such as

Telegram: @ElectricalDocument
232 12 State estimation and grid identiﬁcation

induction motors and residential, commercial, and industrial loads, as pre-

sented in this chapter. The advantages of heuristic algorithms compared to
more simple least square methods are questionable in time calculation. How-
ever, the current development of machine learning and artificial intelligence is
based on these types of approaches, and the research continues.
It is vital to note that although the mathematical models presented in this
chapter were solved using CvxPy, they could be solved using the gradient
method or by direct calculation of the quadratic problem. These methods may
be more efficient for online estimation where results are required in real-time.
However, the approach presented here is enough for off-line applications even
with large data sets.

12.7 Exercises
1. Solve the dc state estimation problem presented in Example 12.1 including
PMUs measurements of the angles of the system, that is, including a new
set of measurements 𝑧𝜃 = (𝜃0 , 𝜃1 , 𝜃2 )⊤ + (𝑒𝜃1 , 𝑒𝜃2 , 𝑒𝜃2 )⊤ with 𝜎𝜃 = 1 × 10−6 .
Compare the results with the classic problem without PMUs.
2. Solve the ac state estimation problem presented in Example 12.2, includ-
ing nodal and branch power flows measurements. Formulate a second-order
approximation to make the problem convex.
3. Repeat the previous problem using a semidefinite approximation.
4. Solve the non-convex formulation of the state estimation problem, consider-
ing nodal and branch power flows measurements, using Newton’s method.
Compare results with the previous problems.
5. Solve the topology identification problem on the IEEE 33-bus test system
presented in Table 11.2, Chapter 11.
6. Prove that the matrix norm defined as (12.18) has the following property:

‖𝐴 ⋅ 𝐵‖ ≤ ‖𝐴‖ ⋅ ‖𝐵‖ (12.30)

7. The gradient method may be also used in matrix functions. Consider the
following optimization problem:

1 2
min 𝑓(𝑋) = ‖𝐴𝑋 − 𝐵‖𝐹 (12.31)
2

where 𝐴, 𝐵, 𝑋 are square matrices and ‖⋅‖𝐹 is the Frobenius norm, defined
as (12.32).
√
‖𝑋‖ = tr(𝑋𝑋 ⊤ ) (12.32)

Telegram: @ElectricalDocument
12.7 Exercises 233

The derivative of this norm is given by (12.33),

2
𝜕 ‖𝑋‖𝐹
= 2𝑋 (12.33)
𝜕𝑋
Notice this is the derivative of a matrix since the gradient method must
be formulated in terms of matrices and not vectors. Solve the optimization
problem (12.31) for two randomly generated matrices 𝐴, 𝐵 ∈ ℝ3×3 .
8. Prove the properties of the 𝑌bus given in (12.21).
9. Repeat examples presented in Section 12.4 but this time use the Frobenius
norm. Compare results and computation time.
10. Solve the problem in Example 12.7 by direct derivation of (12.27). Com-
pare the solution with the solution given by CvxPy at different randomly
generated instances of the problem.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
235

Demand-side management

Learning outcomes

By the end of this chapter, the student will be able to:

● To formulate basic models for demand-side management, including
energy storage devices.
● To discuss the effect of electric vehicles.
● To solve problems related to phase balancing and load shifting.

13.1 Shifting loads

Final-users usually pay for consumed energy regardless of the shape of the load
curve, that is to say, they pay for kWh and not for kW. However, power is becom-
ing more critical in modern electric markets where the price of the energy is
variable along the day. In these cases, it is advisable to adjust the load curve
in order to avoid a load peak at high-price hours. Consider, for example, the
duty cycle of a washing machine on a residential user; it is preferable to have a
peak of demand in hours when the price of energy is low, early in the morning,
instead of having it when the price is high in the evening. The same applies to
many industrial loads that may be moved in time to reduce the total energy cost.
An optimization model for shifting loads modifies the starting time but main-
tains the shape and the total energy as depicted in Figure 13.1. This shifting
action may be represented as a cyclic permutation.
Let us consider a shifting of one hour. The load curve is represented as a
vector 𝑠 ∈ ℝ24 for each hour 𝑡; the shifted load represented by another vector

Mathematical Programming for Power Systems Operation: From Theory to Applications in

(kW)
load shift

time(h)
load (kW)

time(h)

Figure 13.1 Example of a shifting load. In both cases the shape and the total energy
are the same, but the peak as moved.

𝑝 ∈ ℝ24 , and permutation matrix 𝑀 which is given by Equation (13.1):

⎛ 0 1 0 0 … 0 0 ⎞
⎜ 0 0 1 0 … 0 0 ⎟
⎜ 0 0 0 1 … 0 0 ⎟
𝑀=⎜ 0 0 0 0 … 0 0 ⎟ (13.1)
⎜ ⎟
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
⎜ ⎟
⎜ 0 0 0 0 … 0 1 ⎟
⎝ 1 0 0 0 … 0 0 ⎠

This matrix shifts the load in one position, that is it returns the same load curve
but delayed in one hour; thus, the shifted load is a vector given by the expression
𝑝 = 𝑀𝑠, which returns 𝑝2 = 𝑠1 , 𝑝3 = 𝑠2 , 𝑝4 = 𝑠3 and so on, until 𝑝24 = 𝑠23 and
finally, 𝑝1 = 𝑠24 . If we desire to shift the load in two positions, then we must
use twice the same permutation, namely 𝑝 = 𝑀𝑀𝑠 = 𝑀 2 𝑠. In general, a shift
permutation of 𝑘 positions is given by Equation (13.2):

𝑝 = 𝑀𝑘 𝑠 (13.2)

Therefore, we must choose among 24 different cyclic permutation matrices

including the identity permutation1 . With this simple idea, the model for
shifting load optimization of 𝒟 loads is the following:

1 Notice that 𝑀 24 = Identity

Telegram: @ElectricalDocument
13.1 Shifting loads 237

min 𝑐⊤ 𝑝total
∑
𝑝total = 𝑝𝑖
𝑖∈𝒟

∑
𝑝𝑖 = ( 𝑥𝑖𝑘 𝑀 𝑘 ) 𝑠𝑖 , ∀𝑖 ∈ 𝒟 (13.3)
𝑘
∑
𝑥𝑖𝑘 = 1, ∀𝑖 ∈ 𝒟
𝑘

𝑥𝑖𝑘 ∈ {0, 1}

where 𝑐 is the vector of energy cost for 24 h operation; 𝑥𝑖𝑘 is a binary vari-
able which is 1 if the load 𝑖 is shifted to 𝑘 positions; and 𝑀 is the one-shift
permutation given by Equation (13.1). The objective function seeks to mini-
mize total costs, subject to the energy balance. Additional constraint may be
included, such as maximum peak load 𝑝total ≤ 𝑝max and hours of banned oper-
ation (e.g, we might avoid operation between 0:00 and 7:00 AM). Notice that
Equation (13.3) is a mixed-integer programming problem, since 𝑀 𝑘 is constant.

Example 13.1. Let us make a small experiment to see the properties of the
cyclic permutation matrices. In the script below, we create the matrix 𝑀 and
a permutation 𝑀 𝑘 with 𝑘 as input. The result is the expected permutation
according to 𝑘.

import numpy as np
M = np.zeros((24,24))
M[23,0] = 1
for k in range(23):
M[k,k+1] = 1
t = int(input(’Enter the time shifting:’))
p = np.linspace(1,24,24)
R = np.linalg.matrix_power(M,t)
m = p @ R
for k in range(24):
print(p[k],’->’,m[k])

The student is invited to use different integer values of 𝑘, even values greater
than 24.

Example 13.2. An industry has three specific loads that correspond to a dif-
ferent industrial process. These loads can be shifted in order to minimize total
costs. The capability of the transformer is 𝑝max = 20 and no process can be per-
formed between 0:00 and 6:00 AM. Table 13.1 shows the data for the current
operation.

Telegram: @ElectricalDocument
238 13 Demand-side management

Table 13.1 Three industrial process

and price of the energy in each time.

Hour Price Load 1 Load 2 Load 3

0 460 0 0 0
1 450 0 0 0
2 450 0 0 0
3 450 0 0 0
4 470 0 0 0
5 470 0 0 0
6 500 0 0 0
7 530 0 0 2
8 580 5 0 3
9 600 10 3 4
10 620 10 4 10
11 600 8 10 12
12 600 5 12 7
13 600 3 9 5
14 580 0 8 4
15 570 0 5 0
16 560 0 0 0
17 565 0 0 0
18 550 0 0 0
19 610 0 0 0
20 650 0 0 0
21 650 0 0 0
22 610 0 0 0
23 500 0 0 0

The following code shows the implementation of Model Equation (13.3) for
these three loads. First, we load the data, which was stored in a CSV file named
Table.csv.

import numpy as np
import matplotlib.pyplot as plt
from pandas import read_csv
import cvxpy as cvx

Telegram: @ElectricalDocument
13.1 Shifting loads 239

data = read_csv(’Table.csv’)
c = data[’Price’].to_numpy(dtype= float)
s1 = data[’Load 1’].to_numpy(dtype= float)
s2 = data[’Load 2’].to_numpy(dtype= float)
s3 = data[’Load 3’].to_numpy(dtype= float)

Now, we define the matrix 𝑀:

M = np.zeros((24,24))
M[23,0] = 1
for k in range(23):
M[k,k+1] = 1

Finally, we built the optimization model:

x1 = cvx.Variable(24,boolean=True)
x2 = cvx.Variable(24,boolean=True)
x3 = cvx.Variable(24,boolean=True)
pt = cvx.Variable(24)
p1 = cvx.Variable(24)
p2 = cvx.Variable(24)
p3 = cvx.Variable(24)
u = np.ones(24)
pmax = 20
Eq1 = 0
Eq2 = 0
Eq3 = 0
for k in range(24):
Eq1 = Eq1 + x1[k]*np.linalg.matrix_power(M,k)@s1
Eq2 = Eq2 + x2[k]*np.linalg.matrix_power(M,k)@s2
Eq3 = Eq3 + x3[k]*np.linalg.matrix_power(M,k)@s3
res = [pt == p1+p2+p3,
pt <= pmax,
p1 == Eq1,
p2 == Eq2,
p3 == Eq3,
cvx.sum(x1) == 1,
cvx.sum(x2) == 1,
cvx.sum(x3) == 1,
pt[0:6] == 0]
fo = cvx.Minimize(c.T@pt)
SL = cvx.Problem(fo,res)
SL.solve()

Most of the code is self-explanatory. Notice there are only 24 binary variables
for each load in this Model. The rest of the variables are continuous.
Results are shown in Figure 13.2. Initial loads are shown at the top, while
shifted loads are shown at the bottom. Notice the peak of the load was 30 kW
in the first case, while the new peak is only 20 in the shifted loads. The shifting

Telegram: @ElectricalDocument
240 13 Demand-side management

Figure 13.2 Results for the shifting load problem. Original loads (top), shifted loads
(bottom).

load optimization reduces by 5% the total cost. A great reduction taking into
account we only required to move the loads in time.

13.2 Phase balancing

Power distribution networks are usually unbalanced due to the presence of
single-phase loads. This phenomenon is undesired since unbalancing increases
neutral currents and, consequently, power losses. In addition, unbalanced
currents may reduce motors lifetime since extra heat, due to zero-sequence
currents, increases operating temperature of windings and might break down
insulation, entailing motor failure. Mechanical vibrations due to unbalanced
voltages/currents in industrial loads may also reduce the lifetime of motors.
Therefore, we require an optimization model to determine the phase in which
loads are connected to reduce zero sequence currents [112].
Let us consider a system with 𝑛 single-phase loads represented as 𝑑𝑗 with
𝑗 ∈ {0, 1, … , 𝑛 − 1}, which are connected to a three-phase power distribu-
tion network; the problem is basically an assignment problem2 that consists

2 See [113] for more details about the general assignment problem.

Telegram: @ElectricalDocument
13.2 Phase balancing 241

Figure 13.3 Phase balancing as an

assignment problem.

in assigning a phase to each load as shown schematically in Figure 13.3. There-

fore, a binary variable 𝑥𝑖𝑗 is defined such that 𝑥𝑖𝑗 = 1 if the load 𝑑𝑗 is connected
to the phase 𝑖 ∈ {𝐴, 𝐵, 𝐶}. The total power in each phase is given the matrix
equation Equation (13.4).

⎛ 𝑑0 ⎞
⎛ 𝑝𝐴 ⎞ ⎛ 𝑥𝐴0 𝑥𝐴1 … 𝑥𝐴𝑛−1 ⎞
⎜ 𝑑1 ⎟
⎜ 𝑝𝐵 ⎟ = ⎜ 𝑥𝐵0 𝑥𝐵1 … 𝑥𝐵𝑛−1 ⎟ ⎜ (13.4)
⋮ ⎟
⎜ ⎟ ⎜ ⎟
𝑝𝐶 𝑥𝐶0 𝑥𝐶1 … 𝑥𝐶𝑛−1 ⎜ ⎟
⎝ ⎠ ⎝ ⎠ 𝑑𝑛−1
⎝ ⎠

Each load must be connected to a unique phase, therefore, the following

constraint must be satisfied:
∑
𝑥𝑖𝑗 = 1 ∀𝑗 ∈ {0, 1, … , 𝑛 − 1} (13.5)
𝑖∈{𝐴,𝐵,𝐶}

An unbalance index 𝜓 is defined as the deviation of the power in each phase

with respect to the mean:

1
𝜓= (‖𝑝𝐴 − 𝑝mean ‖ + ‖𝑝𝐵 − 𝑝mean ‖ + ‖𝑝𝐶 − 𝑝mean ‖) (13.6)
3

where 𝑝𝐴 , 𝑝𝐵 , 𝑝𝐶 are the total power in each phase and, 𝑝mean is the average
power per phase given by Equation (13.7),

𝑝𝐴 + 𝑝𝐵 + 𝑝𝐶
𝑝mean = (13.7)
3

under ideal balanced conditions, 𝑝𝐴 , 𝑝𝐵 , and 𝑝𝐶 would be equal and 𝜓 would be

zero; however, this condition is not always possible and hence, our best alter-
native is to find the configuration that minimizes 𝜓. This configuration may
not be unique but the value of 𝜓 does since Equation (13.6) is a strongly convex

Telegram: @ElectricalDocument
242 13 Demand-side management

function. The entire model is presented below:

1
min 𝜓 = (‖𝑝𝐴 − 𝑝mean ‖ + ‖𝑝𝐵 − 𝑝mean ‖ + ‖𝑝𝐶 − 𝑝mean ‖)
3
𝑝𝐴 + 𝑝𝐵 + 𝑝𝐶
𝑝mean =
3
𝑝 = 𝑥𝑑 (13.8)
∑
𝑥𝑖𝑗 = 1 ∀𝑗
𝑖∈{𝐴,𝐵,𝐶}

𝑥𝑖𝑗 ∈ {0, 1}

where 𝑥, 𝑝𝐴 , 𝑝𝐵 , 𝑝𝐶 , and 𝑝mean are decision variables and 𝑑 is the vector of

single-phase loads. The model can be easily transformed into a mixed-integer
programming problem that can be solved using CvxPy as presented in the
following example:

Example 13.3. Let us balance a system with nine single-phase loads given by
a vector 𝑑 = (5, 8, 7, 3, 5, 5, 3, 4, 4)⊤ kW. The model consists in Equations (13.4)
to (13.7) with 𝑥 a Boolean matrix:

import cvxpy as cvx

import numpy as np
d = np.array([5,8,7,3,5,5,3,4,4]) # single-phase loads
n = len(d)
A, B, C = 0, 1, 2
x = cvx.Variable((3,n), boolean = True)
p = cvx.Variable(3)
pmean = cvx.Variable()
psi = cvx.Minimize(1/3*(cvx.abs(pmean-p[A])+
cvx.abs(pmean-p[B])+
cvx.abs(pmean-p[C])))
re = [pmean == sum(p)/3]
re += [p == x@d]
re += [x[A,j] + x[B,j] + x[C,j] == 1 for j in range(n)]
PhaseBalance = cvx.Problem(psi,re)
PhaseBalance.solve()
print(np.round(np.abs(x.value)))
print(p.value,pmean.value)

The optimal value is 𝜓 = 0.444 with a total power of 15 kW in two of the phases
and 14 kW in the remaining phase. This solution is, of course, not unique since
there are different manners to obtain the same unbalance index. The student
is invited to test different solvers for the Model and create instances with more
loads.

The previous Model is valid for single-phase loads. In the case of three-phase
loads, the assignment model must take into account that each phase of the load

Telegram: @ElectricalDocument
13.2 Phase balancing 243

Table 13.2 Feasible permutations for the

phase-balancing problem.

Matrix Value Permutation Determinant

⎛ 1 0 0 ⎞
𝑀0 ⎜ 0 1 0 ⎟ ABC +1
⎜ ⎟
⎝ 0 0 1 ⎠
⎛ 0 1 0 ⎞
𝑀1 ⎜ 0 0 1 ⎟ BCA +1
⎜ ⎟
⎝ 1 0 0 ⎠
⎛ 0 0 1 ⎞
𝑀2 ⎜ 1 0 0 ⎟ CAB +1
⎜ ⎟
⎝ 0 1 0 ⎠
⎛ 1 0 0 ⎞
𝑀3 ⎜ 0 0 1 ⎟ ACB −1
⎜ ⎟
⎝ 0 1 0 ⎠
⎛ 0 1 0 ⎞
𝑀4 ⎜ 1 0 0 ⎟ BAC −1
⎜ ⎟
⎝ 0 0 1 ⎠
⎛ 0 0 1 ⎞
𝑀5 ⎜ 0 1 0 ⎟ CBA −1
⎜ ⎟
⎝ 1 0 0 ⎠

must be unequivocally connected to each phase of the system as was presented

in Figure 1.6, Chapter 1. The problem is now a permutation with six possible
configurations given in Table 13.2, where each permutation is described by a
3 × 3 matrix. The first three permutations maintain the sequence of the original
loads while the three last permutations reverse the sequence. The determinant
of the matrix indicates this property3 . In general, the problem may be stated in
terms of these possible permutations4 .
A set 𝒟 is defined to represent the three phase loads that require to be bal-
anced. Each load is vector 𝑑𝑘 ∈ ℝ3 with 𝑘 ∈ 𝒟. For the sake of simplicity,

3 det(𝑀𝑖 ) = 1 if the permutation maintains the sequence. A change in the sequence of the
load may have practical effects. For example, a reverse in the sequence of a motor causes an
opposite rotation.
4 Other representations are possible. However, this representation reduces the number of
binary variables. See Exercise 6 at the end of this chapter.

Telegram: @ElectricalDocument
244 13 Demand-side management

we assume only real power, but the model can be extended to consider active
and reactive power. A new vector of loads 𝑝𝑘 is defined with the required
permutation as given below:
𝑝𝑘 = (𝑥0𝑘 𝑀0 + 𝑥1𝑘 𝑀1 + 𝑥2𝑘 𝑀2 + 𝑥3𝑘 𝑀3 + 𝑥4𝑘 𝑀4 + 𝑥5𝑘 𝑀5 ) 𝑑𝑘 (13.9)
where 𝑥𝑖𝑘 is a binary variable which is 1 as the permutation 𝑖 is activated in
the load 𝑗. This decision must be univocal, so the following constrain must be
added:
𝑥0𝑘 + 𝑥1𝑘 + 𝑥2𝑘 + 𝑥3𝑘 + 𝑥4𝑘 + 𝑥5𝑘 = 1 (13.10)
Finally, the power per phase is represented by a vector 𝑠 ∈ ℝ3 and the average
power is represented by 𝑠mean . The mathematical model is given below:

1 ∑
min 𝜓 = ( |𝑠 − 𝑠𝑘 |)
3 𝑘∈Ω mean
5
∑
𝑝𝑘 = 𝑥𝑖𝑘 𝑀𝑖 𝑑𝑘 , ∀𝑘 ∈ 𝒟
𝑖=0
5
∑
𝑥𝑖𝑘 = 1, ∀𝑘 ∈ 𝒟
𝑖=0

1 ∑
𝑠mean = 𝑠 (13.11)
3 𝑘∈Ω 𝑘

𝑠𝑘 = 𝟙⊤ 𝑝
𝑥𝑖𝑘 ∈ {0, 1}
where Ω = {𝐴, 𝐵, 𝐶} represents the phases of the system. Notice this is a integer
linear programming model that can be easily coded in CvxPy. This is only a toy-
model, since the phase balance problem may be integrated with the power flow
equations to minimize power loss in power distribution systems. Interested
reader is referred to [114].
Example 13.4. Let us make an instance of Model Equation (13.11) for a system
with four loads 𝑑0 = (10, 12, 11)⊤ , 𝑑1 = (12, 15, 11)⊤ , 𝑑2 = (11, 12, 15)⊤ , 𝑑3 =
(13, 10, 12)⊤ . First, we define a list that contains the permutation matrices given
in Table 13.2:
import numpy as np
import cvxpy as cvx
M = []
M += [np.array([[1,0,0],[0,1,0],[0,0,1]])]
M += [np.array([[0,1,0],[0,0,1],[1,0,0]])]
M += [np.array([[0,0,1],[1,0,0],[0,1,0]])]

Telegram: @ElectricalDocument
13.2 Phase balancing 245

M += [np.array([[1,0,0],[0,0,1],[0,1,0]])]
M += [np.array([[0,1,0],[1,0,0],[0,0,1]])]
M += [np.array([[0,0,1],[0,1,0],[1,0,0]])]

Now, we define the optimization model whose implementation is straightfor-

ward:
d = np.array([[10,12,11,13],
[12,15,12,10],
[11,11,15,12]])

num_loads = len(d.T)
x = cvx.Variable((6,num_loads),boolean=True)
p = cvx.Variable((3,num_loads))
s_mean = cvx.Variable()
s = cvx.Variable(3)
res = []
for k in range(num_loads):
res += [p[:,k] == (x[0,k]*M[0] +
x[1,k]*M[1] +
x[2,k]*M[2] +
x[3,k]*M[3] +
x[4,k]*M[4] +
x[5,k]*M[5])@d[:,k]]
res += [cvx.sum(x[:,k]) == 1]

res += [s_mean==sum(s)/3]
res += [s == sum(p.T)]

obj = cvx.Minimize(1/3*(cvx.abs(s_mean-s[0])+
cvx.abs(s_mean-s[1])+
cvx.abs(s_mean-s[2])))
PhaseBalancing = cvx.Problem(obj,res)
PhaseBalancing.solve()

Finally, we print the result with the following code:

print(s.value)
u = np.array([0,1,2])
pha = ’ABC’
for k in range(num_loads):
for i in range(6):
if np.round(x[i,k].value) == 1:
Mk = M[i]
w = Mk@u
print(pha[w[0]],pha[w[1]],pha[w[2]])

This result could have been obtained by hand since the problem is quite simple.
However, the model is general for any number of loads. The student is invited
to generate more instances of the model with a high number of loads.

Telegram: @ElectricalDocument
246 13 Demand-side management

100
Net load

0
0 2 4 6 8 10 12 14 16 18 20 22
time (h)

Figure 13.4 Load curve in a typical day: (- -) normal load without solar generation,
(—) load curve including the effect of solar generation (duck curve).

13.3 Energy storage management

High levels of wind and solar generation lead to a drastic change in the load
curve. For example, a peak of solar radiation is expected in the middle of the
day, creating a drastic reduction of the demand seen by the power distribution
system. Likewise, the peak of the load is usually expected in the late evening
when the capacity for solar generation is reduced. This produces a steep slope of
the load as shown in Figure 13.4, a phenomenon known as the duck curve5 . This
creates problems for the power system operator, especially in the late evening
when demand begins to rise since this steep slope affects the unit commitment
and the economic dispatch.
One way to mitigate problems related to the duck curve is using energy
storage devices such as batteries, flywheels, compressed air, or superconduct-
ing magnetic energy storage. The cycles of charge/discharge of these devices
should be optimized to reduce the adverse effects of the duck curve. They can
also be optimized to reduce power loss or to increase profits for selling energy
to the grid. All cases result in a convex optimization problem that can be easily
solved in Python.
Here, we propose a toy-model for the ideal grid-connected microgrid with
solar system and energy storage depicted in Figure 13.5. This model, although
particular, may be easily modified to include more general phenomena and
components. Our main objective is to minimize cost, although other objectives
such as peak shaving or loss minimization are also viable.

5 This term was coined by the California Independent System Operator but now used in the
power system literature.

Telegram: @ElectricalDocument
13.3 Energy storage management 247

Figure 13.5 Example of a

grid-connected microgrid.

Let us define 𝑠𝑡 as the power generated by the solar system and 𝑔𝑡 as the power
injected by the main grid; 𝑝𝑡 is the power required by the storage system (in this
case a battery) and 𝑑𝑡 is the power required by the load. Therefore, the balance
of power in the common bus-bar is given by Equation (13.12):

𝑠𝑡 + 𝑔𝑡 = 𝑝𝑡 + 𝑑𝑡 (13.12)

where the sub-index 𝑡 indicates the time in hours for a operation in 24 h. Power
is measured in kW, time in h and energy in kWh; thus, the energy in the next
hour is given by Equation (13.13):

𝑒𝑡+1 = 𝑒𝑡 + 𝑝𝑡 (13.13)

By convention we assume 𝑝𝑡 > 0 for charging mode and 𝑝𝑡 < 0 for discharg-
ing mode. For the sake of simplicity we assume an ideal process with 100%
efficiency. Both the price of the energy 𝑐𝑡 , the power generated by the solar sys-
tem 𝑠𝑡 and, the load 𝑑𝑡 are known a priory via an accurate forecasting, so the
optimization model is a deterministic linear programming problem:
∑
min 𝑐𝑡 𝑔𝑡
𝑡

𝑔𝑡 + 𝑠𝑡 = 𝑝𝑡 + 𝑑𝑡
𝑒𝑡+1 = 𝑒𝑡 + 𝑝𝑡
0 ≤ 𝑒𝑡 ≤ 𝑒max (13.14)
|𝑝𝑡 | ≤ 𝑝max
𝑒0 = 0

Telegram: @ElectricalDocument
248 13 Demand-side management

As mentioned before, the objective is to minimize total costs, and the main
result is the schedule for charge and discharge of the battery. The following
example shows the use of the Model in practice.

Example 13.5. Table 13.3 shows a forecasting for solar generation, load and
price for a microgrid as the one shown in Figure 13.5. This table is stored in a
cvs file named Table.csv. We are going to use data frames in Pandas as a
tool for analysis and data manipulation.
Model Equation (13.14) is implemented in Python with 𝑒max = 100 kWh and
max
𝑝 = 30 kW. The battery start discharged at 𝑡 = 0. The corresponding code is
presented below:

import numpy as np
import matplotlib.pyplot as plt
import cvxpy as cvx
from pandas import read_csv
data = read_csv(’Table.csv’)
p_stor = cvx.Variable(24) # p>0 charging
p_grid = cvx.Variable(24) # p<0 discharging
e_stor = cvx.Variable(25)
pmax = 30
emax = 100
f = 0
res = [p_stor>=-pmax,
p_stor<= pmax,
e_stor>=0,
e_stor<=emax,
e_stor[0]==0]
for t in range(24):
f += data[’Price’][t]*p_grid[t]
res += [e_stor[t+1] == e_stor[t] + p_stor[t]]
res += [p_grid[t]+data[’Solar’][t]==p_stor[t]+data[’Load’][t]]
BM = cvx.Problem(cvx.Minimize(f),res)
BM.solve()

Results for 24 h operation are plotted with the following code:

plt.subplot(3,1,1)
data[’Solar’].plot()
data[’Load’].plot()
plt.plot(p_stor.value)
plt.plot(p_grid.value)
plt.legend([’Solar’,’Load’,’Battery’,’Grid’])
plt.grid()
plt.ylabel(’Power (kW)’)
plt.subplot(3,1,2)
plt.plot(e_stor.value)
plt.grid()
plt.ylabel(’Energy (kWh)’)

Telegram: @ElectricalDocument
13.5 Exercises 249

plt.subplot(3,1,3)
data[’Price’].plot()
plt.ylabel(’Spot price ($/kW)’)
plt.grid()
plt.show()

As common along with the book, the reader is invited to do experiments with
this code. For example, compare results for larger values of 𝑝max and 𝑒max .
Compare the results with and without energy storage.

13.4 Further readings

One of the first models for demand-side management can be found in [115],
while a modern vision of the problem can be found in [116] and in [117]. An
interesting model for local flexibility markets can be found in [33]. Besides, a
good review of shifting load models is available in [118].
There are different technologies for energy storage, each one with a particular
niche market. For example, batteries are used for energy management, while
flywheels are used for stability purposes. A complete review about current
technologies for energy storage can be found in [59].
The classic approach for phase balancing in power distribution grids is
based on mixed-integer programming as in [114]. However, the problem is
more complex when it considers the effects of the power distribution network
and the non-linear model of loads. In that case, a mixed-integer non-linear
programming model is required. This type of model is usually solved by heuris-
tic algorithms such as simulated annealing [119]. A significant challenge of
heuristic algorithms in this type of applications is the representation of the solu-
tions; recent studies have demonstrated the advantages of using group-based
codification that reduces the space of solutions. Interested readers are invited
to review [120], and the references therein, for a complete analysis of this type
of codification.

13.5 Exercises
1. Modify the Model presented in Example 13.5 for peak shaving. Compare
results with the case of cost minimization.
2. Evaluate Example 13.5 with different values of 𝑝max and 𝑒max for both cost
minimization and peak shaving. What is more important in each case, to
have a large energy storage capacity or a large power capacity?.

Telegram: @ElectricalDocument
250 13 Demand-side management

Table 13.3 Expected

generation, demand and
price for 24h operation of a
microgrid.

Hour Solar Load Price

0 0 68.5 140
1 0 69.5 140
2 0 68 140
3 0 64.5 140
4 0 64.5 140
5 0 70.5 175
6 0 82.5 210
7 26 100 210
8 50 112 210
9 71 112 175
10 87 112.5 170
11 97 115.5 170
12 100 111.5 173
13 97 111 175
14 87 100 180
15 71 98 190
16 50 99 210
17 26 110 315
18 0 130 385
19 0 130 385
20 0 130 350
21 0 120 280
22 0 92.5 245
23 0 79.5 175

3. Modify the code presented in Example 13.1 to show the cyclic permutation
if the problem is divide into 15-minute intervals.
4. Modify Example 13.2 for minimization of the peak load.
5. Generate random instances with different sizes of the phase balancing prob-
lem presented in Example 13.3. Evaluate the quality of the solution and time
calculation according to the size of the problem. Use different solvers.

Telegram: @ElectricalDocument
13.5 Exercises 251

6. The phase balancing problem for three-phase loads may be represented

without using variables 𝑥𝑖𝑘 of Model Equation (13.11). In this case, we define
a matrix variable 𝑀𝑘 ∈ 𝔹3×3 for each load, such that
𝑝𝑘 = 𝑀 𝑘 𝑑𝑘 (13.15)
with additional constraints on the entries 𝑚𝑖𝑗 of each matrix 𝑀𝑘 :
∑
𝑚𝑖𝑗 = 1 (13.16)
𝑖
∑
𝑚𝑖𝑗 = 1 (13.17)
𝑗

Formulate the phase balancing problem in these terms and compare the
solutions. How many binary variables are required in this formulation
compared to Model Equation (13.11)?
7. Solve the phase balancing problem for three-phase loads avoiding permuta-
tions that reverse the sequence.
8. Solve the phase balancing problem relaxing the binary variables. Use a
randomly generated instance of the problem with more than ten loads.
9. Formulate the phase balancing problem considering active and reactive
power. Propose a suitable objective function in this case.
10. Demand-side management is a vast area of research; there are many other
problems such as thermal loads and V2G that are closely related. Search on
the internet for these problems and formulate their corresponding models.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
253

The nodal admittance matrix

We require some concepts from graph theory in order to obtain a systematic

representation of the network equations, and solve large optimization prob-
lems such as the optimal power flow. A graph is a structure 𝒢 = {𝑁, 𝐵}
that groups together a set of nodes 𝑁 = {0, 1, … , 𝑛 − 1} and a set of edges
(branches) 𝐵 ⊆ 𝑁 × 𝑁 that connect these nodes. For example, Figure
A.1 depicts a graph with nodes 𝑁 = {0, 1, 2, 3, 4, 5} and branches 𝐵 =
{(0, 1), (0, 3), (1, 2), (1, 4), (1, 5), (2, 5), (3, 4), (4, 5)}. In this case, the graph is ori-
ented, since the set of branches 𝐵 defines not only the connectivity between
nodes but also a direction of these connections. This is useful to represent an
electric network since it allows to define the direction of the current and/or the
power flows in the branches.
The connectivity of the graph can be represented by a matrix 𝐴 known
as incidence matrix. This matrix is size 𝑛 × 𝑚 where 𝑛 is the number of
nodes and 𝑚 the number of branches1 . Thus, every entry 𝑎𝑖𝑗 represents the
connections of node 𝑖 with the branch 𝑗; in the case of an oriented graph,
𝑎𝑖𝑗 = −1 if the branch 𝑗 leaves node 𝑖 and 𝑎𝑖𝑗 = 1 if the branch 𝑗 arrives
to node 𝑖; otherwise 𝑎𝑖𝑗 = 0 if the branch 𝑗 is not connected to node 𝑖. For
example, the incidence matrix for the graph depicted in Figure A.1 is the
following:

Figure A.1 Example of an oriented

graph.

1 Notice that some books define the node–branch incidence matrix which is the transpose of
our definition.

Mathematical Programming for Power Systems Operation: From Theory to Applications in

⎛ −1.0 −1.0 0.0 0.0 0.0 0.0 0.0 0.0 ⎞

⎜ 1.0 0.0 −1.0 −1.0 −1.0 0.0 0.0 0.0 ⎟
⎜ 0.0 0.0 1.0 0.0 0.0 −1.0 0.0 0.0 ⎟
𝐴=⎜ (A.1)
0.0 1.0 0.0 0.0 0.0 0.0 −1.0 0.0 ⎟
⎜ ⎟
⎜ 0.0 0.0 0.0 1.0 0.0 0.0 1.0 −1.0 ⎟
⎝ 0.0 0.0 0.0 0.0 1.0 1.0 0.0 1.0 ⎠

The incidence matrix is used to define relations between nodal and branch
variables in a power grid. For instance, nodal currents can be calculated from
branch currents as presented below:

𝐼𝑁 = 𝐴𝐼𝐵 (A.2)

Likewise, branch voltages can be calculated from nodal voltages as follows:

𝑉𝐵 = 𝐴 ⊤ 𝑉𝑁 (A.3)

Branch currents are in turns related to the branch voltages as given in (A.4)
which constitute a matrix representation of the Ohm’s law:

𝐼𝐵 = 𝑌𝐵 𝑉𝐵 (A.4)

where 𝑌𝐵 = diag(𝑦𝑖𝑗 ) is a diagonal matrix with diagonal entries equal to the

admittance of each branch (𝑖𝑗) ∈ 𝐵. Let us replace the expressions presented
above into (A.4) to obtain a direct relation between nodal currents and nodal
voltages:

𝐼𝑁 = 𝐴𝑌𝐵 𝐴⊤ 𝑉𝑁 (A.5)

From (A.5) we can obtain a direct definition of the nodal admittance matrix
𝑌bus :

𝑌bus = 𝐴𝑌𝐵 𝐴⊤ (A.6)

This is, of course, one of many ways to obtain the 𝑌bus matrix. Another approach
consists in defining its entries directly as follows:
∑
𝑌bus (𝑖, 𝑖) = 𝑦𝑖𝑗 (A.7)
𝑖𝑗∈Ω𝑖

𝑌bus (𝑖, 𝑗) = −𝑦𝑖𝑗 (A.8)

where 𝑦𝑖𝑗 is the admittance of each branch (𝑖𝑗) ∈ 𝐵 and Ω𝑖 represents the set of
branches that connects the node 𝑖. The following example shows how to build
the 𝑌bus in practice.

Telegram: @ElectricalDocument
A The nodal admittance matrix 255

Example A.1. Python allows to calculate easily the incidence matrix of a

graph using the module NetworkX. Let us see how the graph of Figure A.1 is
represented:
import networkx as nx
G = nx.DiGraph()
G.add_nodes_from([0,1,2,3,4,5])
G.add_edges_from([(0,1),(0,3),(1,2),(1,4),(1,5),(2,5),(3,4),
(4,5)])

Intuitively, we have defined 𝑁 and 𝐵 in the last two lines. Thus, we can
obtain a representation of the graph using the code given below, which is
self-explanatory:

import matpLotlib.pyplot as plt

nx.draw(G,with_labels = True)
plt.show()

Remember that it is the connections and not the shape of the graph that matter
in this figure. Therefore, the draw may look different from Figure A.1, but the
graph 𝐺 is the same.
Example A.2. The incidence matrix can be easily calculated in Python using
the modulde NetworkX. Let us continue with the graph depicted in Figure A.1,
where the incidence matrix is calculated as given below:

A = nx.incidence_matrix(G,oriented = True)

Let us suppose the admittance of each branch is 𝑦𝑖𝑗 = −10𝑗, then, the 𝑌bus is
obtained as follows:
import numpy as np
yB = -10j*np.identity(6)
Ybus = A@[email protected]

Thus, the 𝑌bus is built in only a few code lines.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
257

Complex linearization

It is common, in power systems applications, to find equality constraints

defined in the complex domain. Representing the equation in the complex
domain may turn out to be more straightforward. Compare, for example, the
complex representation of the nodal power, presented below:
∑
𝑠𝑘∗ = 𝑦𝑘𝑚 𝑣𝑘∗ 𝑣𝑚 (B.1)
𝑚

with its counterpart separated in real and imaginary part, namely:

∑
𝑝𝑘 = 𝑔𝑘𝑚 𝑣𝑘 𝑣𝑚 cos(𝜃𝑘𝑚 ) + 𝑏𝑘𝑚 𝑣𝑘 𝑣𝑚 sin(𝜃𝑘𝑚 ) (B.2)
𝑚
∑
𝑞𝑘 = 𝑏𝑘𝑚 𝑣𝑘 𝑣𝑚 cos(𝜃𝑘𝑚 ) − 𝑔𝑘𝑚 𝑣𝑘 𝑣𝑚 sin(𝜃𝑘𝑚 ) (B.3)
𝑚

Complex equality constraints, such as Equation (B.1), are usually non-convex,

so an affine approximation is advisable to convexify the space. A suitable man-
ner to make this approximation is to split the function into real and imaginary
parts; then, a truncated Taylor series may be used to obtain an affine func-
tion. However, it may be convenient to obtain the approximation directly in
the complex domain, as presented in this√appendix.
We define the imaginary unit as 𝑗 = −1; thus, a complex variable is rep-
resented as 𝑧 = 𝑥 + 𝑗𝑦, where 𝑥 and 𝑦 are the real and imaginary parts,
respectively. A function 𝑓 ∶ ℂ → ℂ is also defined as 𝑓(𝑧) = 𝑢 + 𝑗𝑣; where
𝑢 = real(𝑓) and 𝑣 = imag(𝑓). In the context of mathematical optimization, a
complex function can be used to represent an equality constraint, as presented
below1 :
𝑓(𝑧) = 0 (B.4)

1 Notice that an inequality constraint, such as 𝑓(𝑧) ≤ 0 may be meaningless in the complex
domain, since it is a non-ordered set. See Chapter 2 for a discussion about ordered sets.

Mathematical Programming for Power Systems Operation: From Theory to Applications in

this constraint is equivalent to the following set of constraints in the real

domain, namely:

𝑢(𝑥, 𝑦) = 0
𝑣(𝑥, 𝑦) = 0 (B.5)

Obviously, Equation (B.4) is a simpler representation than Equation (B.5).

Moreover, Python allows to work directly with complex variables, so it is more
convenient to formulate the problem directly in the complex domain. A well-
known tool to define a linear approximation is to make use of derivatives.
However, a derivative on the complex domain is not as intuitive as a derivative
in the real domain.
A derivative, in the complex domain, is defined by the following limit:

𝑓(𝑧 + ∆𝑧) − 𝑓(𝑧)

𝑓 ′ (𝑧) = lim (B.6)
∆𝑧→0 ∆𝑧

This is the same definition as in the real numbers; however, there are infinitely
many directions in which this limit may be taken in the complex plane. This
implies an important consideration related to the continuity of the function. In
a real function, we require that the limits from the left and the right are equal.
Here, we require the same limit from all directions, as depicted in Figure B.1.
This fact restricts the analysis to a special set of functions, known as a holomor-
phic functions. Sufficient conditions for differentiating a complex function, are
given by the Cauchy–Riemann equations, as presented below:

𝜕𝑢 𝜕𝑣
=
𝜕𝑥 𝜕𝑦
𝜕𝑢 𝜕𝑣
=− (B.7)
𝜕𝑦 𝜕𝑥

Figure B.1 Possible directions for taking the

limit that deﬁnes the derivative in the complex
plane.

Telegram: @ElectricalDocument
B Complex linearization 259

Unfortunately, many expressions in power systems operation problems are

not holomorphic. Therefore, we require another mathematical tool that allows
us to linearize non-holomorphic functions.
Given a complex function 𝑓 = 𝑢 + 𝑗𝑣 with 𝑢 = 𝑢(𝑥, 𝑦), 𝑣 = 𝑣(𝑥, 𝑦), we define
the Wirtinger’s derivative and the conjugate Wirtinger’s derivative as follows:
𝜕𝑓 1 𝜕𝑢 𝜕𝑣 𝑗 𝜕𝑣 𝜕𝑢
= ( + )+ ( − ) (B.8)
𝜕𝑧 2 𝜕𝑥 𝜕𝑦 2 𝜕𝑥 𝜕𝑦
𝜕𝑓 1 𝜕𝑢 𝜕𝑣 𝑗 𝜕𝑣 𝜕𝑢
∗
= ( − )+ ( + ) (B.9)
𝜕𝑧 2 𝜕𝑥 𝜕𝑦 2 𝜕𝑥 𝜕𝑦
If 𝑓 is holomorphic, then Wirtinger’s derivative is equal to the standard com-
plex derivative. However, in the general case, Wirtinger’s derivatives are not the
same as the complex derivative.
Both Wirtinger’s derivative and conjugate Wirtinger’s derivative behave sim-
ilarly as a partial derivative. So, we can apply common rules for differentiation
concerning the sum and the product of functions as follows:
𝜕(𝑓 + 𝑔) 𝜕𝑓 𝜕𝑔
= + (B.10)
𝜕𝑧 𝜕𝑧 𝜕𝑧
𝜕(𝑓 + 𝑔) 𝜕𝑓 𝜕𝑔
= ∗ + ∗ (B.11)
𝜕𝑧∗ 𝜕𝑧 𝜕𝑧
𝜕(𝑓 ⋅ 𝑔) 𝜕𝑔 𝜕𝑓
=𝑓 +𝑔 (B.12)
𝜕𝑧 𝜕𝑧 𝜕𝑧
𝜕(𝑓 ⋅ 𝑔) 𝜕𝑔 𝜕𝑓
=𝑓 +𝑔 ∗ (B.13)
𝜕𝑧∗ 𝜕𝑧 𝜕𝑧
Thus, a linearization is given by the following simple relation that resemblances
a Taylor expansion:
𝜕𝑓 𝜕𝑓
𝑓(𝑧) ≈ 𝑓(𝑧0 ) + ∆𝑧 + ∗ ∆𝑧∗ (B.14)
𝜕𝑧 𝜕𝑧
Example B.1. Let us formulate a linear approximation for the following
function, around the point 𝑧0 = 0,

𝑓 = 𝑧𝑧∗ + 8𝑧 + 5𝑧∗ (B.15)

First, we calculate the Wirtinger’s derivative and the conjugate Wirtinger’s

derivative, as presented below:
𝜕𝑓
= 𝑧∗ + 8 (B.16)
𝜕𝑧
𝜕𝑓
=𝑧+5 (B.17)
𝜕𝑧∗

Telegram: @ElectricalDocument
260 B Complex linearization

Now, we evaluate these derivatives in the point 𝑧0 , obtaining the following

affine function:

𝑓(𝑧) ≈ 0 + 8∆𝑧 + 5∆𝑧∗ (B.18)

Example B.2. Let us formulate a linear approximation of the function defined

in the previous example, but now, we split the function in real and imaginary
parts. First, the function is defined as presented below:

𝑢(𝑥, 𝑦) = 𝑥 2 + 𝑦 2 + 8𝑥 + 5𝑥
𝑣(𝑥, 𝑦) = 8𝑦 − 5𝑦 (B.19)

Now, we define the derivatives in each real variable, namely:

𝜕𝑢
= 2𝑥 + 13
𝜕𝑥
𝜕𝑢
= 2𝑦
𝜕𝑦
𝜕𝑣
=0 (B.20)
𝜕𝑥
𝜕𝑣
=3
𝜕𝑦

These derivatives form a jacobian matrix,

2𝑥 + 13 0
𝐽=( ) (B.21)
2𝑦 3

The linear approximation is obtained by a Taylor expansion around 𝑥 = 0 and

𝑦 = 0, that is:

𝑢 0 13 0 ∆𝑥
( )=( )+( )( ) (B.22)
𝑣 0 0 3 ∆𝑦

Then, the linear approximation is the following:

𝑢 = 13∆𝑥
𝑣 = 3∆𝑦 (B.23)

Telegram: @ElectricalDocument
B Complex linearization 261

Example B.3. Equation (B.18) is equivalent to Equation (B.23); this equiva-

lence is easily demonstrated by the following calculation:
8∆𝑧 + 5∆𝑧∗ = 8(∆𝑥 + 𝑗∆𝑦) + 5(∆𝑥 − 𝑗∆𝑦) (B.24)
= (8∆𝑥 + 5∆𝑥) + 𝑗(8∆𝑦 − 5∆𝑦) (B.25)
= 13∆𝑥 + 3∆𝑦𝑗 (B.26)

Example B.4. Power flow equations in a power distribution grid are given by
Equation (B.27):
∑
𝑠𝑘∗ = 𝑦𝑘𝑚 𝑣𝑘∗ 𝑣𝑚 (B.27)
𝑚

This equation is non-linear and non-holomorphic. However, we can obtain a

linear approximation around 𝑣 = 1 + 0𝑗 as follows:
∑
𝑠𝑘∗ = 𝑦𝑘𝑚 ((1 + 0𝑗)∗ (1 + 0𝑗) + (1 + 0𝑗)∆𝑣𝑘∗ + (1 + 0𝑗)∗ ∆𝑣𝑚 ) (B.28)
𝑚
∑
= 𝑦𝑘𝑚 (1 + (𝑣𝑘∗ − 1) + (𝑣𝑚 − 1)) (B.29)
𝑚
∑
= 𝑦𝑘𝑚 (𝑣𝑘∗ + 𝑣𝑚 − 1) (B.30)
𝑚

This linearization is convenient for optimization problems such as the optimal

power flow, where both 𝑠 and 𝑣 are decision variables. Notice that the equation
is affine in both variables.
Example B.5. We can create a different linearization for the power flow
equations when 𝑠𝑘 is constant. In that case, the nodal current is given by
Equation (B.31):
𝑠𝑘 ∗
𝑖𝑘 = ( ) (B.31)
𝑣𝑘
This equation is again, non-linear and non-holomorphic. The complex lin-
earization around 𝑣 = 1 + 0𝑗 is the following:

1 1
𝑖𝑘 = 𝑠𝑘∗ ( − ∆𝑣 ∗ ) (B.32)
(1 + 0𝑗)∗ ((1 + 0𝑗)∗ )2 𝑘
= 𝑠𝑘∗ (1 − (𝑣𝑘∗ − 1)) (B.33)
= 𝑠𝑘∗ (2 − 𝑣𝑘∗ ) (B.34)
This linearization is convenient for problems in which 𝑠𝑘 is constant, for
example, the primary feeder reconfiguration.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
263

Some Python examples

Python is a powerful programming language for all types of applications, from

power systems to game development. There are hundreds of libraries available
for free to solve a wide variety of problems. Likewise, there is a vast material
available on the internet, such as examples and tutorials. Said tutorials might
be more detailed and up-to-date than the examples presented in this appendix,
which is only a brief introduction to Python programming.
Python is a high-level and interpreted language, which means the code is
executed by an interpreter program, in contrast to compiled languages, such as
c++, that return an independent executable. Bypassing the compilation step
makes development faster, although the program itself may be a bit slower.
The Python interpreter may be downloaded from https://www.python.org/ and
works in both Linux and Windows. We do not require anything different from
this interpreter to execute the examples presented in this book. The code may
be written in a plain document generated in Notepad. However, there are many
IDEs (Integrated Development Environments) that simplify the development.
We have no preference for any IDE, all of them are good-enough for our pur-
poses. In addition, there are online platforms such as Jupyter and Colab that
allow to execute the examples without installing the interpreter. When reading
this book, there will probably be many other platforms.
Below, we present a series of basic examples that demonstrate the main
features of the language. These examples are pretty simple but enough to
understand the logic behind the examples presented in this book.

C.1 Basic Python

Example C.1. Our first example is the traditional hello word program that
displays the famous message. Scripting in Python is simple and clean, below,
the corresponding code:

Mathematical Programming for Power Systems Operation: From Theory to Applications in

print("Hello word")

Notice we do not require any additional library or configuration to obtain a

simple output message.
Example C.2. Let us perform simple mathematical operations as follows:
x = 5
x = x + 1
print(’the value of x is ’, x)

Other operations such as multiplication and division are straightforward. In

addition, there are commands for floor division and exponentiation, namely:
"basic mathematical commands"
x = 10/4 # divide
y = 10//4 # divide and round
z = 10*2 # multiply
w = 10**2 # exponentiation
print(x,y,z,w)

Notice that the last case is a square, e.g., 𝑤 = 102 = 100. All comments in the
code were done using a hash mark (#).
Example C.3. An array in Python may be stored in different ways; here, we
used manly lists and tuples. A tuple is defined by parenthesis and is immutable,
whereas a list is defined by square brackets and can change in size. See the
difference in the code presented below:
X = (10,15,12) # this is a tuple
Y = [10,15,12] # this is a list

Example C.4. We use tuples when the size of the array is fixed, for example:
A = (10,15,12)

in this case, 𝐴 is a vector with three entries. We can access each entry as
A[0], A[1], and A[2]. In this case, A[0]=10, A[1]=15, and A[2]=12.
Besides, we can count from the last to the first entry in the following way:
A[-1]=12,A[-2]=15, and A[-3]=10.
Example C.5. We use lists if we require to modify the entries of the array or
its size, for example:
A = [10,15,12]

Telegram: @ElectricalDocument
C.1 Basic Python 265

We access the entries of 𝐴 in the same way as a tuple. Moreover, we can increase
the size of 𝐴, as follows:

A += [30]

Here, we have added an entry at the end of the list. Therefore, the new list has
entries A[0]=10,A[1]=15,A[2]=12, and A[3]=30.

Example C.6. The operator * is not a conventional multiplication when

applied to a list. For instance, the following command returns a vector of size 8
with all entries equal to 5:

B = 8*[5]

Example C.7. One distinctive characteristic of Python is the indentation, i.e.,

the spaces at the beginning of a code line. Indentation is used to indicate block
code. Let us consider a simple conditional structure:

x = 10
y = 20
if x <= y:
print("x is lower or equal than y")
print("This line is inside the body of if")
print("This line is still inside the body of if")
print("This is outside the body of if")

Indentation in Python allows a neat code since we do not require any command
to begin and end the block code inside the conditional. However, we must be
cautious to avoid unnecessary spaces at the beginning of a line code. A simple
space may change the results of the algorithm drastically.

Example C.8. Likewise conditionals, a for-loop is quite intuitive in Python. Let

us define a simple script that prints the numbers from 0 to 4 and its squares,
e.g.,

for k in range(5):
y = k**2
print(’k is ’,k,’ and k^2 is ’,y)
print(’This is the end’)

The first print is indented, meaning that this command is executed inside the
body of the for structure. The second print is outside the body of the for
structure.

Telegram: @ElectricalDocument
266 C Some Python examples

Example C.9. A function is defined by the command def. The following script
shows the definition of the function 𝑓(𝑥) = 1∕𝑥 5 :

def f(x):
y = 1/(x**5)
return y

After defining the function, we can evaluate it in any real variable, namely:

a = 5.3
b = f(a)
print(b)

We can also evaluate the function in a complex number, as follows:

a = 5.3 + 2.0j
b = f(a)
print(b)

C.2 NumPy
One of the most useful modules in Python is NumPy, which allows operation
with multidimensional arrays and matrices similarly to Matlab. The following
examples show the use of this module.

Example C.10. The script below, shows a simple definition of a NumPy array:

import numpy as np
x = np.array([10,15,12])

The first line imports the library and defines an alias (np). The second line
defines the array itself; this array behaves as expected in linear algebra. For
instance, we can multiply for a scalar as presented below:

y = 5 *x

This operation return a vector 𝑦 ∈ ℝ3 with entries y[0]=50, y[1]=75, and

y[2]=60. Note that the result would be very different if 𝑥 were a list; in that
case, the result would be a list of size 15.

Telegram: @ElectricalDocument
C.2 NumPy 267

Example C.11. A matrix may be easily defined using NumPy. Consider the
following 3 × 3 matrix:

⎛ 4 8 7 ⎞
𝐴=⎜ 3 0 1 ⎟ (C.1)
⎜ ⎟
4 2 1
⎝ ⎠
This array is defined as follows:

A = np.array([[4,8,7],[3,0,1],[4,2,1]])

Now, we can make different operations related to linear algebra. These oper-
ations are available in the linalg submodule. Some common functions are
presented below:

B = np.linalg.inv(A) # inverse
d = np.linalg.det(A) # determinant
L,V = np.linalg.eig(A) # eigenvalues and eigenvectors

Example C.12. We can solve a linear system of equations 𝐴𝑥 = 𝑏, were 𝑏 is a

NumPy array of suitable size, for instance:

A = np.array([[4,8,7],[3,0,1],[4,2,1]])
b = np.array([12,15,9])
x = np.linalg.solve(A,b)

Example C.13. Conventional matrix multiplication is performed by the

command @. Let us consider the evaluation of the following quadratic
form:

x = np.array([1,8,3])
H = np.array([[4,8,7],[3,0,1],[4,2,1]])
f = x.T @ H @ x

Example C.14. Conventional mathematical functions are also defined in

NumPy, as follows:
x = 0.5
a = np.sin(x)
b = np.cos(x)

Telegram: @ElectricalDocument
268 C Some Python examples

c = np.tan(x)
d = np.exp(x)

C.3 MatplotLib
Example C.15. MatplotLib is a library that allows to obtain plots in a way
as simple as Matlab. The code below, shows the plot of the function 𝑓(𝑥) =
sin(𝑥)∕𝑥 for −10 ≤ 𝑥 ≤ 10:
import numpy as np
import matplotlib.pyplot as plt
xr = np.linspace(-10,10,100) # vector with 100 points
from -10 to 10
yr = np.sin(xr)/xr
plt.plot(xr,yr)

the command linspace create a vector with 100 points, between −10 and 10;
next, the function 𝑓 is invoked and the function is plotted. After that, we can
add some labels to the axis, as follows:
plt.grid()
plt.xlabel(’abscissa’)
plt.ylabel(’ordinate’)
plt.show()

C.4 Pandas
Most of the examples presented in this book are toy-models. However, they can
be extended to solve large models. In that case, we require a simple and effi-
cient way to read, store, and manipulate data. The module Pandas allows these
operations.
Example C.16. The essential component of Pandas is a DataFrame which
allows to store and manipulate data. The following code shows the creation
of a DataFrame that store the information given in Table C.1:
import pandas as pd

mytable = pd.DataFrame()
mytable["Source"] = ["Solar","Wind","Hydro","Geothermal"]
mytable["Installed"] = [12.1, 61.1, 78.4, 3.4]

Telegram: @ElectricalDocument
C.4 Pandas 269

mytable["Increasing"] = [10.9, 26.0, 7.3, 0.2]

mytable["Percentage"] = [1.14, 5.76, 7.69,0.32]
mytable.head()

This Table may be also stored in a CSV file. In that case, we can open the file
by a simple line of code as presented below:
mytable = pd.read_csv("MYFILE.csv")

where the table is stored in a file named MYFILE.csv inside the same folder of
the script. The following line returns the source in the second row (i.e., wind):
print(mytable["Source"][1])

Table C.1 Comparison of the installed power and

increase in the United States from 2008 to 2013.
Taken from [121]

Source Installed Increasing Percentage

Solar 12.1 10.9 1.14
Wind 61.1 26.0 5.76
Hydro 78.4 7.3 7.39
Geothermal 3.4 0.2 0.32

We can also plot the information given in the DataFrame using MatplotLib,
as follows:
import matplotlib.pyplot as plt
mytable.plot()
plt.grid()
plt.show()

There are many functionalities available in Pandas. This is just a hint about the
possibilities of the module. As always, the reader is invited to explore further
functions.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
271

Bibliography

[1] Stanfield S, Safdi S, Mihaly S. Optimizing the grid, a regulator’s guide to

hosting capacity analyses for distributed energy resources. 1st ed. NY:
IREC; 2017.
[2] Ding F, Mather B. On Distributed PV Hosting Capacity Estimation,
Sensitivity Study, and Improvement. IEEE Transactions on Sustainable
Energy. 2017 July;8(3):1010–1020.
[3] Conejo A, Baringo L. Power system operations. springer; 2018.
[4] Terorde M, Wattar H, Schulz D. Phase balancing for aircraft electrical
distribution systems. IEEE Transactions on Aerospace and Electronic
Systems. 2015 July;51(3):1781–1792.
[5] Weckx S, Driesen J. Load Balancing With EV Chargers and PV Inverters in
Unbalanced Distribution Grids. IEEE Transactions on Sustainable Energy.
2015 April;6(2):635–643.
[6] Chia-Hung Lin, Chao-Shun Chen, Hui-Jen Chuang, Cheng-Yu Ho.
Heuristic rule-based phase balancing of distribution systems by considering
customer load patterns. IEEE Transactions on Power Systems. 2005
May;20(2):709–716.
[7] Zhu J, Bilbro G, Chow MY. Phase balancing using simulated annealing.
IEEE Transactions on Power Systems. 1999 Nov;14(4):
1508–1513.
[8] Lin C, Chen C, Chuang H, Huang M, Huang C. An Expert System for
Three-Phase Balancing of Distribution Feeders. IEEE Transactions on
Power Systems. 2008 Aug;23(3):1488–1496.
[9] Soltani S, Rashidinejad M, Abdollahi A. Stochastic Multiobjective
Distribution Systems Phase Balancing Considering Distributed Energy
Resources. IEEE Systems Journal. 2017;PP(99):1–12.
[10] Agrawal A, Verschueren R, Diamond S, Boyd S. A Rewriting System for
Convex Optimization Problems. Journal of Control and Decision.
2018;5(1):42–60.

Mathematical Programming for Power Systems Operation: From Theory to Applications in

[11] Nesterov Y. Introductory Lectures on Convex Programming Volume I: Basic

course. Springer; 2008.
[12] Nesterov Y, Nemirovskii A. Interior point polynomial algorithms in convex
programming. vol. 1 of 10. 1st ed. Philadelphia: SIAM studies in applied
mathematics; 1994.
[13] Bertsekas D. Convex Optimization Algorithms. Massachusetts Institute of
Technology, Athena Scientific, Belmont, Massachusetts; 2015.
[14] Rockafellar T. Lagrange Multipliers and Optimality. SIAM Review.
1993;35(2):183–238.
[15] Slootweg JG, de Haan SWH, Polinder H, Kling WL. General model for
representing variable speed wind turbines in power system dynamics
simulations. IEEE Transactions on Power Systems. 2003;18(1):
144–151.
[16] Axler S, Gehring F, Ribet K. Linear algebra done wright. 2nd ed. NY:
Springer; 2009.
[17] Boyd S, Vandenberhe L. Convex optimization. Cambridge university press;
2004.
[18] Takahashi W. Introduction to Nonlinear and Convex Analysis. vol. 1. 1st
ed. Yokohama: Yokohama Publishers; 2009.
[19] Luenberger D. Optimization by vector space methods. Wiley professional
paperback series; 1969.
[20] Luenberger D, Ye Y. Linear and Nonlinear Programming. Springer; 2008.
[21] Hubbard JH, Hubbard BB. Vector Calculus, Linear Algebra, And
Differential Forms A Unified Approach. Prentice Hall; 1999.
[22] Grant M, Boyd S, Ye Y. Disciplined Convex Programming. In: Liberti L,
Maculan N, editors. Global Optimization: From Theory to Implementation;
book series Nonconvex Optimization and its Applications. NY: Springer;
2006. p. 155–210.
[23] CVXPY. CVXPY; 2020. Available from: https://www.cvxpy.org/tutorial/
advanced/index.html.
[24] Nocedal J, Wrigth SJ. Numerical optimization. Springer; 2006.
[25] Lee J. A first course in combinatorial optimization. 1st ed. Cambridge:
Cambridge university press; 2004.
[26] Goemans MX, Williamson DP. Improved Approximation Algorithms for
Maximum Cut and Satisfiability Problems Using Semidefinite
Programming. J ACM. 1995 Nov;42(6):1115–1145. Available from:
https://doi.org/10.1145/227683.227684.
[27] Poljak S, Tuza Z. The expected relative error of the polyhedral
approximation of the maxcut problem. Operations Research Letters.
1994;16(4):191 – 198. Available from: http://www.sciencedirect.com/
science/article/pii/016763779490068X.

Telegram: @ElectricalDocument
Bibliography 273

[28] Blekherman G, Parrillo P, Thomas RA. Semidefinite optimization and

convex algebraic geometry. SIAM; 2013.
[29] Anjos MF, Lasserre JB. Handbook on semidefinite, conic and polynomial
optimization. Springer; 2012.
[30] Wolkowicz H, Romesh S, Lieven V. Handbook of Semidefinite
Programming Theory, Algorithms, and Applications. NY: Springer US;
2000.
[31] Cominetti R, Facchinei F, Lasserre J. Modern Optimization Modelling
Techniques. Berlin: Springer Basel; 2012.
[32] Anjos MF, Lasserre JB. Handbook on Semidefinite, Conic and Polynomial
Optimization. Springer US; 2012.
[33] Correa-Florez CA, Michiorri A, Kariniotakis G. Optimal Participation of
Residential Aggregators in Energy and Local Flexibility Markets. IEEE
Transactions on Smart Grid. 2020;11(2):1644–1656.
[34] Bertsimas D, Brown DB, Caramanis C. Theory and Applications of Robust
Optimization. SIAM Review. 2011;53(3):464–501. Available from:
http://www.jstor.org/stable/23070141.
[35] Birge JR, Louveaux F. Introduction to stochastic programming. 2nd ed. NY:
Springer; 2011.
[36] Robinson C, Tompsett DH. Power-system engineering problems with
reference to the use of digital computers. Proceedings of the IEE - Part B:
Radio and Electronic Engineering. 1956;103(1):26–34.
[37] Basu M. Economic environmental dispatch using multi-objective
differential evolution. Applied Soft Computing. 2011;11(2):2845 – 2853.
The Impact of Soft Computing for the Progress of Artificial Intelligence.
Available from: http://www.sciencedirect.com/science/article/
pii/S1568494610002917.
[38] Stott B, Jardim J, Alsac O. DC Power Flow Revisited. IEEE Transactions on
Power Systems. 2009;24(3):1290–1300.
[39] Harker DC, Jacobs WE, Ferguson RW, Harder EL. Loss Evaluation; Parts I
to V. Transactions of the American Institute of Electrical Engineers Part III:
Power Apparatus and Systems. 1954;73(1):709–716.
[40] Burnett KN, Halfhill DW, Shepard BR. A New Automatic Dispatching
System for Electric Power Systems [includes discussion]. Transactions of
the American Institute of Electrical Engineers Part III: Power Apparatus
and Systems. 1956;75(3):1049–1056.
[41] Sörensen K. Metaheuristics—the metaphor exposed. International
Transactions in Operational Research. 2015;22(1):3–18. Available from:
https://onlinelibrary.wiley.com/doi/abs/10.1111/itor.12001.
[42] Barcelo WR, Rastgoufard P. Dynamic economic dispatch using the
extended security constrained economic dispatch algorithm. IEEE

Telegram: @ElectricalDocument
274 Bibliography

Transactions on Power Systems. 1997;12(2):

961–967.
[43] Li Q, Gao DW, Zhang H, Wu Z, Wang F. Consensus-Based Distributed
Economic Dispatch Control Method in Power Systems. IEEE Transactions
on Smart Grid. 2019;10(1):941–954.
[44] Coffrin C, Knueven B, Holzer J, Vuffray M. The impacts of convex piecewise
linear cost formulations on AC optimal power flow. ArXiv. 2020;0(0):42–60.
[45] Morales-España G, Latorre JM, Ramos A. Tight and Compact MILP
Formulation for the Thermal Unit Commitment Problem. IEEE
Transactions on Power Systems. 2013;28(4):4897–4908.
[46] Castillo A, Laird C, Silva-Monroy CA, Watson J, O’Neill RP. The Unit
Commitment Problem With AC Optimal Power Flow Constraints. IEEE
Transactions on Power Systems. 2016;31(6):4853–4866.
[47] Anjos M, Conejo A. Unit commitment in electric energy systems. Boston:
NOW Foundations and Trends in Electric Energy Systems,; 2017.
[48] González-Castellanos A, Pozo D, Bischi A. In: Distribution System
Operation with Energy Storage and Renewable Generation Uncertainty.
Cham: Springer International Publishing; 2020. p. 183–218. Available from:
https://doi.org/10.1007/978-3-030-36115-0_6.
[49] Padhy NP. Unit commitment-a bibliographical survey. IEEE Transactions
on Power Systems. 2004;19(2):1196–1205.
[50] IHA. 2020-hydropower status report, sector trends and insights.
International hydropower association; 2020.
[51] WAPA. small-scale hydroelectric power, a brieff assessment. Western area
power administration; 1984.
[52] Wood A, Wollenberg B, Sheble G. Power Generation, Operation, and
Control. 3rd ed. NY: Wiley; 2013.
[53] Victorov G. Guidelines for the application of small hydraulic turbines.
United Nations Industrial Development Organization; 1986.
[54] Fuentes-Loyola, Quintana VH. Medium-term hydrothermal coordination
by semidefinite programming. IEEE Transactions on Power Systems. 2003
November;18:1515–1522.
[55] Castano JC, Garces A, Fosso O. Short-Term Hydrothermal Coordination
with Solar and Wind Farms Using Second-Order Cone Optimization with
Chance-box Constraints. in press. 2020.
[56] Agarwal SK, Nagrath IJ. Optimal scheduling of hydrothermal systems.
Proceedings of the Institution of Electrical Engineers. 1972;119(2):169–173.
[57] Diniz AL, Souza TM. Short-Term Hydrothermal Dispatch With River-Level
and Routing Constraints. IEEE Transactions on Power Systems. 2014;29(5):
2427–2435.

Telegram: @ElectricalDocument
Bibliography 275

[58] Pickard WF. The History, Present State, and Future Prospects of
Underground Pumped Hydro for Massive Energy Storage. Proceedings of
the IEEE. 2012;100(2):473–483.
[59] Luo X, Wang J, Dooner M, Clarke J. Overview of current development in
electrical energy storage technologies and the application potential in power
system operation. Applied Energy. 2015;137:511 – 536. Available from:
http://www.sciencedirect.com/science/article/pii/S0306261914010290.
[60] Suul JA. Variable Speed Pumped Storage Hydropower Plants for
Integration of Wind Power in Isolated Power Systems. Renewable Energy,
T J Hammons, InTech; 2009.
[61] Redondo NJ, Conejo AJ. Short-term hydro-thermal coordination by
Lagrangian relaxation: solution of the dual problem. IEEE Transactions on
Power Systems. 1999;14(1):89–95.
[62] Wong KP, Wong YW. Short-term hydrothermal scheduling part. I.
Simulated annealing approach. IEE Proceedings - Generation,
Transmission and Distribution. 1994;141(5):497–501.
[63] Gil E, Bustos J, Rudnick H. Short-term hydrothermal generation scheduling
model using a genetic algorithm. IEEE Transactions on Power Systems.
2003;18(4):1256–1264.
[64] van Ackooij W, Finardi EC, Ramalho GM. An Exact Solution Method for
the Hydrothermal Unit Commitment Under Wind Power Uncertainty With
Joint Probability Constraints. IEEE Transactions on Power Systems.
2018;33(6):6487–6500.
[65] Bruninx K, Dvorkin Y, Delarue E, Pandžić H, D’haeseleer W, Kirschen DS.
Coupling Pumped Hydro Energy Storage With Unit Commitment. IEEE
Transactions on Sustainable Energy. 2016;7(2):786–796.
[66] Terry L, Pereira M, Araripe-Neto T, Silva L, Sales P. Coordinating the Energy
Generation of the Brazilian National Hydrothermal Electrical Generating
System. INFORMS Journal on Applied Analytics. 1986;p. 361–379.
[67] Finardi E, Decker B, de Matos V. An Introductory Tutorial on Stochastic
Programming Using a Long-term Hydrothermal Scheduling Problem.
Journal of Control Automation and Electric Systems. 2013;24:361–379.
[68] Garces A. Uniqueness of the power flow solutions in low voltage direct
current grids. Electric Power Systems Research. 2017;151:149 – 153.
Available from: http://www.sciencedirect.com/science/article/pii/
S0378779617302298.
[69] Ochoa LF, Wilson DH. Angle constraint active management of distribution
networks with wind power. In: 2010 IEEE PES Innovative Smart Grid
Technologies Conference Europe (ISGT Europe); 2010. p. 1–5.

Telegram: @ElectricalDocument
276 Bibliography

[70] Garces A, Ramirez D, Mora J. A Wirtinger Linearization for the Power Flow
in Microgrids. Presented in 2019 IEEE Power and Energy Society General
Meeting, Atlanta. 2019 Aug;.
[71] Dommel HW, Tinney WF. Optimal Power Flow Solutions. IEEE
Transactions on Power Apparatus and Systems.
1968;PAS-87(10):1866–1876.
[72] Peschon J, Bree DW, Hajdu LP. Optimal power-flow solutions for power
system planning. Proceedings of the IEEE. 1972;60(1):64–70.
[73] Gómez Expósito A, Romero Ramos E. Reliable load flow technique for
radial distribution networks. IEEE Transactions on Power Systems.
1999;14(3):1063–1069.
[74] Torres GL, Quintana VH. An interior-point method for nonlinear optimal
power flow using voltage rectangular coordinates. IEEE Transactions on
Power Systems. 1998;13(4):1211–1218.
[75] Garces A. A Linear Three-Phase Load Flow for Power Distribution Systems.
IEEE Transactions on Power Systems. 2016 Jan;31(1):827–828.
[76] Low SH. Convex Relaxation of Optimal Power Flow—Part I: Formulations
and Equivalence. IEEE Transactions on Control of Network Systems. 2014
March;1(1):15–27.
[77] Low SH. Convex Relaxation of Optimal Power Flow—Part II: Exactness.
IEEE Transactions on Control of Network Systems. 2014;1(2):177–189.
[78] Molzahn DK, Hiskens IA. A Survey of Relaxations and Approximations of
the Power Flow Equations. now; 2019. Available from:
https://ieeexplore.ieee.org/document/8635446.
[79] Li J, Liu F, Wang Z, Low SH, Mei S. Optimal Power Flow in Stand-Alone
DC Microgrids. IEEE Transactions on Power Systems. 2018;33(5):
5496–5506.
[80] Lavaei J, Low SH. Zero Duality Gap in Optimal Power Flow Problem. IEEE
Transactions on Power Systems. 2012;27(1):92–107.
[81] Molzahn DK, Hiskens IA. Convex Relaxations of Optimal Power Flow
Problems: An Illustrative Example. IEEE Transactions on Circuits and
Systems I: Regular Papers. 2016;63(5):650–660.
[82] Madani R, Sojoudi S, Lavaei J. Convex Relaxation for Optimal Power Flow
Problem: Mesh Networks. IEEE Transactions on Power Systems.
2015;30(1):199–211.
[83] Chis M, Salama MMA, Jayaram S. Capacitor placement in distribution
systems using heuristic search strategies. IEE Proceedings - Generation,
Transmission and Distribution. 1997;144(3):225–230.
[84] Civanlar S, Grainger JJ, Yin H, Lee SSH. Distribution feeder reconfiguration
for loss reduction. IEEE Transactions on Power Delivery. 1988;3(3):
1217–1223.

Telegram: @ElectricalDocument
Bibliography 277

[85] Lavorato M, Franco JF, Rider MJ, Romero R. Imposing Radiality

Constraints in Distribution System Optimization Problems. IEEE
Transactions on Power Systems. 2012;27(1):172–180.
[86] Gil-González W, Garces A, Montoya OD, Hernández JC. A Mixed-Integer
Convex Model for the Optimal Placement and Sizing of Distributed
Generators in Power Distribution Networks. Applied Sciences. 2021;11(2).
Available from: https://www.mdpi.com/2076-3417/11/2/627.
[87] Divan D, Kandula P. Increasing solar hosting capacity is the key to
sustainability. In: 2016 First International Conference on Sustainable
Green Buildings and Communities (SGBC); 2016. p. 1–5.
[88] Teodorescu R, Liserre M, Rodriguez P. Grid Converters for Photovoltaic and
Wind Power Systems. 1st ed. NY: IEEE Power Engineering Society, NJ,
Wiley-Interscience; 2011.
[89] Garces A. A Linear Three-Phase Load Flow for Power Distribution Systems.
Power Systems, IEEE Transactions on. 2015;PP(99):1–2.
[90] Gallego RA, Monticelli AJ, Romero R. Optimal capacitor placement in
radial distribution networks. IEEE Transactions on Power Systems.
2001;16(4):630–637.
[91] Kefayat M, Ara AL, Niaki SN. A hybrid of ant colony optimization and
artificial bee colony algorithm for probabilistic optimal placement and
sizing of distributed energy resources. Energy Conversion and
Management. 2015;92:149–161.
[92] Su X, Masoum MAS, Wolfs PJ. PSO and Improved BSFS Based Sequential
Comprehensive Placement and Real-Time Multi-Objective Control of
Delta-Connected Switched Capacitors in Unbalanced Radial MV
Distribution Networks. IEEE Transactions on Power Systems.
2016;31(1):612–622.
[93] Sorensen K. Metaheuristics—the metaphor exposed. International
Transactions in Operational Research. 2015;22(1):3–18. Available from:
https://onlinelibrary.wiley.com/doi/abs/10.1111/itor.12001.
[94] Dubey A, Santoso S. On Estimation and Sensitivity Analysis of Distribution
Circuit’s Photovoltaic Hosting Capacity. IEEE Transactions on Power
Systems. 2017 July;32(4):2779–2789.
[95] Torquato R, Salles D, Oriente Pereira C, Meira PCM, Freitas W. A
Comprehensive Assessment of PV Hosting Capacity on Low-Voltage
Distribution Systems. IEEE Transactions on Power Delivery. 2018
April;33(2):1002–1012.
[96] Al-Saadi H, Zivanovic R, Al-Sarawi SF. Probabilistic Hosting Capacity for
Active Distribution Networks. IEEE Transactions on Industrial
Informatics. 2017 Oct;13(5):2519–2532.

Telegram: @ElectricalDocument
278 Bibliography

[97] Gensollen N, Horowitz K, Palmintier B, Ding F, Mather B. Beyond Hosting

Capacity: Using Shortest-Path Methods to Minimize Upgrade Cost
Pathways. IEEE Journal of Photovoltaics. 2019 July;9(4):1051–1056.
[98] Garces A, Molinas M, Rodriguez P. A generalized compensation theory for
active filters based on mathematical optimization in ABC frame. Electric
Power Systems Research. 2012;90:1–10. Available from: https://www.
sciencedirect.com/science/article/pii/S0378779612000806.
[99] Akagi H, Watanabe EH, Aredes M. Instantaneous Power Theory and
Applications to Power Conditioning. 1st ed. NY: IEEE Power Engineering
Society, NJ, Wiley-Interscience; 2007.
[100] Czarnecki LS. Instantaneous reactive power p-q theory and power
properties of three-phase systems. IEEE Transactions on Power Delivery.
2006;21(1):362–367.
[101] Herrera RS, Salmeron P. Instantaneous Reactive Power Theory: A
Comparative Evaluation of Different Formulations. IEEE Transactions on
Power Delivery. 2007;22(1):595–604.
[102] Baran ME, Wu FF. Network reconfiguration in distribution systems for loss
reduction and load balancing. IEEE Transactions on Power Delivery.
1989;4(2):1401–1407.
[103] Monticelli A. Electric power system state estimation. Proceedings of the
IEEE. 2000;88(2):262–282.
[104] Ardakanian O, Wong VWS, Dobbe R, Low SH, von Meier A, Tomlin CJ,
et al. On Identification of Distribution Grids. IEEE Transactions on Control
of Network Systems. 2019;6(3):950–960.
[105] Farajollahi M, Shahsavari A, Mohsenian-Rad H. Topology Identification in
Distribution Systems Using Line Current Sensors: An MILP Approach.
IEEE Transactions on Smart Grid. 2020;11(2):1159–1170.
[106] Schweppe FC, Wildes J. Power System Static-State Estimation, Part I: Exact
Model. IEEE Transactions on Power Apparatus and Systems.
1970;PAS-89(1):120–125.
[107] Zhao J, Gómez-Expósito A, Netto M, Mili L, Abur A, Terzija V, et al. Power
System Dynamic State Estimation: Motivations, Definitions,
Methodologies, and Future Work. IEEE Transactions on Power Systems.
2019;34(4):3188–3198.
[108] Zhu H, Giannakis GB. Power System Nonlinear State Estimation Using
Distributed Semidefinite Programming. IEEE Journal of Selected Topics in
Signal Processing. 2014;8(6):1039–1050.
[109] Madani R, Lavaei J, Baldick R. Convexification of Power Flow Equations in
the Presence of Noisy Measurements. IEEE Transactions on Automatic
Control. 2019;64(8):3101–3116.

Telegram: @ElectricalDocument
Bibliography 279

[110] Wu FF, Monticelli A. Network Observability: Theory. IEEE Transactions on

Power Apparatus and Systems. 1985;PAS-104(5):1042–1048.
[111] Bai H, Zhang P, Ajjarapu V. A Novel Parameter Identification Approach via
Hybrid Learning for Aggregate Load Modeling. IEEE Transactions on
Power Systems. 2009;24(3):1145–1154.
[112] Garces A, Gil-González W, Montoya OD, Chamorro HR, Alvarado-Barrios
L. A Mixed-Integer Quadratic Formulation of the Phase-Balancing Problem
in Residential Microgrids. Applied Sciences. 2021;11(5). Available from:
https://www.mdpi.com/2076-3417/11/5/1972.
[113] Bazaraa M, Jarvis J, Sherali H. Linear programming and network flows.
Wiley; 2010.
[114] Jinxiang Zhu, Mo-Yuen Chow, Fan Zhang. Phase balancing using
mixed-integer programming [distribution feeders]. IEEE Transactions on
Power Systems. 1998;13(4):1487–1492.
[115] Gellings CW. The concept of demand-side management for electric utilities.
Proceedings of the IEEE. 1985;73(10):1468–1470.
[116] Meyabadi AF, Deihimi MH. A review of demand-side management:
Reconsidering theoretical framework. Renewable and Sustainable Energy
Reviews. 2017;80:367 – 379. Available from:
http://www.sciencedirect.com/science/article/pii/S1364032117308481.
[117] Deng R, Yang Z, Chow M, Chen J. A Survey on Demand Response in Smart
Grids: Mathematical Models and Approaches. IEEE Transactions on
Industrial Informatics. 2015;11(3):570–582.
[118] Hu RL, Skorupski R, Entriken R, Ye Y. A Mathematical Programming
Formulation for Optimal Load Shifting of Electricity Demand for the Smart
Grid. IEEE Transactions on Big Data. 2020;6(4):638–651.
[119] Zhu J, Bilbro G, Mo-Yuen Chow. Phase balancing using simulated
annealing. IEEE Transactions on Power Systems. 1999;14(4):1508–1513.
[120] Castaño JC, Garcés A, Rios MA. In: Phase Balancing in Power Distribution
Grids: A Genetic Algorithm with a Group-Based Codification. Cham:
Springer International Publishing; 2020. p. 325–342. Available from:
https://doi.org/10.1007/978-3-030-36115-0_ 11.
[121] Li K, Bian H, Liu C, Zhang D, Yang Y. Comparison of geothermal with solar
and wind power generation systems. Renewable and Sustainable Energy
Reviews. 2015;42:1464–1474. Available from:
https://www.sciencedirect.com/science/article/pii/S1364032114008740.

Telegram: @ElectricalDocument
Telegram: @ElectricalDocument
281

Index

𝑌bus , 175 Euclidean norm, 23

+=, 31
f
a francis turbine, 157
affine set, 42
argmin, 19 g
global optimum, 24, 51
b
branch and bound, 77 h
hydraulic chains, 162
c hydroelectric, 156
capacitor placement, 202 hydrothermal coordination, 4, 155
cardinality constrained hydrothermal dispatch, 4, 155
uncertainty, 122 hydrothermal scheduling, 155
Cholesky, 69, 70, 87, 93, 118
concave, 45 i
Identification, topology, 221
d incremental cost, 128
day ahead dispatch, 9 inf, 19
dc state estimator, 218 information and communication
determinant, 93 technologies, 7
dual function, 54 Internet of the things, 7
dual norm, 114
dual problem, 54 k
Duck curve, 246 kapplan turbine, 157
Kron, 176
e
economic dispatch, 3, 127 l
epigraph, 48 Lagrange multipliers, 32

Mathematical Programming for Power Systems Operation: From Theory to Applications in

Python. First Edition. Alejandro Garcés.
© 2022 by The Institute of Electrical and Electronics Engineers, Inc. Published 2022 by John
Telegram: @ElectricalDocument
Wiley & Sons, Inc.
282 Index

least square, 230 robust economic dispatch, 133

local optimum, 24 robust optimization, 14, 133

m s
Manhattan norm, 23 SCADA, 9, 215
matrix norm, 224 SDP, 188
MI-SOC, 201 sectionalizing switches, 196
minimum square, 230 semidefinite, 69, 95
MIQP, 201 semidefinite matrix, 67, 69
model predictive control, 14 skew-symmetric, 68
SOC, 184
n
strictly convex, 45
negative semidefinite matrix, 69
networkx, 175 strong duality, 54
norm, 22 strongly convex, 51
sup, 19
o
OPF, 171 t
trace, 92
p
pelton turbine, 157 u
PMU, 9 uniform norm, 23
PMUs, 215 unit commitment, 148
polytope, 42
positive semidefinite matrix, 69 v
power flow, 173 V2G, 8
pumped hydroelectric storage, 165 vehicle-to-grid, 8
virtual power plant, 9
q VPP, 9
quadratic form, 67
quantile function, 120, 133 w
weak duality, 54
r Wirtinger, 177
receding horizon, 14
Reconfiguration, primary feeder, z
196 ZIP model, 228

Telegram: @ElectricalDocument
WILEY END USER LICENSE
AGREEMENT
Go to www.wiley.com/go/eula to access Wiley’s eb-
ook EULA.

Telegram: @ElectricalDocument

Telephone Directory BHEL
43% (7)
Telephone Directory BHEL
14 pages
PowerFactory EMT Model2
No ratings yet
PowerFactory EMT Model2
14 pages
Innovative Numerical Protection Relay Design On The Basis of Sampled Measured Values For Smart Grids
No ratings yet
Innovative Numerical Protection Relay Design On The Basis of Sampled Measured Values For Smart Grids
225 pages
Event Sample Budget PDF
No ratings yet
Event Sample Budget PDF
4 pages
Application For Incentive of Inter-Caste Marriage: Government of West Bengal Office of The PO Cum DWO, PURBA MEDINIPUR
No ratings yet
Application For Incentive of Inter-Caste Marriage: Government of West Bengal Office of The PO Cum DWO, PURBA MEDINIPUR
1 page
Merchant Center Intro PDF
0% (1)
Merchant Center Intro PDF
2 pages
Power Flow Control Solutions for a Modern Grid Using SMART Power Flow Controllers
100% (1)
Power Flow Control Solutions for a Modern Grid Using SMART Power Flow Controllers
716 pages
Power Flow
No ratings yet
Power Flow
30 pages
Optimal Placement of SVC and Statcom For Voltage Stability Enhancement Under Contingency Using Cat Swarm Optimization
No ratings yet
Optimal Placement of SVC and Statcom For Voltage Stability Enhancement Under Contingency Using Cat Swarm Optimization
12 pages
Analysis For Power System State Estimation
No ratings yet
Analysis For Power System State Estimation
9 pages
Time-Domain Models For Power System Stability and Unbalance
No ratings yet
Time-Domain Models For Power System Stability and Unbalance
135 pages
Chapter - 5 Network Reconfiguration
No ratings yet
Chapter - 5 Network Reconfiguration
38 pages
CA08100018E Vol15 Ibook
No ratings yet
CA08100018E Vol15 Ibook
166 pages
Power System Stability Sweden 20132
No ratings yet
Power System Stability Sweden 20132
4 pages
Modern Power System Matlab Simulation, Pspice, SVC-HVDC Transmission, STATCOM, Location of Facts, Power System ME, M.tech, B.Tech, BE Final Year IEEE Projects 2011 - 2012
No ratings yet
Modern Power System Matlab Simulation, Pspice, SVC-HVDC Transmission, STATCOM, Location of Facts, Power System ME, M.tech, B.Tech, BE Final Year IEEE Projects 2011 - 2012
3 pages
PSCAD Cookbook: Induction Machines Study
No ratings yet
PSCAD Cookbook: Induction Machines Study
27 pages
Impact of AVR On Stability
No ratings yet
Impact of AVR On Stability
44 pages
PS Simulation Lab
No ratings yet
PS Simulation Lab
41 pages
Lab 1 Introduction To Power System Protection Updated
No ratings yet
Lab 1 Introduction To Power System Protection Updated
21 pages
Power System Dynamics and Control
No ratings yet
Power System Dynamics and Control
2 pages
Load Flow Analysis On IEEE 14 Bus System
No ratings yet
Load Flow Analysis On IEEE 14 Bus System
7 pages
Visvesvaraya Technological University Belagavi: Scheme of Teaching and Examination and Syllabus
100% (1)
Visvesvaraya Technological University Belagavi: Scheme of Teaching and Examination and Syllabus
55 pages
PSCAD Broc
No ratings yet
PSCAD Broc
6 pages
Varibles DIgSILENT
No ratings yet
Varibles DIgSILENT
58 pages
3.6-PSOC Optimized PDF
No ratings yet
3.6-PSOC Optimized PDF
68 pages
11 Power System Analysis and Planning
No ratings yet
11 Power System Analysis and Planning
14 pages
IEEE Smart Grid Domains and SubDomains Definitions
100% (1)
IEEE Smart Grid Domains and SubDomains Definitions
5 pages
Micro Grid
No ratings yet
Micro Grid
10 pages
Grid Security Challenge
No ratings yet
Grid Security Challenge
25 pages
2발전기특성Model DataSheets PDF
No ratings yet
2발전기특성Model DataSheets PDF
181 pages
Appa-Module 6-Fault Current Analysis PDF
No ratings yet
Appa-Module 6-Fault Current Analysis PDF
65 pages
Tutorial DistanceProtection
No ratings yet
Tutorial DistanceProtection
13 pages
Guidelines For Grid Interconnection - Part B Technical - Tanzania
No ratings yet
Guidelines For Grid Interconnection - Part B Technical - Tanzania
35 pages
Generation - Incorporating Electromagnetic Transient Studies Into The Generator Interconnection Process at ATC - 110420 PDF
No ratings yet
Generation - Incorporating Electromagnetic Transient Studies Into The Generator Interconnection Process at ATC - 110420 PDF
32 pages
Load Flow and Contingency Analysis in Power Systems
No ratings yet
Load Flow and Contingency Analysis in Power Systems
66 pages
EE4525 Ch2 Load Forecast (Lecture2)
No ratings yet
EE4525 Ch2 Load Forecast (Lecture2)
23 pages
Lesson 7 - Power System Analysis and Control
No ratings yet
Lesson 7 - Power System Analysis and Control
46 pages
2020 Book EuropeanGuideToPowerSystemTest
No ratings yet
2020 Book EuropeanGuideToPowerSystemTest
141 pages
GRID CONNECTED Notes Book
100% (1)
GRID CONNECTED Notes Book
22 pages
CYME Solar Impact Study Modules
No ratings yet
CYME Solar Impact Study Modules
8 pages
Power System Restoration With
No ratings yet
Power System Restoration With
5 pages
MATLAB Program For Solution Power Flow Gauss-Seidel Method - EE1404 - Power System Simulation Laboratory
0% (1)
MATLAB Program For Solution Power Flow Gauss-Seidel Method - EE1404 - Power System Simulation Laboratory
4 pages
PSS CAPE RelaySetting DataSheet
No ratings yet
PSS CAPE RelaySetting DataSheet
3 pages
Subsynchronous Resonance Studies Using Power Factory
No ratings yet
Subsynchronous Resonance Studies Using Power Factory
43 pages
Short Circuit Calc Thesis PDF
No ratings yet
Short Circuit Calc Thesis PDF
214 pages
Capacitor Digsilent
No ratings yet
Capacitor Digsilent
5 pages
Voltage Stability: (Definition and Concept)
No ratings yet
Voltage Stability: (Definition and Concept)
33 pages
Reactive Power Management and Voltage Stability
100% (1)
Reactive Power Management and Voltage Stability
32 pages
App12 KTH Master Thesis Comparison of A Three Phase Single Stage PV System in PSCAD - and PowerFactory PDF
No ratings yet
App12 KTH Master Thesis Comparison of A Three Phase Single Stage PV System in PSCAD - and PowerFactory PDF
101 pages
PSS Sincal - Reliability in Grid Industry Webinar
No ratings yet
PSS Sincal - Reliability in Grid Industry Webinar
31 pages
Power System Analysis Toolbox
No ratings yet
Power System Analysis Toolbox
29 pages
Integration of Wind Power in The Egyptian Power System 2016
No ratings yet
Integration of Wind Power in The Egyptian Power System 2016
152 pages
Manual Digsilent 4-5
100% (2)
Manual Digsilent 4-5
116 pages
Load Flow Analysis
100% (2)
Load Flow Analysis
36 pages
STATCOM Review PDF
No ratings yet
STATCOM Review PDF
6 pages
Simulation of Some Power System, Control System and Power Electronics Case Studies Using Matlab and PowerWorld Simulator
From Everand
Simulation of Some Power System, Control System and Power Electronics Case Studies Using Matlab and PowerWorld Simulator
Dr. Hedaya Mahmood Alasooly
No ratings yet
Reactive Power Compensation
From Everand
Reactive Power Compensation
Dr. Hidaia Mahmood Alassouli
No ratings yet
Mathematical Programming for Power Systems Operation with Python Applications 1st Edition Alejandro Garces Ruiz instant download
100% (1)
Mathematical Programming for Power Systems Operation with Python Applications 1st Edition Alejandro Garces Ruiz instant download
45 pages
(Ebook) Mathematical Programming for Power Systems Operation with Python Applications by Alejandro Garces Ruiz ISBN 9781119747260, 1119747260 2024 scribd download
100% (4)
(Ebook) Mathematical Programming for Power Systems Operation with Python Applications by Alejandro Garces Ruiz ISBN 9781119747260, 1119747260 2024 scribd download
81 pages
Download Complete Mathematical Programming for Power Systems Operation with Python Applications 1st Edition Alejandro Garces Ruiz PDF for All Chapters
No ratings yet
Download Complete Mathematical Programming for Power Systems Operation with Python Applications 1st Edition Alejandro Garces Ruiz PDF for All Chapters
40 pages
Mathematical Programming for Power Systems Operation with Python Applications 1st Edition Alejandro Garces Ruiz download pdf
No ratings yet
Mathematical Programming for Power Systems Operation with Python Applications 1st Edition Alejandro Garces Ruiz download pdf
50 pages
Optimization of Power System
No ratings yet
Optimization of Power System
7 pages
Foreword To The 2nd Edition
No ratings yet
Foreword To The 2nd Edition
143 pages
Optimization Methods Applied To Power Systems Ii Francisco G Montoya pdf download
No ratings yet
Optimization Methods Applied To Power Systems Ii Francisco G Montoya pdf download
86 pages
Pediatric Infusion Standards
No ratings yet
Pediatric Infusion Standards
20 pages
Holiday Homework Class - Ix (23-24)
No ratings yet
Holiday Homework Class - Ix (23-24)
2 pages
Membership Cancellation
0% (1)
Membership Cancellation
4 pages
This Study Resource Was: Problem 2
No ratings yet
This Study Resource Was: Problem 2
6 pages
Textiles and Apparel: August 2020
No ratings yet
Textiles and Apparel: August 2020
37 pages
Full Hard Stainless Steel Shim Flat Sheets
No ratings yet
Full Hard Stainless Steel Shim Flat Sheets
15 pages
Class Ix Maths MCQS
No ratings yet
Class Ix Maths MCQS
36 pages
Smartgen 4020
No ratings yet
Smartgen 4020
30 pages
Exp - 08 - ABCD Parameters of A Transmission Line
No ratings yet
Exp - 08 - ABCD Parameters of A Transmission Line
4 pages
Thoughts On Functional Decomposition
No ratings yet
Thoughts On Functional Decomposition
5 pages
MERCHANT ONBOARDING v1
No ratings yet
MERCHANT ONBOARDING v1
20 pages
Askjdasd
No ratings yet
Askjdasd
14 pages
G10 Q1 SLM1 Information Gathering Through Listening for Everyday Life Usage
No ratings yet
G10 Q1 SLM1 Information Gathering Through Listening for Everyday Life Usage
32 pages
Namicas
No ratings yet
Namicas
688 pages
Indigo Disc Assessment
No ratings yet
Indigo Disc Assessment
6 pages
4th Quarter - Arts - MAPEH 10 1
No ratings yet
4th Quarter - Arts - MAPEH 10 1
11 pages
GAYA BRASSERIE MENU (2024) (1)
No ratings yet
GAYA BRASSERIE MENU (2024) (1)
8 pages
Verilog
No ratings yet
Verilog
6 pages
Mechanical Station Plan
No ratings yet
Mechanical Station Plan
1 page
Islamic University:: of Science & Technology
No ratings yet
Islamic University:: of Science & Technology
3 pages
BST 32202 LINEAR REGRESSION 4 TWO WAY ANOVA
No ratings yet
BST 32202 LINEAR REGRESSION 4 TWO WAY ANOVA
25 pages
Probe API Overview - Apr 2023-- P43
No ratings yet
Probe API Overview - Apr 2023-- P43
14 pages
Manual de Bomba ICTUS
No ratings yet
Manual de Bomba ICTUS
16 pages
App of CFC Logic in Intel Devices
No ratings yet
App of CFC Logic in Intel Devices
14 pages
Tadano Mobile Crane Atf 70g 4 Load Chart Operating Manual
No ratings yet
Tadano Mobile Crane Atf 70g 4 Load Chart Operating Manual
11 pages
ASSIGNMENT Work, Energy and Power
No ratings yet
ASSIGNMENT Work, Energy and Power
3 pages