0% found this document useful (0 votes)
111 views75 pages

Lecture 2 Merged

The document discusses two examples of linear programming problems: transportation problems and choosing paths in a communication network. Transportation problems involve optimizing the cost of transporting goods from plants to warehouses given production capacities and demand. Choosing paths problems involve transmitting data between nodes in a network at minimum cost given link capacities and data generation.

Uploaded by

rohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views75 pages

Lecture 2 Merged

The document discusses two examples of linear programming problems: transportation problems and choosing paths in a communication network. Transportation problems involve optimizing the cost of transporting goods from plants to warehouses given production capacities and demand. Choosing paths problems involve transmitting data between nodes in a network at minimum cost given link capacities and data generation.

Uploaded by

rohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

MTL103: Optimization Methods and Applications

Lecture 1 - January 4th, 2024


Hemang Sidana - Zaurez Ahmad - Dhruv Gupta - Musaib Gani Pirzada
2021CS10078 - 2021CS10090 - 2021CS50125 - 2021MT10227

Optimization Problems

• An optimization problem looks like:

– Minimize f0 (x)
– Subject to fi (x) ≥ bi , i = 1, . . . , m

• In the above equations the vector x = (x1 , . . . , xn ) is the optimization variable or decision
variable of the problem. The function f0 : Rn → R is the objective function. The functions
fi : Rn → R, i = 1, . . . , m are the (inequality) constraint functions. The constants b1 , . . . , bm
are the limits or bounds, for the constraints.

Terminology

• Decision Variables: These are variables in the mathematical expression of linear program-
ming that we are trying to find the optimal values for to optimize the objective function.

• Free variable: These are the variables in linear programming that are not subject to any
constraint.

• Objective function: This is the mathematical expression in linear programming that rep-
resents the target quantity that needs to be optimized. In linear programming problem it is
a linear combination of the decision variables with coefficients given by the cost vector.

• Cost vector: Cost vector c is a vector such that we seek to minimize cT x as the objective
function in linear programming problem.

• Feasible solution: Any set of values for the decision variables that satisfies all the constraints
in the linear programming problem is said to be a feasible solution.

• Feasible region: This is the set of all feasible solutions in the decision space in linear
programming.

• Optimal solution: A vector x∗ is the Optimal Solution for the minimization problem if it
has the smallest objective value among all vectors that satisfy the constraints. In other words,
a vector x∗ is said to be an optimal solution if for any z satisfying fi (z) ≤ bi , for i = 1, . . . m
and f0 (z) ≥ f0 (x∗ )

• Cost is unbounded: In linear programming problems, if the cost is unbounded, it means


that there is no upper or lower limit on the objective function, i.e., for the minimization
problem there exists a solution x∗ for every solution x of optimization problem s.t f0 (x∗ ) ≤
f0 (x) and vice versa for maximization problem.
Forms of Linear Programming Problems

1. General Form

The general form of linear programming problem is denoted as follows. For a given cost vector c
minimize cT x subject to constraints -

aTi x ≥ bi , i ∈ M1
aTi x ≤ bi , i ∈ M2
aTi x = bi , i ∈ M3
xj ≥ 0, j ∈ N1
xj ≤ 0, j ∈ N2

2. Standard Form

The standard form of linear programming problem is denoted as follows. For a given cost vector c
minimize cT x subject to constraints -

Ax = b
x≥0

3. Canonical Form

The canonical form of linear programming problem is denoted as follows. For a given cost vector
c minimize cT x subject to constraints -

Ax ≥ b
x≥0

2
Reduction from General Form to Canonical Form

• Given General Form


Minimize cT x subject to constraints,

aTi x ≥ bi , i ∈ M1
aTi x ≤ bi , i ∈ M2
aTi x = bi , i ∈ M3
xj ≥ 0, j ∈ N1
xj ≤ 0, j ∈ N2

• Equivalent Canonical Form



Replace xj with x+
j − xj if j ∈
/ N1 ∪ N2 .
T
Minimize c x subject to constraints,

aTi x ≥ bi , i ∈ M1
−aTi x ≥ −bi , i ∈ M2
aTi x ≥ bi , i ∈ M3
−aTi x ≥ −bi , i ∈ M3
xj ≥ 0, j ∈ N1
−xj ≥ 0, j ∈ N2
x+
j ≥ 0, j ∈
/ N1 ∪ N2
x−
j ≥ 0, j ∈
/ N1 ∪ N2

Reduction from Canonical Form to General Form

Canonical Form is a special case of General Form.

Reduction from Canonical Form to Standard Form

• Given Canonical Form


Minimize cT x subject to constraints,

Ax ≥ B
x≥0

• Equivalent Standard Form


Introduce slack variable si . Let S be the column vector of si .
Minimize cT x subject to constraints,

Ax − S = B
x≥0
S≥0

3
Reduction from Standard Form to Canonical Form

• Given Standard Form


Minimize cT x subject to constraints,
Ax = B
x≥0

• Equivalent Canonical Form


Minimize cT x subject to constraints,
Ax ≥ B
−Ax ≥ −B
x≥0

Equivalence of Three Forms

We have shown that general form and canonical form are reducible to each other. So, general form
and canonical form are equivalent. Similarly, canonical form and standard form are equivalent.
Hence, all three forms are equivalent.

Examples of Linear Programming

1. Transportation Problems

The problem of optimising cost for transporting coffee can be modelled using linear programming
as follows -

• Consider a coffee company that has m plants indexed 1 to m such that production capacity
of plant i is Pi .
• Coffee is shipped every week to n warehouses which are indexed 1 to n such that the demand
at warehouse j is dj .
• Unit shipping cost from plant i to warehouse j is cij .

Objective is to find the production-shipping pattern xij , from plant i to warehouse j, i ∈ {1, . . . m},
j ∈ {1, . . . n}, which minimizes the overall shipping cost.

Mathematical Formulation and Cost Function

Since shipping from plant i to warehouse j costs cij per unit, and we ship xij amount, we get that
the total cost to ship from plant i to warehouse j is cij · xij . Thus our total cost over all i, j pairs

4
is m n
P
i=1 Σj=1 cij · xij . Thus this is our objective function, which we need to minimise. Further the
constraints which we have can be formulated as follows:

• Since from each warehouse ‘ i ’, we can only produce Pi amount of goods, thus the total
amount of goods leaving this factory should be less than or equal to Pi . We get
n
X
xij ≤ Pi ∀ i ∈ {1 . . . m}
j=1

• Since for each warehouse, its demands need to be met, and we do not need to send anymore
than is required, then for each warehouse j, the total shipment reaching from all plants should
equal dj . Thus we get
m
X
xij = dj ∀ j ∈ {1 . . . n}
i=1

• The shipment from plant i to warehouse j, cannot be negative, so we get

xij ≥ 0 ∀ i ∈ {1 . . . m}, ∀ j ∈ {1 . . . n}

Linear Programming Problem

Thus the problem can be formulated as follows:

Pm Pn
Minimise
Pi=1 j=1 cij · xij
n
Subject to xij ≤ Pi ∀i ∈ {1 . . . m}
Pj=1
m
i=1 xij = dj ∀j ∈ {1 . . . n}
xij ≥ 0 ∀j ∈ {1 . . . n}, ∀i ∈ {1 . . . m}

2. Choosing Paths in a Communication Network

In this subsection, we will have a look at another example of a real world problem which can be
formulated as a linear programming problem. The problem is as follows -

• Consider a communication network of n nodes connected by links allowing one-way transmis-


sion. Link from node i to node j is represented by an ordered pair (i, j) which can carry upto
uij bits per second and set of all links be A.

• There is a positive charge cij per bit transmitted along the link (i, j) ∈ A.

• Each node k generated bkl bits per second that need to be transmitted to node l (direct or
indirect).

• The problem is to transmit all data to their destinations while minimizing the cost.

5
Mathematical Formulation and Cost Function

We introduce the decision variables of the problem as follows - let xkl


ij be the amount of data per
second flowing in the link (i, j) ∈ A which originated at node k and is destined for node l.

We can further notice that if node k generates bkl bits per second of data that need to be transmitted
to node l, then node k acts as a source of this data while node l acts as a sink, and the rest of
the nodes simply help to transmit this data from node k to node l. To formalize this notion in the
terms of variables, we introduce the following variables which denote whether data is coming from,
or leaving the network through a particular node -


kl
b
 i=k
dkl
i = −bkl i=l

0 i ̸= k, i ̸= l

We can see that this definition of dkl kl


i encapsulates the fact that b bits per second are ‘created’ at
node k and are ‘destroyed’ at node l, independent of the remaining nodes.

Now that we have defined some variables, we can see that the cost C that we need to minimize is
the following -

n X
X X n
C= cij xkl
ij
(i,j)∈A k=1 l=1

The cost function C is easily seen to be linear in the decision variables xkl
ij .

The constraints that we need to take care of are as follows -

• Non-Negativity Constraints - The data flowing in each link with any source and destina-
tion must be non-negative -
xkl
ij ≥ 0

This should be true for all allowed values of i, j, k, l.

• Capacity Constraints - The data flowing at each link should not be more than the link’s
capacity, this means that for all values of (i, j) ∈ A -
n X
X n
xkl
ij ≤ uij
k=1 l=1

• Data Constraints - We have one final set of constraints, which tell us that the amount of
data that is coming into a node must equal to amount of data that is exiting the node, except
for the data that is being generated or consumed at that node. If data is being generated
at that node, the incoming data will be less than the outgoing data and vice-versa for data
consumption at a node. When we talk about data, we talk about a particular type of data

6
that was generated at some particular node k and is destined for some other particular node
l, and this constraint should hold for all valid values of k and l. Mathematically, these set of
constraints can be written as -
X X
xkl
ij = xkl kl
ij + di
{j|(i,j)∈A} {j|(j,i)∈A}

For a particular set of i, k, l.


The summation in the left hand side of the above equation represents the amount of data
leaving node i (with destination l and source k ), and the summation on the right hand side
represents the amount of data entering i (with destination l and source k ). These two should
be the same if i is different from both k and l, the LHS summation should be greater by an
amount of bkl if i = k and the RHS summation should be greater by the same amount if i = l.
All of this is contained in the single variable dkl
i that we defined above in the first part of this
subsection. Note that this constraint can be rewritten as -
X X
xkl
ij − xkl kl
ij = di
{j|(i,j)∈A} {j|(j,i)∈A}

Linear Programming Problem

Now, we see that in this problem, we need to maximize the cost function -

n X
X X n
C= cij xkl
ij
(i,j)∈A k=1 l=1

Subject to the constraints -

xkl
ij ≥ 0
n X
X n
xkl
ij ≤ uij
k=1 l=1
X X
xkl
ij − xkl kl
ij = di
{j|(i,j)∈A} {j|(i,j)∈A}

We can see that the cost function as well as the constraints are all linear in the decision variables
xkl
ij , so this problem has been formulated as a linear programming problem.

References

D. Bertsimas and J.N. Tsitsiklis. Introduction to linear optimization. Athena Scientific, 1997.

7
MTL103: Optimization Methods and Applications
Lecture 2 - 5th January, 2024
Lakshaya Jain - Ravi Kumawat - Kamal Nehra - Vaishali Choudhary
2022MT11933 - 2022MT12027 - 2022MT61979 - 2022MT11948

Recap

Minimize f0 (x)
Subject to fi (x) ≤ bi , i = 1, ..., m.

The vector x = (x1 , . . . , xn ) is the optimization variable or Decision Variable of the problem.
The function f0 : Rn → R is the objective function.
The functions fi : Rn → R, i = 1, . . . , m, are the (inequality) constraint functions.
The constants b1 , . . . , bm are the limits, or bounds, for the constraints.
A vector x∗ is an optimal solution if:
• It has the smallest objective value among all vectors that satisfy the constraints
• For any z with f1 (z) ≤ b1 , . . . , fm (z) ≤ bm , we have f0 (z) ≥ f0 (x∗ )

Canonical form:
A canonical form of the optimization problem is of the form:
Minimize cT x
Subject to Ax ≥ b, x ≥ 0.
• A minimization problem is in Canonical Form if all restrictions are of the ≥ type and all
variables are non-negative.
• A maximization problem is in Canonical Form if all restrictions are of the ≤ type and all
variables are non-negative.

Standard form:
A standard form of optimization problem is of the problem:
Minimize cT x
Subject to Ax = b, x ≥ 0.
• All restrictions are equalities and all variables are non-negative.

Conversion of canonical form to standard form:


• Elimination of free variables/ imposing non-negativity constraints
• Elimination of inequality constraints

Choosing paths in a Communication Network

The problem statement is described below:


• Consider a communication network consisting of n nodes.
• A link allowing one-way transmission from node i to node j is represented by an ordered pair
(i, j).
• Let A be the set of all links.
1
• Assume that each link (i, j) A can carry up to uij bits per second.
• There is a positive charge cij per bit transmitted along the link (i, j) A.
• Each node k generates bkl bits per second that need to be transmitted to node l (direct or
indirect).
• The problem is to transmit all data to their destinations while minimizing the total cost.

Mathematical formulation and cost function


We would like to minimize the cost of flow through the network.
Let xkl
ij represent the amount of data per second flowing in the link (i, j) ∈ A that originated at
node k and is destined for node l.
We can notice that if node k generates bkl bits per second of data that needs to be transmitted to
node l, then node k acts as a source and node l acts as a sink, and the rest of the nodes simply
transmit all the data they receive further.
Let dkl
i denote the amount of data transmitted through the node l while transferring data from
node k to node l.


kl
b ,
 if i = k,
dkl
i
kl
= −b , if i = l,

0, otherwise.

After defining these variables, we can write the objective function as follows:
Pn Pn
cij xkl
P
Minimize C = (i,j)∈A k=1 l=1 ij

Subject to the constraints:


xkl
ijP≥ 0
Pn n kl
k=1 P l=1 xij ≤ uij P
kl kl kl
j|(i,j)∈A xij - i|(i,j)∈A xij = di

We can see that all the constraints and the cost function are linear in nature; hence, it has been
summarized to a linear programming problem.

Different types of function

Concave function
A function f : Rn → R is called concave if for every x, y ∈ Rn , and every λ ∈ [0, 1], we have
f (λx + (1 − λ)y) ≥ λf (x) + (1 − λ)f (y)

Convex function

A function f : Rn → R is called convex if for every x, y ∈ Rn , and every λ ∈ [0, 1], we have

2
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)

Theorem 1. Let f1 , ..., fm : Rn → R be convex functions. Then the function f defined by f (x) =
maxi=1,...,m fi (x) is also convex.

• f (λx + (1 − λ)y) = maxi=1,2,...,m fi (λx + (1 − λ)y)


• f (λx + (1 − λ)y) = maxi=1,2,...,m (λfi (x) + (1 − λ)fi (y))
• f (λx + (1 − λ)y) = maxi=1,2,...,m (λf (x)) + maxi=1,2,...,m ((1 − λ)f (y))
• f (x) = maxi=1,...,m fi (x)
• f (x) ≥ fi (x)∀i = 1, ..., m
• f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
Hence, the function f (x) is a convex function.

Piecewise Linear Convex Objective function

A function of the form maxi=1,...,m (ci T x + di ) is called a piecewise linear convex function.

• Example: The absolute value function f (x) = |x| = max{x, −x}.

A generalization of a linear programming problem where the objective function is piecewise linear
and convex:

Minimize maxi=1,...,m ((ci )T x + di )


Subject to Ax ≤ b , t ≥ (ci )T x + di

Problem involving absolute values

Minimize ni=1 ci |xi |


P
Subject to Ax ≥ b

This problem can be reduced to:

Minimize ni=1 ci ti
P
Subject to Ax ≥ b, ti ≥ xi , ti ≥ −xi , i ∈ [1, n]
ti = max{xi , −xi }

References
[1] M. Fredman, J. Komlós, E. Szemerédi, Storing a Sparse Table with O(1) Worst Case Access
Time, Journal of the ACM, 31(3):538-544, 1984.

3
MTL103: Optimization Methods and Applications Spring 2023

Lecture 04 —09 Jan, 2023


Lecturer: Prof. Minati De

Scribe:
Name Entry Number
Aayushi Singh 2022MT11955
Sanya Sachan 2022MT11286
Sarthak Gangwal 2022MT11275
Shikhar Gupta 2022MT11925

1 Overview

In the last lecture, we covered the optimization of data flow in a communication network using linear
programming. The network, comprised of n nodes and one-way links with specific capacities and
charges, aimed to minimize the total cost of data transmission. Decision variables xijkl represented
P Pn Pn ij
data flow, and the objective function (i,j)∈A k=1 l=1 cij xkl sought to minimize the associated
cost. Linear programming constraints, including non-negativity, capacity, and flow conservation,
were discussed. The lecture also introduced the concepts of convex functions and piecewise linear
convex functions, and how to deal with problems involving absolute values.
In this lecture we will explore the Chebyshev Approximation Problem and its formulation using
linear programming, building on our knowledge from communication network optimization. We’ll
also dive into a practical example involving chocolate production, showcasing linear programming’s
role in profit maximization within constraints. Throughout, we’ll discuss the nuances of optimal
solution existence and uniqueness in different scenarios, providing concise insights into these key
aspects of linear programming.

2 Chebyshev Approximation Problem

2.1 Problem Statement

Given i ∈ N and i ≤ n, where ai ∈ Rn and bi ∈ R, the objective is to predict bm given am for any
m ∈ N.

2.2 Approach

The problem can be framed as a linear system aTi · x = bi , where x represents the coefficients of the
approximating polynomial.

1
Formulate as a Linear Programming Problem

1. Minimization Objective:

• Original Problem: Minimize


n
max aTi x − bi
i=0

• Improved Approximation: Minimize


n n
X X 2
aTi x − bi or, aTi x − bi
i=0 i=0

The choice between minimizing the maximum absolute deviation or the sum of absolute
deviations or the sum of squared deviations or any other possible objective function
depends on the specific requirements of the approximation.

2. Decision Variables: The vector x representing the coefficients of the approximating poly-
nomial.

3. Constraints: The linear system constraints aTi · x = bi ensure that the polynomial approxi-
mates the given data points.

4. Interpretation: The objective aims to minimize the maximum (or total) absolute devia-
tion between the predicted values aTi · x and the actual values bi . This ensures a robust
approximation for all data points.

2.3 Conclusion

The Chebyshev Approximation Problem, when formulated as a linear programming problem, pro-
vides a versatile framework for data fitting, offering flexibility in terms of the optimization objective
and allowing for a balance between accuracy and simplicity in representation.

3 An Example of Solving a Two Variable Linear Programming


Problem Graphically

3.1 Problem

Consider that you start a Chocolate Company which could produce dark chocolate and milk choco-
lates. You need cocoa powder, sugar and milk to manufacture both but in different quantities. But
you have only 10 units of cocoa powder, 10 units of sugar and 8 units of milk. One unit of dark
chocolate produces 5 units of revenue and one unit of milk chocolate produces 4 units of revenue.
How much of both the chocolates you shall manufacture to maximize the revenue?

2
Ingredients Dark Milk
Cocoa 5 2
Sugar 2 5
Milk 1 4

Table 1: Quantity to produce 1 unit of both the chocolates

3.2 Solution

Let us formulate this problem as a linear programming problem.


Let d be the amount of dark chocolate and m be the amount of milk chocolate produced.

Objective: Maximize revenue i.e. 5d + 4m.

Constraints:

• Quantity produced is non-negative: d ≥ 0, m ≥ 0.

• Amount of cocoa powder used: 5d + 2m ≤ 10.

• Amount of sugar used: 2d + 5m ≤ 10.

• Amount of milk used: d + 4m ≤ 8.

Approach: We would plot all the constraints over a graph to determine a feasible set of solutions.
Then we shall check which point in the set maximizes the objective.

The quadrilateral ABCD formed inside is the set of all the solutions satisfying all the constraints.
We plot the lines 5d + 4m = c i.e. the objective function for different values of c. Each of these
lines is called isocost lines because all the set of points lying over these lines, for a fixed value of
c, have the same cost. We find that the objective is maximized at point A which is point ( 10 10
7 , 7 ).
90 10
At this point the cost is 7 . This shows that you shall produce 7 ≈ 1.43 units of dark chocolate
and 10 90
7 ≈ 1.43 units of milk chocolate to earn the maximum revenue of 7 ≈ 12.86 units.

3
3.3 Additional Notes

While working in higher dimensions we are not able to solve the problem graphically. To find an
optimal solution we may use Simplex method. Its basic ideas are as follows

• Find an edge (or extreme ray) in which the objective value is continually improving.

• Go to the next corner point. If there is no such point then stop, the objective is unbounded.

• Continue until no adjacent corner has a better objective value.

4 Does Uniqueness of Optimal Value Imply Uniqueness of Opti-


mal Solution

No! An objective function with constraints does not have any inherent relationship between the
uniqueness of its optimal values and the uniqueness of its optimal solutions.
We can show it using different examples, explained below:

• Objective function: Minimize c1 x1 + c2 x2

• Constraints: x1 ≥ 0;
x2 ≥ 0;

4.1 Example 1: c1 = 1 and c2 = 1

Here, the minimum value is 0.


And there is a unique optimal solution i.e (0,0).

4
4.2 Example 2: c1 = -1 and c2 = -1

Here, the minimum value is −∞.


And optimal solution is not finite.

4.3 Example 3: c1 = 1 and c2 = 0

Here, the minimum value is 0.


And there are infinite optimal solutions i.e (0,y) : y ≥ 0.

5
4.4 Example 4: c1 = 0 and c2 = 1

Here, the minimum value is 0.


And there are infinite optimal solutions i.e (x,0) : x ≥ 0.

5 Existence of Optimal solution

As we have seen in previous section, in different scenarios we may have a unique or multiple optimal
solutions, but does the optimal solution always exist? No! There are multiple such scenarios where
optimal solutions and optimal value may not exist.
Trivially, if the feasible region is empty, an optimum solution will not exist. Even if feasible region
is non-empty, optimum solution may not exist due to the following reasons.

5.1 Unbounded feasible region

This occur due to existence of feasible solutions with non-finite cost.


Example: Minimize x + y subject to x ≤ 1 , x + 2y ≤ 2

6
In this example for any cost C, there exist feasible solutions with lesser cost, thus no solution
is optimum. Moreover, minimum value tend to −∞ due to unboundedness of feasible region.

5.2 Open boundaries of feasible region

In this case feasible solutions tend to optimality near open boundary but optimal solution does not
exist.
Example: Maximize x such that |x| < 5
for any feasible value of x such as 5-ϵ ,∃ larger value of x such as 5 − 2ϵ . Thus no value of x can be
optimal.

6 References

1. Introduction to Linear Optimization by Dimitris Bertsimas and John Tsitsiklis

2. Lecture slides for the course, MTL103: Optimization Methods and Applications by Prof.
Minati De

7
MTL103: Optimization Methods and Applications Spring 2024

Lecture 5 — 10 Jan, 2024


Lecturer: Prof. Minati De

Scribe:
Name Entry Number
Md Sanaur Rahman 2022MT11946
Saem Habeeb 2022MT62004
Vatsal 2022MT11926
Rohit Jadekar 2022MT61984

1 Overview

In the last lecture we covered Chebyshev Approximation Problem and its formulation using linear
programming. We also solved a practical example of maximizing profit in chocolate production
within some constraints. We learned how to solve Linear Programming Problems Graphically and
concluded that the isocost line line is perpendicular to cost function. Also Uniqueness of Optimal
Value does not imply Uniqueness of Optimal solution.
In this lecture we will cover definitions of Hyperplane, half-space, Polyhedron and Convex
Sets. We will also see that a polyhedron is the intersection of convex sets. Throughout, we’ll
discuss extreme points and their condition for existence in a polyhedron, providing concise insights
into these key.

2 Polyhedra and Convex Set

To better understand Linear Programming graphically we need to learn some definitions

2.1 Definitions

1. Hyperplane

Let a be a non-zero vector in Rn and let b be a scalar. The set {x ∈ Rn | aT x = b} is called


a Hyperplane.

2. Half-space

Let a be a non-zero vector in Rn and let b be a scalar. The set {x ∈ Rn | aT x ≤ b} is called


a Half-space

• Hyperplane is the boundary of a corresponding hyperspace.

1
• Vector a in the definition of the hyperplane is perpendicular to the hyperplane itself(If
x and y belong to the same hyperplane, then aT x = aT y. Hence aT (x − y) = 0 and
therefore a is orthogonal to any direction vector confined to the hyperplane)

Figure 1: A hyperplane and two halfspaces

3. Polyhedron

A Polyhedron is a set that can be described in the form {x ∈ Rn | Ax ≤ b}, where A is an


m × n matrix and b is a vector in Rm .

• The feasible set of any linear programming problem can be described by inequality
constraints of the form of Ax ≤ b and is therefore a polyhedron.
• A set of the form {x ∈ Rn | Ax ≤ b, x ≥ 0} is also a polyhedron in the standard form
representation.

Figure 2: The polyhedron


{x | aTi x ≤ bi , for i = 1, 2, . . . , 5}

2
4. Convex Set

A set S ⊂ Rn is convex if for every x, y ∈ S, λx + (1 − λ)y ∈ S, where λ ∈ [0, 1].

• If λ ∈ [0, 1] , λx + (1 − λ)y is a weighted average of the vectors x, y and therefore belongs


to the line segment joining x and y. Thus a set is convex if the segment joining any two
of its elements is contained in the set.

Figure 3: Set S is convex but set Q is not


• Some convex sets

Figure 4: Convex Sets


• Some sets that are not convex

Figure 5: Non-Convex Sets

3
Lemma: Intersection of two convex set is also convex set
Proof: The proof of this theorem is by contradiction. Suppose for convex sets S and T there are
elements a and b such that a and b both belong to S ∩ T , i.e., a belongs to S and T and b belongs
to S and T and there is a point c on the straight line between a and b that does not belong to
S ∩ T . This would mean that c does not belong to one of the sets S or T or both. For whichever set
c does not belong to this is a contradiction of that set’s convexity, contrary to assumption. Thus
no such c and a and b can exist and hence S ∩ T is convex.
Theorem 1: Every Polyhedron is a Convex Set
Proof: Let H be a Polyhedron
H = x ∈ Rn | aT x ≤ b


x1 , x2 ∈ H
{aT x1 ≤ b, aT x2 ≤ b}
aT (λx1 + (1 − λ)x2 ) = λaT x1 + (1 − λ)aT x2
λaT x1 + (1 − λ)aT x2 ≤ λb + (1 − λ)b
λaT x1 + (1 − λ)aT x2 ≤ b
λaT x1 + (1 − λ)aT x2 ∈ H

Hence we see that, the convexity of H is established.

3 Extreme Point

Extreme Point: A point p ∈ P is an extreme point of a polyhedron P if we cannot find two


points y and z in P , both different from p, and a scalar λ ∈ [0, 1].
such that, x = λy + (1 − λ)z
Remarks: Necessary and Sufficient condition for a polyhedron to have an at least one extreme
point is, it should not have any line.

3.1 Existence of Extreme points

Theorem 2: Suppose that a polyhedron P = {x ∈ Rn | aTi x ≥ bi , for i = 1, 2, . . . , m} is non-empty.


The following are equivalent:

1. The polyhedron P does not contain a line

2. The polyhedron P has at least one extreme point

3. There exists n vectors out of the family a1 , a2 , . . . , an which are linearly independent.

4
Proof. (1) =⇒ (2) Let x ∈ P and I = {i | aTi x = bi }
If n of the vectors ai , i ∈ I are linearly independent then, we have n hyperplanes intersecting at x
and we have an extreme point.
If not, then all ai , i ∈ I lie in a proper subspace of Rn spanned by all the a′i s.
=⇒ ∃d ∈ Rn , d ̸= 0 s.t. aTi d = 0 ∀i ∈ I.
Let us consider the line y = x + λd where λ ∈ R. For i ∈ I, we have aTi y = aTi x + aTi d = aTi x = bi
=⇒ the constraints satisfying equality at x are also satisfied at all y. But as P doesn’t contain
a line, at some λ′ some constraints will get violated and a new constraint will become active. Let
this constraint is aTj (x + λ′ d) = bj , j ∈
/I
T T
By definition, aj x ̸= bj =⇒ aj d ̸= 0 =⇒ aj is orthogonal to ai ∀i ∈ I and hence linearly
independent of all a′i s
Thus by moving from x to x + λ′ d, the number of linearly independent vectors have increased
at least by one. As x and the line were arbitrary, this argument can be repeated till we get n
linearly independent vectors. Once we have these n linearly independent vectors, we will have n
hyperplanes intersecting at a point, this point shall be an extreme point.
(2) =⇒ (3) We know, in Rn , a point is obtained by the intersection of n hyperplanes. So, if P has
an extreme point x, then there exist n constraints aTi x = bi s.t. x satisfies them with equality and
the a′i s are linearly independent. Hence, we have a family of n linearly independent vectors.
(3) =⇒ (1) We have that n of the a′i s are linearly independent and let WLOG a1 , a2 , ..., an are
the n linearly independent vectors.
Let the polygon P has the line y = x + λd, (λ ∈ R, d ̸= 0) then,
aTi (x + λd) ≥ bi
but,
aTi x ≥ bi
so,
aTi d = 0
(by varying λ we can make aTi d ≪ 0 violating the inequality)
Now, as the vectors ai , i = 1, 2, ..., n are linearly independent, we have d = 0 which contradicts our
assumption and hence P doesn’t contain a line.

Corollary: Every nonempty polyhedron in standard form has at least one extreme point.

3.2 Optimality of Extreme points

Theorem 3: Consider the Linear Programming Problem (LPP) of minimizing cT x over a polyhe-
dron P . Suppose that P has at least one extreme point and that there exists an optimal solution.
Then there exists an optimal solution which is an extreme point of P .
Proof: Let P = {x ∈ Rn : Ax ≤ b}, where A is a matrix and b is a vector.
Let v be the optimum cost of the Linear Programming Problem (LPP).
Define Q = {x ∈ P : cT x = v}.
Since P does not contain a line and Q is a subset of P , it follows that Q also does not contain a
line. This implies that Q has an extreme point.

5
Let x0 be an extreme point of Q.
For contradiction, assume x0 is not an extreme point of P .
Let y, z ∈ P and λ ∈ [0, 1] such that:

x0 = λy + (1 − λ)z

Then, the optimum cost can be expressed as:

v = cT x0 = λcT y + (1 − λ)cT z

This implies that cT y = cT z = v, contradicting the assumption that x0 is not an extreme point of
P.
Therefore, x0 must be an extreme point of P .

References

1. Dimitris Bertsimas and John Tsitsiklis, Introduction to Linear Optimization.

2. Prof. Minati De, Lecture slides for the course MTL103: Optimization Methods and Applica-
tions.

6
MTL103: Optimization Methods and Applications Spring 2023

Lecture 6 – 12th Jan 2024


Lecturer: Prof. Minati De

Scribes :
Shamil Mohammed 2022MT11938
Dev Elvis Kannath 2022MT61961
Raj Aryan 2022MT11945
Soumya Namdeo 2022MT61978

1 Overview

In the previous lecture, we explored fundamental definitions such as Hyperplane, Half-space, Poly-
hedron, and Convex Sets. Real-life applications and examples demonstrated polyhedra as intersec-
tions of convex sets. The discussion delved into extreme points within a polyhedron, elucidating
the conditions for their existence. This comprehensive exploration provided valuable insights into
both theoretical foundations and practical implications in the realm of optimization.
In this lecture, we delve into vertices and active constraints, examining basic and basic feasible
solutions. The exploration connects geometric and algebraic spaces within a polyhedron, bridging
theory with practical applications. By understanding these concepts, we gain insights into optimiz-
ing solutions and navigating the intricate intersections of mathematical and geometric domains.

2 Vertex and Active constraints :

In this section, we will define some basic terminologies.

2.1 Definition of Vertex:

Let P be a polyhedron. A vector x ∈ P is a vertex of P if there exists some c such that C T x < C T y
for all y ∈ P, where y ̸= x.
Alternatively, x is a vertex of P if and only if P lies on one side of a hyperplane that intersects
P only at the point x.
To provide a definition based on the representation of a polyhedron in terms of linear constraints,
we need some additional terminology.

1
2.2 Definition of Active Constraints :

Consider P ⊂ Rn be a polyhedron such that

aTi x ≥ bi , i ∈ M1

aTi x ≤ bi , i ∈ M2
aTi x = bi , i ∈ M3

where M1 , M2 , M3 are finite index sets each ai is vector in Rn and each bi is a scalar.
If a vector x∗ satisfies aTi x = bi for some i ∈ M2 , M2 , M3 we say the corresponding constraint as
active constraint.
If there are n number of constraints at a vector x∗ , then x∗ satisfies a certain system of n linear
equation with n unknowns. This system has a unique solution if and only if these n equations
are linearly independent.

Theorem : let x∗ ∈ Rn . Let I = {i : aTi x∗ = bi } be a set of indices of active constraints at x∗ .


Then the following are equivalent:
(a) there exist n vectors in the set {ai : i ∈ I} which are linearly independent.
(b) span({ai : i ∈ I}) = Rn .
(c) the system of linear equations aTi x = bi has a unique solution.

Proof :

1. a ⇔ b
Assume that S = {ai : i ∈ I} is a set of n linearly independent vector ai . Note that the
dim(Rn ) is n.
We know that a set of linearly independent vectors of right length forms a basis of Rn . Thus
we can say S is a basis and hence span(S) = Rn .

Conversely, assume span(S) = Rn .


Again, S has n vectors that spans Rn and dim(Rn ) = n and we know that a spanning list of
right length forms a basis.
⇒ S is a basis of Rn and thus the vectors in S are linearly independent.

2. b ⇔ c
Assume that the system of equations aTi x = bi , i ∈ I has multiple solutions. let x1 , x2
(x1 ̸= x2 ) be vectors satisfying the above equations.

⇒ aTi x1 = bi and aTi x2 = bi . let d = x1 - x2 ̸= 0 then

aTi d = aT1 x1 − aTi x2 = 0

2
⇒ aTi d = 0 for all i ∈ I.

⇒ d is orthogonal to every ai , i ∈ I.

⇒ d is not a linear combination of these vectors thus, d ∈


/ span(S).
Hence S does not spans Rn .
Conversely, assume S does not spans Rn .

Choose a non-zero vector d which is orthogonal to the subspace spanned by these vectors.
i.e. aTi d = 0 for all i ∈ I.

If x satisfies aTi d = bi

aTi (x + d) = bi ,

⇒ x + d also satisfies the equation hence the equation aTi x = bi has no unique solution.

3 Basic and Basic Feasible Solution :

Consider a polyhedron P defined by linear equality and inequality, and let x∗ be an element a ∈ Rn
(a) The vector x∗ is a basic solution if:
(1) All equality constraints are active.
(2) Out of the constraints that are active at x∗ , there are n of them that are linearly independent.
(b) If x∗ is a basic solution that satisfies all of the constraints, we say that it is a basic feasible
solution.
NOTE: If the number of constants say m, used to define polyhedron is less than n, the number
of active constraints at any given point must also be n and there are no basic or basic feasible
solutions.

4 Connecting Geometric and Algebraic aspects

Theorem: let P be a non empty polyhedron. let x∗ ∈ P. Then the following are equivalent.
(a) x∗ is a vertex.
(b) x∗ is an extreme point.
(c) x∗ is a basic feasible solution.

Proof :
(1) a ⇒ b
Let x∗ ∈ P be a vertex point. By definition, there exists c such that cT x∗ < cT y for all y ∈ P and
y ̸= x∗ .

3
Assume x∗ is not an extreme point, which implies that x∗ = λy + (1 − λ)z for some y, z ∈ P and
λ ∈ [0, 1].
Since x∗ is a vertex point, we have cT x∗ < cT y and cT x∗ < cT z.
Now, we can write these equations explicitly:

cT x∗ < cT y
cT x∗ < cT z

But from the equation we obtained:

cT x∗ = cT (λy + (1 − λ)z)

= λcT y + (1 − λ)cT z

This contradicts the assumption that cT x∗ < cT y and cT x∗ < cT z.


Hence, we conclude that x∗ is an extreme point.
(2) b ⇒ c
Suppose x∗ ∈ P is an extreme point. Assume x∗ is not a basic feasible solution.

Let I = {i : aTi x∗ = bi }.
Since x∗ is not a basic feasible solution, there does not exist n linearly independent vectors in
{ai : i ∈ I} , ∀i ∈ I. Thus, ai , ∀i ∈ I lie in a subspace of Rn , and there exists some nonzero vector
d ∈ Rn such that
aTi d = 0, ∀i ∈ I.

Let ε > 0 be a small positive number, and consider vectors y = x∗ + εd and z = x∗ − εd.
Notice that aTi y = aTi (x∗ + εd) = aTi x∗ = bi , ∀i ∈ I.
Furthermore, if ε is small enough, we have aTi x∗ > bi , and provided that ε is small, we will also
have aTi y > bi .
Thus, when ε is small enough, y ∈ P, and by a similar argument, z ∈ P. Finally, notice that
x∗ = y+z ∗
2 , which implies that x is not an extreme point.

(3) c ⇒ a
Let x∗ ∈ Rn be a basic feasible solution, and let I = {i : aTi x = bi }. Let c =
P
i ai .

Then we have:
cT x = ΣaTi xi = Σbi , ∀i ∈ I.

Furthermore, for any x ∈ Rn and any i ∈ I, we have ai x ≥ bi and

cT x = ΣaTi xi ≥ Σai x∗i , ∀i ∈ I.

This shows that x∗ is an optimal solution to the problem of minimizing cT x over P.

4
Furthermore, equality holds if and only if ai x = bi for all i ∈ I.
Since x∗ is a basic feasible solution, there are n linearly independent constraints that are active at
x∗ , and x∗ is the unique solution to the system of equations ai x = bi , ∀i ∈ I.
It follows that x = x∗ . Therefore, x∗ is a vertex of P.

5
MTL103: Optimization Methods and Applications Fall 2023

Lecture 07 : 18th January, 2024


Scribes: 1.Tejas Asija 2022MT61890
2.Hemant Ramgaria 2022MT11854
Lecturer: Prof. Minati De 3.Sumit Sonowal 2022MT11296
4.Nikita 2022MT11951

1 Overview

In the previous lecture, we discussed the meanings of hyperplane, halfspace, polyhedron, and con-
vex sets. Additionally, we explored the proof of the theorem asserting that every polyhedron is a
convex set. The lecture also delved into topics such as extreme points, the presence of extreme
points, and the optimality associated with extreme points.
In this lecture, we’ll primarily discuss polyhedron in standard form, explore the definitions of fea-
sible solutions and basic feasible solutions, then some relevant theorems, and subsequently explore
methods for constructing basic solutions.

2 Polyhedron in Standard

Theorem. Let P = {n | Ax = b, x ≥ 0} be a non-empty polyhedron, where A is a matrix of


dimension m × n with rows aT1 , aT1 , . . . , aTk being linearly independent.
Consider,
Q = {n | aTi = bi , i ∈ {1, 2, . . . , k}}

Then,
P =Q

Proof. We need to show that P ⊆ Q & Q ⊆ P

1. Claim: P ⊆ Q
Let P ̸= ϕ and x ∈ P
Then, aTi = Σkj=1 λij aTj j = 1, 2, . . . , k & λij are scalars

Now, bi = ai n

=⇒ bi = Σkj=1 λij aj x
=⇒ bi = Σkj=1 λij bj i ∈ {1, 2, . . . , m}
∴x∈Q
∴P ⊆Q

1
2. Claim: Q ⊆ P
Let Q ̸= ϕ and y ∈ Q

Then, ai y = Σkj=1 λij aj y i = 1, 2, . . . , m & λij are scalars

=⇒ ai y = Σkj=1 λij bj

=⇒ ai y = bi
∴y∈P
∴Q⊆P

Hence,
P =Q

2.1 Basic Feasible Solutions

Definition (Basic Feasible Solution). x ∈ R is a basic feasible solution if,

1. All equality constraints are active.

2. Out of the active constraints at x, there are n of them that are linearly independent

Remark. At any basic solution, at least n − m of the x′i s should be zero

Theorem. Consider the constraint Let Ax = b, x ≥ 0, where A is a matrix of dimension m × n


with linearly independent rows.
A vector x∗ is a basic feasible solution if and only if we have Ax∗ = b and there exists indices
B(1), B(2), . . . , B(m) such that

(a) The columns AB(1) , AB(2) , . . . , AB(m) are linearly independent.

(b) If i ∈
/ {B(1), B(2), . . . , B(m)} then xi = 0.

Proof. .

1. (−→)
Consider some x ∈ Rn and suppose there are indices B(1), B(2), . . . , B(m) that satisfy (a) &
(b) in the statement of the theorem.
The active constraints xi = 0, i ̸= B(1), B(2), . . . , B(m) & Ax = b imply that

Σm n
i=1 AB(i) xB(i) = Σi=1 Ai xi = Ax = b

Since the columns AB(i) , i = 1, 2, . . . , m are linearly independent , xB(1) , xB(2) , . . . , xB(m) are
uniquely determined
∴ The system of equations formed by the active constraints has a unique solution.
By theorem on active constraints, there are n linearly independent active constraints
=⇒ x is a basic solution

2
2. (←−)
Assume x is a basic solution. Let xB(1) , xB(2) , . . . , xB(m) be the non-zero components of x
Since,x is a basic solution
=⇒ The system of linear equations formed by the active constraints Σni=1 Ai xi = b & xi =
0, i ̸= B(1), B(2), . . . , B(k) have a unique solution by theorem on active constraints
⇐⇒ The equation, Σm i=1 AB(i) xB(i) = b has a unique solution

Claim. The Columns AB(1) , AB(2) , . . . , AB(k) are Linearly Independent

Proof. By Contradiction
Assume AB(1) , AB(2) , . . . , AB(k) are not Linearly Independent
=⇒ ∃λ1 ,2 , . . . , λk not all of them zero such that Σm i=1 AB(i) λi = 0
m
=⇒ Σi=1 AB(i) (xB(i) + λi ) = b contradicting the uniqueness of solution
∴ AB(1) , AB(2) , . . . , AB(k) are Linearly Independent .

Now Since,AB(1) , AB(2) , . . . , AB(k) are Linearly Independent


=⇒ k ≤ m
Since A has m linearly independent rows,it also has m linearly independent columns which
span Rn
=⇒ We can find m − k additional columns AB(k+1) , AB(k+2) , . . . , AB(m) , so that the columns
AB(i) , i = 1, 2, . . . , m are linearly independent
In addition since k ≤ m,if i ̸= B(1), B(2), . . . , B(m)
=⇒ i ̸= B(1), B(2), . . . , B(m) & xi = 0
∴ Both conditions (a) and (b) are satisfied.

2.1.1 Method for Constructing Basic Feasible Solution

1. Choose m linearly independent columns AB(1) , AB(2) , . . . , AB(m)



2. M in C x
Set Ax = b & x ≥ 0

3. Let xi = 0 ∀i ∈ B(1), . . . , B(m)

4. Solve the system of m equations Ax = b for the unknowns xB(1) , xB(2) , . . . , xB(m)

Example: Let the constraint Ax = b be of the form


   
1 1 2 1 0 0 0 8
0 1 6 0 1 0 0 12
1 0 0 0 0 1 0 x =  4 
   

0 1 0 0 0 0 1 6
∴ Rank(A) = 4
Let us choose A4 , A5 , A6 , A7 as our basic columns. Note that they are linearly independent and the

3
corresponding basis matrix is the identity matrix.
∴ x1 = x2 = x3 = 0      
1 0 0 0 x4 8
0 1 0 0  x5  12
=⇒ 
0

 x6  =  4 
   
0 1 0
0 0 0 1 x7 6
∴ x4 = 8, x5 = 12, x6 = 4, x7 = 6
=⇒ Basic Solution is (0, 0, 0, 8, 14, 4, 6)
Question: Is this basic soltuion feasible
Ans: Yes it is feasible as xi ≥ 0∀i ∈ {1, . . . , 7}

References

[1] Dimitris Bertsimas, John N. Tsitsiklis Introduction to Linear Optimization, Athena Scientific
Series in Optimization and Neural Computation, 6/1997, Athena Scientific

4
MTL103: Optimization Methods and Applications Spring 2023

Lecture 8 – 19th Jan, 2024 Lecturer: Prof. Minati De

Scribes:
Keshav Rai 2022MT61968 Parth Bhardwaj 2022MT11257
Aahna Jain 2022MT11930 Umangh Nagal 2022MT11928

1 Overview

In the last lecture we applied the theory of basic solutions to the special case of Polyhedra in
standard form(by taking Full Row Rank Assumption), which are central to the development of
simplex method which we will learn later.This allowed us to develop a method to construct all
basic(and hence basic feasible) solutions to a standard form polyhedron.
In this lecture we further develop the geometric and algebraic intuition on this construction method
and delve into the ideas of Adjacent basic solutions, basis and Degeneracy.

2 Main Section

2.1 Procedure for constructing basic solutions

Let P={x|Ax=b, x ≥ 0} be a polyhedron in standard form, where A is an m x n matrix.

Objective: To minimize cT x in order to construct a basic solution:


1) Choose m linearly independent columns AB(1) , . . . , AB(m) . These are called basic columns and
thus form a basis of Rm .By arranging the m basic columns next to each other, we obtain a m ×
m matrix AB called a basis matrix.
2) Let xi = 0 for all i ∈
/ B(1), . . . , B(m). These variables are called non-basic variables.
3) Solve the system of m equations Ax = b for the unknowns xB(1) , . . . , xB(m) . These variables are
called basic variables.

   
1 1 2 1 0 0 0 8
0 1 6 0 1 0 0 12
Example 1: Let A=  1 and b=   along with constraints of standard
0 0 0 0 1 0   4
0 1 0 0 0 0 1 6
form Ax = b, x ≥ 0.
In order to find a basic solution, we first select 4 linearly independent columns, namely A4 , A5 , A6 ,
and A7 .Then, we set xi = 0 for i=1,2,3. Finally, we arrive at the solution by solving the system of
4 equations AB xB = b (which can further be solved using standard algorithms such as Gaussian

1
     
1 0 0 0 x4 8
0 1 0 0 x5  12
Elimination). Here ,  0
   =  . This finally gives x=(0,0,0,8,12,4,6) which is a
0 1 0 x6   4 
0 0 0 1 x7 6
basic feasible solution.

Similarly,
 if we select
 A3 ,A5 , 
A6 , and A7 . Then, we set xi = 0 for i=1,2,4.
2 0 0 0 x3 8
6 1 0 0 x5  12
Hence,
0
   =  .
0 1 0 x6   4 
0 0 0 1 x7 6
This finally gives x=(0,0,4,0,-12,4,6) which is not a feasible solution.
Note : Different basic solutions necessarily correspond to different basis because a basis uniquely
determines a basic solution. However, two different bases may lead to the same basic solution.

2.1.1 Adjacent basic solutions and Adjacent basis

Definition: Two distinct basic solutions to a set of linear constraints in Rn are said to be adjacent
if we can find n-1 linearly independent constraints that are active at both of them.

For standard form problems, we also say that two basis are adjacent if they share all but one
basic column. Adjacent basic solutions can always be obtained from two adjacent basis.

Remark: The two basis chosen in Example 1 are adjacent. If both solutions were viable, they
would be referred to as adjacent basic feasible solution; however x=(0,0,4,0,-12,4,6) is not a feasible
solution. Therefore, the basis are adjacent but the solutions are not.

2.2 Degeneracy

According to our definition, at a basic solution, we must have n linearly independent active con-
straints. This allows for the possibility that the number of active constraints is greater than n .

Definition:A basic solution x ∈ Rn is said to be degenerate if more than n constraints are


active at x.
Example: Consider the basis consisting of the linearly independent columns A1 , A2 , A3 , and A7
in Example 1, which gives (4,0,2,0,0,0,6) as a basic solution. This is a degenerate solution because
8 constraints are actve at this point(4 from the basis matrix itself and 4 from xi = 0).

Remark: The above basic feasible solution is also obtained when we start with A1 , A3 , A4 , and
A7 . This might cause a problem while algorithm designing as we might end up in an infinite loop.

2
2.2.1 Degeneracy in Standard Form

At a basic solution of a polyhedron in standard form, the m equality constraints are always active.
Therefore, having more than n active constraints is the same as having more than n - m xi being
0.

Definition: Consider a polyhedron A in standard form and let x* be a basic solution. Let m
be the number of rows of A. The vector x* is a degenerate basic solution if more than n-m of the
components of x* are 0.

Example: Let x* be a non-degenerate solution of a standard form polyhedron P = {x | Ax=b, x ≥ 0}


where A is of dimensions m x n, exactly n - m of the variables xi * are equal to zero. Let us now
represent P in the form P = {x | Ax ≥ b, Ax ≤ b, x ≥ 0} Then, at the basic feasible solution x*,
we have n - m variables set to zero and an additional 2m inequality constraints are satisfied with
equality. We therefore have n + m active constraints and x* is degenerate.
This example shows that degeneracy is highly dependent on representation of polyhedra.

Question: Let P = {x1 , x2 , x3 } such that x1 − x2 = 0, x1 + x2 + 2x3 = 2, x1 , x2 , x3 ≥ 0.


Check the degeneracy of:
(i) (1, 1, 0)
(ii) (0, 0, 1)
Answer:
(i) Non-degenerate, as only n = 3 constraints are active at the given basic solution.
(ii) Degenerate, as the given basic solution is active at more than n, i.e., 4 constraints (x1 − x2 = 0,
x1 + x2 + 2x3 = 2, x1 , x2 ≥ 0).

References

[1] Dimitris Bertsimas and John N. Tsitsiklis, Introduction To Linear Optimisation

3
MTL103: Optimization Methods and Applications Spring 2024
Lecture 10 — 23rd Jan, 2024
Lecturer: Prof. Minati De

Scribes:
1. Siddharth Saini- 2022MT11283
2. Rudranil Naskar- 2022MT11287
3. Tushar Goyal- 2022MT11266
4. Niraj Agarwal- 2022MT11921

1 Overview

In the last lecture we learnt about reduced cost vector and feasible directions. Let us recall.
Let x be a basic feasible solution to the following standard form problem,

minimize c′ x
subject to Ax = b
x≥0

and let P be the corresponding feasible set. We assume that the dimensions of the matrix A are
m × n and that its rows are linearly independent. We continue using our previous notation: Ai is
the ith column of the matrix A, and a′i is its ith row. Then we have

B ={B(1), . . . , B(m)}
 
| |
AB = AB(1) . . . AB(m) 
| |
where B denote the set of Basic Indices, and AB denotes the Basic Matrix. We define reduced cost
vector cb whose jth component is
cj = cj − cTB A−1
b B Aj

If cb ≥ 0, then x is an optimal solution (Optimality condition theorem).


In this lecture We will now complete the development of the simplex method. Our main task is to
work out the details of how to move to a better basic feasible solution, whenever a profitable basic
direction is discovered using an example.

1
2 Main Section

2.1 Example: Consider the linear programming problem

minimize c1 x1 + c2 x2 + c3 x3 + c4 x4
subject to x1 + x2 + x3 + x4 = 2
2x1 + 3x3 + 4x4 = 2
x1 , x2 , x3 , x4 ≥ 0.

The first two columns of the matrix A are A1 = (1, 2) and A2 = (1, 0). Since they are linearly
independent, we can choose x1 and x2 as our basic variables. The corresponding basis matrix is

 
1 1
AB =
2 0

We set x3 = x4 = 0, and solve for x1 , x2 , to obtain x1 = 1 and x2 = 1. We have thus obtained a


nondegenerate basic feasible solution.
A basic direction corresponding to an increase in the nonbasic variable x3 , is constructed as follows.
We have d3 = 1 and d4 = 0. The direction of change of the basic variables is obtained using Eq.
(1):

        
d1 dB(1) −1 0 1/2 1 −3/2
= = dB = −AB A3 = − =
d2 dB(2) 1 −1/2 3 1/2

The cost of moving along this basic direction is c′ d = −3c1 /2 + c2 /2 + c3 . This is the same as the
reduced cost of the variable x3 .

3 How to choose θ . . .

Let us assume that every basic feasible solution is nondegenerate. This assumption will remain
in effect until it is explicitly relaxed later in this section. Suppose that we are at a basic feasible
solution x and that we have computed the reduced costs c̄j of the nonbasic variables. If all of them
are nonnegative, Optimality condition theorem shows that we have an optimal solution, and
we stop. If on the other hand, the reduced cost c̄j of a nonbasic variable xj is negative, the j th
basic direction d is a feasible direction of cost decrease. [This is the direction obtained by letting
dj = 1, di = 0 for i ̸= B(1), . . . , B(m), j, and dB = −AB −1 Aj .] While moving along this direction
d, the nonbasic variable xj becomes positive and all other nonbasic variables remain at zero. We
describe this situation by saying that xj (or Aj ) enters or is brought into the basis.
Once we start moving away from x along the direction d, we are tracing points of the form x + θd,
where θ ≥ 0. Since costs decrease along the direction d, it is desirable to move as far as possible.
This takes us to the point x + θ∗ d, where

θ∗ = max{θ ≥ 0 | x + θd ∈ P }

2
The resulting cost change is θ∗ c′ d, which is the same as θ∗ c̄j .
We now derive a formula for θ∗ . Given that Ad = 0, we have A(x+ θd) = Ax = b for all θ, and
the equality constraints will never be violated. Thus, x + θd can become infeasible only if one of
its components becomes negative. We distinguish two cases:

3.1 Case - 1: d ≥ 0

If d ≥ 0, then x + θd ≥ 0 for all θ ≥ 0, the vector x + θd never becomes infeasible, and we let
θ∗ = ∞.

3.2 Case - 2: ∃ i, di < 0

If di < 0 for some i, the constraint xi + θdi ≥ 0 becomes θ ≤ −xi /di . This constraint on θ must be
satisfied for every i with di < 0. Thus, the largest possible value of θ is

 
∗ xi
θ = min −
{i|di <0} di

Recall that if xi is a nonbasic variable, then either xi is the entering variable and di = 1, or else
di = 0. In either case, di is non-negative. Thus, we only need to consider the basic variables and
we have the equivalent formula

xB(i)
 

θ = min −
{i=1,...,m|dB(i) <0} dB(i)
Note that θ∗ > 0, because xB(i) > 0 for all i, as a consequence of non-degeneracy.

3.3 Continuation of Example 2.1 . . .

minimize c1 x1 + c2 x2 + c3 x3 + c4 x4
subject to x1 + x2 + x3 + x4 = 2
2x1 + 3x3 + 4x4 = 2
x1 , x2 , x3 , x4 ≥ 0.

Let us again consider the basic feasible solution x = (1, 1, 0, 0) and recall that the reduced cost c̄3
of the nonbasic variable x3 was found to be −3c1 /2 + c2 /2 + c3 . Suppose that c = (2, 0, 0, 0), in
which case, we have c̄3 = −3. Since c̄3 is negative, we form the corresponding basic direction, which
is d = (−3/2, 1/2, 1, 0), and consider vectors of the form x + θd, with θ ≥ 0. As θ increases, the
only component of x that decreases is the first one (because d1 < 0 ). The largest possible value
of θ is given by θ∗ = − (x1 /d1 ) = 2/3. This takes us to the point y = x + 2d/3 = (0, 4/3, 2/3, 0).
Note that the columns A2 and A3 corresponding to the nonzero variables at the new vector y are
(1, 0) and (1, 3), respectively, and are linearly independent. Therefore, they form a basis and the
vector y is a new basic feasible solution. In particular, the variable x3 has entered the basis and
the variable x1 has exited the basis.

3
Once θ∗ is chosen, and assuming it is finite, we move to the new feasible solution y = x + θ∗ d.
Since xj = 0 and dj = 1, we have yj = θ∗ > 0. Let ℓ be a minimizing index in Eq. (4), that is,

xB(ℓ) xB(i)
 
− = min − = θ∗
dB(ℓ) {i=1,...,m|dB(i) <0} dB(i)

in particular,

dB(ℓ) < 0

and

xB(ℓ) + θ∗ dB(ℓ) = 0.

We observe that the basic variable xB(ℓ) has become zero, whereas the nonbasic variable xj has
now become positive, which suggests that xj should replace xB(ℓ) in the basis. Accordingly, we
take the old basis matrix AB and replace AB(ℓ) with Aj , thus obtaining the matrix

 
| | | | |
B =  AB(1) · · · AB(ℓ−1) Aj AB(ℓ+1) · · · AB(m)  .
| | | | |

Equivalently, we are replacing the set {B(1), . . . , B(m)} of basic indices by a new set {B̄(1), . . . , B̄(m)}
of indices given by

(
B(i), i ̸= ℓ
B̄(i) =
j, i=ℓ

3.4 Equations

dB = −A−1
B Aj (1)

cj = cT d
b (2)

cj = cj − cTB A−1
b B Aj (3)

xB(i)
 

θ = min − (4)
{i=1,...,m|dB(i) <0} dB(i)

4
Theorem

(a) The columns AB(i) , i ̸= ℓ, and Aj are linearly independent and, therefore, B is a basis matrix.

(b) The vector y = x + θ∗ d is a basic feasible solution associated with the basis matrix B.

Proof.

(a) If the vectors AB̄(i) , i = 1, . . . , m, are linearly dependent, then there exist coefficients λ1 , . . . , λm ,
not all of them zero, such that

m
X
λi AB̄(i) = 0
i=1

which implies that

m
X
λi AB −1 AB̄(i) = 0
i=1

and the vectors AB −1 AB̄(i) are also linearly dependent. To show that this is not the case, we
will prove that the vectors AB −1 AB(i) , i ̸= ℓ, and AB −1 Aj are linearly independent. We have
AB −1 AB = I. Since AB(i) is the i th column of AB , it follows that the vectors AB −1 AB(i) , i ̸= ℓ,
are all the unit vectors except for the ℓ th unit vector. In particular, they are linearly independent
and their ℓ th component is zero. On the other hand, AB −1 Aj is equal to −dB . Its ℓ th entry,
−dB(ℓ) , is nonzero by the definition of ℓ. Thus, AB −1 Aj is linearly independent from the unit
vectors AB −1 AB(i) , i ̸= ℓ.
(b) We have y ≥ 0, Ay = b, and yi = 0 for i ̸= B̄(1), . . . , B̄(m). Furthermore, the columns
AB̄(1) , . . . , AB̄(m) have just been shown to be linearly independent. It follows that y is a basic
feasible solution associated with the basis matrix B.
Since θ∗ is positive, the new basic feasible solution x + θ∗ d is distinct from x; since d is a direction
of cost decrease, the cost of this new basic feasible solution is strictly smaller. We have therefore
accomplished our objective of moving to a new basic feasible solution with lower cost. For our
purposes, it is convenient to define a vector u = (u1 , . . . , um ) by letting

u = −dB = AB −1 Aj

where Aj is the column that enters the basis; in particular, ui = −dB(i) , for i = 1, . . . , m.

References
[1] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimisation.

5
MTL103: Optimization Methods and Applications Spring 2023

Lecture 11 — 30th Jan, 2024


Lecturer: Prof. Minati De

Scribes:

1. Sarthak Maheshwari, 2022MT11258

2. Arnab Goyal, 2022MT61963

3. Soham Sameer Palkar, 2022MT61971

4. Pawan Kumar Jangir, 2022MT61300

1 Overview

In the last lecture we started with the development of the simplex method, which allows us to move
from one basic feasable solution to another. Let us recall:

If x is a basic feasible solution with basis vector xB = [xB(1) ....xB(m) ] such that the reduced
cost vector, ĉ ≥ 0, then x is an optimal solution.
If not, then ∃ j such that ĉj < 0. We then try going to another basic feasible solution, by first
computing the feasible direction (in which the cost decreases). We set dj = 1 and di = 0 for all
i ̸= j, B(1), B(2)...B(m). Also, for the constraints to be valid, we calculated the value of dB , which
is the vector with those components of d that correspond to the basic variables:

dB = −A−1
B Aj

Which gives us all the remaining components of the feasible direction.


If d ≥ 0, x + θd ∈ P for all θ, so the optimal solution is −∞. This is because θ can be arbitrarily
large.
Else, we calculated θ∗ to be:
 
xB(i)
θ* = min −d
B(i)
{i=1,...,m|dB(i) <0}

We then proved that x + θ∗ d is also a basic feasible solution.


In this lecture we will see how a typical iteration of the simplex method looks like. We will also
prove that the simplex method algorithm terminates after a finite number iterations. We will
then look at the Naive Implementation of the simplex method, and then look at elementary row
operations, which will be useful in reducing the number of computations in one iteration.

1
2 Main Section

We begin by describing how a typical iteration of the simplex method looks like.

2.1 An iteration of the simplex method

• Step-1 : In a typical iteration, we start with a basis consisting of the basic columns AB(l) , AB(2) , ....AB(m) ,
and an associated basic feasible solution x.
−1
• Step-2: Compute the reduced costs: ĉj = cj − cT
B AB Aj for all nonbasic indices j.

• Step-3 : If they are all nonnegative, the current basic feasible solution is optimal, and the
algorithm terminates; else, choose some j for which ĉj < 0.

• Step-4 : Compute u = A−1 ∗


B Aj . If no component of u is positive, we have θ = ∞, the optimal
cost is −∞, and the algorithm terminates. else,

xB(i)
θ* = min ui
i=1,...,m|ui >0

x
• Step-5 : Let ℓ be such that θ* = B(ℓ)
uℓ . Form a new basis by replacing AB(ℓ) with Aj . If y is
the new basic feasible solution, the values of the new basic variables are: yB(l) = 0, yj = θ∗
and yB(i) = xB(i) − θ∗ ui , i ̸= ℓ.

2.2 Termination of the Simplex Method

Theorem: Assume that the feasible set is nonempty and that every basic feasible solution is
nondegenerate. Then, the simplex method terminates after a finite number of iterations. At
termination, there are the following two possibilities:

a) We have an optimal basis B and an associated basic feasible solution which is optimal.

b) We have found a vector d satisfying Ad = 0, d ≥ 0, and cT d < 0, and the optimal cost is
−∞.

Proof: If the algorithm terminates due to the stopping criterion in Step 3, then the optimality
conditions have been met , B is an optimal basis, and the current basic feasible solution is optimal.

If the algorithm terminates because the criterion in Step 4 has been met , then we are at a basic
feasible solution x and we have discovered a nonbasic variable xj such that ĉj < 0 and such that
the corresponding basic direction d satisfies Ad = 0 and d ≥ 0. In particular, x + θd ∈ P for all
θ > 0. Since cT d = ĉj < 0, by taking θ arbitrarily large, the cost can be made arbitrarily negative,
and the optimal cost is - ∞ .

At each iteration, the algorithm moves by a positive amount θ∗ along a direction d that satis-
fies cT d < 0. Therefore, the cost of every successive basic feasible solution visited by the algorithm

2
is strictly less than the cost of the previous one, and no basic feasible solution can be visited twice.
Since there is a finite number of basic feasible solutions, the algorithm must eventually terminate.

2.3 Implementations of Simplex Method

We will now discuss some ways of carrying out the mechanics of the simplex method. It should be
clear from the statement of the algorithm that the vectors A−1B Aj play a key role. If these vectors
are available, the reduced costs, the direction of motion, and the stepsize θ∗ are easily computed.
Thus, the main difference between alternative implementations lies in the way that the vectors
A−1
B are computed and on the amount of related information that is carried from one iteration to
the next. When comparing different implementations, it is important to keep the following facts in
mind (cf. Section 1.6) . If AB is a given m x m matrix and b ∈ Rm is a given vector, computing the
inverse of AB or solving a linear system of the form AB x = b takes O(m3 ) arithmetic operations.
Computing a matrix-vector product AB b takes O(m2 ) ) operations. Finally, computing an inner
product pT b of two m-dimensional vectors takes O(m) arithmetic operations.

2.3.1 Naive implementation

We start by describing the most straightforward implementation in which no auxiliary information


is carried from one iteration to the next. At the beginning of a typical iteration, we have the
indices B(l), ... , B (m) of the current basic variables. We form the basis matrix B and compute
−1
pT = cTB AB , by solving the linear system pT T
B = cB for the unknown vector p. (This vector p
is called the vector of simplex multipliers associated with the basis B.) The reduced cost ĉj =
−1
cj − cTB AB Aj of any variable xj is then obtained according to the formula :

ĉj = cj − pT Aj

Once a column Aj is selected to enter the basis, we solve the linear system AB u = Aj in order to
determine the vector u = A−1 B Aj . At this point, we can form the direction along which we will
be moving away from the current basic feasible solution. We finally determine θ∗ and the variable
that will exit the basis, and construct the new basic feasible solution.
We note that we need O(m3 ) arithmetic operations to solve the systems pT AB = cT
B and AB u = Aj .
In addition, computing the reduced costs of all variables requires O(mn) arithmetic operations,
because we need to form the inner product of the vector p with each one of the nonbasic columns
Aj . Thus, the total computational effort per iteration is O(m3 + mn). We will see shortly that
alternative implementations require only O(m2 + mn) arithmetic operations. Therefore, the im-
plementation described here is rather inefficient, in general.

2.3.2 Elementary Row operations for the Revised simplex method

Much of the computational burden in the naive implementation is due to the need for solving two
linear systems of equations. In an alternative implementation, the matrix A−1
B is made available at
T −1 −1
the beginning of each iteration, and the vectors cB AB and AB Aj are computed by a matrixvector
multiplication. For this approach to be practical, we need an efficient method for updating the

3
matrix A−1
B each time that we effect a change of basis. This is discussed next.
Let:

AB = [AB(1) .....AB(m) ]

be the basis matrix at the beginning of an iteration and let:

ÂB = [AB(1) ....AB(ℓ−1) Aj AB(ℓ+1) .........AB(m) ]

be the basis matrix at the beginning of the next iteration. These two basis matrices have the
same columns except that the ℓth column AB(ℓ) (the one that exits the basis) has been replaced
by Aj . It is then reasonable to expect that A−1
B contains information that can be exploited in the
−1
computation of ÂB .

Let us consider an example to understand better.


Let:
   
1 0 2 1 2
Q = 0 1 0 , C = 3 4
0 0 1 5 6
and note that:
 
11 14
QC =  3 4 
5 6

In particular, multiplication from the left by the matrix Q has the effect of multiplying the third
row of C by two and adding it to the first row. Generalizing this example, we see that multiplying
the j th row by β and adding it to the ith row (for i ̸= j) is the same as left-multiplying by the
matrix Q = I + Dij , where Dij is a matrix with all entries equal to zero, except for the (i, j)th
entry which is equal to β. The determinant of such a matrix Q is equal to 1 and, therefore, Q is
invertible.

Suppose now that we apply a sequence of K elementary row operations and that the k th such
operation corresponds to left-multiplication by a certain invertible matrix Qk . Then, the sequence
of these elementary row operations is the same as left-multiplication by the invertible matrix
Qk Qk−1 Q2 Q1 . We conclude that performing a sequence of elementary row operations on a given
matrix is equivalent to left-multiplying that matrix by a certain invertible matrix. In the next class,
we will use this idea for the development of the revised simplex method.

References
[1] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimisation.

4
MTL103: Optimization Methods and Applications Spring 2024

Lecture 12 — 1st Feb, 2024


Lecturer: Prof. Minati De

Scribes:

1 . Ryan Azim Shaikh, 2022MT11923

2 . Khushank Galgat, 2022MT11291

3 . Agrim Chandra, 2022MT11952

4 . Prateek Mourya, 2022MT11937

1 Recap

In the previous lecture we discussed an algorithm for one iteration of the simplex method. We
analysed its time complexity and introduced the idea of using the pre-computed information from
the previous iteration in the next iteration.
−1
In this lecture we will explain in detail how the new matrix AB from the matrix A−1B calculated
in the previous iteration. We will provide an algorithm for one iteration of the Revised Simplex
method. We will also introduce the Full Tableau method used to solve linear optimization problems
by hand.

2 Revised Simplex Method

One of the main issues with Simplex method is the fact that we need to calculate the matrix inverse
A−1
B for each iteration of the algorithm.
Although for small matrices, this might not seem too much of a computation, when dealing with a
large number of variables this computation is expensive.
One way to reduce computation load is to calculate A−1 B using Gauss elimination. But the com-
plexity of the algorithm is still O(m3 ).
Notice that the new matrix AB is very close the the basis matrix AB in the previous iteration.
Only one basic variable changes and hence only one column changes in the basis matrix.
Revised Simplex Method exploits this by using the previously computed A−1 B to calculate the new
−1
inverse AB without actually calculating AB .

1
2.1 Computation of A−1
B
from A−1
B

Since B−1 B = I, we see that B−1 AB(i) is the ith unit vector ei . Using this observation, we have
 
| | | | |
B−1 B =  e1 · · · eℓ−1 u eℓ+1 · · · em 
| | | | |
 
1 u1
 .. .. 

 . . 
 (1)
= uℓ ,

 .. .. 
 . . 
um 1

where u = B−1 Aj . Let us apply a sequence of elementary row operations that will change the
above matrix to the identity matrix. In particular, consider the following sequence of elementary
row operations.
(a) For each i ̸= ℓ, we add the ℓ th row times −ui /uℓ to the i th row. (Recall that uℓ > 0.) This
replaces ui by zero.
(b) We divide the ℓ th row by uℓ . This replaces uℓ by one.

2.1.1 Proof of Computation

We are adding to each row a multiple of the ℓth row to replace the ℓth column u by the ℓth unit
vector eℓ . This sequence of elementary row operations is equivalent to left-multiplying B−1 B by
a certain invertible matrix Q. Since the result is the identity, we have QB−1 B = I, which yields
−1
QB−1 = B . The last equation shows that if we apply the same sequence of row operations to
−1
the matrix B−1 (equivalently, left-multiply by Q ), we obtain B . We conclude that all it takes to
−1
generate B , is to start with B−1 and apply the sequence of elementary row operations described
above.

2.1.2 Example

Let    
1 2 3 −4
B−1 =  −2 3 1 , u =  2 ,
4 −3 −2 2
and suppose that ℓ = 3. Thus, our objective is to transform the vector u to the unit vector
e3 = (0, 0, 1). We multiply the third row by 2 and add it to the first row. We subtract the third
row from the second row. Finally, we divide the third row by 2 . We obtain
 
9 −4 −1
−1
B =  −6 6 3 .
2 −1.5 −1

2
2.2 Algorithm for Revised Simplex Method

Refer to the previous lecture for the Algorithm for the Simplex method. We will now explain the
changes in the algorithm for the Revised Simplex Method.

Step 6: In the Matrix A−1


B perform the following row operations.

Ai,j −1
(A−1 − (A−1
B )i, → B )i, − (A )l, ∀i ̸= l
Al,j B

Where, j is the pivot column and l is the pivot row in the current iteration.

2.3 Time complexity analysis and comparison with naive approach

It should be clear from the algorithm that the vectors B −1 Aj play a key role. If these vectors are
available, the reduced costs, the direction of motion, and the stepsize (Θ∗ ) are easily computed.
If B is a given m × m matrix and b ∈ Rm is a given vector, computing the inverse of B or solving a
linear system of the form Bx = b takes O(m3 ) arithmetic operations. However, as discussed in the
revised simplex method, in order to calculate the inverse of the new basis, the problem boils down
to solving a linear system:
Bx = b or B −1 x = b,
which generally takes O(m2 ) operations. Computing a matrix-vector product Bb takes O(m2 )
operations. Finally, computing an inner product p⊤ b of two m-dimensional vectors takes O(m)
arithmetic operations. In addition, computing the reduced costs of all variables requires O(mn)
arithmetic operations because we need to form the inner product of the vector p with each one of
the nonbasic columns Aj . Thus, the total computational effort per iteration O(m2 +mn) arithmetic
operations.

Method Time Complexity


Simplex W orst − case : O(m3 ), Best − case : O(mn)
RevisedSimplex W orst − case : O(mn), Best − case : O(m2 )

Comparison of the full tableau method and revised simplex. The time requirements refer to a
single iteration.

3 Full Tableau Method

The Tableau Method is essentially the same as the revised simplex method, but it provides an easy
way to represent all the data required for each iteration in a tabular form.
Each iteration consists of selecting a pivot column, a pivot row and a set number of elementary
row operations. Representing the data in this format also updates the reduced cost vector and the
cost of the current basic feasible solution with just a row operation.

3
3.1 The Tableau

The Tableau is an (n + 1) ∗ (m + 1) matrix.

• The element T0,0 is −cT


B XB = -ve cost of current basic feasible solution.

• The elements (T0,j )1≤j≤n are the reduced cost vectors (ĉj )1≤j≤n .

• The elements (Tk,0 )1≤k≤m are the values xB .

• The elements ((Ti,j )1≤i≤n )1≤j≤m store the elements of the matrix A−1
B A at
each iteration.

−cT
B xB c̄1 ··· c̄n
xB(1) | |
..
. B−1 A1 ... B−1 An
xB(m) | |

3.2 Updating the Tableau

3.2.1 The Algorithm

In each iteration, the Tableau is updated according to the following algorithm.

Step 1: Pick a column in which the zero element i.e. ĉj is -ve. Let this column be U, from the first
element onwards. If no such element is found, the algorithm terminates.
XB(i)
Step 2: Let Θ∗ be min ( ) and l be the corresponding value of i.
Ui >0 Ui
Step 3: Taking the element Tl,j to be the pivot element perform elementary row operations

Ui
Ti, →
− Ti, − Tl, ∀i ̸= l
Ul

and

Tl,
Tl, →

Ul
Step 4: Perform this row operation on row zero to update the reduced cost vector and the current
cost.
ĉj
T0, →
− T0, − Tl,
Ul

4
3.2.2 Why it works

We pick a pivot column with a -ve ĉj . So, we select a non-basic variable such that the cost will
decrease if we move it to the basis. The corresponding column A−1 B Aj gives −d. We call this
X B(i)
column vector U. So Θ∗ is obtained by min ( ).
Ui >0 Ui
In the next step we update the matrix A−1 −1
B A. Since any row operations on AB translates to
AB A, we use the row operations used in the revised simplex method to update the matrix A−1
−1
B .
The final step updates the current cost and the reduced cost vector.[1]

−cT − −cT
B xB →
T
B xB − ĉj Θ = −cB xB

3.3 Example

Consider the problem


minimize − 10x1 − 12x2 − 12x3
subject to
x1 + 2x2 + 2x3 ≤ 20
2x1 + x2 + 2x3 ≤ 20
2x1 + 2x2 + x3 ≤ 20
x1 , x2 , x3 ≥ 0.
After introducing slack variables, we obtain the following standard form problem:

minimize − 10x1 − 12x2 − 12x3

subject to
x1 + 2x2 + 2x3 + x4 = 20
2x1 + x2 + 2x3 + x5 = 20
2x1 + 2x2 + x3 + x6 = 20
x1 , . . . , x6 ≥ 0.

Note that x = (0, 0, 0, 20, 20, 20) is a basic feasible solution and can be used to start the algorithm.
Let accordingly, B(1) = 4, B(2) = 5, and B(3) = 6. The corresponding basis matrix is the identity
matrix I. To obtain the zeroth row of the initial tableau, we note that CB = 0 and, therefore,
CB′ x = 0 and c = c. Hence, we have the following initial tableau:
B

x1 x2 x3 x4 x5 x6
0 −10 −12 −12 0 0 0
x4 20 1 2 2 1 0 0
x5 20 2∗ 1 2 0 1 0
x6 20 2 2 1 0 0 1

We note a few conventions in the format of the above tableau: the label xi on top of the i-th column
indicates the variable associated with that column. The labels xi = in the leftmost column of the
tableau tell us which are the basic variables and in what order. For example, the first basic variable
xB(1) is x4 , and its value is 20. Similarly, xB(2) = x5 = 20, and xB(3) = x6 = 20. Strictly speaking,

5
these labels are not quite necessary. We know that the column in the tableau associated with the
first basic variable must be the first unit vector. Once we observe that the column associated with
the variable x4 is the first unit vector, it follows that x4 is the first basic variable.
We continue with our example. The reduced cost of x1 is negative, and we let that variable enter
the basis. The pivot column is u = (1, 2, 2). We form the ratios xB(i) /ui , i = 1, 2, 3; the smallest
ratio corresponds to i = 2 and i = 3. We break this tie by choosing i = 2. This determines the
pivot element, which we indicate by an asterisk. The second basic variable xB(2) , which is x5 , exits
the basis. The new basis is given by B(1) = 4, B(2) = 1, and B(3) = 6. We multiply the pivot row
by 5 and add it to the zeroth row. We multiply the pivot row by 1/2 and subtract it from the first
row. We subtract the pivot row from the third row. Finally, we divide the pivot row by 2. This
leads us to the new tableau:
x1 x2 x3 x4 x5 x6
100 0 −7 −2 0 5 0
x4 10 0 1.5 1∗ 1 −0.5 0
x1 10 1 0.5 1 0 0.5 0
x6 0 0 1 −1 0 −1 1

The corresponding basic feasible solution is x = (10, 0, 0, 10, 0, 0). In terms of the original variables
x1 , x2 , x3 , we have moved to point (10, 0, 0). Note that this is a degenerate basic feasible solution
because the basic variable x6 is equal to zero.
We have mentioned earlier that the rows of the tableau (other than the zeroth row) amount to
a representation of the equality constraints B T Ax = B T b, which are equivalent to the original
constraints Ax = b. In our current example, the tableau indicates that the equality constraints can
be written in the equivalent form:

10 = 1.5x2 + x3 + x4 − 0.5x5
10 =x1 + 0.5x2 + x3 + 0.5x5
0 = x2 − x3 − x5 + x6 .

We now return to the simplex method. With the current tableau, the variables x2 and x3 have
negative reduced costs. Let us choose x3 to be the one that enters the basis. The pivot column is
u = (1, 1, −1). Since U3 < 0, we only form the ratios xB(i) /ui , for i = 1, 2. There is again a tie,
which we break by letting i = 1, and the first basic variable, x4 , exits the basis. An asterisk again
indicates the pivot element. After carrying out the necessary elementary row operations, we obtain
the following new tableau:

x1 x2 x3 x4 x5 x6
120 0 −4 0 2 4 0
x3 10 0 1.5 1 1 −0.5 0
x1 0 1 −1 0 −1 1 0
x6 10 0 2.5∗ 0 1 −1.5 1

In terms of Figure 3.4, we have moved to point B = (0, 0, 10), and the cost has been reduced to
-120. At this point, x2 is the only variable with negative reduced cost. We bring x2 into the basis,

6
x6 exits, and the resulting tableau is:

x1 x2 x3 x4 x5 x6
136 0 0 0 3.6 1.6 1.6
x3 4 0 0 1 0.4 0.4 −0.6
x1 4 1 0 0 −0.6 0.4 0.4
x2 4 0 1 0 0.4 −0.6 0.4

We have now reached the optimal point. Its optimality is confirmed by observing that all reduced
costs are nonnegative.

References

[1] Dimitris Bertsimas and John N. Tsitsiklis, Introduction to Linear Optimisation.

7
MTL103: Optimization Methods and Applications Spring 2023

Lecture 13 — 2nd Feb, 2024


Lecturer: Prof. Minati De

Scribes:

1. Mukul Sahu, 2022MT11939

2. Nimkar Abhinav Yashwant, 2022MT11943

3. Priyansh Prakash Mayank, 2022MT11954

4. Chintada Srinivasa Rao, 2022MT11924

1 Overview

In the last lecture we learnt about the Revised Simplex Method which optimises the calculation
of the reduced cost matrix by preserving the inverse of the basis matrix through iterations and
modifying it through elementary row operations to get the inverse of the new basis matrix.
We also learnt the Full Tableau method which takes more space than the revised simplex method
but is more beneficial for applying the simplex method by hand. Thus, it is useful for solving linear
optimisation problems with few variables.
In this lecture we will begin by looking at the procedure of the Full Tableau method through
the means of an example.
We shall also delve into the some more challenges with the application of the simplex method,
namely the possible cycling of the algorithm due to degeneracy and finding a basic feasible
solution for the initialisation of the simplex method.

2 Main Section

We shall begin by describing how a linear programming problem can be solved using the Full
Tableau Method.

2.1 Example of a Full Tableau Method

The problem is as follows,

min − 10x1 − 12x2 − 12x3


s.t x1 + 2x2 + 2x3 ≤ 20 (1)

1
2x1 + x2 + 2x3 ≤ 20
2x1 + 2x2 + x3 ≤ 20
x1 , x2 , x3 ≥ 0

As we can see, the problem is not in standard form. We will start by first converting it into standard
form. We will introduce some slag variables x4 , x5 , x6 for that.
The problem in standard form is now,

min − 10x1 − 12x2 − 12x3


s.t x1 + 2x2 + 2x3 + x4 = 20 (2)
2x1 + x2 + 2x3 + x5 = 20
2x1 + 2x2 + x3 + x6 = 20
x1 , x2 , x3 , x4 , x5 , x6 ≥ 0

The dimensions of matrix A is 3 x 6 and rank (A)=3. Upon observation, we see that A4 , A5 , A6
are linearly independent. Set B(1), B(2), B(3) to be equal to 4,5 and 6 respectively. hence, to
construct a basic solution we can set x1 ,x2 and x3 all to be equal to and solve AB xB = b. where
AB = [A4 A5 A6 ] and xB = (x4 , x5 , x6 )T . hence, x = (0, 0, 0, 20, 20, 20) is a basic solution.
Now we have everything to create a full table tableau.

−cTB x x1 x2 x3 x4 x5 x6
0 -10 -12 -12 0 0 0
x4 20 1 2 2 1 0 0
x5 20 2 1 2 0 1 0
x6 20 2 2 1 0 0 1

As the first row of tableau has at least one negative entry, we select the column corresponding to
first negative entry. For pivot row, we need to take l = min(xB (i)/ui ). We take least such l.
The element in 1st row and 1st column will be our pivot element.
Next, we will perform elementary operations on our tableau such that our pivot row is changed to
el which is the lth column of identity matrix.

The transformed tableau after the first iteration is,

−cTB x x1 x2 x3 x4 x5 x6
120 -4 0 0 -6 0 0
x2 10 0.5 1 1 0.5 0 0
x5 10 1.5 0 1 -0.5 1 0
x6 0 1 0 -1 -1 0 1

2
After another iteration, we get the following tableau,

−cTB x x1 x2 x3 x4 x5 x6
120 0 0 -4 2 0 4
x2 10 0 1 1.5 1 0 -0.5
x5 10 0 0 2.5 1 1 -1.5
x1 0 1 0 -1 -1 0 1

After another iteration, we get the following tableau,

−cTB x x1 x2 x3 x4 x5 x6
136 0 0 0 3.6 1.6 1.6
x2 4 0 1 0 0.4 -0.6 0.4
x3 4 0 0 1 0.4 0.4 -0.6
x1 4 1 0 0 -0.6 0.4 0.4

As reduced cost vector is non-negative, the optimal solution is z = 136 and x = (4, 4, 4, 0, 0, 0).
While we have a basic algorithm, there are still two challenges for the simplex algorithm.

2.2 Challenges in the simplex algorithm:

1. Degeneracy: Degeneracy allows for two difference basis corresponding to the same solution.
This might cause the algorithm to cycle between BFS, not decreasing the objective functions.
This could make the algorithm never reach it’s terminating condition.
2. Initial BFS: To initialise the simplex method, a canonical BFS must be found which we
don’t have a algorithm for finding (yet).

2.3 Example Demonstrating Cycle in Iterations of Simplex Method

The linear programming problem considered for this section has the following tableau during its
typical iteration.

−cTB x x1 x2 x3 x4 x5 x6 x7
3 -3/4 20 -1/2 6 0 0 0
x5 0 1/4 -8 -1 9 1 0 0
x6 0 1/2 -12 -1/2 3 0 1 0
x7 1 0 0 1 0 0 0 1

−cTB x x1 x2 x3 x4 x5 x6 x7
3 0 -4 -3 1/2 33 3 0 0
x1 0 1 -32 -4 36 4 0 0
x6 0 0 4 1 1/2 -15 -2 1 0
x7 1 0 0 1 0 0 0 1

3
−cTB x x1 x2 x3 x4 x5 x6 x7
3 0 0 -2 18 1 1 0
x1 0 1 0 8 -84 -12 8 0
x2 0 0 1 3/8 -3 3/4 -1/2 1/4 0
x7 1 0 0 1 0 0 0 1

−cTB x x1 x2 x3 x4 x5 x6 x7
3 1/4 0 0 -3 -2 3 0
x5 0 1/8 0 1 -10 1/2 -1 1/2 1 0
x3 0 -3/64 1 0 3/16 1/16 -1/8 0
x7 1 -1/8 0 0 10 1/2 1 1/2 -1 1

−cTB x x1 x2 x3 x4 x5 x6 x7
3 -1/2 16 0 0 1 -1 0
x3 0 -2 1/2 56 1 0 2 -6 0
x4 0 -1/4 5 1/3 0 1 1/3 -2/3 0
x7 1 2 1/2 -56 0 0 -2 6 1

−cTB x x1 x2 x3 x4 x5 x6 x7
3 -1 3/4 44 1/2 0 0 -2 0
x4 0 - 1 1/4 28 0.5 0 1 -3 0
x5 0 1/6 -4 -1/6 1 0 1/3 0
x7 1 0 0 1 0 0 0 1

−cTB x x1 x2 x3 x4 x5 x6 x7
3 -0.75 20 -0.5 6 0 0 0
x5 0 0.25 -8 -1 9 1 0 0
x6 0 0.5 -12 -0.5 3 0 1 0
x7 1 0 0 1 0 0 0 1

We have followed the algorithm to choose the xj with minimum cbj to enter the new basis after an
iteration.As we can see, we started with an intial basic variables 5,6,7 and ended up with the same
basic variables after 7 iterations of algorithm.

3 Cycling in the BFS

The original simplex algorithm starts with an arbitrary basic feasible solution, and then changes
the basis in order to decrease the minimization target and find an optimal solution. Usually, the
target indeed decreases in every step, and thus after a bounded number of steps an optimal solution
is found. However, there are examples of degenerate linear programs, on which the original simplex
algorithm cycles forever. It gets stuck at a basic feasible solution (a corner of the feasible polyhe-
dron) and changes bases in a cyclic way without decreasing the optimization target. In the above
example, we arrived at a degenerate solution from a non- degenerate solution and also encountered
cycling .

4
3.1 Anticycling Rules

Bland’s Rule / Smallest subscript rule: Bland’s rule is one such anticycling rule, used
to avoid the problem of cycling. Using Bland’s rule, the simplex algorithm solves feasible linear
optimization problems without cycling.[1][2]
Bland’s rule gives us the steps for choosing the entering and exiting variable as given:

1. Choose the lowest-numbered (i.e., leftmost) nonbasic column with a negative (reduced) cost

C
cj < 0

2. Now among the rows, choose the one with the lowest ratio between the (transformed) right
hand side and the coefficient in the pivot tableau where the coefficient is greater than zero.
If the minimum ratio is shared by several rows, choose the row with the lowest-numbered
column (variable) basic in it. .

This guarantees that the cycle does not occur.

4 Finding an initial BFS

We have described how a typical iteration of the simplex method looks like as well as when and how
to terminate the simplex algorithm. But we have yet to find an initial BFS for the initialisation of
the simplex algorithm. A canonical BFS must be found before the simplex method can initialised.
For an equation of form

min cT x

st Ax ≤ b (3)
x≥0 (4)

This can be accomplished by the introduction of slack variables yi by adding the columns of the
identity vector to the basis matrix. (If bi <= 0, the equation is negated before adding the columns).
Doing this does not change the feasible region of the solution. T he new objective function thus
becomes

n
X
min yi
i=1

and the new constraint functions are:

st Ax + y = b (5)
x≥0 (6)
y≥0 (7)

5
This LP problem is known as auxiliary LP. For this problem, we obtain an initial BFS x = 0, y = b.
Using this BFS as the initial BFS, we can run the simplex algorithm on the auxiliary LP to obtain
the minimum cost for the objective function. If the optimum solution for the auxiliary LP has:

• y = 0 and x = x∗ , then x∗ satisfies Ax = b and thus, this can be used as the initial BFS for
the initialisation of the simplex algorithm of the original problem

• y ̸= 0, then ∄ x∗ st Ax = b, then the original LP is infeasible.

5 Conclusion

In this lecture we have seen a running example of a tableau method as an overview of the revised
simplex method.We have also addressed the issue of cycling in simplex method and studied an
algorithm for finding the initial BFS for initialising the simplex algorithm.

References

[1] Cycling in linear programming problems by Saul I. Gassa; Sasirekha Vinjamuri

[2] J.A.J.Hall, K.I.M.McKinnon; The simplest examples where the simplex method cycles and
conditions where EXPAND fails to prevent cycling

6
MTL103: Optimization Methods and Applications Spring 2023

Lecture 14 — 6 Feb, 2024


Lecturer: Prof. Minati De

1) Astha Lohia 2022MT11927


2) Kaneesha Jain 2022MT11929
3) Shreya Jain 2022MT11914
4) Tanya Jain 2022MT11935

1 Overview

For the past few lectures, we focused on the development of the Simplex method which is an
algorithm for solving linear programming problems in standard form.
In this lecture we focus on Duality theory which leads to another algorithm for linear program-
ming. We start with a linear programming problem, the original, and introduce another linear
programming problem, called the dual problem.

2 Duality Theory

2.1 Introduction

We begin by understanding the motivation behind Duality theory which is an extension of Lagrange
multiplier method, used in calculus to minimise a function subject to equality constraints.
As we do in Lagrange multiplier method,in linear programming, we associate a price variable with
each constraint and start searching for prices under which the presence or absence of the constraints
does not affect the optimal cost. It turns out that the right prices can be found by solving a new
linear programming problem, called the dual of the original.

2.1.1 Original problem

min f (x)
x
subject to: fi (x) ≤ 0 for i = 1, 2, . . . , m
hi (x) = 0 for i = 1, 2, . . . , p

The set D is defined as the intersection of the domains of fi for i = 1, 2, . . . , m and the domains of
hi for i = 1, 2, . . . , p:
\m p
\
D= domain(fi ) ∩ domain(hi )
i=1 i=1

1
D is assumed to be a non-empty set.

2.1.2 Lagrangian

The Lagrangian function L : Rn × Rm × Rp → R is given by:


m
X p
X
L(x, λ, ν) = f0 (x) + λi fi (x) + νi hi (x)
i=1 i=1

where λ ∈ Rm and ν ∈ Rp .

2.1.3 Lagrangian dual function

The Lagrange Dual Function g(λ, ν) is defined as:


m p
!
X X
g(λ, ν) = inf f0 (x) + λi fi (x) + νi hi (x)
x∈D
i=1 i=1

Question: Is this function concave?


Answer: Yes. L(x, λ, ν) is an affine function i.e. a function of the form ax + b. The pointwise
infimum of a family of affine functions is always concave.

Let λ ≥ 0, and let p∗ be the optimal value of the original problem (refer to Subsection 2.1.1).
Then, we will prove:
g(λ, ν) ≤ p∗
Proof
Let x∗ be a feasible solution of the original problem. Then,

fi (x∗ ) ≤ 0 for i = 1, 2, . . . , m

hi (x ) = 0 for i = 1, 2, . . . , p

Thus,
m
X p
X
λi fi (x∗ ) + νi hi (x∗ ) ≤ 0
i=1 i=1

L(x, λ, ν) − f0 (x) ≤ 0
L(x, λ, ν) ≤ f0 (x)
Given,
m p
!
X X
g(λ, ν) = inf f0 (x) + λi fi (x) + νi hi (x)
x∈D
i=1 i=1

g(λ, ν) ≤ f0 (x) for all x ∈ feasible solution set


Hence,
g(λ, ν) ≤ p∗

2
2.1.4 Lagrange Dual Problem

We wish to achieve best possible lower bound for g(λ, ν)

max: g(λ, ν)
subject to: λ ≥ 0
Let d∗ be the optimal cost of this dual problem,

d∗ ≤ p∗ also known as weak duality property


∗ ∗
p −d termed Optimal Duality Gap

3 Standard Form Problems

max: c′ x
subject to: Ax = b
x≥0

n
X
L(x, λ, ν) = c′ x − λi xi + ν ′ (Ax − b)
1
= −b ν + (ν ′ A − λ′ + c′ )x

= −b′ ν + (A′ ν − λ + c)′ x

g(λ, ν) = inf (L(x, λ, ν))


= inf (−b′ ν + (A′ ν − λ + c)′ x)
= −b′ ν + infx (A′ ν − λ + c)′ x)

(
−b′ ν if A′ ν − λ + c = 0
g(λ, ν) =
−∞ otherwise

(
−b′ ν if A′ ν − λ + c = 0
max g(λ, ν) =
−∞ otherwise
⇐⇒

3
max − b′ ν

Such that, (A′ ν − λ + c) = 0 , λ ≥ 0

max − b′ ν
⇐⇒

Such that, (A′ ν + c) ≥ 0

Let, −ν = z

max − b′ z = z ′ b

Such that, A′ z ≤ c′
z f ree

This is now converted to a Dual Problem.

4 General Form Problems

max: c′ x
subject to: Ax ≤ b

L(x, λ, ν) = c′ x + λ′ (Ax − b)
= −b′ λ + (A′ λ + c)′ x

g(λ) = −b′ λ + inf ((A′ λ + c)′ x)

(
−b′ λ if A′ λ + c = 0
g(λ) =
−∞ otherwise
⇐⇒

4
max − b′ λ

Such that, (A′ λ + c) = 0 , λ ≥ 0

Let, −λ = z

max − b′ z = z ′ b

Such that, A′ z = c′
z≤0

This is now converted to a Dual Problem.

References

[1] Dimitris Bertsimas, John N.Tsitsiklis, Introduction To Linear Optimization

[2] Stephen Boyd, Lieven Vandenberghe, Convex Optimization

5
MTL103: Optimization Methods and Applications Spring 2024

Lecture 15 — 8 Feb, 2024


Lecturer: Prof. Minati De

Scribes:

1. Mehta Maalav Pranav, 2022MT11265

2. Rohan Roy, 2022MT11294

3. Adarsh Singh, 2022MT11285

4. Abhishek Singh, 2022MT11934

1 Overview

In the last lecture we saw how we can formulate a dual problem of a given optimization problem
(referred to as the original/primal problem).
We then formulated the dual problem for the specific cases where the primal is a linear programming
problem in canonical form or in the standard form.
We also saw that the optimal cost for the dual problem is always less than or equal to the optimal
cost of the original problem.
Since the dual problem is always convex irrespective of the primal, we can find a lower bound on the
optimal cost of the primal if we are able to solve convex optimization problems. (By convention,
the primal is a minimisation problem and its dual is a maximisation problem). Duality gap is
defined as the difference between the optimal cost of primal and the optimal cost of dual.
In this lecture we will focus on the dual of linear programming problems and study their optimal
costs.

2 The Dual Problem

Primal Problem: Dual Problem:

Minimize: c′ x Maximize: p′ b
Subject to: a′i x ≥ bi ∀i ∈ M1 Subject to: pi ≥ 0 ∀i ∈ M1
a′i x ≤ bi ∀i ∈ M2 pi ≤ 0 ∀i ∈ M2
a′i x = bi ∀i ∈ M3 pi free ∀i ∈ M3

xj ≥ 0 ∀j ∈ N1 p Aj ≤ cj ∀j ∈ N1

xj ≤ 0 ∀j ∈ N2 p Aj ≥ cj ∀j ∈ N2

xj free ∀j ∈ N3 p Aj = cj ∀j ∈ N3

1
PRIMAL minimize maximise DUAL
≥ bi ≥0
constraints ≤ bi ≤0 variables
= bi free
≥0 ≤ cj
variables ≤0 ≥ cj constraints
free = cj

Table 1: Relation between primal and dual variables and constraints

Let’s look at an example.


Primal:
minimize 5x1 + 3x2 + 7x3
subject to 4x1 + 7x2 ≤ 7
x2 + x3 = 8
x1 + x3 ≥ 10
x1 ≥ 0
x2 ≤ 0
x3 free

Dual:
maximize 7z1 + 8z2 + 10z3
subject to z1 ≤ 0
z2 free
z3 ≥ 0
4z1 + z3 ≤ 5
7z1 + z2 ≥ 3
z2 + z3 = 7

2.1 Dual of dual is primal

Now, we will show for the above example that the dual of dual is primal. First, convert the
maximization problem to a minimization problem.
Dual:
minimize − 7z1 − 8z2 − 10z3
subject to 4z1 + z3 ≤ 5
7z1 + z2 ≥ 3
z2 + z3 = 7
z1 ≤ 0
z2 free
z3 ≥ 0

2
Dual of Dual:

maximize 5y1 + 3y2 + 7y3


subject to y1 ≤ 0
y2 ≥ 0
y3 free
4y1 + 7y2 ≥ −7
y2 + y3 = −8
y1 + y3 ≤ −10

Which is equivalent to the original problem (just replace yi = −xi ).


This in fact is true for all kinds of Optimization problems - linear or not.

2.2 Duals of equivalent problems are equivalent

We also show this with an example. Consider the following primal and dual pairs-
Primal Problem: Dual Problem:

Minimize: c′ x Maximize: p′ b
Subject to: Ax ≥ b Subject to: p ≥ 0
x free p′ A = c′
Primal Problem: Dual Problem:

Minimize: c′ x + 0′ s Maximize: p′ b
Subject to: Ax − s = b Subject to: p free
x free p′ A = c′
s≥0 −p≤0
It is clear that both the primal problems are equivalent. In the dual of the latter, the first constraint
is redundant, and we get both dual problems to be equivalent.

3 The Duality Theorem

3.1 Weak Duality

Theorem: If x is a feasible solution to the primal problem and p is a feasible solution to the dual
problem, then
p′ b ≤ c′ x

Proof: For any vectors x and p, define ui = pi (a′i x − bi ) and vi = (cj − p′ Aj )xj .
Suppose that x and p are primal and dual respectively. The definition of the dual problem requires
the sign of pi to be the same as the sign of a′i x − bi and the sign of ck − p′ Aj to be the same as the

3
sign of xj . Thus, all ui and vj are non-negative.
Notice that X
ui = p′ Ax − p′ b
i

and X
vj = c′ x − p′ Ax.
j

We add these two equalities and use the non-negativity of ui and vj to obtain
X X
0≤ ui + vj = c′ x − p′ b
i j

and hence the weak duality theorem.

3.2 Corrolaries

• If the optimal cost in the primal is −∞, then the dual problem must be infeasible.
Proof: Suppose that the optimal cost in the primal problem is −∞ and that the dual problem
has a feasible solution p. By weak duality, p satisfies p′ b ≤ c′ x for every primal feasible x.
Taking the minimum over all primal feasible x, we conclude that p′ b ≤ −∞. This is impossible
and shows that the dual cannot have a feasible solution.

• If the optimal cost in the dual is +∞, then the primal problem must be infeasible.
Proof: Symmetric argument as above.

• Let x and p be feasible solutions to the primal and the dual, respectively, and suppose
p′ b = c′ x. Then, x and p are optimal solutions to the primal and the dual, respectively.
Proof: For every primal feasible y, the weak duality theorem yields c′ x = p′ b ≤ c′ y, which
proves x is optimal. The proof of optimality of p is similar.

3.3 Strong Duality

Theorem: If a linear programming problem has an optimal solution, so does its dual, and the
optimal costs are equal.

4 Conclusion

In this lecture, we saw how to directly find the dual of any LPP.
We also showed that the dual of the dual problem is the original problem, and showed that if
two formulations of an optimisation problem are equivalent, so are their dual.
Lastly, we saw the weak duality theorem and strong duality theorem for LPPs.

4
5 Fun Fact

Duality theory has a direct application in Supervised Machine Learning in computing the
classifier for Support Vector Machines (SVMs).[2]

References

[1] Bertsimas, Dimitris & Tsitsiklis, John. (1998). Introduction to Linear Optimization.

[2] https://www.adeveloperdiary.com/data-science/machine-learning/
support-vector-machines-for-beginners-duality-problem/

5
MTL103: Optimization Methods and Applications Spring 2024

Lecture 16 — 13 Feb, 2024


Lecturer: Prof. Minati De

Scribes:

1. Dhruv Chaurasiya, 2022MT11172

2. Aniket Anand, 2022MT11259

3. Praveen Lakhara, 2022MT11280

4. Shivang Goyal, 2022MT11269

1 Overview

In the last lecture we discussed topics related to dual problem and duality theorems like, The weak
Duality, and Strong Duality and corollaries related to them.
In this lecture we will continue the discussion by analysing the relationship between different pos-
sibilities of solutions of Primal and its Dual, and then discuss Complementary slackness Property.

2 Possible Combinations of Solutions of Primal and Dual Problem

We know that for any Linear Programming problem, we have 3 possibilities.

• It has an Optimal Solution

• It is unbounded

• It is infeasible

For every possibility of the primal, we have 3 for the dual problem, which means there are 3 ∗ 3 = 9
possibilities.
Using the theorems of Weak duality and Strong duality, it is possible to determine which combi-
nations of the primal and dual are possible.
The Strong Duality Theorem states that if a primal problem has an optimal solution, then the
dual will also have an optimal solution, and vice versa.
Similarly, the Weak Duality Theorem states that if one of the problems is unbounded, then the
other is infeasible.
With these is hand, we proceed to create the following table:

1
Optimal Solution Unbounded Infeasible
Optimal Solution Possible Impossible Impossible
Unbounded Impossible Impossible Possible
Infeasible Impossible Possible Possible

Example: Consider the following problem:

minimize x1 + 2x2
subject to x1 + x2 = 1
2x1 + 2x2 = 3

This problem is infeasible, because the two constraints cannot be simultaneously satisfied by any
vector (x1 , x2 ) .
Its dual is:

minimize y1 + 3y2
y1 + 2y2 = 1
y1 + 2y2 = 2

This problem is also infeasible, as no possible (y1 , y2 ) can satisfy both the constraints simultane-
ously.

3 Complementary Slackness

Theorem: Let x and p be feasible solutions to the primal and the dual problem, respectively.
The vectors x and p are optimal solutions for the two respective problems if and only if:


pi (ai x − bi ) = 0, ∀i,

(cj − p′ Aj )xj = 0, ∀j,


Proof: We defined ui = pi (ai x − bi ) and vj = (cj − p′ Aj )xj , and noted that for x primal feasible
and p dual feasible, we have ui ≥ 0 and vj ≥ 0 for all i and j. In addition, we showed that,

X X
c′ x − p′ b = ui + vj .
i j

By the strong duality theorem, if x and p are optimal, then c′ x = p′ b, which implies that ui =
vj = 0 for all i , j . Conversely, ui = vj = 0 for all i , j then c′ x = p′ b, and Corollary 3 (Lecture
15) implies that x and p are optimal.

2
The first complementary slackness condition is automatically satisfied by every feasible solution to
a problem in standard form. If the primal problem is not in standard form and has a constraint

like ai x ≥ bi , the corresponding complementary slackness condition asserts that the dual variable
pi is zero unless the constraint is active. An intuitive explanation is that a constraint which is not
active at an optimal solution can be removed from the problem without affecting the optimal cost,
and there is no point in associating a nonzero price with such a constraint.
If the primal problem is in standard form and a non degenerate optimal basic feasible solution is
known, the complementary slackness conditions determine a unique solution to the dual problem.
We illustrate this fact in the next example.
Example: Consider a problem in standard form and its dual:

minimize 13x1 + 10x2 + 6x3


subject to 5x1 + x2 + 3x3 = 8
x1 , x2 , x3 ≥ 0

maximize 8p1 + 3p2


subject to 5p1 + 3p2 ≤ 13
p1 + p2 ≤ 10
3p3 ≤ 6

Solution: As will be verified shortly, the vector x∗ = (1, 0, 1) is a non degenerate optimal
solution to the primal problem. Assuming this to be the case, we use the complementary

slackness conditions to construct the optimal solution to the dual. The condition pi (ai x∗ − bi ) = 0
is automatically satisfied for each i, since the primal is in standard form. The condition
(cj − p′ Aj)x∗j = 0 is clearly satisfied for j=2, because x∗2 = 0. However, since x∗1 > 0 and x∗3 > 0,
we obtain

5p1 + 3p2 = 13

and

3p1 = 6

which we can solve to obtain p1 = 2 and p2 = 1. Note that this is a dual feasible solution whose
cost is equal to 19, which is the same as the cost of x∗ . This verifies that x∗ is indeed an optimal
solution as claimed earlier.

3
We now generalize the above example. Suppose that xj is a basic variable in a non degenerate
optimal basic feasible solution to primal problem in standard form. Then, the complementary
slackness condition (cj − p′ Aj)x∗j = 0 yields cj = p′ Aj for every such j. Since the basic columns
Aj are linearly independent, we obtain a system of equations for p which has a unique solution,
namely, p′ = c′ B B−1 . A similar conclusion can also be drawn for problems not in standard form.
On the other hand, if we are given a degenerate optimal basic feasible solution to the primal,
complementary slackness may be of very little help in determining an optimal solution to the dual
problem.
We finally mention that if the primal constraints are of the form Ax ≥ b, x ≥ 0, and the primal
problem has an optimal solution, then there exist optimal solutions to the primal and the dual
which satisfy strict complementary slackness; that is, a variable in one problem is nonzero if and
only if the corresponding constraint in the other problem is active.

4 Conclusion

In this lecture, we found out which combination of primal and dual problems can exist.
We also saw the Complementary Slackness and theorems related to it.

5 References

1. Introduction to Linear Optimization by Dimitris Bertsimas and John Tsitsiklis

2. Lecture slides for the course, MTL103: Optimization Methods and Applications by Prof.
Minati De

4
MTL103: Optimization Methods and Applications Spring 2023

Lecture 17 — DATE 16, 2023


Lecturer: Prof. Minati De

Scribes:

1. Nikhil Unavekar 2020CS10363

2. Kinjal Anchhara 2021MT60959

3. Sakshi Magarkar 2021MT60965

4. Kosles Gautam 2021MT10908

1 Overview

In last lecture, we found out which combination of primal and dual problems can exist. We also
saw the Complementary Slackness and theorems related to it.
In this lecture, we concentrate on the case where the primal problem is standard form. We develop
the dual simplex method, which is an alternative to the simplex method.

2 Dual Simplex Mehtod and its Examples

The algorithm which maintains primal feasibility and works towards obtaining dual feasibility is
called Primal algorithm. The algorithm which maintains dual feasibility all throughout and moves
towards obtaining primal feasibility is called dual algorithm. The conventional (primal) simplex
method is an algorithm designed to preserve primal feasibility while aiming for dual feasibility.
It begins with a primal feasible solution and endeavors to achieve dual feasibility while uphold-
ing complementary slackness. In contrast, the dual simplex method mirrors the primal simplex
approach. It commences with a dual feasible solution associated with a basis B and progresses to-
wards establishing the corresponding primal solution’s feasibility while preserving complementary
slackness.
Let’s examine a problem formulated in standard form, assuming the usual condition that the rows
of matrix X are linearly independent. Suppose we have a basis matrix Y, comprising m linearly
independent columns of X, and let’s analyze the associated table

−a′Y Y −1 b c̄′
Y b−1 −1
Y X

precisely we can say,

1
−c′Y pY c̄1 c̄2 ··· c̄n
pY (1) | |
.. Y −1 X1 · · · Y −1 Xn
.
pY (m) | |

We don’t require Y −1 b to be nonnegative, indicating a basic but not necessarily feasible solution
to the primal problem. However, we assume that c̄ > 0, which implies that the vector p′ = c′Y Y −1
satisfies p′ X ≤ c′ , providing us with a feasible solution to the dual problem.
The cost of this dual feasible solution is given by

p′ b = c′Y Y −1 b = c′Y pY

This corresponds to the negative of the entry at the upper left corner of the table.
If Y −1 b ≥ 0 holds, we also have a primal feasible solution with the same cost, and optimal solutions
to both problems are found. However, if the inequality Y −1 b ≥ 0 fails to hold, we proceed with a
change of basis. We find some l such that pY (l) < 0 and consider the l-th row of the table, known
as the pivot row. This row takes the form (pY (l) , v1 , v2 , . . . , vn ), where vi is the l-th component of
Y −1 Xi . For each i with vi < 0 (if such i exist), we calculate the ratio |vc̄ii | and select an index j for
which this ratio is the smallest; that is, vj < 0 and |vc̄ii | is minimized. We call the corresponding
entry vj the pivot element. We then perform a change of basis: column Xj enters the basis, and
column XY (l) exits. This change of basis, or pivot, is executed exactly as in the primal simplex
method: we add to each row of the table a multiple of the pivot row so that all entries in the pivot
column are set to zero, except for the pivot element, which is set to 1. In particular, to set the

reduced cost in the pivot column to zero, we multiply the pivot row by |vjj | and add it to the zeroth
row. For every i, the new value of c̄i is determined by,
vi c̄j
c̄i +
|vj |

which is nonnegative due to the way j was selected. Consequently, the reduced costs in the new
table will also be nonnegative, and dual feasibility is maintained.

Iteration of Dual Simplex Method

1. The iteration starts with the table associated with a basis matrix Y and with all reduced
costs nonnegative.

2. Examine the components of the vector Y −1 b in the zeroth column of the table. If they are all
nonnegative, we have an optimal basic feasible solution and the algorithm terminates; else,
choose some l such that xY (l) < 0.

3. Consider the l-th row of the table, with elements xY (l) , v1 , . . . , vn (the pivot row). If vi ≥ 0
for all i, then the optimal dual cost is positive infinity, and the algorithm terminates.

2
4. For each i such that vi < 0, compute the ratio |vc̄ii | and let j be the index of a column that
corresponds to the smallest ratio. The column XY (l) exits the basis, and the column Xj takes
its place.

5. Add to each row of the table a multiple of the l-th row (the pivot row) so that vj (the pivot
element) becomes 1 and all other entries of the pivot column become 0.

Example Question

Problem. Maximize 2x1 + 6x2 + 10x3


Subject to:

−2x1 + 4x2 + x3 + x4 = 2
−4x1 + 2x2 + 3x3 − x5 = 1
x1 , x2 , x3 ≥ 0
x4 , x5 ≥ 0

Solution:
Consider the following table for this -

x1 x2 x3 x4 x5
0 2 6 10 0 0
x4 = 2 −2 4 1 1 0
x∗5 = −1 4 −2 −3 0 1
Due to the negative value of xY (2) , we opt for the second row as the pivot row. Negative entries
are present in the second and third columns of the pivot row. By comparing the corresponding
6 10 6
ratios |−2| and |−3| , we find that the smallest ratio is |−2| , indicating the second column to enter
the basis. The pivot element is denoted by an asterisk. We proceed by multiplying the pivot row
by 3 and adding it to the zeroth row, and by multiplying the pivot row by 2 and adding it to the
first row. Subsequently, we divide the pivot row by -2. This results in the formation of the new
table are

x1 x2 x3 x4 x5
−3 14 0 1 0 3
x4 = 0 6 0 −5 1 2
x5 = 1/2 −2 1 3/2 0 −1/2

The total cost has risen to 3. Additionally, we now observe that Y −1 b ≥ 0, indicating the
attainment of an optimal solution.
It’s noteworthy that the pivot element vj is consistently selected to be negative, while the corre-
sponding reduced cost c̄j remains nonnegative. Suppose momentarily that c̄j is indeed positive. In
such a scenario, to nullify c̄j , we must introduce a positive multiple of the pivot row to the zeroth
row. Given the negativity of xY (l) , this results in the addition of a negative value to the upper left
corner. Consequently, the dual cost experiences an increase. Thus, as long as the reduced cost of

3
any nonbasic variable remains non-zero, each alteration in the basis leads to a rise in the dual cost,
precluding the repetition of any basis configuration throughout the algorithm’s execution.
Consequently, the algorithm is guaranteed to conclude eventually, and this can unfold in one of two
ways: (a) We ascertain Y −1 b ≥ 0, signaling the presence of an optimal solution. (b) All entries
v1 , v2 , . . . , vn in the pivot row are nonnegative, rendering the identification of a pivot element
unattainable. Analogous to the primal simplex method, this scenario signifies that the optimal
dual cost attains a value of +∞, denoting the primal problem’s infeasibility.

Conclusion

At this juncture, one may naturally ponder over the circumstances where the dual simplex method
proves advantageous. A pertinent scenario emerges when a basic feasible solution of the dual
problem is readily accessible. For instance, consider a scenario where we possess an optimal basis
for a linear programming problem and seek to address the same problem with a modified right-
hand side vector b. The optimal basis for the original problem might become primal infeasible with
the updated value of b. However, the alteration in b does not impact the reduced costs, thereby
ensuring the persistence of a dual feasible solution. Hence, rather than embarking on solving the
new problem afresh, it may be advantageous to employ the dual simplex algorithm, commencing
from the optimal basis established for the original problem.

References

[1] Introduction to Linear Optimization by Dimitris Bertsimas and John Tsitsikli

[2] Lecture slides for the course, MTL103: Optimization Methods and Applications by Prof.
Minati De

You might also like