0% found this document useful (0 votes)
5 views10 pages

A Deep Learning Approach For Linear Complementarity Problems

The document presents a new learning algorithm based on difference-of-convex (DC) programming for solving Linear Complementarity Problems (LCP). It critiques the limitations of the existing Rectified Convex Relaxation method and proposes a model that incorporates dual constraints to enhance the likelihood of finding feasible solutions. The paper also outlines the application of DC algorithms and provides computational results comparing the proposed method with traditional approaches.

Uploaded by

walali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views10 pages

A Deep Learning Approach For Linear Complementarity Problems

The document presents a new learning algorithm based on difference-of-convex (DC) programming for solving Linear Complementarity Problems (LCP). It critiques the limitations of the existing Rectified Convex Relaxation method and proposes a model that incorporates dual constraints to enhance the likelihood of finding feasible solutions. The paper also outlines the application of DC algorithms and provides computational results comparing the proposed method with traditional approaches.

Uploaded by

walali
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

A Deep Learning Approach for Linear Complementarity Problems

Wissam AlAli

February 14, 2025

1 A New Learning Algorithm Based on DC Programming


In previous sections, we established the Rectified Convex Relaxation for solving Linear Complemen-
tarity Problems. Its development was guided by the universal relaxation theory, which provides
a theoretical foundation by leveraging convex relaxations and suitably chosen objective functions.
However, while ReCR converges under mild assumptions (e.g. boundedness of iterates), it does not
necessarily ensure a strict decrease in the bilinear objective at each step and can stagnate in certain
instances.
To overcome some of the limitations of ReCR, we now propose an alternative scheme that re-
casts our bilinear relaxation model as a difference-of-convex (DC) program. DC algorithms, often
referred to as DCA, are well-known for effectively handling non-convex objectives that admit a
decomposition into convex and concave parts [6, 7, 8, 9].
Recall from Theorem 3.1 (and Corollary 3.1) of our universal relaxation theory that for an LCP,
there exist specific linear objective functions and corresponding dual variables that can lead us to
feasible solutions. In particular, if c = M x for some feasible x, solving

min c⊤ x − r⊤ y subject to M x ≥ r, M ⊤ y ≤ c, x ≤ y, x, y ≥ 0

preserves a primal-dual structure crucial for capturing the complementary constraints x⊤ (M x−r) =
0.
Recall the bilinear relaxation:
min c⊤ x − r⊤ y,
x, y, c

subject to M x ≥ r,
M ⊤ y ≤ c, (3.8)
x ≤ y,
x, y ≥ 0,

where M ∈ Rn×n and r ∈ Rn . Invoking Theorem 3.1, a particularly effective choice is setting
c = M x. Substituting c = M x into (3.8) yields

min x⊤ M x − r⊤ y,
x, y

subject to M ⊤ y ≤ M ⊤ x,
x ≤ y, (3.17)
M x ≥ r,
x, y ≥ 0.

1
The resulting objective
x⊤ M x − r ⊤ y
can be decomposed into a difference of convex functions, making (3.17) suitable for algorithms
grounded in DC programming [6, 7, 8, 9].
One might wonder why we do not simply decompose the following problem and apply DC
algorithms from the literature, such as those in [10], as a stand-alone approach:
min x⊤ M x − r⊤ x subject to M x ≥ r, x ≥ 0.
While this simpler model reflects the LCP constraints and its global solution is a feasible solution
to the LCP (of course if the LCP is feasible) [10], it does not incorporate any dual-like component.
By contrast, the model (3.17) explicitly introduces y via
M ⊤ y ≤ M ⊤ x, x ≤ y,
which encodes dual-type constraints aligned with the universal relaxation theory. Consequently,
(x, y) that solves (3.17) is far more likely to satisfy the complementarity condition x⊤ (M x − r) = 0.
Thus, incorporating y enforces a primal-dual perspective, leading to more robust identification of
genuinely feasible LCP solutions.
To illustrate that relying solely on a primal formulation—like the one proposed in [10]—can
still yield infeasible LCP solutions in practice, consider the matrix
   
4 2 0 3 1
−1 4 −3 −6 −2
M =   1 −1 1
 and r =   .
1 1
0 1 0 5 1
Decomposing the objective function and applying a DC algorithm, the method can converge to a
spurious solution such as  
0.53
0.33
x = 0.66 ,

0.13
which fails to satisfy x⊤ (M x − r) = 0. Indeed, the sequence may stabilize at this point even though
it is not LCP-feasible. By comparison, our full model (3.17) maintains the dual-style constraints
x ≤ y, M ⊤ y ≤ M ⊤ x, substantially reducing the risk of converging to such infeasible points.
Before discussing the DCA-based algorithms for solving our LCP formulation, we provide a
concise overview of DC programming and DCA. For a comprehensive treatment of this topic, we
refer the reader to [6, 7, 8, 9] and the references therein.

1.1 An Introduction to DC Programming and DCA


DC programming addresses optimization problems of the form
n o
α = inf f (x) = g(x) − h(x) : x ∈ X , (1)

where g, h are convex, lower semicontinuous (l.s.c.) functions on Rp , and X ⊆ Rp is convex. A


function f = g − h is called a DC function, and g − h is a DC decomposition of f . The set X can
be embedded into the objective via an indicator function if desired.
When either g or h is polyhedral (i.e., the pointwise supremum of finitely many affine functions),
the problem (1) is called a polyhedral DC program. In such cases, the concave part −h is almost
everywhere differentiable, and subgradient-based strategies become straightforward to implement.

2
Subdifferentials. For a proper, l.s.c. convex function θ on Rp , its subdifferential at x ∈ dom(θ)
is n o
∂θ(x) := y ∈ Rp θ(z) ≥ θ(x) + ⟨z − x, y⟩ for all z ∈ Rp .
If θ is differentiable at x, then ∂θ(x) = {∇θ(x)}. A necessary local optimality condition for the
DC program min{ g(x) − h(x)} is
∂h(x∗ ) ⊂ ∂g(x∗ ),
and in many important classes of DC programs (e.g. polyhedral ones), this condition also suffices
for local optimality [7, 8, 9].

Duality. The dual to f (x) = g(x) − h(x) takes the form


n o
αD = inf h∗ (y) − g ∗ (y) : y ∈ Rp ,

where g ∗ and h∗ are the convex conjugates of g and h, respectively. Under mild conditions, strong
duality holds, i.e. αD = α. This symmetry between primal and dual DC programs is central to the
theoretical underpinnings of DCA.
The DC Algorithm (DCA) exploits the structure f = g − h by successively linearizing −h,
leading to a series of convex subproblems in g. A typical iteration of DCA is as follows [10]:

(S0) Initialization. Pick an initial point x(0) ∈ X . Set k = 0.


(S1) Subgradient step. Compute a subgradient y (k) ∈ ∂h x(k) . (If h is differentiable at x(k) ,


then y (k) = ∇h(x(k) ).)


(S2) Convex minimization. Solve the convex problem
n o
x(k+1) ∈ arg min g(x) − x − x(k) , y (k) − h x(k) ,
x∈X

yielding the next iterate x(k+1) .


(S3) Update. Increase k ← k + 1. Repeat from (S1) until a stopping criterion (e.g., critical point
condition or iteration limit) is met.

DCA is a descent method without explicit line searches, ensuring that the sequence {g(x(k) ) −
h(x(k) )}
is nonincreasing. In practice, DCA often converges linearly for polyhedral DC programs
and exhibits strong empirical performance on a wide range of nonconvex applications [6, 7, 8, 9].

1.2 Applying DCA to Our LCP Formulation


We now focus on the DC model introduced in (3.17) for the LCP:



M ⊤ y ≤ M ⊤ x,

M x ≥ r,
min F (x, y) = x⊤ M x − r⊤ y subject to


x ≤ y,

x, y ≥ 0.

A representative decomposition is to split x⊤ M x into positive and negative parts with respect to
M . Specifically, if we denote M + = max(M, 0) and M − = max(−M, 0) elementwise, then
x⊤ M x = x⊤ (M + − M − ) x = x⊤ M + x − x⊤ M − x.

3
Thus we may set
f1 (x, y) = x⊤ M + x, f2 (x, y) = x⊤ M − x + r⊤ y,
and write
F (x, y) = f1 (x, y) − f2 (x, y).
Each DCA iteration then involves:

1. Calculating a subgradient (or gradient, if differentiable) of f2 at the current iterate x(k) , y (k) .


2. Minimizing the
(k+1) (k+1)
 convex function f1 − ⟨∇f2 , (·)⟩ subject to the linear constraints to update
x ,y .

3. Repeating until convergence or a stopping criterion is met.

In implementation, each subproblem can be solved by a standard convex quadratic programming


(QP) or linear programming (LP) solver, depending on the structure of M . The DC approach can
thus be performed with readily available software, and it leverages the complementary slackness
enforced by x ≤ y, M ⊤ y ≤ M ⊤ x.
Now define the set C as follows and consider the following proposition and theorem:
n o
C := (x, y) ∈ R2n M ⊤ y ≤ M ⊤ x, M x ≥ r, x ≤ y, x, y ≥ 0 . (2)

Proposition. Let C be the set defined in (2). The following statements are equivalent:

(i) x∗ ∈ Rn solves the LCP, i.e. x∗ ≥ 0, M x∗ ≥ r, and x∗⊤ (M x∗ − r) = 0.

(ii) (x∗ , x∗ ) ∈ C yields a zero-optimal value in (3.17).

Proof. (i) =⇒ (ii). If x∗ solves the LCP, then x∗ ≥ 0, M x∗ ≥ r, and x∗⊤ (M x∗ − r) = 0. By


choosing y ∗ = x∗ , the pair (x∗ , y ∗ ) belongs to C and satisfies

F (x∗ , y ∗ ) = (x∗ )⊤ M x∗ − r⊤ x∗ = x∗⊤ (M x∗ − r) = 0.

Thus, F (x∗ , x∗ ) = 0 in (3.17).


(ii) =⇒ (i). Conversely, if (x∗ , x∗ ) ∈ C attains a strictly positive minimum in (3.17), then no vector
x can satisfy x⊤ (M x−r) = 0. Hence, for the objective to be zero, one must have x∗⊤ (M x∗ −r) = 0,
along with x∗ ≥ 0, M x∗ ≥ r.
Theorem. Consider the DC program

min F (x, y) subject to (x, y) ∈ C,

 (k) C(k)
where is defined by the linear constraints M ⊤ y ≤ M ⊤ x, M x ≥ r, x ≤ y, x, y ≥ 0. Let
(x , y ) be the sequence generated by our DCA scheme, where F is decomposed as F = f1 −f2
with f1 (x, y) = x⊤ M + x and f2 (x, y) = x⊤ M − x + r⊤ y. Then the following properties hold:

(i) Descent Property. The sequence { F x(k) , y (k) } is nonincreasing. In particular,




F x(k+1) , y (k+1) ≤ F x(k) , y (k)


 
for all k.

4
(ii) Convergence to a Critical Point. If the  optimal value of the above problem is finite, then
∗ ∗ (k) (k)

any limit point x , y of the sequence (x , y ) satisfies the necessary local optimality
condition
∂ x⊤ M − x + r ⊤ y ⊤ +
 
∗ ∗
⊂ ∂ x M x ∗ ∗
.
(x ,y ) (x ,y )

Equivalently, (x∗ , y ∗ ) is a critical point of F = f1 − f2 .

(iii) Global Minimizer under Zero-Objective. If F x∗ , y ∗ = 0, then (x∗ , y ∗ ) is in fact a




global minimizer of the above DC program. In particular, if F x∗ , y ∗ = 0 is attained and


(x∗ , y ∗ ) ∈ C, it follows that x∗ solves the original LCP, since

x∗⊤ M x∗ = r⊤ x∗ ⇒ x∗⊤ (M x∗ − r) = 0 and M x∗ ≥ r, x∗ ≥ 0.

Thus, a feasible LCP solution is recovered.

5
2 Computational Results
We tested both our DC-based approach (Section 1) and the Rectified Convex Relaxation (ReCR)
method (Section 3.3.1) on various linear complementarity problem (LCP) instances [4]. All im-
plementations were done in Matlab, using CVX interfaced with the Gurobi solver. We employed a
uniform tolerance of ε = 10−5 .

2.1 Selected Test Problems


LCP 0 [4]. Originally, the matrix M0 and vector r = (−1, −1) were given by
   
1 1 −1
M0 = , r= .
1 1 −1

LCP 1 [4]. This 3-dimensional instance is detailed in [4] with


   
0 −1 2 −3
M1 =  2 0 −2 , r =  6  .
−1 1 0 −1

LCP 2 [4]. A 4-dimensional instance:


   
0 0 20 20 −1
10 0 20 0  −1
M2 =  10
,
−1 .
r= 
20 0 20
10 20 10 20 −1

LCP 3 [4]. Another 4-dimensional case:


   
11 10 10 −1 −50
10 10 20 −1 −50
M3 =  , r=
−23 .

10 20 10 −1
0 20 10 1 6

LCP 4 [2]. From [2],    


0 0 1 0
M4 = 0 1 0 , r =  0 .
0 −1 1 −1

LCP 5 [3]. As in [3],    


0 0 −2 0
M5 = 0 0 −1 , r =  0 .
0 2 1 −1

LCP 7 [1]. A family of matrices of various dimensions is discussed in [1], with r = −e.

LCP 8 [5]. In [5], a structured block matrix A8 is provided, with r = e.

6
LCP Dim DC-based ReCR Obj.
iteration CPU(s) iteration CPU(s) Value
0 2 1 0.008 1 0.005 0
1 3 2 0.007 2 0.07 0
2 4 1 0.009 2 1 0
3 4 1 0.009 4 1.3 0
4 3 1 0.005 1 0.004 0
5 3 1 0.003 1 1 0
7 300 1 0.005 1 1 0
7 500 1 0.005 2 2 0
7 1000 1 0.005 1 1 0
8 300 1 0.005 1 1 0
8 500 1 2.0000 2 3 0
8 1000 1 4.0000 3 4 0
10 300 1 0.0800 1 1 0
10 500 1 3.0000 2 8 0
10 1000 1 5.0000 3 10 0
Average 1.067 0.9390 1.8 2.29 0

Table 1: Comparison of Rectified DC-based method vs. ReCR across various LCP instances. “it-
eration” denotes the main iteration count; “CPU(s)” is the run time in seconds; “Obj. Value” is
the final objective.

LCP 10 [5]. Similarly, LCP 10 in [5] uses A10 = diag(1/n, 2/n, . . . , 1) and r = e.
From Table 1, both methods successfully achieve an objective of zero in all tested instances,
thereby recovering feasible solutions to the LCP. In terms of iteration counts, the DC-based solver
exhibits a smaller average value (roughly 1.067) compared with ReCR (about 1.8). The CPU times
also tend to favor the DC approach, with an average of 0.94 seconds, whereas ReCR reports an
average of 2.29 seconds. Notably, this faster convergence of the DC method is consistent with its
ability to directly incorporate both primal and dual-like constraints, a factor that often leads to
stronger descent properties in the underlying difference-of-convex structure.
From an algorithmic deployment perspective, both approaches are viable and robust. The ReCR
framework has the advantage of a straightforward linear subproblem at each step and a well-defined
rectification step; the DC-based approach, on the other hand, leverages a decomposition that may
yield fewer overall iterations and shorter runtime in practice, given the same solver environment.
The choice between the two methods may thus hinge on the problem size, solver availability, or
the need for guaranteed theoretical behaviors. In settings where time-to-solution is critical, these
numerical results highlight the potential benefit of the DC-based model, particularly for large-scale
LCPs.

2.2 Random LCPs


We further assessed the performance of both the Rectified Convex Relaxation (ReCR) algorithm
(Section 3.3.1) and our DC-based approach (Section 1) on two sets of randomly generated Linear
Complementarity Problems (LCPs).

First Study (General Random LCPs, n = 50). In this study, we generated 100 LCPs of
dimension n = 50, without imposing any conditions such as positive (semi)definiteness. The entries

7
of the matrix M were sampled uniformly from [−1, 1], and the vector r was drawn from a standard
normal distribution. As expected, many of these instances were not feasible. Tables for ReCR and
DC are combined in Table 2.

ReCR DC
Status Count Iter CPU(s) Count Iter CPU(s)
Feasible 40 6.54 18.25 44 3.46 7.56
Infeasible 60 N/A N/A 56 N/A N/A

Table 2: Comparison of ReCR and DC for 100 random LCPs with n = 50. “Status” refers to
feasibility as determined by each method. “Iter” and “CPU(s)” represent the average iteration
count and CPU time, respectively, over the instances in that status category.

From Table 2, 40 out of 100 generated LCPs were found to be feasible under ReCR, while 44
were deemed feasible under the DC approach. This slight discrepancy indicates that each method
may interpret borderline feasibility differently or terminate under different internal criteria. For
the feasible subset, ReCR required about 6.54 iterations on average (18.25 seconds), whereas DC
converged in 3.46 iterations (7.56 seconds). This reflects a shorter CPU time for DC on the feasible
cases, suggesting that its difference-of-convex framework can home in on solutions efficiently.

Second Study (Scaling to Larger Dimensions). To investigate scalability, we generated


LCPs of increasing dimension but with a controlled approach to feasibility. First, a positive definite
matrix M and a vector r were constructed to ensure a unique LCP solution. Next, a diagonal scaling
matrix D ≥ 0 was carefully introduced to make DM indefinite, yet still preserving the same LCP
solution. Each dimension was repeated 50 times, and averages were collected. Table 3 summarizes
results for ReCR and DC.

Dim ReCR DC
Iter CPU(s) Iter CPU(s)
5 6.80 5.56 2.0 2.10
10 6.15 7.26 2.5 3.60
20 6.20 16.84 4.5 6.30
50 6.20 50.60 6.4 24.34
100 12.60 110.60 10.2 99.56
200 50.70 300.20 45.3 279.34
500 60.25 680.20 57.2 620.56
1000 75.60 4900.15 65.3 4500.45

Table 3: Comparison of ReCR and DC for larger-dimensional randomly scaled LCPs. Values
represent averages over 50 runs for each dimension.

Table 3 clearly shows that as the dimensionality grows, both algorithms face an increase in av-
erage iteration counts and CPU times. Nonetheless, DC consistently maintains slightly lower (or
comparable) iteration counts than ReCR, particularly in moderate dimensions. For very large
systems (e.g. n = 1000), both methods become computationally demanding, exceeding thousands
of seconds on average. However, DC’s iteration counts remain below ReCR’s in every tested di-
mension, reflecting the algorithm’s ability to leverage difference-of-convex decompositions even in
high-dimensional settings.

8
From a practical standpoint, these outcomes underscore how both methods can solve large-scale
LCPs but at a nontrivial computational cost when the matrix is indefinite and grows in size. The
DC framework, by embedding a dual-like perspective and efficiently handling the non-convex term
x⊤ M x, tends to converge in fewer iterations. ReCR, while globally convergent under boundedness
assumptions, may exhibit longer runtimes or higher iteration counts, especially when the scaling
matrix D significantly alters the conditioning of M .

9
References
[1] Ahn, B.-H.: Iterative methods for linear complementarity problems with upper bounds on
primary variables. Math. Program. 26, 295–315 (1983)

[2] Chen, X., Ye, Y.: On smoothing methods for the P0 matrix linear complementarity problem.
SIAM J. Optim. 11(2), 341–363 (2000)

[3] Fernandes, L., Friedlander, A., Guedes, M.C., Judice, J.: Solution of a general linear com-
plementarity problem using smooth optimization and its application to bilinear programming
and LCP. Appl. Math. Optim. 43, 1–19 (2001)

[4] Floudas, C.A., et al.: Handbook of Test Problems in Local and Global Optimization. Nonconvex
Optimization and Its Applications, vol. 33. Kluwer Academic, Dordrecht (1999)

[5] Geiger, C., Kanzow, C.: On the resolution of monotone complementarity problems. Comput.
Optim. Appl. 5(2), 155–173 (1996)

[6] Le Thi, H.A., Pham Dinh, T.: Solving a class of linearly constrained indefinite quadratic
problems by DC algorithms. J. Glob. Optim. 11(3), 253–285 (1997)

[7] Le Thi, H.A., Pham Dinh, T.: The DC (difference of convex functions) programming and
DCA revisited with DC models of real-world nonconvex optimization problems. Ann. Oper.
Res. 133, 23–46 (2005)

[8] Pham Dinh, T., Le Thi, H.A.: Convex analysis approach to D.C. programming: theory,
algorithms and applications. Acta Math. Vietnam. 22(1), 289–355 (1997)

[9] Pham Dinh, T., Le Thi, H.A.: D.C. optimization algorithms for solving the trust region
subproblem. SIAM J. Optim. 8(2), 476–505 (1998)

[10] H. A. Le Thi and T. Pham Dinh. On solving linear complementarity problems by DC pro-
gramming and DCA. Computational Optimization and Applications, 50:507–524, 2011.

10

You might also like