0% found this document useful (0 votes)
2 views

Applications of Fixed Point Theory To Distributed Optimization R

Applications of fixed point theory

Uploaded by

idrisatea
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Applications of Fixed Point Theory To Distributed Optimization R

Applications of fixed point theory

Uploaded by

idrisatea
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

Iowa State University Capstones, Theses and

Graduate Theses and Dissertations


Dissertations

2019

Applications of fixed point theory to distributed


optimization, robust convex optimization, and
stability of stochastic systems
Seyyed Shaho Alaviani
Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd


Part of the Applied Mathematics Commons, and the Electrical and Electronics Commons

Recommended Citation
Alaviani, Seyyed Shaho, "Applications of fixed point theory to distributed optimization, robust convex optimization, and stability of
stochastic systems" (2019). Graduate Theses and Dissertations. 16956.
https://lib.dr.iastate.edu/etd/16956

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University
Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University
Digital Repository. For more information, please contact [email protected].
Applications of fixed point theory to distributed optimization, robust convex

optimization, and stability of stochastic systems

by

Seyyed Shaho Alaviani

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Electrical Engineering

Program of Study Committee:


Nicola Elia, Major Professor
Umesh Vaidya
Yongxin Chen
Domenico D’Alessandro
Soumik Sarkar

The student author, whose presentation of the scholarship herein was approved by the program of
study committee, is solely responsible for the content of this dissertation. The Graduate College
will ensure this dissertation is globally accessible and will not permit alterations after a degree is
conferred.

Iowa State University

Ames, Iowa

2019

Copyright c Seyyed Shaho Alaviani, 2019. All rights reserved.


ii

DEDICATION

To those who have moved the world.


iii

TABLE OF CONTENTS

Page

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

LIST OF NOMENCLATURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

CHAPTER 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Motivation of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

CHAPTER 2. REVIEW OF RELEVANT FIXED POINT THEORY . . . . . . . . . . . . . 10

2.1 Relevant Fixed Point Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2 Relevant Fixed Point Iterations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

CHAPTER 3. TWO FRAMEWORKS FOR OPTIMIZATION . . . . . . . . . . . . . . . . . 14

3.1 A Framework for Convex Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.1.1 Application to Centralized Convex Optimization . . . . . . . . . . . . . . . . 15

3.1.2 Application to Centralized Robust Convex Optimization . . . . . . . . . . . . 15

3.1.3 Application to Distributed Convex Optimization over Random Networks . . . 16

3.2 A Framework for Distributed Convex Optimization with State-Dependent Interactions 20


iv

CHAPTER 4. ALGORITHMS TO SOLVE THE OPTIMIZATION . . . . . . . . . . . . . . 25

4.1 A Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.1.1 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.1.2 Application to Solve Distributed Optimization over Random Networks . . . . 38

4.2 The Random Picard Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.1 Firmly Nonexpansive Random Maps . . . . . . . . . . . . . . . . . . . . . . . 44

4.2.2 Contraction Random Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.3 The Random Krasnoselskii-Mann Algorithm . . . . . . . . . . . . . . . . . . . . . . . 46

4.3.1 Nonexpansive Random Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

CHAPTER 5. SOLVING LINEAR ALGEBRAIC EQUATIONS OVER RANDOM NET-

WORKS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.1 A Distributed Algorithm for Solving Linear Algebraic Equations over Random Net-

works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.1.1 Distributed Average Consensus over Random Networks . . . . . . . . . . . . 58

CHAPTER 6. A DISTRIBUTED ALGORITHM FOR DISTRIBUTED CONVEX OPTI-

MIZATION WITH STATE-DEPENDENT INTERACTIONS AND TIME-VARYING TOPOLO-

GIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.1 A Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.1.1 Convergence Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

CHAPTER 7. STABILITY OF STOCHASTIC NONLINEAR DISCRETE TIME SYSTEMS 77

7.1 Stability of Stochastic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7.2 Stability Analysis of Stochastic Nonlinear Discrete-Time Systems . . . . . . . . . . . 79

CHAPTER 8. CONCLUSIONS AND FUTURE WORKS . . . . . . . . . . . . . . . . . . . . 83

8.1 Contributions of This Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

8.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
v

LIST OF FIGURES

Page

Figure 4.1 Variables θ1 and θ2 of 20 agents are shown by solid blue lines and dashed

black lines, respectively. The figures show that they are approaching θ∗ =

[0.7417, 0.7417]T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Figure 4.2 2D plot, where θ∗ is shown by o, for 1000 iterations. . . . . . . . . . . . . . 42

Figure 4.3 Root Mean Square Error (RMSE) for two intervals: [0, 300] and [301, 3000]

iterations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Figure 5.1 states’ route . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Figure 6.1 Variables y of agents in Example 6.1. This figure shows that the variables

y converge to average of the initial positions of variables y. . . . . . . . . . . 74

Figure 6.2 Variables z of agents in Example 6.1. This figure shows that the variables z

converge to average of the initial positions of variables z. . . . . . . . . . . . 75

Figure 6.3 The error in Example 6.1. This figure shows that the positions of agents

converge to the average of their initial positions. . . . . . . . . . . . . . . . . 76


vi

LIST OF NOMENCLATURE

N The set of all natural numbers

<n n-dimensional real space

<m×n The set of all real m by n matrices

AT Transpose of the matrix A

In Identity matrix of dimension n by n

X Topological space

M Metric space

B Real Banach space

H Real Hilbert space

Bn A closed ball with Euclidean metric in <n

ρ A metric

k.kH Norm of the real Hilbert space H

k.kB Norm of the real Banach space B

< ., . > Inner product of the real Hilbert space H

PC Projection onto the set C in the real Hilbert space H

kAk1 Induced 1-norm of matrix A

kAk2 Induced 2-norm of matrix A

kAk∞ Induced ∞-norm of matrix A

1n Vector of dimension n whose all entries are 1

0n Vector of dimension n whose all entries are 0

Re(a) Real part of the complex number a

λ2 (A) Sorted in increasing order with respect to real parts, the second eigenvalue of the matrix A

E[x] Expectation of the random variable x


vii

L1 The space of measurable functions

∇f (x) Gradient of the function f (x)

∅ The empty set

A0 Positive semi-definite matrix


viii

ACKNOWLEDGMENTS

I must thank God who has helped me to come to the U.S., selected Prof. Nicola Elia as my

major professor, and helped me during my Ph.D. program as well as every second in my life. I

would like to thank my major professor who has guided me to the world of optimization theory and

networked systems, discussed interesting topics with me, and supported me during this research. I

mention that Iowa State University and Ames, Iowa, are great places for studying and student life,

respectively.

I would like to thank my parents, my uncle Habib Kharazmi, and my best friends Seyyed

Ghafoor Barzanjeh, Mohammad Ali Saate’, and Seyyed Taha Kamalizadeh for their emotional

support.

This work was supported by National Science Foundation under Grants CCF-1320643, CNS-

1239319, ECCS-1509372, and AFOSR under Grant FA95501510119.


ix

ABSTRACT

Large-scale multi-agent networked systems are becoming more and more popular due to ap-

plications in robotics, machine learning, and signal processing. Although distributed algorithms

have been proposed for efficient computations rather than centralized computations for large data

optimization, existing algorithms are still suffering from some disadvantages such as distribution

dependency or B-connectivity assumption of switching communication graphs. This study applies

fixed point theory to analyze distributed optimization problems and to overcome existing diffi-

culties such as distribution dependency or B-connectivity assumption of switching communication

graphs. In this study, a new mathematical terminology and a new mathematical optimization

problem are defined. It is shown that the optimization problem includes centralized optimization

and distributed optimization problems over random networks. Centralized robust convex opti-

mization is defined on Hilbert spaces that is included in the defined optimization problem. An

algorithm using diminishing step size is proposed to solve the optimization problem under suitable

assumptions. Consequently, as a special case, it results in an asynchronous algorithm for solving

distributed optimization over random networks without distribution dependency or B-connectivity

assumption of random communication graphs. It is shown that the random Picard iteration or the

random Krasnoselskii-Mann iteration may be used for solving the feasibility problem of the defined

optimization. Consequently, as special cases, they result in asynchronous algorithms for solving

linear algebraic equations and average consensus over random networks without distribution de-

pendency or B-connectivity assumption of switching communication graphs. As a generalization of

the proposed algorithm for solving distributed optimization over random networks, an algorithm is

proposed for solving distributed optimization with state-dependent interactions and time-varying

topologies without B-connectivity assumption on communication graphs. So far these random al-

gorithms are special cases of stochastic discrete-time systems. It is shown that difficulties such as
x

distribution dependency of random variable sequences that arise in using Lyapunov’s and LaSalle’s

methods for stability analysis of stochastic nonlinear discrete-time systems may be overcome by

means of fixed point theory.


1

CHAPTER 1. INTRODUCTION

Optimization has been a backbone of many problems in machine learning, energy efficiency,

optimal control, signal processing etc. Optimization over networks has been a hot topic due to its

applications in real life problems such as large-scale building energy systems [1]. This chapter aims

at presenting motivation and contributions of this study.

1.1 Motivation of Problem

We consider an unconstrained collaborative optimization of a sum of convex functions where

agents make decisions using local information. Each agent has its own private convex cost function

that wishes to reach the minimizer of the sum of agents’ cost functions by interacting with its

neighbors, known as distributed optimization. Distributed optimization problems naturally appear

in many distributed problems such as sensor networks, power grid control, source localization,

distributed data regression [2]-[3], and large-scale building energy systems [1].

Consensus problems are long-established problems in automata theory and distributed compu-

tation [4] and management science and statistics [5]. In multi-agent networks, consensus means

reaching an agreement which depends on all agents’ states by interacting with neighbors. Motivated

by the pioneering works of Borkar & Varaiya [6] and Tsitsiklis [7] (see also [8]), many researchers

have paid much attention to consensus and distributed optimization problems [1]-[3], [9]-[134].

In most networks such as sensor networks, since nodes sometimes shut down their transmitters

to save energy, or due to existing physical obstructions which block wireless channels, the availability

of communication links in the network is typically random. Indeed, because of packet drops, link

failures, or node failures, random graphs are suitable models for these kinds of networks. As a

matter of fact, consensus problems over random networks have been a hot topic to research due
2

to their applications in sensor networks [45]–[48]. Therefore, several researchers have investigated

consensus problems and distributed convex optimization problems over random networks [27]-[72].

In a synchronous protocol, all nodes activate at the same time and perform communication

updates. This protocol requires a common notion of time among the nodes. On the other hand,

in an asynchronous protocol, each node has its own concept of time defined by a local timer which

randomly triggers either by the local timer or by a message from neighboring nodes. The algorithms

guaranteed to work with no bound on the time for updates are called totally asynchronous, and

those that need B-connectivity assumption, namely there exists a bounded time interval such that

union of the graphs is strongly connected and each edge transmits a message at least once, are

called partially asynchronous (see [7] and [8, Ch. 6-7]). As the dimension of the network increases,

synchronization becomes an issue; therefore, some investigators have considered asynchronous dis-

tributed optimization problems [70]-[84], to cite a few.

In practice, state-dependent networks appear in several systems such as mobile robotic networks

(see [86] and references therein), wireless networks [87], and predator-prey interaction [88]. In

mobile robotic networks or wireless networks, the quality of the link between two agents depends

on the distance between them, resulting in state-dependent networks in reality (by considering the

position as the state). Furthermore, opinion dynamics and flocking are modeled as state-dependent

networks (see [89] and references therein). Also, the genome is viewed as a state-dependent network

[90]. As stated, state-dependent networks appear in real networks. In [91], existence of consensus

in a multi-robot network has been investigated. Since consensus problems are special cases of

distributed optimization problems, solving distributed convex optimization problems over state-

dependent networks are very important and useful. Therefore, some researcher has paid attention

to solving distributed optimization with state-dependent interactions [52] and [85].

A special case of optimization problems is solving linear algebraic equations. Linear algebraic

equations arise in modeling of many natural phenomena such as forecasting and estimation [92].

Since the processors are physically separated from each others, distributed computations to solve
3

linear algebraic equations are important and useful. Several authors have proposed algorithms for

solving the problem over non-random networks [93]-[134].

The algorithms for distributed optimization over random networks are special cases of stochas-

tic systems. In these systems, almost sure and moment stability are the most popular notions of

stability [135]-[136]. Lyapunov’ direct method has been used for stability of stochastic discrete-time

systems [137]-[152]. Moreover, Lyapunov measure which is dual to Lyapunov function has been

introduced for stability analysis of stochastic discrete-time systems [153]-[154]. The converse Lya-

punov’s theorem for stochastic discrete-time systems has been studied in [155]. Recently, stochastic

version of LaSalle’s theorem has been developed for discrete-time systems [156].

As we have shown above, distributed optimization problems and stability of stochastic discrete-

time systems are important and useful in practice.

1.2 Literature Review

Consensus problems: As stated in the previous section, several researchers have investigated

consensus problems. To the best of our knowledge, in all existing results except [23] and [44],

the distribution of random interconnection topologies or B-connectivity assumption is needed. In

[23] for time-varying directed networks, the authors have improved the lengths of B-connectivity

intervals to linear grow and tend to infinity. In [44], it has been shown that if the elements of a set

of communication graphs whose union is strongly connected occur infinitely often almost surely,

then distributed average consensus occurs1 . In [44], the authors consider undirected links with

Maximum-degree or Metropolis weights. In Maximum-degree weight, the number of the nodes is

required to set the weights of links, while in Metropolis weights the degrees of an agent’s neighbors

are required.

Distributed optimization with state-independent interactions: As stated in the previous section,

investigators have considered distributed optimization problems with state-independent interactions


1
We mean that the authors in [44] have implicitly stated that.
4

over random or non-random networks with/without asynchronous protocols. The results are based

on distribution dependency or B-connectivity assumptions of communication graphs.

Distributed optimization with state-dependent interactions: In [52] and [85], the authors have

considered distributed multi-agent optimization problems over state-dependent networks. In [52],

the authors assume that the weighted matrix of the graph is Markovian on the state variables at each

iteration. In a non-random case of the network contemplated in [52], the Markovian assumption

is not satisfied because the state-dependent weighted matrix of the graph at each iteration does

depend on previous iterations. In [85], a continuous-time system is proposed for unconstrained

optimal consensus of convex optimization problem over directed time-varying networks; the authors

assume that the weight of each link has positive lower bound. Furthermore, they assume that the

intersection of the set of optimal solutions of each agent’s cost function should be nonempty. The

continuous-time algorithm they propose needs to project each agent’s state into its optimal solution

set at each time.

Solving linear algebraic equations over networks: The linear algebraic equation considered in

this study is of the form Ax = b that is solved simultaneously by m agents assumed to know only a

subset of the rows of the partitioned matrix [A, b], by using local information from their neighbors;

indeed, each agent only knows Ai xi = bi , i = 1, 2, ..., m, where the goal of them is to achieve

a consensus x1 = x2 = ... = xm = x̃ where x̃ ∈ {x̄|x̄ = arg minkAx − bk}. Several authors have

proposed algorithms for solving the problem over non-random networks [93]-[119]. Other distributed

algorithms for solving linear algebraic equations have been proposed by some investigators [120]-

[134] that the problems they consider are not the same as the problem considered here. Some

approaches propose cooperative solution methods that exploit the matrix A interconnectivity and

have each node in charge of one single solution variable or a dual variable [120]-[122]. One view

of the problem is to formulate it as a constrained consensus problem over random networks and

use the result in [50]; nevertheless, the result in [50] needs each agent to use projection onto its

constraint set with some probability at each time and also needs weighted matrix of the graph to be

independent at each time. Another view of the problem is to formulate it as a distributed convex
5

optimization problem over random networks and use the results in [51], [52], [54], [70]. Nevertheless,

the results in [51], [52], [54], [70] are based on subgradient descent or diminishing step size that

have slow convergence as an optimal solution is approached. Furthermore, the results in [51],

[52], [54], [70] need weighted matrix of the graph to be independent and identically distributed

(i.i.d.). Recently, the authors of [111] have proposed asynchronous algorithms for solving the linear

algebraic equation over time-varying networks where they impose B-connectivity assumption.

Stability of stochastic discrete-time systems: Although Lyapunov’s and LaSalle’s methods have

ben useful tools for stability analysis of stochastic discrete-time systems, they need distribution

dependency of random variable sequences. Lyapunov’ direct method needs stochastic parameters

to be i.i.d. [137]-[144] or stochastic parameters to be Markov processes [145]-[152]. Moreover,

Lyapunov measure needs stochastic parameters to be i.i.d. [153]-[154]. The converse Lyapunov’s

theorem [155] needs random variable sequence to be an i.i.d. process. LaSalle’s theorem for

stochastic discrete-time systems [156] needs stochastic parameters to be independent. Quadratic

Lyapunov functions have been useful to analyze stability of linear dynamical systems. Nevertheless,

common quadratic Lyapunov functions may not exist for stability analysis of consensus problems

in networked systems [157]. Furthermore, quadratic Lyapunov functions may not exist for stability

analysis of switched linear systems [158]-[160]. For deterministic discrete-time systems, by proving a

converse to the Banach’s fixed point theorem and using the Banach’s fixed point theorem, [161]-[162]

prove necessary and sufficient conditions for global and local exponential stability of deterministic

nonlinear systems that is locally continuously differentiable at its equilibrium point.

As we have shown above, distribution dependency or B-connectivity assumptions are the limita-

tions of existing results for stability of stochastic discrete-time systems and distributed optimization

over random networks with/without asynchronous updates. Furthermore, lower bound on nonlin-

ear weight and nonempty intersection of optimal solutions of agents cost functions are limitations of

existing works for distributed optimization with state-dependent interactions. We mention that in

practice Cucker-Smale weight [163] does not have a positive lower bound that appears in biological

networks.
6

1.3 Contributions

Before working on distributed optimization problems, the author published two papers [164]

and [165] on applications of fixed point theory to overcome the fundamental difficulties mentioned

in [164] which arise in using Lyapunov’s and LaSalle’s methods for stability analysis of time-varying

systems with time delay. Here, our contribution in this study is a new perspective on distributed

optimization by using the mathematical theory of random maps. We show that by applying fixed

point theory, we are able to overcome the limitations of existing results for packet drops, synchrony,

and state-dependent weights in distributed optimization. We state our proposed approaches in the

remaining paragraphs.

Consensus problems: We consider the consensus problem over random networks. We show that

this problem is to find a fixed value point of the random operator formed from the random weighted

graph matrices. We assume that the random weighted graph matrices are doubly stochastic for

all possible graphs. This assumption allows us to discard the distribution of random interconnec-

tion topologies. Consequently, this formulation includes asynchronous updates or/and unreliable

communication protocols. Furthermore, this framework does not need distribution of random in-

terconnection topologies or B-connectivity assumptions. Wireless sensor networks motivates this

framework since interference among the sensors communication correlates the links’ failures over

probability space or time. We show that the random Krasnoselskii-Mann iterative algorithm con-

verges almost surely and in mean square to the average consensus of initial states of the agents.

We also show that the agents interact among themselves to approach the consensus subspace in

such a way that the projection of their states onto consensus subspace at each time is equal to

the average consensus of their initial states. Moreover, the algorithm is able to converge even if

the interconnection weighted matrix is periodic and irreducible. We should mention that existing

discrete-time algorithms for consensus problems are the algorithm proposed by Tsitsiklis [7] and

its generalizations to random cases; this algorithm is in fact the Picard iteration.

Distributed optimization with state-independent interactions: We consider the problem of un-

constrained distributed convex optimization over random networks. We approach the problem
7

using a random operator formed from the random weighted graph matrices. We show that the dis-

tributed optimization problem can be formulated as minimization of a convex function over the set

of fixed value points of the random operator. Since the random operator is nonexpansive, we define

a mathematical optimization problem, namely minimization of a convex function over the set of

fixed value points of a nonexpansive random operator, which includes the distributed optimization

problem as a special case. The definition of fixed value point is a bridge from deterministic analysis

to random analysis of the algorithm. With the help of fixed value point set and nonexpansivity

property of the random operator, we are able to extend deterministic tools to random cases to

prove boundedness, convergence to the feasible set, and convergence to the optimal solution of

the generated sequence. This is very useful because we are able to analyze random processes by

using extended deterministic tools. We propose a discrete-time algorithm using diminishing step

size for almost sure and in mean square convergences to the optimal solution of the mathematical

optimization problem. This framework does not need any assumption on distribution of random

interconnection graphs. The proposed algorithm is also able to reach the optimal solution under

asynchronous updates. Our algorithm is not comparable to existing algorithms since they need

i.i.d. or B-connectivity assumptions.

Distributed optimization with state-dependent interactions: We consider an unconstrained dis-

tributed convex optimization problem over time-varying networks with state-dependent interac-

tions. The union of graphs which occur infinitely often is assumed to be strongly connected, and

the weights depend on the states continuously for each graph. We propose a framework for model-

ing multi-agent optimization problems over state-dependent networks with time-varying topologies,

i.e., the minimization of sum of convex functions over the intersection of fixed point sets of opera-

tors constructed. We assume that each agent’s cost function is strongly convex with Lipschitzian

gradient and that weighted graph matrix of the network is doubly stochastic with respect to state

variables at each time. We propose a gradient-based discrete-time algorithm using diminishing

step size for converging to the optimal solution of the problem. Our algorithm does not require

the weights to have positive lower bounds. This allows us to consider Cucker-Smale weights [163].
8

To the best of our knowledge, in contrast to existing results, our algorithm does not require B-

connectivity assumption for convergence. Therefore, our results are not comparable with existing

results even in state-independent case.

Solving linear algebraic equations over networks: Several authors in the literature have con-

sidered solving linear algebraic equations over switching networks with B-connectivity assumption

such as [111]. However, B-connectivity assumption is not guaranteed to be satisfied for random

networks. We formulate this problem such that this formulation does not need the distribution of

random communication graphs or B-connectivity assumption if the weighted matrix of the graph

is doubly stochastic. Thus this formulation includes asynchronous updates or unreliable commu-

nication protocols. We assume that the set S = {x| min kAx − bk = 0} is nonempty. Since the
x
Picard iterative algorithm may not converge, we apply the random Krasnoselskii-Mann iterative

algorithm for converging almost surely and in mean square to a point in S for any matrices A and b

and any initial conditions. The proposed algorithm, like that of [111], requires that whole solution

vector is computed and exchanged by each node over a network. Based on initial conditions of

agents’ states, we show that the limit point to which the agents’ states converge is determined by

the unique solution of a feasible convex optimization problem independent from the distribution of

random communication graphs.

Stability of stochastic discrete-time systems: We apply fixed point theory to stability analysis

of stochastic nonlinear discrete-time systems to overcome difficulties that arise in using Lyapunov’s

and LaSalle’s approaches such as distribution dependency of random variable sequences.

1.4 Outline of Thesis

The outline of thesis is as follows. In Chapter 2, review of relevant fixed point theory is

given. In Chapter 3, a new optimization problem and its special cases such as robust convex

optimization and distributed optimization over random networks are given. Furthermore, two

frameworks for distributed optimization over state-dependent networks with/without switching

topologies are presented. In Chapter 4, an algorithm for solving the optimization problem defined in
9

Chapter 3, and its application to distributed optimization, is presented. In Chapter 5, solving linear

algebraic equations, and its special case consensus problems, over random networks is considered.

In Chapter 6, a generalization of the proposed algorithm to solve distributed optimization with

state-dependent interactions and time-varying topologies is given. In Chapter 7, applied fixed

point theory to stability analysis of stochastic discrete-time systems is presented. In Chapter 8,

conclusions and future works are given.


10

CHAPTER 2. REVIEW OF RELEVANT FIXED POINT THEORY

In this chapter, we give relevant fixed point theorems and iterations that we use for our results.

Before doing so, we present a history of relevant results. The following brief history of fixed point

theory has been mentioned in [166].

The idea of fixed point of an operator was first flashed in the mind of Cauchy while dealing

with the existence and uniqueness of solution of certain differential equation and by this notion,

a new light in the research arena appeared as Fixed Point Theory. This has two-fold-valuation–

one from the classical analysis point of view, and the other is its application on many branches of

sciences and economics.

After Cauchy, R. Lipschitz simplified Cauchy’s proof in 1877 (1876 in [167]) using Lipschitz

condition, and G. Peano proved a deeper result in 1890 which relates mostly to the modern fixed

point theory. In the same year Picard applied this method to ordinary and partial differential

equations. In 1886, Poincaré proved a fixed point theorem for a continuous self-mapping f on <n

satisfying condition f (x) + αx = r, kxk = r, ∀x ∈ <n , for some r > 0 and for every α > 0. This

theorem was rediscovered by P. Bohl in 1904. For a long period of time, this branch remained

suppressed until it was redeemed and re-cultivated by the Dutch mathematician L. E. J. Brouwer

who put this branch of mathematics in the front line of the research arena. In 1912 (1910 in [167]),

he proved the well-known Brouwer fixed point theorem for a continuous self-map on a closed unit

ball in <n .

In 1922, S. Banach launched in this field with a new concept of mapping called contraction

mapping and showed that a contraction self-mapping on a complete metric space has a unique

fixed point. In 1930, R. Caccioppoli remarked on Banach contraction principle that the contraction

condition may be replaced by the assumption of the convergence of the sequence of iterates, which

led to open another direction of studying fixed point theory, known as approximation of fixed point
11

of an operator. So, to speak on iterative sequence, the Picard iterative scheme has a wide range

of applications in different branches of sciences. Nevertheless, it has found to have some crucial

drawback that the iterative sequence obtained by this method may not always converge, which was

pointed out and rectified by W. R. Mann in 1953 by introducing a new type of iteration scheme,

called the Mann iterative process. In 1930, J. Schauder obtained the result for existing a fixed point

for a continuous mapping on a Banach space. Under the assumption of the Schauder theorem, there

was no method for approximating a fixed point of a mapping. However, Krasnoselskii showed in

1955 that a special type of iterative sequence converges to a fixed point of a nonexpansive mapping

on a uniformly convex Banach space.

It is assumed that the reader is familiar with usual concepts of topological and metric spaces

or is referred to [168]. Now we present relevant fixed point theorems and iterations in the following

sections, respectively.

2.1 Relevant Fixed Point Theorems

Before we presents theorems, we need to give some definitions.

Definition 2.1: Let the operator T : X −→ X be a self map. A x ∈ X is said to be a fixed

point of T if T (x) = x, and F ix(T ) denotes the set of all fixed points of T .

Definition 2.2 [167]: A topological space X is said to possess fixed point property if every

continuous mapping of X into X has a fixed point.

Definition 2.3 [167]: Let T be a mapping of a metric space M = (X , ρ) into M. T is called

contraction mapping if there exists a number η such that 0 ≤ κ < 1 and

ρ(T x, T y) ≤ κρ(x, y).

Definition 2.4: Let H = (X , k.kH ) with inner product < ., . > be a real Hilbert space. A self

map operator H : H −→ H is said to be nonexpansive if for any x, y ∈ dom(H) we have

kH(x) − H(y)kH ≤ kx − ykH .


12

Definition 2.5: The map T : H −→ H is said to be firmly nonexpansive if for each x, y ∈ H,

kT (x) − T (y)k2H ≤< T (x) − T (y), x − y > .

Remark 2.1 [169]: φ : H −→ H is a firmly nonexpansive mapping if T : H −→ H is a

nonexpansive mapping where


1
φ(x) = (x + T (x)).
2
Moreover, every firmly nonexpansive mapping is nonexpansive by Cauchy–Schwarz inequality.

Let (Ω∗ , σ) be a measurable space (σ-sigma algebra) and C be a nonempty subset of a metric

space M. A mapping x : Ω∗ −→ M is measurable if x−1 (U ) ∈ σ for each open subset U of M. The

mapping T : Ω∗ ×C −→ M is a random map if for each fixed z ∈ C, the mapping T (., z) : Ω∗ −→ M

is measurable, and it is continuous if for each ω ∗ ∈ Ω∗ the mapping T (ω ∗ , .) : C −→ M is

continuous.

Definition 2.6 [170]: A measurable mapping x : Ω∗ −→ M is a random fixed point of the

random map T : Ω∗ × C −→ M if T (ω ∗ , x(ω ∗ )) = x(ω ∗ ) for each ω ∗ ∈ Ω∗ .

Definition 2.7 [170]: Let C be a nonempty subset of a metric space M and T : Ω∗ × C −→ C

be a random map. The map T is said to be contraction random operator if for each ω ∗ ∈ Ω∗ and

for arbitrary x, y ∈ C we have

ρ(T (ω ∗ , x), T (ω ∗ , y)) ≤ κρ(x, y), 0 ≤ κ < 1.

Definition 2.8 [170]: Let C be a nonempty subset of a real Hilbert space H and T : Ω∗ ×C −→

C be a random map. The map T is said to be nonexpansive random operator if for each ω ∗ ∈ Ω∗

and for arbitrary x, y ∈ C we have

kT (ω ∗ , x) − T (ω ∗ , y)kH ≤ kx − ykH .

Definition 2.8 [170]: Let C be a nonempty subset of a real Hilbert space H and T : Ω∗ ×C −→

C be a random map. The map T is said to be firmly nonexpansive random operator if for each

ω ∗ ∈ Ω∗ and for arbitrary x, y ∈ C we have

kT (ω ∗ , x) − T (ω ∗ , y)k2H ≤< T (ω ∗ , x) − T (ω ∗ , y), x − y > .


13

Now we present the relevant fixed point theorems.

Theorem 2.1 [167] (The Brouwer fixed point theorem): Every compact convex non-empty

subset of <n has the fixed point property.

The following theorem is in fact equivalent to the Brouwer fixed point theorem.

Theorem 2.2 [167]: B n has the fixed point property.

Theorem 2.3 [167] (The Banach Fixed Point Theorem): Any contraction mapping of a com-

plete non-empty metric space M into itself has a unique fixed point.

2.2 Relevant Fixed Point Iterations

The Picard iterative algorithm: The Picard iteration for finding a fixed point of an operator

T (x) is

xn+1 = T (xn ), n ∈ N ∪ {0}. (2.1)

The Krasnoselskii-Mann iterative algorithm [171]-[172]: The Krasnoselskii-Mann iteration for

finding a fixed point of an operator T (x) is

xn+1 = (1 − αn )xn + αn T (xn ), n ∈ N ∪ {0}, (2.2)

where αn ∈ [0, 1].

The Picard iteration may not always converge when T (x) is nonexpansive on a real Hilbert

space H. For example, consider T (x) := −x, x ∈ <. However, Krasnoselskii [171] proved that
1
Algorithm (2.2) when αn = 2 always converges to a fixed point of a nonexpansive mapping on H.
14

CHAPTER 3. TWO FRAMEWORKS FOR OPTIMIZATION

In this chapter, we define a new mathematical terminology called fixed value point and use it

to define a new mathematical optimization framework. The framework includes centralized convex

optimization, centralized robust convex optimization, and distributed convex optimization over

random networks. Then we give a framework for distributed optimization problem with state-

dependent interactions. The reader is assumed to be familiar with convex optimization concepts

or is referred to [173].

Now we give the following definition.

Definition 3.1: If there exists a point x̂ ∈ M where x̂ = T (ω ∗ , x̂) for all ω ∗ ∈ Ω∗ , we call it

fixed value point, and F V P (T ) represents the set of all fixed value points of T .

Remark 3.1: A random mapping may have a random fixed point but may not have a fixed

value point. For instance, if Ω∗ = {H, G} and T (H, x(H)) = 1, T (G, x(G)) = 0, then the random

variable x(H) = 1, x(G) = 0 is a random fixed point of T . However, T does not have any fixed

value point.

Remark 3.2: A fixed value point of a nonexpansive random mapping is a common fixed point

of a family of nonexpansive non-random mappings T (ω ∗ , .) for each ω ∗ . We refer the interested

reader to [174]-[176] for existence theorems for a common fixed point of nonexpansive non-random

mappings.

3.1 A Framework for Convex Optimization

Let H be a real Hilbert space. Given a convex function f : H −→ < and a nonexpansive random

mapping T : Ω∗ × H −→ H, the problem is to find x∗ ∈ argminf (x) such that x∗ is a fixed value
x
point of T (ω ∗ , x), i.e., we have the following minimization problem
15

min f (x)
x
(3.1)
subject to x ∈ F V P (T )

where F V P (T ) is the set of fixed value points of the random operator T (ω ∗ , x) (see Definition 3.1).

We assume that the problem is feasible, namely F V P (T ) 6= ∅.

The following preposition is a corollary of Preposition 5.3 in [177].

Preposition 3.1: Let C be a closed and convex subset of a real Hilbert space H. If T : C −→ C

is nonexpansive, then the fixed point set of T is closed and convex.

Remark 3.3: From Preposition 3.1, the fixed point set of a nonexpansive non-random mapping

T : Ω∗ × C −→ C where C is a closed convex set, for each ω ∗ , is a closed convex set. It is

known that the intersection of closed convex sets (finite, countable, or uncountable) is closed and

convex. Since, by Remark 3.2, fixed value points set of a nonexpansive random operator T (ω ∗ , x)

is the intersection of fixed points set of nonexpansive non-random mappings T (ω ∗ , x) for each fixed

ω ∗ ∈ Ω∗ , i.e., F V P (T ) = ω∗ ∈Ω∗ F ix(T (ω ∗ , x)), we have that F V P (T ) is a closed convex set.


T

Therefore, Problem (3.1) is a convex optimization problem.

3.1.1 Application to Centralized Convex Optimization

Unconstrained optimization problem, i.e., minf (x), is included in the framework (3.1). In this
x
case, the constraint set is x = x, or x ∈ F ix(T ) where T (x) := x. It is easy to check that T (x) := x

is nonexpansive. Moreover, constrained optimization problems are included in the framework (3.1).

In this case, the constraint set is x = PC (x) where C is a closed convex constraint set, and PC (x)

is the projection map onto C. It is known that T (x) := PC (x) is nonexpansive.

3.1.2 Application to Centralized Robust Convex Optimization

Robust convex optimization has been investigated on Euclidean spaces [178]-[203]. In this

subsection, we define centralized robust convex optimization (CRCO) on real Hilbert spaces. CRCO

on H is in general of the form


16

min f (x)
x∈C
(3.2)
∗ ∗ ∗
subject to g(x, ω ) ≤ 0, ∀ω ∈ Ω ,
where C is a nonempty closed convex subset of H, f : H −→ < is a convex function, g(x, .) :

Ω∗ −→ < is measurable for each fixed x ∈ H, g(., ω ∗ ) : H −→ < is a convex function for each fixed

ω ∗ ∈ Ω∗ , and the uncertainty ω ∗ enters into the constraint function g(x, ω ∗ ). Assume the problem

is feasible, i.e., there exists an x∗ ∈ C such that g(x∗ , ω ∗ ) ≤ 0, ∀ω ∗ ∈ Ω∗ .

The constraint set of (3.2), i.e., {x|x ∈ C, g(x, ω ∗ ) ≤ 0, ∀ω ∗ ∈ Ω∗ }, can be converted to

{x|x = P g (ω ∗ , x), ∀ω ∗ ∈ Ω∗ } where P g (ω ∗ , x) is the projection onto the closed convex set {z|z ∈

C, g(z, ω ∗ ) ≤ 0} for each fixed ω ∗ ∈ Ω∗ . The constraint set of the CRCO problem (3.2) is in fact

x ∈ F V P (P g ). Therefore, (3.2) is equivalent to

min f (x)
x
(3.3)
g
subject to x ∈ F V P (P ).
Since the projection operator P g (ω ∗ , x) is nonexpansive, (3.1) includes (3.3) as a special case.

3.1.3 Application to Distributed Convex Optimization over Random Networks

In this subsection, we define the distributed convex optimization problem over random networks.

A network of m nodes labeled by the set V = {1, 2, ..., m} is considered. The topology of the

interconnections among nodes is not fixed but defined by a set of graphs G(ω ∗ ) = (V, E(ω ∗ )) where

E(ω ∗ ) is the ordered edge set E(ω ∗ ) ⊆ V × V and ω ∗ ∈ Ω∗ where Ω∗ is the set of all possible

communication graphs, i.e., Ω∗ = {G1 , G2 , ..., GN̄ }. We write Niin (ω ∗ )/Niout (ω ∗ ) for the labels of

agent i’s in/out neighbors at graph G(ω ∗ ) so that there is an arc in G(ω ∗ ) from vertex j/i to

vertex i/j only if agent i receives/sends information from/to agent j. We write Ni (ω ∗ ) when

Niin (ω ∗ ) = Niout (ω ∗ ). We assume that there are no self-looped arcs in the communication graphs.

We define the weighted graph matrix W(ω ∗ ) = [Wij (ω ∗ )] with Wij (ω ∗ ) = aij (ω ∗ ) for j ∈

Niin (ω ∗ ) ∪ {i}, and Wij (ω ∗ ) = 0 otherwise, where aij (ω ∗ ) > 0 is the scalar constant weight that

agent i assigns to the information xj received from agent j. For instance, if W(Gk ) = Im , for
17

some 1 ≤ k ≤ N̄ , implies that there are no edges in Gk , or/and all nodes are not activated for

communication updates in asynchronous protocol or both.

Now we define the distributed convex optimization problem as follows: for each node i ∈ V, we

associate a private convex cost function fi : <n −→ < which is known to node i. The objective of

each agent is to collaboratively seek the solution of the following optimization problem using local

information exchange with the neighbors and switching communication topologies:


m
X
min fi (x)
i=1

where x ∈ <n . We assume that there is no communication delay or noise in delivering a message

from agent j to agent i.

The full formulation of the above problem is as follows:

Problem 3.1: The distributed convex optimization is formulated as

m
X
min f (x) := fi (xi )
x
i=1 (3.4)
subject to x1 = x2 = ... = xm

where x = [xT1 , ..., xTm ]T , xi ∈ <n , i = 1, 2, ..., m, fi : <n −→ < is a private cost function known to

node i, and the constraint is achieved through random graph interactions.

Remark 3.4: The set C = {x ∈ <mn |xi = xj , 1 ≤ i, j ≤ m, xi ∈ <n } is known as consensus

subspace.

Now we impose the following assumptions.

Assumption 3.1: The weighted graph matrix W(ω ∗ ) is doubly stochastic for each ω ∗ ∈ Ω∗ ,

i.e.,
∗)
P
i) j∈Niin (ω ∗ )∪{i} Wij (ω = 1, i = 1, 2, ..., m,
∗)
P
ii) j∈Niout (ω ∗ )∪{i} Wij (ω = 1, i = 1, 2, ..., m.

Note that any network with undirected links satisfies Assumption 3.1.

Assumption 3.2: The union of all of the graphs in Ω∗ is strongly connected.

Now we give the following lemma regarding Assumption 3.2.


18

Lemma 3.1: The union of all of the graphs in Ω∗ is strongly connected if and only if

Re[λ2 ( ω∗ ∈Ω∗ (Im − W(ω ∗ )))] > 0.


P

Proof: Since the union of all of the graphs is strongly connected, the matrix ω∗ ∈Ω∗ W(ω ∗ )
P

is irreducible. Therefore, according to Perron-Frobenius theorem for irreducible matrices, it has

a unique positive real largest eigenvalue. Since, by Assumption 3.1, W(ω ∗ ), ∀ω ∗ ∈ Ω∗ , is dou-
∗ ∗
P
bly stochastic, the unique largest eigenvalue of the matrix ω ∗ ∈Ω∗ W(ω ) is λ = N̄ . Thus

Re[λ2 ( ω∗ ∈Ω∗ (Im − W(ω ∗ )))] > 0. Conversely, we prove it by contradiction. Assume that the
P

union of all of the graphs is not strongly connected. It is well-known that there exists a permuta-

tion matrix P such that  


X  A B 
P( W(ω ∗ ))P T =  ,
ω ∗ ∈Ω∗ 0 C

where A and C are square matrices. Therefore, spec( ω∗ ∈Ω∗ W(ω ∗ )) = spec(A) ∪ spec(C). From
P

Assumption 3.1, all columns of A has summation equal to N̄ , and all rows of C have summation

equal to N̄ . Therefore, the eigenvalue λ = N̄ has multiplicity 2 in spec( ω∗ ∈Ω∗ W(ω ∗ )) which is a
P

contradiction. Thus the proof is complete.

Assumption 3.2 ensures that the information sent from each node will be finally obtained by

every other node through a directed path. Now, Problem 3.1 with Assumptions 3.1 and 3.2 can be

reformulated as the following problem.

Problem 3.2: Problem 3.1 under Assumptions 3.1 and 3.2 can be formulated as

m
X
min f (x) := fi (xi )
x
i=1 (3.5)
subject to W (ω ∗ )x = x, ∀ω ∗ ∈ Ω∗ ,

where W (ω ∗ ) = W(ω ∗ ) ⊗ In , ω ∗ ∈ Ω∗ .

Now we show that Problems 3.1 and 3.2 are equivalent. We obtain from W (ω ∗ )x = x, ∀ω ∗ ∈ Ω∗

that

(Imn − W (ω ∗ ))x = 0, ∀ω ∗ ∈ Ω∗ ,
19

which implies that


X
(Imn − W (ω ∗ ))x = 0. (3.6)
ω ∗ ∈Ω∗
Now we have
X X
(Imn − W (ω ∗ )) = ((Im − W(ω ∗ )) ⊗ In )
ω ∗ ∈Ω∗ ω ∗ ∈Ω∗
X
=( (Im − W(ω ∗ ))) ⊗ In
ω ∗ ∈Ω∗

= Λ ⊗ In ,

where
X
Λ := (Im − W(ω ∗ )).
ω ∗ ∈Ω∗
Λ has the following properties: the summation of all rows are equal to zero; the diagonal elements

are non-negative; the off-diagonal elements are non-positive. Therefore, Λ has the Laplacian matrix

structure. Since Re[λ2 (Λ)] > 0 (see the proof of Lemma 3.1), (3.6) implies that x1 = x2 = . . . = xm .

Therefore, Problems 3.1 and 3.2 are equivalent.

Although double-stochasticity is restrictive in distributed setting [204], we show that Assump-

tion 3.1 allows us to remove the distribution of random interconnection graphs. Now we show that

the random operator T (ω ∗ , x) := W (ω ∗ )x with Assumption 3.1 is nonexpansive in the Hilbert space

H = (<mn , k.k2 ). For arbitrary x, y ∈ <mn , we have

kT (ω ∗ , x) − T (ω ∗ , y)k2 = kW (ω ∗ )x − W (ω ∗ )yk2

= kW (ω ∗ )(x − y)k2

≤ kW (ω ∗ )k2 kx − yk2 .

Now we have the following lemma.


p
Lemma 3.2 [205]: Let W ∈ <m×m . Then kW k2 ≤ kW k1 kW k∞ .

By Assumption 3.1 and Lemma 3.2, we obtain kW (ω ∗ )k2 ≤ 1, ∀ω ∗ ∈ Ω∗ . Thus we have

kT (ω ∗ , x) − T (ω ∗ , y)k2 ≤ kW (ω ∗ )k2 kx − yk2

≤ kx − yk2 (3.7)
20

which implies that T is nonexpasnive.

Now we give the following definition.

Definition 3.2: Given a weighted graph matrix W(ω ∗ ), we call T (ω ∗ , x) := W (ω ∗ )x, ω ∗ ∈ Ω∗ ,

weighted random operator of the graph. Similarly, for non-random case, we call T (x) := W x

weighted operator of the graph.

Remark 3.5: We show that since weighted random operator of the graph has nonexpansiv-

ity property in the Hilbert space H = (<mn , k.k2 ), any assumption on distribution of random

communication topologies is not needed.

Remark 3.6: Consensus subspace is in fact the fixed value points set of weighted random

operator of the graph with Assumption 3.2, i.e., C = F V P (T ).

To conclude, we have shown that Problem 3.2 is a special case of (3.1) where T (ω ∗ , x) :=

W (ω ∗ )x.

3.2 A Framework for Distributed Convex Optimization with State-Dependent


Interactions

Before we give a framework for distributed optimization with state-dependent interactions, we

clarify that existing frameworks cannot be applied for this problem.

Fast Lipschitz Optimization has been introduced as a powerful method to capture the unique

solution of convex or non-convex optimization problems [206]-[208]. If W does not depend on the

states, i.e., the case of state-independent weighted graphs, the condition kW k < 1 is not satisfied

for our problem because kW k2 = 1; in fact, this condition makes the operator T (x) := W x a

contraction, and the feasible set will be a unique point (see Theorem 2.3) instead of the set C.

Therefore, the results given in [206]-[208] cannot not be applied here.

Convex minimization over fixed point set of a nonexpansive mapping has been studies in [209]

and references therein. This method has been usefully applied to signal processing, inverse problem,

network bandwidth allocation and so on [210]-[214]. If W does not depend on the states, we have

shown that the operator T (x) := W x is a nonexpansive mapping with Assumption 3.1. For any
21

W (x), the operator T may not be a nonexpansive mapping. Therefore, the results given in [209] and

references therein cannot be applied here. Similarly, the proposed framework in the previous section

for non-random case cannot be applied here. Note that the framework of minimization over fixed

point set of nonexpansive mapping considered in [209] and references therein includes the centralized

convex optimization in Subsection 3.1.1 but does not include the problems in Subsections 3.1.2 and

3.1.3.

Similar to the previous subsection, distributed optimization with state-dependent interactions

can be formulated as
m
X
min f (x) := fi (xi )
x
i=1 (3.8)
subject to W (x)x = x

where W (x) = W(x) ⊗ In , and W(x) = [Wij (xi , xj )] is the state-dependent weighted matrix of the

graph satisfying the following assumptions.

Assumption 3.3: The weights Wij : <n ×<n −→ [0, 1] are continuous, and the state-dependent

weighted matrix of the graph is doubly stochastic, i.e.,


P
i) j∈N in ∪{i} Wij (xi , xj ) = 1, i = 1, 2, ..., m,
i
P
ii) j∈N out ∪{i} Wij (xi , xj ) = 1, j = 1, 2, ..., m.
i

Assumption 3.4: The graph is strongly connected for all x ∈ <mn .

Assumptions 3.3 and 3.4 ensure the connectivity of the graph, and that the information sent

from each node will be finally obtained by every other node through a path. Note that Assumption

3.3 when applied to state-dependent weights would require undirected connections, and directed

graphs are allowed in the special case of state-independent weights.

Now we show that the only solution of W (x)x = x with Assumptions 3.3 and 3.4 is x1 = x2 =

... = xm , i.e., the consensus subspace. Assume a x̃ = [x̃T1 , x̃T2 , ..., x̃Tm ]T which satisfies W (x̃)x̃ = x̃.

Since the summation of rows of W(x̃) − Im is zero, the matrix W(x̃) − Im has an eigenvalue

zero; moreover, since the graph is strongly connected, this eigenvalue is unique. According to

Assumptions 3.3 and 3.4, W (x̃) has some nonzero elements. Therefore, x̃ must be in the null space
22

of W (x̃) − Imn which implies x̃1 = x̃2 = ... = x̃m . Therefore, the only solution of W (x)x = x with

Assumptions 3.3 and 3.4 is x1 = x2 = ... = xm .

Note that from Assumption 3.3 and Lemma 3.2, we have that kW (x)k2 ≤ 1, ∀x ∈ <mn . The

Hilbert space considered here is H = (<mn , k.k2 ).

Now, we introduce a framework for modeling multi-agent optimization problems.

Problem given by (3.8) with Assumptions 3.3 and 3.4 can be reformulated as
m
X
min f (x) := fi (xi )
x
i=1 (3.9)
subject to x ∈ F ix(T )

where T (x) := W (x)x. We obtain kT (x)k2 ≤ kW (x)k2 kxk2 ≤ kxk2 ; in fact, the operator T maps

every closed ball B mn into itself. Thus, according to Theorem 2.2, we can guarantee that there

exists a fixed point of T , i.e., there exists a point x̂ such that x̂ = W (x̂)x̂ which is the same as the

constraint W (x)x = x.

Now we generalize the framework (3.9) to distributed optimization with state-dependent inter-

actions and time-varying topologies.

The topology of the network is represented by Gn = (V, En ) at time n ∈ N ∪ {0} with the

ordered edge set En ⊆ V × V. Let consider the set G = {Gn : n ∈ N ∪ {0}}. Since m ∈ N , the

cardinality of G, namely |G| = N̄ , is finite.

Now we impose the following assumptions.

Assumption 3.5: For each G ∈ G, the weights Wij (G) : <n × <n −→ [0, 1] are continuous, and

the state-dependent weighted matrix of the graph is doubly stochastic, i.e.,


P
i) j∈N in ∪{i} Wij (xi , xj , G) = 1, i = 1, 2, ..., m,
i
P
ii) j∈N out ∪{i} Wij (xi , xj , G) = 1, j = 1, 2, ..., m.
i

Assumption 3.6: The union of the graphs in G is strongly connected for all x ∈ <mn .

Now we give the following lemma regarding Assumption 3.6.

Lemma 3.3: The union of the graphs in G is strongly connected for all x ∈ <mn if and only if

Re[λ2 ( G∈G (Im − W(x, G)))] > 0 for all x ∈ <mn .


P
23

Proof: Since the union of all of the graphs is strongly connected for all x ∈ <mn , the matrix

is irreducible for all x ∈ <mn . Therefore, according to Perron-Frobenius theorem


P
G∈G W(x, G)

for irreducible matrices, it has a unique positive real largest eigenvalue for each x ∈ <mn . Since, by

Assumption 3.5, W(x, G), ∀G ∈ G, ∀x ∈ <mn , is doubly stochastic, the unique largest eigenvalue of

the matrix G∈G W(x, G) is λ∗ (x) = N̄ . Thus Re[λ2 ( G∈G (Im − W(x, G)))] > 0. Now we prove
P P

the opposite direction. We prove it by contradiction. Assume that the union of all of the graphs

in G is not strongly connected for all x ∈ <mn . It is well-known that there exists a permutation

matrix P such that for some x̃ ∈ <mn


 
X  A B 
W(x̃, G))P T = 
P( ,
G∈G 0 C
P
where A and C are square matrices. Therefore, spec( G∈G W(x̃, G)) = spec(A) ∪ spec(C). From

Assumption 3.5, all columns of A has summation equal to N̄ , and all rows of C has summation
P
equal to N̄ . Therefore, the eigenvalue λ = N̄ has multiplicity 2 in spec( G∈G W(x̃, G)) which is a

contradiction. Thus the proof is complete.

Now we show that the only solution of W (x, G)x = x, ∀G ∈ G, with Assumptions 3.5 and 3.6 is

the constraint x1 = x2 = ... = xm .

We obtain from W (x, G)x = x, ∀G ∈ G, that

(Imn − W (x, G))x = 0, ∀G ∈ G,

which implies that


X
(Imn − W (x, G))x = 0. (3.10)
G∈G
Now we have
X X
(Imn − W (x, G)) = ((Im − W(x, G)) ⊗ In )
G∈G G∈G

= Λ(x) ⊗ In ,

where
X
Λ(x) := (Im − W(x, G)).
G∈G
24

Λ(x) has the following properties: the summation of all rows are equal to zero; the diagonal elements

are non-negative; the off-diagonal elements are non-positive for all x ∈ <mn . Therefore, Λ(x) has

the Laplacian matrix structure. Since Re[λ2 (Λ(x))] > 0, ∀x ∈ <mn , (see the proof of Lemma 3.3),

(3.10) implies that x1 = x2 = . . . = xm .

Therefore, distributed convex optimization with state-dependent interactions and time-varying

topologies with Assumptions 3.5 and 3.6 can be formulated as


m
X
min f (x) := fi (xi )
x
i=1 (3.11)
subject to W (x, G)x = x. ∀G ∈ G

In the remaining part of this section, we introduce a framework for modeling multi-agent opti-

mization problems.

Problem stated by (3.11) can be reformulated as


m
X
min f (x) := fi (xi )
x
i=1
\ (3.12)
subject to x ∈ F ix(T (x, G))
G∈G

where T (x, G) := W (x, G)x.

Definition 3.3: We call T (x) := W (x)x state-dependent weighted operator of the graph.

Remark 3.7: The consensus subspace C (see Remark 3.4) is the intersection of fixed points

sets of state-dependent weighted operators of the graphs with Assumption 3.6, i.e.,

\
C= F ix(T (x, G)).
G∈G
25

CHAPTER 4. ALGORITHMS TO SOLVE THE OPTIMIZATION

In this chapter, we propose an algorithm, in the first section, to solve the optimization problem

(3.1). An application of the algorithm is to solve distributed convex optimization over random

networks with/without asynchronous protocols. In the later sections, we show that the random

Picard and the random Krasnoselskii-Mann iterations are useful to solve feasibility problem of (3.1)

under suitable assumptions.

Before we present the algorithm, we need to give some definitions needed for the next section.

For simplicity, we write k.kH = k.k in this chapter.

Definition 4.1: An operator A : H −→ H is said to be monotone if

< x − y, Ax − Ay >≥ 0

for all x, y ∈ H.

Definition 4.2: A : H −→ H is called ξ-strongly monotone if

< x − y, Ax − Ay >≥ ξkx − yk2

for all x, y ∈ H.

Remark 4.1: A function is ξ-strongly convex if its gradient is ξ-strongly monotone.

Definition 4.3: A mapping A : H −→ H is said to be K-Lipschitz continuous if there exists a

K > 0 such that

kAx − Ayk ≤ Kkx − yk

for all x, y ∈ H.

Definition 4.4: A sequence of random variables xn is said to converge pointwise (surely) to x

if for every ω ∈ Ω

lim kxn (ω) − x(ω)k = 0.


n−→∞
26

Definition 4.5: A sequence of random variables xn is said to converge almost surely to x if

there exists a subset Θ ⊆ Ω such that P r(Θ) = 0, and for every ω ∈


lim kxn (ω) − x(ω)k = 0.


n−→∞

Definition 4.6: A sequence of random variables xn is said to converge in mean square to x if

E[kxn − xk2 ] −→ 0 as n −→ ∞.

4.1 A Proposed Algorithm

Problem (3.1) is a static optimization problem that requires potentially large number of con-

straints as the cardinality of Ω∗ can be big. Moreover, explicit knowledge of T (ω ∗ , x), ∀ω ∗ ∈ Ω∗ , is

in principle necessary to formulate the constraints.

We are interested in obtaining an iterative solution to Problem (3.1) where the constraint set

is not known a priory and randomly changes over time. In other words, we solve Problem (3.1)

iteratively such that one set of constraint appears at each time. In what follows, we assume that

the constraint changes randomly over time.

We propose the following algorithm

xn+1 = αn (xn − β∇f (xn )) + (1 − αn )T̂ (ωn∗ , xn ), (4.1)

where T̂ (ωn∗ , xn ) := (1−η)xn +ηT (ωn∗ , xn ), η ∈ (0, 1), αn ∈ [0, 1], and ωn∗ denotes an outcome ω ∗ ∈ Ω∗

at iteration n. The convergence of the algorithm is proved under the following assumption.

Assumption 4.1: f (x) is ξ-strongly convex, and ∇f (x) is K-Lipschitz continuous.

Remark 4.2: A key distinguishing feature of Algorithm (4.1) is the presence of αn in the

second term. As we will see, this with Assumption 4.1 will introduce nice convergence properties.
27

4.1.1 Convergence Analysis

Consider a probability measure µ defined on the space (Ω, F) where

Ω = Ω∗ × Ω∗ × Ω∗ × . . .

F = σ × σ × σ × ...

such that (Ω, F, µ) forms a probability space. We denote a realization in this probabilty space by

ω ∈ Ω. We have the following assumption.

Assumption 4.2: There exists a nonempty subset K̃ ⊆ Ω∗ such that F V P (T ) = {z̃|z̃ ∈ H, z̃ =

T (ω̄, z̃), ∀ω̄ ∈ K̃}, and each element of K̃ occurs infinitely often almost surely.
P∞
Remark 4.3: If the sequence {ω̄n }∞ n=0 is mutually independent with n=0 P rn (ω̄) = ∞ where

P rn (ω̄) is the probability of ω̄ occurring at time n, then according to Borel-Cantelli lemma [215],

Assumption 4.2 is satisfied. Consequently, any i.i.d. random sequence satisfies Assumption 4.2.

Any ergodic stationary sequences {ωn∗ }∞


n=0 , P r(ω̄) > 0, satisfy Assumption 4.2 (see proof of Lemma

1 in [49]). Consequently, any time-invariant Markov chain with its unique stationary distribution

as the initial distribution satisfies Assumption 4.2 (see [49]).

Lemma 4.1: Let T̂ (ω ∗ , x) := (1−η)x+ηT (ω ∗ , x), ω ∗ ∈ Ω∗ , x ∈ H, with a nonexpansive random

operator T , F V P (T ) 6= ∅, and η ∈ (0, 1]. Then

(i) F V P (T ) = F V P (T̂ ).

(ii) < x − T̂ (ω ∗ , x), x − z >≥ η2 kx − T (ω ∗ , x)k2 , ∀z ∈ F V P (T ), ∀ω ∗ ∈ Ω∗ .

(iii) T̂ (ω ∗ , x) is nonexpasnive.

Proof: (i)

Consider a x̂ ∈ F V P (T ). Thus x̂ = T (ω ∗ , x̂), ∀ω ∗ ∈ Ω∗ . Hence

T̂ (ω ∗ , x̂) = (1 − η)x̂ + ηT (ω ∗ , x̂) = x̂, ∀ω ∗ ∈ Ω∗ ,

which implies that F V P (T ) ⊆ F V P (T̂ ). Conversely, consider a x̂ ∈ F V P (T̂ ). Indeed, x̂ =

T̂ (ω ∗ , x̂), ∀ω ∗ ∈ Ω∗ . Thus we have

x̂ = T̂ (ω ∗ , x̂) = (1 − η)x̂ + ηT (ω ∗ , x̂), ∀ω ∗ ∈ Ω∗ ,


28

or

x̂ = T (ω ∗ , x̂), ∀ω ∗ ∈ Ω∗ ,

which implies that F V P (T̂ ) ⊆ F V P (T ). Therefore, we can conclude that F V P (T̂ ) = F V P (T ).

Thus the proof of part (i) of Lemma 4.1 is complete.

(ii)

We have from nonexpansivity of T (ω ∗ , x) that

kT (ω ∗ , x) − zk2 ≤ kx − zk2 , ∀z ∈ F V P (T ), ∀ω ∗ ∈ Ω∗ . (4.2)

From the fact that

ku + vk2 = kuk2 + kvk2 + 2 < u, v >, ∀u, v ∈ H, (4.3)

we obtain for all z ∈ F V P (T ) and for all ω ∗ ∈ Ω∗ that

kT (ω ∗ , x) − zk2 = kT (ω ∗ , x) − x + x − zk2

= kT (ω ∗ , x) − xk2 + kx − zk2

+ 2 < T (ω ∗ , x) − x, x − z > . (4.4)

Substituting (4.4) for (4.2) yields

2 < x − T (ω ∗ , x), x − z >≥ kT (ω ∗ , x) − xk2 . (4.5)

x−T̂ (ω ∗ ,x)
From the definition of T̂ (ωn∗ , xn ), substituting x − T (ω ∗ , x) = η for the left hand side of the

inequality (4.5) implies (ii). Thus the proof of part (ii) of Lemma 4.1 is complete.

(iii)

We have from nonexpansivity of T (ω ∗ , x) for arbitrary x, y ∈ H that

kT̂ (ω ∗ , x) − T̂ (ω ∗ , y)k ≤ (1 − η)kx − yk + ηkT (ω ∗ , x) − T (ω ∗ , y)k

≤ (1 − η)kx − yk + ηkx − yk

= kx − yk, ∀ω ∗ ∈ Ω∗ .

Therefore, T̂ (ω ∗ , x) is a nonexpansive random operator, and the proof of part (iii) of Lemma 4.1

is complete.
29

4.1.1.1 Almost Sure Convergence

Theorem 4.1: Consider Problem (3.1) with Assumptions 4.1 and 4.2. Let β ∈ (0, K2ξ2 ) and

αn ∈ [0, 1], n ∈ N ∪ {0} such that

(a) lim αn = 0,
n−→∞
P∞
(b) n=0 αn = ∞.

Then starting from any initial point, the sequence generated by (4.1) globally converges almost

surely to the unique solution of the problem.


1
Remark 4.4: An example of αn satisfying (a) and (b) of Theorem 4.1 is αn := (1+n)ζ
where

ζ ∈ (0, 1].

Proof of Theorem 4.1:

We prove Theorem 4.1 in three steps:

Step 1: {xn }∞
n=0 , ∀ω ∈ Ω, is bounded.

Step 2: {xn }∞
n=0 converges almost surely to a random variable supported by the feasible set.

Step 3: {xn }∞
n=0 converges almost surely to the optimal solution.

Remark 4.5: The definition of fixed value point is a bridge from deterministic analysis to

random analysis of the algorithm. With the help of fixed value point set and nonexpansivity

property of the random operator T (ω ∗ , x), we are able to: first, prove boundedness of the generated

sequence {xn } in a deterministic way in Step 1; second, extend deterministic tools to random

cases such as part (ii) of Lemma 4.1 and use it for proving the convergence to the feasible set

in Step 2; third, apply deterministic tools to proving the convergence to the optimal solution in

Step 3. Therefore, the definition of fixed value point set with nonexpansivity property of T (ω ∗ , x)

makes analysis of random processes easier than those of existing results regardless of switching

distributions.

Now we give th proofs of Steps 1-3 in details.

Step 1: {xn }∞
n=0 , ∀ω ∈ Ω, is bounded.

Since the cost function is strongly convex and the constraint set is closed, the problem has the

unique solution. Let x∗ be the unique solution of the problem. Since x∗ is the solution, we have
30

that x∗ = T̂ (ωn∗ , x∗ ), ∀ωn∗ ∈ Ω∗ , ∀n ∈ N ∪ {0} (see part (i) of Lemma 4.1). Also, we can write

x∗ = αn x∗ + (1 − αn )T̂ (ωn∗ , x∗ ), ∀ωn∗ ∈ Ω∗ , ∀n ∈ N ∪ {0}. Therefore, we have

kxn+1 − x∗ k = kαn (xn − β∇f (xn )) + (1 − αn )T̂ (ωn∗ , xn ) − x∗ k

= kαn (xn − β∇f (xn ) − x∗ ) + (1 − αn )(T̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ ))k

≤ αn kxn − β∇f (xn ) − x∗ k + (1 − αn )kT̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ )k.

Since T̂ (ω ∗ , x) is a nonexpansive random operator (see part (iii) of Lemma 4.1), the above can be

written as

kxn+1 − x∗ k ≤ αn kxn − β∇f (xn ) − x∗ k + (1 − αn )kT̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ )k

≤ αn kxn − β∇f (xn ) − x∗ k + (1 − αn )kxn − x∗ k. (4.6)

Since ∇f (x) is ξ-strongly monotone, and ∇f (x) is K-Lipschitz continuous, we obtain from (4.3)

for any x, y ∈ H that

kx − y − β(∇f (x) − ∇f (y))k2 = kx − yk2 − 2β < ∇f (x) − ∇f (y), x − y > +β 2 k∇f (x) − ∇f (y)k2

≤ kx − yk2 − 2ξβkx − yk2 + K 2 β 2 kx − yk2

= (1 − 2ξβ + β 2 K 2 )kx − yk2

= (1 − γ)2 kx − yk2

1 − β(2ξ − βK 2 ), and selecting β ∈ (0, K2ξ2 ) implies 0 < γ ≤ 1. Indeed, we have


p
where γ = 1 −

kx − y − β(∇f (x) − ∇f (y))k ≤ (1 − γ)kx − yk. (4.7)

We have that

kxn − β∇f (xn ) − x∗ k = kxn − x∗ − β(∇f (xn ) − ∇f (x∗ )) − β∇f (x∗ )k

≤ kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k + βk∇f (x∗ )k. (4.8)

Therefore, (4.7) and (4.8) implies

kxn − β∇f (xn ) − x∗ k ≤ kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k + βk∇f (x∗ )k

≤ (1 − γ)kxn − x∗ k + βk∇f (x∗ )k. (4.9)


31

Substituting (4.9) for (4.6) yields

kxn+1 − x∗ k ≤ (1 − γαn )kxn − x∗ k + αn βk∇f (x∗ )k


βk∇f (x∗ )k
= (1 − γαn )kxn − x∗ k + γαn ( )
γ

which by induction implies that

βk∇f (x∗ )k
kxn+1 − x∗ k ≤ max{kx0 − x∗ k, }
γ

that implies kxn − x∗ k, n ∈ N ∪ {0}, ∀ω ∈ Ω, is bounded. Therefore, {xn }∞


n=0 is bounded for all

ω ∈ Ω.

As seen from above, we proved the boundedness of the sequence with the help of fixed value

point set and nonexpansiveness of T (ω ∗ , x) as well as Assumption 4.1.

Step 2: {xn }∞
n=0 converges almost surely to a random variable supported by the feasible set.

From (4.1) and xn = αn xn + (1 − αn )xn , we have

xn+1 − xn + αn β∇f (xn ) = (1 − αn )(T̂ (ωn∗ , xn ) − xn ), (4.10)

and hence

< xn+1 − xn + αn β∇f (xn ), xn − x∗ >= −(1 − αn ) < xn − T̂ (ωn∗ , xn ), xn − x∗ > . (4.11)

Since x∗ ∈ F V P (T ), we have from part (ii) of Lemma 4.1 that

η
< xn − T̂ (ωn∗ , xn ), xn − x∗ >≥ kxn − T (ωn∗ , xn )k2 . (4.12)
2

From (4.11) and (4.12), we obtain

η
< xn+1 − xn + αn β∇f (xn ), xn − x∗ >≤ − (1 − αn )kxn − T (ωn∗ , xn )k2 (4.13)
2

or equivalently

η
− < xn − xn+1 , xn − x∗ >≤ −αn < β∇f (xn ), xn − x∗ > − (1 − αn )kxn − T (ωn∗ , xn )k2 . (4.14)
2

For any u, v ∈ H we have

1 1 1
< u, v >= − ku − vk2 + kuk2 + kvk2 . (4.15)
2 2 2
32

From (4.15) we obtain

1
< xn − xn+1 , xn − x∗ >= −Cn+1 + Cn + kxn − xn+1 k2 (4.16)
2

where Cn = 12 kxn − x∗ k2 . From (4.14) and (4.16) we obtain

1 η
Cn+1 − Cn − kxn − xn+1 k2 ≤ −αn < β∇f (xn ), xn − x∗ > − (1 − αn )kxn − T (ωn∗ , xn )k2 .
2 2
(4.17)

From (4.10) and (4.3) we have

kxn+1 − xn k2 = k − αn β∇f (xn ) + (1 − αn )(T̂ (ωn∗ , xn ) − xn )k2

= αn2 kβ∇f (xn )k2 + (1 − αn )2 kT̂ (ωn∗ , xn ) − xn k2

− 2αn (1 − αn ) < β∇f (xn ), T̂ (ωn∗ , xn ) − xn > . (4.18)

We know that kT̂ (ωn∗ , xn )−xn k = ηkxn −T (ωn∗ , xn )k. Since αn ∈ [0, 1], we have also that (1−αn )2 ≤
1
(1 − αn ). Using these facts as well as multiplying both sides of (4.18) by 2 yield

1 1 1
kxn+1 − xn k2 = αn2 kβ∇f (xn )k2 + (1 − αn )2 η 2 kT (ωn∗ , xn ) − xn k2
2 2 2
− αn (1 − αn ) < β∇f (xn ), T̂ (ωn∗ , xn ) − xn > .
1 1
≤ αn2 kβ∇f (xn )k2 + (1 − αn )η 2 kT (ωn∗ , xn ) − xn k2
2 2
− αn (1 − αn ) < β∇f (xn ), T̂ (ωn∗ , xn ) − xn > . (4.19)

From (4.17) and (4.19), we obtain

1
Cn+1 − Cn ≤ kxn+1 − xn k2 − αn < β∇f (xn ), xn − x∗ >
2
η
− (1 − αn )kxn − T (ωn∗ , xn )k2
2
1 η 1
≤ −( − )η(1 − αn )kxn − T (ωn∗ , xn )k2 + αn ( αn kβ∇f (xn )k2
2 2 2
− < β∇f (xn ), xn − x∗ >

− (1 − αn ) < β∇f (xn ), T̂ (ωn∗ , xn ) − xn >). (4.20)


33

Now we claim that there exists an n0 ∈ N such that the sequence {Cn } is non-increasing for n ≥ n0 .

Assume by contradiction that this is not true. Then there exists a subsequence {Cnj } such that

Cnj +1 − Cnj > 0

which together with (4.20) yields

0 < Cnj +1 − Cnj


1 η
≤ −( − )η(1 − αnj )kxnj − T (ωn∗ j , xnj )k2
2 2
1
+ αnj ( αnj β 2 k∇f (xnj )k2 − < β∇f (xnj ), xnj − x∗ >
2
− (1 − αnj ) < β∇f (xnj ), T̂ (ωn∗ j , xnj ) − xnj >). (4.21)

Since {xn } is bounded, ∇f (x) is continuous, and η ∈ (0, 1), we obtain from (4.21) by Theorem 4.1

(a) that

1 η
0 < lim inf [−( − )η(1 − αnj )kxnj − T (ωn∗ j , xnj )k2
j−→∞ 2 2
1
+ αnj ( αnj kβ∇f (xnj )k2 − < β∇f (xnj ), xnj − x∗ >
2
− (1 − αnj ) < β∇f (xnj ), T̂ (ωn∗ j , xnj ) − xnj >)]

≤0 (4.22)

which is a contradiction. Therefore, there exists an n0 ∈ N such that the sequence {Cn } is non-

increasing for n ≥ n0 . Since {Cn } is bounded below, it converges for all ω ∈ Ω.

Taking the limit of both sides of (4.20) and using the convergence of {Cn }, continuity of ∇f (x),

Step 1, η ∈ (0, 1), and Theorem 4.1 (a) yield

lim kxn − T (ωn∗ , xn )k = 0, pointwise


n−→∞

which implies that {xn }∞


n=0 converges for each ω ∈ Ω since F V P (T ) 6= ∅. Moreover, this together

with Assumption 4.2 implies that {xn } converges almost surely to a random variable supported by

F V P (T ).
34

As seen from above, we proved the convergence to the feasible set in a deterministic way with

the help of fixed value point set and nonexpansivity of T (ω ∗ , x) as well as Lemma 4.1.

Step 3: {xn }∞
n=0 converges almost surely to the optimal solution.

It remains to prove that {xn }∞ ∗


n=0 converges almost surely to the optimal solution. Since x ∈

F V P (T ) is the optimal solution, we have

< x̄ − x∗ , ∇f (x∗ ) >≥ 0, ∀x̄ ∈ F V P (T ). (4.23)

We have from (4.3) that

kxn+1 − x∗ k2 = kxn+1 − x∗ + αn β∇f (x∗ ) − αn β∇f (x∗ )k2

= kxn+1 − x∗ + αn β∇f (x∗ )k2 + αn2 kβ∇f (x∗ )k2

− 2αn < β∇f (x∗ ), xn+1 − x∗ + αn β∇f (x∗ ) > . (4.24)

Since x∗ = T̂ (ωn∗ , x∗ ), ∀ωn∗ ∈ Ω∗ , ∀n ∈ N ∪ {0}, we have that x∗ = αn x∗ + (1 − αn )T̂ (ωn∗ , x∗ ), ∀ωn∗ ∈

Ω∗ , ∀n ∈ N ∪ {0}; using this fact and (4.1), we obtain

kxn+1 − x∗ + αn β∇f (x∗ )k2 = kαn [xn − x∗ − β(∇f (xn ) − ∇f (x∗ ))]

+ (1 − αn )[T̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ )]k2 . (4.25)

Furthermore, we have

< β∇f (x∗ ), xn+1 − x∗ + αn β∇f (x∗ ) > =< β∇f (x∗ ), xn+1 − x∗ > +αn < β∇f (x∗ ), β∇f (x∗ ) >

=< β∇f (x∗ ), xn+1 − x∗ > +αn kβ∇f (x∗ )k2 . (4.26)
35

Substituting (4.25) and (4.26) for (4.24) yields

kxn+1 − x∗ k2 = kxn+1 − x∗ + αn β∇f (x∗ )k2 + αn2 kβ∇f (x∗ )k2

− 2αn < β∇f (x∗ ), xn+1 − x∗ + αn β∇f (x∗ ) >

= kαn [xn − x∗ − β(∇f (xn ) − ∇f (x∗ ))] + (1 − αn )[T̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ )]k2

− 2αn < β∇f (x∗ ), xn+1 − x∗ > −αn2 kβ∇f (x∗ )k2

= αn2 kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k2

+ (1 − αn )2 kT̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ )k2

+ 2αn (1 − αn ) < xn − x∗ − β(∇f (xn ) − ∇f (x∗ )), T̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ ) >

− 2αn < β∇f (x∗ ), xn+1 − x∗ > −αn2 kβ∇f (x∗ )k2 .

From (4.7), nonexpansivity property of T̂ (ω ∗ , x), and Cauchy–Schwarz inequality, we obtain

< xn − x∗ − β(∇f (xn ) − ∇f (x∗ )), T̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ ) >≤ (1 − γ)kxn − x∗ k2 . (4.27)

From (4.7), we also obtain

kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k2 ≤ (1 − γ)2 kxn − x∗ k2 . (4.28)

Therefore, from (4.27), (4.28), and nonexpansivity property of T̂ (ω ∗ , x), we have

kxn+1 − x∗ k2 = αn2 kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k2

+ (1 − αn )2 kT̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ )k2

+ 2αn (1 − αn ) < xn − x∗ − β(∇f (xn ) − ∇f (x∗ )), T̂ (ωn∗ , xn ) − T̂ (ωn∗ , x∗ ) >

− 2αn < β∇f (x∗ ), xn+1 − x∗ > −αn2 kβ∇f (x∗ )k2

≤ (1 − 2γαn )kxn − x∗ k2 + αn (γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >)

= (1 − γαn )kxn − x∗ k2 − γαn kxn − x∗ k2

+ αn (γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >).

≤ (1 − γαn )kxn − x∗ k2

+ αn (γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >)


36

or, finally,

γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >


kxn+1 − x∗ k2 ≤ (1 − γαn )kxn − x∗ k2 + γαn ( ),
γ
(4.29)

From Step 1, Step 2, (4.23), and Theorem 4.1 (a), we obtain

lim (γ 2 αn kxn − x∗ k2 − 2β < ∇f (x∗ ), xn+1 − x∗ >) ≤ 0 almost surely. (4.30)


n−→∞

Now we have the following lemma.

Lemma 4.2 [216]: Let {an }∞


n=0 be a sequence of non-negative real numbers satisfying

an+1 ≤ (1 − bn )an + bn hn + cn

P∞ P∞
where bn ∈ [0, 1], n=0 bn = ∞, lim sup hn ≤ 0, and n=0 cn < ∞. Then
n−→∞

lim an = 0.
n−→∞

According to Lemma 4.2 by setting

an = kxn − x∗ k2 ,

bn = γαn ,
γ 2 αn kxn − x∗ k2 − 2β < ∇f (x∗ ), xn+1 − x∗ >
hn = ( ),
γ

we obtain from (4.29), (4.30), and Theorem 4.1 (b) that

lim kxn − x∗ k2 = 0 almost surely.


n−→∞

Therefore, {xn }∞ ∗
n=0 converges almost surely to x .

As seen from above, we proved the convergence to the optimal solution in a deterministic way

(using Lemma 4.2) with the help of fixed value point set and nonexpansivity of T (ω ∗ , x) as well as

Assumption 4.1.
37

4.1.1.2 Mean Square Convergence

Theorem 4.2: Consider Problem (3.1) with Assumptions 4.1 and 4.2. Suppose that β ∈ (0, K2ξ2 )

and αn ∈ [0, 1], n ∈ N ∪ {0}, satisfies (a) and (b) of Theorem 4.1. Then starting from any initial

point, the sequence generated by (4.1) globally converges in mean square to the unique solution of

the problem.

Proof: We have from Theorem 4.1 that

lim kxn − x∗ k = 0 almost surely,


n−→∞

or

lim kxn − x∗ k2 = 0 almost surely.


n−→∞

From Parallelogram Law, we have that

kxn − x∗ k2 ≤ 2(kxn k2 + kx∗ k2 ), ∀n ∈ N.

We define a non-negative measurable function

τn = 2(kxn k2 + kx∗ k2 ) − kxn − x∗ k2 .

Hence, we obtain

lim τn = 4kx∗ k2 almost surely.


n−→∞

Now we have the following lemma.

Lemma 4.3 [217] (Fatou’s Lemma): If τn : Ω −→ [0, ∞] is measurable, for each positive integer

n, then
Z Z
(lim inf τn )dµ ≤ lim inf τn dµ.
Ω n−→∞ n−→∞ Ω

Applying Lemma 4.3 yields


Z Z
(lim inf τn )dµ ≤ lim inf τn dµ
Ω n−→∞ n−→∞ Ω

or
Z Z Z Z
∗ 2 ∗ 2
4kx k dµ ≤ lim inf ( 2kxn k dµ +2
2kx k dµ − kxn − x∗ k2 dµ)
Ω n−→∞ Ω Ω Ω
Z Z Z
2
= lim ( 2kxn k dµ) + 2kx k dµ − lim sup kxn − x∗ k2 dµ.
∗ 2
(4.31)
n−→∞ Ω Ω n−→∞ Ω
38

Now we have the following lemma.

Lemma 4.4 [217] (The Dominated Convergence Theorem): Let {τn } be a sequence in L1 such

that τn −→ τ almost everywhere, and there exists a nonnegative g ∈ L1 such that |τn | ≤ g for all
Z
n. Then, τ ∈ L1 and Ω τ dµ = lim
R
τn dµ.
n−→∞ Ω
Due to boundedness of {xn }∞
n=0 , ∀ω ∈ Ω, we obtain from Lemma 4.4 that
Z Z
lim 2kxn k2 dµ = 2kx∗ k2 dµ. (4.32)
n−→∞ Ω Ω

Thus, we obtain from (4.31) and (4.32) that


Z Z Z Z
∗ 2 2
4kx k dµ ≤ lim ( 2kxn k dµ) + 2kx k dµ − lim sup kxn − x∗ k2 dµ
∗ 2
Ω n−→∞ Ω Ω n−→∞ Ω
Z Z
= 4kx∗ k2 dµ − lim sup kxn − x∗ k2 dµ,
Ω n−→∞ Ω

or
Z
lim sup kxn − x∗ k2 dµ = 0.
n−→∞ Ω

Thus we have
Z
∗ 2
lim E[kxn − x k ] = lim kxn − x∗ k2 dµ
n−→∞ n−→∞ Ω
Z
≤ lim sup kxn − x∗ k2 dµ
n−→∞ Ω

=0

which implies that {xn }∞ ∗


n=0 converges in mean square to x . Thus the proof of Theorem 4.2 is

complete.

4.1.2 Application to Solve Distributed Optimization over Random Networks

So far we have provided the convergence of Algorithm (4.1) to the optimal solution of Problem

(3.1). The algorithm can directly be applied to solving Problem 3.2 in a distributed fashion under

the following considerations. We need to assume that each fi (xi ) is ξ-strongly convex and ∇fi (xi ) is

K-Lipschitz. Therefore, we arrive at the following corollaries of Theorems 4.1 and 4.2, respectively.
39

Corollary 4.1: Consider Problem 3.2 with Assumption 4.2. Assume that fi (xi ) is ξ-strongly

convex, and ∇fi (xi ) is K-Lipschitz continuous for i = 1, 2, ..., m. Suppose that β ∈ (0, K2ξ2 ), η ∈

(0, 1), and αn ∈ [0, 1], n ∈ N ∪ {0}, satisfies (a) and (b) of Theorem 4.1. Then starting from any

initial point, the sequence generated by

xn+1 = αn (xn − β∇f (xn )) + (1 − αn )((1 − η)xn + ηW (ωn∗ )xn ) (4.33)

globally converges almost surely to the unique solution of the problem.

Corollary 4.2: Consider Problem 3.2 with Assumption 4.2. Assume that fi (xi ) is ξ-strongly

convex, and ∇fi (xi ) is K-Lipschitz continuous for i = 1, 2, ..., m. Suppose that β ∈ (0, K2ξ2 ), η ∈

(0, 1), and αn ∈ [0, 1], n ∈ N ∪ {0}, satisfies (a) and (b) of Theorem 4.1. Then starting from any

initial point, the sequence generated by (4.33) globally converges in mean square to the unique

solution of the problem.

Remark 4.6: The authors in [8] have presented a totally asynchronous algorithm for solving

systems of equations of the form x = f (x) where f (x) is a contraction mapping on the Banach space

B∞ = (<n , k.k∞ ) and a partially asynchronous algorithm for solving consensus system x = W x

where W is nonexpansive on B∞ . Here, we are able to obtain a totally asynchronous distributed

algorithm for solving distributed optimization problems (rather than systems of equations) con-

strained by the consensus system in the Hilbert space H = (<n , k.k2 ). Note that nonexpansivity

(or contraction) property of an operator in general may not be preserved from a space to another.

4.1.2.1 Numerical Example

Now we give an instance of a distributed optimization problem over a random network in which

there are distribution dependencies among communication graphs.

Example 4.1: Distributed estimation in wireless sensor networks (WSNs):

Consider a WSN with m = 20 sensors which measure the location of an object. The observation

of the ith sensor is described as yi = Ai θ + νi where Ai ∈ <d×n , θ ∈ <n is the deterministic

parameter to be estimated, and νi is the i.i.d. Gaussian observation noise. We use Maximum
40

Likelihood Estimator with regularization


20
X
min (kAi θ − yi k22 + ρi kθk22 ),
θ
i=1

where ρi is the regularization parameter of each sensor.

This problem can be formulated as the following distributed problem:


20
X
min (kAi θi − yi k22 + ρi kθi k22 )
θ1 ,...,θ20
i=1 (4.34)
subject to θ1 = θ2 = ... = θ20 .

We consider an undirected graph, i.e., 1 ←→ 2 . . . ←→ 20. Each sensor gives a weight Wij =
1
|Ni ∪{i}| to information received from its neighbors. We select yi = [0.25, 0.25][2, 2]T +νi and ρi = 0.2

where νi is the i.i.d. Gaussian observation noise with zero mean and variance 0.01 for each sensor’s

measurement. One can see that fi (θi ) = kAi θi − yi k22 , i = 1, 2, ..., 20, are 2ρi -strongly convex, and

∇fi (θi ) are Ki -Lipschitz continuous where

Ki = k2ATi Ai + 2ρi I2 k2 = 0.65.

Hence, ξ = min{2ρ1 , . . . , 2ρ20 } = 0.4 and K = max{K1 , . . . , K20 } = 0.65; consequently, we select

β = 1 ∈ (0, K2ξ2 ). Also we select αn = 1


n+1 , n ≥ 0 and η = 0.5 for simulation.

Here, Ω∗ = {G1 , G2 , G3 , G4 } where

G1 = {(2, 3), (4, 5), (6, 7), (8, 9)},

G2 = {(10, 11), (12, 13), (14, 15), (16, 17), (18, 19)},

G3 = {(1, 2), (3, 4), (5, 6), (7, 8), (9, 10), (11, 12), (13, 14),

(15, 16), (17, 18), (19, 20)},

G4 = {}.

1
We assume that Gi , i = 1, 2, 3, 4, have i.i.d. Bernoulli distribution with P r(Gi ) = 4 in every

N̂ −interval, and at the iteration k N̂ , k = 1, 2, . . . , a graph works that has worked the minimum

number of times in the previous N̂ −interval. If some graphs Gi have the same number of minimum
41

Figure 4.1 Variables θ1 and θ2 of 20 agents are shown by solid blue lines and dashed
black lines, respectively. The figures show that they are approaching
∗ T
θ = [0.7417, 0.7417] .
42

Figure 4.2 2D plot, where θ∗ is shown by o, for 1000 iterations.


43

Figure 4.3 Root Mean Square Error (RMSE) for two intervals: [0, 300] and [301, 3000]
iterations.
44

occurrences in the previous N̂ −interval, then one is chosen randomly. Thus the sequence {ωn∗ }∞
n=0

is not independent and has time-varying distributions. In fact, it has a subsequence {ωn∗ j }∞
j=0 that

is i.i.d. As a matter of fact, according to Borel-Cantelli lemma [215], Gi , i = 1, 2, 3, 4, occur infinitely

often almost surely in the probability space (Ω̄, F̄, P̄ ) where

Ω̄ = {G1 , . . . , G4 } × {G1 , . . . , G4 } × {G1 , . . . , G4 } × ...

Therefore, G1 , . . . , G4 occur infinitely often almost surely in the probability space (Ω, F, µ) in this

example. Therefore, Assumption 4.2 is satisfied. Indeed, the conditions of Corollaries 4.1 and 4.2

are satisfied. We choose N̂ = 20 and random initial conditions for simulation. The results given

by Algorithm (4.33) are shown in Figures 4.1.

We use CVX software of Matlab for solving optimization problem (4.34) and the solution is

θi∗ = [0.7417, 0.7417]T , i = 1, . . . , 20. Note that θi∗ may be different due to different observation

noise. The two-dimensional plot is given in Figure 4.2, and the error en = kxn − x∗ k2 is shown in

Figure 4.3

As seen in this example, Algorithm (4.33) is able to solve distributed optimization problems in

which there are distribution dependencies among possible graphs under mentioned assumptions.

4.2 The Random Picard Algorithm

Although the Picard iterative algorithm may not always converge to a fixed point of an operator

(see Chapter 2), it converges for operators with special properties. This is useful for solving

feasibility problem of (3.1). In the following two subsections, we show that the random Picard

iteration can solve feasibility problems under suitable conditions.

4.2.1 Firmly Nonexpansive Random Maps

Consider a firmly nonexpansive random mapping T (ω ∗ , x) where T : Ω∗ × <n −→ <n , n ∈ N,

and H = (<n , k.kH ). Now we have the following theorem.

Theorem 4.3: Consider the above firmly nonexpansive random map T (ω ∗ , x) where the car-

dinality of the set Ω∗ is finite. Assume F V P (T ) 6= ∅. If each ω ∗ ∈ Ω∗ occurs infinitely often almost
45

surely, then the sequence {xn }∞


n=0 generated by the random Picard iteration

xn+1 = T (ωn∗ , xn ) (4.35)

converges almost surely and in mean square to a random variable supported by F V P (T ).

Proof: We introduce the following lemma.

Lemma 4.5 [218]: Let φi : H −→ H, i = 1, 2, ..., Ñ , be firmly nonexpansive with ∩Ñ


i=1 F ix(φi ) 6=

∅, where H is finite dimensional. Then the random sequence generated by

x0 ∈ D arbitrary, xn+1 = φr(n) (xn ), n ≥ 0, (4.36)

where each element of {1, ..., Ñ } appears in the sequence {r(0), r(1), ...} an infinite number of times,

converges to some point in ∩Ñ


i=1 F ix(φi ).

If each ω ∗ ∈ Ω∗ occurs infinitely often almost surely, we obtain from Lemma 4.5 that the

sequence {xn }∞
n=0 generated by (4.35) converges almost surely to a random variable supported by

F V P (T ). From the proof of Theorem 4.2, the sequence also converges in mean square to the

random variable. Thus the proof is complete.

4.2.2 Contraction Random Maps

Theorem 4.4: Consider a random operator T (ω ∗ , x) where T : Ω∗ × B −→ B, F V P (T ) 6= ∅,

and T is a contraction random operator with constant 0 ≤ κ < 1. Then starting from any initial

point, the sequence generated by the random Picard iteration (4.35) converges pointwise (surely)

and in mean square to the solution of the problem with exponential rate of convergence.

Proof: We have that for each fixed ω ∗ ∈ Ω∗ , the operator T (ω ∗ , x) is a contraction with

constant κ. Thus, according to Theorem 2.3, it has a unique fixed point for each fixed ω ∗ ∈ Ω∗ .

Since F V P (T ) 6= ∅, there exists a unique x∗ such that x∗ = T (ω ∗ , x∗ ), ∀ω ∗ ∈ Ω∗ . Hence, we obtain

kxn+1 − x∗ kB ≤ κkxn − x∗ kB

≤ κ2 kxn−1 − x∗ kB
..
.

≤ κn+1 kx0 − x∗ kB , ∀ω ∈ Ω. (4.37)


46

Now we show that the sequence {xn }∞


n=0 is a Cauchy sequence in B. We obtain by (4.37) that

kxn+1 − xn kB = kxn+1 − xn + x∗ − x∗ kB

≤ kxn+1 − x∗ kB + kxn − x∗ kB

≤ κn+1 kx0 − x∗ kB + κn kx0 − x∗ kB , ∀ω ∈ Ω.

Therefore, {xn }∞ ∗
n=0 is a Caushy sequence in B and, thus, converges pointwise (surely) to x . We

also obtain from (4.37) that


Z
lim E[kxn − x∗ k2B ] = lim kxn − x∗ k2B
n−→∞ n−→∞ Ω

≤ lim κ2n kx0 − x∗ k2B µ(Ω)


n−→∞

= 0.

Therefore, {xn }∞ ∗
n=0 converges in mean square to x . One can see from (4.37) that the rate of

convergence is exponential. Thus the proof of Theorem 4.4 is complete.

4.3 The Random Krasnoselskii-Mann Algorithm

In some cases when the Picard iteration may not converge, the Krasnoselskii-Mann iteration may

be useful to solve a problem. In the following subsection, we show that the random Krasnoselskii-

Mann iterative algorithm is useful to solve feasibility problem of (3.1).

4.3.1 Nonexpansive Random Maps

Consider a nonexpansive random map T (ω ∗ , x) where T : Ω∗ × <n −→ <n , n ∈ N, and H =

(<n , k.kH ).

Theorem 4.5: Consider the above nonexpansive random map. Let the cardinality of the set

Ω∗ be finite. Assume F V P (T ) 6= ∅. If each ω ∗ ∈ Ω∗ occurs infinitely often almost surely, then the

sequence {xn }∞
n=0 generated by the random Krasnoselskii-Mann iteration

1 1
xn+1 = xn + T (ωn∗ , xn ) (4.38)
2 2
47

converges almost surely and in mean square to a random variable supported by the F V P (T ).

Proof: Since T (ω ∗ , x) is nonexpansive for each ω ∗ ∈ Ω∗ , the random operator φ(ω ∗ , x) where

1
φ(ω ∗ , x) := (x + T (ω ∗ , x)) (4.39)
2

is, by Remark 2.1, firmly nonexpansive for each ω ∗ ∈ Ω∗ . From Lemma 4.5, the sequence generated

by the random Krasnoselskii-Mann algorithm (4.38) converges almost surely to a random variable

supported by the F V P (T ). From the proof of Theorem 4.2, the sequence also converges in mean

square to the random variable. Thus the proof is complete.

Remark 4.7: Algorithm (4.38) is a special case of Algorithm (2.2) for random case where
1
αn = 2. In this case, Algorithm (4.38) can be viewed as either the random Krasnoselskii-Mann

iterative algorithm for finding a fixed value point of a nonexpansive random map T (ω ∗ , x) or the

random Picard iterative algorithm for finding a fixed value point of a firmly nonexpansive random

map φ(ω ∗ , x) defined in (4.39).


48

CHAPTER 5. SOLVING LINEAR ALGEBRAIC EQUATIONS OVER


RANDOM NETWORKS

In this chapter, we consider the problem of solving linear algebraic equations over random

networks. This problem includes distributed consensus problem as a special case. We show that

the random Krasnoselskii-Mann algorithm (4.38) is useful to solve this problem. The real Hilbert

space considered in this chapter is H = (<n , k.k2 ), n ∈ N. For simplicity we write k.k2 = k.k in this

chapter.

5.1 A Distributed Algorithm for Solving Linear Algebraic Equations over


Random Networks

Now we define the problem of solving linear algebraic equations over random network.

Consider m agents. The agents want to solve the problem min kAx − bk, A ∈ <µ×q , b ∈ <µ ,
x
where each agent merely knows a subset of the rows of the partitioned matrix [A, b]; precisely, each

agent knows a private equation Ai xi = bi , i = 1, 2, ..., m, where Ai ∈ <µi ×q , bi ∈ <µi , m


P
i=1 µi = µ.

The objective of each agent is to collaboratively seek the solution of the following optimization

problem using local information in presence of random interconnection graphs:

m
X
min kAi x − bi k2
i=1
where x ∈ <q .

Problem 5.1: Let the weighted random operator of the graph T (ω ∗ , x) := W (ω ∗ )x be given

(see Definition 3.2). Then the above problem under Assumptions 3.1 and 3.2 can be formulated as

follows:
m
X
min f (x) := kAi xi − bi k2
x
i=1 (5.1)
subject to x ∈ F V P (T ),
49

where x = [xT1 , ..., xTm ]T , xi ∈ <q , i = 1, 2, ..., m.

Before presenting our main results, we impose the following assumption on the equation Ax = b.

Assumption 5.1: The linear algebraic equation Ax = b has a solution, namely S := {x| min kAx−
x
6 ∅.
bk = 0} =

Problem 5.1 with Assumption 5.1 can be reformulated as finding x such that

Āx = b̄, (5.2)

and

x ∈ F V P (T ), (5.3)

where    
A1 0 ··· 0 b1
   
   
 0 A2 ··· 0   b2 
Ā =  , b̄ = .
  
.. .. .. 
  .. 

 . ··· . . 

 . 

   
0 0 ··· Am bm

Lemma 5.1: The solution set of (5.2) is equal to the solution set of the following equation:

Ãx + b̃ = x, (5.4)

where  
Iq − θ1 AT1 A1 0 ··· 0
 
 
 0 Iq − θ2 AT2 A2 ··· 0 
à =  , (5.5)
 
.. .. ..

 . ··· . .


 
0 0 ··· Iq − θm ATm Am
 
θ1 AT1 b1
 
 
 θ2 AT b2 
2
b̃ =  , (5.6)
 
 .
.. 
 
 
θm ATm bm
2
and θi ∈ (0, λ T ), i = 1, 2, ..., m.
max (Ai Ai )
50

Proof: Rows of (5.2) are written as Ai xi = bi , i = 1, 2, ..., m, which is equivalent to xi =

xi − θi ATi (Ai xi − bi ). Consequently, the solution sets of Ai xi = bi and xi = xi − θi ATi (Ai xi − bi )

are the same. This completes the proof of Lemma 5.1.

Now Problem 5.1 with Assumption 5.1 reduces to the following problem.

Problem 5.2: Consider Problem 5.1 with Assumption 5.1. Let H(x) := Ãx + b̃, where à and

b̃ are defined in (5.5)-(5.6), and let T (ω ∗ , x) be defined in Definition 3.2. The problem is to find x∗

such that x∗ ∈ F ix(H) ∩ F V P (T ).

Now we give the following theorem.

Theorem 5.1: Consider Problem 5.2 with Assumption 4.2. Then starting from any initial

condition, the sequence generated by the random Krasnoselskii-Mann algorithm

1 1
xn+1 = xn + [(1 − $)W (ωn∗ )xn + $(Ãxn + b̃)] (5.7)
2 2

where $ ∈ (0, 1) converges almost surely to x∗ which is the unique solution of the following convex

optimization problem:

min kx − x0 k
x
(5.8)
∗ ∗ ∗
subject to x = (1 − $)W (ω )x + $(Ãx + b̃), ∀ω ∈ Ω .

Remark 5.1: Algorithm (5.7) cannot be derived from generalization of algorithms proposed

in [93]-[119] and [120]-[134] to random case.

Before we give the proof of Theorem 5.1, we need to give some lemmas needed in the proof.

Lemma 5.2: Let H(x) be defined in Problem 5.2. Then H : <mq −→ <mq is nonexpansive.

Proof: We have that kH(z)−H(y)k = kÃ(z −y)k, ∀z, y ∈ <mq . Now we prove that kÃ(z −y)k ≤

kz − yk. Let z = [z1T , z2T , ..., zm


T ]T and y = [y T , y T , ..., y T ]T . We have that
1 2 m

kÃ(z − y)k2
 
(Iq − θ1 AT1 A1 )(z1 − y1 )
 
 
 (Iq − θ2 AT2 A2 )(z2 − y2 ) 
 2
= k k

..
.
 
 
 
(Iq − θm ATm Am )(zm − ym )
51

m
X
= k(Iq − θj ATj Aj )(zj − yj )k2 .
j=1
2
Since θj ∈ (0, λmax (Aj AT
), we have kIq − θj ATj Aj k ≤ 1. Moreover, k(Iq − θj ATj Aj )(zj − yj )k ≤
j )

kIq − θj ATj Aj kkzj − yj k, j = 1, 2, ..., m. Therefore, we obtain


m
X m
X
k(Iq − θj ATj Aj )(zj 2
− yj )k ≤ kIq − θj ATj Aj k2 kzj − yj k2
j=1 j=1
Xm
≤ kzj − yj k2 = kz − yk2
j=1
or

kÃ(z − y)k ≤ kz − yk. (5.9)

Thus the proof of Lemma 5.2 is complete.

Lemma 5.3: Let T (ω ∗ , x) and H(x) be defined in Definition 3.2 and Problem 5.2, respectively,

and

D(ω ∗ , x) := (1 − $)T (ω ∗ , x) + $H(x), (5.10)

where ω ∗ ∈ Ω∗ , $ ∈ (0, 1). Then F V P (D) = F ix(H) ∩ F V P (T ).

Proof: Assume a z̃ ∈ F ix(H) ∩ F V P (T ). In fact, z̃ = H(z̃) and z̃ = T (ω ∗ , z̃) = z̃, ∀ω ∗ ∈ Ω∗ .

Therefore, we obtain from (5.10) that

D(ω ∗ , z̃) = (1 − $)T (ω ∗ , z̃) + $H(z̃)

= (1 − $)z̃ + $z̃ = z̃, ∀ω ∗ ∈ Ω∗ ,

which implies that F ix(H) ∩ F V P (T ) ⊆ F V P (D). Conversely, assume a z̃ ∈ F V P (D), i.e.,

D(ω ∗ , z̃) = z̃ = (1 − $)T (ω ∗ , z̃) + $H(z̃), ∀ω ∗ ∈ Ω∗ . (5.11)

Since F ix(H) ∩ F V P (T ) 6= ∅, there exits a y ∗ ∈ F ix(H) ∩ F V P (T ). Now by (5.11) we obtain

kz̃ − y ∗ k = k(1 − $)T (ω ∗ , z̃) + $H(z̃) − y ∗ k.

By the fact that y ∗ = (1 − $)y ∗ + $y ∗ , $ ∈ (0, 1), we obtain

kz̃ − y ∗ k = k(1 − $)T (ω ∗ , z̃) + $H(z̃) − y ∗ k

= k(1 − $)(T (ω ∗ , z̃) − y ∗ ) + $(H(z̃) − y ∗ )k. (5.12)


52

Since y ∗ = H(y ∗ ) and y ∗ = T (ω ∗ , y ∗ ), ∀ω ∗ ∈ Ω∗ , we obtain from (5.12) for all ω ∗ ∈ Ω∗ that

k(1−$)(T (ω ∗ , z̃)−y ∗ )+$(H(z̃)−y ∗ )k = k(1−$)(T (ω ∗ , z̃)−T (ω ∗ , y ∗ ))+$(H(z̃)−H(y ∗ ))k. (5.13)

Due to nonexpansivity property of T (ω ∗ , x) (see (3.7)), we have for all ω ∗ ∈ Ω∗ that

k(1 − $)(T (ω ∗ , z̃) − T (ω ∗ , y ∗ )) + $(H(z̃) − H(y ∗ ))k ≤ (1 − $)kz̃ − y ∗ k + $kH(z̃) − H(y ∗ )k. (5.14)

By nonexpansivity property of H(x) (see Lemma 5.2), we also have for all ω ∗ ∈ Ω∗ that

k(1 − $)(T (ω ∗ , z̃) − T (ω ∗ , y ∗ )) + $(H(z̃) − H(y ∗ ))k ≤ (1 − $)kT (ω ∗ , z̃) − T (ω ∗ , y ∗ )k + $kz̃ − y ∗ k.

(5.15)

Because of nonexpansivity property of T (ω ∗ , x), we obtain from (5.15) that

(1−$)kT (ω ∗ , z̃)−T (ω ∗ , y ∗ )k+$kz̃−y ∗ k ≤ (1−$)kz̃−y ∗ k+$kz̃−y ∗ k = kz̃−y ∗ k, ∀ω ∗ ∈ Ω∗ . (5.16)

Due to nonexpansivity property of H(x), we also obtain from (5.14) that

(1 − $)kz̃ − y ∗ k + $kH(z̃) − H(y ∗ )k ≤ (1 − $)kz̃ − y ∗ k + $kz̃ − y ∗ k = kz̃ − y ∗ k. (5.17)

From (5.12)-(5.17), we finally obtain

kz̃ − y ∗ k ≤ k(1 − $)(T (ω ∗ , z̃) − T (ω ∗ , y ∗ )) + $(H(z̃) − H(y ∗ ))k

≤ (1 − $)kz̃ − y ∗ k + $kH(z̃) − H(y ∗ )k

≤ kz̃ − y ∗ k, ∀ω ∗ ∈ Ω∗ (5.18)

and

kz̃ − y ∗ k ≤ k(1 − $)(T (ω ∗ , z̃) − T (ω ∗ , y ∗ )) + $(H(z̃) − H(y ∗ ))k

≤ (1 − $)kT (ω ∗ , z̃) − T (ω ∗ , y ∗ )k + $kz̃ − y ∗ k

≤ kz̃ − y ∗ k, ∀ω ∗ ∈ Ω∗ . (5.19)

Thus, the equalities hold in (5.18) and (5.19), that imply that

kz̃ − y ∗ k = k(1 − $)(T (ω ∗ , z̃) − T (ω ∗ , y ∗ )) + $(H(z̃) − H(y ∗ ))k

= kH(z̃) − H(y ∗ )k

= kT (ω ∗ , z̃) − T (ω ∗ , y ∗ )k, ∀ω ∗ ∈ Ω∗ . (5.20)


53

Now we have the following remark.

Remark 5.2 [219, Ch. 2]: Due to strict convexity of the norm k.k, if kxk = kyk = k(1 − $)x +

$yk where x, y ∈ X and $ ∈ (0, 1), then x = y.

Substituting y ∗ = H(y ∗ ) and y ∗ = T (ω ∗ , y ∗ ), ∀ω ∗ ∈ Ω∗ , for (5.20) yields

kH(z̃) − y ∗ k = kT (ω ∗ , z̃) − y ∗ k = k(1 − $)(T (ω ∗ , z̃) − y ∗ ) + $(H(z̃) − y ∗ )k, ∀ω ∗ ∈ Ω∗ ,

which by Remark 5.2 implies that H(z̃) − y ∗ = T (ω ∗ , z̃) − y ∗ , ∀ω ∗ ∈ Ω∗ , or

H(z̃) = T (ω ∗ , z̃), ∀ω ∗ ∈ Ω∗ . (5.21)

Substituting (5.21) for (5.11) yields

z̃ = H(z̃) = T (ω ∗ , z̃), ∀ω ∗ ∈ Ω∗ ,

which implies that F V P (D) ⊆ F ix(H) ∩ F V P (T ). Therefore, F V P (D) = F ix(H) ∩ F V P (T ).

Thus the proof of Lemma 5.3 is complete.

Lemma 5.4: Let D(ω ∗ , x), ω ∗ ∈ Ω∗ , be defined in Lemma 5.3. Then F V P (D) is a closed

convex nonempty set.

Proof: For any z, y ∈ <mq , we obtain

kD(ω ∗ , z) − D(ω ∗ , y)k = k(1 − $)(T (ω ∗ , z) − T (ω ∗ , y)) + $(H(z) − H(y))k

≤ (1 − $)kT (ω ∗ , z) − T (ω ∗ , y)k + $kH(z) − H(y)k. (5.22)

Because of nonexpansivity of both T (ω ∗ , x) and H(x), we obtain from (5.22) that

kD(ω ∗ , z) − D(ω ∗ , y)k ≤ (1 − $)kT (ω ∗ , z) − T (ω ∗ , y)k + $kH(z) − H(y)k

≤ (1 − $)kz − yk + $kz − yk = kz − yk

that implies that D(ω ∗ , x) is nonexpansive. Indeed, since <mq is closed and convex, we obtain by

Remark 3.3 that F V P (D) is closed and convex. Furthermore, F V P (D) is nonempty by Assumption

5.1 and Lemma 5.3. This completes the proof of Lemma 5.4.

Lemma 5.5: Let T (ω ∗ , x), ω ∗ ∈ Ω∗ , be defined in Definition 3.2, and


54

S(ω, x) := (1 − $)T (ω ∗ , x) + $Ãx, ω ∗ ∈ Ω∗ , (5.23)

where $ ∈ (0, 1). Then F V P (S) is nonempty, closed, and convex.

Proof: Since 0mq is a fixed value point of S, we can conclude that F V P (S) is nonempty. Now

for any z, y ∈ <mq , we obtain

kS(ω ∗ , z) − S(ω ∗ , y)k = k(1 − $)(T (ω ∗ , z) − T (ω ∗ , y)) + $Ã(z − y)k

≤ (1 − $)kT (ω ∗ , z) − T (ω ∗ , y)k + $kÃ(z − y)k. (5.24)

Similar to the proof of Lemma 5.2, we obtain

kÃ(z − y)k ≤ kz − yk. (5.25)

Therefore, we obtain from (5.24) by nonexpansivity of T (ω ∗ , x) and (5.25) that

kS(ω ∗ , z) − S(ω ∗ , y)k ≤ (1 − $)kT (ω ∗ , z) − T (ω ∗ , y)k + $kÃ(z − y)k

≤ (1 − $)kz − yk + $kz − yk

= kz − yk

which implies that S(ω ∗ , x), ω ∗ ∈ Ω∗ , is nonexpansive. Therefore, one can obtain by Remark 3.3

that F V P (S) is closed and convex. Thus the proof of Lemma 5.5 is complete.

Lemma 5.6: Assume that the linear algebraic equation Ax = b does not have a unique solution,

i.e., S is not a singleton. Let S(ω ∗ , x) be defined in (5.23). Then F V P (S) is a closed affine subspace.

Proof: By Lemma 5.5, we have that F V P (S) is closed. Since S is not a singleton, F V P (S) is

not a singleton either. Consider two distinct points z̄, ȳ ∈ F V P (S), i.e.,

z̄ = S(ω ∗ , z̄), ȳ = S(ω ∗ , ȳ), ∀ω ∗ ∈ Ω∗ . (5.26)

Now we obtain

S(ω ∗ , ς z̄ + (1 − ς)ȳ) = S(ω ∗ , ς z̄) + S(ω ∗ , (1 − ς)ȳ)

= ςS(ω ∗ , z̄) + (1 − ς)S(ω ∗ , ȳ), (5.27)


55

where ς ∈ <. Substituting (5.26) for (5.27) yields

S(ω ∗ , ς z̄ + (1 − ς)ȳ) = ςS(ω ∗ , z̄) + (1 − ς)S(ω ∗ , ȳ) = ς z̄ + (1 − ς)ȳ

which implies that ς z̄ + (1 − ς)ȳ ∈ F V P (S). Therefore, F V P (S) is an affine set.

Now we have the following remark.

Remark 5.3 [173]: If C is an affine set and z0 ∈ C, then the set

C − z0 = {z − z0 |z ∈ C}

is a subspace.

Since 0mq ∈ F V P (S), we obtain by Remark 5.3 that the set

F V P (S) − 0mq = F V P (S)

is a subspace. Thus the proof of Lemma 5.6 is complete.

Lemma 5.7: Let


1 1
Q1 (ω ∗ , x) := x + D(ω ∗ , x), ∀ω ∗ ∈ Ω∗ , (5.28)
2 2
1 1
Q2 (ω ∗ , x) := x + S(ω ∗ , x), ∀ω ∗ ∈ Ω∗ . (5.29)
2 2
Then Q1 (ω ∗ , x) and Q2 (ω ∗ , x) are nonexpansive and F V P (Q1 ) = F V P (D) and F V P (Q2 ) =

F V P (S). Moreover, Q1 (ω ∗ , x) is firmly nonexpansive for each ω ∗ ∈ Ω∗ .

Proof: Since D(ω ∗ , x) and S(ω ∗ , x) are nonexpansive, we obtain by Remark 2.1 that Q1 (ω ∗ , x)

and Q1 (ω ∗ , x) are firmly nonexpansive for each ω ∗ ∈ Ω∗ and thus nonexpansive. Now consider a

z̃ ∈ F V P (D). Thus D(ω ∗ , z̃) = z̃, ∀ω ∗ ∈ Ω∗ . Substituting this fact for (5.28) yields Q1 (ω ∗ , z̃) =

z̃, ∀ω ∗ ∈ Ω∗ which implies that z̃ ∈ F V P (Q1 ). Now consider a z̃ ∈ F V P (Q1 ). Similarly, one can

obtain that z̃ ∈ F V P (D). Therefore, F V P (Q1 ) = F V P (D). With the same procedure, one can

prove by using nonexpansivity of S(ω, x) (see proof of Lemma 5.5) that F V P (Q2 ) = F V P (S).

Thus the proof of Lemma 5.7 is complete.

Remark 5.4: By Lemma 5.3 and Lemma 5.7, Assumption 5.1 guarantees that the set of

equilibrium points of (5.7) is F ix(H)∩F V P (T ) 6= ∅. Also Assumption 5.1 guarantees the feasibility

of the optimization problem (5.8).


56

Remark 5.5: Quadratic Lyapunov functions have been useful to analyze stability of linear

dynamical systems. Nevertheless, common quadratic Lyapunov functions may not exist for con-

sensus problems in networked systems [157]. Furthermore, common quadratic Lyapunov functions

may not exist for switched linear systems [158]-[160]. Moreover, other difficulties mentioned in

[164] may arise in using Lyapunov’s direct method to analyze stability of dynamical systems. Also,

LaSalle-type theorem for discrete-time stochastic systems (see [156] and references therein) needs

{ωn∗ }∞
n=0 to be independent. Therefore, we do not try Lyapunov’s and LaSalle’s approaches.

Proof of Theorem 5.1:

From Lemmas 5.3 and 5.7, we can write (5.7) as

xn+1 = Q1 (ωn∗ , xn ). (5.30)

Now we have the following definition and lemmas.

Definition 5.1 [220]: Suppose C is a closed convex nonempty set and {xn }∞
n=0 is a sequence

in H. {xn }∞
n=0 is said to be Fejér monotone with respect to C if

kxn+1 − zk ≤ kxn − zk, ∀z ∈ C, n ≥ 0.

Lemma 5.8 [220]: Suppose the sequence {xn }∞


n=0 is Fejér monotone with respect to C. Then

{xn }∞
n=0 is bounded.

Lemma 5.9 [221]: Let {xn }∞


n=0 be a sequence in H and let C be a closed affine subspace of H.

Suppose that {xn }∞


n=0 is Fejér monotone with respect to C. Then PC xn = PC x0 , ∀n ∈ N.

Consider a c̄ ∈ F V P (D) = F V P (Q1 ). From Lemma 5.7, we have c̄ = Q1 (ω ∗ , c̄). Hence, for all

ω ∈ Ω, we have

kxn+1 − c̄k = kQ1 (ωn∗ , xn ) − Q1 (ωn∗ , c̄)k ≤ kxn − c̄k,

which implies that the sequence {xn } is Fejér monotone with respect to F V P (D) (see Definition 5.1

and Lemma 5.4). Therefore, the sequence is bounded by Lemma 5.8 for all ω ∈ Ω. Since m ∈ N ,

N̄ is finite. Thus we obtain from (5.30), Lemma 4.5, and Assumption 4.2 that {xn }∞
n=0 converges

almost surely to a random variable supported by F V P (Q1 ) = F V P (D) for any initial condition.
57

It remains to prove that {xn }∞ ∗


n=0 converges almost surely to the unique solution x . If Problem

5.2 has a unique solution, then x∗ is the only feasible point of the optimization (5.8); otherwise,

F V P (S) is a closed affine subspace by Lemma 5.6. Consider a fixed ỹ ∈ F V P (D) = F V P (Q1 ).

Thus ỹ = 21 ỹ + 12 D(ω ∗ , ỹ) and D(ω ∗ , ỹ) = ỹ, ∀ω ∗ ∈ Ω∗ . We obtain from these facts and (5.7) that
1 1
xn+1 − ỹ = (xn − ỹ) + (D(ωn∗ , xn ) − ỹ)
2 2
1 1
= (xn − ỹ) + (D(ωn∗ , xn ) − D(ωn∗ , ỹ))
2 2
1 1
= (xn − ỹ) + (S(ωn∗ , xn ) − S(ωn∗ , ỹ))
2 2
1 1
= (xn − ỹ) + S(ωn∗ , xn − ỹ)
2 2
= Q2 (ωn∗ , xn − ỹ). (5.31)

Now consider a c̄ ∈ F V P (S) = F V P (Q2 ). From (5.31) we obtain

kxn+1 − ỹ − c̄k = kQ2 (ωn∗ , xn − ỹ) − c̄k

= kQ2 (ωn∗ , xn − ỹ) − Q2 (ωn∗ , c̄)k

which by nonexpansivity property of Q2 (ω ∗ , x) (see Lemma 5.7) implies

kxn+1 − ỹ − c̄k = kQ2 (ωn∗ , xn − ỹ) − Q2 (ωn∗ , c̄)k

≤ kxn − ỹ − c̄k. (5.32)

Since F V P (S) = F V P (Q2 ) (by Lemma 5.7) is nonempty, closed, and convex (see Lemma 5.5),

the sequence {xn − ỹ}∞


n=0 is Fejér monotone with respect to F V P (Q2 ) = F V P (S) for all ω ∈

Ω. Moreover, F V P (S) = F V P (Q2 ) (by Lemma 5.7) is a closed affine subspace by Lemma 5.6.

Therefore, according to Lemma 5.9, we obtain

lim xn − ỹ = PF V P (S) (x0 − ỹ).


n−→∞

As a matter of fact, x∗ = z ∗ + ỹ where z ∗ = PF V P (S) (x0 − ỹ). Indeed, z ∗ can be considered as the

solution of the following convex optimization problem:

min kz − (x0 − ỹ)k


z
(5.33)
∗ ∗ ∗
subject to z = (1 − $)W (ω )z + $Ãz, ∀ω ∈ Ω .
58

By changing variable x = z + ỹ in optimization problem (5.33), (5.33) becomes

min kx − x0 k
x
(5.34)
∗ ∗ ∗
subject to x = (1 − $)W (ω )(x − ỹ) + $Ã(x − ỹ) + ỹ, ∀ω ∈ Ω .

where x∗ is the solution of (5.34). By the fact that ỹ = (1 − $)ỹ + $ỹ, the constraint set in (5.34)

becomes

x = (1 − $)(W (ω ∗ )(x − ỹ) + ỹ) + $(Ã(x − ỹ) + ỹ), ∀ω ∗ ∈ Ω∗ . (5.35)

Substituting ỹ = W (ω ∗ )ỹ, ∀ω ∗ ∈ Ω∗ , and ỹ = Ãỹ + b̃ for (5.35) yields

x = (1 − $)W (ω ∗ )x + $(Ãx + b̃). (5.36)

Substituting (5.36) for (5.34) yields (5.8). Because of strict convexity of k.k, the convex optimization

problem (5.8) has the unique solution. Thus the proof of Theorem 5.1 is complete.

Theorem 5.2: Consider Problem 5.2 with Assumption 4.2. Then starting from any initial

condition, the sequence generated by (5.7) converges in mean square to x∗ which is the unique

solution of the convex optimization problem (5.8).

Proof: One can obtain from Theorem 5.1 and the proof of Theorem 4.2.

5.1.1 Distributed Average Consensus over Random Networks

Now, we define the problem of reaching average consensus over random networks.

The agents want to reach the average of their initial states in presence of random interconnection
1 Pm d
topologies, i.e, x1 = x2 = ... = xm = m i=1 xi (0) where xi (0) ∈ < is an initial state of the agent

i.

Before we present our results, we mention that the algorithm of Tsitsiklis [7] and its general-

ization to random case is

xn+1 = W xn . (5.37)

Algorithm (5.37) is in fact the Picard iterative algorithm (2.1) for finding a fixed point of the

nonexpansive operator T (x) := W x. For periodic and irreducible matrices, the authors of [35]-[36]
59

prove that distributed consensus occurs with asynchronous updates. It is still a question if agents

achieve consensus with synchronous updates. The answer is affirmative.

Remark 5.6: Relaxation method for convex feasibility problems was first investigated in [222]-

[223]. It is shown in [224] that relaxation method is a special case of the Krasnoselskii-Mann

iteration.

The random Krasnoselskii-Mann iterative algorithm (4.38) for consensus problems reduces to

the following algorithm:


1 1
xn+1 = xn + W (ωn∗ )xn . (5.38)
2 2
Now we give the following theorem.

Theorem 5.3: Consider the average consensus problem where W(ω ∗ ) satisfies Assumptions

3.1, 3.2, and 4.2. Then the sequence generated by the random Krasnoselskii-Mann algorithm (5.38)
1 Pm 1 Pm
converges almost surely to x∗ = 1m ⊗ m i=1 xi (0) so that PC xn = 1m ⊗ m i=1 xi (0), ∀n ∈ N .

Remark 5.7: We show here that the average consensus of initial states of the agents is in fact

the projection of initial states of agents onto the consensus subspace in the Hilbert space (<md , k.k).

Before we give the proof of Theorem 5.3, we need to present the following lemma needed in the

proof.

Lemma 5.10: Let


1 1
D̃(ω ∗ , x) := x + T (ω ∗ , x), (5.39)
2 2
where T (ω ∗ , x) is defined in Definition 3.2. Then F V P (D̃) = F V P (T ).
1
Proof: One can prove from Lemma 5.3 where β = 2 and H(x) := x.

Proof of Theorem 5.3: From (5.38), we have


1 1
kxn+1 k = k xn + W (ωn∗ )xn k
2 2
1 1
≤ kxn k + kW (ωn∗ )xn k
2 2
1 1
≤ kxn k + kW (ωn∗ )kkxn k.
2 2
From (3.7), we have kW (ω ∗ )k ≤ 1, ∀ω ∗ ∈ Ω∗ . Hence we obtain
1 1
kxn+1 k ≤ kxn k + kW (ω ∗ )kkxn k ≤ kxn k, ∀n ∈ N,
2 2
60

which implies that the sequence {xn }∞


n=0 , ∀ω ∈ Ω, is bounded.

Now consider a c∗ ∈ F V P (D̃) = C. Thus we have c∗ = 21 c∗ + 12 c∗ . Using this fact and (5.38),

we obtain

1 1
kxn+1 − c∗ k = k xn + W (ωn∗ )xn − c∗ k
2 2
1 1
= k (xn − c∗ ) + (W (ωn∗ )xn − c∗ )k. (5.40)
2 2

Since c∗ ∈ F V P (D̃), we have that c∗ = W (ωn∗ )c∗ , ∀ωn∗ ∈ Ω∗ , ∀n ∈ N ∪ {0}. Therefore, we obtain

1 1 1 1
k (xn − c∗ ) + (W (ωn∗ )xn − c∗ )k ≤ kxn − c∗ k + kW (ωn∗ )(xn − c∗ )k
2 2 2 2
1 1
≤ kxn − c∗ k + kW (ωn∗ )kkxn − c∗ k. (5.41)
2 2

Since kW (ω ∗ )k ≤ 1, ∀ω ∗ ∈ Ω∗ , we obtain from (5.40)-(5.41) that

1 1
kxn+1 − c∗ k ≤ kxn − c∗ k + kW (ωn∗ )kkxn − c∗ k ≤ kxn − c∗ k. (5.42)
2 2

Since the number of the agents, m, is finite, the number of the possible graphs N̄ is finite, too. Due

to nonexpansivity of T (ω ∗ , x) for each fixed ω ∗ ∈ Ω∗ , we obtain by Remark 2.1 that

1
S̃(ω ∗ , x) := (x + T (ω ∗ , x))
2

is firmly nonexpansive. Therefore, by Lemma 4.5, Lemma 5.10, and Assumption 4.2, (5.38) con-

verges almost surely to a random variable supported by C since (5.38) is xn+1 = S̃(ωn∗ , xn ).
1 Pm
It remains to prove that the sequence {xn }∞ ∗
n=0 converges almost surely to x = 1m ⊗ m j=1 xj (0).

We can see by (5.42) and Definition 5.1 that the sequence {xn }∞
n=0 is Fejér monotone with respect

to C for all ω ∈ Ω. Since C is a closed affine subspace, we conclude by Lemma 5.9 that the limit

point of the sequence {xn }∞ ∗


n=0 is x = PC x0 , i.e., the solution of the following optimization problem:

minimize kx − x0 k
x
(5.43)
subject to x1 = x2 = ... = xm .

The optimization problem (5.43) is equivalent to the following optimization problem:

minimize kx − x0 k2
x
(5.44)
subject to x1 = x2 = ... = xm .
61

Pm
Indeed, the solution of the optimization problem (5.44) is x∗ = 1m ⊗ 1
m j=1 xj (0) which implies

that the sequence {xn }∞


n=0 converges almost surely to the average of initial states of the agents.

This completes the proof of Theorem 5.3.

Theorem 5.4: Consider the average consensus problem where W(ω ∗ ) satisfies Assumptions

3.1, 3.2, and 4.2. Then the sequence generated by (5.38) converges in mean square to x∗ =
1 Pm 1 Pm
1m ⊗ m i=1 xi (0) so that PC xn = 1m ⊗ m i=1 xi (0), ∀n ∈ N .

Proof: One can obtain from Theorem 5.3 and the proof of Theorem 4.2.

The random Krasnoselskii-Mann algorithm (5.7) for consensus problems reduces to the following

algorithm:
1 1
xn+1 = (1 + $)xn + (1 − $)W (ωn∗ )xn (5.45)
2 2
where $ ∈ (0, 1). From Algorithms (5.7) and (5.38) and Theorems 5.1-5.4, we arrive at the following

theorem.

Theorem 5.5: Consider the average consensus problem where W(ω ∗ ) satisfies Assumptions

3.1, 3.2, and 4.2. Then the sequence generated by (5.45) in which $ ∈ [0, 1) converges almost
1 Pm 1 Pm
surely and in mean square to x∗ = 1m ⊗ m i=1 xi (0) so that PC xn = 1m ⊗ m i=1 xi (0), ∀n ∈ N .

Proof: Almost sure and mean square convergences of the sequence generated by Algorithm

(5.45) where $ ∈ (0, 1) have been proved in Theorems 5.1 and 5.2, respectively. Almost sure and

mean square convergences of the sequence generated by Algorithm (5.45) where $ = 0 have been

proved in Theorems 5.3 and 5.4, respectively. Thus the proof of Theorem 5.5 is complete.

5.1.1.1 Numerical Example

Example 5.1: Consider three agents in the one-dimensional Euclidean space where Ω∗ =

{G1 , G2 , G3 } in which G1 = {}, G2 {(1, 2)}, and G3 {(1, 3)} with undirected links where the weights

of links are assumed to be W12 = 0.25, W13 = 0.3. One can see that W12 and W13 are neither

Maximum-degree nor Metropolis weights. W(ω ∗ ), ∀ω ∗ ∈ Ω∗ is doubly stochastic, and the union of

all graphs in Ω∗ is strongly connected. Therefore, Assumptions 3.1 and 3.2 are satisfied. We assume
1
that G1 and G2 occur independently with probability P r(f ailure) = 2, and whenever G2 occurs
62

and G3 did not occur in the previous iteration, G3 occurs after it. Thus the sequence {ωn∗ }∞
n=0 is

not independent and has time-varying distributions. In fact, it has a subsequence {ωn∗ j }∞
j=0 that

is i.i.d. As a matter of fact, according to Borel-Cantelli lemma [215], G1 and G2 occur infinitely

almost surely in the probability space (Ω̄, F̄, P̄ ) where

Ω̄ = {G1 , G2 } × {G1 , G2 } × ...

Therefore, G1 and G2 occur infinitely almost surely in the probability space (Ω, F, µ) in this example.

Thus G3 occurs infinitely almost surely, too. Therefore, Assumption 4.2 is satisfied. Indeed, the

conditions of Theorem 5.3 are satisfied. We choose initial conditions x1 (0) = −4, x2 (0) = 2, and

x3 (0) = 5 for simulation. In fact, the average of agents’ initial states is 31 3i=1 xi (0) = 1. We
P

should mention that in the three-dimensional Euclidean space, we have that PC ζ = [1, 1, 1]T where

ζ ∈ {[x1 , x2 , x3 ]T ∈ <3 |x1 + x2 + x3 = 3}. As a matter of fact, the agents collaborate among

themselves to approach the average of their initial states in such a way that they remain on the

plane {[x1 , x2 , x3 ]T ∈ <3 |x1 + x2 + x3 = 3} for all n ∈ N . The results are shown in Figure 5.1.
63

Figure 5.1 states’ route


64

CHAPTER 6. A DISTRIBUTED ALGORITHM FOR DISTRIBUTED


CONVEX OPTIMIZATION WITH STATE-DEPENDENT INTERACTIONS
AND TIME-VARYING TOPOLOGIES

In this chapter, we show that a generalization of the proposed algorithm (4.33) can solve dis-

tributed optimization with state-dependent interactions and time-varying topologies. We consider

the real Hilbert space H = (<mn , k.k2 ) in this chapter. For simplicity we write k.k2 = k.k in this

chapter.

6.1 A Proposed Algorithm

We propose the following generalization of the distributed algorithm (4.33):

xn+1 = αn (xn − β∇f (xn )) + (1 − αn )((1 − η)xn + ηW (xn , Gn )xn ), (6.1)

where η ∈ (0, 1).

Remark 6.1: The discrete algorithm proposed in [225] can solve consensus problems for a

special weight form where xi ∈ <, i = 1, 2, ..., m. Algorithm (6.1) is able to solve average consensus

problems for weights satisfying Assumptions 3.5-3.6 while it is restricted to diminishing step size.

Continuous algorithms have been proposed in [226] and [227] for solving consensus problems with

state-dependent interactions.

6.1.1 Convergence Analysis

Related to Assumption 3.6, we have the following assumption.

Assumption 6.1: There exists a nonempty subset K̃ ⊆ G such that the union of all elements

in K̃ is strongly connected for all x ∈ <mn , and each element of K̃ occurs infinitely often.

Now we give the following theorem.


65

Theorem 6.1: Consider the problem (3.12) with Assumptions 3.5, 3.6, and 6.1. Let each

fi (xi ), i = 1, . . . , m, satisfies Assumption 4.1. Suppose that β ∈ (0, K2ξ2 ) and the sequence αn ∈

[0, 1], n ∈ N ∪ {0}, satisfies (a) and (b) of Theorem 4.1. Then the sequence generated by Algorithm

(6.1) globally converges to the unique solution of the problem.

Before we give the proof of Theorem 6.1, we present the following lemma needed in the proof.

Lemma 6.1: Let T̂ (x, G) := (1 − η)x + ηT (x, G), x, G ∈ G, x ∈ <mn , with T defined in (3.12),

and η ∈ (0, 1]. Then

(i) F ix(T (x, G)) = F ix(T̂ (x, G)).

(ii) < x − T̂ (x, G), x − z >≥ η2 kx − T (x, G)k2 , ∀z ∈ C, ∀G ∈ G.

(iii) kT̂ (x, G) − zk ≤ kx − zk, ∀z ∈ C, ∀x ∈ H, ∀G ∈ G.

Proof: (i)

Consider a x̂ ∈ F ix(T (x, G)). Thus x̂ = T (x̂, G). Hence

T̂ (x̂, G) = (1 − η)x̂ + ηT (x̂, G) = x̂,

which implies that F ix(T (x, G)) ⊆ F ix(T̂ (x, G)). Conversely, consider a x̂ ∈ F ix(T̂ (x, G)). Indeed,

x̂ = T̂ (x̂, G). Thus we have

x̂ = T̂ (x̂, G) = (1 − η)x̂ + ηT (x̂, G),

or x̂ = T (x̂, G), which implies that F ix(T̂ (x, G)) ⊆ F ix(T (x, G)). Therefore, we can conclude that

F ix(T̂ (x, G)) = F ix(T (x, G)). Thus the proof of (i) is complete.

(ii)

Since z ∈ C, we have W (x, G)z = z. Therefore, we obtain

kT (x, G) − zk = kW (x, G)x − W (x, G)zk

≤ kW (x, G)kkx − zk.

Since by Assumption 3.5 W (x, G) is doubly stochastic, we obtain from Lemma 3.2 that kW (x, G)k ≤

1. Hence,

kT (x, G) − zk ≤ kW (x, G)kkx − zk ≤ kx − zk (6.2)


66

or

kT (x, G) − zk2 ≤ kx − zk2 , ∀z ∈ C, ∀G ∈ G. (6.3)

Also we have

kT (x, G) − zk2 = kT (x, G) − x + x − zk2

= kT (x, G) − xk2 + kx − zk2 + 2 < T (x, G) − x, x − z > . (6.4)

Substituting (6.4) for (6.3) yields

2 < x − T (x, G), x − z >≥ kT (x, G) − xk2 . (6.5)

x−T̂ (x,G)
Substituting x − T (x, G) = η for the left hand side of the inequality (6.5) implies (ii). Thus

the proof of (ii) is complete.

(iii)

We have from (6.2) and z = (1 − η)z + ηz that

kT̂ (x, G) − zk ≤ (1 − η)kx − zk + ηkT (x, G) − zk

≤ (1 − η)kx − zk + ηkx − zk

= kx − zk.

Therefore, the proof of (iii) is complete.

Proof of Theorem 6.1:

We prove Theorem 6.1 in three steps.

Step 1: {xn }∞
n=0 is bounded.

Step 2: {xn }∞
n=0 converges to an element in the feasible set.

Step 3: {xn }∞
n=0 converges to the optimal solution.

Now we give the proof of each step in details.

Proof of Step 1:

Since f (x) is strongly convex and C is closed, the problem has a unique solution. Let x∗ be the

unique solution of the problem. We have x∗ = αn x∗ + (1 − αn )x∗ . Therefore, we have from (6.1)
67

and T̂ (xn , Gn ) = (1 − η)xn + ηT (xn , Gn ) that

kxn+1 − x∗ k = kαn (xn − β∇f (xn )) + (1 − αn )T̂ (xn , Gn ) − x∗ k

= kαn (xn − β∇f (xn ) − x∗ ) + (1 − αn )(T̂ (xn , Gn ) − x∗ )k.

We obtain from (iii) of Lemma 6.1 that

kαn (xn − β∇f (xn ) − x∗ ) + (1 − αn )(T̂ (xn , Gn ) − x∗ )k

≤ αn kxn − β∇f (xn ) − x∗ k + (1 − αn )kT̂ (xn , Gn ) − x∗ k

≤ αn kxn − β∇f (xn ) − x∗ k + (1 − αn )kxn − x∗ k. (6.6)

Since ∇f (x) is ξ-strongly monotone, and ∇f (x) is K-Lipschitz continuous, we obtain

kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k2 = kxn − x∗ k2 − 2β < ∇f (xn ) − ∇f (x∗ ), xn − x∗ >

+ β 2 k∇f (xn ) − ∇f (x∗ )k2

≤ kxn − x∗ k2 − 2ξβkxn − x∗ k2 + K 2 β 2 kxn − x∗ k2

= (1 − 2ξβ + β 2 K 2 )kxn − x∗ k2

= (1 − γ)2 kxn − x∗ k2

1 − β(2ξ − βK 2 ), and selecting β ∈ (0, K2ξ2 ) implies 0 < γ ≤ 1. As a matter of


p
where γ = 1 −

fact, we have

kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k ≤ (1 − γ)kxn − x∗ k. (6.7)

We have that

kxn − β∇f (xn ) − x∗ k = kxn − x∗ − β(∇f (xn ) − ∇f (x∗ )) − β∇f (x∗ )k

≤ kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k + βk∇f (x∗ )k

≤ (1 − γ)kxn − x∗ k + βk∇f (x∗ )k. (6.8)

Substituting (6.8) for (6.6) yields

βk∇f (x∗ )k
kxn+1 − x∗ k ≤ (1 − γαn )kxn − x∗ k + αn βk∇f (x∗ )k = (1 − γαn )kxn − x∗ k + γαn ( )
γ
68

which by induction implies

βk∇f (x∗ )k
kxn+1 − x∗ k ≤ max{kx0 − x∗ k, }.
γ
Thus {xn }∞
n=0 is bounded.

Proof of Step 2:

From (6.1) and xn = αn xn + (1 − αn )xn , we have

xn+1 − xn + αn β∇f (xn ) = (1 − αn )(T̂ (xn , Gn ) − xn ), (6.9)

where T̂ (xn , Gn ) = (1 − η)xn + ηT (xn , Gn ). Hence

< xn+1 − xn + αn β∇f (xn ), xn − x∗ >= −(1 − αn ) < xn − T̂ (xn , Gn ), xn − x∗ > . (6.10)

Since x∗ ∈ C, we have from part (ii) of Lemma 6.1 that

η
< xn − T̂ (xn , Gn ), xn − x∗ >≥ kxn − T (xn , Gn )k2 . (6.11)
2
From (6.10) and (6.11), we obtain

η
< xn+1 − xn + αn β∇f (xn ), xn − x∗ >≤ − (1 − αn )kxn − T (xn , Gn )k2 (6.12)
2
or equivalently

η
− < xn − xn+1 , xn − x∗ >≤ −αn < β∇f (xn ), xn − x∗ > − (1 − αn )kxn − T (xn , Gn )k2 . (6.13)
2
From (4.15) we obtain

1
< xn − xn+1 , xn − x∗ >= −Cn+1 + Cn + kxn − xn+1 k2 , (6.14)
2
where Cn = 12 kxn − x∗ k2 . From (6.13) and (6.14) we obtain

1 η
Cn+1 − Cn − kxn − xn+1 k2 ≤ −αt < β∇f (xn ), xn − x∗ > − (1 − αn )kxn − T (xn , Gn )k2 . (6.15)
2 2
From (6.9) we have

kxn+1 − xn k2 = k − αn β∇f (xn ) + (1 − αn )(T̂ (xn , Gn ) − xn )k2

= αn2 kβ∇f (xn )k2 + (1 − αn )2 kT̂ (xn , Gn ) − xn k2

− 2αn (1 − αn ) < β∇f (xn ), T̂ (ωn∗ , xn ) − xn > . (6.16)


69

We know that kT̂ (xn , Gn )−xn k = ηkxn −T (xn , Gn )k. Since αn ∈ [0, 1], we have also that (1−αn )2 ≤

(1 − αn ). Using these facts, (6.16) becomes

1 1 1
kxn+1 − xn k2 ≤ αn2 kβ∇f (xn )k2 + (1 − αn )η 2 kT (xn , Gn ) − xn k2
2 2 2
− αn (1 − αn ) < β∇f (xn ), T̂ (xn , Gn ) − xn > . (6.17)

From (6.15) and (6.17), we obtain

1
Cn+1 − Cn ≤ kxn+1 − xn k2 − αn < β∇f (xn ), xn − x∗ >
2
η
− (1 − αn )kxn − T (xn , Gn )k2
2
1 η
≤ −( − )η(1 − αn )kxn − T (xn , Gn )k2
2 2
1
+ αn ( αn kβ∇f (xn )k2 − < β∇f (xn ), xn − x∗ >
2
− (1 − αn ) < β∇f (xn ), T̂ (xn , Gn ) − xn >). (6.18)

Now we claim that there exists an n0 ∈ N such that the sequence {Cn } is non-increasing for n ≥ n0 .

Assume by contradiction that this is not true. Then there exists a subsequence {Cnj } such that

Cnj +1 − Cnj > 0

which together with (6.18) yields

0 < Cnj +1 − Cnj


1 η
≤ −( − )η(1 − αnj )kxnj − T (xnj , Gnj )k2
2 2
1
+ αnj ( αnj β 2 k∇f (xnj )k2 − < β∇f (xnj ), xnj − x∗ >
2
− (1 − αnj ) < β∇f (xnj ), T̂ (xnj , Gnj ) − xnj >). (6.19)
70

Since {xn } is bounded, ∇f (x) is continuous, and η ∈ (0, 1), we obtain from (6.19) by Theorem 4.1

(a) that
1 η
0 < lim inf (−( − )η(1 − αnj )kxnj − T (xnj , Gnj )k2
j−→∞ 2 2
1
+ αnj ( αnj kβ∇f (xnj )k2 − < β∇f (xnj ), xnj − x∗ >
2
− (1 − αnj ) < β∇f (xnj ), T̂ (ωn∗ j , xnj ) − xnj >))

≤0 (6.20)

which is a contradiction. Therefore, there exists an n0 ∈ N such that the sequence {Cn } is non-

increasing for n ≥ n0 . Since {Cn } is bounded below, it converges.

Taking the limit of both sides of (6.18) and using the convergence of {Cn }, continuity of ∇f (x),

Step 1, η ∈ (0, 1), and Theorem 4.1 (a) yield

lim kxn − T (xn , Gn )k = 0


n−→∞

or, by kT̂ (xn , Gn ) − xn k = ηkxn − T (xn , Gn )k,

lim kxn − T̂ (xn , Gn )k = 0. (6.21)


n−→∞

We have from (6.1) and Step 1 that

lim kxn+1 − T̂ (xn , Gn )k = αn kxn − β∇f (xn ) − T̂ (xn , Gn )k = 0. (6.22)


n−→∞

(6.21) and (6.22) together implies

lim kxn+1 − xn k ≤ lim kxn+1 − T̂ (xn , Gn )k + lim kxn − T̂ (xn , Gn )k = 0,


n−→∞ n−→∞ n−→∞

thus the sequence {xn }∞


n=0 is convergent. From this fact, we obtain by Assumption 6.1, part (i) of

Lemma 6.1, and (6.21) that {xn }∞


n=0 converges to an element in the feasible set.

Proof of Step 3:

It remains to prove that lim kxn − x∗ k = 0. We have that


n−→∞

kxn+1 − x∗ k2 = kxn+1 − x∗ + αn β∇f (x∗ ) − αn β∇f (x∗ )k2

= kxn+1 − x∗ + αn β∇f (x∗ )k2 + αn2 kβ∇f (x∗ )k2

− 2αn < β∇f (x∗ ), xn+1 − x∗ + αn β∇f (x∗ ) > . (6.23)


71

We have that x∗ = αn x∗ + (1 − αn )x∗ , ∀n ∈ N ∪ {0}; using this fact and (6.1), we obtain

kxn+1 −x∗ +αn β∇f (x∗ )k2 = kαn [xn −x∗ −β(∇f (xn )−∇f (x∗ ))]+(1−αn )[T̂ (xn , Gn )−x∗ ]k2 . (6.24)

Moreover, we have

< β∇f (x∗ ), xn+1 − x∗ + αn β∇f (x∗ ) >=< β∇f (x∗ ), xn+1 − x∗ > +αn kβ∇f (x∗ )k2 . (6.25)

Substituting (6.24) and (6.25) for (6.23) yields

kxn+1 − x∗ k2 = kxn+1 − x∗ + αn β∇f (x∗ )k2 + αn2 kβ∇f (x∗ )k2

− 2αn < β∇f (x∗ ), xn+1 − x∗ + αn β∇f (x∗ ) >

= kαn [xn − x∗ − β(∇f (xn ) − ∇f (x∗ ))]

+ (1 − αn )[T̂ (xn , Gn ) − x∗ ]k2

− 2αn < β∇f (x∗ ), xn+1 − x∗ > −αn2 kβ∇f (x∗ )k2

= αn2 kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k2

+ (1 − αn )2 kT̂ (xn , Gn ) − x∗ k2

+ 2αn (1 − αn ) < xn − x∗ − β(∇f (xn ) − ∇f (x∗ )), T̂ (xn , Gn ) − x∗ >

− 2αn < β∇f (x∗ ), xn+1 − x∗ > −αn2 kβ∇f (x∗ )k2 .

From (6.7), part (iii) of Lemma 6.1, and Cauchy–Schwarz inequality, we obtain

< xn − x∗ − β(∇f (xn ) − ∇f (x∗ )), T̂ (xn , Gn ) − x∗ >≤ (1 − γ)kxn − x∗ k2 . (6.26)

From (6.7), we also obtain

kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k2 ≤ (1 − γ)2 kxn − x∗ k2 . (6.27)


72

Therefore, from (6.26), (6.27), and part (iii) of Lemma 6.1, we have

kxn+1 − x∗ k2 = αn2 kxn − x∗ − β(∇f (xn ) − ∇f (x∗ ))k2

+ (1 − αn )2 kT̂ (xn , Gn ) − x∗ k2

+ 2αn (1 − αn ) < xn − x∗ − β(∇f (xn ) − ∇f (x∗ )), T̂ (xn , Gn ) − x∗ >

− 2αn < β∇f (x∗ ), xn+1 − x∗ > −αn2 kβ∇f (x∗ )k2

≤ (1 − 2γαn )kxn − x∗ k2

+ αn (γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >)

= (1 − γαn )kxn − x∗ k2 − γαn kxn − x∗ k2

+ αn (γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >).

≤ (1 − γαn )kxn − x∗ k2

+ αn (γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >)

or finally

γ 2 αn kxn − x∗ k2 − 2 < β∇f (x∗ ), xn+1 − x∗ >


kxn+1 − x∗ k2 ≤ (1 − γαn )kxn − x∗ k2 + γαn ( ). (6.28)
γ

From Steps 1 and 2, (4.23), and Theorem 4.1 (a), we obtain

lim (γ 2 αn kxn − x∗ k2 − 2β < ∇f (x∗ ), xn+1 − x∗ >) ≤ 0. (6.29)


n−→∞

According to Lemma 4.2 by setting

an = kxn − x∗ k2 ,

bn = γαn ,
γ 2 αn kxn − x∗ k2 − 2β < ∇f (x∗ ), xn+1 − x∗ >
hn = ( ),
γ

we obtain from (6.28), (6.29), and Theorem 4.1 (b) that lim kxn − x∗ k2 = 0; therefore, {xn }
n−→∞
converges to x∗ as n −→ ∞. Thus the proof of Theorem 6.1 is complete.
73

6.1.1.1 Numerical Example

Now related to Remark 6.1, we give an instance of average consensus problem in the following

example.

Example 6.1: Consider ten agents in a two-dimensional space that wish to reach average of their

initial states. The state of each agent is its location in the two-dimensional space, i.e., xi = [yi , zi ]T .

It is known in this case that the cost functions of agents are

fi (y, z) = 0.5(y − y0i )2 + 0.5(z − z0i )2

where [y0i , z0i ]T is initial state of agent i.

The topology of the undirected graph is assumed to be 1 ←→ 2 . . . ←→ 10, and the weight of

the link between agent i and j which is assumed to depend on the Euclidean distance of their states

is considered of the form


0.25
Wij =     .
yi  yj 
1 + k  −  k
zi zj

The weight models the gain from j to i diminishing with the distance between the agents. One

can see that fi (y, z), i = 1, 2, ..., 10, are 1-strongly convex, and ∇fi (y, z) are 1-Lipschitz continuous.

Let Link 1={(1, 2)}, . . . , Link 9={(9, 10)}, where at each time n, the Link t(mod)9 + 1 works.

Thus, the union of the graphs which occur infinitely often is strongly connected for all x ∈ <20 .

Therefore, Assumption 6.1 is fulfilled. Thus the conditions of Theorem 6.1 are satisfied.
1
We use η = 0.7, αn = 1+n 1
, n ≥ 0, β = Kξ2 = , and initial conditions y0i = i, z0i = 2i for
1
simulation. The results given by Algorithm (6.1) are shown in Figures 6.1 and 6.2. The error

en = kxn − x∗ k is shown in Figure 6.3. Figures 6.1-6.3 show that the positions of agents converge

to the average of their initial positions.


74

Figure 6.1 Variables y of agents in Example 6.1. This figure shows that the variables y
converge to average of the initial positions of variables y.
75

Figure 6.2 Variables z of agents in Example 6.1. This figure shows that the variables z
converge to average of the initial positions of variables z.
76

Figure 6.3 The error in Example 6.1. This figure shows that the positions of agents con-
verge to the average of their initial positions.
77

CHAPTER 7. STABILITY OF STOCHASTIC NONLINEAR DISCRETE


TIME SYSTEMS

So far the random Picard algorithm (4.35) and the random Krasnoselskii-Mann algorithm (4.38)

for finding a fixed value point of a nonexpansive random operator are special cases of stochastic

discrete-time systems. In this chapter, we analyze stability of stochastic nonlinear discrete-time

systems by means of fixed point theory. We show that fixed point theory and the definition of fixed

value point allow us to remove distribution dependency for stability of stochastic discrete-time

systems by using Lyapunov’s and LaSalle’s approaches. We stress from Remark 5.5 that specific

Lyapunov functions cannot exist for stability analysis of some stochastic systems.

Since the consensus subspace is a continuum set, the equilibrium set of stochastic systems (5.38)

is continuum. In a continuum of equilibria, since every neighborhood of a non-isolated equilibrium

contains another equilibrium, a non-isolated equilibrium cannot be asymptotically stable in the

sense of Lyapunov. However, given a system that has a continuum of equilibria, it is natural

to ask if the trajectories go to limit points and if the limit points are Lyapunov stable. These

questions lead to consider properties of convergence and semistability. Convergence is the notion

that every trajectory of the system goes to a limit point. The limit point, which is necessarily

an equilibrium point, depends in general on the initial conditions. In a convergent system, the

limit points of trajectories may or may not be Lyapunov stable. Semistability is the additional

requirement that trajectories converges to limit points that are Lyapunov stable. Several authors

have investigated semistability of deterministic and stochastic dynamical systems [228]-[244], to cite

a few. Nevertheless, in this chapter, we only consider stability, but not semistability, of stochastic

discrete-time systems.
78

7.1 Stability of Stochastic Systems

Now consider the following stochastic discrete-time system:

xt+1 = f (ωt∗ , xt ) (7.1)

where f : Ω∗ × R −→ R is a continuous random map (see Section 2.1), R ⊆ <n is a closed set,

and t represents time. We consider (<n , k.kB ). Note that <n equipped with any norm is a Banach

space. For simplicity, we write k.kB = k.k in this chapter.

Consider a probability measure P defined on the space (Ω, F) where

Ω = Ω∗ × Ω∗ × Ω∗ × . . .

F = σ × σ × σ × ...

such that (Ω, F, P ) forms a probability space. We denote a realization in this probability space by

ω ∈ Ω.

Now we have the following definitions of stability.

Definition 7.1 [245] (Almost sure Lyapunov stability): The equilibrium point x∗ of System

(7.1) is said to be almost surely Lyapunov stable if ∀ε > 0, % > 0, there exists δ = δ(ε, %) > 0 such

that kx0 − x∗ k < δ implies

P {sup kxt − x∗ k ≥ ε} ≤ %.
t≥0

Almost sure Lyapunov stability is also referred to as Lyapunov stability with probability one.

Definition 7.2 [245] (Almost sure asymptotic stability): The equilibrium point of System (7.1)

is said to be almost surely asymptotically stable if it is almost surely Lyapunov stable, and there

exists δ > 0 such that ∀ε > 0, kx0 − x∗ k < δ implies

lim P {sup kxt − x∗ k ≥ ε} = 0.


T −→∞ t≥T

Definition 7.3 [246] (Mean square stability): The equilibrium point x∗ of System (7.1) is said

to be mean square stable if ∀ε > 0 there exists δ = δ(ε) > 0 such that E[kxt − x∗ k2 ] < ε whenever

E[kx0 − x∗ k2 ] < δ.
79

Definition 7.4 [246] (Asymptotic mean square stability): The equilibrium point x∗ of System

(7.1) is said to be asymptotically mean square stable if it is mean square stable and there exists

δ > 0 such that E[kx0 − x∗ k2 ] < δ implies

lim E[kxt − x∗ k2 ] = 0.
t−→∞

7.2 Stability Analysis of Stochastic Nonlinear Discrete-Time Systems

In this section, we show that with the help of fixed point theory and fixed value point, we can

overcome distribution dependency of random variable sequences in existing results using Lyapunov’s

and LaSalle’s approaches. Now we have the following theorem.

Theorem 7.1: Consider stochastic system (7.1). Assume that f (ω ∗ , x) is a contraction random

mapping with constant 0 ≤ κ < 1 and F V P (f ) 6= ∅. Then the equilibrium point x∗ is both almost

surely asymptotically stable and asymptotically mean square stable.

Proof: Since f (ω ∗ , x) is a contraction random map (see Definition 2.3), we obtain for each

ω ∗ ∈ Ω∗ that

kf (ω ∗ , x) − f (ω ∗ , y)k ≤ κkx − yk, ∀x, y ∈ R.

According to Theorem 2.3, the random map f (ω ∗ , x) has a unique fixed point for each fixed ω ∗ ∈ Ω∗ .

Since F V P (T ) 6= ∅, we have x∗ = f (ω ∗ , x∗ ), ∀ω ∗ ∈ Ω∗ . Therefore, we obtain

kxt+1 − x∗ k ≤ κkxt − x∗ k, ∀ω ∈ Ω. (7.2)

We have from (7.2) that

kxt+1 − x∗ k ≤ κkxt − x∗ k ≤ kxt − x∗ k, ∀ω ∈ Ω,

or

kxt − x∗ k ≤ kx0 − x∗ k, ∀t ∈ N. (7.3)

Hence, by setting ε = δ in Definition 7.1, we obtain

kxt − x∗ k ≤ kx0 − x∗ k < ε


80

which implies that the equilibrium point x∗ is almost surely Lyapunov stable. We also have
Z Z
∗ 2 ∗ 2
E[kxt − x k ] = kxt − x k dP ≤ kx0 − x∗ k2 dP = E[kx0 − x∗ k2 ].
Ω Ω

Thus by setting ε = δ in Definition 7.3, we obtain

E[kxt − x∗ k2 ] ≤ E[kx0 − x∗ k2 ] < ε

which implies that the equilibrium point x∗ is mean square stable.

We have from (7.2) that

kxt+1 − x∗ k ≤ κt+1 kx0 − x∗ k, ∀ω ∈ Ω. (7.4)

Taking the limit of both sides of (7.4) yields

lim kxt+1 − x∗ k ≤ lim κt+1 kx0 − x∗ k = 0, ∀ω ∈ Ω


t−→∞ t−→∞

implying that the equilibrium point x∗ is almost surely asymptotically stable. We also have
Z
∗ 2
E[kxt − x k ] = lim kxt − x∗ k2 dP ≤ lim κ2t kx0 − x∗ k2 P (Ω) = 0
t−→∞ Ω t−→∞

which implies that the equilibrium point x∗ is asymptotically mean square stable. Thus the proof

is complete.

Corollary 7.1: Consider the stochastic linear system

xt+1 = A(ωt∗ )xt , ωt∗ ∈ Ω∗ . (7.5)

If there exists a norm k.k∗ such that

kA(ω ∗ )k∗ ≤ κ, ∀ω ∗ ∈ Ω∗ ,

where 0 ≤ κ < 1, then the origin is both almost surely asymptotically stable and asymptotically

mean square stable.

Hint: kAxk∗ ≤ kAk∗ kxk∗ where A ∈ <n×n and x ∈ <n .

Remark 7.1: One may consider k.k1 , k.k2 , or k.k∞ for k.k∗ in Corollary 7.1.
81

Remark 7.2: Consider Theorem 7.1 where f (x) is deterministic. Here, we do not assume that

the system is locally continuously differentiable at its equilibrium point.

Theorem 7.2: Consider stochastic system (7.1) where the cardinality of the set Ω∗ is finite

and (<n , k.kH ). Let f (ω ∗ , x) be firmly nonexpansive random map, and F V P (f ) 6= ∅. Assume that

there exists a nonempty subset K̄ ⊆ Ω∗ such that {x∗ } = {z̄|z̄ ∈ <n , z̄ = f (ω̃, z̄), ∀ω̃ ∈ K̄}, and

each element of K̄ occurs infinitely often almost surely. Then the equilibrium point x∗ is both

almost surely asymptotically stable and asymptotically mean square stable.

Remark 7.3: We recall Remark 4.3 here. If the sequence {ω̄n }∞ n=0 is mutually independent
P∞
with n=0 P rn (ω̄) = ∞ where P rn (ω̄) is the probability of ω̄ occurring at time n, then accord-

ing to Borel-Cantelli lemma [215], the assumption in Theorem 7.2 is satisfied. Consequently, any

i.i.d. random sequence satisfies the assumption in Theorem 7.2. Any ergodic stationary sequences

{ωn∗ }∞
n=0 , P r(ω̄) > 0, satisfy the assumption in Theorem 7.2 (see proof of Lemma 1 in [49]). Con-

sequently, any time-invariant Markov chain with its unique stationary distribution as the initial

distribution satisfies the assumption in Theorem 7.2 (see [49]).

Proof: Since f (ω ∗ , x) is firmly nonexpansive random operator, we have by Remark 2.1 that it

is nonexpansive. Thus we obtain for each ω ∗ ∈ Ω∗ that

kf (ω ∗ , x) − f (ω ∗ , y)kH ≤ kx − ykH , x, y ∈ R.

Since x∗ = f (ω ∗ , x∗ ), ∀ω ∗ ∈ Ω∗ , we obtain

kxn+1 − x∗ kH ≤ kxn − x∗ kH , ∀ω ∈ Ω.

Therefore, similar to the proof of Theorem 7.1, we obtain that the equilibrium point x∗ is both

almost surely Lyapunov stable and mean square stable. From Lemma 4.5 and assumptions of

Theorem 7.2, the equilibrium point x∗ is almost surely asymptotically stable. Since the sequence

{xt }∞
t=0 is bounded for all ω ∈ Ω, we obtain from the proof of Theorem 4.2 that the equilibrium

point x∗ is asymptotically mean square stable. Thus the proof is complete.

Corollary 7.2: Consider the stochastic linear system (7.5) where the cardinality of the set Ω∗

is finite. Assume that there exists a nonempty subset K̄ ⊆ Ω∗ such that {0n } = {z̄|z̄ ∈ <n , z̄ =
82

A(ω̃)z̄, ∀ω̃ ∈ K̄}, and each element of K̄ occurs infinitely often almost surely. If for each fixed

ω ∗ ∈ Ω∗ either

I) A(ω ∗ ) − AT (ω ∗ )A(ω ∗ )  0

or

II) A(ω ∗ ) = 12 In + 12 Ā(ω ∗ )

for some kĀ(ω ∗ )k2 ≤ 1, then the origin is both almost surely asymptotically stable and asymp-

totically mean square stable.

Hint: Use Definition 2.5 and Remark 2.1.


83

CHAPTER 8. CONCLUSIONS AND FUTURE WORKS

In this chapter, we summarize the contributions of this dissertation and give some directions

for future research.

8.1 Contributions of This Dissertation

1- A new mathematical optimization problem based on a new mathematical ter-

minology: We have defined a new mathematical terminology called fixed value point and defined

a new mathematical optimization problem (3.1). This problem includes both centralized and dis-

tributed optimization problems. The results have been published and accepted in [247] and [248],

respectively.

2- Centralized robust convex optimization on Hilbert spaces: We have defined robust

convex optimization on real Hilbert spaces. We have shown that this problem is included in the

problem (3.1).

3- A framework for distributed optimization over random networks: Based on the op-

timization problem (3.1), we have proposed a framework for unconstrained distributed optimization

problems over random networks. The results are given in [247]-[248].

4- A framework for distributed optimization with state-dependent interactions: We

have defined a framework for distributed optimization with state-dependent interactions. Then we

have generalized it to give a framework for distributed optimization with state-dependent interac-

tions and time-varying topologies. The preliminary results have been published in [249].

5- A proposed algorithm for the mathematical optimization: We have proposed an

algorithm with diminishing step size to solve the mathematical optimization problem and analyzed

its convergence with suitable assumptions. The results are given in [248].
84

6- An algorithm to solve unconstrained distributed optimization over random net-

works: As a special case of the proposed algorithm for solving the mathematical optimization

problem, we have given an asynchronous algorithm with diminishing step size for solving distributed

optimization over random networks that does not require distribution dependency or B-connectivity

assumption on random communication graphs for convergence. The results are given in [248].

7- The random Picard and Krasnoselskii-Mann algorithms for solving the optimiza-

tion problem: We have shown that the random Picard algorithm or the random Krasnoselskii-

Mann algorithm which do not suffer from diminishing step size are useful to solve the feasibility

problem of (3.1).

8- Solving linear algebraic equations over random networks: We have shown that the

random Krasnoselskii-Mann iterative algorithm can be applied for solving linear algebraic equations

over random networks without distribution dependency or B-connectivity assumption on random

communication graphs for convergence. The algorithm is also an asynchronous algorithm. The

preliminary results have been published in [250].

9- Distributed average consensus over random networks: We have shown that the

random Krasnoselskii-Mann iterative algorithm can be applied for distributed average consensus

over random networks without distribution dependency or B-connectivity assumption on random

communication graphs for convergence. The algorithm is also an asynchronous algorithm. We

have shown that the algorithm converges when the weighted matrix of the graph is periodic and

irreducible. The results have been accepted in [251].

10- A distributed algorithm for convex optimization with state-dependent inter-

actions and time-varying topologies: As a generalization of the proposed algorithm for dis-

tributed optimization over random networks, we have proposed an algorithm to solve distributed

optimization with state-dependent interactions and time-varying topologies that does not require

B-connectivity assumption on communication graphs for convergence. We have shown that this al-

gorithm can be applied for solving distributed average consensus with state-dependent interactions

and time-varying topologies.


85

11- Stability analysis of stochastic nonlinear discrete-time systems by means of fixed

point theory: We have analyzed stability of stochastic nonlinear discrete-time systems by using

fixed point theory to overcome difficulties that arise in using Lyapunov’s and LaSalle’s approaches

such as distribution dependency of random variable sequences.

8.2 Future Works

Several future research directions based on the approaches of this dissertation are:

- Relaxing strong convexity or K-Lipschitz assumption on the cost function to only convex

function in Assumption 4.1 to propose a distributed asynchronous algorithm which converges with

any convex functions of agents.

- Distributed asynchronous algorithm without diminishing step size for solving least square

problems.

- Relaxing doubly stochastic assumptions to only row stochastic assumption for distributed

optimization over random networks or state-dependent interactions.


86

REFERENCES

[1] Z. Jiang, Distributed optimization for control and learning, Ph.D. dissertation, Dep. Mech.

Eng., Iowa State University, Ames, Iowa, USA, 2018.

[2] S. Boyed, N. Parikh, E. Chu, B. Peleato, and J. Eckstin, Distributed optimization and statisti-

cal learning via the alternating direction method of multipliers, Found. Trends. Mach. Learn.,

vol. 3, pp. 1–122, 2010.

[3] S. S. Ram, A. Nedić, and V. V. Veeravalli, A new class of distributed optimization algorithms:

Application to regression of distributed data, Optim. Methods Softw., vol. 27, no. 1, pp. 71–88,

2012.

[4] N. A. Lynch, Distributed Algorithms, San Mateo, CA: Morgan Kaufmann, 1997.

[5] M. H. DeGroot, Reaching a consensus, J. Amer. Statis. Assoc., vol. 69, no. 345, pp. 118–121,

1974.

[6] V. Borkar and P. P. Varaiya, Asymptotic agreement in distributed estimation, IEEE Trans.

on Automatic Control, vol. AC-27, no. 3, pp. 650–655, 1982.

[7] J. N. Tsitsiklis, Problems in decentralized decision making and computation, Ph.D. disserta-

tion, Dep. Elect. Eng. Comp. Sci., MIT, Cambridge, MA, 1984.

[8] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods,

Englewood Cliffs, NJ: Prentice Hall, 1989.

[9] J. R. Marden, G. Arslan, and J. S. Shamma, Cooperative control and potential games, IEEE

Trans. on Systems, Man, and Cybernetics–Part B: Cybernetics, vol. 39, no. 6, pp. 1393–1407,

2009.
87

[10] A. Nedić and A. Ozdaglar, Distributed subgradient methods for multi-agent optimization,

IEEE Trans. Automatic Control, vol. 56, pp. 48–61, 2009.

[11] J. Wang and N. Elia, Control approach to distributed optimization, Proc. of 48th Annual

Allerton Conf., Sep. 29–Oct. 1, Allerton House, UIUC, Illinois, USA, pp. 557–561, 2010.

[12] S. S. Kia, J. Cortés, and S. Mrtı́nez, Distributed convex optimization via continuous-time

coordination algorithms with discrete-time communication, Automatica, vol. 55, pp. 254–264,

2015.

[13] M. Franceschelli, A. Giua, and A. Pisano, Finite-time consensus on the median value with

robustness properties, IEEE Trans. on Automatic Control, vol. 62, no. 4, pp. 1652–1667, 2017.

[14] D. Varagnolo, F. Zanella, A. Cenedese, G. Pillonetto, and L. Schenato, Newton-Raphson

consensus for distributed convex optimization, IEEE Trans. on Automatic Control, vol. 61,

no. 4, pp. 994–1009, 2016.

[15] T-H. Chang, M. Hong, and X. Wang, Multi-agent distributed optimization via inexact con-

sensus ADMM, IEEE Trans. on Signal Processing, vol. 63, no. 2, pp. 482–497, 2015.

[16] N. Li and J. R. Marden, Designing games for distributed optimization, IEEE J. of Selected

Topics in Signal Processing, vol. 7, no. 2, pp. 230–242, 2013.

[17] P. Chebotarev and R. Agaev, The forest consensus theorem, IEEE Trans. on Automatic Con-

trol, vol. 59, no. 9, pp. 2475–2479, 2014.

[18] G. Shi and K. H. Johansson, The role of persistent graphs in the agreement seeking of social

networks, IEEE Journal on Selected Areas in Communications/Supplement, vol. 31, pp. 595–

606, 2013.

[19] R. Olfati-Saber, J. A. Fax, and R. M. Murray, Consensus and cooperation in networked multi-

agent systems, Proceedings of The IEEE, vol. 95, no. 1, pp. 215–233, 2007.
88

[20] D. Cheng, J. Wang, and X. Hu, An extension of LaSalle’s invariance principle and its ap-

plication to multi-agent consensus, IEEE Trans. on Automatic Control, vol. 53, no. 7, pp.

1765–1770, 2008.

[21] B. Ning, Q-L. Han, and Z. Zuo, Distributed optimization for multiagent systems: an edge-

based fixed-time consensus approach, IEEE Trans. on Cybernetics, vol. 49, pp. 122-132, 2019.

[22] G. Notarstefano and F. Bullo, Distributed abstract optimization via constraints consensus:

theory and applications, IEEE Trans. on Automatic Control, vol. 56, no. 10, pp. 2247–2261,

2011.

[23] Y. Zhang and Y. Hong, Multi-agent consensus convergence analysis with infinitely jointly

connected and directed communication graphs, Proc. of the 11th World Congress on Intelligent

Control and Automation, June 29–July 4, Shenyang, China, pp. 2091–2096, 2014.

[24] J. Liu and A. Nedić, and T. Başar, Complex constrained consensus, Proc. of 53th IEEE Conf.

on Decision and Control, Dec. 15–17, Los Angeles, CA, USA, pp. 1464–1469, 2014.

[25] A. Nedić ans A. Olshevsky, Distributed optimization over time-varying directed graphs, IEEE

Trans. on Automatic Control, vol. 60, no. 3, pp. 601–615, 2015.

[26] A. Jadbabaie, J. Lin, and A. S. Morse, Coordination of groups of mobile autonomous agents

using nearest neighbor rules, IEEE Trans. on Automatic Control, vol. 48, no. 6, pp. 988–1001,

2003.

[27] C. N. Hadjicostis, N. H. Vaidya, and A. D. Domı́nguez-Garcı́a, Robust distributed average

consensus via exchange of running sums, IEEE Trans. on Automatic Control, vol. 61, pp.

1492–1507, 2016.

[28] T. Li and J. Wang, Distributed averaging with random network graphs and noises, IEEE

Trans. on Information Theory, vol. 64, pp. 7063–7080, 2018.


89

[29] W. Ren and R. W. Beard, Consensus seeking in multiagent systems under dynamically chang-

ing interaction topologies, IEEE Trans. on Automatic Control, vol. 50, no. 5, pp. 655–661,

2005.

[30] Y. Hatano and M. Mesbahi, Agreement over random networks, IEEE Trans. on Automatic

Control, vol. 50, no. 11, pp. 1867–1872, 2005.

[31] J. Wu, Z. Meng, T. Yang, G. Shi, and K. H. Johansson, Sampled-data consensus over random

networks, IEEE Trans. on Signal Processing, vol. 64, no. 17, pp. 4479–4492, 2016.

[32] Y. Wang, L. Cheng, W. Ren, Z-G. Hou, and M. Tan, Seeking consensus in networks of linear

agents: communication noises and Markovian switching topologies, IEEE Trans. on Automatic

Control, vol. 60, no. 5, pp. 1374–1379, 2015.

[33] S. Patterson, B. Bamieh, and A. E. Abbadi, Distributed average consensus with stochastic

communication failures, Proc. of 46th IEEE Conf. on Decision and Control, Dec. 12–14, New

Orleans, LA, USA, pp. 4215–4220, 2007.

[34] J. Wang and N. Elia, Distributed averaging algorithms resilient to communication noise and

dropouts, IEEE Trans. on Signal Processing, vol. 61, no. 9, pp. 2231–2242, 2013.

[35] Y. Qin, M. Cao, and B. D. O. Anderson, Asynchronous agreement through distributed coordi-

nation algorithms associated with periodic matrices, Proc. of 20th IFAC World Congress, July

9–14, Toulouse, France, pp. 1742–1747, 2017.

[36] G. Shi, B. D. O. Anderson, and K. H. Johansson, Consensus over random graph processes: net-

work Borel-Cantelli lemmas for almost sure convergence, IEEE Trans. on Information Theory,

vol. 61, pp. 5690–5707, 2015.

[37] F. Bénézit, V. Blondel, P. Thiran, J. Tsitsiklis, and M. Vetterli, Weighted gossip: distributed

averaging using non-doubly stochastic matrices, Proc. of IEEE International Symposium on

Information Theory, June 13–18, Austin, Texas, USA, pp. 1753–1757, 2010.
90

[38] A. D. Domı́nguez-Garcı́a and C. N. Hadjicostis, Distributed startegies for average consensus

in directed graphs, Proc. of IEEE Conf. on Decision and Control and European Control Conf.,

Dec. 12–15, Orlando, FL, USA, pp. 2124–2129, 2011.

[39] T. C. Aysal, M. E. Yildiz, A. D. Sarwate, and A. Scaglione, Broadcast gossip algorithms for

consensus, IEEE Trans. on Signal Processing, vol. 57, no. 7, pp. 2748–2761, 2009.

[40] F. Fagnani and S. Zampieri, Average consensus with packet drop communication, SIAM J.

Control and Optimization, vol. 48, no. 1, pp. 102–133, 2009.

[41] N. Bof, R. Carli, and L. Schenato, Average consensus with asynchronous updates and unreliable

communication, Proc. of 20th IFAC World Congress, July 9–14, Toulouse, France, pp. 601–606,

2017.

[42] M. Huang, Stochastic approximation for consensus: A new approach via ergodic backward

products, IEEE Trans. on Automatic Control, vol. 57, no. 12, pp. 2994–3008, 2012.

[43] N. Abaid and M. Porfiri, Consensus over numerosity-constrained random networks, IEEE

Trans. on Automatic Control, vol. 56, no. 3, pp. 649–654, 2011.

[44] L. Xiao, S. Boyd, and S. Lall, A scheme for robust distributed sensor fusion based on av-

erage consensus, Proc. of 4th International Symposium on Information Processing in Sensor

Networks, April 25–27, UCLA, Los Angeles, California, USA, pp. 63–70, 2005.

[45] S. Kar, J. M. F. Moura, Sensor networks with random links: topology design for distributed

consensus, IEEE Trans. on Signal Pocessing, vol. 56, pp. 3315–3326, 2008.

[46] S. Kar, J. M. F. Moura, Distributed average consensus in sensor networks with random link

failures and communication channel noise, Proc. of Conf. Record of the Forty-First Asilomar

Conf. on Signals, Syetms and Computers, pp. 676–680, 2007.

[47] S. Kar and J. M. F. Moura, Distributed average consensus in sensor networks with random link

failures, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, pp. 1013–1016, 2007.
91

[48] S. S. Pereira and A. Pagès-Zamora, Consensus in correlated random wireless sensor networks,

IEEE Trans. Signal Processing, vol. 59, pp. 6279–6284, 2011.

[49] A. Tahbaz-Salehi and A. Jadbabaie, Consensus over ergodic stationary graph processes, IEEE

Trans. on Automatic Control, vol. 55, pp. 225–230, 2010.

[50] G. Shi and K. H. Johansson, Randomized optimal consensus of multi-agent systems, Automat-

ica, vol. 48, pp. 3018–3030, 2012.

[51] I. Lobel and A. Ozdaglar, Distributed subgradient methods for convex optimization over ran-

dom networks, IEEE Trans. on Automatic Control, vol. 56, no. 6, pp. 1291–1306, 2011.

[52] I. Lobel, A. Ozdaglar, and D. Feiger, Distributed multi-agent optimization with state-

dependent communication, Math. Program. Ser. B, vol. 129, pp. 255–284, 2011.

[53] D. Wang, D. Wang, W. Wang, Y. Liu, and F. E. Alsaadi, Distributed optimization for multi-

agent systems with the first order integrals under markovian switching topologies, International

J. of Systems Science, vol. 48, pp. 1787–1795, 2017.

[54] J. C. Duchi, A. Agarwal, and M. J. Wainwright, Dual averaging for distributed optimization:

convergence analysis and network scaling, IEEE Trans. on Automatic Control, vol. 57, pp.

592–606, 2012.

[55] J. Jakovetić, J. M. F. Xavier, and J. M. F. Moura, Convergence rates of distributed Nesterov-

like gradient methods on random networks, IEEE Trans. on Signal Processing, vol. 62, pp.

868–882, 2014.

[56] T. Wada, I. Masubuchi, K. Hanada, T. Asai, and Y. Fujisaki, Distributed multi-objective opti-

mization over randomly varying unbalanced networks, Proc. of the 20th IFAC World Congress,

July 9–14, Toulouse, France, pp. 2403–2408, 2017.


92

[57] I. Matei and J. S. Baras, Performance evaluation of the consensus-based distributed subgra-

dient method under random communication topologies, IEEE J. of Selected Topics in Signal

Processing, vol. 5, no. 4, pp. 754–771, 2011.

[58] L. Su, On the convergence rate of average consensus and distributed optimization over un-

reliable networks, Proc. of Asilomar Conf. on Signals, Systems, and Computers, Oct. 28–31,

Pacific Grove, CA, USA, pp. 43–47, 2018.

[59] D. Jakovetic, D. Bajovic, A. K. Sahu, and S. Kar, Convergence rates for distributed stochastic

optimization over random networks, Proc. of IEEE Conf. on Decision and Control, Dec. 17–19,

Miami Beach, FL, USA, pp. 4238–4245, 2018.

[60] A. K. Sahu, D. Jakovetic, D. Bajovic, and S. Kar, Distributed zeroth order optimization over

random networks: a Kiefer-Wolfowitz stochastic approximation approach, Proc. of IEEE Conf.

on Decision and Control, Dec. 17–19, Miami Beach, FL, USA, pp. 4951–4958, 2018.

[61] H. Iiduka, Fixed point optimization algorithms for distributed optimization in networked sys-

tems, SIAM J. Optim., vol. 23, pp. 1–26, 2013.

[62] R. Carli, G. Notarstefano, L. Schenato, and D. Varagnolo, Distributed quadratic program-

ming under asynchronous and lossy communications via Newton-Raphson consensus, Proc. of

European Control Conf., July 15–17, Linz, Austria, pp. 2514–2520, 2015.

[63] N. Bastianello, M. Todescato, R. Carli, L. Schenato, Distributed optimization over lossy net-

works via relaxed Peaceman-Rachford splitting: a robust ADMM approach, Proc. of European

Control Conf., July 12–15, Limassol, Cyprus, pp. 477–482, 2018.

[64] L. Majzoobi, V. Shah-Mansouri, and F. Lahouti, Analysis of distributed ADMM algorithm for

consensus optimization over lossy networks, IET Signal Processing, vol. 12, pp. 786–794, 2018.

[65] J. Xu, S. Zhu, Y. C. Soh, and L. Xie, A Bregman splitting scheme for distributed optimization

over networks, IEEE Trans. on Automatic Control, vol. 63, pp. 3809–3824, 2018.
93

[66] R. Carli, G. Notarstefano, L. Schenato, and D. Varagnolo, Analysis of Newton-Raphson con-

sensus for multi-agent convex optimization under asynchronous and lossy communications,

Proc. of IEEE 54th Annual Conf. on Decision and Control, Dec. 15–18, Osaka, Japan, pp.

418–424, 2015.

[67] T-H. Chang, A randomized dual consensus ADMM method for multi-agent distributed opti-

mization, Proc. of IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 19–24,

Brisbane, Queensland, Australia, pp. 3541–3545, 2015.

[68] N. Bastianello, R. Carli, L. Schenato, and M. Todescato, A partition-based implementation of

the relaxed ADMM distributed convex optimization over lossy networks, Proc. of IEEE Conf.

on Decision and Control, Dec. 17–19, Miami Beach, FL, USA, pp. 3379–3384, 2018.

[69] J. Zhang, K. You, and T. Başar, Distributed discrete-time optimization by exchanging one bit

of information, Proc. of Annual Amer. Cont. Conf., June 27–29, Wisconsin Center, Milwaukee,

USA, pp. 2065–2070, 2018.

[70] A. Nedić, Asynchronous broadcast-based convex optimization over a network, IEEE Trans. on

Automatic Control, vol. 56, no. 6, pp. 1337–1351, 2011.

[71] K. Srivastava and A. Nedić, Distributed asynchronous constrained stochastic optimization,

IEEE J. of Selected Topics in Signal Processing, vol. 5, pp. 772–790, 2011.

[72] B. Touri, A. Nedić, and S. S. Ram, Asynchronous stochastic convex optimization over ran-

dom networks: error bounds, Information Theory and Applications Workshop, Jan. 31-Feb. 5,

University of California, San Diego, USA, pp. 1–10, 2010.

[73] P. Bianchi, W. Hachem, and F. Iutzeler, A coordinate descent primal-dual algorithm and

application to distributed asynchronous optimization, IEEE Trans. on Automatic Control,

vol. 61, pp. 2947–2957, 2016.

[74] P. Bianchi, W. Hachem, and F. Iutzeler, A stochastic primal-dual algorithm for distributed

asynchronous composite optimization, Proc. of IEEE Global Conf. on Signal and Information
94

Processing, Dec. 3–5, Georgia Tech Hotel & Conference Center, Atlanta, GA, USA, pp. 732–

736, 2014.

[75] I. Notarnicola and G. Notarstefano, Asynchronous distributed optimization via randomized

dual proximal gradient, IEEE Trans. on Automatic Control, vol. 62, pp. 2095–2106, 2017.

[76] F. Iutzeler, P. Bianchi, P. Ciblat, and W. Hachem, Asynchronous distributed optimization us-

ing a randomized Alternating Direction Method of Multipliers, Proc. of 52th Conf. on Decision

and Control, Dec. 10–13, Florence, Italy, pp. 3671–3676, 2013.

[77] I. Notarnicola, R. Carli, and G. Notarstefano, Distributed partitioned big-data optimization

via asynchronous dual decomposition, IEEE Trans. on Control of Network Systems, vol. 5, pp.

1910–1919, 2018.

[78] F. Zanella, D. Varagnolo, A. Cenedese, G. Pillonetto, and L. Schenato, Asynchronous Newton-

Raphson consensus for distributed convex optimization, Proc. of 3rd IFAC Workshop on Dis-

tributed Estimation and Control in Networked Systems, Sep. 14–15, Santa Barbara, CA, USA,

pp. 133–138, 2012.

[79] M. Zhong and C. G. Cassandras, Asynchronous distributed optimization with minimal commu-

nication and connectivity preservation, Proc. of the Joint 48th Conf. on Decision and Control

and 28th Chinese Control Conf., Dec. 16–18, Shanghai, China, pp. 5396–5401, 2009.

[80] J. N. Tsitsiklis, D. P. Bertsekas, and M. Athans, Distributed asynchronous deterministic and

stochastic gradient optimization algorithms, IEEE Trans. on Automatic Control, vol. AC-31,

no. 9, pp. 803–812, 1986.

[81] S. Lee and A. Nedić, Distributed random projection algorithm for convex optimization, IEEE

J. of Selected Topics in Signal Processing, vol. 7, no. 2, pp. 221–229, 2013.

[82] M. Zhong and C. G. Cassandras, Asynchronous distributed optimization with event-driven

communication, IEEE Trans. on Automatic Control, vol. 55, pp. 2735–2750, 2010.
95

[83] M. Zhong and C. G. Cassandras, Asynchronous distributed optimization with minimal com-

munication, Proc. of the 47th Conf. on Decision and Control, Dec. 9–11, Cancun, Mexico, pp.

363–368, 2008.

[84] W. Liu and Z. Hua, Asynchronous algorithms for distributed optimisation and application to

distributed regression with robustness to outliers, IET Control Theory and Applications, vol.

7, pp. 2084–2089, 2013.

[85] G. Shi, K. H. Johansson, and Y. Hong, Reaching an optimal consensus: dynamical systems

that compute intersections of convex sets, IEEE Trans. on Automatic Control, vol. 58, pp.

610–622, 2013.

[86] A. Simonetto, T. Kevicsky, and R. Babuška, Constrained distributed algebraic connectivity

maximization in robotic networks, Automatica, vol. 49, pp. 1348–1357, 2013.

[87] Y. Kim and M. Mesbahi, On maximizing the second smallest eigenvalue of a state-dependent

graph laplacian, IEEE Trans. on Autom. Contr., vol. 51, pp. 116–120, 2006.

[88] D. D. Siljak, Dynamic graphs, Nonlin. Analysis: Hybrid Syst., vol. 2, pp. 544–567, 2008.

[89] S. Totsch and E. Tadmor, Heterophilious dynamics enhances consensus, SIAM Review, vol.

56, pp. 577-621, 2014.

[90] I. Rajapakse, M. Groudine, and M. Mesbahi, Dynamics and control of state-dependent net-

works for probing genomic organization, Proc. of Nation. Acad. of Scien. of the USA, vol. 108,

pp. 17257–17262, 2011.

[91] V. Trianni, D. D. Simone, A. Reina, and A. Baronchelli, Emergence of consensus in a multi-

robot network: from abstract models to empirical validation, IEEE Robotics and Automation

Letters, vol. 1, no. 1, pp. 348–353, 2016.

[92] R. Mehmood and J. Crowcroft, Parallel iterative solution method of large sparse linear equation

systems, Technical Report, University of Cambridge, 2005.


96

[93] S. Mou and A. S. Morse, A fixed-neighbor, distributed algorithm for solving a linear algebraic

equation, Proc. of European Control Conf., July 17-19, Zürich, Switzerland, pp. 2269–2273,

2013.

[94] Y. Shang, A distributed memory parallel Gauss-Seidel algorithm for linear algebraic systems,

Computers and Mathematics with Applications, vol. 57, pp. 1369–1376, 2009.

[95] G. Shi and B. D. O. Anderson, Distributed network flows solving linear algebraic equations,

Proc. of American Control Conf., Boston Marriott Copley Place, July 6-8, Boston, Mas-

sachusetts, USA, pp. 2864–2869, 2016.

[96] J. Liu, A. S. Morse, A. Nedić, and T. Başar, Stability of a distributed algorithm for solving

linear algebraic equations, Proc. of 53rd IEEE Conf. on Decision and Control, Dec. 15-17, Los

Angeles, California, USA, pp. 3707–3712, 2014.

[97] X. Gao, J. Liu, and T. Başar, Stochastic communication-efficient distributed algorithms for

solving linear algebraic equations, Proc. of IEEE Conf. on Control Applications, part of IEEE

Multi-Conference on Systems and Control, Sept. 19-22, Buenos Aires, Argentina, pp. 380–385,

2016.

[98] J. Liu, X. Chen, T. Başar, and A. Nedić, A continuous-time distributed algorithm for solving

linear equations, Proc. of Amer. Cont. Conf., Boston Marriott Copley Place, July 6-8, Boston,

MA, USA, pp. 5551–5556, 2016.

[99] S. Mou, A. S. Morse, Z. Lin, L. Wang, and D. Fullmer, A distributed algorithm for efficiently

solving linear equations, IEEE 54th Annual Conf. on Decision and Control, Dec. 15–18, Osaka,

Japan, pp. 6791–6796, 2015.

[100] J. Liu, A. S. Morse, A. Nedić, and T. Başar, Exponential convergence of a distributed algo-

rithm for solving linear algebraic equations, Automatica, vol. 83, pp. 37–46, 2017.
97

[101] S. Mou, J. Liu, and A. S. Morse, A distributed algorithm for solving a linear algebraic

equation, Proc. of 51st Annual Allerton Conf., Allerton House, UIUC, Octob. 2-3, Illinois,

USA, pp. 267–274, 2013.

[102] J. Liu, S. Mou, and A. S. Morse, An asynchronous distributed algorithm for solving a linear

algebraic equation, Proc. of 52nd IEEE Conf. on Decision and Control, Dec. 10-13, Florence,

Italy, pp. 5409–5414, 2013.

[103] S. Mou, J. Liu, and A. S. Morse, A distributed algorithm for solving a linear algebraic

equation, IEEE Trans. on Automatic Control, vol. 60, pp. 2863–2878, 2015.

[104] K. You, S. Song, and R. Tempo, A networked parallel algorithm for solving linear algebraic

equations, Proc. of 55th IEEE Conf. on Decision and Control, ARIA Resort & Casino, Dec.

12-14, Las Vegas, USA, pp. 1727–1732, 2016.

[105] X. Wang, S. Mou, and D. Sun, Further discussions on a distributed algorithm for solving

linear algebra equations, Proc. of Amer. Cont. Conf., Sheraton Seattle Hotel, May 24-26,

Seattle, WA, USA, pp. 4274–4278, 2017.

[106] P. Wang, W. Ren, and Z. Duan, Distributed solution to linear equations from arbitrary

initializations, Proc. of Amer. Cont. Conf., Sheraton Seattle Hotel, May 24-26, Seattle, WA,

USA, pp. 3986–3991, 2017.

[107] W. Shen, B. Yin, X. Cao, Y. Cheng, and X. Shen, A distributed secure outsourcing scheme

for solving linear algebraic equations in ad hoc clouds, IEEE Trans. on Cloud Computing, to

appear.

[108] M. Yang and C. Y. Tang, A distributed algorithm for solving general linear equations over

networks, IEEE 54th Annual Conf. on Decision and Control, Dec. 15–18, Osaka, Japan, pp.

3580–3585, 2015.
98

[109] P. Wang, Y. Gao, N. Yu, W. Ren, J. Lian, and D. Wu, Communication-efficient distributed

solutions to a system of linear equations with Laplacian sparse structure, Proc. of IEEE Conf.

on Decision and Control, Dec. 17–19, Miami Beach, FL, USA, pp. 3367–3372, 2018.

[110] J. Zhou, W. Xuan, S. Mou, and B. D. O. Anderson, Distributed algorithm for achieving

minimum l1 norm solutions of linear equations, Proc. of Annual Amer. Cont. Conf., June

27–29, Wisconsin Center, Milwaukee, USA, pp. 5857–5862, 2018.

[111] J. Liu, S. Mou, and S. Morse, Asynchronous distributed algorithms for solving linear algebraic

equations, IEEE Trans. on Automatic Control, vol. 63, pp. 372–385, 2018.

[112] X. Wang, S. Mou, and D. Sun, Improvement of a distributed algorithm for solving linear

equations, IEEE Trans. on Industrial Electronics, vol. 64, no. 4, pp. 3113–3117, 2017.

[113] L. Wang, D. Fullmer, and A. S. Morse, A distributed algorithm with an arbitrary initialization

for solving a linear algebraic equation, Proc. of American Control Conf., Boston Marriott

Copley Place, July 6-8, Boston, Massachusetts, USA, pp. 1078–1081, 2016.

[114] J. Liu, X. Gao, and T. Başar, A communication-efficient distributed algorithm for solving

linear algebraic equations, Proc. of 7th International Conf. on Network Games, Control, and

Optimization, Oct. 29-31, Trento, Italy, pp. 62–69, 2014.

[115] D. Fullmer, L. Wang, and A. S. Morse, A distributed algorithm for computing a common fixed

point of a family of paracontractions, Proc. of 10th IFAC Symposium on Nonlinear Control

Systems, August 23-25, Marriott Hotel Monterey, California, USA, pp. 552–557, 2016.

[116] D. Fullmer, J. Li, and A. S. Morse, An asynchronous distributed algorithm for computing

a common fixed point of a family of paracontractions, Proc. of 55th IEEE Conf. on Decision

and Control, ARIA Resort & Casino, Dec. 12-14, Las Vegas, USA, pp. 2620–2625, 2016.

[117] P. Weng, W. Ren, and Z. Duan, Distributed minimum weighted norm solution to linear

equations associates with weighted inner product, Proc. of 55th IEEE Conf. on Decision and

Control, ARIA Resort & Casino, Dec. 12-14, Las Vegas, USA, pp. 5220–5225, 2016.
99

[118] B. D. O. Anderson, S. Mou, A. S. Morse, and U. Helmke, Decentralized gradient algorithm

for solution of a linear equation, Numerical Algebra, Control and Optimization, vol. 6, pp.

319–328, 2016.

[119] S. Mou, Z. Lin, L. Wang, D. Fullmer, and A. S. Morse, A distributed algorithm for efficiently

solving linear equations and its applications (Special issue JCW), System & Control Letters,

vol. 91, pp. 21–27, 2016.

[120] J. Wang and N. Elia, Distributed solution of linear equations over unreliable networks, Proc.

of Amer. Cont. Conf., Boston Marriott Copley Place, July 6-8, Boston, MA, USA, pp. 6471–

6476, 2016.

[121] J. Wang and N. Elia, Solving systems of linear equations by distributed convex optimization

in the presence of stochastic uncertainty, Proc. of the 19th IFAC World Congress, August

24–29, Cape Town, South Africa, pp. 1210–1215, 2014.

[122] J. Wang and N. Elia, Distributed least square with intermittent communications, Proc. of

Amer. Cont. Conf., June 27–29, Fairmont Queen Elizabeth, Montréal, Canada, pp. 6479–4684,

2012.

[123] Z. Liu and C. Li, Distributed sparse recursive least-squares over networks, IEEE Trans. on

Signal Processing, vol. 62, no. 6, pp. 1386–1395, 2014.

[124] D. E. Marelli and M. Fu, Distributed weighted least-squares estimation with fast convergence

for large-scale systems, Automatica, vol. 51, pp. 27–39, 2015.

[125] J. Lu and C. Y. Tang, Distributed asynchronous algorithms for solving positive definite

linear equations over networks–Part I: agent networks, Proc. of the First IFAC Workshop on

Estimation and Control of Networked Systems, Sep. 24–26, Venice, Italy, pp. 252–257, 2009.

[126] J. Lu and C. Y. Tang, Distributed asynchronous algorithms for solving positive definite

linear equations over networks–Part II: wireless networks, Proc. of the First IFAC Workshop

on Estimation and Control of Networked Systems, Sep. 24–26, Venice, Italy, pp. 258–263, 2009.
100

[127] X. Wang, J. Zhou, S. Mou, and M. J. Corless, A distributed linear equation solver for least

square solutions, Proc. of the IEEE 56th Annual Conf. on Decision and Control, Dec. 12–15,

Melbourne, Australia, pp.5955–5960, 2017.

[128] J. Lu and C. Y. Tang, A distributed algorithm for solving positive definite linear equations

over networks with membership dynamics, IEEE Trans. on Control of Network Systems, vol.

5, no. 1, pp. 215–227, 2018.

[129] X. Wang, S. Mou, and B. D. O. Anderson, A distributed algorithm with scalar states for

solving linear equations, Proc. of IEEE Conf. on Decision and Control, Dec. 17–19, Miami

Beach, FL, USA, pp. 2861–2865, 2018.

[130] A. H. Sayed and C. G. Poles, Distributed recursive least-squares over adaptive networks,

Proc. of fortieth Asilomar Conf. on Signals, Systems and Computers, Oct. 29– Nov. 1, Pacific

Grove, CA, USA, pp. 233–237, 2006.

[131] J. Wang, I. S. Ahn, Y. Lu, T. Yang, and G. Staskevich, A distributed least-squares algorithm

in wireless sensor networks with limited communication, Proc. of IEEE Int. Conf. Electro

Information Technology, May 14–17, Lincoln, NE, USA, pp. 467–471, 2017.

[132] S. Huang and C. Li, Distributed sparse total least-squares over networks, IEEE Trans. on

Signal Processing, vol. 63, pp. 2986–2998, 2015.

[133] G. Mateos, I. D. Schizas, and G. B. Giannakis, Distributed recursive least-squares for

consensus-based in-network adaptive estimation, IEEE Trans. on Signal Processing, vol. 57,

pp. 4583–4588, 2009.

[134] G. Mateos and G. B. Giannakis, Distributed recursive least-squares: stability and perfor-

mance analysis, IEEE Trans. on Signal Processing, vol. 60, pp. 3740–3754, 2012.

[135] R. Z. Has’minskiĭ, Stability of Differential Equations, Germantown, MD: Sijthoff & Noordhoff,

1980.
101

[136] L. Arnold, Random Dynamical Systems, Berlin, Heidenberg: Springer-Verlag, 1998.

[137] C. S. Kubrusly and O. L. V. Costa, Mean square stability conditions for discrete stochastic

bilinear systems, IEEE Trans. on Automatic Control, vol. AC-30, no. 11, pp. 1082–1087, 1985.

[138] A. R. Teel, J. P. Hespanha, and A. Subbaraman, Equivalent characterizations of input-to-

state stability for stochastic discrete-time systems, IEEE Trans. on Automatic Control, vol.

59, no. 2, pp. 516–522, 2014.

[139] F. Wei and T. Jie, Moment and sample path stability of switched discrete stochastic systems,

Proc. of the 30th Chinese Control Conf., July 22–24, Yanrai, China, pp. 1810–1814, 2011.

[140] L. Huang, H. Hjalmarsson, and H. Koeppl, Almsot sure stability and stabilization of discrete-

time stochastic systems, Systems & Control Letters, vol. 82, pp. 26–32, 2016.

[141] S. Sathananthan, M. J. Knap, A. Strong, and L. H. Keel, Robust stability and stabilization of

a class of nonlinear discrete time stochastic systems: an LMI approach, Applied mathematics

and Computation, vol. 219, pp. 1988–1997, 2012.

[142] C. Possieri and A. R. Teel, Asymptotic stability in probability for stochastic Boolean net-

works, Automatica, vol. 83, pp. 1–9, 2017.

[143] A. R. Teel, Lyapunov conditions certifying stability and recurrence for a class of stochastic

hybrid systems, Annual Reviews in Control, vol. 37, pp. 1–24, 2013.

[144] Y. A. Phillis, On the stabilization of discrete linear time-varying stochastic systems, IEEE

Trans. on Syst. Man Cyber., vol. SMC-12, no. 3, pp. 415–417, 1982.

[145] W. Yijing, S. Yunna, and Z. Zhiqiang, Stochastic stability and stabilization of discrete-time

markovian jump systems in the presence of incomplete knowledge of transition probabilities,

Proc. of the 30th Chinese Control Conf., July 22–24, Yantai, China, pp. 1395–1399, 2011.
102

[146] T. Hou and H. Ma, Stability analysis of discrete-time stochastic systems with infinite markov

jump parameter, Proc. of Amer. Cont. Conf., palmer House Hilton, July 1–3, Chicago, IL,

USA, pp. 4192–4197, 2015.

[147] Z. Zuo, Y. Liu, Y. Wang, and H. Li, Finite-time stochastic stability and stabilisation of

linear discrete-time Markovian jump systems with partly unknown transition probabilities,

IET Control Theory and Applications, vol. 6, pp. 1522–1526, 2012.

[148] O. L. V. Costa and D. Z. Figueiredo, Stochastic stability of jump discrete-time linear systems

with Markov chain in a general Borel space, IEEE Trans. on Automatic Control, vol. 59, no.

1, pp. 223–227, 2014.

[149] C. A. C. Gonzaga and O. L. V. Costa, Stochastic stability for discrete-time Markov jump

Lur’e systems, Proc. of 52nd IEEE Conf. on Decision and Control, Dec. 10–13, Florence, Italy,

pp. 5993–5998, 2013.

[150] Z. Zuo, H. Li, Y. Wang, and Y. Liu, Robust finite-time stochastic stability analysis and control

synthesis of uncertain discrete-time Markovian jump linear systems, Proc. of the 10th World

Congress on Intelligent Control and Automation, July 6–8, Beijing, China, pp. 1925–1929,

2012.

[151] H. Linlin, Z. Guangdeng, and W. Yuqiang, Exponential l2 − l∞ stochastic stability analysis

for discrete-time switching Markov jump linear systems, Proc. of the 30th Chinese Control

Conf., July 22–24, Yantai, China, pp. 1722–1727, 2011.

[152] C. A. C. Gonzaga and O. L. V. Costa, Stochastic stabilization and induced l2 -gain for discrete-

time Markov jump Lur’e systems with control saturation, Automatica, vol. 50, pp. 2397–2404,

2014.

[153] U. Vaidya, Stochastic stability analysis of discrete-time system using Lyapunov measure,

Proc. of Amer. Cont. Conf., Palmer House Hilton, July 1–3, Chicago, Il, USA, pp. 4646–4651,

2015.
103

[154] U. Vaidya and V. Chinde, Computation of the Lyapunov measure for almost everywhere

stochastic stability, Proc. of the 54th Annual Conf. on Decision and Control, Dec. 15–18,

Osaka, Japan, pp. 7042–7047, 2015.

[155] A. R. Teel, J. P. Hespanha, and A. Subbaraman, A converse Lyapunov theorem and robust-

ness for asymptotic stability in probability, IEEE Trans. on Autom. Cont., vol. 59, no. 9, pp.

2426–2441, 2014.

[156] W. Zhang, X. Lin, and B-S. Chen, LaSalle-type theorem and its applications to infinite hori-

zon optimal control of discrete-time nonlinear stochastic systems, IEEE Trans. on Automatic

Control, vol. 62, no. 1, pp. 250–261, 2017.

[157] A. Olshevsky and J. Tsitsiklis, On the nonexistence of quadratic Lyapunov functions for

consensus algorithms, IEEE Trans. on Automatic Control, vol. 53, no. 11, pp. 2642–2645,

2008.

[158] J. M. Davis and G. Eisenbarth, On positivstellensatz and nonexistence of common quadratic

Lyapunov functions, Proc. of IEEE 43th Southeastern Symposium on System Theory, pp. 55–

58, 2011.

[159] R. H. Ordóñez-Hurtado and M. A. Duarte-Mermoud, A methodology for determining the

non-existence of common quadratic Lyapunov functions for pairs of stable systems, Proc. of

5th International Conf. on Genetic and Evolutionary Computing, August 29–September 01,

Kitakyushu Convention Center, Kitakyushu, Japan, pp. 127–130, 2011.

[160] M. A. Duarte-Mermoud, R. H. Ordóñez-Hurtado, and P. Zagalak, A method for determining

the non-existence of common quadratic Lyapunov function for switched linear systems based

on particle swarm optimisation, International J. of Systems Science, vol. 43, no. 11, pp. 2015–

2029, 2012.
104

[161] L. Wang and Z. Xu, On characterizations of exponential stability of nonlinear discrete dynam-

cial systems on bounded regions, IEEE Trans. on Autom. Cont., vol. 52, no. 10, pp. 1871–1881,

2007.

[162] Y. Fu, Q. Zhao, and L. Wang, A converse to global exponential stability of a class of difference

equations, Advances in Difference Equations, vol. 9, pp. 1–10, 2013.

[163] F. Cucker and S. Smale, Emergent behavior in flocks, IEEE Trans. on Automatic Control,

vol. 52, pp. 852–862, 2007.

[164] S. Sh. Alaviani, A necessary and sufficient condition for delay-independent stability of linear

time-varying neutral delay systems, J. of The Franklin Institute, vol. 351, pp. 2574–2581, 2014.

[165] S. Sh. Alaviani, Delay-dependent exponential stability of linear time-varying neutral delay

systems, Proc. of the 12th IFAC Workshop on Time Delay Systems, June 28-30, University of

Michigan, Ann Arbor, Michigan, USA, pp. 177–179, 2015.

[166] D. Bandyopadhyay, On a study of some results on the theory of fixed point of operators,

Ph.D. dissertation, Dep. of Pure Math., University of Calcutta, 1997.

[167] D. R. Smart, Fixed point theorems, Cambridge Univ. Press, London, 1974.

[168] W. Rudin, Functional Analysis, McGraw-Hill: Singapore, 1991, second edition.

[169] R. T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J. Control

and Optimization, vol. 14, no. 5, pp. 877–898, 1976.

[170] H-K. Xu, Some random fixed point theorems for condensing and nonexpansive operators,

Proceedings of The American Mathematical Society, vol. 110, no. 2, pp. 395–400, 1990.

[171] M. A. Krasnoselskii, Two remarks on the method of successive approximations, Uspekhi Mat.

Nauk, vol. 10, pp. 123–127, 1955.

[172] W. R. Mann, Mean value methods in iteration, Proc. of the Amer. Math. Soc., vol. 4, pp.

506–510, 1953.
105

[173] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press: New York,

2004.

[174] A. Kumar and S. Rathee, Some common fixed point and invariant approximation results for

nonexpansive mappings in convex metric space, Fixed Point Theory and Appl., vol. 182, pp.

1–14, 2014.

[175] R. Espinola, P. Lorenzo, and A. Nicolae, Fixed points, selections and common fixed points

for nonexpansive mappings, J. Math. Anal. Appli., vol. 382, pp. 503–515, 2011.

[176] I. Bula, Strictly convex metric spaces and fixed points, Mathematica Moravica, vol. 3, pp.

5–16, 1999.

[177] K. G. S. Reich, Uniform Convexity, Hyperbolic Geometry, and Nonexpansive Mapping, New

York: Marcel Dekker, 1984.

[178] G. C. Calafiore and M. C. Campi, The scenario approach to robust control design, IEEE

Trans. Automatic Control, vol. 51, pp. 742–753, 2006.

[179] G. Clafiore and M. C. Campi, Robust convex program: randomized solutions and applications

in control, Proc. of 42nd IEEE Conf. on Decision and Control, Dec. 9–12, Maui, Hawaii, USA,

pp. 2423–2428, 2003.

[180] R. Tempo, G. Calafiore, and F. Dabbene, Randomized Alorithms for Analysis and Control of

Uncertain Systems with Applications, Springer–Verlag: London, 2013.

[181] K. Yang, J. Huang, Y. Wu, X. Wang, and M. Chiang, Distributed robust optimization (DRO)

part I: framework and example, Optim. Eng., vol. 15, pp. 35–67, 2014.

[182] M. Bürger, G. Notarstefano, and F. Allgöwer, Distributed robust optimization via cutting-

plane consensus, Proc. of IEEE Conf. on Decision and Control, Dec. 10–13, Maui, Hawaii,

USA, pp. 7457–7463, 2012.


106

[183] M. Bürger, G. Notarstefano, and F. Allgöwer, A polyhedral approximation framework for

convex and robust distributed optimization, IEEE Trans. on Automatic Control, vol. 59, pp.

384–395, 2014.

[184] L. Carlone, V. Srivastava, F. Bullo, and G. C. Calafiore, Distributed random convex pro-

gramming via constraint consensus, SIAM J. Control Optim., vol. 52, pp. 629–662, 2014.

[185] S. Wang and C. Li, Distributed random optimization in networked system, IEEE Trans. on

Cybernetics, vol. 47, pp. 2321–2333, 2017.

[186] K. You and R. Tempo, Parallel algorithms for robust convex programs over networks, Proc.

of Amer. Cont. Conf., July 6–8, Boston Marriott Copley Place, Boston, MA, USA, pp. 2017–

2022, 2016.

[187] K. You, R. Tempo, and P. Xie, Distributed algorithms for robust convex optimization via the

scenario approach, IEEE Trans. on Automatic Control, vol. 64, pp. 880–895, 2019.

[188] A. Ben-Tal and A. Nemirovski, Robust convex optimization, Mathematics of Operations Re-

search, vol. 23, pp. 769–805, 1998.

[189] V. Jeyakumar and G. Y. Li, Strong duality in robust convex programming: complete char-

acterizations, SIAM J. Optim., vol. 20, pp. 3384–3407, 2010.

[190] J. H. Lee and G. M. Lee, On approximate solutions for robust convex optimization problems,

www.kurims.kyoto-u.ac.jp/ kyodo/kokyuroku/contents/pdf/2011-02.pdf

[191] K. Cai, W. Li, F. Ju, and X. Zhu, A scenario-based optimization approach to robust esti-

mation of airport apron capacity, Integrated Communications, Navigation, Surveillance Conf.,

April 10-12, Herndon, VA, USA, pp. 3A1-1–3A1-8, 2018.

[192] B. T. Polyak, Random algorithms for solving convex inequalities, Studies in Computational

Mathematics, vol. 8, pp. 409–422, 2001.


107

[193] T. Wada and Y. Fujisaki, Sequential randomized algorithms for robust convex optimization,

IEEE Transactions on Automatic Control, vol. 60, pp. 3356–3361, 2015.

[194] N. Ho-Nguyen and F. Kilinc-Karzan, Online first-order framework for robust convex opti-

mization, Operation Research, vol. 66, pp. 1670–1692, 2018.

[195] C. Kroer, N. Ho-Nguyen, G. Lu, and F Kilinc-Karzan, Performance evaluation of iterative

methods for solving robust convex quadratic problems, 10th NIPS Workshop on Optimization

for Machine Learning, Dec. 8, Long Beach, USA, 2017.

[196] A. Mutapcic and S. Boyd, Cutting-set methods for robust convex optimization with pessimiz-

ing oracles, Optimization Methods & Software, vol. 24, pp. 381–406, 2009.

[197] K. Margellos, A. Falsone, S. Garatti, and M. Prandini, Distributed constrained optimization

and consensus in uncertain networks via proximal minimization, IEEE Trans. on Automatic

Control, vol. 63, pp. 1372–1387, 2018.

[198] F. Alismail, P. Xiong, and C. Singh, Optimal wind farm allocation in multi-area power

systems using distributionally robust optimization approach, IEEE Trans. on Power Systems,

vol. 33, pp. 536–544, 2018.

[199] D. Apostolopoulou, Z. De Greve, and M. McCulloch, Robust optimization for Hydroelectric

system operation under uncertainty, IEEE Trans. on Power Systems, vol. 33, pp. 3337–3348,

2018.

[200] C. Wang, J. Zhu, and T. Zhu, Decentralized robust optimization for real-time dispatch of

power system based on approximate dynamic programming, Int. Conf. on Power System Tech.,

Nov. 6–8, Guangzhou, China, pp. 1935–1941, 2018.

[201] R. Zhang and B. Zeng, Ambulance deployment with relocation through robust optimization,

IEEE Trans. on Automation Science and Engineering, vol. 16, pp. 138–147, 2019.
108

[202] D. Yazdani, T. T. Nguyen, and J. Branke, Robust optimization over time by learning problem

space characteristics, IEEE Trans. on Evolutionary Computation, vol. 23, pp. 143–155, 2019.

[203] M. Boulekchour, N. Aouf, and M. Richardson, Robust L∞ convex optimization for monocular

visual odometry trajectory estimation, Robotica, vol. 34, pp. 703–722, 2016.

——————–

[204] B. Gharesifard and G. Cortés, When does a digraph admit a doubly stochastic adjacency

matrix, Proc. of Amer. Control Conf., Marriott Waterfront, Baltimore, MD, USA, June 30–

July 2, pp. 2440–2445, 2010.

[205] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, New York,

1985.

[206] C. Fischione, Fast-Lipschitz Optimization with Wireless Sensor Networks Applications, IEEE

Trans. on Automatic Control, vol. 56, pp. 2319–2331, 2011.

[207] M. Jakobsson and C. Fischione, Extensions of Fast-Lipschitz optimization for convex and

nonconvex problems, Proc. of the 3rd IFAC Workshop on Distributed estimation and Control

in Networked Systems, Sep. 14-15, Santa Barbara, California, USA, pp. 162–167, 2012.

[208] M. Jakobsson, S. Magnusson, C. Fischione, and P. C. Weeraddana, Extensions of Fast-

Lipschitz optimization, IEEE Trans. on Automatic Control, vol. 61, pp. 861–876, 2016.

[209] H. Iiduka, Acceleration method for convex optimization over the fixed point set of a nonex-

pansive mapping, Math. Program., Ser. A, vol. 149, pp. 131–165, 2015.

[210] H. Iiduka, Convex optimization over fixed point sets of quasi-nonexpansive and nonexpan-

sive mappings in utility-based bandwidth allocation problems with operational constraints, J.

Comput. Appl. Math., vol. 282, pp. 225–236, 2015.

[211] H. Iiduka, Fixed point optimization algorithm and its applications to network bandwidth

allocation, J. Comput. Appl. Math., vol. 236, pp. 1733–1742, 2012.


109

[212] K. Slavakis, J. Yamada, and K. Sakaniwa, Computation of symmetric positive definite

Toeplitz matrices by the hybrid steepest descent method, Signal Processing, vol. 83, pp. 1135–

1140, 2003.

[213] K. Slavakis and I. Yamada, Robust wideband beamforming by the hybrid steepest descent

method, IEEE Trans. on Signal Processing, vol. 55, pp. 4511-4532, 2007.

[214] I. Yamada, N. Ogura, and N. Shirakawa, A numerical robust hybrid steepest descent method

for the convexly constrained generalized inverse problems, Contemp. Math., vol. 313, pp. 269–

305, 2002.

[215] R. Durrett, Probability: Theory and Examples, New York: Cambridge Univ. Press, 2010,

forth edition.

[216] H. K. Xu, Iterative algorithms for nonlinear operators, J. London Math. Soc., vol. 66, pp.

240-256, 2002.

[217] G. B. Folland, Real Analysis: Modern Techniques and Their Applications, John Wiley &

Sons, Inc., USA, 1984.

[218] P. Tseng, On the convergence of the products of firmly nonexpansive mappings, SIAM J.

Optimization, vol. 2, no. 3, pp. 425–434, 1992.

[219] I. Cioranescu, Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems,

Kluwer Academic Publishers: The Netherlands, 1990.

[220] H. H. Bauschke and J. M. Borwein, On projection algorithms for solving convex feasibility

problems, SIAM Review, vol. 38, pp. 367–426, 1996.

[221] H. H. Bauschke and P. L. Combettes, Covex Analysis and Monotone Operator Theory in

Hilbert Spaces, Springer: New York, 2011.

[222] S. Agmon, The relaxation method for linear inequalities, Canadian Journal of Mathematics,

vol. 6, pp. 382–392, 1954.


110

[223] T. S. Motzkin and I. J. Schoenberg, The relaxation method for linear inequalities, Canadian

Journal of Mathematics, vol. 6, pp. 393–404, 1954.

[224] I. M. Terfaloaga, The convex feasibility problem and Mann-type iteration, Analele Univer-

sităţii ”Eftimie Murgu” Reşiţa, vol. 24, pp. 402–410, 2017.

[225] O. Sluc̆iak and M. Rupp, Consensus algorithm with state-dependent weights, IEEE Trans.

on Signal Processing, vol. 64, pp. 1972–1985, 2016.

[226] A. Bogojeska, M. Mirchev, I. Mishkovski, and L. Kocarev, Synchronization and consensus in

state-dependent networks, IEEE Trans. on Circuits and Systems-I: Regular Papers, vol. 61,

pp. 522–529, 2014.

[227] A. Awad, A. Chapman, E. Schoof, A. Narang-Siddarth, and M. Mesbahi, Time-scale sepa-

ration on networks: consensus, tracking, and state-dependent interactions, IEEE 54th Annual

Conf. on Decision and Control, Dec. 15-18, Osaka, Japan, pp. 6172–6177, 2015.

[228] S. L. Campbell and N. J. Rose, Singular perturbation of autonomous linear systems, SIAM

J. Math. Anal., vol. 10, pp. 542–551, 1979.

[229] D. S. Bernstein and S. P. Bhat, Lyapunov stability, semi-stability, and asymptotic stability

of matrix second-order systems, ASME Trans. J. Vibr. Acoustics, vol. 117, pp. 145–153, 1995.

[230] S. P. Bhat and D. S. Bernstein, Nontangency-based Lyapunov tests for convergence and

stability in systems having a continuum of equilibria, SIAM J. Control Optim., vol. 42, pp.

1745–1775, 2003.

[231] S. P. Bhat and D. S. Bernstein, Arc-length-based Lyapunov test for convergence and stability

in systems having a continuum of equilibria, Proc. of the Amer. Cont. Conf., June 4–6, Denver,

CO, USA, pp. 2961–2966, 2003.

[232] Q. Hui, W. M. Hadda, and S. P. Bhat, Finite-time semistability and consensus for nonlinear

dynamical networks, IEEE Trans. on Automatic Control, vol. 53, pp. 1887–1900, 2008.
111

[233] Q. Hui, W. M. Hadda, and S. P. Bhat, On robust control algorithms for nonlinear network

consensus protocols, Int. J. Robu. Nonlin. Cont., vol. 20, pp. 269–284, 2010.

[234] Q. Hui, W. M. Hadda, and S. P. Bhat, Lyapunov and converse Lyapunov theorems for semi-

stability, Proc. of IEEE Conf. on Decision and Control, Dec. 12–14, New Orleans, LA, USA,

pp. 5870–5874, 2007.

[235] Q. Hui, Convergence and stability analysis for iterative dynamics with application in balanced

resource allocation: a trajectory distance Lyapunov approach, Proc. of IEEE Conf. on Decision

and Control and European Control Conf., Dec. 12–15, Orlando, FL, USA, pp. 7218–7223, 2011.

[236] Q. Hui and W. M. Haddad, Distributed nonlinear control algorithms for network consensus,

Automatica, vol. 44, pp. 2375–2381, 2008.

[237] Q. Hui, Lyapunov-based semistability analysis for discrete-time switched network systems,

Proc. of Amer. Cont. Conf. on O’Farrell Street, June 29–July 01, San Francisco, CA, USA,

pp. 2602–2606, 2011.

[238] J. Shen, J. Hu, and Q. Hui, Semistability of switched linear systems with application to

PageRank algorithms, European J. of Control, vol. 20, pp. 132–140, 2014.

[239] Q. Hui and H. Zhang, Optimal balanced coordinated network resource allocation using swarm

optimization, IEEE Trans. on Systems, Man, and Cybernetics: Systems, vol. 45, pp. 770–787,

2015.

[240] J. Shen, J. Hu, and Q. Hui, Semistability of switched linear systems with applications to

distributed sensor networks: A generating function approach, Proc. of 50th IEEE Conf. on

Decision and Control and European Control Conf., Dec. 12–15, Orlando, FL, USA, pp. 8044–

8049, 2011.

[241] Q. Hui, Convergence and stability analysis for iterative dynamics with application to com-

partmental networks: A trajectory distance based Lyapunov approach, J. of The Franklin

Institute, vol. 350, pp. 679–697, 2013.


112

[242] J. Zhou and Q. Wang, Stochastic semistability with application to agreement problems over

random networks, Proc. of Amer. Cont. Conf., Marriott Waterfront, Baltimore, MD, USA,

June 30-July 02, pp. 568–573, 2010.

[243] T. Rajpurohit and W. M. Haddad, Stochastic thermodynamics: a dynamical systems ap-

proach, Entropy, vol. 19, pp. 1–48, 2017.

[244] T. Rajpurohit and W. M. Haddad, Lyapunov and converse Lyapunov theorems for stochastic

semistability, Systems & Control Letters, vol. 97, pp. 83–90, 2016.

[245] H. J. Kushner, Introduction to Stochastic Control, Reinhart and Winston: New York, 1971.

[246] K. Liu, Stability of Infinite Dimensional Stochastic Differential Equations with Applications,

vol. 135 of Monographs and Surveys in pure and Applied Mathematics, Chapman & Hall/CRC,

2006.

[247] S. Sh. Alaviani and N. Elia, Distributed multi-agent convex optimization over random di-

graphs, Proceedings of American Control Conference, May 24–26, Sheraton Seattle Hotel,

Seattle, USA, pp. 5288–5293, 2017.

[248] S. Sh. Alaviani and N. Elia, Distributed multi-agent convex optimization over random di-

graphs, IEEE Transactions on Automatic Control, to appear

[249] S. Sh. Alaviani and N. Elia, Distributed convex optimization with state-dependent interac-

tions, Proceedings of 25th Mediterranean Conference on Control and Automation, July 3–6,

Valletta, Malta, pp. 129–134, 2017.

[250] S. Sh. Alaviani and N. Elia, A distributed algorithm for solving linear algebraic equations

over random networks, Proceedings of IEEE Conference on Decision and Control, Dec. 17–19,

Miami Beach, FL, USA, pp. 83–88, 2018.

[251] S. Sh. Alaviani and N. Elia, Distributed average consensus over random networks, Proceedings

of American Control Conference, July 10–12, Philadelphia, USA, 2019, to appear

You might also like