0% found this document useful (0 votes)
23 views22 pages

2022 - Neural Optimization Machine-A Neural Network Approach For Optimization

Uploaded by

Manav Arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views22 pages

2022 - Neural Optimization Machine-A Neural Network Approach For Optimization

Uploaded by

Manav Arora
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Neural Optimization Machine: A Neural Network Approach for

Optimization

Jie Chena, Yongming Liub,*

a
Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208,
USA, E-mail: [email protected]
b
School for Engineering of Matter, Transport, and Energy, Arizona State University,
Tempe, AZ 85287, USA, E-mail: [email protected]

Abstract: A novel neural network (NN) approach is proposed for constrained optimization.
The proposed method uses a specially designed NN architecture and training/optimization
procedure called Neural Optimization Machine (NOM). The objective functions for the NOM
are approximated with NN models. The optimization process is conducted by the neural
network's built-in backpropagation algorithm. The NOM solves optimization problems by
extending the architecture of the NN objective function model. This is achieved by
appropriately designing the NOM's structure, activation function, and loss function. The NN
objective function can have arbitrary architectures and activation functions. The application of
the NOM is not limited to specific optimization problems, e.g., linear and quadratic
programming. It is shown that the increase of dimension of design variables does not increase
the computational cost significantly. Then, the NOM is extended for multiobjective
optimization. Finally, the NOM is tested using numerical optimization problems and applied
for the optimal design of processing parameters in additive manufacturing.

Keywords: Neural networks, Constrained optimization, Multiobjective optimization,


Fatigue, Additive manufacturing

1. Introduction
Optimization is used in various scientific and engineering applications. Examples contain
function, approximation, and regression analysis optimal control, system planning, signal
processing, mechanical design (Effati & Nazemi, 2006). One possible and auspicious method
for optimization is utilizing neural networks (NNs) due to their inherent massive parallelism
(Effati & Nazemi, 2006; Lopez-Garcia, Coronado-Mendoza, & Domínguez-Navarro, 2020),
the ability to deal with time-varying parameters (Xia, Feng, & Wang, 2008), the robustness of
the computation (X.-S. Zhang, 2013), and a large community for rapid advancement in recent
years (Chen, Gao, & Liu, 2022; C. Wu, Wang, & Kim, 2022).
Tank and Hopfield (Tank & Hopfield, 1986) proposed several neural optimization
networks by applying circuit theory in optimization using neural networks (X.-S. Zhang, 2013).
Dhingra and Rao (Dhingra & Rao, 1992) adapted Hopfield’s neural network to solve nonlinear
programming problems with/without constraints. The solutions from the neural network
approach agree well with those calculated by gradient-based search techniques. Wu et al. (A.
Wu & Tam, 1999) presented a neural network method for quadratic programming problems by
1
applying the Lagrange multiplier theory. The solutions satisfy the necessary conditions for
optimality. The connections in neural networks are designed according to the optimization
problems. Tagliarini et al. (Tagliarini, Christ, & Page, 1991) presented a rule for neural network
design for optimization problems. Time evolution equations are constructed with equality and
inequality constraints. Effati et al. (Effati & Nazemi, 2006) solved linear and quadratic
programming problems using recurrent neural networks. The NN approach establishes an
energy function and a dynamic system. The solution can be found when the dynamic system
approaches its static state. Xia et al. (Xia, et al., 2008) developed a recurrent neural network
approach for optimization under nonlinear inequality constraints. That model was developed
for convex optimization problems, and is suitable for only a specific category of nonconvex
optimization problems.
Another category of applying neural networks in optimization is to use the NN as a
surrogate model. Neelakantan et al. (Neelakantan & Pundarikanthan, 2000) trained a NN to
approximate the simulation model and applied nonlinear programming methods to find near-
optimal policies. Nascimento et al. (C. A. O. Nascimento, Giudici, & Guardani, 2000) used the
NN to replace the equations of optimization problems. First, a grid search is carried out in the
region of interest. Then, the solutions violating the constraints are excluded. This method
allows one to identify multiple optima. Darvishvand et al. (Darvishvand, Kamkari, & Kowsary,
2018) trained NNs to construct the objective functions and then used Genetic Algorithm to find
the optimal design variables. Villarrubia et al. (Villarrubia, De Paz, Chamoso, & la Prieta, 2018)
used NNs to solve problems when the linear programming or Lagrange multiplier is not
applicable. First, the NN is used for objective function approximation. Then Lagrange method
is used to solve optimization problems. Jeon et al. (Jeon, Lee, & Choi, 2019) proposed a double-
loop process for training and optimizing the NN objective function. The outer process is to
optimize the NN weights. The inner process aims to optimize input variables while fixing the
NN weights. Chandrasekhar et al. (Chandrasekhar & Suresh, 2021) used the NN as the density
function for topology optimization. The inputs are location coordinates, and the outputs are
density values. The density field is optimized by relying on the NN’s backpropagation and a
finite element solver.
This paper proposes a neural network approach for optimizing neural network surrogate
models with and without constraints. The method is called Neural Optimization Machine
(NOM). The NOM has the following features. The objective functions for the NOM are NN
models. The optimization process is conducted by the neural network’s built-in
backpropagation algorithm. The NOM solves optimization problems by extending the
architecture of the NN objective function model. This is achieved by appropriately designing
the NOM’s structure, activation function, and loss function. The proposed NOM has the
following benefits. First, the NOM is very flexible, and the NN objective function can have
arbitrary architectures and activation functions. Second, the NOM is not limited to specific
optimization problems, e.g., linear and quadratic programming. Third, multiple local minima
can be found, which provides the potential for finding the global minimum. Fourth, compared
with current heuristics optimization techniques, e.g., Particle Swarm Optimization and Genetic
Algorithm, the increase of dimension of design variables does not increase the computational
2
cost significantly. Fifth, the NOM can be easily extended and adapted for multiobjective
optimization.
The remainder of the paper is organized as follows. First, a brief introduction to neural
networks is provided. Following this, the methodology of the Neural Optimization Machine is
presented. The construction of the NOM for unconstrained and constrained optimization is
described. The NOM is then extended for multiobjective optimization. Next, the NOM is tested
using numerical optimization problems and applied for the design in additive manufacturing.
Next, a discussion is given regarding the NOM finding multiple local minima. Finally, the
conclusions are drawn according to the current study.

2. Neural Optimization Machine

2.1 A brief introduction of neural networks


Fig. 1 shows an illustrative example for explaining the main concepts of neural networks
(Chen & Liu, 2020a, 2021a, 2021b; Laddach, Łangowski, Rutkowski, & Puchalski, 2022).
There are three layers in this example. The first and the last layer are Input and Output Layer,
respectively. The layer in between is Hidden layer. The nodes in each layer are called neurons.
There are two and one neurons at the first and last layer in this example. Thus, it is a two-
dimensional problem with one output. The neural network in this example is called feedforward
neural network since information flows layer by layer from the Input layer to Output Layer.
The mathematical calculation in each neuron can be expressed as follows (Sattarifar &
Nestorović, 2022):
pl −1
zk(l ) = bk(l −1) +  wkj(l −1) a (jl −1) , l = 2,..., L , (1)
j =1

( )
ak(l ) = Gk(l ) zk(l ) , k = 1, 2,..., pl , (2)

where l is the layer index. l is 1 and L for Input and Output Layer, respectively. The number of
neurons at the lth is pl. At the first layer (input layer),
ak(1) = xk , k = 1,..., p1 . (3)

The kth neuron at layer l has activation function Gk(l ) ( ) . In neural networks, the coefficients

and intercepts in Eq. (1) are named weights and biases, respectively. During the training of the
neural network, weights and biases are updated while minimizing the loss function. The loss
function is the objective function in the training process (Ketkar, 2017; R. G. Nascimento,
Fricke, & Viana, 2020). A common optimization technique in neural networks is stochastic
gradient descent. To alleviate the computational burden, training the neural network is carried
out though mini-batches of the total training data. In each epoch, the training data are used in
a mini-batch manner until all the data are used. A number of epochs are needed to train a neural
network.

3
z1(2) → a1(2)

z2(2) → a2(2)
x1

z3(2) → a3(2) z1(3) → a1(3)


x2
z4(2) → a4(2)

z5(2) → a5(2)

Layer 1 Layer 2 Layer 3


Input Layer Hidden Layer Output Layer

Fig. 1 An illustrative example of a single hidden layer neural network.

2.2 Neural Optimization Machine (NOM)


Suppose a neural network model NN (X) has been trained. The NN (X) is then used as the
objective function in optimization. The goal is to find the set of inputs X, which minimize the
outputs of NN (X) under inequality and equality constraints. The problem of the constrained
optimization is formulated as
Minimize
f ( X ) = NN ( X )
subject to (4)
g p ( X )  0, p = 1,..., P
hq ( X ) = 0, q = 1,..., Q
where X is the input vector, NN (X) is the NN model used as the objective function, and g and
h are inequality and equality constraints, respectively. A simple neural network model NN (X)
in Fig. 2 is used as the NN objective function for illustration, which is the same as the
architecture in Fig. 1.

Fig. 2 A simple neural network.


The key idea of the developed Neural Optimization Machine (NOM) is to solve the
optimization problem in Eq. (4) using the neural networks’ built-in backpropagation algorithm
by properly designing the NN architecture. On the one hand, in the backpropagation algorithm,
the basic method is stochastic gradient descent. It computes the gradients of the loss function
with respect to the weights and biases. The weights and biases are then updated by the gradient

4
information (Z. Zhang, Li, & Lu, 2021). At the end of the training, a set of local optimal weights
and biases are obtained. On the other hand, the gradient descent method can also be used to
solve the optimization problem in Eq. (4). It requires computing the gradient of the outputs of
the NN objective function with respect to its inputs. Then, the question is, can we transform
the problem of calculating the gradient of NN outputs with respect to inputs to the problem of
calculating the gradient of the NN loss function with respect to weights and biases? If this can
be achieved, another question is how to consider the constraints in Eq. (4). The NOM is
developed to solve those two issues.

2.2.1 NOM for unconstrained optimization


To illustrate the basic components of the Neural Optimization Machine, we will first
consider the unconstrained optimization problem, i.e., the problem without the constraints in
Eq. (4). The NOM architecture in Fig. 3 is designed to answer the first question, that is,
transforming the problem of calculating the gradient of NN outputs with respect to inputs to
the problem of calculating the gradient of the NN loss function with respect to weights and
biases. The subpart of the NOM architecture shown in the dashed line box in Fig. 3 is the
trained NN objective function to be optimized, as shown in Fig. 2. It is called NN objective
function to differentiate it from the NOM. A new layer, called starting point layer, is added
before the input layer of the NN objective function. There are two starting point neurons in this
layer for this example, as shown in grey. Each neuron in the starting point layer is connected
to one of the neurons in the input layer. No activation functions are applied in this layer. The
values input to the NN objective function are controlled by the starting points and weights and
biases between the starting point layer and the input layer. Following the formulation in Eqs.
(1) - (3), the inputs to the NN objective function are calculated by
zk(1) = bk( s ) + wkk
(s) (s)
ak k = 1, 2,..., p1 , (5)
xk = ak(1) = zk(1) , (6)
where the superscript (s) and (1) indicate the starting point layer of the NOM and the input
layer of the NN objective function, respectively, and p1 is the input dimension of the NN
objective function. The optimal xk, k = 1, 2,…, p1, are desired to minimize the NN objective
function. They are obtained from Eq. (6) by training the NOM.
To achieve this goal, we customize the loss function of NOM and constrain the NN weights
and biases. The loss function of the NOM is the output of the NOM:
Loss ( NOM ) = NOM output = NN ( X ) , (7)
In this case, the output of the NOM is the same as the output of the NN objective function. That
is, the NOM is trained to minimize the output of the NN objective function. In Fig. 3, different
colors of the connections between neurons mean whether the weights and biases are fixed
(orange) or trainable (blue). The weights and biases are fixed for the NN objective function in
the dashed line box and are trainable for the connection between the starting point layer and
the input layer. This is because when training the NOM, the original NN model (i.e., NN
objective function) should be kept unchanged while the weights and biases between the starting
point layer and the input layer are updated to find the optimal solution to minimize the NOM.
5
Fixed weights and biases

Trainable weights and biases

Neuron

Starting point neuron

Fig. 3 Neural Optimization Machine for unconstrained optimization.

Only one training data point is used to train the NOM. The data point is referred to as the
starting point of the stochastic gradient descent algorithm in optimizing the NOM. This data
can also be interpreted as the training data for the input of the NOM (i.e., the starting point is
the input to the starting point layer of the NOM). There is no training data for the output of
NOM since the NOM is not trained in a supervised way.
A general drawback of NN training is the convergence to local minima. As a result, the
NOM will produce a locally optimal solution. The following two operations are adopted to
mitigate this issue and possibly find the global minimum. First, a grid search is conducted on
the region of interest to generate multiple good starting points. The constraints can then be
evaluated if needed to exclude the points with constraint violations (C. A. O. Nascimento, et
al., 2000). Next, each good starting point is used as the training data of the NOM to find the
corresponding optimal solution to the NN objective function. Second, before training the NOM,
random initial values are assigned to the weights between the starting point layer and the input
layer. This is a default feature of the optimizer in training the NN. This operation adds some
randomness to the initial search directions. The effects of the above operations are further
discussed in Section 4.

2.2.2 NOM for constrained optimization


The above description of the NOM shown in Fig. 3 is for unconstrained optimization. On
the basis of the NOM for unconstrained optimization, the NOM is further developed and shown
in Fig. 4 for answering the second question, i.e., how to consider the constraints. Compared
with the NOM in Fig. 3, a constraint layer is added. The neurons in this layer are called
constraint neurons, as shown in green. The number of constraint neurons is equal to the number
of constraints. Following Eqs. (1) and (2), the calculation from the input layer to the constraint
layer can be expressed as
zk( c ) = g k ( X ) , k = 1,..., P , (8)
for inequality constraints, and
zk( c ) = hk ( X ) , k = P + 1,..., P + Q , (9)
for equality constraints, and then
( )
ak(c ) = Gk(c ) zk(c ) , k = 1, 2,..., P + Q , (10)

6
where the superscript (c) indicates the constraint layer. The adopted activation function G ( c ) for
the constraint layer is modified from the Rectified Linear Unit (ReLU) activation function,
which is expressed as
ReLU ( z ) = max ( 0, z ) . (11)
For the inequality constraints shown in problem (4), the activation function used for the
constraint neuron is
G ( c ) ( z ) = c  ReLU ( z ) , (12)
and for the equality constraints,
G ( c ) ( z ) = c   ReLU ( − z ) + ReLU ( z )  , (13)
where c is a large number as the penalty parameter. The plots of ReLU (Eq. (11)) and two
modified ReLU activation functions (Eqs. (12) and (13)) with c = 10 are plotted in Fig. 5. The
physics meaning of using the above activation functions is that when the constraints are
satisfied, the outputs of the constraint neurons are zero, and when the constraints are violated,
the outputs of the constraint neutrons are large values.
Another difference between the NOM architectures in Fig. 3 and Fig. 4 is the output of the
NOM. In Fig. 3, the output of the NOM is the same as the output of the NN objective function.
In Fig. 4, the NOM output is the sum of the NN objective function NN(X) and the outputs of
all constraint neurons. The loss function of the NOM for constrained optimization is still the
output of the NOM, i.e.,
Loss ( NOM ) = NOM output = NN ( X ) + outputs of constraint neurons , (14)
The meaning of Eq. (14) is that when the constraints are violated (i.e., the inputs of the NN
objective function are infeasible), a large penalty value is added to the loss function. The NOM
is trained to minimize the loss function Eq. (14), which brings the inputs to the NN objective
function to the feasible domain by adjusting the weights and biases between the starting point
layer and the input layer.

Fixed weights and biases

Trainable weights and biases

Neuron

Starting point neuron


+ Constraint neuron

Penalized objective neuron

Fig. 4 Neural Optimization Machine for constrained optimization.

7
(a) ReLU (b) Modified Relu for (c) Modified Relu for
inequality constraint neurons equality constraint neurons
Fig. 5 ReLU and modified ReLU activation functions.

2.3 NOM for multiobjective optimization


The proposed Neural Optimization Machine is a flexible tool. It can be developed to solve
other types of optimization problems involving NNs as the objective functions. This section
shows how the NOM can be developed for multiobjective optimization for NNs.
The multiobjective optimization problem is formulated as

Minimize
F ( X ) =  f1 ( X ) , f 2 ( X ) ..., f M ( X )
=  NN1 ( X ) , NN 2 ( X ) ..., NN M ( X )
(15)
subject to
g p ( X )  0, p = 1,..., P
hq ( X ) = 0, q = 1,..., Q
where the M objective functions are M NN models. Usually, there is no X that can minimize
all objective functions simultaneously (Rao, 2019). Infinite solutions can be obtained for
different combinations of good performance of objectives. Those solutions are called Pareto
optimum solutions (Mack, Goel, Shyy, & Haftka, 2007; Martínez-Iranzo, Herrero, Sanchis,
Blasco, & García-Nieto, 2009; Sundaram, 2022). Pareto optimal is a feasible solution X that
satisfy the following condition: we cannot find another feasible solution Y that fi (Y) ≤ fi (X)
for all objectives (i = 1, 2, ..., M) and fj (Y) < fj (X) for at least one objective function. (Rao,
2019). The goal of multiobjective optimization is to find Pareto optimum solution set.
Among several methods developed for multiobjective optimization, the bounded objective
function method is used, which is suitable for the proposed Neural Optimization Machine.
Consider the maximum and minimum acceptable values for objective function fi are U(i) and
L(i) (i = 1, 2, …, M), respectively. We minimize the most important (say, the rth) objective
function (Rao, 2019):

Minimize
f r ( X ) = NN r ( X ) (16)
subject to

8
g p ( X )  0, p = 1,..., P
hq ( X ) = 0, q = 1,..., Q

L( )  fi ( X ) = NNi ( X )  U ( ) , i = 1,..., M , i  r
i i

L(i) can be discarded if the goal is not to achieve a solution falling within a range of values each
objective. In that case, U(i) can be varied systematically to produce Pareto optimal solution set
(Arora, 2004). That is adopted in this paper. A general guideline for selecting U(i) is
(Carmichael, 1980)

( ) ( )
fi X*i  U ( )  f r X*i .
i
(17)
More discussion on selecting U(i) can be found in (Cohon, 2004; Stadler, 1988).
A numerical example with two NN objective functions is used to illustrate the proposed
NOM in multiobjective optimization using the bounded objective function method. The NOM
for multiobjective optimization is shown in Fig. 6. The subparts of the NOM in the two dashed
line boxes are the two NN objective functions, NN1 and NN2, respectively. The architecture of
each NN objective function is shown in Fig. 2. Compared with the NOM in Fig. 4, another NN
objective function is added in Fig. 6. Both NN objective functions share the same input layer
in this example. According to the bounded objective function method, NN1 is used as the
objective function, and NN2 is used as the constraint. The output neuron of NN2 is replaced by
the constraint neuron. The upper bound, U(2), changes according to Eq. (17). For each U(2), a
Pareto optimal solution is obtained using the NOM. Pareto optimal solutions can be obtained
by varying the U(2).
NN1

Fixed weights and biases


+ Trainable weights and biases
NN2 Neuron

Starting point neuron

Constraint neuron

Penalized objective neuron

Fig. 6 Neural Optimization Machine for multiobjective optimization.

9
3. Experiments and applications
In this section, the proposed Neural Optimization Machine (NOM) is tested for numerical
problems of constrained optimization and multiobjective optimization. Following that, the
NOM is applied to design processing parameters in additive manufacturing.

3.1 Constrained optimization problems


In this subsection, three optimization problems are analyzed. The results from the NOM
are compared with those from Nelder Mead, Generic algorithm, Differential Evolution, and
Particle Swarm Optimization.
The data used to train the neural network (NN) objective function are generated inside the
domain defined by the constraints on individual variables. 10,000 data are generated evenly for
each problem. The hyperparameters for training the NN objective functions and the NOM are
identical for all three problems and are listed in Table 1 and Table 2, respectively.

Table 1 The hyperparameters for training the NN objective functions.


Number of hidden layers 1
Number of hidden neurons 20
The activation function of hidden layers Hyperbolic tangent (tanh)
Epochs 500
Minibatch size 10
Learning rate 0.01

Table 2 The hyperparameters for the NOM.


Number of starting points 5
Penalty parameters 10
Epochs 2000
Minibatch size 1
Learning rate 0.01

The first problem (Problem 1) has a cubic objective function with quadratic constraints:

Minimize
1 
f ( x1 , x2 ) = ( x1 − 10 ) + ( x2 − 10 ) 
3 3
1000  
subject to
− ( x1 − 5 ) − ( x2 − 5 ) + 100  0
2 2
(18)

( x1 − 6 )2 − ( x2 − 5)2 − 100  0
13  x1  20
0  x2  20
The second problem (Problem 2) have a trigonometric objective function with linear and
quadratic constraints:

Minimize (19)
10
f ( x1 , x2 ) = −10 cos ( x1x2 ) + x1x2 10 + 10 ( x1 + x2 ) sin ( x1 + x2 )
subject to
x1 − x2  0.5
x1 x2  15
0  x1  1.5
−1  x2  1
The third problem (Problem 3) is to optimize a trigonometric objective function with
quadratic constraints:

Minimize
sin 3 ( 2 x1 ) sin ( 2 x2 )
f ( x1, x2 ) =
x13 ( x1 + x2 )
subject to
x12 − x2 + 1  0 (20)

( x1 − 2 )2 − x2 + 1  0
0.1  x1  1
0.1  x2  7
The plots of the three objective functions are shown in Fig. 7. There are zero, one, and two
local minima in the defined field for Problem 1, 2, and 3, respectively. The results of the NOM
are shown in Table 3. The results are compared with those of Nelder Mead, Genetic Algorithm,
Differential Evolution, and Particle Swarm Optimization, which are also shown in Table 3. The
results from the NOM are almost identical to those of other optimization algorithms. The
running time is also shown in Table 3. The direct search method (Nelder Mead) uses the shorted
time. The NOM uses less computational time than heuristics algorithms (Genetic Algorithm,
Differential Evolution, Particle Swarm Optimization).

Table 3 Results of different optimization algorithms.


Nelder Genetic Differential Particle Neural
Mead Algorithm Evolution Swarm Optimization
Optimization Machine
Problem 1 x1 13.660 13.660 13.660 13.694 13.680
x2 0.000 0.000 0.000 0.000 0.001
f (x1, x2) -7.950 -7.950 -7.950 -7.949 -7.948
Time (s) 3.2 55.2 137.4 36.0 25.7
Problem 2 x1 0.258 0.257 0.256 0.256 0.257
x2 -0.241 -0.242 -0.243 -0.243 -0.243
f (x1, x2) -9.983 -9.984 -9.984 -9.984 -9.984
Time (s) 3.0 54.6 139.7 35.3 20.0
Problem 3 x1 0.100 0.100 0.100 0.100 0.101
x2 5.463 5.464 5.464 5.464 5.464
f (x1, x2) -1.406 -1.406 -1.406 -1.406 -1.405
Time (s) 3.0 57.5 143.5 34.8 21.8

11
(a) Problem 1.

(b) Problem 2.

12
(c) Problem 3.
Fig. 7 Plots of the objective functions.

3.2 Multiobjective optimization problem


The NOM is tested for multiobjective optimization following the methodology presented
in Section 2.3.
The multiobjective optimization problem is shown in Eqs. (21). This is a problem with two
quadratic objective functions and linear constraints. The plots of the objective functions are
shown in Fig. 8.

Minimize
f1 = ( x1 − 3) + ( x2 − 7 )
2 2

f 2 = ( x1 − 9 ) + ( x2 − 8 )
2 2

subject to
70 − 4 x2 − 8 x1  0 (21)
−2.5 x2 + 3x1  0
−6.8 + x1  0
0  x1  10
5  x2  15

13
Fig. 8 Plots of the objective functions of the multiobjective optimization problem.

The architecture shown in Fig. 6 is used for this problem. According to the bounded
objective function method setup in Eq. (16), f1 is used as the single objective function, and f2
is used as one of the constraints. The optimal solution X2* of f2 is x1 = 6.784 and x2 = 8.309.
The outcomes of f2 and f1 at X2*, f2 (X2*), and f1 (X2*), are 0.501 and 1.611, respectively.
According to Expression (17), f2 (X2*) and f1 (X2*) are used as the lower and upper limits for
varying the constraint of f2, U(2). The hyperparameters for training the NN objective functions
and the NOM are the same as those shown in Table 1 and Table 2. The results are shown in
Fig. 9. The red points are the results obtained using the NOM. The Non-dominated Sorting
Genetic Algorithm (NSGA-II), a multiobjective optimization technique, is also applied for this
problem. The results are shown in black points in Fig. 9. As can be seen from Fig. 9, the results
obtained from both methods are almost identical.

14
Fig. 9 Pareto optimum solutions.

3.3 Design of processing parameters in additive manufacturing


The developed NOM is flexible and not restricted to specific NN architectures or activation
functions. In this section, we applied the NOM to design an engineering problem, i.e., the
design of processing parameters in additive manufacturing (AM) to optimize the fatigue life of
Ti-6Al-4V (Ti-64).
Ti-64 is a type of material extensively used in AM (Kumar & Ramamurty, 2020; Pegues,
et al., 2020; Sandgren, et al., 2016; Sharma, et al., 2021; Wycisk, Emmelmann, Siddique, &
Walther, 2013). Common applications of Ti-64 include turbine blades and compressors. In
those cases, components may have fatigue failure due to the cyclic loadings (Chen & Liu,
2020b; Günther, et al., 2017). Therefore, the fatigue performance of AM Ti-64 is important in
ensuring structural safety (Chen & Liu, 2021b). The processing parameters influence the AM
quality for fatigue performance. The key factors are in- processing parameters and post-
processing parameters. This paper aims to optimize the fatigue life by designing the processing
parameters.
(Chen & Liu, 2021b) proposed a neural network approach for the fatigue modeling for AM
Ti-64. The model is named Probabilistic Physics-guided Neural Network (PPgNN). That model
can be used to obtain probabilistic stress-life (P-S-N) relationships (Chen, Liu, Zhang, & Liu,
2020) under different processing parameters. The PPgNN model is used as the NN objective
function, as shown in the dashed line box in Fig. 10. The inputs of the PPgNN are fatigue
parameters, AM in-processing parameters, and AM post- processing parameters. Specifically,
fatigue parameters are stress amplitude S and stress ratio R. Among all the AM in-processing
parameters, scanning velocity v, laser power P, hatch distance h, and layer thickness t are
considered. Heat temperature HT and heat time Ht are the considered AM post-processing
parameters. The outputs of the PPgNN are the statistics of the fatigue life, which are mean μ
15
and standard deviation σ. The square neurons are designed for missing data problems. For a
comprehensive introduction of the PPgNN, refer to (Chen & Liu, 2021b).
The architecture of the NOM shown in Fig. 10 is built according to Section 2.2.2 for
constrained optimization. The loss function to be minimized by the NOM is
Loss ( NOM ) = − (  − 1.96 ) + outputs of constraint neurons , (22)
where μ – 1.96σ is the 2.5% lower bound of the fatigue life (Chen & Liu, 2020b). In other
words, this problem is to maximize the lower bound. The constraints of this problem are the
ranges of the input variables, and are listed next to the corresponding constraint neurons. The
hyperparameters are the same as in previous examples. The stress amplitude and stress ratio
are fixed to be 520 MPa and 0.1, respectively. The results of the NOM together with Nelder
Mead, Genetic Algorithm, Differential Evolution, and Particle Swarm Optimization are shown
in Table 4. The NOM achieves the same result of log10 (μ – 1.96σ) as Genetic Algorithm,
Differential Evolution, and Particle Swarm Optimization. The Nelder Mead provides a worse
result in this case. This problem has more variables than those of the test examples shown in
Section 3.1. As a result, the computational cost increases for all the other optimization
algorithms. However, the computational cost is not shown to increase for the NOM.

16
S

 (−1.96)
v
P
h
+
t
HT
Ht

1000  v  1250 mm / s
170  P  400 W
50  h  160  m
30  t  60  m
650  HT  1200 C
1  Ht  5 h
Fig. 10 Neural Optimization Machine for the design of processing parameters in additive
manufacturing using Physics-guided Neural Network as the objective function.

Table 4 Results of different optimization algorithms for AM fatigue optimization.


Nelder Genetic Differential Particle Neural
Mead Algorithm Evolution Swarm Optimization
Optimization Machine
v (mm/s) 1053.2 1067.2 1066.6 1064.1 1064.9
P (W) 177.8 170.0 170.2 170.0 171.6
h (μm) 86.3 90.2 88.9 90.6 90.8
t (μm) 59.8 59.9 59.9 59.9 59.9
HT (°C) 797.1 797.5 797.0 797.2 796.9
Ht (h) 5.0 5.0 5.0 5.0 5.0
log10(μ – 1.96σ) 4.458 4.462 4.462 4.462 4.462
Time/second 32.2 105.3 261.6 162.2 21.6

17
4. Discussion
This section discusses the impact of initial weights and starting points. As stated at the end
of Section 2.2.1, the NOM uses the strategy of random initial weights between the starting
point layer and the input layer and multiple starting points to explore the global minimum. The
impacts of these operations are discussed below.
During the training of the NOM, the weights and biases between the starting point layer
and the input layer are updated. This is achieved by calculating the derivatives of the loss
function of the NOM with respect to weights and biases according to the backpropagation
algorithm:
s s
(
dLoss db( ) = fb b( ) , w ( ) , X0 ,
s
) (23)

and
s s
(
dLoss dw( ) = f w b( ) , w ( ) , X0 ,
s
) (24)

respectively, where Loss is the loss function of the NOM, w(s) and b(s) are the weights and
biases between the starting point layer and the input layer, respectively, and X0 is the training
data as well as the starting point of the NOM. Eqs. (23) and (24) show that the derivatives are
the functions of the w(s) and b(s) at the previous step and X0. If the initial w(s) and b(s) are fixed
to be zero and one, respectively, the calculations of the derivatives in Eqs. (23) and (24) at the
first step are only the function of the starting point. In this way, the NOM will converge to the
nearest local minimum from the starting point due to the steepest gradient descent algorithm.
As stated in Section 2.2.1, before training the NOM, random initial values are assigned to w(s).
This introduces randomness in the first step of the gradient calculation. As the following
updates of the weights and biases are functions of the results from the previous steps, the initial
random weights are assigned to the NOM to find more local minima. Also, multiple starting
points are used to train the NOM. The randomness of the initial weights and multiple starting
points help the NOM to converge to multiple local minima, which is possible to obtain the
global minimum.
The above discussion is demonstrated using the following optimization problem:

Minimize
sin 3 ( 2 x1 ) sin ( 2 x2 )
f ( x1, x2 ) =
x13 ( x1 + x2 )
(25)
subject to
0.1  x1  1
0.1  x2  7
This problem is modified from Problem 3 by deleting the first and second constraints. As
shown in Fig. 7 (c), there are two local minima in the defined domain, and one of those is the
global minimum. This problem aims to test whether the NOM can obtain both local minima or
just one of those. The results of the number of local minima obtained by the NOM are shown
in Table 5. Six cases are investigated according to the number of starting points and whether
initial w(s) are one or random. The initial b(s) is zero for all cases. The test results in Table 5
18
show that the NOM can obtain both local minima with multiple starting points and random
initial w(s). Therefore, we use random initial w(s) and 5 starting points for all the previous
optimization problems. The starting points correspond to 5 smallest objective function values
determined by the grid search as stated in Section 2.2.1. This strategy benefits finding multiple
local minima, which is possible to find the global minima. However, the other optimization
techniques used in the previous sections only try to find the global minima.

Table 5 Number of local minima obtained by the NOM.


1 starting point Multiple starting
Training NOM Training NOM points
once multiple times
Initial w(s) is one. 1 1 2
Initial w(s) is random. 1 2 2

5. Conclusions
Neural Optimization Machine (NOM), a novel neural network approach, is proposed for
constrained optimization of neural network models. The objective functions for the NOM are
NN models. The optimization process is conducted by the neural network’s built-in
backpropagation algorithm. The NOM solves optimization problems by extending the
architecture of the NN objective function model. This is achieved through the appropriate
design of the NOM’s structure, activation function, and loss function. The NOM is very flexible
and is extended for multiobjective optimization. The NOM is tested using numerical
optimization problems for constrained optimization and multiobjective optimization. The
results obtained from the NOM are compared with the Nelder Mead, Genetic Algorithm,
Differential Evolution, and Particle Swarm Optimization for single-objective optimization, and
Non-dominated Sorting Genetic Algorithm (NSGA-II) for multiobjective optimization. The
NOM is then applied for the design of processing parameters in additive manufacturing. Based
on the investigation of this paper, the following are the conclusions.
1. The NN objective function can have arbitrary architectures and activation functions.
2. The application of the NOM is not limited to specific optimization problems, e.g.,
linear and quadratic programming.
3. Multiple local minima can be found, which provides the potential for finding the global
minimum.
4. The increase of dimension of design variables does not increase the computational cost
significantly for the NOM.

Acknowledgments
The research in this paper was partially supported by funds from NASA University
Leadership Initiative program (Contract No. NNX17AJ86A, Project Officer: Dr. Anupa Bajwa,
Principal Investigator: Dr. Yongming Liu). The support is gratefully acknowledged.

References
19
Arora, J. (2004). Introduction to optimum design: Elsevier.
Carmichael, D. (1980). Computation of Pareto optima in structural design. International Journal for
Numerical Methods in Engineering, 15, 925-929.
Chandrasekhar, A., & Suresh, K. (2021). TOuNN: Topology optimization using neural networks.
Structural and Multidisciplinary Optimization, 63, 1135-1149.
Chen, J., Gao, Y., & Liu, Y. (2022). Multi-fidelity Data Aggregation using Convolutional Neural
Networks. Computer Methods in Applied Mechanics and Engineering, 391, 114490.
Chen, J., Liu, S., Zhang, W., & Liu, Y. (2020). Uncertainty quantification of fatigue SN curves with
sparse data using hierarchical Bayesian data augmentation. International Journal of Fatigue,
134, 105511.
Chen, J., & Liu, Y. (2020a). Probabilistic physics-guided machine learning for fatigue data analysis.
Expert Systems with Applications, 114316.
Chen, J., & Liu, Y. (2020b). Uncertainty quantification of fatigue properties with sparse data using
hierarchical Bayesian model. In AIAA Scitech 2020 Forum (pp. 0680).
Chen, J., & Liu, Y. (2021a). Fatigue modeling using neural networks: a comprehensive review. Fatigue
& Fracture of Engineering Materials & Structures.
Chen, J., & Liu, Y. (2021b). Fatigue property prediction of additively manufactured Ti-6Al-4V using
probabilistic physics-guided learning. Additive Manufacturing, 39, 101876.
Cohon, J. L. (2004). Multiobjective programming and planning (Vol. 140): Courier Corporation.
Darvishvand, L., Kamkari, B., & Kowsary, F. (2018). Optimal design approach for heating irregular-
shaped objects in three-dimensional radiant furnaces using a hybrid genetic algorithm–artificial
neural network method. Engineering optimization, 50, 452-470.
Dhingra, A., & Rao, S. (1992). A neural network based approach to mechanical design optimization.
Engineering optimization, 20, 187-203.
Effati, S., & Nazemi, A. (2006). Neural network models and its application for solving linear and
quadratic programming problems. Applied mathematics and Computation, 172, 305-331.
Günther, J., Krewerth, D., Lippmann, T., Leuders, S., Tröster, T., Weidner, A., Biermann, H., &
Niendorf, T. (2017). Fatigue life of additively manufactured Ti–6Al–4V in the very high cycle
fatigue regime. International Journal of Fatigue, 94, 236-245.
Jeon, Y., Lee, M., & Choi, J. Y. (2019). Neuro-Optimization: Learning Objective Functions Using
Neural Networks. arXiv preprint arXiv:1905.10079.
Ketkar, N. (2017). Deep Learning with Python (Vol. 1): Springer.
Kumar, P., & Ramamurty, U. (2020). High cycle fatigue in selective laser melted Ti-6Al-4V. Acta
Materialia, 194, 305-320.
Laddach, K., Łangowski, R., Rutkowski, T. A., & Puchalski, B. (2022). An automatic selection of
optimal recurrent neural network architecture for processes dynamics modelling purposes.
Applied Soft Computing, 116, 108375.
Lopez-Garcia, T. B., Coronado-Mendoza, A., & Domínguez-Navarro, J. A. (2020). Artificial neural
networks in microgrids: A review. Engineering Applications of Artificial Intelligence, 95,
103894.

20
Mack, Y., Goel, T., Shyy, W., & Haftka, R. (2007). Surrogate model-based optimization framework: a
case study in aerospace design. In Evolutionary computation in dynamic and uncertain
environments (pp. 323-342): Springer.
Martínez-Iranzo, M., Herrero, J. M., Sanchis, J., Blasco, X., & García-Nieto, S. (2009). Applied Pareto
multi-objective optimization by stochastic solvers. Engineering Applications of Artificial
Intelligence, 22, 455-465.
Nascimento, C. A. O., Giudici, R., & Guardani, R. (2000). Neural network based approach for
optimization of industrial chemical processes. Computers & Chemical Engineering, 24, 2303-
2314.
Nascimento, R. G., Fricke, K., & Viana, F. A. C. (2020). A tutorial on solving ordinary differential
equations using Python and hybrid physics-informed neural network. Engineering Applications
of Artificial Intelligence, 96, 103996.
Neelakantan, T., & Pundarikanthan, N. (2000). Neural network-based simulation-optimization model
for reservoir operation. Journal of water resources planning and management, 126, 57-64.
Pegues, J. W., Shao, S., Shamsaei, N., Sanaei, N., Fatemi, A., Warner, D. H., Li, P., & Phan, N. (2020).
Fatigue of additive manufactured Ti-6Al-4V, Part I: The effects of powder feedstock,
manufacturing, and post-process conditions on the resulting microstructure and defects.
International Journal of Fatigue, 132, 105358.
Rao, S. S. (2019). Engineering optimization: theory and practice: John Wiley & Sons.
Sandgren, H. R., Zhai, Y., Lados, D. A., Shade, P. A., Schuren, J. C., Groeber, M. A., Kenesei, P., &
Gavras, A. G. (2016). Characterization of fatigue crack growth behavior in LENS fabricated
Ti-6Al-4V using high-energy synchrotron x-ray microtomography. Additive Manufacturing, 12,
132-141.
Sattarifar, A., & Nestorović, T. (2022). Damage localization and characterization using one-
dimensional convolutional neural network and a sparse network of transducers. Engineering
Applications of Artificial Intelligence, 115, 105273.
Sharma, A., Chen, J., Diewald, E., Imanian, A., Beuth, J., & Liu, Y. (2021). Data-Driven Sensitivity
Analysis for Static Mechanical Properties of Additively Manufactured Ti–6Al–4V. ASCE-
ASME J Risk and Uncert in Engrg Sys Part B Mech Engrg, 8.
Stadler, W. (1988). Multicriteria Optimization in Engineering and in the Sciences (Vol. 37): Springer
Science & Business Media.
Sundaram, A. (2022). Multiobjective multi verse optimization algorithm to solve dynamic economic
emission dispatch problem with transmission loss prediction by an artificial neural network.
Applied Soft Computing, 124, 109021.
Tagliarini, G. A., Christ, J. F., & Page, E. W. (1991). Optimization using neural networks. IEEE
transactions on computers, 40, 1347-1358.
Tank, D., & Hopfield, J. (1986). Simple'neural'optimization networks: An A/D converter, signal
decision circuit, and a linear programming circuit. IEEE transactions on circuits and systems,
33, 533-541.
Villarrubia, G., De Paz, J. F., Chamoso, P., & la Prieta, F. D. (2018). Artificial neural networks used in
optimization problems. Neurocomputing, 272, 10-16.

21
Wu, A., & Tam, P. K.-S. (1999). A neural network methodology of quadratic optimization.
International Journal of Neural Systems, 9, 87-93.
Wu, C., Wang, C., & Kim, J.-W. (2022). Welding sequence optimization to reduce welding distortion
based on coupled artificial neural network and swarm intelligence algorithm. Engineering
Applications of Artificial Intelligence, 114, 105142.
Wycisk, E., Emmelmann, C., Siddique, S., & Walther, F. (2013). High cycle fatigue (HCF) performance
of Ti-6Al-4V alloy processed by selective laser melting. Advanced materials research, 816,
134-139.
Xia, Y., Feng, G., & Wang, J. (2008). A novel recurrent neural network for solving nonlinear
optimization problems with inequality constraints. IEEE Transactions on neural networks, 19,
1340-1353.
Zhang, X.-S. (2013). Neural networks in optimization (Vol. 46): Springer Science & Business Media.
Zhang, Z., Li, L., & Lu, J. (2021). Gradient-based fly immune visual recurrent neural network solving
large-scale global optimization. Neurocomputing, 454, 238-253.

22

You might also like