IFEM Ch09
IFEM Ch09
IFEM Ch09
MultiFreedom
Constraints II
9–1
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–2
TABLE OF CONTENTS
Page
§9.1. The Penalty Method 9–3
§9.1.1. Physical Interpretation . . . . . . . . . . . . . . 9–3
§9.1.2. Choosing the Penalty Weight . . . . . . . . . . . . 9–4
§9.1.3. The Square Root Rule . . . . . . . . . . . . . . 9–4
§9.1.4. Penalty Elements for General MFCs . . . . . . . . . 9–5
§9.1.5. *The Theory Behind the Recipe . . . . . . . . . . . 9–6
§9.1.6. Assessment of the Penalty Method . . . . . . . . . . 9–7
§9.2. Lagrange Multiplier Adjunction 9–8
§9.2.1. Physical Interpretation . . . . . . . . . . . . . . 9–8
§9.2.2. Lagrange Multipliers for General MFCs . . . . . . . . 9–9
§9.2.3. *The Theory Behind the Recipe . . . . . . . . . . . 9–10
§9.2.4. Assessment of the Lagrange Multiplier Method . . . . . 9–10
§9.3. *The Augmented Lagrangian Method 9–10
§9.4. Summary 9–11
§9. Notes and Bibliography
. . . . . . . . . . . . . . . . . . . . . . 9–12
§9. References. . . . . . . . . . . . . . . . . . . . . . 9–12
§9. Exercises . . . . . . . . . . . . . . . . . . . . . . 9–13
9–2
9–3 §9.1 THE PENALTY METHOD
In this Chapter we continue the discussion of methods to treat multifreedom constraints (MFCs). The
master-slave method described previously was found to exhibit serious shortcomings for treating
arbitrary constraints, although the method has important applications to model reduction.
We now pass to the study of two other methods: penalty augmentation and Lagrange multiplier
adjunction. Both of these techniques are better suited to general implementations of the Finite
Element Method.
u 1 , f1 u 2 , f2 u 3 , f3 u 4 , f4 u 5 , f5 u 6 , f6 u 7 , f7
(1) (2) (3) (4) (5) (6)
x
1 2 3 4 5 6 7
Figure 9.1. The example structure of Chapter 8, repeated for convenience.
The penalty method will be first presented using a physical interpretation, leaving the mathematical
formulation to a subsequent section. Consider again the example structure of Chapter 8, which
is reproduced in Figure 9.1 for convenience. To impose u 2 = u 6 imagine that nodes 2 and 6 are
connected with a “fat” bar of axial stiffness w, labeled with element number 7, as shown in Figure
9.2. This bar is called a penalty element and w is its penalty weight.
u 1 , f1 u 2 , f2 u 3 , f3 u 4 , f4 u 5 , f5 u 6 , f6 u 7 , f7
(1) (2) (3) (4) (5) (6)
x
1 2 3 4 5 6 7
(7)
penalty element of axial rigidity w
Such an element, albeit fictitious, can be treated exactly like another bar element insofar as con-
tinuing the assembly of the master stiffness equations. The penalty element stiffness equations,
K(7) u(7) = f(7) , are1 (7)
1 −1 u2 f2
w = (9.1)
−1 1 u6 f 6(7)
Because there is one freedom per node, the two local element freedoms map into global freedoms 2
and 6, respectively. Using the assembly rules of Chapter 3 we obtain the following modified master
9–3
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–4
distinguish K from an exactly singular matrix. If w << 1016 but w >> 1, the effect will be seen in
increasing solution errors affecting the computed displacements û returned by the equation solver.
These errors, however, tend to be more of a random nature than the constraint violation error.
2 Such definitions are more rigurously done by working with binary numbers and base-2 arithmetic but for the present
discussion the use of decimal powers is sufficient.
9–4
9–5 §9.1 THE PENALTY METHOD
Obviously we have two effects at odds with each other. Making w larger reduces the constraint
violation error but increases the solution error. The best w is that which makes both errors roughly
equal in absolute value. This tradeoff value is difficult to find aside of systematically running
numerical experiments. In practice the heuristic square root rule is often followed.
This rule can be stated as follows. Suppose that the largest stiffness coefficient, before adding
penalty elements, is of the order of 10k and that the working machine precision is p digits.3 Then
choose penalty weights to be of order 10k+ p/2 with the proviso that such a choice would not cause
arithmetic overflow.4
For the above example in which k ≈ 0 and p ≈ 16, the optimal w given by this rule would be
w ≈ 108 . This w would yield a constraint violation and a solution error of order 10−8 . Note that
there is no simple way to do better than this accuracy aside from using extended (e.g., quad) floating-
point precision. This is not easy to do when using standard low-level programming languages.
√
The name “square root” arises because the recommended w is in fact 10k 10 p . It is seen that
picking the weight by this rule requires knowledge of both stiffness magnitudes and floating-point
hardware properties of the computer used, as well as the precision selected by the program.
For the constraint u 2 = u 6 the physical interpretation of the penalty element is clear. Nodal points 2
and 6 must move in lockstep long x, which can be approximately enforced by the heavy bar device
shown in Figure 9.2. But how about 3u 3 + u 5 − 4u 6 = 1? Or just u 2 = −u 6 ?
The treatment of more general constraints is linked to the theory of Courant penalty functions,
which in turn is a topic in variational calculus. Because the necessary theory given in §9.1.5 is
viewed as an advanced topic, the procedure used for constructing a penalty element is stated here
as a recipe. Consider the homogeneous constraint
3u 3 + u 5 − 4u 6 = 0. (9.4)
3 Such order-of-magnitude estimates can be readily found by scanning the diagonal of K because the largest stiffness
coefficient of the actual structure is usually a diagonal entry.
4 If overflows occurs, the master stiffness should be scaled throughout or a better choice of physical units made.
9–5
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–6
e
Here K̄ is the unscaled stiffness matrix of the penalty element. This is now multiplied by the
penalty weight w and assembled into the master stiffness matrix following the usual rules. For the
example problem, augmenting (9.2) with the w-scaled penalty element (9.6) yields
K 11 K 12 0 0 0 0 0 u1 f1
K 12 K 22 K 23 0 0 0 0 u2 f2
0 K 23 K 33 + 9w K 34 3w −12w 0 u3 f3
0 0 K 34 K 44 K 45 0 0 u4 = f4 . (9.7)
0 0 3w K 45 K 55 + w K 56 − 4w 0 u5 f5
0 0 −12w 0 K 56 − 4w K 66 + 16w K 67 u6 f6
0 0 0 0 0 K 67 K 77 u7 f7
If the constraint is nonhomogeneous the force vector is also modified. To illustrate this effect,
consider the MFC: 3u 3 + u 5 − 4u 6 = 1. Rewrite in matrix form as
u3
[ 3 1 −4 ] u 5 = 1. (9.8)
u6
Premultiply both sides by the transpose of the coefficient matrix:
9 3 −12 u3 3
3 1 −4 u5 = 1 . (9.9)
−12 −4 16 u6 −4
Scaling by w and assembling yields
K 11 K 12 0 0 0 0 0 u1 f1
K 12 K 22 K 23 0 0 0 0 u2 f2
0 K 23 K 33 + 9w K 34 3w −12w 0 u 3 f 3 + 3w
0 0 K 34 K 44 K 45 0 0 u4 = f4 .
0 0 3w K 45 K 55 + w K 56 − 4w 0 u 5 f5 + w
0 0 −12w 0 K 56 − 4w K 66 + 16w K 67 u6 f 6 − 4w
0 0 0 0 0 K 67 K 77 u7 f7
(9.10)
The rule comes from the following mathematical theory. Suppose we have a set of m linear MFCs. Using the
matrix notation introduced in §8.1.3, these will be stated as
where u contains all degrees of freedom and each a p is a row vector with same length as u. To incorporate
the MFCs into the FEM model one selects a weight w p > 0 for each constraints and constructs the so-called
Courant quadratic penalty function or “penalty energy”
m
1
P= Pp , with Pp = u T 2
aTp a p u − w p aTp b p = 12 uT K( p) u − uT f( p) , (9.12)
p=1
9–6
9–7 §9.1 THE PENALTY METHOD
where we have called K( p) = w p aTp a p and f( p) = w p aT bi . P is added to the potential energy function
= 12 uT Ku − uT f to form the augmented potential energy a = + P. Minimization of a with respect
to u yields
m
m
( p)
Ku + K u=f+ f( p) . (9.13)
p=1 p=1
Each term of the sum on p, which derives from term Pp in (9.12), may be viewed as contributed by a penalty
element with globalized stiffness matrix K( p) = w p aTp a p and globalized added force term f( p) = w p aTp b p .
To use a even more compact form we may write the set of multifreedom constraints as Au = b. Then the
penalty augmented system can be written compactly as
where W is a diagonal matrix of penalty weights. This compact form, however, conceals the configuration of
the penalty elements.
5 Single freedom constraints, such as those encountered in Chapter 3, are usually processed separately for efficiency.
6 For example, solving the master stiffness equations by Cholesky factorization or conjugate-gradients.
9–7
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–8
u 1 , f1 u 2 , f2 u 3 , f3 u 4 , f4 u 5 , f5 u 6 , f6 u 7 , f7
(1) (2) (3) (4) (5) (6)
x
1 2 3 4 5 6 7
−λ λ
Figure 9.3. Physical interpretation of Lagrange multiplier
adjunction to enforce the MFC u 2 = u 6 .
Finally, even if optimal weights are selected, the combined solution error cannot be lowered beyond
a threshold value.
From this assessment it is evident that penalty augmentation, although superior to the master-slave
method from the standpoint of generality and ease of implementation, is no panacea.
§9.2. Lagrange Multiplier Adjunction
This λ is called a Lagrange multiplier. Because λ is an unknown, let us transfer it to the left hand
side by appending it to the vector of unknowns:
u1
K 11 K 12 0 0 0 0 0 0 f1
u2
K 12 K 22 K 23 0 0 0 0 1 f2
u
0 K 23 K 33 K 34 0 0 0 0 3 f3
u
0 0 K 34 K 44 K 45 0 0 0 4 = f4 . (9.16)
u5
0 0 0 K 45 K 55 K 56 0 0 5 f
u
0 0 0 0 K 56 K 66 K 67 −1 6 f6
u7
0 0 0 0 0 K 67 K 77 0 f7
λ
9–8
9–9 §9.2 LAGRANGE MULTIPLIER ADJUNCTION
But now we have 7 equations in 8 unknowns. To render the system determinate, the constraint
condition u 2 − u 6 = 0 is appended as eighth equation:
K 11 K 12 0 0 0 0 0 0 u1 f1
K 12 K 22 K 23 0 0 0 0 1 u2 f2
0 K 23 K 33 K 34 0 0 0 0 u3 f3
0 0 K 34 K 44 K 45 0 0 0 u4 f4
= , (9.17)
0 0 0 K 45 K 55 K 56 0 0 u5 f5
0 0 0 0 K 56 K 66 K 67 −1 u 6 f6
0 0 0 0 0 K 67 K 77 0 u7 f7
0 1 0 0 0 −1 0 0 λ 0
This is called the multiplier-augmented system. Its coefficient matrix, which is symmetric, is
called the bordered stiffness matrix. The process by which λ is appended to the vector of original
unknowns is called adjunction. Solving this system provides the desired solution for the degrees
of freedom while also characterizing the constraint forces through λ.
The general procedure will be stated first as a recipe. Suppose that we want to solve the example
structure subjected to three MFCs
u 2 − u 6 = 0, 5u 2 − 8u 7 = 3, 3u 3 + u 5 − 4u 6 = 1, (9.18)
K 11 K 12 0 0 0 0 0 f1
K 12 K 22 K 23 0 0 0 0 f2
u1
0 K 23 K 33 K 34 0 0 0 f3
u2
0 0 K 34 K 44 K 45 0 0 f4
u
0 0 0 K 45 K 55 K 56 0 3 f5
u = , (9.19)
0 0 0 0 K 56 K 66 K 67 4 f6
u
0 0 0 0 0 K 67 K 77 5 f7
u6
0 1 0 0 0 −1 0 0
u7
0 5 0 0 0 0 −8 3
0 0 3 0 1 −4 0 1
Three Lagrange multipliers: λ1 , λ2 and λ3 , are required to take care of three MFCs. Adjoin those
unknowns to the nodal displacement vector. Symmetrize the coefficient matrix by appending 3
columns that are the transpose of the 3 last rows in (9.19), and filling the bottom right-hand corner
9–9
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–10
9–10
9–11 §9.4 SUMMARY
On taking W = wS, the general matrix equation (9.13) of the penalty method is recovered.
This relation suggests the construction of iterative procedures in which one tries to improve the accuracy
of the penalty function method while w is kept constant [57]. This strategy circumvents the aforementioned
ill-conditioning problems when the weight w is gradually increased. One such method is easily constructed
by inspecting (9.24). Using superscript k as an iteration index and keeping w fixed, solve equations (9.24) in
tandem as follows:
(K + AT WA) uk = f + AT Wb − AT λk ,
(9.26)
λk+1 = λk + W(b − Auk ),
for k = 0, 1, . . . , beginning with λ0 = 0. Then u0 is the penalty solution. If the process converges one
recovers the exact Lagrangian solution without having to solve the Lagrangian system (9.23) directly.
The family of iterative procedures that may be precipitated from (9.24) collectively pertains to the class of
augmented Lagrangian methods.
Figure 9.4 gives an assessment of the three Figure 9.4. Assessment summary of
techniques in terms of seven attributes. three MFC application methods.
9–11
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–12
For a general purpose program that tries to attain “black box” behavior (that is, minimal decisions
on the part of users) the method of Lagrange multipliers has the edge. This edge is unfortunately
blunted by a fairly complex computer implementation and by the loss of positive definiteness in the
bordered stiffness matrix.
Notes and Bibliography
A form of the penalty function method, quite close to that described in §9.1.5, was first proposed by Courant
in the early 1940s [45]. It entered the FEM through the work of numerous people in the 1960s. There is a
good description in the book by Zienkiewicz and Taylor [231].
The Lagrange Multiplier method is much older. Multipliers (called initially “coefficients”) were described
by Lagrange in his famous Mécanique Analytique monograph [126], as part of the procedure for forming the
function now called the Lagrangian. Its use in FEM is more recent than penalty methods.
Augmented Lagrangian methods have received much attention since the late 1960s, when they originated in
the field of constrained optimization [110,164]. The use of the Augmented Lagrangian Multiplier method
for FEM kinematic constraints is first discussed in [57], wherein the iterative algorithm (9.26) for the master
stiffness equations is derived.
References
Referenced items have been moved to Appendix R.
9–12
9–13 Exercises
EXERCISE 9.1 [C+N:20] This is identical to Exercise 8.1, except that the MFC u 2 − u 6 = 1/5 is to be
treated by the penalty function method. Take the weight w to be 10k , in which k varies as k = 3, 4, 5, . . . 16.
For each sample w compute the Euclidean-norm solution error e(w) = ||u p (w) − uex ||2 , where u p is the
computed solution and uex is the exact solution listed in (E8.1). Plot k = log10 w versus log10 e and report for
which weight e attains a minimum. (See Slide #5 for a check). Does it roughly agree with the square root rule
(§9.1.3) if the computations carry 16 digits of precision?
As in Exercise 8.1, use Mathematica, Matlab (or similar) to do the algebra. For example, the following
Mathematica script solves this Exercise:
EXERCISE 9.2 [C+N:15] Again identical to Exercise 8.1, except that the MFC u 2 − u 6 = 1/5 is to be
treated by the Lagrange multiplier method. The results for the computed u and the recovered force vector Ku
should agree with (E8.1). Use Mathematica, Matlab (or similar) to do the algebra. For example, the following
Mathematica script solves this Exercise:
9–13
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–14
Kmod[[2,8]]=Kmod[[8,2]]= 1;
Kmod[[6,8]]=Kmod[[8,6]]=-1; fmod[[8]]=1/5;
Print["Kmod=",Kmod//MatrixForm];
Print["fmod=",fmod];
umod=LinearSolve[N[Kmod],N[fmod]]; u=Take[umod,7];
Print["Solution u=",u ,", lambda=",umod[[8]]];
Print["Recovered node forces=",K.u];
EXERCISE 9.3 [A:10] For the example structure, show which penalty elements would implement the fol-
lowing MFCs:
(a) u 2 + u 6 = 0,
(E9.1)
(b) u 2 − 3u 6 = 1/3.
As answer, show the stiffness equations of those two elements in a manner similar to (9.1).
EXERCISE 9.4 [A/C+N:15+15+10] Suppose that the assembled stiffness equations for a one-dimensional
finite element model before imposing constraints are
2 −1 0 u1 1
−1 2 −1 u2 = 0 . (E9.2)
0 −1 2 u3 2
u1 = u3. (E9.3)
(a) Impose the constraint (E9.3) by the master-slave method taking u 1 as master, and solve the resulting
2 × 2 system of equations by hand.
(b) Impose the constraint (E9.3) by the penalty function method, leaving the weight w as a free parameter.
Solve the equations by hand or CAS (Cramer’s rule is recommended) and verify analytically that as
w → ∞ the solution approaches that found in (a). Tabulate the values of u 1 , u 2 , u 3 for w = 0, 1, 10, 100.
Hint 1: the value of u 2 should not change. Hint 2: the solution for u 1 should be (6w + 5)/(4w + 4).
(c) Impose the constraint (E9.3) by the Lagrange multiplier method. Show the 4 × 4 multiplier-augmented
system of equations analogous to (9.13) and solve it by computer or calculator.
EXERCISE 9.5 [A/C:10+15+10] The left end of the cantilevered beam-column member illustrated in Fig-
ure E9.1 rests on a skew-roller that forms a 45◦ angle with the horizontal axis x. The member is loaded axially
by a force P as shown. The finite element equations upon removing the fixed right end freedoms {u x2 , u x2 , θ2 },
but before imposing the skew-roller MFC, are
E A/L 0 0 u x1 P
0 12E I /L 3 6E I /L 2 u y1 = 0 , (E9.4)
0 6E I /L 2 4E I /L θ1 0
where E, A, and I = Izz are given member properties, θ1 is the left end rotation, and L is the member length.7
7 The stiffness equations for a beam column are derived in Part III of this book. For now consider (E9.4) as a recipe.
9–14
9–15 Exercises
To simplify the calculations set P = α E A, and I = β AL 2 , in which α and β are dimensionless parameters,
and express the following solutions in terms of α and β.
(a) Apply the skew-roller constraint by the master-slave
;
method (make u y1 slave) and solve for u x1 and θ1 in terms
of L, α and β. This may be done by hand or a CAS. Partial beam-column E, A, I
;
member
solution: u x1 = αL/(1 + 3β). y
;;;
θ x P 1 2
;
(b) Apply the skew-roller constraint with the penalty method
;
z
by inserting a penalty element at node 1. Follow the rule
L
of §9.1.4 to construct the 2 × 2 penalty stiffness. Com-
o
pute u x1 from the modified equations (Cramer’s rule is 45
recommended if solved by hand). Verify that as w → ∞
the answer obtained in (a) is recovered. Partial solution: Figure E9.1. Cantilevered beam-column
u x1 = αL(3E Aβ +wL)/(3E Aβ +wL(1+3β)). Can the on skew-roller for Exercise 9.5.
penalty stiffness be physically interpreted in some way?
(c) Apply the skew roller constraint by Lagrangian multiplier adjunction, and solve the resulting 4×4 system
of equations using a CAS (by hand it will take long). Verify that you get the same solution as in (a).
EXERCISE 9.6 [A:5+5+10+10+5] A cantiveler beam-column is to be joined to a plane stress plate mesh as
depicted in Figure E9.2.8 Both pieces move in the plane {x, y}. Plane stress elements have two degrees of
freedom per node: two translations u x and u y along x and y, respectively, whereas a beam-column element
has three: two translations u x and u y along x and y, and one rotation (positive CCW) θz about z. To connect
the cantilever beam to the mesh, the following “gluing” conditions are applied:
3 6 9
(1) The horizontal (u x ) and vertical (u y ) dis-
placements of the beam at their common plane beam H/2
node (2 of beam, 4 of plate) are the same. 1 4 7 10
2 H
(2) The beam end rotation θ2 and the mean ro- nodes 2 and 4 H/2
tation of the plate edge 3–5 are the same. occupy same position
y 5 8 11
For infinitesimal displacements and rota-
avg
θ x
plane stress mesh
tions the latter is θ35 = (u x5 − u x3 )/H . z
Questions:
Figure E9.2. Beam linked to plate in plane stress
(a) Write down the three MFC conditions: two for Exercise 9.6. Beam shown slightly separate from
from (1) and one from (2), and state whether plate for visualization convenience: nodes 2 and 4
they are linear and homogeneous. actually are at the same location.
avg
(b) Where does the above expression of θ35 come from? (Geometric interpretation is OK.) Can it be made
more accurate9 by including u x4 ?
(c) Write down the master-slave transformation matrix if {u x2 , u y2 , θ2 } are picked as slaves. It is sufficient
to write down the transformation for the DOFs of nodes 2, 3, 4, and 5, which gives a T of order 9 × 6,
since the transformations for the other freedoms are trivial.
(d) If the penalty method is used, write down the stiffness equations of the three penalty elements assuming
the same weight w is used. Their stiffness matrices are of order 2 × 2, 2 × 2 and 3 × 3, respectively.
(Do not proceed further)
8 This is extracted from a question previously given in the Aerospace Ph. D. Preliminary Exam. Technically it is not
difficult once the student understand what is being asked. This can take some time, but a HW is more relaxed.
9 To answer the second question, observe that the displacements along 3–4 and 4–5 vary linearly. Thus the angle of rotation
about z is constant for each of them, and (for infinitesimal displacements) may be set equal to the tangent.
9–15
Chapter 9: MULTIFREEDOM CONSTRAINTS II 9–16
(e) If Lagrange multiplier adjunction is used, how many Lagrange multipliers will you need to append? (Do
not proceed further).
EXERCISE 9.7 [A:30] Show that the master-slave transformation method u = Tû can be written down as a
special form of the method of Lagrange multipliers. Start from the augmented functional
M S = 12 uT Ku − uT f + λT (u − Tû) (E9.5)
and write down the stationarity conditions of M S with respect to u, λ and û in matrix form.
EXERCISE 9.8 [A:35] Check the matrix equations (9.23) through (9.26) quoted for the Augmented La-
grangian method.
EXERCISE 9.9 [A:40] (Advanced, close to a research paper). Show that the master-slave transformation
method u = Tû can be expressed as a limit of the penalty function method as the weights go to infinity. Start
from the augmented functional
Write down the matrix stationarity conditions with respect to to u and û and take the limit w → ∞. Hint:
using Woodbury’s formula (Appendix C, §C.5.2)
show that
−1
K = TK−1 TT . (E9.8)
9–16