Advanced Computational Electromagnetic Methods and Applications
Advanced Computational Electromagnetic Methods and Applications
Advanced Computational Electromagnetic Methods and Applications
Wenhua Yu
Wenxing Li
Atef Elsherbeni
Yahya Rahmat-Samii
Editors
All rights reserved. Printed and bound in the United States of America. No part of this book
may be reproduced or utilized in any form or by any means, electronic or mechanical, including
photocopying, recording, or by any information storage and retrieval system, without permission
in writing from the publisher.
All terms mentioned in this book that are known to be trademarks or service marks have been
appropriately capitalized. Artech House cannot attest to the accuracy of this information. Use of
a term in this book should not be regarded as affecting the validity of any trademark or service
mark.
10 9 8 7 6 5 4 3 2 1
v
vi Advanced Computational Electromagnetic Methods and Applications
xv
xvi Advanced Computational Electromagnetic Methods and Applications
Wenhua Yu
Wenxing Li
Atef Z. Elsherbeni
Yahya Rahmat-Samii
March 2015
Chapter 1
Novelties of Spectral Domain Analysis in
Antenna Characterizations: Concept,
Formulation, and Applications
Joshua M. Kovitz and Yahya Rahmat-Samii
1.1 INTRODUCTION
termed the near fields. In these regions, the antenna no longer appears as a point,
leading to complex field behavior that is difficult to analyze numerically and
analytically. The near-field and far-field regions are illustrated in Figure 1.1 with a
large reflector dish antenna ground station, where an observer in the near-field
region does not perceive the antenna as a point and experiences complex wave
behavior. However, the depicted satellite orbits the Earth at a large distance away
from the surface, where the satellite experiences far-field radiation as if the dish
antenna was a point source that concentrated its power in one direction.
Figure 1.1 Qualitative illustration of the near-field and far-field regions. The observer in the figure is
standing within the near-field region of the large ground dish antenna, whereas the satellite
is located in the far-field region of the dish antenna ground station. Note that in the far-
field region the dish antenna appears nearly as a point source to the satellite, whereas the
antenna does not appear as a point source to the observer in the near-field region.
The approaches that can be used to acquire the near-fields can be divided into
three general categories, which are all depicted in Figure 1.2. The first category
encompasses direct measurements of the EMFs near the antenna. Since the electric
field is the primary quantity of interest, a simple and intuitive technique under this
category would be to measure the electric fields using a simple power meter and
antenna positioner. If the electric field phase is desired as well, then the power
meter can be replaced with a vector network analyzer. While this directly measures
and obtains the near fields, the approach can be cumbersome, time-consuming,
expensive, and in some cases impractical. First, the approach requires having
robust mechanical equipment that can provide motion on three different axes,
which is certainly not straightforward over large volumes. Furthermore, if
measurement is chosen as the tool to determine the near-field values, there is no
guarantee that the design satisfies the antenna near-field requirements.
Consequently, multiple design iterations may compel additional costs to
reconstruct the antenna design in order to satisfy the desired specifications.
Novelties of Spectral Domain Analysis 3
Figure 1.2 Depiction of the possible techniques to find the near-field radiation from a given antenna.
Of the three techniques, this chapter specifically focuses on spectral analysis, which often
requires less time or computational effort in comparison to full-wave simulation or direct
measurements. A handy feature of this technique is that the only data required are the far-
field patterns and the radiated power, which are often known in most practical
circumstances.
the near fields still requires a tedious integration over the currents for every
observation point in the near-field region.
The last category of techniques can be classified as spectral analysis. In many
of these techniques, the fields are analyzed by decomposing the fields into an
ensemble of propagating and evanescent waves traveling in different directions. A
simple and intuitive approach is to decompose the fields into plane waves [1, 2].
This enables the rapid calculation of the fields in any region through the use of the
fast Fourier transform (FFT), which is well known in the computational
community for its inherent computational efficiency. Rather than using currents to
predict the near fields, we can directly utilize the far field to evaluate the near-
fields. This is rather convenient since the far-fields are usually known in most
practical cases, where the far fields can be found via simulation or measurements.
With the knowledge of the far field radiation and radiated power, one can
accurately predict the magnitude of the near fields. Often, the antenna is placed in
a complex environment where it can be difficult to characterize the radiation from
interactions between the antenna and other nearby objects. When applied to the
measured far fields, the spectral domain approach conveniently provides the near
fields radiated from all parts of the antenna and any interactions with the antenna’s
environment. This is important in accurately characterizing the near-fields, and can
also be challenging to achieve via the standard computation techniques.
The search for an efficient and accurate near field computational technique is
motivated by personal safety as well as interference concerns. Ensuring safety for
anyone in the antenna vicinity is critical in any antenna installation, and providing
a quick means of characterizing the near fields is an important problem in the field
of antenna engineering and electromagnetics. Often standards are placed by
government organizations such as the U.S. Federal Communications Commission
(FCC) in order to provide safety for individuals and minimize possible interference
with other devices. The knowledge of the antenna near fields also can be used in
the design of compact electronic systems such as CubeSat’s and other spaceborne
aircraft, where the induced fields may cause undesirable interference or breakdown
in electronics placed near the antenna. Once the near fields are known, then either
the electronics can be placed appropriately on the satellite to avoid such problems
or the antenna can be optimized such that the near fields in a particular location are
minimized.
In this chapter, we detail the steps needed to evaluate the near fields based on
the far-field data and the radiated power. Starting from Maxwell’s equations, it will
be revealed how EMFs can be decomposed into a spectrum of plane waves, which
has been popularized as plane wave expansion (PWE). The result is the Fourier
transform relationship between the near fields and far fields, which has seen use in
many applications including theoretical and computational electromagnetics [17],
antenna measurements [812], and even optics [13]. In our derivations, we provide
general results that can be used for several popular orientations of the coordinate
system describing the antenna. The discretization of the near-field and far-field
data is also discussed in detail, leading to the application of the FFT. The use of the
Novelties of Spectral Domain Analysis 5
FFT requires proper normalization to account for sampling. In the past, the data
from the FFT was simply normalized to the maximum, but doing so will not
provide the field values attained in real life. The normalization effectively scales
the results to the desired units and is accomplished with only the knowledge of the
directivity and the power radiated. Without the normalization, the resulting data
only provides relative field strengths, which is not helpful in finding the realized
values of the fields. Interpolation is another important aspect when using the FFT,
since a rectangular sampling grid in the spectral domain must be used. In general,
the far-field values are complex numbers, and care must be taken when
interpolating the values. Some simple and effective choices for interpolation
schemes are briefly highlighted and discussed in detail.
As usual, some mathematical notations and assumptions must be pointed out
to the reader. In the following derivations, the italic notation f represents a complex
scalar, while the bold notation B represents a complex vector in 3-D space. With
the exception of the discussion on FFT, these quantities are given in the phasor
domain, where the engineering ejωt time convention is assumed. This will be the
convention used throughout the chapter, unless otherwise noted.
The material derived and discussed in the ensuing sections effectively covers
all necessary aspects to recover the near-field data from the far-field data. The
chapter provides the complete story in the development and use of this technique
to aid any reader in replicating the results and applying the technique to their
antennas in general. To conclude the chapter, the concepts developed herein are
applied towards several instructive examples of well-known aperture distributions,
where the fields are known analytically for comparison. A real-life reflector
antenna example is provided, where we obtain the near fields using the simulated
far fields. Quite commonly the far fields are only known for two principal planes,
and we extend the spectral analysis technique to these cases as well. We compare
the scenario where only two principal planes are known versus the case where the
far-field patterns for all angles are known for a reflector antenna.
In order to obtain the near fields, the theoretical framework behind radiation in the
near field and far field must be established. Radiation from antennas is
characterized by its radiated electric field and magnetic fields, denoted as E and H,
respectively. Both of these physical quantities exhibit complex behavior that is
challenging to model either analytically or numerically. However, spectral analysis
provides an intuitive link between the near fields and the far fields that enables an
efficient and systematic procedure to compute the near fields based upon the
knowledge of the far fields. This is depicted in Figure 1.3, where a new quantity
known as the PWS has been introduced to facilitate a simple relationship between
the fields. As shown, the PWS has a Fourier transform relation to the near-fields in
a plane z = C, where C is some arbitrary constant. Once the PWS is known, then
6 Advanced Computational Electromagnetic Methods and Applications
the near fields in any region are known and can be computed via Fourier
transform, and the far fields can be computed via an asymptotic relation to the
PWS. In this section, the behavior of electromagnetic waves and in particular plane
waves is reviewed and described in detail. These fundamental concepts lay the
foundation to introduce the PWS formally. The relationships between the near-
fields, far-fields, and the PWS are also derived and explained. Lastly, the analytical
procedure to obtain the near fields based upon the far-field distribution and the
radiated power is outlined.
Figure 1.3 Depiction of the relationship between the near fields and far fields provided by spectral
analysis. The technique utilizes the so-called PWS to relate the fields in the near-field and
far-field regions, resulting in a Fourier transform relation to the near-field electric fields in
the planes z = C, where C is an arbitrary constant. The PWS also has a useful asymptotic
relationship to the far fields.
For antennas and EMFs in general, the electric and magnetic fields can be
mathematically described by Maxwell’s equations, shown below.
E jB (1.1a)
H J jD (1.1b)
D (1.1c)
B 0 (1.1d)
In the equations above, B represents the magnetic flux density, D represents the
electric flux density, J represents the electric current density, and represents the
electric charge density. Note also that is the angular frequency in rad/s. These
equations are known individually as Faraday’s law, Ampere’s law, Gauss’ law, and
the magnetic Gauss’ law, respectively. Maxwell’s equations are often paired with
the constitutive relations B=H and D=E, assuming homogenous, linear, and
isotropic materials are present.
Novelties of Spectral Domain Analysis 7
While Maxwell’s equations provide insights into the relationship between the
electric field, magnetic fields, and the electric sources, the solutions to these
equations are not immediately obvious. A few mathematical manipulations of
these equations can reveal some remarkable insights. Taking the curl of Faraday’s
law
and using the vector identity F F 2F along with the constitutive
relations, we have
2E k 2E jJ (1.3)
where k is known as the wavenumber. This equation is an inhomogeneous
partial differential equation of second order, and in unbounded space can be solved
using standard techniques (e.g., vector potentials, assuming that J and are
known). Unfortunately, the knowledge of the sources usually comes at a great
computational cost, as discussed in the previous section.
The spectral analysis technique avoids this problem by analyzing the fields in
the source-free regions, where simplifications to the differential equations can be
made. No currents or charges exist in these regions, that is, J = 0 and = 0, which
leads to the Helmholtz equation
2 E k 2 E 0 (1.4)
A similar equation can also be derived for the magnetic field H. The solutions of
this equation have very interesting implications as discussed in [1416]. While
many solutions of this equation can be derived for any coordinate system, the most
important and possibly the simplest to understand are the solutions in rectangular
coordinates. In rectangular coordinates, the solutions of this equation are
E( x, y, z ) E0e jkr (1.5)
k x2 k y2 k z2 k 2 (1.7)
8 Advanced Computational Electromagnetic Methods and Applications
This forces the speed of the plane wave to be equal to the speed of light in that
medium (i.e., 𝑣𝑝 = 𝜔⁄𝑘 = 1⁄√𝜇𝜀 ). The dispersion relation represents one of
many important properties of plane waves. One consequence is that there are only
two independent components, which means that only two components must be
known to have full knowledge of the propagation constant vector k. In many cases,
only kx and ky are given, but kz can always be found for plane waves using
k z k 2 k x2 k y2 (1.8)
Care must be taken in choosing either positive or negative values of kz, but usually
there is enough information in the problem being solved to determine the sign. We
will highlight those cases in the subsequent sections. Also, if kx2 + ky2 > k2, then
imaginary values of kz can ensue, resulting in evanescent waves decaying in
magnitude as the observation points move in the +z direction.
Another important feature about the plane wave solution is the electric field
polarization vector E0, which points in the direction of oscillation as time
progresses. There are several important features about this vector. The first is that
the electric field vector E0 in free space (or in isotropic mediums) will be
orthogonal to the direction of propagation. This can be shown by considering the
source-free Gauss’ law
E E0e jkr E0 e jkr E0 ke jkr 0 (1.9)
Figure 1.4 Illustration of a plane wave whose direction of propagation is towards the k direction.
Note that the surfaces of constant phase are planes, hence the term plane waves. The k
vector is orthogonal to these planes, implying that propagation occurs orthogonal to these
surfaces. An example of the electric and magnetic fields of this plane wave are also
shown, where E0 and H0 are orthogonal to k.
E E0 n e jk n r (1.12)
n
With just kx and ky known, we can have full knowledge of the wave directions and
the wave vector k using (1.8). We can also introduce the quantity A, which
represents the spectral density, i.e. that is the field density packed into the spectral
10 Advanced Computational Electromagnetic Methods and Applications
Taking this equation and shrinking the factors k x and k y to zero produces
The PWS represents a vector quantity that provides a means to relate the near
fields and far-fields in a simple, intuitive, and compact manner. (1.17) from the
previous section demonstrated the intuition behind the PWS as a spectrum of plane
waves propagating in many different directions, all with the same frequency .
However, many more properties can be extracted from (1.17) through some
important assumptions.
A special but important case occurs when the observation point lies in the z =
0 plane. In this plane, (1.17) reduces to
1
A( k , k
jk x x jk y y
Et ( x, y, 0) x y )e dk x dk y (1.18)
4 2
Novelties of Spectral Domain Analysis 11
where C has been set to C 1/ 4 2 and the ranges of kx and ky have been extended
to cover k x , k y . The above equation can be recognized as a 2-D Fourier
transform with respect to the parameters kx and ky. The propagation constants kx
and ky are alike to the angular frequency in the more common Fourier transform
relationship
S ( )e (1.19)
jt
s(t ) d
between frequency and time dependence of signals. One key difference in (1.18) is
that the kx and ky represent spatial frequencies rather than frequencies in time.
These equations also make it clear that A represents the frequency-domain
components (alike to S()) and E represents the physical quantity of interest in
space (alike to s(t)). Another distinction is that a minus sign appears in the
exponential factor of (1.18), while the typical Fourier transform usually has a
positive exponential factor when going back to the time domain.
The fact that the PWS has the 2-D Fourier transform relationship suggests that
the PWS can be obtained via the inverse Fourier transform as
E ( x, y, 0)e
jk x x jk y y
A(k x , k y ) t dxdy (1.20)
due to the Fourier inversion theorem. Notice that this equation has a positive sign
in the exponential factor. Since the relationship shown in (1.20) shares similarities
with the typical Fourier transform, we denote the Fourier transform by the script
letter F, where we can rewrite (1.18) and (1.20) more compactly by
E( x, y,0) 1
A(k x , k y ) (1.21a)
Note that the Fourier transform operation in (1.21b) has a positive exponent as
denoted in (1.20). Now, it is interesting to note that all information about the PWS
can be obtained if the electric field is known in one plane. This scenario frequently
occurs within the antenna discipline in theory and measurements. Once the full
PWS has been obtained, then all radiation information relevant to the antenna can
be computed.
As an example, let us assume an electric field distribution with the form
ˆ 0 ( y)rect( x / )
E( x, y,0) xE (1.22)
ky
A(k x , k y ) xˆE0 sin
sincc (1.23)
2
where the sinc() function is defined as sinc(x) ≡ sin(x)/x. Note that with the PWS
fully known, we can go back and retrieve the electric field at z = 0 using the
inverse Fourier transform in (1.21a).
Another interesting case to consider with (1.17) is an observation plane at a
nonzero z value. If we consider the observation points on a plane defined by z = z0,
then we obtain
1
A( k , k
jk x x jk y y
E( x, y, z0 ) x y )e jkz z0 e dk x dk y (1.24)
4 2
In this equation, it is important to note that both the A(kx, ky) and the e jkz z0 terms
are functions of kx and ky. If we rewrite the equation by setting
which means that the electric field in another plane can be computed through a
Fourier transform of the modified PWS written as A((kk x , k y ) . Both sides of the
equation are also vectors, which means that the x, y, and z components of the
electric field can be obtained from the x, y, and z components of the Fourier
transform of A .
The resulting equation clearly shows that the electric field at any point in
space can be found assuming that the PWS is already known. While there are a few
methods to obtain this quantity, we will show how to retrieve the PWS from the
far-field patterns in a later section of this chapter. Thus, with the PWS already
known from the far-field patterns, the near-field electric field at any point in space
near the antenna can be found. This is an important consequence of the PWS.
Another important point to consider is only planes of constant z were discussed;
however, this treatment can be extended to planes of constant x or y as well as
planes tilted at some arbitrary angle.
Previously, it was shown that the electric field in a plane could be related to the
PWS by the Fourier transform and vice versa. The treatment was generalized such
that any point in space could be obtained assuming that the point was located on
Novelties of Spectral Domain Analysis 13
the plane z = z0. Thus, it stands to reason that one could obtain the far field from
the PWS as well. In fact, the PWS represents the field strength devoted to a plane
wave in the given k direction. It then becomes intuitive that in the far field (i.e.,
r ) the only radiation that will be received in the (, ) direction is the plane
wave component traveling in the rˆ k / k direction as shown in Figure 1.5. In
particular, the plane wave traveling in the r̂ direction is associated with the kx and
ky propagation constants, where
kx k sin cos (1.27a)
kz k cos (1.27c)
Figure 1.5 When computing the PWS from the near fields at z = 0, we are decomposing the fields
into the plane wave components propagating in the direction specified by kx and ky. In the
far-field, the only radiation that will reach the point defined by (r, ) is the plane wave
component of the PWS traveling in the r̂ direction, where rˆ k / k or 𝑘𝑥 = 𝑘𝑠𝑖𝑛𝜃𝑐𝑜𝑠
and 𝑘𝑦 = 𝑘𝑠𝑖𝑛𝜃𝑠𝑖𝑛.
The far-field can be related to PWS by finding the asymptotic form of the
integral shown in (1.17). Using coordinate transformations, we can rewrite this
equation as
14 Advanced Computational Electromagnetic Methods and Applications
1
jr k x sin cos k y sin sin k z cos
E(r , , )
4 2 A( k , k
x y )e dk x dk y (1.28)
and assume that r and find the resulting integral. In [17], the asymptotic form
was derived using the method of stationary phase, which results in
jke jkr
E(r , , ) cos A(k x , k y ) kx k sin cos (1.29)
2 r k y k sin sin
This equation reveals many interesting properties about the relationship between
the far-field radiation and the PWS. First, the only spectral component that
contributes to the far-field radiation in the direction towards is the
component corresponding to the direction of propagation k fully defined by kx and
ky. An important aspect of the equation is the scaling factors jk/2, which must be
included if the proper magnitudes are to be obtained in the near field. Interestingly,
this factor has a factor of 2 embedded in comparison to the scaling factors of
vector potentials [5]. This is analogous to utilizing a perfect magnetic conductor
(PMC) sheet and doubling the magnetic current sources in order to work with the
electric field [14]. If both electric and magnetic fields are taken into account, then
the familiar 1/4 would be observed in the equation.
Another interesting artifact of this equation is that only the and
components will exist in the far field. This is a well-known result and has been
proven through vector potential analysis [14]. It can also be shown by first
remembering that the far field is a source-free region and writing
1
E
4 2 A(k , k
x y )e jk r dk x dk y 0 (1.30)
where one can argue that k A(k x , k y ) 0 to satisfy the source-free condition for
all positions r in the source-free region. In the far field, the propagation constant
vector is k kr̂ , which means that rˆ E 0 according to the above equations.
This is in agreement with previous results and is intuitive since the far fields are
considered local plane waves with no electric field component in the direction of
propagation.
We can also use this property for the PWS to provide a direct formula for the
E and E components in terms of the PWS components. Since it is common to
have knowledge of only two components of the PWS, such as Ax and Ay , one
should first write the PWS in terms of all its components, by
Novelties of Spectral Domain Analysis 15
k x Ax k y Ay
A xˆAx yˆAy zˆ (1.32)
kz
which provides the full PWS given two components. When using the spectral
analysis for constant z-planes, typically Ax and Ay are the components that are
known. Note however that the theory is not limited to only this case, and other
coordinate system configurations can be considered. Next, one can write the E
and E components in terms of the components of A as
Er , ,
jke jkr
Ax k x , k y cos cos sin
ˆ ˆ (1.33)
2r
A
y x y k , k ˆ sin ˆ sin cos
k xy k sin sin
k k sin cos
which provides a direct link between the far-field components E and E and the
PWS. This is quite useful and we will utilize these relationships to take data from
the far-field to find the PWS in the following sections.
A keen eye should note that only certain spectral components actually contribute to
the far-field. If kx, ky, and kz satisfy the conditions in (1.27a)(1.27c), then it
becomes impossible to achieve propagation constant values outside the region
k x2 k y2 k 2 unless we use complex values for (θ, ). This region has often been
designated by the electromagnetics community as the visible region of the PWS.
The other region satisfying the criterion k x2 k y2 k 2 has been referred to as the
invisible region. Both regions are depicted in Figure 1.6 in the spectral domain,
that is, in terms of kx and ky.
This vision behind this terminology is that only the components within the
visible region are observable to an object or receiver in the far field of the antenna.
The components outside of this region decay rapidly as the distance from the
antenna increases due to the imaginary value of kz. Thus, any far-field data will
only contain the contributions from the visible region and the evanescent waves are
invisible to the observer in the far field. Because of this, it becomes difficult to
gain knowledge of the evanescent waves from the far fields. In response, the
components of the PWS in the invisible region are often approximated as zero.
While this implies that one can only gain partial knowledge of the PWS, the
contributions from the evanescent waves are negligible for many practical cases
for antenna engineers. Specifically, electrically large antennas such as reflectors,
arrays, and large horn antennas will radiate little evanescent waves since the
16 Advanced Computational Electromagnetic Methods and Applications
spectral content is packed more densely into the visible region. This is analogous
to the inverse relationship of the bandwidth and time extent of signals. As the
antenna size becomes larger, the spectral bandwidth becomes smaller. The time-
frequency analogy is that when any signal is stretched to a longer length of time,
the bandwidth decreases, based on the scaling property of Fourier transforms.
Figure 1.6 Illustration of the invisible and visible spectral components of the PWS in the spectral
domain. Only the spectral components within the visible region contribute to the far-field
region, and are observable to a user in the far-field region. Components (or energy) in the
invisible region represent evanescent waves that decay to zero in the far field.
x y
E x, y, 0 xE
ˆ 0 rect rect (1.34)
a
b
Note that the electric fields go immediately to zero outside the region where
a / 2 x a / 2 and b / 2 y b / 2, resulting in a sharp transition and
ultimately higher spectral content. Others often describe the electric fields in a
Novelties of Spectral Domain Analysis 17
plane above the antenna as the aperture distribution, and one can consider a and b
as the lengths of the physical antenna size. It is quite difficult to obtain an aperture
distribution of this form, and usually there will be some transition to zero near the
edge of the antenna. Using the Fourier transform relationship, we can find that the
PWS has the form
k a k b
A k x , k y xa
ˆ bE0 sinc x sinc y (1.35)
2 2
2 | x | 2 | y |
E( x, y, 0) xˆE0 1 1 (1.36)
a b
and the electric fields are zero outside the region a / 2 x a / 2 and
b / 2 y b / 2 . This tapering provides continuity at the edges of the aperture
and ensures that there are no sharp discontinuities. The result is the PWS given by
18 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
(c) (d)
Figure 1.7 (a) Normalized electric field distribution at z = 0 for a rectangular pulse distribution with a
size of 5 × 5. (b) Normalized magnitude of the PWS of the electric field in (a). (c)
Normalized electric field distribution at z = 0 for a rectangular pulse distribution with a
size of 10 × 10. (d) Normalized magnitude of the PWS of the electric field in (c). The
black circles in the PWS plots illustrate the boundary of the visible and invisible regions.
ab k a k yb
A( x, y, 0) xˆE0 sinc2 x sinc2 (1.37)
2 4 4
The electric field distribution and corresponding PWS for an antenna of size 5 ×
5 and 10 × 10 are shown in Figure 1.8. Comparing the plots in this figure
reveals that the use of the triangular distribution can significantly remove the
higher frequency components in the spectral domain. Again, the black circles
denote the boundary between the visible and invisible regions. For both the small
and large distributions, the most significant spectral content is found in the visible
regions since there are no sharp discontinuities. The formulas provided for the
PWS of the square versus triangular pulses also agree with these observations. The
Novelties of Spectral Domain Analysis 19
PWS envelope of the square pulse decays as 1/kxky whereas the triangular pulse
PWS decays as 1/(kxky)2, resulting in a significant decrease of high frequency
spectral components.
(a) (b)
(c) (d)
Figure 1.8 (a) Normalized electric field distribution at z = 0 for a triangular pulse distribution with a
size of 5 × 5. (b) Normalized magnitude of the PWS of the electric field in (a). (c)
Normalized electric field distribution at z = 0 for a triangular pulse distribution with a size
of 10 × 10. (d) Normalized magnitude of the PWS of the electric field in (c).
Another important feature of the invisible versus visible regions is the power
associated with the evanescent waves and the radiated power. This is of interest in
the near-field problem in order to ensure that the correct electric field values are
being obtained and make sense physically. Parseval’s theorem can be utilized to
relate the power in the electric field distribution at z=0 to the power in the PWS by
20 Advanced Computational Electromagnetic Methods and Applications
2 1 2
E( x, y, 0) dxdy
4 2
A(k x , k y ) dk x dk y (1.38)
which relates the power in the spatial domain to the power in the spectral domain.
For large antennas, the left side of the equation has been widely used to
approximate the radiated power from an antenna, since once the distribution is
known it is then usually straightforward to integrate. The right side integral in the
spectral domain can be split into two integrals over the visible and invisible
regions by
2
A(k x , k y ) dk x dk y I visible I invisible
(1.39)
2 2
A(k x , k y ) dk x dk y A(k x , k y ) dk x dk y
k x2 k y2 k 2 k x2 k y2 k 2
where the first term corresponds to the visible region while the second term
corresponds to the invisible region. This is one place where the presence of
evanescent waves can make a notable difference if present. If a significant portion
of power gets transferred into the evanescent waves, that is, Iinvisible is on the same
order as Ivisible, then the aperture plane wave approximation may not be an accurate
one. In that case, the radiated power might be better computed directly through the
radiated far-field patterns.
However, when the antennas are electrically large (with respect to ), then one
can make the approximation that
2 2
(1.40)
A(k x , k y ) dk x dk y A(k x , k y ) dk x dk y
k x2 k y2 k 2
since most of the power is within the visible region as illustrated in Figures 1.7 and
1.8. This ultimately implies that a good approximation of the radiated power is
| Ex |2 | E y |2 1 | Ax |2 | Ay |2
(1.41)
Prad 2
dxdy
4 2 2
dk x dk y
k x2 k y2 k 2
The last important point to realize is that the (x, y, z) space operates as a bandpass
filter for evanescent waves (i.e., higher-frequency components of the PWS). The
end result is that capturing data in the far field ultimately removes the ability to
sense the evanescent waves due to the ideal bandpass filter effect of space. This
can best be understood by revisiting the formula to find the electric field at any
point in space as
1
A( k , k
jk x x jk y y
E( x, y, z0 ) x y )e jkz z0 e dk x dk y (1.42)
4 2
where A is the PWS that can be computed using electric field data from another
plane or may be provided by other means. The important factor in this equation is
the exponential term e jkz z0 . This term must be included and well understood in its
role when computing the Fourier transform to obtain the electric field E. This term
can be written in terms of kx and ky as
jz0 k 2 k x2 k y2
e jkz z0 e (1.43)
A closer examination of this term reveals that the magnitude of this term
remains at unity within the visible region. However, the term will decay rapidly in
the invisible region for large positive values of z0 if we assume a coordinate system
with the majority of the waves traveling in the +z-direction.
Figure 1.9 Illustration of the factor e jkz z0 and its role as a filter to remove the higher frequency
components in the spectral domain. As z0 increases, the factor becomes more like an ideal
bandpass filter, removing the evanescent waves. Note that this plot provides values
assuming that ky = 0.
22 Advanced Computational Electromagnetic Methods and Applications
If we plot the e jkz z0 factor against kx and ky, we have the result shown in Figure
1.9. In this plot the magnitude of the factor is plotted against kx, where it is
assumed that ky = 0. The results are shown for several values of z0. Effectively, the
rolloff becomes faster as z0 approaches infinity. Once in the far field (i.e., z0
becomes infinite), this factor will act as an ideal bandpass filter, resulting in only
the visible spectrum being observed by the far-field observer. This is important in
recognizing why one cannot have access to the invisible components
mathematically.
Previously the relationship between the PWS and the electric fields in different
planes has been derived. The PWS can be also used to compute the far-field
distribution using asymptotic expansion. From this relationship, it stands to reason
that one could approximate the PWS through the use of the far-field data. This is
quite useful since most antenna engineers have some information about the far
field of the antenna. Thus, our next goal to achieve is to utilize the far-field data to
approximate the PWS distribution. The correct scaling must be used to reflect the
radiated power for the antenna system, which is a parameter that is assumed to be
known through the input power, impedance matching, and antenna radiation
efficiency. Once the proper scaling has been realized, the asymptotic relationship
between the far fields and the PWS can be utilized.
where Pacc is the accepted power that enters the antenna. The radiation efficiency
represents the ohmic and dielectric losses in the antenna when written in this
manner. The accepted power can be related to the impedance matching
performance and the input power as
Pacc Pin 1
2
(1.45)
where Pin is the input power and is the reflection coefficient. The reflection
coefficient describes the voltage of the wave reflected by the antenna port, and
| |2 describes the power reflected. The reflection coefficient can be found
through
Zin Z0
(1.46)
Zin Z0
where Zin is the input impedance of the antenna and Z0 is the characteristic
impedance of the transmission line feeding the antenna [18]. With this in mind, the
radiated power Prad can be computed from the input power Pin through
Prad Pin r 1
2
(1.47)
which shows that Prad will always be smaller than Pin assuming no amplification is
implemented at the antenna level. Usually the radiation efficiency and reflection
coefficient are known to the antenna engineer. If not, a reasonable approximation
is that the antenna is 100% efficient and minimal reflection occurs (i.e., r ≈ 1 and
≈ 0).
Now that the power radiated from the system level can be found, we move to
relate the Prad to the far fields for proper scaling. In order to remain general in the
derivation, we assume that the antenna radiates two orthogonal polarizations
defined by the â1 and â2 directions. These unit vectors can either be the right-hand
circular polarization (RHCP), left-hand circular polarization (LHCP), spherical, or
Ludwig’s polarization vectors, and it is important to remember that the polarization
unit vectors are dependent on the angle , ). These polarization vectors are
depicted in Figure 1.10. Furthermore, the antenna has the radiation patterns
associated with each polarization defined as f1(, ) and f2(, ), corresponding to
polarizations 1 and 2. For the sake of generality, we will assume that these patterns
have no normalization. The only assumption we make is that these patterns were
found with the same radiated power. This can be done by controlling the input
power either in simulation or in measurement. This is important or else the
relationship between f1(, ) and f2(, ) has no meaning.
The electric and magnetic fields can be found by using the point source
approximation in the far field. It is assumed that the angular distribution remains
24 Advanced Computational Electromagnetic Methods and Applications
fixed as r changes, but the electric field’s magnitude and phase change as if the
antenna were behaving like an isotropic source as
E0e jkr
Er , , aˆ1 f1 , aˆ2 f 2 , (1.48)
r
Figure 1.10 Coordinate system of the antenna under test (AUT). Note that we assume that the field
can be decomposed into two polarizations 𝑎̂1 and 𝑎̂2 that have arbitrary orientation for a
given direction (. This generalization allows the use of 𝜃̂, ̂ or even Ludwig’s
definitions of copolar/cross-polar fields.
The magnetic field can be found using the local plane wave relationship
shown in (1.11) by
rˆ E E0e jkr
Hr , , aˆ2 f1 , aˆ1 f 2 , (1.49)
r
It is this scaling factor E0 that remains to be found to compute the field magnitudes
in the far field.
The Poynting vector P describes the power density propagating in a particular
direction and can be computed by
1
P(r , , )
2
Re E(r , , ) H* (r , , ) (1.50)
We can substitute (1.48)(1.49) into (1.50), the equation above to find that
rˆ
P(r , , ) Re E1 (r , , ) H 2* (r , , ) E2 (r , , ) H1* (r , , )
2
(1.51)
2
1 E
2
Pr ,2 Re E2 (r , , ) H1* (r , , ) rˆ 2
2
(1.53)
P(r , , ) rˆ Pr ,1 Pr ,2 (1.54)
We can now compute the total radiated power by integrating the contributions of
the Poynting vector over all angles (, ) by
2
This is often computed when obtaining the directivity D of the antenna, which
describes how much radiation is concentrated in a particular ( , ) direction
compared to an antenna with equal radiation in all directions. For practicing
antenna engineers, the directivity is associated with a given polarization, and it can
be computed for the ith polarization by
4 U i ( , )
Di ( , ) (1.56)
Prad
where Ui(, ) = r2Pr,i(r, , ) is the radiation intensity associated with the ith
polarization. Note that this equation can be computed for any angle ( , ) since this
is a ratio of the radiation intensity to the radiated power. The importance of the
ratio is brought out by rewriting (1.56) in terms of the patterns rather than the
radiated power as
2
4 fi ( , ) (1.57)
Di ( , ) 2
f sin d d
2 2
1 f2
0 0
This shows that one can compute the directivity at any angle with only the
knowledge of the far-field patterns. Since directivity can be computed for any
angle with this information, it remains instructive to rewrite (1.56) further in terms
of the electric fields as
2 2
2 E1 ( , ) 2 E0 2
Di ( , ) fi ( , ) (1.58)
Prad Prad
where the second equality utilizes the definition of the far-field electric field in
(1.48). This equation shows that the magnitude E0 could possibly be found through
the knowledge of the patterns and radiated power. We can rewrite this equation to
find
26 Advanced Computational Electromagnetic Methods and Applications
Prad Di ( , ) (1.59)
E0 2
2 fi ( , )
Most often antenna engineers have the maximum directivity D0i for the dominant
polarization on hand, which corresponds to the angles ( , ). The patterns are
also often normalized to the dominant polarization components (i.e.,
max fi ( , ) 1 ). With this in mind, we can arrive at the simplified scaling as
i , ,
where f1,n(, ) and f2,n(, ) are the normalized far-field radiation patterns. These
scaled far-field patterns will be the patterns used to compute the near-field electric
field magnitudes.
Using the results from the previous section, one can find the PWS based on the
properly scaled far-field patterns. For this section we assume that the antenna’s
main beam is pointing in the hemisphere containing 0 90 and that the planes
of interest are constant z planes. If other observation planes are desired then the
appropriate PWS corresponding to those planes should be computed using
coordinate transformations.
With the â1 and â2 components known, the first step in the process of
computing the PWS is to find the electric field in rectangular coordinates. This can
be accomplished through the use of a vector transformation matrix Tca , which
converts the components in the aˆi directions into the Cartesian vectors by
Ec Tca Ea (1.62)
If the vector Ea is known in the spherical vector components then this manifests as
Novelties of Spectral Domain Analysis 27
Once we have the Cartesian components of E, then we can compute the PWS via
the asymptotic relationship shown in (1.29). Rearranging the equation brings us to
the final relationship as
j 2 e jkr r
A(k x , k y ) |kx k sin cos
k y k sin sin k cos
xˆEx (r, , ) yˆEy (r, , ) (1.64)
Thus, we arrive at the PWS from the far fields and can retrieve the near field using
the Fourier transform. Note that only the data from the range of 0 / 2 is
used to compute the PWS. It is interesting to point out that the factor 1/cos will
have a singularity at = 90°, and in practice the radiation patterns may have finite
values at these angles, leading to infinite values in the PWS. A simple solution to
overcome this is to smoothen out the patterns with a windowing function. The
window function forces the patterns to zero at = 90°. Using a reasonable
windowing function does not significantly change the final results.
An important observation is that the resulting PWS from the defined
procedure only obtains the visible components due to the inability to observe the
invisible components in the far field, as discussed in Section 1.2. Thus we
approximate the PWS as zero in the invisible region, which is a reasonable
approximation for large antennas as discussed previously.
The electric field in the near fields and the PWS share a Fourier transform
relationship, providing a remarkably insightful and intuitive link between the far-
field and near-field radiation. In practice, however, the far-field data is found at
sampled intervals (, ) in the far field. In order to make full use of the Fourier
transform relation for practical applications, one must modify these relationships
slightly when using sampled data. Thus, the DFT must be used to compute the
PWS from the far-field data, which comes in the form of the FFT for high-
efficiency computation. In this section we assume that a sampled version of the
far-field patterns is available to the user. With this data, we provide all the
necessary steps to obtain the near fields from the far-field data.
28 Advanced Computational Electromagnetic Methods and Applications
1.4.1 Discretizing the Plane Wave Spectrum and the Electric Field
Distribution
In many cases, only sampled data of the electric field distribution are available to
the user. We know that we can find the PWS by
j kx x k y y
A(k x , k y ) e jkz z0 E( x, y, z )e
0 dxdy (1.65)
ym my (1.68)
which can be quickly recognized as a 2-D DFT, which can be computed quickly
via FFT. Note that the extra exponential term exp(jkz,pqz0) is included for
completeness. When the observation points (x, y, z) are at z = 0, then this term
disappears from the equation. Another point to note is that the values of kx and ky
range from 0 to 2/x and from 0 to 2/y, respectively. Thus, the parameters kxx
and kyy are the electromagnetic analogues of a signal’s angular frequency .
Just as the Fourier transform is an invertible operation, the FFT operation is
also invertible but with one slight modification. Starting with the inverse
relationship
Novelties of Spectral Domain Analysis 29
1
A( k , k
jk x x jk y y
E( x, y, z0 ) x y )e jkz z0 e dk x dk y (1.72)
4 2
np mq
k x k y N 1 M 1 j 2
E( xn , ym , z0 )
4 2
p 0 q 0
A pq e jkz , pq z0 e
N M (1.73)
Interestingly, this equation does not reflect the traditional form of the inverse FFT,
and we can rearrange the equation by
np mq
1 N 1 M 1 j 2
E( xn , ym , z0 ) A pq e jkz , pq z0 e N M
NM xy p 0 q 0
(1.74)
which implies that the link via the FFT exists between the following entities
FFT
E( xn , ym , z0 )xy A pq e jkz , pq z0
(1.75)
iFFT
N 1 mn
j 2
Fm f n e N (1.76)
n 0
mn
1 N 1 j 2
fn
N
Fme
m 0
N (1.77)
where it is again noted that the minus sign appears in the exponent for the inverse
FFT. Thus, one must utilize the provided scaling constants in order to remain in
agreement with the physical reality. This is an important feature to discuss, and
most researchers discard any constants and simply normalize to the maximum
absolute value of the resulting data matrix. The inclusion of the sampling distances
x and y in (1.75) is to preserve the physical values of the near fields and will be
discussed later.
Overall, these equations define the relationships that will apply to practical
datasets, that is, sampled data in the near field, PWS, and far fields. An underlying
point that should be highlighted in this discussion is that the sampled far fields lead
directly to the sampled values of the PWS via (1.64). A sample of the far-field at
the angle (, ) literally provides a sample of the PWS at the points (kx, ky) =
(ksincos, ksinsin), which only provides data for the visible region. The FFT
must be then used to compute the near-field electric fields from the PWS.
30 Advanced Computational Electromagnetic Methods and Applications
The FFT enables the power of the Fourier transform for practical applications
where only samples are known of a given distribution. Assuming that adequate
sampling has been implemented, no information loss should occur. Yet it is not
sufficient to merely apply the FFT operation on a data set; the resulting numbers
only provide relative information. The more interesting data in the application at
hand is the absolute values of the electric field in V/m, which takes some careful
manipulation and interpretation of the formulas. Most researchers apply the FFT
and normalize to make observations on the relative field values. This comes in
many forms, but the most common results to plot are the field values relative to the
maximum. However, in this section we will attempt to uncover the proper scaling
factors (i.e., normalization) that ensure that the units and values are sensible and
accurate when the data is computed from the FFT. Thus, the resulting data predict
the exact near field electric field values in V/m given the radiation pattern and
radiated power.
Finding the proper normalization starts by comparing the equations for the
inverse continuous Fourier transform and the inverse DFT. For simplicity, we list
out the formulas for one dimension, since the extension of the results to two
dimensions is straightforward. These can be written in order as
G kx
g ( x) 1
G k x e jkx x dk x (1.78)
mn
1 j 2
hn n
1
Gm
N
G e
m
m
N (1.79)
where again a negative exponential represents the forward Fourier transform. The
Gm represents the PWS, while gn and hn represent the continuous and sampled
electric fields in the near-fields, respectively. Note that these equations are written
in the form that would be utilized for the specific problem at hand. Given a
sampled version of the PWS, we want to find the correct near-field values using
the FFT. We denote the sampled version as hn rather than gn to make it clear that
merely applying the FFT on the data is not enough, and it will be shown later that
the values resulting from this operation are quite unreasonable. The first
observation from these equations is the lack of differential length x in (1.79), and
this is by definition of the DFT. The lack of the differential length is the first hint
of how the resulting output data should be normalized given the radiation patterns
and radiated power.
The second hint that can be used is Parseval’s theorem, which states that the
total power observed in the spectral domain is equal to that in the spatial domain,
that is,
Novelties of Spectral Domain Analysis 31
2 1 2
P
g ( x) dx
2 G (k )
x dk x (1.80)
If each side of the equation is discretized, then one can arrive at the conclusion that
1
g G
2 2
n x n k x (1.81)
n 2 n
This formula represents the physical intuition behind the conservation of power.
The left side represents the electric fields in the aperture, while the right side
represents the far-field radiation. Since power is not lost, the total sum of all power
observed in the two domains must be equal. Note the use of the variable gn for the
sampled version of the correct field values. This variable represents the exact
electric fields sampled in the near field. Interestingly, differences exist between
this result and Parseval’s theorem for the DFT, which states
N
1 N
h
2 2
n Gn (1.82)
n 1 N n1
Both equations are correct and can be proven using the equations presented in this
chapter. Yet, a mysterious 1/N factor appears in (1.82) that does not appear in the
original. Equation (1.82) can be manipulated further by multiplying both sides by
kx to reveal
2
hn 2 1
h k x G
2 2
n x n k x (1.83)
n n x N N n
which immediately suggests that the resulting inverse FFT (iFFT) output should be
scaled such that gn = hn / x in order to produce the correct values of the electric
fields. This agrees with the resulting relationship shown in (1.75). By applying the
iFFT onto the samples Gn, we can find the values for hn = gn x. We then must
scale by 1/x in order to find the true magnitudes.
To illustrate these points, a 1-D example of the PWS will be shown. Assume
that the PWS of a given antenna is known as
sin k x W 2
Ak x , k y xˆ 2Ak x k y xˆ 2A0 kx W 2
ky (1.85)
32 Advanced Computational Electromagnetic Methods and Applications
which reduces (1.18) down to a scalar 1-D Fourier transform relationship in terms
of A(kx) as
1
A(k )e (1.86)
jk x x
Ex ( x, y, 0) x dk x
2
A0 x
Ex ( x, y,0) rect (1.87)
W W
which is simply the 1-D inverse Fourier transform of the sinc() function. For
numerical purposes, we choose A0 = 10V, W = 20, and f = 8.4 GHz and plot the
results in Figure 1.11. Clearly, the magnitudes are as expected, where the peaks of
the plots in Figures 1.11(a) and 1.11(b) are 10 and 14, respectively. Thus, the
interpretation is that the PWS has either measured or simulated peak values of
20V·m, while the electric field is equal to 14 V/m at z = 0.
(a) (b)
Figure 1.11 Plots of the 1-D PWS example for A0 = 10 V, W = 20, and f = 8.4 GHz. These plots
represent the physical reality, and the values shown here are the true values of PWS and
electric fields. (a) PWS function A(kx). (b) Resulting electric field distribution at z = 0.
These values should also be reflected in the sampled versions resulting from
the FFT. For the case with A0 = 10V, W = 20, and f = 8.4 GHz, we have two
examples with two different sampling periods of the PWS. The first case is where
N = 200 and x = /4, leading to a spectral sampling period of kx = 0.04/. This
case is plotted in Figures 1.12(a) and 1.12(b). The other case tested uses N = 300
and x = /8, leading to a spectral sampling period of kx = 0.0533/. This case
is plotted in Figures 1.12(c) and 1.12(d). Note that the electric fields are plotted
when directly implementing the iFFT without any normalization. The resulting
electric field plots (Figures 1.12(b) and 1.12(d)), highlight several important
artifacts of the iFFT operation without normalization. The first observation is that
the magnitude of the electric field is not close to the magnitude of the electric field
given in Figure 1.11(b), which finds an electric field of 14 V/m. Furthermore, the
Novelties of Spectral Domain Analysis 33
values are different when using different sampling rates, as seen when comparing
Figures 1.12(b) and 1.12(d). These effects are all due to the lack of normalization
when computing the iFFT (or FFT) and can only be removed with the proper
normalization.
(a) (b)
(c) (d)
Figure 1.12 Plots of the 1-D PWS example for A0 = 10V, W = 20, and f = 8.4 GHz with different
sampling rates. (a) PWS function A(kx) for N = 200, x = /4. (b) Resulting electric field
distribution without normalization at z = 0 from iFFT operation. (c) PWS function A(kx)
for N = 300, x =/8. (d) Resulting electric field distribution without normalization at z =
0 from iFFT operation.
For the same case with A0 = 10V, W = 20, and f = 8.4 GHz, we computed the
electric field using the iFFT and 1/x normalization and plotted the results in
Figure 1.13. Notice that both plots predict roughly the same magnitude for the
electric field. More importantly, the electric field values agree well with the exact
values based on the continuous Fourier transform. This demonstrates both the
subtlety and the significance of including the 1/x normalization in the iFFT/FFT
operations. Another notable characteristic is the ringing effects near x = ±10,
which can be observed for the sampled electric field distributions. This is due to
the finite truncation of the sinc() function commonly known as Gibb’s
phenomenon, and the best way to minimize these effects is by increasing N. These
34 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 1.13 Plots of the normalized 1-D electric field example for A0 = 10V, W = 20, and f = 8.4
GHz with different sampling rates. (a) Resulting electric field distribution at z = 0 from
iFFT operation for N = 200, x = /4. (b) Resulting electric field distribution at z = 0
from iFFT operation for N = 300, x = /8.
An interesting question that one might ask is how finely the electric field
distribution should be sampled in order to recover the electric field exactly. The
Novelties of Spectral Domain Analysis 35
which has three spectral components located at (kx, ky) = (kx0, 0), (kx1, 0), and (kx2,
0). To make things interesting, we can place kx0 and kx1 in the visible region and kx2
in the invisible region. With this we have the corresponding electric field
distribution
1 j xkxi z0
E( x, y, z0 ) xˆ e
k 2 k xi2 e z
0 k x22 k 2
e jkx 2 x (1.90)
i 0
We choose values of kx0 = 0, kx1 = 0.707k, and kx2 = 1.5k to illustrate the points
being made and plot the electric field distribution for several planes of interest in
Figure 1.14 for several planes. Clearly, the distribution changes dramatically from
36 Advanced Computational Electromagnetic Methods and Applications
one observation plane to the next, where rapid variations in Ex are observed in the
z0 = 0 plane compared to the other planes. This is due to the evanescent component
that would be observed near the source antenna. However, the rapid variations in
the field distribution attenuate as z0 increases, and the only components that
effectively contribute to the field distribution in Figure 1.14(c) are the two
components in the visible region. This is a direct result of the passband filtering
properties of space, as discussed in Section 1.2.3.3.
We can also test the case where only the knowledge of the visible components
is known. In this case, the electric field distribution takes the form
j xkx 0 z0
E( x, y, z0 ) xˆ e
k 2 k x20 e j xk x1 z0 k 2 k x21 (1.91)
The distribution is plotted in Figure 1.15, where clear differences can be observed
when compared to the results shown in Figure 1.14. Without the invisible spectral
components, the distribution in Figure 1.15(a) at z0 = 0 is not representative of the
true physical reality of the electric field distribution. However, as z0 increases, the
invisible components become vastly attenuated, and the fields begin to appear very
similar. Even at z0 = /4, the distributions share most of the major features, and at
z0 = the two distributions in Figures 1.14(c) and 1.15(c) are almost identical. It is
interesting to note that it only takes one wavelength for the evanescent component
to decay to a negligible existence. This is quite a small distance, and from a
practical perspective it highlights the fact that the spectral technique can be readily
applied as long as the observation plane is not too close to the antenna.
The last point to highlight is the minimum sampling rate and the minimum
recoverable feature size in the electric field distribution. Since observations are
made in the far-field region, we are limited to only detecting the visible region of
the PWS, which implies that the maximum detectable wavenumber is
Novelties of Spectral Domain Analysis 37
Figure 1.15 Predicted magnitude of Ex in (1.90) at several different observation planes with only the
knowledge of the PWS in the visible regions. In this example, the values kx0 = 0, kx1 =
0.707k, and kx2 = 1.5k are chosen. (a) z0 = 0. (b) z0 = /4. (c) z0 = .
2
K max | k x |,| k y | k (1.92)
This means that the minimum sampling period must be at least
(1.93)
x, y
K 2
in order to ensure that all details are obtained and can completely recover the
electric field. Note that this minimum sampling rate only works if there are only
visible components present. If the observation plane is at distances near the
antenna, then the evanescent waves can create rapid oscillations that cannot be
captured with this sampling rate. The minimum recoverable feature size is highly
related to the minimum sampling period. The only features that change over a
distance of /2 or longer will be captured with the visible region. Even if a smaller
resolution in x and y is used, the fastest observable changes in the electric field
distribution will occur over a length of /2 when only the visible components are
used to predict the near fields. If the observation plane is near the antenna where
strong evanescent fields are radiating, then rapid oscillations in the electric field
distributions can occur that cannot be observed since the changes are
subwavelength (< /2). This is an important limitation that should be understood
when using the spectral domain analysis.
In the previous section it was shown that the minimum sampling rate is x and y
< /2, and this can be achieved through a well-directed sampling scheme in the far
field. Based on (1.69) and (1.70), there is a relationship between the sampling rate
38 Advanced Computational Electromagnetic Methods and Applications
in the near-field spatial domain and the sampling rate in the PWS, which depends
directly on the far field. Specifically, the spectral sampling period should satisfy
2
x (1.94)
N k x 2
2 (1.95)
y
M k y 2
leading to
4
N k x 2k (1.96)
4
M k y 2k (1.97)
which restates the sampling theorem from another perspective.
Another important criterion is the overall size of the antenna under
consideration. In most scenarios, it is generally desired to obtain the near-field
distribution over an observation plane whose area is larger than the physical size of
the antenna. This implies that the factor N x D , where D is the largest
dimension of the antenna, leading to a constraint on N as
2D
N (1.98)
and similarly for the y component. Note that when increasing N, we will increase
the factor Nx, resulting in a smaller spectral sampling period. If we assume that N
> 2D/ and x = /2, then the corresponding spectral sampling period is
2
k x (1.99)
D
This is an interesting result that is expected from antenna theory [17]. As the
largest antenna dimension D increases, the PWS (and the far field) will have
increasingly faster variations. Specifically, the beamwidth becomes narrower and
the number of observed side lobes increases. This implies that one must properly
sample the far field in order to observe all of its features in the near field. This is
depicted in Figure 1.16, where the far-field patterns of differently sized antennas
are shown. The antenna with D = 30 requires much more sampling points
compared to the one with D = 5. The resulting PWS for the D = 30 antenna will
have very rapid oscillations that call for a smaller sampling period to capture all of
the PWS features.
Novelties of Spectral Domain Analysis 39
where the angles (1 , 1 ) and (0 , 0 ) represent far-field angles corresponding to
samples of the far fields. Next, we assume that the observation angle lies in the
principal x-z plane for simplicity, that is, = 0, and also assume that
1 / 2 and 0 / 2 then it can be shown that
2
k x 2k cos sin (1.101)
2 D
For 0 we can approximate this inequality using sin(x) ≈ x
(1.102)
D cos
which is very similar to the diffraction limit seen in optics [20]. When = 0, the
inequality in (1.102) leads to an angular far-field spacing of < /D, which is a
good starting point for sampling the far-field data. In practice, researchers typically
use smaller sampling periods in order to ensure that the far-field patterns have
sufficient sampling.
(a) (b)
Figure 1.16 Illustration of how larger antennas can have faster variations in the PWS and far field.
(a) Far-field magnitude pattern of an antenna with dimensions on the order of D = 30.
(b) Far-field magnitude pattern of antenna with dimensions on the order of D = 5. Note
that for both plots.
40 Advanced Computational Electromagnetic Methods and Applications
Notice that the conditions derived in this section and the previous section
merely show how to approach a decision when it comes to the sampling periods in
both space and spectrum. It is recommended to use even higher sampling rates in
order to ensure more accurate results. A typical recommendation for sampling the
far field is to ensure that each sidelobe has a few points sampling it. A reasonable
sampling period is to sample the far field patterns with 10 points within one
sidelobe or within the half-power beamwidth, leading to the final recommended
sampling period given as
6 (1.103)
D/
where is in degrees. This ensures that all features of the far fields get properly
incorporated into the PWS in order to compute the near fields.
imaginary parts are usually reasonably smooth for most data sets. However, phase
is often wrapped into the [, ] range by most programs, and thus discontinuities
are a typical feature of a phase distribution. These can produce inaccurate results.
Therefore, it is generally recommended to interpolate the real and imaginary
components independently to obtain more accurate results. Therefore, when we
refer to interpolating the far-field data, it is automatically assumed that we are
interpolating the real and imaginary parts separately throughout the rest of this
section.
(a) (b)
Figure 1.17 (a) Location of the spherical grid points (, ) on the spectral kx and ky domain. The X
markers indicate the location of a PWS sample given a uniformly spaced spherical grid.
(b) Rectangular grid of (kx, ky) points in the spectral domain that is needed for the FFT.
carried out faster. The most ideal is a regular grid of data in the domain of interest,
like the grid shown in Figure 1.17(b). Searching for points within a regular grid is
simplified to a small computation of the indices, which can be predicted by the
sampling periods. A curvilinear grid, like the one shown in Figure 1.17(a), is
somewhat more difficult, but can also be predicted through inverse mapping
functions. However, if the inverse functions are mathematically intractable or
challenging to compute (or if the data is on a random grid), then one may proceed
to use algorithms targeting scattered grids. However, it is highly encouraged to
spend the effort if possible in utilizing intelligent grids; a dramatic acceleration in
the interpolation process can be observed compared to schemes using scattered
grids.
One of the simplest interpolation techniques is the nearest neighbor
approximation, where the value of the point of interest is assigned the value of the
nearest neighboring point. For a given grid, this technique will likely produce the
fastest results with the least amount of memory requirements. However, a finely
meshed grid with a small maximum spacing between samples must be available, or
this technique suffers in accuracy. It will also produce a discontinuous interpolant,
which can be undesirable. This technique is useful when working under severe
memory and hardware speed constraints, but for many applications this technique
is not used.
A popular technique that provides a continuous interpolant is the bilinear
interpolation technique, which has also been referred to as the four-point bivariate
Lagrangian method [9]. One critical assumption in this technique is that the known
sampled data is on a regular (or even rectilinear) grid. First, the four points
neighboring the point of interest are identified at x11 = (x1, y1), x12 = (x1, y2), x21 =
(x2, y1), and x22 = (x2, y2), and the value at (x, y) is computed by
x x2 y y2 x x2 y y1
f ( x, y ) f11 f12
x1 x2 y1 y2 x1 x2 y2 y1 (1.106)
x x1 y y2 x x1 y y1
f 21 f 22
x2 x1 y1 y2 x2 x1 y2 y1
where fij = f(xi, yj). It is interesting to note that contrary to the name, the resulting
formula is actually not a linear function in x and y, since a resulting xy term can be
found when expanding this formula. The equation is quite fast to compute for each
point of interest, and rapid results can be achieved with this technique while still
maintaining good accuracy. The speed is achieved in both the search phase and the
computation phase. The only constraint is that the grid must be rectilinear, which
limits its use to only certain sets of far-field data. This can usually be accomplished
by interpolating in the (, ) domain rather than the (kx, ky) domain, since it is
popular to discretize the angles in a regular grid. Other more computationally
complex algorithms exist for those interested in obtaining more accurate results.
Novelties of Spectral Domain Analysis 43
f ( x, y) 1 f ( x1 , y1 ) 2 f ( x2 , y2 ) 3 f ( x3 , y3 ) (1.107)
where i are the barycentric coordinates of the triangle defined for the spatial
coordinate of interest x by
1 2 3 1 (1.109)
and xi are the three vertices of the triangle [22]. While the interpolants are fast to
compute, the triangulation can be extremely time-consuming for large data sets.
These interpolation algorithms have been implemented in a number of
different packages available online, including the functions built into MATLAB.
The bilinear and the cubic splines methods can be executed through the interp2
function. For scattered data points, the griddata function implements the
Delaunay triangulation followed by the linear-triangular interpolation. To highlight
the importance of intelligent grid approaches versus scattered data approaches, we
have tested the runtime when using an example of each approach. We tested this
on a sample size of 3,721 × 3,601 points, and our goal was to interpolate the data
to a set of 2,001 × 2,001 points. We applied the bilinear approach in the (, )
domain using the interp2 function in MATLAB and compared it to the linear-
triangular approach in MATLAB using the griddata function. The overall
computational time was 336 times faster using the bilinear approach compared to
the linear-triangular approach due to the lengthy triangulation needed. Both
procedures produced almost identical data. Clearly, the choice of interpolation is
extremely critical in reducing the overall computational time, and one must make
an informed decision on this matter.
44 Advanced Computational Electromagnetic Methods and Applications
1.4.6 Subtle Issues When Implementing the FFT and iFFT Using Pre-Built
Packages and Libraries
One of the greatest advantages in using the FFT is the plethora of packages and
research purely devoted to making its computation faster and more efficient. Since
the advent of personal computing and the internet, many packages and libraries
have been and are being developed to perform even the most challenging of tasks,
including the FFT. Some examples are the Fastest Fourier Transform in the West
(FFTW) subroutine library for C and C++ [23] and the FFTPACK Fortran
packages [24]. MATLAB currently implements the FFTW library with their built-
in functions fft2 and ifft2 [25].
While utilizing these packages can avoid spending large amounts of time
writing code, it is important to recognize some subtle differences in the common
implementation of the FFT and the FFT mentioned in this chapter. The most
common definition of the FFT and iFFT utilizes a negative and positive sign in the
exponent, respectively, which we will denote by
N 1 mn
j 2
Gm m g ( n)
g (n)e
n 0
N (1.110)
mn
1 N 1 j 2
g ( n)
n Gm
1
N
Gme
m 0
N (1.111)
While each algorithm implements the transform differently, the bottom line is that
precoded packages have the opposite sign in the exponential compared to this
chapter’s definitions in (1.76) and (1.77). In order to circumvent this issue, one can
rearrange the terms and use some mathematical manipulations to ensure a simple
and clear implementation. It can be easily shown that the following two operations
g ( n)
*
m g ( n) m
*
(1.112)
g ( n)
*
m
1
g ( n) 1
m
*
(1.113)
are equivalent. For example, in order to implement the FFT operation defined by
(1.76) and (1.77) in MATLAB, we could use the following code
G = conj(fft2(conj(g));
or for the iFFT operation
G = conj(ifft2(conj(g));
where the conj function performs complex conjugation on the input matrix. This
can also easily be implemented in other languages using the built-in functions such
as CONJG in FORTRAN or the conj operator in the complex class of C++.
Novelties of Spectral Domain Analysis 45
The difference in the equations can be partially attributed to the choice of time
convention. If the physics time convention were chosen, that is, exp it , then
the opposite sign would appear in each of the exponentials, matching with the
conventional FFT/iFFT definitions. However, the engineering notation,
exp jt , is frequently used within the antenna engineering community, and
thus it was chosen for convenience.
The last detail to consider when implementing the FFT is the arrangement of
the spectral frequencies. In the context of time-frequency signals, each number in
the resulting output vector of the FFT corresponds to some spectral component n.
Since the FFT is periodic with 2, the range of frequencies can either be (0, 2) or
(, ), where a good choice is somewhat arbitrary. With many of the packages
available, the output of the FFT typically corresponds to the (0, 2) spectral
frequencies. Unfortunately, this representation is not convenient for the near-field
applications, and some adjustments to the data must be made in order to
make full use of existing algorithms. Since A(kx + 2x, ky) = A(kx , ky + y) =
A(kx + x, ky + y) = A(kx, ky), a circular element shift in the data array can
be used to reorient the data in a (, ) range. In MATLAB, this can be
accomplished using the fftshift function, which circularly shifts the elements
from right to left. Conversely, many existing iFFT function implementations must
have the data provided as an argument in the range of (0, 2). Again, a circular
shift can be used to accommodate this requirement, and MATLAB provides the
function ifftshift to accomplish the circular shift [25].
system. Note that the measurement coordinate system does not have to exclusively
refer to measurement data and coordinates. This can also represent far-field data
found from simulation that was only available for one specific coordinate system.
Since our application will perform the transformation in the far field, some
approximations and assumptions can be made in order to simplify the final
relations. First, the far field electric field can be written as
e jkrm
Em,u (rm , vm , wm ) E0 f (vm , wm ) (1.114)
rm
which is a general result that is evident in (1.29) and can also be proven through
vector potentials [14]. The factor E0 should be scaled according to the
normalization scheme discussed in Section 1.3. Notice that we assume that the first
measurement coordinate is um = rm in order to make the analysis more amenable to
the concept of the far-field. This will be assumed throughout the rest of the section.
Starting with the available complex vector far-field electric fields in the MCS
(Erm, Evm, Ewm), we attempt to find the complex vector electric field components in
the desired BCS (Exb, Eyb, Ezb) in order to compute the near-field prediction in the
desired planar area of interest. A systematic and intuitive approach uses the
following step by step procedure [26, 27]:
1. Convert the given field data (Erm, Evm, Ewm) defined by ( rˆm , vˆm , w
ˆ m ) into their
rectangular components (Exm, Eym, Ezm) .
2. Transform the rectangular field components (Exm, Eym, Ezm) into desired
coordinate system rectangular components (Exb, Eyb, Ezb) using Eulerian
angles.
3. For each given MCS location rm, vm, wm (or xm, ym, zm) associated with the
field data, compute the location with respect to the BCS.
In step 1, converting the given field data (Erm, Evm, Ewm) into the rectangular
components can be accomplished using transformation matrices. An important
assumption is that the 3-tuple ( rˆm , vˆm , w
ˆ m ) forms a set of orthonormal vectors
throughout all three-dimensional space, that is, ( rˆm vˆm vˆm w
ˆ m rˆm wˆ m 0 ) for
all ( xm , ym , zm ) . With this assumption, the transformation matrix can be written as
a matrix whose elements are the projection of the vector components (Erm, Evm,
Ewm) onto the rectangular directions, as
where the subscript “ru” denotes the conversion from the uvw components to the
rectangular components. With the transformation matrix at hand, the rectangular
components can be computed from
Figure 1.18 Coordinate system transformations are critical in converting the fields from one
coordinate system to another. Many times the far-field patterns are measured using a
MCS predefined by the measurement system denoted with the subscript m. The data has
to be converted using coordinate system transformations to obtain the data represented
in the far-field BCS, denoted by the subscript b.
48 Advanced Computational Electromagnetic Methods and Applications
cos sin 0
R zm sin cos 0 (1.120)
0 0 1
1 0 0
R x 0 cos sin (1.121)
0 sin cos
cos sin 0
R zb sin cos 0 (1.122)
0 0 1
With the transformation matrix now defined, the rectangular vector components in
the BCS can be computed as
leading to the final resulting equation to obtain the far-field vector components
with respect to the BCS
Eb,r R zb R x R zm Tm,ru Em,u (1.124)
Novelties of Spectral Domain Analysis 49
Figure 1.19 Eulerian angles used to transform vector components in the MCS to the back-projection
′ ′ ′
coordinate system. Note that the primed measurement coordinate system (𝑥𝑚 , 𝑦𝑚 , 𝑧𝑚 )
has the same orientation as the MCS but the origin has been displaced to the origin of
the BCS.
The third and last step is to determine each point’s location in the BCS given its
coordinates in the MCS. Assuming that every point has a known (um, vm, wm)
coordinate, we can write its position vector as
rm rxm um , vm , wm xˆm rxm um , vm , wm yˆm rxm um , vm , wm zˆm (1.125)
where rxm, rym, and rzm are the transformation functions relating the (um, vm, wm)
coordinates to the (xm, ym, zm) position. With the position vector, (1.123), and the
origin displacement vector rbm, it can be shown that the observation point location
can be found in rectangular coordinates with respect to the BCS by the relationship
which is depicted in Figure 1.18. This is useful as it provides a direct link between
the coordinates (um, vm, wm) of an observation point in the MCS to the coordinates
(xb, yb, zb) with respect to the BCS.
In the far-field, the distance between the coordinate systems is negligible
compared to the distances to the observation point, that is, |rm| » |rbm| and |rb| »
|rbm|. This leads to two different approximations that are often seen in antenna
theory. The first is a zeroth order approximation, which states that
rb Tbmrm (1.127)
T
T
rf rbT rb r
bm m Tbmrm rm (1.128)
50 Advanced Computational Electromagnetic Methods and Applications
1 T
using the definition of rotation matrices as orthogonal matrices where Tbm Tbm
where 𝑇̅𝑏𝑚
𝑇
represents the matrix transpose. The first-order approximation can be
used by approximating rb by
rmT rbm
rb rm rm rˆm rbm (1.130)
rm
where it should be noted that this becomes exact in the true far field at r . We
use these two approximations to write the far fields as
jk rb rˆb rmb
e (1.131)
Eb,r (rb ,b , b ) E0 Tbm Tm,ru f (b , b )
rb
where we assume that rˆb Tbmrm and rmb Tbmrbm . The assumptions also lead
to the relationship between the spherical angles in the BCS and MCS as
These are the final results that can be used to transform the electric fields from one
coordinate system to another as well as convert coordinates (um, vm, wm) into the
spherical angles (b, b). Equation (1.131) has two critical factors that alter the
original data: the phase factor and transformation matrices. The phase factor
accounts for the origin displacement between the two coordinate systems. The
transformation matrices convert the vectors to another coordinate system
orientation, as discussed previously. Remember that the equation defines the
rectangular components of Eb. The PWS is most often written in rectangular form,
and thus no further steps are required beyond this equation.
As an example, we will derive these relations for an elevation-azimuth (EL-
AZ) coordinate system commonly used in measurement systems. As with any
antenna pattern measurement, the AUT must be placed on a positioner that can
provide motion in at least two axes. A common positioner configuration is EL over
AZ, denoted as EL/AZ, where each axis of rotation directly changes the angles AZ
and EL as depicted in Figure 1.20. The coordinate system shown in Figure 1.20(b)
Novelties of Spectral Domain Analysis 51
describes both the MCS and the desired BCS. Notice that the origins are not
displaced for this example, but the orientation of the coordinate systems is
different. In this example, we assume that we know the electric field distribution in
terms of EEL and EAZ for a given (EL, AZ) coordinate in the far field. Our task is to
convert these fields into the BCS in order to find the PWS and compute the near
fields using the FFT.
(a) (b)
Figure 1.20 (a) EL/AZ antenna positioner used in antenna pattern measurements. (b) Coordinate
system configuration for the MCS and BCS of the EL/AZ example defining the AZ and
EL angles.
Using (1.115), we can find the rectangular components of the electric field as
where we assume that the radial electric field, that is, Em,r = 0, based on the far-
field assumption. For the Eulerian angles, it can be determined that we have =
, = /2, = /2, which leads to a rotation matrix given by
0 1 0 1 0 0 0 1 0 0 0 1
Tbm 1 0 0 0 0 1 1 0 0 0 1 0 (1.134)
0 0 1 0 1 0 0 0 1 1 0 0
Eb , x sin EL cos EL 0 0
(1.135)
Eb , y cos EL sin AZ sin EL sin AZ cos AZ Em, EL
Eb, z cos EL cos AZ sin EL cos AZ sin AZ Em, AZ
Thus, we have the conversion of the electric field distribution into the BCS well
defined for all points. The last step is to relate the coordinates between the two
coordinate systems. We can relate the (EL, AZ) angles to (b, b) angles using the
relationship rˆb TbmTm, ru rm,u :
With the electric field and its corresponding location computed, one can then
proceed to compute the PWS. As a final note, we can now write the final far-field
electric field in the BCS as
E0 e jkrb
Eb (rb ,b , b ) xˆb Em , EL cos EL
rb
yˆb Em, EL sin EL sin AZ Em, AZ cos AZ (1.137)
where the (EL, AZ) angles are related to (b, b) through the relationship shown in
(1.136), resulting in the equations b = cos1(cos(EL)cos(AZ)) and b = –tan1
(cot(EL)sin(AZ)). Note also that the phase term did not alter the electric field
phase since the coordinate systems’ origins were collocated (i.e., 𝒓𝑚𝑏 = 0).
To demonstrate the use of spectral analysis in computing the near fields, the FFT
approach was applied on several examples. Two well-known aperture distributions
with analytical radiation patterns were selected. The resulting near-field
distribution and field values based on the far fields agree quite well with the
theoretical aperture distribution. Both of these problems have theoretical and
practical significance for reflector, array, horn, slot, and other antennas.
Novelties of Spectral Domain Analysis 53
Suppose that the electric field in a rectangular aperture, depicted in Figure 1.21, of
width a and length b has the field distribution as
a / 2 x a / 2
ˆ xa ,
xE
E( x, y, 0) b / 2 y b / 2 (1.138)
0 elsewhere
where Exa is the electric field at the rectangular aperture. If the aperture is large,
one can predict the radiated power provided by this aperture by
Exa2
Prad ab (1.139)
2
since the fields mimic plane waves traveling through the aperture.
Figure 1.21 Rectangular aperture electric field distribution used for testing the spectral analysis-FFT
technique.
k a k yb
A(k x , k y ) xˆExa absinc x sinc (1.140)
2 2
which leads to a far-field distribution of
E0e jkr (1.141)
E(r , , ) f ( , )
r
ka kb
f , sinc sin cos sinc cos sin ˆ cos cos ˆ sin (1.142)
2 2
where E0 contains the scaling factors that would be unknown to the observer. The
function f(, ) represents the pattern that would be known to the user. The
directivity may already be known to the user, or one could predict the directivity
54 Advanced Computational Electromagnetic Methods and Applications
by integrating the pattern. For this particular aperture distribution, we can predict
the directivity by
ab
D0 4 (1.143)
2
Using the spectral analysis procedure outlined in Section 1.1 and detailed in the
previous sections, the near-field electric field distribution has been computed for
several planes. As a numerical example, an aperture size of a = b = 33.5 was
chosen along with a radiating power of Prad = 87W at 13.4 GHz. Using (1.139), the
electric field magnitude in the aperture becomes Exa = 341.5 V/m for this particular
radiated power. For these particular values, we have plotted the radiation pattern in
Figure 1.22(a). Notice that the beamwidth is fairly small with a half-power
beamwidth of 1.5°. Thus, a sufficiently small spectral sampling period must be
used in order to capture the information from the pattern.
With the radiation pattern readily available, the FFT spectral analysis program
was applied to these fields to predict the aperture field distribution. For the results
shown in Figure 1.22(b), 2,000 points were used for both kx and ky sampling in the
visible region. Note that any values for A(kx,ky) outside the visible region were set
to zero in order to maintain the rectangular grid. The spacing x = y = /4 was
chosen based on the recommendations in Section 1.4, leading to a spectral
sampling period of kx = ky = 0.002k. With this spacing, the angular spacing is
roughly = 0.11° or larger (since the angular spacing is not uniform in ).
(a) (b)
Figure 1.22 (a) Normalized radiation patterns for the rectangular aperture with a = b = 33.5 for =
0°, 90°. (b) Predicted electric field aperture distribution |Ex| via FFT given the far-field
patterns, directivity, and radiated power. The FFT utilized the sampled far field with N =
M = 2,000 points and spatial sampling period of x = y = /4. The spectral sampling
period was kx/k = ky/k = 0.002.
The resulting aperture fields at z = 0 from the FFT computation are shown in
Figure 1.22(b). We only plot the magnitude of the x-component since the y-
Novelties of Spectral Domain Analysis 55
component is negligible. The first and most evident characteristic from the plot is
that the FFT spectral analysis predicts a square-shaped aperture distribution with a
sharp roll-off in the electric field. The length and width of the aperture predicted
by the resulting aperture distribution are roughly 33.5, which agrees with the
theoretical development. Some ringing effects can be observed at the outer
periphery of the aperture, but this is due to the fact that only the visible portion of
the spectrum is considered in the approach. This artifact comes from the physical
limitation of the spectral analysis framework. Even more important is the
magnitude of the fields in V/m. It was found that the magnitude of the electric field
component Ex in the aperture had a mean value of 341.1 V/m, agreeing well with
the theoretical value of 341.5 V/m.
It is interesting to observe the effect of larger spectral sampling periods as well
as the spatial sampling on the spectral analysis' ability to predict the near field.
This is compared by examining the changes in the near-field values for different
number of samples N = M (for both x- and y-directions) for the same rectangular
aperture as in Figure 1.22. The spatial sampling spacing x, y remain constant at
/4, which means that changing N presents a change in spectral sampling and
ultimately angular sampling. For the case when N = 250, the smallest angular
sampling is roughly 0.916°. The results of the comparison are shown in
Figure 1.23(a), where the electric field is plotted versus x/ for y = 0. It is
interesting to note that a larger angular spacing will still provide a satisfactory
prediction as seen in Figure 1.24(a). In each of the sampling schemes, the general
features are observed and no significant difference can be observed. From a
numerical perspective, the smaller values of N lead to faster computation times,
which can be useful in computationally intensive applications.
A comparison of the results for different spatial sampling periods is shown in
Figure 1.23 for the same rectangular aperture. The magnitude of the electric field
component Ex is plotted versus x with different spatial sampling spacings x = y.
This was done while keeping the number of samples constant at N = M = 1,000.
As expected, the curves converge as the spacing between samples becomes
smaller. Ringing is still present in all cases, since only the visible region spectral
components are present. However, it is interesting to note that the /2 case seems
to have no ringing. This is due to the spacing of the samples, and a closer
investigation shows that the /2 case provides nearly identical values to the smaller
spacing cases. As expected, the smaller sample spacing does not necessarily
provide a dramatic improvement in the values recovered in this procedure, but
rather provides more data points if desired. Note also that the slope of the rolloff
does not become steeper with smaller x. The lack of higher spectral components
in the invisible region causes the finite slope, and a steeper slope similar to the
theoretical distribution can only be attained by incorporating higher spectral
content.
56 Advanced Computational Electromagnetic Methods and Applications
(a)
(b)
Figure 1.23 (a) Near-field electric field distribution in Ex for the rectangular aperture with a = b =
33.5 at y = 0. Different number of samples N were used in the FFT to compare their
effect, and x = y = /4. (b) Near-field electric field distribution in Ex for the
rectangular aperture with a = b = 33.5 at y = 0. Different spatial sampling was used in
the FFT to compare its effect with .
The resulting near-field distributions for the same rectangular aperture are
shown in Figure 1.24 for several different planes. These plots are generated by the
same spectral analysis (FFT) program with different values for z using the far
fields and the radiated power. The fields at z = 50 (Figure 1.24(a)) demonstrate
similar features to the original aperture distribution, while those farther away
(Figure 1.24(d) with z = 400) resemble the far-field radiation patterns, as
expected. The contour plots provide insight into the hotspot locations at various
planes of interest. The plots illustrate an increased field intensity at the corners of
the aperture which eventually shift towards the center (x = y = 0). The maximum
field intensity also does not decrease monotonically versus z. In fact, the largest
field intensity between these plots can be observed at z = 400. Similar
observations have been made with previous findings on near-field distributions,
which showed that the fields tend to oscillate rapidly in the near field and gradually
begin to attenuate by 1/r around the far-field region. This axial variation will be
discussed later on.
Novelties of Spectral Domain Analysis 57
(a) (b)
(c) (d)
Figure 1.24 Magnitude of the Ex component of the rectangular aperture radiating 87W with a = b =
33.5 with an FFT sampling of x =y = /4 and N = M = 2,000 for (a) z = 50 (b) z =
100 (c) z = 200 and (d) z = 400.
Suppose that there exists an electric field over a circular aperture, shown in Figure
1.25, of radius a with the distribution given by
ˆ , x2 y 2 a2
xE
E( x, y, 0) xa (1.144)
0
elsewhere
where Exa is the electric field magnitude in the aperture and does not depend on
space. Similar to the rectangular aperture, the electric field in the aperture can be
related to the power radiated by
58 Advanced Computational Electromagnetic Methods and Applications
2
Exa
Prad a2 (1.145)
2
and the directivity can be predicted by
a2
D0 4 2 (1.146)
Figure 1.25 Circular aperture electric field distribution used for testing the spectral analysis-FFT
technique.
(a) (b)
Figure 1.26 (a) Normalized radiation patterns for the circular aperture with a = 16.75 for = 0°,
90°. (b) Predicted electric field aperture distribution |Ex| via FFT given the far-field
patterns and radiated power. The FFT utilized the sampled far field with N = M = 2,000
points and spatial sampling period of x = y = /4. The spectral sampling period was
kx/k = ky/k = 0.002.
For the circular aperture, the far-field patterns can be found by computing
PWS and employing its asymptotic relation to the far fields. The PWS can be
found by taking the 2-D Fourier transform of the circular disc as
e
jk x x jk y y
ˆ xa
A(k x , k y ) xE dxdy (1.147)
Sc
where Sc is the circular area centered about the origin with radius a. The integral
can be rewritten in the aperture cylindrical coordinates as
Novelties of Spectral Domain Analysis 59
a 2
j k x cos k y sin d d
A(k x , k y ) xˆExa e
0 0
(1.148)
The exponent in the equation above can be modified to have the form
a 2
j k x2 k y2 sin
A(k x , k y ) xˆExa
0 0
e d d (1.149)
where tan 1 k x / k y . The integrand is a periodic function in and thus can be
recognized as the Bessel function of the first kind. Integrating in leads to
a
A(k x , k y ) xˆ 2 Exa J 0 k x2 k y2 d
0
(1.150)
where Jm is the Bessel function of the first kind of mth order. Setting
t k x2 k y2 and dt d k x2 k y2 and using the property that
t
J 0 ( x) xdx tJ1 (t ) , it can be shown that
0
A(k x , k y ) xˆ 2 Exa a 2
J1 a k x2 k y2 (1.151)
a k x2 k y2
which is often referred to as the Airy disc function. Using (1.29), we can find the
far-field pattern of the circular aperture using its PWS as
E0 e jkr
E(r , , ) f ( , ) (1.152)
r
2 J1 kasin ˆ
f ,
kasin
cos cos ˆ sin (1.153)
where E0 is another arbitrary scaling factor unknown to the observer or user. The
factor of 2 in f(, ) is included in order to normalize the pattern.
With the radiation pattern readily available, the FFT spectral analysis program
was applied to these fields to predict the aperture field distribution. As a numerical
example, an aperture size of a = 16.75 was chosen along with a radiating power
of Prad = 87W at 13.4 GHz. With the radiated power and the area known, the
electric field in the aperture can be computed as Exa = 385.4 V/m using (1.145).
For these particular values and aperture sizes, we have plotted the radiation pattern
in Figure 1.26(a). Overall, the patterns in the principal planes are similar to those
of the rectangular aperture with the exception of the lower sidelobes. Since a
60 Advanced Computational Electromagnetic Methods and Applications
similarly sized aperture was utilized, the beamwidth is also comparable to that of
the rectangular aperture at 1.6°. Similar patterns between = 0° and 90° are
realized since the aperture is circular.
For the results shown in Figure 1.26(b), 2,000 points were used for both kx and
ky sampling in the visible region, again placing zeros for any A(kx, ky) falling
outside the visible region. The sample spacing was set to x = y = /4, leading to
a spectral sampling period of kx = ky = 0.002k. With this sample spacing and
spectral period, the smallest angular spacing is = 0.11°, which provides an
ample number of points to sample the radiation pattern. For the = 0 cut, the
spectral sampling period provides roughly 16 points per sidelobe, thus ensuring
that the oscillations in the far-field radiation pattern are well sampled. The
resulting aperture distribution shown in Figure 1.26(b) reflects the original circular
shape with a radius of approximately 16.75. The results were generated based on
the knowledge of only the far-field patterns and the radiated power. Besides this,
no other a priori knowledge was utilized to generate the aperture fields.
Near-field electric field distributions were generated for different planes
farther from the aperture plane to show how the near-field distribution and the
maximum electric field value can change along the distance z. In Figure 1.27, the
electric field distribution can be observed for z = 50, 100, 200, and 400. The
contour plots reveal that the distribution from a uniform circular aperture spread
over a large area into a focused beam. Even at 50 away from the aperture, the
fields oscillate around 400 V/m in roughly the same area in the original aperture.
At 100 (Figure 1.27(b)), the oscillations in the fields become more pronounced,
and a sharp beam starts to take shape with large sidelobes. The electric field
distributions at 200 and 400 have the appearance of a concentrated beam. It is
also interesting to keep track of the maximum Ex field intensity for each of the
planes. After the FFT computation, the resulting data matrix for the Ex component
was searched to find the location and value of the maximum. The maximum fields
found from the FFT computation were given as 561 V/m, 723 V/m, 622 V/m, and
686 V/m for z = 50, 100, 200, and 400, respectively. The search also showed
that the maximum values were located at x = y = 0 in every case, as expected from
the plots. To summarize, the near-field predictions made by this tool provide both
insight into the near-field distributions as well as a direct tool for engineers to
evaluate systems in terms of both the requirements within the vicinity of the
antenna. By knowing how the fields are distributed in V/m, one can directly assess
the interference upon nearby electronic systems as well as the radiation levels
received by individuals in the vicinity of the antenna.
While the rectangular aperture provided some interesting insights into the
evolution of a near field to a far-field distribution, a unique feature of the circular
aperture is that the near fields can be analytically solved along the z-axis. This is a
Novelties of Spectral Domain Analysis 61
well-known feature that has been proven insightful in understanding the near-field
behaviors of large antennas. For our purpose, it can serve as a benchmark problem
to ensure the validity of the technique at nonzero distances from the aperture.
Therefore, we will first derive the near fields of a circular aperture and compare
with the results generated from a spectral analysis FFT program.
(a) (b)
(c) (d)
Figure 1.27 Magnitude of the Ex component of the circular aperture radiating 87W with a = 16.75
with an FFT sampling of x =y = /4 and N = M = 2000 for (a) z = 50, (b) z = 100, (c)
z = 200, and (d) z = 400. The maximum Ex fields observed in these planes were 561
V/m, 723 V/m, 622 V/m, and 686 V/m, respectively.
With the same electric field distribution given in (1.144) and illustrated in
Figure 1.25, we can begin to compute the near-field electric field distribution along
the z-axis (i.e., x = y = 0). It has been shown using vector potentials and the surface
equivalence theorem that the radiated electric field from such an aperture can be
found using integral [14]
62 Advanced Computational Electromagnetic Methods and Applications
e jk r r
E(r) 2 zˆ
Sc
E( x, y , 0)
4 r r
dxdy
(1.154)
where Sc is the circular surface of radius a for the integration, r is the observation
point, and r' is the source location along with any other primed coordinates.
Substituting the field distribution in (1.144) and taking the first cross-product leads
to
e jk r r
E 2 Exa S
yˆ
4 r r
dxdy
(1.155)
The next step is to find the gradient for the factors inside the parentheses as
r r 1 jk r r e jkR
E 2 Exa S
yˆ
r r
r r 4 r r
dxdy
(1.156)
Since the observation points are along the z-axis and the source points are only
located in the x-y plane (i.e., x = y = 0 and z' = 0), we can expand this into
E xzˆ zxˆ 1 jk x 2 y 2 z 2 e
jk x2 y 2 z 2
E xa dxdy (1.158)
2 x 2 y 2 z 2
2
x y z2 2
S
E
a 2
cos zˆ zxˆ 1 jk 2 z 2 e
jk 2 z 2
E xa d d (1.159)
2 2
z 2 2
z 2
0 0
where one immediately can see that the z component will go to zero. Integrating in
provides
Novelties of Spectral Domain Analysis 63
a
1 jk 2 z 2
2 z 2
ˆ xa
E xzE e jk d (1.160)
3/2
0 2 z 2
which can be solved using substitution and integration by parts to find the electric
field as
2
e jk a z
2
ˆ
E xExa e jkz
(1.161)
1 (a / z ) 2
which has been shown in other works as well [28].
(a) (b)
(c) (d)
Figure 1.28 Comparison of the exact axial field distribution for some representative uniform circular
apertures. A sample size of N = M = 500 was used in conjunction with a sample spacing
of x = y = /4. The aperture radii were (a) a = 16.75 (b) a = 5 (c) a = 10 and (d)
a = 50.
Using this formulation, a comparison can be made with the results from the
FFT program. Using a similar configuration to the previous cases, an aperture size
64 Advanced Computational Electromagnetic Methods and Applications
of a = 16.75 was utilized. In order to speed up the computation, the sample size
was chosen to be N = M = 500 and x = y = /4. The results shown in
Figure 1.28(a) agree considerably well even at very close distances. In order to
further demonstrate the ability of the FFT spectral analysis, other aperture sizes
were considered, and the results are also shown in Figures 1.28(bd). For these
other apertures the same sample size N and sample spacing x were also used.
Remarkably good agreement can also be observed in these plots as well.
Clearly, the power of the FFT spectral analysis approach is exemplified by
these plots. Computing the near fields presents a major challenge. Many
researchers have worked towards approximating the fields within the near field as
well as the Fresnel-zone regions through a variety of techniques [4, 2830].
However, most of those techniques were only able to approximate the fields
adequately to a distance of a few diameters away from the aperture (e.g., z = 3D),
whereas this technique is able to accurately predict the near fields within only a
few wavelengths away from the aperture. This is due to the fact that the evanescent
waves that make up the invisible part of the spectrum quickly die out and no longer
contribute to the radiation pattern after a few wavelengths.
For high power and high gain applications, antenna engineers often prefer the
reflector antenna due to its widely proven use, efficiency, and power handling
capabilities. The cost and complexity in scaling the reflector antenna to provide
higher gain are also reasonable compared to other options such as arrays.
Therefore, it would be instructive to consider a practical example of a reflector
antenna in the context of computing the near-fields. The traditional reflector
antenna systems are made up of two components: the feed and the reflector system.
In general, the feed can be a horn antenna or even an array for added antenna
capabilities. The feed antenna illuminates the reflector(s), where the scattered
radiation becomes focused or directed due to the properties of the reflectors. The
reflector system can be configured to provide many unique functionalities in the
radiation patterns. The reflector(s) can be curved, flat, or corner for different
purposes. There also can be multiple reflectors or a single reflector. A common
reflector design is the single symmetric parabolic dish fed with a feed at the focus,
as seen in Figure 1.29, where the dish is symmetric about . When the feed is
placed at the focus of the parabola, the scattered fields become collimated, that is,
the scattered fields appear as plane waves traveling in the +z-direction.
Consequently, the fields over the aperture have a uniform phase, leading to a high
directivity.
Novelties of Spectral Domain Analysis 65
E f r , ,
e jkr
r
cos q x cos Exˆ E yˆ cos y sin E yˆ Exˆ (1.162)
q
where the choice of Ex and Ey determine the polarization of the feed. Note that
Ef = 0 for > 90°. Setting (Ex, Ey) = (1, 0) would provide an x-polarized feed
antenna, while setting (Ex, Ey) = (0, 1) would provide a y-polarized feed antenna.
Circular polarization can also be achieved by setting Ex = Ey along with quadrature
phase between the two components. The most important point to note from this
pattern is that the q-factors qx and qy control the pattern beamwidth in the x-z and
y-z planes, respectively. In the design being discussed, these q-factors were chosen
in order to provide a certain taper in the aperture fields, widely known as the edge
taper (ET). In the particular example at hand, the desired edge taper was ET = 10
dB, which is optimal for single parabolic reflector directivity, providing the
maximum aperture efficiency of ap = 81% [31]. The tapering should be reflected
in the aperture distribution, where the fields should be roughly 10 dB at the
reflector edge compared to the center.
(a) (b)
Figure 1.29 Symmetric parabolic dish antenna fed with a feed antenna at the parabolic focal point.
(a) Side view. (b) Top view.
The radiation from the feed antenna excites surface currents on the reflector.
The currents in turn radiate the fields observed in the far field along with any
additional radiation from the feed. A good approximation of the current
distribution on the reflector is the physical optics approximation, where the surface
currents can be computed by
J PO 2nˆ H f (1.163)
where JPO is the physical optics (PO) surface current, n̂ is the unit normal vector
to the reflector surface, and Hf is the radiated magnetic field from the feed antenna,
66 Advanced Computational Electromagnetic Methods and Applications
which can be computed from (1.162) and the local plane wave relationship in
(1.11). Notice that the only required knowledge is the geometry of the parabolic
reflector (to provide the unit normal vector) and the incident magnetic field. In
reality, the current distribution deviates from the PO prediction due to the
interactions of the feed and reflector in addition to strong edge currents.
Nevertheless, the PO approximation is still quite accurate and useful in predicting
the pattern features in the main beam and its first few sidelobes.
With the currents known on the reflector surface, the radiated fields can be
ascertained through an integration of each infinitesimal current’s contribution. A
generalized treatment of the radiated electric fields applicable in both far-field and
near-field regions has been formulated from vector potentials [32], with the
resulting integration of the current being
e4R d
jkR
EPO r jk0 g1J PO r' g2 J PO r' Rˆ Rˆ (1.164)
3 3
g2 1 2
j (1.166)
kR kR
These are the exact PO integrals that provide the electric field in the near-field and
far-field regions without any approximations. The evaluation of this integral has
been discussed in detail in [32] and other works, but it is not the main focus for this
chapter. Rather, this exact formulation is compared against the results from the
FFT procedure discussed.
For the symmetric reflector example, the chosen diameter was D = 33.5 and
the ratio f/D =0.568, leading to a focal length of 19.03. The frequency 10 GHz
was chosen arbitrarily. Using Figure 1.29, it can be shown that the subtended angle
s = 47.5° with this configuration. In order to obtain the edge taper of
ET = 10 dB, the q-factors were chosen as qx = qy = 2.483. The feed was also x-
polarized and the power radiated was 100W. With a diameter of this size, the half-
power beamwidth can be predicted to be roughly 1.8°, and thus a rapid sampling
rate must be applied in order to make an effective prediction using the spectral
analysis-FFT program. The far-field patterns in Figure 1.30 confirm the rapid
variations, where E and E are plotted for the x-z and y-z planes. The far fields
were generated by evaluating (1.164) in the far-field region, taking an overall
computational time of 2.4 hours. The speed was accelerated through parallelization
into four separate cores on a computer equipped with two quad-core Intel Xeon
Novelties of Spectral Domain Analysis 67
(a) (b)
Figure 1.30 Far-field patterns in the x-z ( = 0°) and y-z ( = 90°) planes for a symmetric parabolic
reflector antenna with D = 33.5 and f = 19.03.The q-factors for the feed were qx = qy =
2.483, and the patterns were generated by integrating the PO currents from (1.164).
(a) (b)
Figure 1.31 (a) Near-field aperture distribution of Ex computed via PO integration. (b) Near-field
aperture distribution of Ex computed via FFT. For both cases the geometry was set to D
= 33.5 and f = 19.03 and the radiated power was 100W. The observation plane is z =
h f = 15.34.
The resulting distributions in the aperture of the reflector antenna are shown in
Figures 1.31 and 1.32. In Figure 1.31(a), the aperture fields result from the
68 Advanced Computational Electromagnetic Methods and Applications
computation of the near-field integrals of (1.164), whereas the results from the FFT
computation are shown in Figure 1.31(b). The plots depict the magnitude of |Ex|
over the plane at z = h f, as illustrated in Figure 1.29. The aperture distribution
was computed by applying the iFFT onto the PWS, which was obtained via the far-
field patterns. The near-field distribution shown was computed using the pattern
data in the range of = [0, 45°] and = [0, 360°], and overall the prediction of the
near fields is quite accurate even with the limited data available. The smallest
sidelobe levels included were 60 dB below the peak, which clearly was enough to
recover the near fields accurately. The data available from the program that
computes the far-field patterns provided the patterns over a rectangular - grid.
Therefore, interpolation was also used in order to convert the data to a rectangular
kx ky grid. As discussed in Section 1.4.5, the interpolation was performed in the
- domain in order to exploit the rectangular grid available. The FFT along with
its sampling parameters, e.g. N, x, and so forth, predetermine the spectral
coordinates (kx, ky) in which the far-field components must be known. The
interpolation was performed by first converting the desired (kx, ky) coordinates into
(, ) locations via (1.27). Bilinear interpolation was subsequently employed to
compute the electric fields at the desired (, ) locations.
(a) (b)
Figure 1.32 (a) Magnitude of Ex along the x-axis (y = 0) for z = h f = 15.34 compared between
the FFT and PO integration procedures. (b) Magnitude of Ex along the y-axis (x = 0) for
z = h f = 15.34 compared between the FFT and PO integration procedures. For both
cases the geometry was set to D = 33.5 and f = 19.03 and the radiated power was
100W.
Some interesting differences can be observed between the two plots and
Figure 1.32 highlights some of those features. In Figure 1.32(a), the Ex magnitude
is plotted along the x-axis, i.e. y = 0, in which both numerical procedures yield
excellent agreement. Some slight differences between the FFT approach and the
PO integration (exact) can be observed, such as the ripple and the rolloff of the
fields outside the aperture. In spite of this, the agreement between the PO
integration and the FFT overall is notable. Both approaches take totally different
paths in generating the near-fields and yet arrive at almost identical solutions. The
results in the z = 10 and 100 cases also demonstrate noteworthy agreement.
Novelties of Spectral Domain Analysis 69
(a) (b)
Figure 1.33 (a) Near-field distribution of Ex computed via PO integration. (b) Near-field distribution
of Ex computed via FFT. For both cases the geometry was set to D = 33.5 and f =
19.03 and the radiated power was 100W. The observation plane is z = 10
(a) (b)
Figure 1.34 (a) Magnitude of Ex along the x-axis (y = 0) for z = 10 compared between the FFT and
PO integration procedures. (b) Magnitude of Ex along the y-axis (x = 0) for z = 10
compared between the FFT and PO integration procedures. For both cases the geometry
was set to D = 33.5 and f = 19.03 and the radiated power was 100W.
It is also worth pointing out that the FFT approach provided the resulting
aperture distribution significantly faster than the PO integration approach. Using
the same computer with the same core allocation, the spectral analysis-FFT
procedure finished in roughly 9.1 seconds, including the time for interpolation. As
for the PO integration approach, the final computation time was roughly 2.91
hours, resulting in about 1,000 times slower speed than the FFT approach. The
only assumption is that the FFT has the far-field patterns in order to calculate the
near-field distribution. The PO integration is performed by splitting the reflector
into many small subdomains and computing their contribution to the integral via
70 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 1.35 (a) Near-field distribution of Ex computed via PO integration. (b) Near-field distribution
of Ex computed via FFT. For both cases the geometry was set to D = 33.5 and f =
19.03 and the radiated power was 100W. The observation plane is z = 100.
(a) (b)
Figure 1.36 (a) Magnitude of Ex along the x-axis (y = 0) for z = 100 compared between the FFT
and PO integration procedures. (b) Magnitude of Ex along the y-axis (x = 0) for z = 100
compared between the FFT and PO integration procedures. For both cases the geometry
was set to D = 33.5 and f = 19.03 and the radiated power was 100W.
one can elongate or shorten the aperture in one dimension, making the projected
aperture elliptical in shape, as shown in Figure 1.37. The projected aperture is
characterized by its major and minor axes a and b. By increasing one of the axes
and properly adjusting the feed, the beamwidth along the dimension corresponding
to the axis can be narrowed. As for the reflector geometry, the aperture no longer
lies in a plane since a ≠ b, and thus the maximum parabola heights are not equal
(i.e., hx ≠ hy). In this case, the aperture plane can be considered as z = max(hx, hy)
f.
The ensuing simulations assumed that the feed’s far-field radiation patterns
appeared as cosq() patterns similar to the circular symmetric reflector. Thus, the
design procedure for the elliptical symmetric reflector antenna is nearly identical to
that of the circular symmetric reflector, with the only difference in the choice of a
and b as well as the feed’s q-values qx and qy. As an example, the geometry was
chosen as a = 16.75, b = 25, and f = 19.03, leading to a narrower beamwidth in
the y-z plane compared to the beamwidth in the x-z plane. The feed parameters qx
and qy were chosen in order to provide a 10-dB edge taper as best as possible.
This was accomplished by computing the subtended angles sx and sy and setting
the q-values to obtain the proper feed taper, which also takes the path loss into
account. Since the aperture is longer along the y dimension in this particular
example, one can expect the qy to be smaller than the qx value. The resulting
directivity from this design was 41.209 dB, producing roughly 80% aperture
efficiency. The radiated power was set to Prad = 100W.
(a) (b)
Figure 1.37 Symmetric parabolic dish antenna with an elliptical projected aperture having major and
minor axes of length a and b. (a) 3-D view, and (b) Top view.
The radiated far-field patterns were generated over the range of = (0, 45°)
and = (0, 360°), and the normalized patterns for the two principal planes (the x-z
and y-z planes) are shown in Figure 1.38. The results were found by directly
integrating the PO currents as shown in (1.164) using composite Gauss-Legendre
quadrature on small subdomains of the reflector. The far-field patterns were
computed on a regular (, ) grid where = 0.1° and = 0.1°, a reasonable
72 Advanced Computational Electromagnetic Methods and Applications
choice given the beamwidths in the x-z and y-z planes as 1.8° and 1.4°,
respectively. These beamwidths can be observed in Figure 1.38. The beamwidths
are not drastically different in comparison to the previous symmetric reflector
antenna, and thus a similar far-field sampling scheme was applied. The sampling
parameters were N = M = 2,000 and x = y = /4, leading to a spectral sampling
period of kx = ky = 0.002k. The data was interpolated to achieve the electric field
over a regular (kx, ky) grid in the same manner as the symmetric reflector with a
circular aperture.
(a) (b)
Figure 1.38 Normalized far-field patterns in the x-z ( = 0°) and y-z ( = 90°) planes for an elliptical
symmetric reflector antenna with a = 16.75 b = 25, and f = 19.03.The q-factors for
the feed were qx = 2.483 and qy = 1.24, and the patterns were generated by integrating
the PO currents using (1.164).
(a) (b)
Figure 1.39 (a) Near-field aperture distribution of |Ex| computed via PO integration. (b) Near-field
aperture distribution of |Ex| computed via FFT. For both cases the geometry was set to a
= 16.75, b = 25 and f = 19.03 and the radiated power was 100W. The observation
plane is at the aperture plane, located at z = hy f = 10.82.
Novelties of Spectral Domain Analysis 73
(a) (b)
Figure 1.40 (a) Magnitude of Ex along the x-axis (y = 0) for z = hy f = 10.82 compared between
the FFT and PO integration procedures. (b) Magnitude of Ex along the y-axis (x = 0) for
z = hy f = 10.82compared between the FFT and PO integration procedures. For
both cases the geometry was set to a = 16.75, b = 25and f = 19.03and the radiated
power was 100W.
(a) (b)
Figure 1.41 (a) Near-field distribution of Ex computed via PO integration. (b) Near-field distribution
of Ex computed via FFT. For both cases the geometry was set to a = 16.75, b = 25,
and f = 19.03 with a radiated power of 100 W. The observation plane is z = 10.
field distribution. For the two planes shown, the beamwidth in the x-z plane is
smaller compared to the y-z plane. However, this will change as the distance z
approaches the far-field region where the y-z plane beamwidth becomes the
smaller beamwidth as expected from antenna theory. Lastly, it should be noted that
the distribution for Ey could also be investigated, but the values are negligible
compared to those for Ex.
(a) (b)
Figure 1.42 (a) Magnitude of Ex along the x-axis (y = 0) for z = 10 compared between the FFT and
PO integration procedures. (b) Magnitude of Ex along the y-axis (x = 0) for z = 10
compared between the FFT and PO integration procedures. For both cases the geometry
was set to a = 16.75, b = 25 and f = 19.03 and the radiated power was 100W.
(a) (b)
Figure 1.43 (a) Near-field distribution of Ex computed via PO integration. (b) Near-field distribution
of Ex computed via FFT. For both cases the geometry was set to a = 16.75, b = 25
and f = 19.03 and the radiated power was 100W. The observation plane is z = 100.
Novelties of Spectral Domain Analysis 75
(a) (b)
Figure 1.44 (a) Magnitude of Ex along the x-axis (y = 0) for z = 100 compared between the FFT
and PO integration procedures. (b) Magnitude of Ex along the y-axis (x = 0) for z = 100
compared between the FFT and PO integration procedures. For both cases the geometry
was set to a = 16.75, b = 25 and f = 19.03 and the radiated power was 100W.
In many cases, antenna designers only have knowledge of the far-field radiation
patterns in two principal planes (e.g., = 0°, 90°). Clearly, this does not provide
the complete set of data needed to recover the PWS in the visible region. However,
one can attempt to interpolate the patterns in to make a good initial prediction of
the near fields. Denoting the radiation patterns as f() and g() for the = 0 and
90° cuts, respectively, we can write a simple interpolation in the far field as
E0 e jkr
f ( ) cos g ( )sin ˆ
E(r , , )
r (1.167)
f ( ) cos
g ( )sin ˆ
where it is assumed also that the pattern functions f() and g() only contain the
magnitudes of the fields. If the phase information is also available, then the minus
sign becomes a plus sign in (1.167). This formulation is fairly general with respect
to the polarization of the far fields, being able to handle either x-polarized, y-
polarized, or CP with the proper insertion of phase into each component.
The reader should note that this interpolation is quite simplistic and does not
work well if the aperture distribution is not symmetric. Another major assumption
is that the main beam is centered about = 0. Different interpolation schemes in
must be utilized for more complex patterns such as scanned beams, contour beams,
and asymmetric patterns. A straightforward example of an aperture distribution
whose far fields can be interpolated using the sin()/cos() approach is any aperture
distribution that can be written as E( , , 0) E f ( ) , which has no dependency on
0
Er , ,
r
E0e jkr ˆ
f cos ˆg sin (1.168)
in the implementation.
Once the far fields were fully interpolated over the = (0, 2) span, the next
steps to find the near fields followed the usual procedure, where the PWS was
computed using (1.64) and the iFFT applied with the proper normalization to
achieve the final near-field values. The resulting near-field distributions found
from only two planes (or cuts) is shown in Figure 1.45. Nearly identical results
were found from this procedure, thus demonstrating the power of having
knowledge of only the two principal plane far-field distributions.
It is important to note that the cos/sin interpolation in only works well for
circular symmetric aperture distributions which result in roughly circularly
symmetric far-field patterns. Good performance cannot be guaranteed for all
aperture distributions in general with this interpolation. The cos/sin interpolation
was also tested on the elliptical symmetric reflector and the rectangular aperture
distribution, where the near-fields at the aperture were computed using only the
two principal planes. The results shown in Figures 1.46 and 1.47 show that decent
agreement can be obtained for the elliptical case (although there are more
noticeable discrepancies in other areas) while poor results are obtained with the
rectangular aperture distribution. Thus caution must be exercised when applying
this interpolation scheme. Since the elliptical symmetric reflector has similar
patterns throughout , interpolating the pattern with simple cos/sin functions works
decently; however this is not the case for the rectangular aperture distribution.
These observations can be confirmed by examining the far-field patterns for
E and E for the = 45° cut, as shown in Figure 1.48. In this figure, the far-field
patterns for the circular symmetric reflector, elliptical symmetric reflector, and the
rectangular aperture are compared between their original (exact) patterns and the
interpolated patterns using sin/cos interpolation. The = 45° cut is typically the
plane at which the largest discrepancies can be observed between the exact patterns
and the sin/cos interpolation. In the = 0°, 90° cuts, the interpolated patterns are
identical to the exact patterns due to the zeros of the sin/cos functions. Therefore
the most interesting cut to investigate is the far-field patterns of the = 45° cut.
Novelties of Spectral Domain Analysis 77
(a) (b)
(c) (d)
(e) (f)
Figure 1.45 Magnitude of Ex compared between the FFT approach with (two cuts) and without
interpolation in (all cuts) for several planes for the circular symmetric reflector
antenna. For all cases the geometry was set to D = 33.5 and f = 19.03 and the radiated
power was 100 W. (a) Plot along the x-axis (y = 0) for z = h f = 15.34(b) Plot
along the y-axis (x = 0) for z = h f = 15.34. (c) Plot along the x-axis (y = 0) for z =
h f = 10. (d) Plot along the y-axis (x = 0) for z = h f = 10. (e) Plot along the x-axis
(y = 0) for z = h f = 100. (f) Plot along the y-axis (x = 0) for z = h f = 100.
Nearly identical far-field patterns can be observed for the circular symmetric
reflector due to its aperture distribution. However, the elliptical symmetric
78 Advanced Computational Electromagnetic Methods and Applications
reflector and the rectangular aperture show some deviations from the exact
patterns. Both the main beam beamwidth and the sidelobe levels are noticeably
different in both cases. Among the two, the elliptical shows better agreement with
the exact patterns in terms of the main beam and also the sidelobes. This is because
the elliptical symmetric reflector still has fairly similar sidelobe levels in the =
45° case compared to the = 0°, 90° patterns as shown in Figure 1.38. The exact
pattern for the rectangular aperture has significantly lower sidelobes in the = 45°
compared to the = 0°, 90° patterns shown in Figure 1.22, which leads to a poor
prediction by the sin/cos interpolation.
(a) (b)
Figure 1.46 Magnitude of Ex compared between the FFT approach with (two cuts) and without
interpolation in (all cuts) for several planes for the elliptical symmetrical reflector. For
all cases the geometry was set to a = 16.75, b = 25, and f = 19.03 with the radiated
power as 100 W at 10 GHz. (a) Plot along the x-axis (y = 0) for z = 10.82. (b) Plot
along the y-axis (x = 0) for z = 10.82.
(a) (b)
Figure 1.47 Magnitude of Ex compared between the FFT approach with (two cuts) and without
interpolation in (all cuts) for several planes from the rectangular aperture case. For all
cases the geometry was set to a = b = 33.5 and the radiated power was 87W at 13.4
GHz. (a) Plot along the x-axis (y = 0) for z = 0. (b) Plot along the y-axis (x = 0) for z = 0.
Novelties of Spectral Domain Analysis 79
(a) (b)
(c) (d)
(e) (f)
Figure 1.48 Normalized far-field patterns of E and E compared between the exact values
(computed from simulation or the exact pattern function) versus the cos/sin interpolation
in (two cuts) for the = 45° plane. (a) E for the circular symmetric reflector antenna.
(b) E for the circular symmetric reflector antenna. (c) E for the elliptical symmetric
reflector antenna. (d) E for the elliptical symmetric reflector antenna. (e) E for the
rectangular aperture. (f) E for the rectangular aperture. The dimensions for the circular
symmetric reflector antenna, elliptical symmetric reflector antenna, and the rectangular
aperture are the same as those listed in Figures 1.451.47, respectively.
80 Advanced Computational Electromagnetic Methods and Applications
REFERENCES
[1] P. Clemmow, The plane wave spectrum representation of electromagnetic fields, New York, NY:
Pergamon Press, Inc., 1966.
[2] R. Rudduck, D. Wu, and M. Intihar, “Near-Feld Analysis by the Plane-wave Spectrum Approach,”
IEEE Transactions on Antennas and Propagation, Vol. 21, No. 2, pp. 231–234, 1973.
[3] H. Booker, and P. Clemmow, “The concept of an angular spectrum of plane waves, and its
relation to that of polar diagram and aperture distribution,” Proceedings of the IEE, Vol. 97,
No. 45, pp. 1117, 1950.
[4] G. Evans, S. Dvorak, and S. Fast, “Efficient computation of Fresnel zone fields associated with
circular apertures,” Radio Science, Vol. 29, No. 4, pp 705–715, 1994.
[5] E. Jull, “Radiation from Apertures,” in Antenna Handbook, Vol. 2, Y. Lo and S. Lee (eds.), New
York, NY: Van Nostrand Reinhold, 1993.
[6] R. Rudduck and C. Chen, “New plane Wave Spectrum Formulations for the Near-Fields of
Circular and Strip Apertures,” IEEE Transactions on Antennas and Propagation, Vol. 24, pp.
438449, 1976.
[7] O. Iupikov, et al., “Fast and Accurate Analysis of Reflector Antennas With Phased Array Feeds
Including Multiple Reflections Between Feed and Reflector,” IEEE Transactions on Antennas and
Propagation, Vol. 62, No. 7, pp. 34503462, 2014.
[8] P. Beeckman, “Prediction of the Fresnel region field of a compact antenna test range with serrated
edges,” IEE Proceedings on Microwaves, Antennas and Propagation, Vol. 133, No. 2, pp.
108114, 1986.
[9] M. Gatti and Y. Rahmat-Samii, “FFT applications to plane-polar near-field antenna
measurements,” IEEE Transactions on Antennas and Propagation, Vol. 36, No. 6, pp. 781791,
1988.
[10] J. McKay and Y. Rahmat-Samii, “Compact Range Reflector Analysis Using the Plane Wave
Spectrum Approach with an Adjustable Sampling Rate,” IEEE Transactions on Antennas and
Propagation, Vol. 39, No. 6, pp. 746–753, 1991.
[11] Y. Rahmat-Samii, “Surface Diagnosis of Large Reflector Antennas Using Microwave
Holographic Metrology: An Iterative Approach,” Radio Science, Vol. 19, No. 5, pp. 12051217,
1984.
[12] J. Wang, “An Examination of the Theory and Practices of Planar Near-Field Measurement,” IEEE
Transactions on Antennas and Propagation, Vol. 36, No. 6, pp. 746–753, 1988.
[13] J. Goodman, Introduction to Fourier Optics, 3rd ed., Greenwood Village, CO: Robert &
Company Publishers, 2005.
[14] C. Balanis, Advanced Engineering Electromagnetics, New York, NY: John Wiley & Sons, 2012.
Novelties of Spectral Domain Analysis 81
[15] F. Ulaby, Fundamentals of Applied Electromagnetics, Upper Saddle River, NJ: Pearson, 2004.
[16] L. Shen and J. Kong, Applied Electromagnetism, Boston, MA: PWS Publishing, 1995.
[17] C. Balanis, Antenna Theory: Analysis and Design, New York, NY: John Wiley & Sons, 2005.
[18] D. Pozar, Microwave Engineering, New York, NY: John Wiley & Sons, 2011.
[19] A. Jerri, “The Shannon sampling theorem—Its Various Extensions and Applications: A Tutorial
Review,” Proceedings of the IEEE, Vol. 65, No. 11, pp. 15651596, 1977.
[20] M. Born and E. Wolf, Principles of Optics, Cambridge, UK: Cambridge University Press, 1997.
[21] D. Shepard, “A Two-Dimensional Interpolation Function for Irregularly-Spaced Data,”
Proceedings of the 1968 ACM National Conference, pp. 517–524, 1968.
[22] D. Watson and G. Philip, “Triangle Based Interpolation,” Journal of the International Association
for Mathematical Geology, Vol. 16, No. 8, pp. 779–795, 1984.
[23] “Fastest Fourier Transform in the West.” Online at http://www.fftw.org/.
[24] P. Swarztrauber, “FFTPACK.” Online at http://www.netlib.org/fftpack/.
[25] MATLAB and Statistics Toolbox Release 2012b, The MathWorks, Inc., Natick, MA.
[26] Y. Rahmat-Samii, “Useful Coordinate Transformations for Antenna Applications,” IEEE
Transactions on Antennas and Propagation, Vol. 27, No. 4, pp. 571574, 1979.
[27] D. Duan and Y. Rahmat-Samii, “Novel Coordinate System and Rotation Transformations for
Antenna Applications,” Electromagnetics, Vol. 15, No. 1, pp 1740, 1995.
[28] V. Galindo-Israel and Y. Rahmat-Samii, “A New Look at Fresnel Field Computation Using the
Jacobi-Bessel Series,” IEEE Transactions on Antennas and Propagation, Vol. 29, No. 6, pp.
885898, 1981.
[29] M. Hu, “Fresnel Region Fields of Circular Aperture Antennas,” Journal of Research of the
National Bureau of Standards, Section D, Vol. 65, pp. 137147, 1961.
[30] R. Bickmore and R. Hansen, “Antenna Power Densities in the Fresnel Region,” Proceedings of
the IRE, Vol. 47, pp. 21192120, 1981.
[31] Y. Rahmat-Samii, “Reflector Antennas,” in Antenna Handbook, Y. Lo and S. Lee (eds.),
Vol. 2, ch. 15, New York, NY: Van Nostrand Reinhold, 1993.
[32] D. Duan and Y. Rahmat-Samii, “A generalized diffraction synthesis technique for high
performance reflector antennas,” IEEE Transactions on Antennas and Propagation, Vol. 43,
No. 1, pp. 2740, 1995.
Chapter 2
High-Order FDTD Methods
Mohammed F. Hadi and Atef Z. Elsherbeni
83
84 Advanced Computational Electromagnetic Methods and Applications
larger physical models with denser digital spaces severely limits FDTD
simulations to dozens of wavelengths at best, even when using the latest and
greatest of today’s supercomputers and other hardware acceleration techniques.
A group of higher-order FDTD methods have been designed to achieve
minimal and near isotropic numerical phase velocity behavior. Employing any of
these methods should, in principle, facilitate obtaining extremely accurate
simulated results when modeling problem scales in the thousands of wavelengths
while using relatively coarse grids relative to the largest wavelength used in the
simulation. This fantastic promise, however, never translated to wide acceptance
by the FDTD community. This is due to an unfortunate combination of (1)
inexperienced use of these methods while not fully understanding their theoretical
underpinnings, and (2) marrying them with ancillary modeling tools and practices
that were designed for and thus limited by the same anisotropic and large phase
errors as standard FDTD.
This chapter will detail the theoretical basis and analysis of a high-order
FDTD method that has received continuous development over the years and
benefited from a fully designed suite of high-order ancillary modeling tools that
matches its phase accuracy performance. These modeling tools will in turn be fully
explained and verified, paying closer attention to the more critical ones: point and
planar wave initiations, absorbing boundary conditions, and planar and curved
PEC modeling. The chapter will conclude with a brief introduction to advanced
forms of this high-order method which offer substantial performance gains at the
expense of higher complexity of implementation.
Figure 2.1 The building block Yee cell for most FDTD algorithm variants.
1 1
n n
Ey 2 Ey 2
K a K
i, j , k i, j , k
Hx
n
Hx
n b H n
Hx
n
t z i, j , k
1
i, j , k
1 3z x i, j, k
3
i, j , k
3
2 2 2 2
Ka n n Kb n n
H z |i 1 , j , k H z |i 1 , j , k H z |i 3 , j , k H z |i 3 , j , k (2.2b)
x 2 2 3x 2 2
1 1
n n
Ez 2
i, j , k
Ez 2
i, j , k Ka n n Kb n n
Hy Hy
t
x
1
i , j,k
1
3x H y i 3 , j , k H y i 3 , j , k
i , j,k
2 2 2 2
Ka n K
H x |i , j 1 , k H x |n 1 b H x |n 3 H x |n 3 (2.2c)
y 2
i , j ,k
2 3y i , j
2
, k i , j ,
2
k
86 Advanced Computational Electromagnetic Methods and Applications
1 1
n n
H x |i , j ,2k H x |i , j ,2k Ka Kb
n n
E y |i , j , k 1 E y |i , j , k 1
n n
E y |i , j , k 3 E y |i , j , k 3
t z 2 2 3z 2 2
Ka n n Kb n n
Ez |i , j 1 , k Ez |i , j 1 , k Ez |i , j 3 ,k Ez |i , j 3 ,k (2.2d)
y 2 2 3y 2 2
1 1
n n
H y |i , j ,2k H y |i , j ,2k Ka Kb
n n
Ez |i 1 , j , k Ez |i 1 , j , k
n n
Ez |i 3 , j , k Ez |i 3 , j , k
t x 2 2 3x 2 2
Ka n n Kb n n
Ex |i , j , k 1 Ex |i , j , k 1 Ex |i , j , k 3 Ex |i , j , k 3 (2.2e)
z 2 2 3z 2 2
1 1
n n
H z |i , j ,2k H z |i , j ,2k Ka Kb
n n
Ex |i , j 1 , k Ex |i , j 1 , k
n n
Ex |i , j 3 , k Ex |i , j 3 , k
t y 2 2 3y 2 2
Ka n n Kb n n
E y |i 1 , j , k E y |i 1 , j , k E y |i 3 , j , k E y |i 3 , j , k (2.2f)
x 2 2 3x 2 2
where t is the temporal step, x, y, z are the spatial steps, and n, (i, j, k) are
temporal and spatial indices in the 3-D FDTD grid. The K a and K b will be carried
through as variables to generalize the entire treatment in this chapter to high-order
FDTD variants that use different coefficient values. For S24 in particular, their
Taylor series derived values would be Ka 9 / 8 and Kb 1/ 8 . It might seem
from the above equations that the E and H field values are updated at the same
time step, n. This is the only mathematical license to simplify the mathematical
derivations of dispersion and stability analysis and it agrees with most of the cited
literature for this chapter. The fact remains that the E and H field values are
updated in a leap-frog manner similar to standard FDTD.
The first order of business when deriving or developing a new FDTD
algorithm is to ascertain its stability limit and its dispersion relation. The latter will
govern the algorithm’s numerical dispersion error bounds in the discrete space and
is vital to understand and utilize correctly when developing the various modeling
tools. One approach to derive both together is to inject the above difference
equations with the trial plane wave solution A exp j nt x nx x y ny y
z nz z (where is the numerically rendered wave number by the FDTD grid)
and construct a discrete-operator system of equations [3]. Each set of difference
High-Order FDTD Methods 87
operators will correspond to what could be called a discrete operator. For example,
(2.1) above will morph into
E x
t
e
j t 2
e j t 2
K H
y
e
a z
~
j y 2
~
e j t 2
K3Hy e
b z
~
3 j y 2
~
e 3 j t 2
Ka H y
z
e ~
j z z 2
~
e j z z 2 Kb H y
3z
e ~
j 3 z z 2
~
e j 3 z z 2 (2.3a)
x Ex
sin t 2
Ka H z
sin y y 2
Kb H z
sin 3 y y 2
t 2 y 2 3y 2 (2.3b)
Ka H y
sin 3 y z 2 K H
sin 3 y z 2
a y
3z 2 3z 2
Dt Ex Dy H z Dz H y
(2.3c)
The other five update equations will morph into similarly succinct discrete-
operator equations, which could be grouped in matrix form
Dt 0 0 0 Dz Dy Ex
0
Dt 0 Dz 0 Dx E y
0 0 Dt Dy Dx 0 Ez
0 (2.4)
0 Dz Dy Dt 0 0 H x
Dz 0 Dx 0 Dt 0 H y
D y Dz 0 0 0 Dt H z
D
x jKa
~
sin x x 2
jKb
~
sin 3 x x 2 (2.5b)
x 2 3x 2
D jKa
~
sin y y 2 jK ~
sin 3 y y 2 (2.5c)
y b
y 2 3y 2
D
x jKa
~
sin z z 2
jKb
~
sin 3 z z 2 (2.5d)
z 2 3z 2
Setting the determinant of the above system of equations to zero will result in
the algorithm’s dispersion relation
88 Advanced Computational Electromagnetic Methods and Applications
Dt2 Dx2 Dy2 Dz2 (2.6)
tmax (2.7)
1 1 1
K a Kb 3 2
2 2
x y z
h 1
tmax (2.8)
3 K a Kb 3
Contrary to FDTD, the S24 maximum time step does not coincide with
optimum phase accuracy. This often unexpected and unlooked for behavior by the
casual user is caused by the imbalance of differencing order between the spatial
and temporal domains. The optimum time step that would minimize numerical
dispersion error could be found through detailed analysis of the dispersion relation
solutions. The following empirical formula can be used to predict this optimum
value [5]
t
toptimum max (2.9)
0.335R 0.40
where R / h is the grid density in FDTD cells per wavelength of interest. This
formula is independent of absolute frequency as dependence on frequency is
embedded within tmax .
comparin
The inherent phase error in S24 (and FDTD) can be observed by comparing
the numerical wave number, , derived from the dispersion relation with its exact
continuous-space value, . This error changes with propagation direction within
the discrete space. A global measure of this error that accounts for all propagation
directions can be constructed as
2
1 2 ( , )
sin d d (2.10)
4 0 0
High-Order FDTD Methods 89
There are many situations when a user desires to implement S24 in a hybrid
simulation with FDTD. An example of such a situation would be modeling the
vicinity of perfect electric conductor (PEC) boundaries or absorbing boundary
layers with regular FDTD in an otherwise global high-order implementation. A
wave that traverses a virtual boundary in an FDTD grid between two regions, one
updated with S24 and another updated with FDTD, would encounter a numerical
impedance mismatch. As with the continuous domain planar interfaces theory, this
mismatch would cause total wave reflections or surface waves if the wave angle of
incidence upon the virtual boundary is steep enough. The reflection coefficient of
such an interface could be accurately predicted by the following formula [6]
cos 2P cos( 2 x h / 2)
1
cos 1P cos( 11xx h / 2)
(2.11)
cos 2P cos( 2 x h / 2)
1
cos 1P cos( 11xx h / 2)
where
Dy
P tan 1 (2.12)
Dx
High-Order FDTD Methods 91
assuming the plane of incidence coincides with the x-y plane and the virtual
interface is along the y-axis. Applying this formula starts with specifying a value
for the incidence angle from medium 1 (S (S24) into medium 2 (S22). The x- and y-
components of 1 are then computed using the dispersion relation in medium 1.
2 x is computed next after enforcing 2 y 1 y at the interface, using the
dispersion relation of medium 2. Both P values are then computed to eventually
yield the reflection coefficient, . The necessary dispersion relation is based on
(2.6) for S24. The same relation could be used for FDTD after setting K a 1 and
Kb 0 . At a typical 20 cells per wavelength resolution, the reflection coefficient
affecting a wave transiting from an S24 medium to an FDTD medium maintains
levels below 60 dB for all incidence angles from normal incidence to 45o. As the
incidence angle grows steeper, however, the reflection coefficient will rise rapidly
to total reflection territory. This can introduce serious simulation errors in wave
resonance applications or where the virtual interface spans multiple wavelengths.
The following is a complete MATLAB code that computes the reflection
coefficient, , across an S24/S22 interface:
PEC Boundary
y
x
Figure 2.2 Collapsing S24 into FDTD (S22) normally at planar boundaries while maintaining the
high phase accuracy of S24 differencing along the transverse plane.
0
With phase-matching
-20 Without phase-matching
Reflection coefficient (dB)
-40
-60
-80
-100
-120
0 10 20 30 40 50 60 70 80 90
Incidence angle (degrees)
One way to mitigate this issue is to adjust the interfacing algorithms such that
their tangential numerical wave numbers are identical. In the example above, this
could be accomplished by modifying medium 2 with S24 differencing along the y-
94 Advanced Computational Electromagnetic Methods and Applications
oriented interface, while maintaining FDTD differencing along the normal x-axis
to facilitate dealing with planar PEC boundaries (see Figure 2.2). Implementing
this seamless hybrid approach will ensure that normal incidence reflection errors
will be the upper error bounds for all wave incidence angles upon the cross-
algorithm interface. Figure 2.3 demonstrates the effect of this phase-matching on
cross-algorithm spurious reflections as explained here.
Applying any of the perfectly matched layer (PML) absorbing boundary conditions
for S24 is basically the same as with FDTD, since both are using the same
temporal differencing order. The established empirical formulas in the literature
for the split-field or uni-axial PML forms equally apply and provide accurate
values for optimum PML parameters. For simulations where there is enough
separation between scatterers and PML regions to ensure little or no steeply
impinging waves on the PML boundaries, regular PML will perform wonderfully,
and there will be no added benefit from using the convolutional PML (CPML).
There are situations, however, when PML regions need to stay in close proximity
to large scatterers due to lack of computing resources and hence, steep wave
incidence and even wave evanescence cannot be avoided. CPML is mandatory for
such situations to effectively absorb all outgoing energies. Furthermore, extreme
care is required when selecting the optimum values for all of CPML’s six
parameters. The above-mentioned empirical formulas would not avail for such
situations, so the user should consider exhaustive-search optimization to find these
optimum parameters. This means running the entire model size for dozens and
often hundreds of times and comparing the results with a much larger reference
simulation, a brute force and extremely time-consuming task even for relatively
small simulations.
This approach is impractical for electrically large problems that usually call
for the use of high-order FDTD algorithms such as S24. For such situations, a
direct optimization approach that sans multiple simulation runs is required. Hadi
recently presented such an approach for FDTD and high-order FDTD algorithms
[7]. The mathematical manipulations in that reference cannot be summarized here
without losing clarity, and the reader is referred to Section III there with the sole
change of redefining Equation (21) in reference [7] to become K y Kb / 3 . While
implementing the procedure there is an involved process, it will guarantee
optimum CPML parameters for various situations in the span of a few minutes.
The following two MATLAB program lists work together, using functions from
Mathwork’s Global Optimization Toolbox, to compute optimum values of the
CPML parameters: max, n, max, n, max, and n. Computations account for large
scatterers located in very close proximity to the CPML boundary through
High-Order FDTD Methods 95
2
1
cosh( ) 1 (2.13)
min wmax
V = [10 3 10 0.3];
% [sigmax nsig = nkap kapmax amax], na=1 Initial guess
A = [-1 0 0 0; 0 -1 0 0; 0 0 -1 0; 0 0 0 -1];
B = [0; 1; 1; 0];
Upper = [100 10 100 100];
Lower = [0 1 1 0];
Opts = optimset('Algorithm', 'active-set', 'tolx', 0.0001,'
tolfun', 0.0001, ... 'maxiter', 2000, 'maxfuneval', 2000);
gs = GlobalSearch('Display', 'iter');
problem = createOptimProblem('fmincon', 'x0', v, ...
'objective', @S24PlateThPars, 'Aineq', A, 'bineq', b,
'ub', upper, ... 'lb', lower, 'options',opts);
[xming, fming, flagg, outptg, manyminsg] = run(gs, problem);
disp('Optimum [sigmax nsig = nkap kapmax amax na] values:')
Opt_CPML = [xming 1]
disp('Max Refl. Coeff. (in dB) across desired CPML incidence
angles:')
Gamma_Max = fming
% Called program by the optimization routine function
% GError=S24PlateThPars(v)
sigxmax = v(1);
nsig = v(2);
kapxmax = v(3);
nkap = v(2);
axmax = v(4);
na = 1;
% Optimization is performed for the entire range (0 –
% max_incidence)
% Max incidence angle on CPML layer, <=88 degrees
max_incidence = 85;
j = sqrt(-1);
Ka = 9/8;
96 Advanced Computational Electromagnetic Methods and Applications
Kb = -1/8;
KKy = Kb/3;
% Design frequency or where most scattered energy is expected
F = 1e9;
w = 2 * pi * f;
epso = 8.854e-12;
muo = 4 * pi * 1e-7;
co = 1/sqrt(muo * epso);
ko = w/co;
lambda = co/f;
% Grid Resolution in cells per minimum wavelength of interest
R = 20;
% Number of CPML layers
N = 10;
H = lambda/R;
% Max dt is used
dt = h/(co * sqrt(3)) * 1/abs(Ka - Kb/3);
thvec = 0 : 1 : 90;
thvec = thvec(1 : length(thvec) - 1);
Gammavec = [];
% Assuming a scatterer with 10 cm largest dimension
Scat = 0.1;
% in close proximity to CPML boundary
% Assuming minimum freq. of interest is 1/2 design freq.
Fmin = f/2;
Kmin = 2 * pi * fmin/co;
% Set chi=0 if no evanescence is expected
chi = acosh(1 + (1/(kmin * scat))^2);
% Sweeping across all incidence angles, 0 – 90 degrees:
for i = 1 : length(thvec)
th = thvec(i) * pi/180;
oldk = k1;
Dx1 = Ka * sin(k1 * C * h/2)/(h/2) + Kb * sin(3 * k1 *
C * h/2)/(3 * h/2);
Dy = Ka * sin(k1 * S * h/2)/(h/2) + Kb * sin(3 * k1 * S
* h/2)/(3 * h/2);
dDx1 = Ka * C * cos(k1 * C * h/2) + Kb * C * cos(3 * k1
* C * h/2);
dDy = Ka * S * cos(k1 * S * h/2) + Kb * S * cos(3 * k1
* S * h/2);
fun = (Dx1/Dtx1)^2 + (Dy/Dty)^2 – muo * epso;
dfun = 2 * Dx1 * dDx1/Dtx1^2 + 2 * Dy * dDy/Dty^2;
k1 = k1 - fun/dfun;
end
kx1 = k1 * C;
C = co * Dx1/Dtx1;
alpha = co * dt/h/C;
D = zeros(2 * N + 2, 1);
for n = 4 : 2 * N + 2
sigx = sigxmax * ((n - 3)/(2 * N))^nsig;
kapx = 1 + (kapxmax - 1) * ((n - 3)/(2 * N))^nkap;
alphx = axmax * ((2 * N - (n - 3))/(2 * N))^na;
Ax = 1;
px = exp(-(sigx/kapx + alphx) * dt/epso);
if sigx == 0,
qx = 0;
else
qx = sigx * (px-1)/(kapx * (sigx + alphx * kapx));
end
Bx = 1/kapx + qx/(1 - px * exp(-j * w * dt));
Omx = (exp(j * w * dt/2) – Ax * exp( - j * w *
dt/2))/(j * 2 * Bx);
D(n) = 1/(2 * j * Omx);
end
M(n, n) = 1;
M(n, n + 1) = -M(n, n - 1);
M(n, n + 3) = -M(n, n - 3);
end
Injecting point sources in high-order FDTD follows the same guidelines as for
FDTD. Hard sources such as current sources representing antennas and input
probes are injected by simply replacing the update equation at the source location
with the time varying source function. Injecting soft (field) sources involves
adding the source function to the existing update equation at the source location.
The propagated waveform due to a soft source differs slightly from the intended
source function. In this regard, the FDTD discrete system of equations acts as a
pseudo-circuit with its own impulse response, which causes reshaping the injected
soft source function and propagating a slightly modified waveform. This effect is a
function of the Yee grid parameters as well as of the implemented FDTD
algorithm parameters. To counteract this grid/algorithm effect on the desired field
injection, the grid/algorithm impulse response, h[n] , needs to be measured using a
matching grid/algorithm simulation that is unbounded and populated
homogeneously with the same medium hosting the source location [9]. The
impulse response is then stored and reused in the actual simulation run by
convolving it with the field source function of choice, f [n] :
n 1
Ez |isn , js , ks update equation f |n h[n l ] f |l 1 (2.14)
l 0
More critically, obtaining impulse response measurements that are long enough to
encompass the entire simulation run can quickly become prohibitive due to
memory and run time limitations imposed on the unbounded reference simulation.
However, those IIR filters could be reliably constructed using impulse response
measurement that are only a few hundred time steps long.
This process starts by constructing the IIR filter, H(z), from the collected time
measurements using, for example, MATLAB’s Prony.m function:
5
b z k
k
H ( z) k 1
5
(2.15)
1 ak z k
k 1
This filter can then be used in the actual simulation to generate a synthesized
impulse response on the fly
5
hIIR [n] bk x[n k ] ak hIIR [n k ] (2.16)
k 1
where
1, n 0
x[n] (2.17)
0, n 0
The above choice of fifth order filters will ensure that source injection error
levels remain below 90 dB. Lower error levels could be obtained by constructing
higher-order IIR filters.
The following is part of a MATLAB program that records the impulse
response of the FDTD grid/algorithm and computes its corresponding IIR filter.
The number of time steps needed is generally 100 to 200 to get an accurate IIR
filter.
(Grid/algorithm initializations)
% Impulse response, N is number of time steps
IR = zeros(N+1, 1);
% Initial value at source location
Ez(Is, Js, Ks) = 1;
% Time loop begins
for n = 1 : N,
(update H fields)
IR(n+1)=(Ez update equation at source location)
(update E fields)
% simulating a discrete impulse function hard source
Ez(Is, Js, Ks)=0;
end % Time loop ends
High-Order FDTD Methods 101
The following is part of a MATLAB program that utilizes the computed IIR
filter above. Grid/algorithm parameters must be the same. Number of time steps
could be smaller or much larger than the one used to derive the IIR filter.
(Grid/algorithm initializations)
(Input or read the IIR filter parameters a and b)
% Computes the filter’s impulse response
H = impz(b, a, N+1);
Introducing plane wave sources into an FDTD grid for scattering-type problems is
best performed using a total-field/scattered-field (TFSF) approach [2]. This
approach has recently been perfected to produce computing machine-level
accuracy [11], with the introduction of a 1-D propagator that coincides perfectly
with the main FDTD grid in terms of precise source field mapping, finite-
difference matching and identical numerical dispersion characteristics. Precise
source mapping is accomplished through limiting the plane wave incidence angles
to rational ratios of number of FDTD cells along the y- and x-directions (assuming
the plane wave is injected within the x-y plane of the FDTD grid). For example,
instead of selecting an incidence angle of = 20o, one would choose instead
102 Advanced Computational Electromagnetic Methods and Applications
tan 1 m y mx tan 1 7 19 20.2o . When this choice is coupled with a cell size
of
h cos
r (2.18)
mx
along the 1-D propagator, mapping source values from the propagator to the main
grid would simplify to direct substitutions that avoid error-causing interpolations
as shown in Figure 2.4. The 1-D propagator is then populated with a colocated
( H xs , H ys ) pair and a colocated ( Ezxs , Ezys ) pair that are staggered by a r / 2
distance.
The finite-difference matching is accomplished through modifying the
difference operator such that every half-step in the main grid is matched with
mx / 2 steps for x-differencing and my / 2 steps for y-differencing. The mx and
m y values need to be odd integers for proper matching. The corresponding update
equations along the 1-D propagator that will result in an identical dispersion
relation to the S24 algorithm of the main grid would then be [12]
1 1
n n
s s
E zx 2 E zx 2
Ka s n n Kb s n n
m m
H m H ys m
H 3m H ys 3m
t h y m x m x 3h y m x m x
2 2 2 2
(2.19a)
1 1
n n
s s
E zy 2 E zy 2
Ka s n n Kb s n n
m m
H my H xs my
H 3m y H xs
3m y
t h x m m 3h x m m
2 2 2 2
(2.19b)
1 1
n n
H xs |m 2
H xs |m 2
Ka s n Kb s n
s n
Ez | m y Ez | m y E | Ezs |n 3my
t h m m 3h z m 3my m
2 2 2 2
(2.19c)
1 1
n n
s s
H | y m
2
H | y m
2
Ka s n Kb s n
s n
Ez |m mx Ez |m mx
s n
3h Ez |m 3mx Ez |m 3mx
t h 2 2 2 2
(2.19d)
s s s
where E E E and m is the spatial index counter along the 1-D propagator.
z zx zy
A few of the leading field nodes within the 1-D propagator need to be hard-sourced.
High-Order FDTD Methods 103
Readers are referred to [12] for one possible way of accomplishing it as well as
finer implementation details of this TFSF approach.
Figure 2.4 Mapping of source nodes from the 1-D propagator to a generalized nonuniform main grid.
No interpolation is required.
This perfect TFSF plane wave injection has also been developed for general
directions within S24 implementation upon 3-D FDTD grids [13]. This
generalization is accomplished by additionally selecting the 1-D propagator angle
off the z-axis as
mx2 my2
tan 1 (2.20)
mz
Furthermore, the 1-D propagator would be populated by all six field
components, with all three E field nodes colocated. The same goes for all
three H field nodes, which are staggered from the E nodes by r / 2 . The spatial
step along the 1-D propagator would be
h cos sin
r (2.21)
mx
104 Advanced Computational Electromagnetic Methods and Applications
For structures that involve planar boundaries coinciding with the FDTD grid axes,
there would be no need for subcell conformal modeling once the FDTD grid is
designed properly. Examples of such structures are arrays of microstrip antennas
and equipment emissions/susceptibility modeling for electromagnetic compatibility
purposes. In such cases, the approach discussed in Section 2.2 of surrounding the
PEC boundaries with a one-cell-thick layer that has regular FDTD differencing
normal to the PEC boundary and S24 differencing in the transverse directions
would work admirably. Phase accuracy would be perfectly maintained and only a
small penalty in cross-algorithm spurious reflections would be observed. For the
record, this small error would be negligible compared to the inherent spurious
errors even in the most elaborate of today’s PEC conformal techniques.
In some cases, modeling a PEC object is only a minor consideration with respect to
the main objective of the simulation. Examples would be PEC objects embedded in
highly lossy dielectrics, or subwavelength PEC objects, or PEC backbones of PML
absorbing boundary conditions. In such cases, it would be safe to collapse the S24
algorithm to FDTD within subregions that contain such PEC objects. Within those
subregions, conformal PEC modeling would be accomplished natively within
FDTD.
1 1
lxa | 1 Ex |n 1 lxa | 1 Ex |n 1
n n K t i, j ,k i, j ,k i, j ,k i, j ,k
H | 2
H | 2
a 2 2 2 2
z i, j ,k z i, j ,k
hsa lya | 1 E y | 1 lya | 1 E y |n 1
n
i , j ,k i , j ,k i , j,k i , j,k
2 2 2 2
lxb | 3 Ex |n 3 lxb | 3 Ex |n 3
K t i, j ,k i , j ,k i , j ,k i , j ,k
(2.22)
2 2 2 2
b
3 hsb l | E | n
l | E | n
i , j ,k
yb 3 y 3 yb 3 y 3
i , j ,k i , j ,k i , j ,k
2 2 2 2
Figure 2.5 Identifying PEC-free edge lengths for the SC mapping technique. The PEC-free surface
area sa is a subset of sb .
This and the matching update equations for the other magnetic field
components work well for most S24 cells encroached upon by PEC boundaries.
Numerical stability considerations dictate, however, that there is a limit to how
small the PEC-free areas could be before the onset of numerical instability. The SC
technique amends the update equations for these problematic cells by modifying
(reducing) the normalized edge lengths where needed to maintain stability:
where min( sa ) refers to the smallest of the four normalized PEC-free surfaces
sharing the edge la . The same modification is applied to the outer loop edges’
lengths
As an exception to the above modifiers, if the inner loop is wholly embedded in the
PEC region ( sa 0 ), then all a and b edge lengths are set to zero to produce a
zero value for the magnetic field there ( sa should be reset to unity to avoid
division by zero).
As mentioned earlier in this chapter, S24 is one of the simpler forms of the
extended-stencil class of high-order FDTD algorithms. It excels at being suitable
for wide-band application, has a fairly low count of floating-point operations per
update equation, which suits it well for fine-grained graphical processor computing,
and is the most understood and widest used high-order form in the literature. Its
main disadvantage concerns its need to use an optimum time step that is roughly
one-tenth of its maximum value allowable by its stability criterion. Obviously, this
cuts deep into its efficiency advantages over FDTD. Two additional variants of this
class of high-order FDTD algorithms will be briefly discussed here, which will
remedy this disadvantage and increase computational efficiency by orders of
magnitudes at the expense of more modeling complexity.
FDTD can be equally derived through applying finite differences to the differential
form of Maxwell’s equations, as well as through applying finite sums to the
integral form of those equations. Most electromagnetics experts would think of
Ampere’s and Faraday’s laws when integral Maxwell’s equations are mentioned.
However, a different and little used form of Maxwell’s equations is available
which will be more useful in deriving an extremely phase-coherent high-order
FDTD algorithm [16]
E
V t S
dv ds H (2.25)
H
V t
dv ds E
S
(2.26)
High-Order FDTD Methods 107
FV24 [5] applies finite-sums over two concentric surfaces surrounding the
field node of interest, with the critical advantage of including all the tangential
field nodes on the outer surface as demonstrated in Figure 2.6.
1 1
n n
Ex 2 Ex 2
Ka n n
i, j , k i, j , k
H n
Hz
n
Hy 1 Hy 1
t h z 1
i, j , k
1
i, j , k i, j , k i, j , k
2 2 2 2
Kb n n n n
H z |i , j 3 , k H z |i , j 3 ,k H y |i , j ,k 3 H y |i , j ,k 3
3h 2 2 2 2
H z |n 3 H z |n 3 H z |n 3 H z |n 3
i 1, j , k
2
i 1, j , k
2
i , j , k 1
2
i , j , k 1
2
H z |n 3 H z |n 3 H z |n 3 H z n
|
Kc
3
i 1, j , k i 1, j , k i , j , k 1 i , j , k 1
2 2 2 2
12h H y |n 3 H y |
n
3 H y |
n
3 H y
n
| 3
i , j 1, k
2
i , j 1, k
2
i 1, j , k
2
i 1, j , k
2
H |n n n n
y 3 H y | 3 H y | 3 H y | 3
i , j 1, k i , j 1, k i 1, j , k i 1, j , k
2 2 2 2
H z |n 3 H z |n 3 H z |n 3 H z |n 3
i 1, j
2
, k 1 i 1, j
2
, k 1 i 1, j , k 1
2
i 1, j , k 1
2
H z |n 3 H z |n 3 H z |n 3 H z |n
Kd
3
i 1, j , k 1 i 1, j , k 1 i 1, j , k 1 i 1, j , k 1
2 2 2 2 (2.27)
12h H y |n 3 H y |
n
3 H y |
n
3 H y |n
3
i 1, j 1, k
2
i 1, j 1, k
2
i 1, j 1, k
2
i 1, j 1, k
2
H |n H n
H n n
y 3 y | 3 y | 3 H y | 3
i 1, j 1, k i 1, j 1, k i 1, j 1, k i 1, j 1, k
2 2 2 2
The field nodes are grouped according to their spatial displacement from the
central field node to be updated, in this case Ex |i , j , k . Each group is then multiplied
by its own coefficient, and they in turn are optimized through the corresponding
numerical dispersion, to yield the least global phase error, , from equation
(2.10). The same dispersion relation in equation (2.6) applies to FV24, with the
following discrete operators, which could be derived as illustrated in Section 2.1:
Dx jK a
~
sin x h 2
j
~
sin 3 x h 2 K ~ ~
~ ~
K b c cos y h cos z h K d cos y h cos z h
h2 3h 2 2
(2.28a)
Dy jK a
~
sin y h 2 j sin3~ h2
y K ~
~ ~
~
K b c cos x h cos z h K d cos x h cos z h
h2 3h 2 2
(2.28b)
108 Advanced Computational Electromagnetic Methods and Applications
Dz jK a
~
sin z h 2j
~
sin 3 z h 2 K
~ ~
~ ~
K b c cos x h cos y h K d cos x h cos y h
h2 3h 2 2
(2.28c)
Figure 2.6 The extended-stencil set of field nodes used for FV24 update equations. Shaded areas are
the constant-field portions of the discrete integrals.
The maximum time step for a stable FV24 can be found, using the approach in
Section 2.1 again, to be
h 1
tmax (2.29)
3 K a K b K c K d 3
The entire FV24 formulations can be collapsed to S24 and FDTD with the
proper selection of the K-tuning parameters; setting Ka = 9/8, Kb = 1/8, Kc = Kd =
0 would yield S24 while setting Ka 1, Kb Kc Kd 0 would yield FDTD.
The main performance advantage of FV24 over S24 becomes apparent for
single-frequency or narrow band applications. At an R 20 cells per wavelength
grid resolution, a properly tuned FV24 algorithm is capable of incurring a global
phase error from (2.10) that is seven orders of magnitude lower than S24 at the
design frequency. To put this matter in perspective, in general and succinctly, the
level of grid resolution refinement required by FDTD for matching the phase
coherence of S24 and FV24 can be stated, respectively, as [5]
2
RFDTD RS24 (2.30)
3
RFDTD RFV24 (2.31)
High-Order FDTD Methods 109
T (2.32)
2
1 z
In practical terms, regardless of the electrical size of the waveguide’s cross-
section, a total grid size in the order of 20 × 20 FDTD cells is all that is required to
model it accurately, unless fine details articulation is needed. On the flip side, this
accurate modeling would require a substantial reduction in temporal steps since the
time step must still relate to the unbounded wave period. A well-designed high-
order algorithm using the compact FDTD grid could run over a hundred times
faster than regular FDTD. This time, however, high-order differencing needs to
extend to the time derivatives in Maxwell’s equations.
Starting with Maxwell’s equations and replacing the spatial derivative along
the waveguide’s longitudinal dimension (assumed here to be along the z-axis) with
the j z term, two decoupled sets of equations can be produced. One of these, the
more suitable to use with the grid design in Figure 2.7, is
E x H z
z H y (2.33a)
t y
E y H z
zHx (2.33b)
t x
E z H y H x
(2.33c)
t x y
110 Advanced Computational Electromagnetic Methods and Applications
H x E
z z Ey (2.33d)
t y
H y Ez
z Ex (2.33e)
t x
H z Ex E y
(2.33f)
t y x
Figure 2.7 Compact-FDTD grid for modeling wave propagation through longitudinally invariant
structures.
E x K a K n
H n
Hz
n b H n
Hz
n H (2.34a)
t h z i, j
1
i, j
1 3h z i, j
3
i, j
3 z y i, j
2 2 2 2
E y Ka K
H n
Hz
n b H n
Hz
n H n
(2.34b)
t h z 1
i , j
1
i , j 3h z 3
i , j
3
i , j z x i, j
2 2 2 2
H n n H n n
x 1 Hx 1 x 3 Hx 3
i, j i, j i, j i, j
E K 2 Kb
z a 2
3h
2 2
(2.34c)
t h H n n n n
y i 1 , j H y i 1 , j
Hy 3 Hy 3
i , j i , j
2 2 2 2
High-Order FDTD Methods 111
H x K Kb
a n n
Ez |i , j 1 Ez |i , j 1 Ez |i , j 3 Ez |i , j 3 z E y |i , j (2.34d)
n n n
t h 2 2 3h 2 2
H y Ka Kb
n n
Ez |i 1 , j Ez |i 1 , j Ez |i 3 , j Ez |i 3 , j z Ex |i , j (2.34e)
n n n
t h 2 2 3h 2 2
Ex |n 1 Ex |n 1 Ex |n 3 Ex |n 3
H z K a i, j i, j K b
i, j i, j
2 2
2 2
(2.34f)
t h E y |n 1 E y |n 1 3h E y |n 3 E y |n 3
i , j i , j i , j i , j
2 2 2 2
Ex 1 n
1
n
1
n
3
n
5
n
7
22 Ex |i , j 2 17 Ex |i , j 2 9 Ex |i , j 2 5Ex |i , j 2 Ex |i , j 2 (2.35)
t 24t
The same matrix equation (2.4) is used to determine the dispersion relation for
this algorithm, with the following changes to the discrete operators there:
Dt
1
24t
22e j t 2 17e j t 2 9e 3 j t 2 5e 5 j t 2 e 7 j t 2 (2.36a)
Dz z first three rows in (2.4) (2.36b)
1
h
tmax 2 (2.38)
2
h
2K a K b 3 z
2
2
When using compact-FDTD and its high-order variants, the longitudinal
wavenumber z is an input that needs to be provided. It is chosen such that the
waveguide first mode of operation coincides with the operating frequency. For
electrically large structures, this stipulation translates to z being nearly identical
to the unbounded wavenumber . For example, propagating a 1-GHz signal
through a 6 × 3 m tunnel would require setting z / 0.9982 [3]. Such values
provide a serious challenge to even high-order modeling algorithms, if not
112 Advanced Computational Electromagnetic Methods and Applications
REFERENCES
[1] J. Fang, Time Domain Finite Difference Computation for Maxwell's Equations, PhD Dissertation,
University of California at Berkeley, Berkeley, California, 1989.
[2] A. Elsherbeni and D. Veysel, The Finite-Difference Time-Domain Method for Electromagnetics
with MATLAB Simulations, Raleigh, NC: Scitech Publishing, Inc. 2009.
[3] M. Hadi and S. Mahmoud, “A High-Order Compact-FDTD Algorithm for Electrically Large
Waveguide Analysis,” IEEE Transactions on Antennas and Propagation, Vol. 56, No. 8, pp.
25892598, 2008.
[4] A. Taflove and M. Brodwin, “Numerical Solution of Steady-State Electromagnetic Scattering
Problems Using the Time-Dependent Maxwell’s Equations,” IEEE Trans. Microwave Theory
Techniques, Vol. 23, No. 8, pp. 623630, 1975.
[5] M. Hadi, “A Finite Volumes-Based 3-D Low Dispersion FDTD Algorithm,” IEEE Transactions
on Antennas and Propagation, Vol. 55, No. 8, pp. 22872293, 2007.
[6] M. Hadi and R. Dib, “IEEE Transactions on Antennas and Propagation Low-Dispersion FDTD
Algorithms,” Appl. Computat. Electromag. Soc. J., Vol. 22, No. 3, pp. 306314, 2007.
[7] M. Hadi, “Near-Field PML Optimization for Low and High Order FDTD Algorithms Using
Closed-Form Predictive Equations,” IEEE Transactions on Antennas and Propagation, Vol. 59,
No. 8, pp. 29332942, 2011.
[8] J. Berenger, “Evanescent Waves in PML's: Origin of the Numerical Reflection in Wave-
Structure Interaction Problems,” IEEE Trans. Antennas Propagation, Vol. 47, No. 10, pp.
14971503, 1999.
[9] J. Schneider and C. Wagner, “Implementation of Transparent Sources in FDTD Simulations,”
IEEE Transactions on Antennas and Propagation, Vol. 46, No. 8, pp. 11591168, 1998.
[10] M. Hadi and N. Almutairi, “Discrete Finite-Difference Time Domain Impulse Response Filters
for Transparent Field Source Implementations,” IET Microw. Antennas Propag., Vol. 4, No. 3,
pp. 381389, 2010.
[11] T. Tan and M. Potter, “1-D Multipoint Auxiliary Source Propagator for the Total-
Field/Scattered-Field FDTD Formulations,” IEEE Antennas and Wireless Propagation Letters,
Vol. 6, pp. 144148, 2007.
[12] M. Hadi, “A Versatile Split-Field 1-D Propagator for Perfect FDTD Plane Wave Injection,”
IEEE Transactions on Antennas and Propagation, Vol. 57, No. 9, pp. 26912697, 2011.
High-Order FDTD Methods 113
[13] W. Hui, H. Zhi, W. Xian and W. Lei, “Perfect Plane Wave Injection into 3D FDTD (2,4)
Scheme,” 2011 Cross Strait Quad-Regional Radio Science and Wireless Technology Conference,
Harbin, China, 2011.
[14] I. Zagorodnov, R. Schuhmann, and T. Weiland, “Conformal FDTD-Methods to Avoid Time Step
Reduction With and Without Cell Enlargement,” Journal of Computational Physics, Vol. 225,
No. 2, pp. 14931507, 2007.
[15] B. Al-Zohouri and M. Hadi, “Conformal Modelling of Perfect Conductors in the High-Order
M24 Finite-Difference Time-Domain Algorithm,” IET Microw. Antennas Propag., Vol. 5, No. 5,
pp. 583587, 2011.
[16] N. Madsen and R. Ziolkowski, “A Three-Dimensional Modified Finite Volume Technique for
Maxwell's Equations,” Electromagnetics, Vol. 10, No. 1/2, pp. 147161, 1990.
[17] M. Hadi and S. Mahmoud, “Optimizing the Compact-FDTD Algorithm for Electrically Large
Waveguiding Structures,” Progress in Electromagnetics Research, Vol. 75, pp. 253269, 2007.
[18] K. Hwang and J. Ihm, “A Stable Fourth-Order FDTD Method for Modeling Electrically Long
Dielectric Waveguides,” Journal of Lightwave Technology, Vol. 24, No. 2, pp. 10481056, 2006.
Chapter 3
GPU Acceleration of FDTD Method for
Simulation of Microwave Circuits
Veysel Demir
3.1 INTRODUCTION
115
116 Advanced Computational Electromagnetic Methods and Applications
GPU codes. CUDA implementations of the FDTD method are used in commercial
computational electromagnetics software. Furthermore, CUDA has been reported
as the programming environment for implementation of FDTD in several academic
research articles, which include [1722] as some of the earlier implementations.
OpenCL [23] is yet another recently introduced programming platform to develop
codes on parallel devices, and used to develop FDTD implementations [2426].
In this chapter we present an implementation of a three-dimensional FDTD
code using CUDA. The presented code includes an implementation of FDTD using
the C programming language to run on CPU as well as the implementation in
CUDA to run on GPU. Some considerations that a developer needs to keep in
mind to develop a code with better performance are also discussed.
The files of the code presented in this chapter are available on the publisher’s
website. We strongly recommend that the reader download and study the code
while reading the following sections, as these sections discuss the concepts in
parallel with the code and serve as a tutorial. Also, a basic knowledge of CUDA
programming is required. For beginners, we recommend the “NVIDIA CUDA
Getting Started Guide” and “CUDA C Programming Guide” available at
NVIDIA’s web site to start learning CUDA.
The next section presents the implementation of the code and discusses the
core functions programmed in C language to run the program on CPU. The
subsequent section presents the CUDA implementation and discusses the issues
one needs to pay attention to while programming FDTD using CUDA.
The FDTD method is the most researched method, and many techniques have been
developed to model various conditions dispersive media, nonlinear media,
absorbing boundaries, and so forth in FDTD. A code developed to demonstrate a
subset of these extensions to the basic FDTD method can be covered only in a
book. Therefore, in this chapter, we keep the implemented code limited with the
basics of FDTD, while sufficient to present a GPU implementation that is useful to
solve basic microwave circuits: implementation of electric and magnetic field
updating equations, excitation of ports, calculation of voltages and currents at the
ports, and eventually calculation of scattering parameters are presented.
The presented FDTD code is developed following the assumptions listed below:
The problem space is a closed PEC box; therefore, PEC boundary condition
is used at the boundaries: the tangential electric field components are set to
be zero.
GPU Acceleration Techniques of FDTD Methods 117
The program reads an input file in which the FDTD problem to be solved is
described. For instance, Figure 3.1 illustrates a lowpass filter [4]. Simulation
parameters of this filter are defined in a text file named as lowpass_filter.txt. The
program can be executed in the command-line user interface on a Microsoft
Windows operating system as
mwfdtd lowpass_filter.txt
The contents of the input file “lowpass_filter.txt” are shown in Listing 3.1. Here,
run_on_gpu is a parameter that determines whether the simulation is to be run on
CPU (if 0) or GPU (if 1). If there are more than one GPU devices on the system,
one can choose which device is used by assigning its device ID to
gpu_device_id parameter. The parameter number_of_time_steps sets the
number of time steps to run the FDTD time-marching loop. The parameters
cell_size_x, cell_size_y, and cell_size_z set the dimension of a unit cell
in the x-, y-, and z-directions, respectively. It should be noted that the default units
of all the lengths described in the input file are in meters. The parameters
substrate_relative_permittivity and substrate_thickness define the
dielectric constant and the thickness of the substrate. The parameters box_size_x,
box_size_y, and box_size_z set the dimensions of the problem space in the x-,
y-, and z-directions, respectively. One corner of the problem space coincides with
the origin of the Cartesian coordinate system and the problem space box extends in
the x-, y-, and z-directions as illustrated in Figure 3.1.
A rectangular PEC patch can be defined by its start and end coordinates in the
input text file. The parameters microstrip_min_x, microstrip_min_y,
microstrip_max_x, and microstrip_max_y define the start and the end
coordinates. A number of rectangular patches can be combined to create complex
shapes. For instance, the lowpass filter shown in Figure 3.1 is created using three
rectangular patches.
Ports are also defined by their start and end coordinates as shown in Listing
3.1, where two ports are defined. The active source port that is used to excite the
antenna during the FDTD simulation is indicated by the parameter active
_port_index. All ports are 50-ohm ports in the presented code; therefore, the
active port is simulated as a voltage source with 50-ohm internal impedance,
whereas the inactive ports are simulated as 50-ohm resistors. Transient voltage and
current are captured on each port during a simulation. Then the captured voltages
and currents are used to calculate scattering parameters of the circuit. Finally, the
parameters frequency_start, frequency_end, and number_of_ frequen-
cies define the frequencies of interest for the scattering parameter calculations.
box_size_x 0.028448
box_size_y 0.027938
box_size_z 0.003445
microstrip_index 1
microstrip_min_x 0.0097536
microstrip_max_x 0.012192
microstrip_min_y 0.004233
microstrip_max_y 0.012699
microstrip_index 2
microstrip_min_x 0.016256
microstrip_max_x 0.018694
microstrip_min_y 0.015239
microstrip_max_y 0.023705
microstrip_index 3
microstrip_min_x 0.004064
microstrip_max_x 0.024384
microstrip_min_y 0.0127
microstrip_max_y 0.01524
port_index 1
port_min_x 0.0097536
port_max_x 0.012162
port_min_y 0.004233
port_max_y 0.004233
port_index 2
port_min_x 0.016256
port_max_x 0.018694
port_min_y 0.023705
port_max_y 0.023705
active_port_index 1
frequency_start 1e9
frequency_end 20e9
number_of_frequencies 91
Listing 3.2 shows the main() function of the C code of the FDTD program. The
program is structured such that, first, the contents of the input file are read and
relevant parameter values are assigned to associated data elements and arrays.
Then the function setupProblemSpace() is called to create and initialize arrays
for updating coefficients, electric and magnetic fields, and other auxiliary data. The
function setupPorts() is called to set up arrays regarding port calculations,
which include voltage source excitation, sampled voltage, and current calculations.
The functions startTimer() and stopTimer() are used to capture the total
120 Advanced Computational Electromagnetic Methods and Applications
time spent for a simulation. Then port scattering parameters are calculated in the
function calculateScatteringParameters() using the sampled voltages and
currents captured on the ports. Finally, the function saveResults() stores the
results of the simulation in the output MATLAB script file.
One can notice in Listing 3.2 that the if statement is used to branch the main
FDTD time-marching loop to run either on CPU or on GPU. Listing 3.3 shows the
function runTimeMarchingLoopOnCPU(), which runs the time-marching loop
on CPU. In the time-marching loop, at every time step, first, magnetic and electric
fields are updated consecutively. These updates are followed by special updates of
fields due to a voltage source. Then electric and magnetic fields are captured at the
ports.
updateVoltageSource(time_step);
captureVoltageAndCurrent(time_step);
}
}
H xn0.5 i, j, k H xn0.5 i, j, k Chxey E yn i, j, k 1 E yn i, j, k (3.1)
Chxez E n
z i, j 1, k E zn i, j, k
where Chxey = t/µ0z and Chxez = –t/µ0y. Here, t is the duration of a time step,
µ0 is the free space permeability, and y and z are the size of a unit cell in the y-
and z-directions, respectively.
In the presented implementation, if there are Nx × Ny × Nz cells in a problem
space, the number of field components for each field type is Nfields = (Nx+1) ×
(Ny+1) × (Nz+1). For instance, Figure 3.2 illustrates the Ex field component
distribution on an x-y plane cut. The numbers of field components in the x- and y-
directions are Nx+1 and Ny+1, respectively. The actual number of the Ex field
components that lay in the problem space is Nx and the Ex components in the
rightmost column in Figure 3.2 are out of the problem space boundaries. These
extra field components are included in the 3-D field arrangements so that all 3-D
arrays (i.e., Ex, Ey, Ez, Hx, Hy, and Hz) are of the same size as Nfields. The offsets
between required field components in an update, discussed next below, are the
same for all types of field components if the field arrays are with the same
dimensions, which make it convenient when programming.
}
}
int nt = number_of_time_steps;
float* time_waveform;
complex sv, sc;
float Z = 50.0;
float sZ = 2.0*sqrt(Z);
time_waveform = ports[ind].sampled_voltage;
sv = dft(time_array, 0.5*dt, dt, nt,
time_waveform, frequencies[i]);
time_waveform = ports[ind].sampled_current;
sc = dft(time_array, 0.0, dt, nt,
time_waveform, frequencies[i]);
ports[ind].a_wave[i].re = (sv.re + sc.re * Z)/sZ;
ports[ind].a_wave[i].im = (sv.im + sc.im * Z)/sZ;
ports[ind].b_wave[i].re = (sv.re - sc.re * Z)/sZ;
ports[ind].b_wave[i].im = (sv.im - sc.im * Z)/sZ;
}
}
complex a, b, s;
for (int ind=0;ind<np;ind++)
{
for (int i=0;i<nf;i++)
{
a = ports[apind].a_wave[i];
b = ports[ind].b_wave[i];
s.re = (b.re*a.re+b.im*a.im)
/(a.re*a.re+a.im*a.im);
s.im = (b.im*a.re-b.re*a.im)
/(a.re*a.re+a.im*a.im);
ports[apind].S[ind][i] = s;
}
}
}
complex dft_val;
dft_val.re = 0.0;
dft_val.im = 0.0;
float pi = atan(1.0)*4.0;
float w = 2*pi*frequency;
float wt, tw;
Some recommendations for optimization of CUDA programs and the list of best
practices for programming with CUDA are provided in “CUDA C Best Practices
Guide” available at NVIDIA’s web site (http://docs.nvidia.com/cuda/cuda-c-best-
practices-guide/#axzz39Pwtembc). Among these recommendations, we can use the
following ones that are directly applicable to FDTD and used to optimize the
presented FDTD implementation:
R1. Structure the algorithm in a way that exposes as much data parallelism as
possible. Once the parallelism of the algorithm has been exposed, it needs
to be mapped to the hardware as efficiently as possible.
R2. Ensure that global memory accesses are coalesced whenever possible.
R3. Minimize the use of global memory. Prefer shared memory access where
possible.
The FDTD updates in a cell can be performed separately from the updates in
other cells. Therefore, the FDTD algorithm inherently satisfies the first
recommendation R1 listed above. We will refer to the other recommendations as
well in the subsequent sections as we discuss the implementation of the code.
128 Advanced Computational Electromagnetic Methods and Applications
The main memory space on the GPU device is referred to as global memory and
the global memory is accessed via 32-, 64-, or 128-byte memory transactions such
as reading or writing an array by a block of threads. These memory transactions
must be naturally aligned for the best performance. For instance, Figure 3.3
illustrates misaligned and nonsequential access of memory locations in the global
memory by threads. Aligned and sequential access of memory locations, as
illustrated in Figure 3.4, is referred to as coalesced memory access.
Figure 3.5 A problem space expanded in the x-direction using padded cells.
if (run_on_gpu)
nx = nx_gpu;
else
nx = nx_cpu;
Remember that the GPU device is a separate computational device that has its own
processors as well as memory spaces. In CUDA terminology, the GPU device is
referred to as device, while the CPU side of the computer is referred to as host.
Thus, if a computation will be performed on the GPU device, the relevant data
need to be transferred to the device memory before the main computation begins.
The main computation in this case is the time-marching loop that requires the
updating coefficients as the input. The task of transferring the input data to the
device memory is performed within the function setupGPU() in Listing 3.2.
During the time-marching loop, the electric and magnetic field distributions are
intermediate outputs, and sampled voltages and currents are the main outputs. The
130 Advanced Computational Electromagnetic Methods and Applications
void setupGPU()
{
setGPUdevice(gpu_device_id);
if (copyArraysToGpuMemory()!=0)
{
printf("Error while copying arrays to GPU memory!\n");
exit (EXIT_FAILURE);
}
setThreadBlocks();
}
cudaError_t et;
et = cudaMalloc((void**)&dvEx, array_size);
et = cudaMalloc((void**)&dvEy, array_size);
GPU Acceleration Techniques of FDTD Methods 131
et = cudaMalloc((void**)&dvEz, array_size);
et = cudaMalloc((void**)&dvHx, array_size);
et = cudaMalloc((void**)&dvHy, array_size);
et = cudaMalloc((void**)&dvHz, array_size);
et = cudaMalloc((void**)&dvCexhy, array_size);
et = cudaMalloc((void**)&dvCexhz, array_size);
et = cudaMalloc((void**)&dvCeyhz, array_size);
et = cudaMalloc((void**)&dvCeyhx, array_size);
et = cudaMalloc((void**)&dvCeze, array_size);
et = cudaMalloc((void**)&dvCezhy, array_size);
et = cudaMalloc((void**)&dvCezhx, array_size);
array_size = size_int*ports[ind].number_of_sv_fields;
et = cudaMalloc((void**)&(ports[ind].dvsv_indices),
array_size);
132 Advanced Computational Electromagnetic Methods and Applications
cudaMemcpy(ports[ind].dvsv_indices,
ports[ind].sv_indices, array_size,
cudaMemcpyHostToDevice);
array_size = size_float*number_of_time_steps;
et =cudaMalloc((void**)&(ports[ind].dvsampled_voltage),
array_size);
cudaMemcpy(ports[ind].dvsampled_voltage,
ports[ind].sampled_voltage,array_size,
cudaMemcpyHostToDevice);
array_size = size_int*ports[ind].number_of_hx_fields;
et=cudaMalloc((void**)&(ports[ind].dvhxm_indices),
array_size);
cudaMemcpy(ports[ind].dvhxm_indices,
ports[ind].hxm_indices, array_size,
cudaMemcpyHostToDevice);
array_size = size_int*ports[ind].number_of_hx_fields;
et = cudaMalloc((void**)&(ports[ind].dvhxp_indices),
array_size);
cudaMemcpy(ports[ind].dvhxp_indices,
ports[ind].hxp_indices, array_size,
cudaMemcpyHostToDevice);
array_size = size_int*ports[ind].number_of_hy_fields;
et = cudaMalloc((void**)&(ports[ind].dvhym_indices),
array_size);
cudaMemcpy(ports[ind].dvhym_indices,
ports[ind].hym_indices, array_size,
cudaMemcpyHostToDevice);
array_size = size_int*ports[ind].number_of_hy_fields;
et = cudaMalloc((void**)&(ports[ind].dvhyp_indices),
array_size);
cudaMemcpy(ports[ind].dvhyp_indices,
ports[ind].hyp_indices, array_size,
cudaMemcpyHostToDevice);
array_size = size_float*number_of_time_steps;
et =cudaMalloc((void**)&(ports[ind].dvsampled_current),
GPU Acceleration Techniques of FDTD Methods 133
array_size);
cudaMemcpy(ports[ind].dvsampled_current,
ports[ind].sampled_current, array_size,
cudaMemcpyHostToDevice);
}
return 0;
}
In FDTD time-marching loop, while updating the fields, the fields in a cell can be
updated independently from the field updates in other cells; hence, the updates in
different cells can be performed in parallel with each other. Therefore, each cell
can be assigned to a thread to perform the field updates within the cell for parallel
processing. In CUDA, a number of threads form a block and a number of blocks
form a grid. We can map each thread to a cell and create a sufficient number of
blocks such that the grid spans all cells in the problem space. Several various
thread to cell mapping algorithms can be proposed. In this chapter, the thread
blocks are constructed as shown in Listing 3.13, which shows a section of the
function setThreadBlocks(). Here, the grid is constructed as a one-dimensional
array of thread blocks and each thread block is constructed as a one-dimensional
array of threads. The number of threads per thread block is denoted as
maximum_threads_per_block in the presented code, which is the maximum
number of threads per block available on the device. Then, in the field updating
kernels, these blocks and threads are mapped to cells. Figure 3.6 illustrates the
thread to cell mapping on an x-y plane-cut. Here, the cells in the rightmost column
are considered as extension cells that contain the extension fields in the rightmost
column in Figure 3.2.
n_blocks = (nsv/maximum_threads_per_block) +
(nsv%maximum_threads_per_block == 0 ? 0 : 1);
shared_memory_size_sv =
maximum_threads_per_block*sizeof(float);
int nh = ports[ind].number_of_hx_fields;
n_blocks = (nh/maximum_threads_per_block) +
(nh%maximum_threads_per_block == 0 ? 0 : 1);
block_sc_hx = dim3(maximum_threads_per_block, 1, 1);
grid_sc_hx = dim3(n_blocks, 1, 1);
shared_memory_size_sc_hx =
2*maximum_threads_per_block*sizeof(float);
nh = ports[ind].number_of_hy_fields;
n_blocks = (nh/maximum_threads_per_block) +
(nh%maximum_threads_per_block == 0 ? 0 : 1);
shared_memory_size_sc_hy =
2*maximum_threads_per_block*sizeof(float);
}
Extension cells
74 75 76 77 80 81 82 83 84 85 86 87
60 61 62 63 64 65 66 67 70 71 72 73
44 45 46 47 50 51 52 53 54 55 56 57
30 31 32 33 34 35 36 37 40 41 42 43
y
14 15 16 17 20 21 22 23 24 25 26 27
00 01 02 03 04 05 06 07 10 11 12 13
x
Once the required arrays are allocated on the device memory and relevant data are
copied to the device memory the program is ready to execute the time-marching
loop. The function runTimeMarchingLoopOnGPU(), shown in Listing 3.14,
executes the time-marching loop on the GPU device. One can compare Listing
3.14 with Listing 3.3 and verify that they contain the same steps: magnetic field
updates, electric field updates, voltage source updates, and voltage and current
calculations.
updateElectricFieldsOnGPU<<<grid_eh, block_eh,
shared_memory_size>>>
(dvCexhy, dvCexhz, dvCeyhz, dvCeyhx, dvCezhx,
dvCezhy, dvCeze, dvEx, dvEy, dvEz, dvHx, dvHy,
dvHz, y_offset, z_offset, n_fields);
updateVoltageSourceOnGPU<<<grid_vs, block_vs>>>
(dvEz, ports[ind].dvsv_indices,
ports[ind].number_of_sv_fields,
time_step, current_value);
ports[ind].dvsampled_voltage,
time_step, dz, nz_substrate);
}
captureCurrentOnGPU<<<grid_sc_hy, block_sc_hy,
shared_memory_size_sc_hy>>>
(dvHy, ports[ind].dvhyp_indices,
ports[ind].dvhym_indices,
ports[ind].number_of_hy_fields,
ports[ind].dvsampled_current,
time_step, dy, nz_substrate);
}
}
}
H yn0.5 i, j, k H yn0.5 i, j, k Chyez E zn i 1, j, k E zn i, j, k (3.3)
Chyex E n
x i, j, k 1 E xn i, j, k
One can realize in (3.3) that, to calculate fields in a cell, the fields in
neighboring cells also are needed: For instance, one needs Ez(i+1, j, k) and Ex(i, j,
k+1) to calculate Hy(i, j, k). A thread processing a cell (i, j, k) can efficiently read a
memory space pertaining to (i, j, k+1) since this access will be coalesced. However,
access to a memory address pertaining to (i+1, j, k) is uncoalesced and expensive
in terms of computation time. In this case, the shared memory can be utilized to
GPU Acceleration Techniques of FDTD Methods 137
retain efficiency. Because it is on-chip, the access to shared memory is much faster
than the local and global memory. Each thread can access to the memory space
associated with it and load the relevant data to the shared memory. Once the data is
in the shared memory, it is ready for use by the neighboring cells’ threads.
With the thread to cell mapping described in Figure 3.6, a problem still exists;
if thread on the boundary of a thread block needs to access to a field in a cell
mapped to another thread block, that data will not be directly available in the
shared memory. To overcome this problem, some threads are scheduled to load the
field data in those cells in the neighboring block in a separate command as
illustrated in Listing 3.15. The statement sEy[ti] = Ey[fi] loads a block of
data in shared memory and the statement sEy[blockDim.x+ti] =
Ey[blockDim.x+fi] loads the neighboring block of data in the shared memory
as also illustrated in Figure 3.7. In Listing 3.15, the statement __syncthreads()
is a synchronization barrier to ensure that all required data are loaded into the
shared memory by the threads. Once the data are available in the shared memory,
they can be used to update the fields.
int ti = threadIdx.x;
int fi = blockIdx.x * blockDim.x + threadIdx.x;
sEy[ti] = Ey[fi];
sEz[ti] = Ez[fi];
if (ti<16)
{
sEy[blockDim.x+ti] = Ey[blockDim.x+fi];
sEz[blockDim.x+ti] = Ez[blockDim.x+fi];
}
__syncthreads();
if (fi>=n_fields-z_offset) return;
Hx[fi] = Hx[fi]
+ Chxey*(Ey[fi+z_offset]-sEy[ti])
+ Chxez*(Ez[fi+y_offset]-sEz[ti]);
Hy[fi] = Hy[fi]
+ Chyez*(sEz[ti+1]-sEz[ti])
+ Chyex*(Ex[fi+z_offset]-Ex[fi]);
Hz[fi] = Hz[fi]
+ Chzex*(Ex[fi+y_offset]-Ex[fi])
+ Chzey*(sEy[ti+1]-sEy[ti]);
}
Electric field updates are performed similar to the magnetic field updates by
utilizing the shared memory in the function updateElectricFieldsOnGPU()
shown in Listing 3.16.
int ti = threadIdx.x;
int fi = blockIdx.x * blockDim.x + threadIdx.x;
GPU Acceleration Techniques of FDTD Methods 139
sHy[ti+16] = Hy[fi];
sHz[ti+16] = Hz[fi];
if (ti<16)
{
sHy[ti] = Hy[fi-16];
sHz[ti] = Hz[fi-16];
}
__syncthreads();
if (fi>=n_fields-z_offset) return;
if (fi<y_offset) return;
if (fi<z_offset) return;
__syncthreads();
sampled_voltage[time_step] =
sampled_voltage[time_step] + sv_sum*scaling;
}
__syncthreads();
sampled_current[time_step] = sampled_current[time_step] +
h_sum*scaling;
}
Sampled voltages and currents on the ports are the main outputs of the section
of the program running on the GPU device. These output data are copied back to
the host memory in the function copyDataBackAndClearGPU(), a section of
which is shown in Listing 3.20. Here the CUDA function cudaMemcpy() is used
for data copy. The program proceeds with scattering parameters calculation, as
shown in Listing 3.2, once the output data are available on the host memory.
cudaMemcpy(ports[ind].sampled_current,
ports[ind].dvsampled_current, array_size,
cudaMemcpyDeviceToHost);
}
The performance of the presented code is examined using the low-pass filter
presented in Figure 3.1 as an example. Figure 3.8 shows the scattering parameters
obtained for this filter up to 20 GHz.
Table 3.1 shows the simulation parameters used for performance evaluation.
The low-pass filter simulation is performed first using a coarse grid and then using
a fine grid that has half the cell size of the coarse grid. Each of the cases is
repeated on both the CPU and GPU platforms. The simulation times are recorded
and the computation performances are calculated as number of million cells
processed per second using the formula
N steps N x N y N z
MCPS 10 6 (3.4)
ts
where Nsteps is the number of time steps; the program has been run and ts is the total
simulation time in seconds. Results show a 14 times speed-up factor for the coarse
grid simulations when computation is performed on a GPU versus a CPU. The
speed up factor is 50 for the case of fine grid simulations. Results also indicate that
the performance of computations on a GPU is higher when the problem size is
larger.
Table 3.1
Simulation Parameters and Performance of Computations
Nz 13 13 26 26
Total cells 60,060 67,782 480,480 490,776
Time steps 5,000 5,000 10,000 10,000
Simulation time (s) 8.96 0.71 242 4.97
Performance (MCPS) 34 480 20 990
It should be noted that these simulations are performed on a computer that has
a CPU of Intel Core2 Quad Processor Q9550 at 2.83 GHz and a GPU of NVIDIA
GTX 480. CPU simulations are performed on a single core; thus, the full multicore
power of the CPU is not fully utilized. The speed-up factors would be different if
the CPU code was parallelized to run on multiple cores.
REFERENCES
[1] K. Yee, “Numerical Solution of Initial Boundary Value Problems Involving Maxwell’s
Equations in Isotropic Media,” IEEE Transactions on Antennas and Propagation, Vol. 14, No. 3,
pp. 302–307, 1966.
[2] A. Taflove and S. Hagness, Computational Electrodynamics: The Finite-Difference Time-
Domain Method, 3rd edition. Norwood, MA: Artech House, 2005.
[3] A. Elsherbeni and V. Demir, The Finite Difference Time Domain Method for Electromagnetics:
With MATLAB Simulations, New York: SciTech Publishing, 2009.
[4] NVIDIA CUDA Parallel Programming and Computing Platform,
http://www.nvidia.com/object/cuda_home_new.html.
[5] S. Krakiwsky, L. Turner, and M. Okoniewski, “Graphics Processor Unit (GPU) Acceleration of
Finite-Difference Time-Domain (FDTD) Algorithm,” Proc. 2004 International Symposium on
Circuits and Systems, Vol. 5, pp. 265–268, 2004.
[6] S. Krakiwsky, L. Turner, and M. Okoniewski, “Acceleration of Finite-Difference Time-Domain
(FDTD) Using Graphics Processor Units (GPU),” 2004 IEEE MTT-S International Microwave
Symposium Digest, Vol. 2, pp. 1033–1036, 2004.
144 Advanced Computational Electromagnetic Methods and Applications
[7] S. Adams, J. Payne, and R. Boppana, “Finite Difference Time Domain (FDTD) Simulations
Using Graphics Processors,” Proceedings of the 2007 DoD High Performance Computing
Modernization Program Users Group (HPCMP) Conference, pp. 334–338, 2007.
[8] M. Inman, A. Elsherbeni, and C. Smith “GPU Programming for FDTD Calculations,” The
Applied Computational Electromagnetics Society (ACES) Conference, Honolulu, HI, 2005.
[9] M. Inman and A. Elsherbeni, “Programming Video Cards for Computational Electromagnetics
Applications,” IEEE Antennas and Propagation Magazine, Vol. 47, No. 6, pp. 71–78, 2005.
[10] M. Inman and A. Elsherbeni, “Acceleration of Field Computations Using Graphical Processing
Units,” The Twelfth Biennial IEEE Conference on Electromagnetic Field Computation CEFC
2006, Miami, FL, 2006.
[11] M. Inman, A. Elsherbeni, J. Maloney, and B. Baker, “Practical Implementation of a CPML
Absorbing Boundary for GPU Accelerated FDTD Technique,” The 23rd Annual Review of
Progress in Applied Computational Electromagnetics Society, ACES'07, Verona, Italy, pp. 19–23,
2007.
[12] M. Inman, A. Elsherbeni, J. Maloney, and B. Baker, “Practical Implementation of a CPML
Absorbing Boundary for GPU Accelerated FDTD Technique,” Applied Computational
Electromagnetics Society Journal, Vol. 23, No. 1, pp. 16–22, 2008.
[13] M. J. Inman and A. Z. Elsherbeni, “Optimization and Parameter Exploration Using GPU Based
FDTD Solvers,” IEEE MTT-S International Microwave Symposium Digest, pp. 149152, June
2008.
[14] M. Inman, A. Elsherbeni, and V. Demir, “Graphics Processing Unit Acceleration of Finite
Difference Time Domain,” Chapter 12, in The Finite Difference Time Domain Method for
Electromagnetics (with MATLAB Simulations), New York: SciTech Publishing, 2009.
[15] Ian Buck, Brook Spec v0.2, Stanford, CA: Stanford University Press, 2003.
[16] N. Takada, N. Masuda, T. Tanaka, Y. Abe, and T. Ito, “A GPU Implementation of the 2-D
Finite-Difference Time-Domain Code Using High Level Shader Language,” Applied
Computational Electromagnetics Society Journal, Vol. 23, No. 4, pp. 309–316, 2008.
[17] Valcarce, G. de la Roche, and J. Zhang, “A GPU Approach to FDTD for Radio Coverage
Prediction,” Proceedings of the 11th IEEE Singapore International Conference on
Communication Systems (ICCS), pp. 1585–1590, GuangZhou, China, 2008.
[18] P. Sypek and M. Michal, “Optimization of a FDTD Code for Graphical Processing Units,” 17th
International Conference on Microwaves, Radar and Wireless Communications, MIKON 2008,
Wroclaw, Poland, pp. 1–3, 2008.
[19] P. Sypek, A. Dziekonski, and M. Mrozowski, “How to Render FDTD Computations More
Effective Using a Graphics Accelerator,” IEEE Transactions on Magnetics, Vol. 45, No. 3, pp.
1324–1327, 2009.
[20] N. Takada, T. Shimobaba, N. Masuda, and T. Ito, “High-Speed FDTD Simulation Algorithm for
GPU with Compute Unified Device Architecture,” IEEE International Symposium on Antennas
& Propagation & USNC/URSI National Radio Science Meeting, North Charleston, SC, p. 4,
2009.
[21] G. Valcarce, De La Roche, A. Jüttner, D. López-Pérez, and J. Zhang, “Applying FDTD to the
Coverage Prediction of WiMAX Femtocells,” EURASIP Journal on Wireless Communications
and Networking, 2009.
GPU Acceleration Techniques of FDTD Methods 145
[22] V. Demir, and A. Elsherbeni, “Compute Unified Device Architecture (CUDA) Based Finite-
Difference Time-Domain (FDTD) implementation,” Journal of the Applied Computational
Electromagnetics Society (ACES), Vol. 25, No. 4, pp. 303–314, 2010.
[23] The OpenCL Specification, ver. 1.0, Khronos OpenCL Working Group,
http://www.khronos.org/registry/cl/specs/opencl-1. 0.48.pdf, 2009.
[24] T. Stefanski, S. Benkler, N. Chavannes, and N. Kuster. “Parallel Implementation of the Finite-
Difference Time-Domain Method in Open Computing Language,” 2010 International
Conference on Electromagnetics in Advanced Applications (ICEAA), pp. 557–560, 2010.
[25] T. P. Stefanski, P. Tomasz, N. Chavannes, and N. Kuster. “Performance Evaluation of the Multi-
Device OpenCL FDTD Solver,” Proceedings of the 5th European Conference on Antennas and
Propagation (EUCAP), pp. 3995–3998, 2011.
[26] D. Sheen, S. Ali, M. Abouzahra, and J. Kong, “Application of the Three-Dimensional Finite-
Difference Time-Domain Method to the Analysis of Planar Microstrip Circuits,” IEEE
Transactions on Microwave Theory and Techniques, Vol. 38, No. 7, pp. 849–857, 1990.
Chapter 4
Recent FDTD Advances for Electromagnetic
Wave Propagation in the Ionosphere
Alireza Samimi, Bach T. Nguyen, and Jamesina J. Simpson
This chapter presents an overview of two recent FDTD [1, 2] modeling advances
for calculating electromagnetic wave propagation in the ionosphere [3]. Section 4.1
provides an introduction of the topic and highlights the advantages of using FDTD
over ray tracing techniques for this application. Section 4.2 summarizes the current
state of the art for calculating transionospheric electromagnetic wave propagation.
Section 4.3 provides an overview of the global FDTD Earth-ionosphere models.
Section 4.4 then describes a new and efficient 3-D FDTD magnetized ionospheric
plasma model [4] that may be used to greatly advance the current state of the art
for ionospheric wave propagation. Next, Section 4.5 describes a new capability:
stochastic FDTD (S-FDTD) [5, 6] magnetized ionospheric plasma modeling [7],
which can yield both average as well as variance electric and magnetic fields
resulting from variances and uncertainties in the ionosphere composition. This
chapter then concludes with a discussion of input parameters and a list of possible
applications of these models.
4.1 INTRODUCTION
147
148 Advanced Computational Electromagnetic Methods and Applications
The challenge of being able to accommodate all of the above details and
physics is that the FDTD model may quickly become very memory and time
intensive and thus require significant supercomputing resources. This makes real-
time calculations difficult or sometimes even impossible to obtain. Further, if the
electromagnetic frequency is high enough (and thus the required grid resolution
low enough), the required grid size may become computationally infeasible,
especially for long propagation paths.
In this section, the general modeling approach to the global 3-D FDTD Earth-
ionosphere models is described. Only the solution to Maxwell’s equations will be
presented in this section as can be applied the lithosphere and atmosphere regions.
These Maxwell’s equations solutions may also be used in the ionosphere region if
a simplified isotropic ionosphere is utilized. Section 4.4 will describe the modeling
approach that may be used in the ionosphere region when the magnetized
ionospheric plasma must be accommodated.
Although global FDTD models are described here, these models may be easily
adapted to simulate only local regions at higher resolutions. This is useful for
higher-frequency applications in which the electromagnetic waves would only
propagate vertically (radially) or only over short distances laterally around the
world.
Two generations of global FDTD models have been generated: latitude-
longitude models [12, 24] and geodesic models [25]. Only a latitude-longitude
model will be described here because of the ease of implementing a magnetized
ionospheric plasma algorithm on its East-West and North-South components rather
than on the variable field component orientations in the geodesic hexagonal and
triangular grid cells.
Note that the model described here from [12] is more efficient than other
global FDTD models [24] because it includes a means of mitigating the grid-cell
eccentricity by merging cells in the East-West direction as either pole is
approached. This mitigation technique permits the use of a larger time step of
nearly the Courant limit permitted by the Equatorial cells.
The present model maps the complete Earth-ionosphere cavity onto a 3-D
spherical-coordinate FDTD space lattice that extends ±100 km radially from sea
level. Figure 4.2 illustrates the general layout of the lattice as seen from the
transverse magnetic (TM) plane at a constant radial coordinate. The lattice is
presented in this section in a logically Cartesian 2M × M × K-cell arrangement,
where M is a power of 2, in order to distinguish between the spherical positions of
each cell (r, θ, ϕ) and their corresponding grid cell indices (i, j, k). The lattice-cell
position index in the West-East direction is 1 ≤ i ≤ 2M, the lattice-cell position
index in the South-North direction is 1 ≤ j ≤ M, and the lattice-cell position index
in the radial direction is 1 ≤ k ≤ K. We see that the grid cells follow along lines of
constant latitude, θ = constant, where θ is the usual spherical angle measured from
the North Pole, and along lines of constant longitude, ϕ = constant, where ϕ is the
152 Advanced Computational Electromagnetic Methods and Applications
usual spherical azimuthal angle measured from a specified prime meridian. In this
manner, each TM plane of the grid shown in Figure 4.2 is comprised of isosceles
trapezoidal cells away from the north and south poles, and isosceles triangular cells
at the poles. Similarly, each transverse electric (TE) plane at a constant radial
coordinate is comprised of isosceles trapezoidal cells away from the North and
South Poles, and a polygon at the poles [12].
E W
Figure 4.2 General layout of the 3-D FDTD Earth-ionosphere model as seen from a TM plane at a
constant radial coordinate. ©2014 IEEE [12].
The same angular increment in latitude is chosen, =/m, for each cell in the
grid. Thus, the South-North span of each trapezoidal or triangular grid cell is Δs-n =
πR/m, where R is the radial distance from the center of the Earth. To maintain
square or nearly square grid cells near the equator, we select the baseline value of
the angular increment in longitude, Δϕ, to equal Δθ. However, this causes the west-
east span of each cell, Δw-e = R Δϕ sin θ, to be a function of θ. This could be
troublesome for cells near the North and South Poles where θ 0 and θ π,
respectively. There, the geometrical eccentricity of each cell, Δs-n / Δw-e = Δθ / (Δϕ
sin θ), would become quite large, and the numerical stability and efficiency of the
FDTD algorithm would be degraded. To address this issue, adjacent grid cells in
the West-East direction are systematically combined as either pole is approached
in order to keep all the grid cells at nearly the same size [12]. This is illustrated
Recent FDTD Advances in the Ionosphere 153
Given the above assumptions, Ampere’s Law in integral form [2] can be applied to
develop an FDTD time-stepping relation for the electric field Ez at the center of the
(i, j, k)’th trapezoidal grid cell [12]. For example, we have
Ezn 1 i, j , k Ezn i, j , k
H xn 0.5 i, j 0.5, k we j 0.5, k (4.1)
t
H x i, j 0.5, k we j 0.5, k
n 0.5
0 S j, k
H y i 0.5, j , k H y i 0.5, j , k s n
n 0.5 n 0.5
M j
we j 0.5, k R sin (4.2a)
M
M j 1
we j 0.5, k R sin (4.2b)
M
s n
S j, k we j 0.5, k we j 0.5, k (4.2c)
2
Similarly, the update for Ez at the center of the ith triangular grid cell at the
north pole (j = M) is given by
Ezn 1 i, M , k Ezn i, M , k
we M 0.5, k s n
M 0.5, k
(4.4)
S M ,k sin cos 1 we
2
2 s n
154 Advanced Computational Electromagnetic Methods and Applications
Expressions analogous to (4.3) and (4.4) can be derived for the i’th triangular
grid cell at the South Pole (j = 1).
The basic TM-plane FDTD time-stepping algorithm is completed by
specifying the updates for the Hx and Hy fields [12]. For example, for the
trapezoidal grid cell we have
Expressions analogous to (4.7) and (4.8) can be derived for a triangular grid
cell at the South Pole (j = 1).
For additional details of the global FDTD updating equations, including those
for merging of cells, TE field components, and so forth, please refer to [12].
Recent FDTD Advances in the Ionosphere 155
J j
j J
j 0 pj2 E Cj J j (4.9)
t
where J j is electric current due to 𝑗 species, where the subscript 𝑗 represents the
electron or ion species (e for electrons, p for positive ions, or n for negative ions),
𝜈𝑗 is the collision frequency, 0 is the electric permittivity, 𝜔𝑝𝑗 is the j species
plasma frequency, and 𝜔 ⃗ 𝑐𝑗 is the j species gyro-frequency. Note that for electrons
and negative ions the gyro-frequency (𝜔 ⃗ 𝑐𝑗 ) is negative.
Equation (4.9) is incorporated into Maxwell’s equations as:
E (4.10)
H 0 J J J s
t
H (4.11)
E 0
t
where J s is the external source current density. The discretization technique that is
used is based on the Yee algorithm where the transverse magnetic (TM) and the
transverse electric (TE) planes are stacked in the z-direction. The H field
156 Advanced Computational Electromagnetic Methods and Applications
components are calculated at each half time steps; that is, (𝑛 + 1/2), and the E
fields at every integer time step; that is, (𝑛). The FDTD form of Maxwell’s
equations for the Earth-ionosphere model are described in [12] and in Section 4.3.
Thus, the focus of this section is on the efficient computational solution of (4.9)
and its incorporation into (4.10). Equation (4.11) is not modified relative to
traditional FDTD.
J e (4.12)
e J
e 0 pe
2
Ε Ce J e
t
The difficulty in solving (4.12) in the collisional regime is that the current
density vector is needed at time step (n+1/2), which is not yet known. In order to
solve this issue, a two-step method known as the predictor-corrector method is
employed. In the first (predictor) step, the current vector at (n1/2) is used to
predict the current density at (n+1/2). In the second (corrector) step, the predicted
current density vector from the first step is used, and all the equations are solved
again. New current density vector is found at (n+1/2) that is known as the corrector
current density vector. The average of the predicted current density vector and the
corrector current density vector at (n+1/2) is used for current density vector at
(n+1/2). The predictor-corrector method is also known as the MacCormack method;
it is second-order accurate [29, 30].
Equation (4.12) in discrete form in the predictor step is as follows:
n
1
n
1
n 12 n
1
J e , p2 J e 2
n
n
1
J e , p J e
2
(4.13)
J e 2
0 pe
2
En ce2
t 2
1
n
n
1
t 0 pe
2
En t J e 2
J
J e, p
2
(4.14)
2 2
1
n
n
1
t 0 pe
2
En t J e 2
J
Je 2
(4.15)
2 2
The cross product does not change the energy, therefore, J J .
However, the direction of the vector is changed. Figure 4.3 demonstrates the
rotation of the current density vector around 𝜔 ⃗ 𝑐𝑒 that is for simplicity (only for the
figure) assumed to be perpendicular to the current density components. The
direction of 𝜔⃗ 𝑐𝑒 is out of the paper and opposite to the B-field.
J
2 ce
J
J J t
tan
1
tan 1 ce (4.16)
2
J J
2
J
0 J t (4.17a)
J
1 J J0 (4.17b)
J 2 J1 s (4.17c)
J J J2 (4.17d)
158 Advanced Computational Electromagnetic Methods and Applications
n
1
n
1
n 12 n
1
J e,c 2 J e 2 n
n
1
J J 2
(4.18)
J e, p
2
0 pe
2
En ce2 e,c e
t 2
1
n
n
1
t 0 pe
2
En t J e, p2
J
J e ,c 2
(4.19)
2 2
1
n
n
1
t 0 pe
2
En t J e, p2
J
Je 2
(4.20)
2 2
1 1
n n
n
1
J e , p2 J e , c 2
Je 2
(4.21)
2
The maximum allowed time step that may be used depends on the electron
gyro frequency; that is, Δ𝑡 < 𝜋⁄𝜔𝑐𝑒 .
Several validation tests are performed in [4] to demonstrate the accuracy and
capability of the newly developed FDTD plasma model. Section 4.4.2 below
summarizes two example validation scenarios. Section 4.4.3 then summarizes the
performance of the new FDTD plasma model.
to the logically Cartesian grid description of Section 4.3). These tests serve as a
high-resolution validation of the global FDTD plasma model of [4, 9, 26].
The spherical waveguide has an internal radius of 2.673m and an external
radius of 3.6978m. A magnetic field is considered in the south-north direction and
its strength is 𝐵0 = 0.06 𝑇. The electron density is 𝑛𝑒 = 1018 /m3 . The source of the
electromagnetic wave is located at 30o 𝑆 and propagation toward the equator is
examined. First, propagation of a single frequency sinusoidal wave with frequency
f = 10.34 GHz ( = 6.5 × 1010 rad/s) is simulated. The source creates a linearly
polarized electromagnetic plane wave polarized in the radial r-direction.
According to plasma theory, only circular polarization can propagate along the
magnetic field. The electromagnetic wave with linear polarization can be
decomposed into a left-hand and a right-hand circular polarization wave. The right-
hand circular polarization wave is known as R-wave and the left-hand circular
polarization wave is called L-wave. The velocity of the wave with left-hand
circular polarization is different from the right-hand circular polarization wave.
Because of this, the direction of polarization of the initially linearly polarized wave
rotates as the wave moves along the magnetic field. This rotation is known as
Faraday rotation [31]. It can be shown that the rotation angle per unit distance 𝜃𝐹
may be obtained from the following expression [19]:
LH RH (4.22)
F
2
where LH and RH are the wave number of the L-wave and R-wave, respectively,
and can be calculated as follows [31]:
LH 0 0 1
pe 2
(4.23)
1
pe
RH 0 0 1
pe 2
(4.24)
1
pe
1.3 mm × 1.3 mm. Figure 4.4 shows the polarization of the electromagnetic wave
at each observation point. The numerical Faraday rotation can be obtained from
E
tan 1
FN Er (4.25)
d
F FN
errorF (4.26)
F
Figure 4.4 Faraday rotation of 10.34-GHz electromagnetic wave propagation along the magnetic
field from 30° South toward the equator inside a small spherical waveguide with the
internal radius of 2.673m and an external radius of 3.6978 m. Note that the electric field
is recorded at a radius of 3.18m and between 10 80 mm from the source in increments
of 10 mm. ©2014 IEEE [4].
Recent FDTD Advances in the Ionosphere 161
Next, using the same model, a Gaussian pulse is used for the source of the
electromagnetic wave. The source electric field is described by the following
expression:
t 50t 2
E exp (4.27)
2 7t 2
This pulse is expected to excite the R-wave and L as well as the low frequency
whistler mode. The whistler mode is part of the R-wave dispersion relation that can
propagate at frequencies less than the electron gyro-frequency.
Figure 4.5 Time domain electric field waveform in the r-direction recorded ~40 mm from the
source along the magnetic field. ©2014 IEEE [4].
162 Advanced Computational Electromagnetic Methods and Applications
Figure 4.6 Power spectrum of the electric field in the r-direction recorded ~40 mm from the source
along the magnetic field line. ©2014 IEEE [4].
Figure 4.5 shows the time domain electric field in the r-direction; that is,
𝐸𝑟 (𝑡) , 40 cells (approximately 40 mm) from the source. The low-frequency
whistler mode arrives at the observation point at around 1.2 ps. Figure 4.6 shows
the power spectrum of the electric field corresponding to the time waveform of
Figure 4.5. The L-wave cutoff frequency, 𝜔𝐿 , the R-wave cutoff frequency, 𝜔𝑅 ,
and the whistler mode with frequency band less than the electron cyclotron
frequency (< 𝜔𝑐𝑒 ) are apparent in the figure. These results are also in very good
agreement with plasma theory and the simulation results of the previous
anisotropic model [19].
Note that in this validation test, the time step value for solving Maxwell’s
equations is chosen according to the Courant stability condition and is Δ𝑡 = 1.5 ps.
This time step value corresponds to a rotation angle 𝜃 = 0.9∘ that yields a
numerical electron gyro-frequency error of less than 0.5%. Therefore, there is no
need to use a different time step for solving the current equation compared to
Maxwell’s equations.
Next, propagation of a sinusoidal electromagnetic wave perpendicular to the
magnetic field is investigated. Depending upon the direction of the electric field
component with respect to the magnetic field, two types of electromagnetic waves
can propagate: (1) the ordinary mode (O-mode), for which the electric field
component of the electromagnetic wave is parallel to the background magnetic
field; and (2) the extraordinary mode (X-mode), for which the electric field
component of the electromagnetic wave is perpendicular to the magnetic field. The
O-mode wave has a cutoff frequency determined by the plasma frequency; that is,
𝜔0 > 𝜔𝑝𝑒 .
Recent FDTD Advances in the Ionosphere 163
Figure 4.7 Frequency power spectrum of the 𝐸𝑟 component of an O-mode plane wave propagating
from 30∘ S toward the equator, (a) under-dense plasma (ω0 > ωpe ) 40 mm after the
plasma boundary; (b) under-dense plasma ( ω0 > ωpe ) 40 mm before the plasma
boundary; (c) over-dense plasma (ω0 < ωpe) 40 mm after the plasma boundary; and (d)
over-dense plasma (ω0 < ωpe ) 40 mm before the plasma boundary.
equations are solved the current density vector is updated 25 (25Δ𝑡𝑐1 = Δ𝑡) and 50
(50Δ𝑡𝑐2 = Δ𝑡 ) times, respectively. Note that for the case of Δ𝑡𝑐1 = 1.2 × 10−7 s ,
two simulations are conducted. In the first, collisions are considered in the
ionosphere; in the second, collisions are neglected. Figure 4.8 shows the collision
frequency versus altitude that is used in the collisional simulation case. The
ionosphere is assumed to start from 80 km.
Figure 4.8 Profile of the collision frequency in the ionosphere. ©2014 IEEE [4].
Topographic and bathymetric data are obtained from National Oceanic and
Atmospheric Administration (NOAA) Global Relief CD-ROM. The magnetic field
is mapped onto the FDTD mesh according to the global geomagnetic field values
at 100 km as obtained from the international geomagnetic reference field (IGRF)
model. For the lithosphere, the same conductivity values of [12] are assigned
depending upon whether the space lattice point is located directly below an ocean
or within a continent. For the low-altitude atmosphere, the same exponential
conductivity profile of [22] is assumed according to [32].
Position and time-dependent density profiles of electrons and ions can be
obtained, for example, from the international reference ionosphere (IRI)
(http://iri.gsfc.nasa.gov/). However, for the general validation study performed
here, exponential profiles for the particle densities and the collision frequencies as
proposed in [33] are utilized. Note that all of these lithosphere, topography,
geomagnetic field, and ionosphere values match those used in the previous (less-
efficient) anisotropic plasma study of [26].
The current source is a 5 km-long Gaussian pulse with a 1/𝑒 full-width of
480Δ𝑡, similar to the source current waveform used in previous studies [12, 26].
166 Advanced Computational Electromagnetic Methods and Applications
The temporal center of the pulse is at 960 Δ𝑡. The source current is above the
Earth’s surface at 47∘ W on the equator.
Figure 4.9 ELF wave attenuation for westward propagation from ¼ and ½ of the distance between
the source and the antipode. The dotted waveform is from the previous isotropic model
of [12]. ©2014 IEEE [4].
The results of the new algorithm are compared with the validated isotropic
ionosphere FDTD model [12]. Figure 4.9 shows the attenuation of the ELF wave
travelling westward from 1 4 to 1 2 of the distance to the antipode for three cases:
(1) isotropic ionosphere case; (2) collision-less anisotropic case with Δ𝑡𝑐2 = 6 ×
10−8 s for the current equation solver; and (3) collisional anisotropic ionosphere
case with Δ𝑡𝑐2 = 1.2 × 10−7 s for the current equation solver.
The ELF wave attenuations for all three cases are very similar. Simpson and
Taflove [12] showed that the wave attenuation obtained from the isotropic model
agrees well with analytical predications and measurements.
Finally, the execution time of the new, anisotropic ionosphere plasma
algorithm is compared to the previous anisotropic model of [26]. The global Earth-
ionosphere system is modeled. Both simulations run on the same machine for only
100 time steps. The execution time of the new algorithm using Δ𝑡𝑐2 = 6 × 10−8 s
(that requires 50 iterations of the current equation solver per each time step of
Maxwell’s equations) was 128 seconds (1.28 seconds per time step) in comparison
to 286 seconds (2.86 seconds per time step) for the previous anisotropic algorithm.
Therefore, the new algorithm is 55% faster than the previous one for this numerical
test.
Recent FDTD Advances in the Ionosphere 167
In summary, the advantages of the new FDTD plasma model [4] over the previous,
stable 3-D magnetized ionosphere plasma formulation of [19, 26] are as follows:
It permits the use of two different time steps for solving the current equation
versus Maxwell’s equations. The previous anisotropic model did not include
this capability, and so for some cases the time-step requirements of the
current density solutions could drastically slow down the solutions to
Maxwell’s equations. As such, obtaining solutions for cases involving high
collision frequencies was nearly impossible due to the necessary long
computational time. It is faster than the previous model. Depending upon the
size of the time step needed to solve the current equation, the new algorithm
is more than 50 percent faster than the previous version.
Implementation of the algorithm is much simpler and no matrix equation
must be solved.
The memory requirement is drastically less than for the previous formulation
(3 additional real numbers are stored per cell relative to traditional FDTD
compared to 9 additional real numbers stored per cell as for the previous
plasma formulation; also it does not require storage or recalculation of a
coefficient matrices of size at least 6 × 6 at every grid cell).
The only disadvantage of the new algorithm is that for simulating wave
propagation in dense plasma, the stability condition can be smaller than the
Courant limit. The plasma frequency puts an additional restriction on the
maximum allowable time step value. Therefore, either the Courant condition or a
𝑑𝑡 < 0.87⁄𝜔𝑝𝑒 , whichever is smaller, should be chosen for the time step for
Maxwell’s equations.
4.5.1 Overview
P 1
n 1! (4.28)
n !d !
where d is the highest polynomial order in the expansion and n is the number of
random variables. It follows that P grows very quickly with the dimension and the
order of the decomposition. In general, the gPC method increases memory
consumption by a factor P + 1 and the simulation time is proportional to (P + 1)2.
The gPC results typically converge significantly faster than the Monte Carlo
method in a number of applications. However, the method has an inherent
Recent FDTD Advances in the Ionosphere 169
limitation. It can handle only a limited number of uncertain inputs. For large
numbers of random variables, polynomial chaos becomes very computationally
expensive and Monte Carlo methods are typically more feasible.
In summary, each of the above approaches has its own strengths and
limitations. Given the fact that the ionosphere content can vary even up to 100% or
more, the S-FDTD method proposed in [5] and the gPC method are good
candidates for electromagnetic wave propagation modeling in ionosphere plasma.
S-FDTD was recently extended to electromagnetic wave propagation in a
magnetized ionospheric plasma by extending the stochastic variables to both
Maxwell’s equations and the Lorentz equation of motion [7]. The electric fields,
magnetic fields, current densities, electron/ion densities and collision frequencies
are all treated as random variables with their own statistical variation. The
resulting mean and variance calculations of the EMFs and current densities provide
new capabilities, such as the ability to determine the confidence level that a
communications/remote sensing/radar system will operate as expected under
abnormal ionospheric conditions. It may also be useful in a wide variety of
geophysical studies.
In [7] an S-FDTD method is developed for the previous (less efficient)
magnetized plasma algorithm of [19]. In this algorithm, the governing stochastic
equations take the form of a large, complex matrix. As a result, the complexity of
the physical model presents a computational challenge. Aside from the S-FDTD
method, if the gPC method is applied in order to avoid the approximation for the
cross correlation coefficients, the derivation of the explicit equations for the gPC
coefficients can be very difficult, or even impossible. Recently, however, a more
efficient magnetized plasma model was developed [4] as presented in Section 4.4.
In this new algorithm, since no matrix equation must be solved and all equations
are explicit, the gPC method may potentially be applied to electromagnetic wave
propagation in the ionosphere. Therefore, the gPC simulation should be derived as
part of future work and its results compared to the S-FDTD modeling and Monte
Carlo results for validation of the algorithm. It is possible that a hybrid method will
be needed to achieve optimal and efficient results. The ultimate object is to
develop a stochastic optimization FDTD-based algorithm that is well suited for
large uncertainty quantification of the ionosphere, so that the variability of the
electromagnetic wave propagation is well under control and understood.
In the remainder of this section, general guidelines are provided for extending
the S-FDTD approach to the more efficient magnetized plasma model of Section
4.4 and [19]. The general approach is analogous to that of [7].
Using the delta method [36], the average (or expected) EMFs and current density
values may be found by solving Maxwell’s equations and the current equation
while using mean (average) values of the variables [7]. For the S-FDTD
magnetized cold plasma model, the equations for the mean values of the EMFs and
170 Advanced Computational Electromagnetic Methods and Applications
current densities are of the same form as for those of the regular 3-D FDTD
magnetized cold plasma model. Thus, the mean EMF and current density values
are found by using the mean plasma frequency of ωPe, or equivalently, the mean of
electron density ne.
The variance fields may be derived by using the delta method and the statistical
values. When solving only Maxwell’s equations, the variance field equations may
be solved separately from the mean field equations no matter the dimensionality of
the problem [5]. However, in the 3-D magnetized cold plasma model, the
momentum (4.9) is coupled to Maxwell’s equations (4.10) and (4.11), which leads
to a complicated but linear system. As a result, the electric field and current density
variances must be computed simultaneously. When variance equations are derived,
covariances are needed for the E and H fields and current density Je in both time
and space. Equation (4.9) also relates the current density to the collision frequency
and the electric field to the plasma frequency of the ionosphere, resulting in
additional covariance terms between the current density and collision frequency,
and the electric field and plasma frequency. For the S-FDTD method, a critical
step is to approximate these correlation coefficients, which control the accuracy of
the algorithm.
Figure 4.10 shows a diagram of the iteration process for each time step of the
S-FDTD method. What is changed from regular FDTD updating is the addition of
the calculation of the variances after the mean values are obtained. Therefore, the
running time as well as the memory required for S-FDTD is roughly double that
needed for traditional FDTD (and double that for the regular FDTD plasma model).
However, we note that this doubling in computation time is in general drastically
faster than the brute-force Monte Carlo method approach.
Since both the mean fields and their variances behave like waves, both require
boundary conditions. Thus, an absorbing boundary condition is needed for the E, H,
and Je mean values as well as for their variances. Mur’s boundary conditions as
used in [7] are an appropriate option because the boundary condition provides
good absorption regardless of the magnetic field direction [37]. The perfectly
matched layer is more complex to implement and also is only effective if the
magnetic field is homogeneous in the vicinity of the boundary region (see for
example, [38]).
Recent FDTD Advances in the Ionosphere 171
Source
Mean H
Standard Deviation ( H)
Mean E, J
No
New time step ? Stop
Since FDTD models may account for highly detailed geometries and materials, it
is useful to populate the FDTD grid with realistic data. The Earth’s topography and
bathymetry data may be obtained, for example, from the National Oceanic and
Atmospheric Administration (NOAA) National Geophysical Data Center (NGDC).
The Earth’s magnetic field data and its direction and amplitude variation with
position may be obtained from the International Geomagnetic Reference Field
(IGRF).
For an isotropic conductivity profile ionosphere to be used in lower-frequency
electromagnetic propagation models, relatively simple profiles based on
measurements and analytical calculations may be used, such as an exponential
conductivity profile [32] or a knee profile [39]. To model an anisotropic
magnetized plasma ionosphere to be used in higher-frequency electromagnetic
propagation models, electron and ion densities and collision frequencies and their
variation with time and position may be obtained from the International Reference
Ionosphere (IRI) and other sources. IRI has recently been expanded to include
stochastic information about the ionosphere composition [40].
172 Advanced Computational Electromagnetic Methods and Applications
4.7 CONCLUSIONS
This chapter provided an overview of two recent FDTD modeling advances for
electromagnetic wave propagation in the ionosphere: (1) a new, efficient 3-D
magnetized ionospheric plasma model; and (2) a stochastic FDTD model of
ionospheric plasma. The combination of these models provides the capability to
model high frequency electromagnetic wave propagation over longer distances
than previously possible, while also solving for not only mean but also variance
electric and magnetic fields due to uncertainties or variances in the ionosphere.
Applications of these models range from propagation studies [12, 24], remote
sensing of ionospheric anomalies [41] and underground oil fields [25, 42], to
modeling of Schumann resonances [39], hypothetical ELF earthquake precursors
[43], space weather [44], and communications.
REFERENCES
[1] K. Yee, “Numerical Solution of Initial Boundary Value Problems Involving Maxwell´s
Equations in Isotropic Media,” IEEE Transactions on Antennas and Propagation, Vol. 14, No. 3,
pp. 302–307, 1996.
[2] A. Taflove and S. Hagness, Computational Electromagnetics: The Finite-Difference Time-
Domain (FDTD) Method, 3rd ed, Norwood, MA: Artech House, 2005.
[3] B. Nguyen A. Samimi, and J. Simpson, “Recent Advances in FDTD Modeling of
Electromagnetic Wave Propagation in the Ionosphere,” Applied Computational Electromagnetics
Society Journal, Vol. 29, No. 12, pp. 1003-1012, 2014.
[4] A. Samimi and J. Simpson, “An Efficient 3-D FDTD Model of Electromagnetic Wave
Propagation in Magnetized Plasma,” IEEE Transactions on Antennas and Propagation, Vol. 63,
No. 1, pp. 269–279, 2015.
[5] S. Smith and S. Furse, “Stochastic FDTD for Analysis of Statistical Variation in Electromagnetic
Fields,” IEEE Transactions on Antennas and Propagation, Vol. 60, No. 7, pp. 3343–3350, 2012.
[6] T. Tan, A. Taflove, and V. Backman, “Single Realization Stochastic FDTD for Weak Scattering
Waves in Biological Random Media,” IEEE Transactions on Antennas and Propagation, Vol. 61,
pp. 818–828, 2013.
[7] B. Nguyen, C. Furse, and J. Simpson, “A 3-D Stochastic FDTD Model of Electromagnetic Wave
Propagation in Magnetized Ionosphere Plasma,” IEEE Transactions on Antennas and
Propagation, Vol. 63, No. 1, pp. 304–313, 2015.
[8] S. Aune, “Comparison of Ray Tracing through Ionospheric Models,” Master’s thesis,
Department of the Air Force, Air Force Institute of Technology, Wright-Patterson Air Force Base,
Ohio, March 2006.
[9] K. Yeh and C. Liu, “Radio Wave Scintillations in the Ionosphere,” Proceedings of the IEEE, Vol.
70, No. 4, pp. 324–360, 1982.
[10] C. Rino, “A Power Law Phase Screen Model for Ionospheric Scintillation: 1. Weak Scatter,”
Radio Science, Vol. 14, No. 6, pp. 1135–1145, 1979.
Recent FDTD Advances in the Ionosphere 173
[27] J. Boris, The Acceleration Calculation from a Scalar Potential, Plasma Physics Laboratory,
Princeton University, MATT-152, March 1970.
[28] C. Birdsall and A. Langdon, Plasma Physics Via Computer Simulation, Institute of Physics, New
York, 1991.
[29] G. Sod, “A Survey of Several Finite Difference Methods for Systems of Nonlinear Hyperbolic
Conservation Laws,” Journal of Computational Physics, Vol. 27, No. 1, pp. 1–31, 1978.
[30] R. Garcia and R. Kahawita, “Numerical Solution of the St. Venant Equations with the
MacCormack Finite-Difference Scheme,” International Journal for Numerical Methods in Fluids,
Vol. 6, pp. 259–274, 1986.
[31] F. Chen, Introduction to Plasma Physics and Controlled Fusion, Plasma Physics, 2nd ed.,
Springer, 1984.
[32] P. Bannister, “ELF Propagation Update,” IEEE J. Ocean. Eng., Vol. 0E-9, No. 3, pp. 179–188,
1984.
[33] J. Wait and K. Spies, Characteristics of the Earth-Ionosphere Waveguide for VLF Radio Waves.
Boulder, CO: National Bureau of Standards, 1964.
[34] R. Edwards, A. Marvin, and S. Porter, “Uncertainty Analyses in the Finite-Difference Time-
Domain Method,” IEEE Transactions on Electromagnetic Compatibilty, Vol. 52, No. 1, pp. 155–
163, 2010.
[35] A. Austin and C. Sarris, “Efficient Analysis of Geometrical Uncertainty in the FDTD Method
Using Polynomial Chaos With Application to Microwave Circuits,” IEEE Transactions on
Microwave and Theory Techniques, Vol. 61, No. 12, pp. 4293–4301, 2013.
[36] G. Casella and R. L. Berger, Statistical Inference, 2nd ed., Singapore: Thompson Learning, 2002.
[37] Y. Yu and J. Simpson, “A Magnetic Field-Independent Absorbing Boundary Condition for the
FDTD E-J Collocated Magnetized Cold Plasma Algorithm,” IEEE Antennas and Wireless
Propagation Letters, Vol. 10, pp. 294–297, 2011.
[38] W. Hu and S. Cummer, “The Nearly Perfectly Matched Layer is a Perfectly Matched Layer,”
IEEE Antennas and Wireless Propagation Letters, Vol. 3, pp. 137–140, 2004.
[39] H. Yang and V. Pasko, “Three-Dimensional Finite-Difference Time-Domain Modeling of the
Earth-Ionosphere Cavity Resonances,” Geophysical Research Letters, Vol. 32, No. L03114,
2005.
[40] O. Oladipo, J. Adeniyi, S. Radicella, and I. Adimula, “Variability of the Ionospheric Electron
Density at Fixed Heights and Validation of IRI-2007 Profiles Prediction at Ilorin,” Advances in
Space Research, Vol. 47, No. 3, pp. 496–505, 2011.
[41] J. Simpson and A. Taflove, “ELF Radar System Proposed for Localized D-Region Ionospheric
Anomalies,” IEEE Geoscience and Remote Sensing Letters, Vol. 3, No. 4, pp. 500–503, 2006.
[42] J. Simpson and A. Taflove, “A Novel ELF Radar for Major Oil Deposits,” IEEE Geoscience and
Remote Sensing Letters, Vol. 3, No. 1, pp. 36–39, 2006.
[43] J. Simpson and A. Taflove, “Electrokinetic Effect of the Loma Prieta Earthquake Calculated by
an Entire-Earth FDTD Solution of Maxwell's Equations,” Geophysical Research Letters, Vol. 32,
No. L09302, 2005.
[44] J. Simpson, “On the Possibility of High-Level Transient Coronal Mass Ejection-Induced
Ionospheric Current Coupling to Electric Power Grids,” Journal of Geophysical Research—
Space Phys., vol. 116, no. A11308, 2011.
Chapter 5
Phi Coprocessor Acceleration Techniques in
Computational Electromagnetic Methods
Wenhua Yu, Xiaoling Yang, and Lei Zhao
175
176 Advanced Computational Electromagnetic Methods and Applications
5.1 INTRODUCTION
Table 5.1
Xeon Phi Coprocessor Types and Technical Specifications
GDDR5 RAM
6 8 8 16
(GB)
Number of cores 57 60 60 61
Number of
hardware threads 4 4 4 4
RAM bandwidth
(GB/s) 240 352 320 352
Memory channel 12 16 16 16
(b) More than one Xeon Phi coprocessor installed on one workstation.
Figure 5.1 Xeon Phi coprocessors (www.intel.com). (a) Appearance of active Phi coprocessors. (b)
More than one Phi coprocessor installed on one computer.
One Xeon Phi coprocessor card can be handled as one independent node in a
Linux cluster, and each one has an on-board flash device that loads the coprocessor
OS on boot and can be monitored by an optional cluster monitoring software
Ganglia (http://ganglia.info/) [25]. Xeon Phi coprocessor programming uses the
Many Integrated Core (MIC) instructions. The same source code can be compiled
for the host CPU and the Xeon Phi coprocessor with one different compilation
option, which is essentially different from a GPU, as shown in Figure 5.2.
178 Advanced Computational Electromagnetic Methods and Applications
Phi
Figure 5.2 Possible relationship between CPU and Xeon Phi coprocessor.
The Xeon Phi coprocessor can be used as an independent processor and has its
own cores, cache, and memory. It is mounted on a workstation through PCI
express × 16 slot and can run an application code independently or cooperate with
the host CPU to work together on an application. In this section, we will introduce
the environment and settings for the Xeon Phi coprocessor.
Intel Xeon E3 and E5 CPU series on the Xeon Phi platform are preferred since
they can share the same source code as the Xeon Phi coprocessor. Not all
motherboards support the Xeon Phi coprocessor, and the Xeon Phi coprocessor
requires a motherboard and its BIOS with the large Base Address Register (BARs)
option (MMIO addressing greater than 4 GB). The expressions in a Courier font
represent Linux commands, special terminologies, or special phrases without
special mention in this chapter.
Check if a motherboard includes the PCIe/PCI/PnP Configuration
option in a motherboard manual. There is an option Above 4G Decoding
(available if the system supports 64-bit PCI decoding) in the BIOS, and one of its
options, Enabled, is used to decode a PCI device that supports 64-bit in the space
above 4G address.
Any motherboard with at least one PCI express × 16 slot plus the following
option in the BIOS must be active. When turning on the workstation, press the
Del key in the BIOS setting window. In the BIOS setting window, select the
Advanced option, and then select the PCIE/PCI/PnP Configuration
Phi Coprocessor Acceleration Techniques 179
1. Motherboard
Supermicro X9DRG-QF GPU-ready Server Board
Dual Socket R (LGA 2011): dual Intel Xeon E5-2600 or E5-2600v2 CPUs
16 DIMM: up to 1TB ECC DDR3 memory, up to 1,866 MHz
4 PCI-E 3.0 × 16 (double-width): four Intel Xeon Phi coprocessor cards
2. CPUs
Intel Xeon E5-2640 v2 Processor
Ivy Bridge-EP Eight-Core, Sixteen Threads 2 GHz 20 MB 7.2GT/s 95W
Memory Types: DDR3-800/1066/1333/1600
3. RAM
16GB PC3-12800 DDR3 1,600 MHz Registered ECC Dual-Rank 1.35V X
4
4. Xeon Phi Coprocessor
Intel Xeon Phi Coprocessor 3120A 1.1 GHz 28.5 MB Cache 300W
Number of Cores: 60
Memory: 6 GB DDR5
5. Chassis
Supermicro SC747TQ-R1620B Tower/4U Chassis
Power Supply: 1,620W Redundant
6. Harddisk
1TB Seagate ST1000DM003 SATA 6.0GB/s 7,200 rpm 64 MB 3.5 inches
180 Advanced Computational Electromagnetic Methods and Applications
No matter what operating system is installed in the host computer, the Xeon Phi
coprocessor only supports a micro Linux system to get the better code performance.
In this section, we introduce how to install the Linux operating system on a
workstation with the Xeon Phi coprocessors. We here use CentOS 6.5 (compatible
with Red Hat Enterprise 6.5, free download from http://www.centos.org) as an
example to explain how to install the Linux operating system on a Xeon Phi
workstation.
Select the Use all space option, and check the Review and
modify partition layout box. Click the Next button.
Double-click the /home option in the LVM Volume Group-
VolGroup list, and modify the size to 1 to release space.
o Red Hat Enterprise 6.0, 6.1, 6.2, 6.3, 6.4 and 6.5
o SuSE Linux Enterprise Server (SLES) 11 SP1 and SP2 (MPSS 2.1
release) and SuSE Linux Enterprise Server (SLES) 11 SP2 and SP3
(MPSS 3.x release)
o Microsoft Windows 7 Enterprise SP1, Windows 8 Enterprise,
Windows Server 2008 R2 SP1 and Windows Server 2012
MPSS Installation
(i) Requirements
Phi Coprocessor Acceleration Techniques 183
If the Xeon Phi coprocessor is found in the list, it means the Xeon Phi
coprocessor is recognized. Otherwise, check the hardware installation
and BIOS settings.
user_prompt> cd mpss-3.1.2
user_prompt> ./uninstall.sh
(vi) Identify the tar file for the host OS, then untar and install the Intel
MPSS package. There is a new folder mpss-3.1.2 in the current
folder after the MPSS package is untarred.
(vii) Regenerate the Xeon Phi coprocessor driver for the current OS
user_prompt> ls $HOME/rpmbuild/RPMS/x86_64
(ix) Initialize the MPSS Default Settings using the following command:
(x) Check the Xeon Phi coprocessor connection status using the
following command:
User_prompt>micinfo
Communication with the coprocessor Linux operating system on the Xeon Phi
coprocessor is provided by a standard network interface. The interface uses a
virtual network driver over the PCIe bus. Standard networking tools such as SSH
are supported.
The Xeon Phi coprocessor Linux operating system supports network access
for all users using the SSH keys. The configuration phase of the Intel MPSS
creates users for each coprocessor based on the current user IDs in the host
/etc/passwd file.
For each user in the /etc/passwd (including root) folder, if the SSH key
files are found in the user’s .ssh directory, those keys are also populated to the
Xeon Phi coprocessor’s file system. If the users do not have valid keys, they will
not have network access to the Xeon Phi coprocessor.
To generate the SSH key, each user must execute the following command:
user_prompt> ssh-keygen
The following commands must be executed in order for the MPSS to pick up
any new keys:
User_propmt> micinfo
Phi Coprocessor Acceleration Techniques 187
System Information
Host OS : Linux
OS Version : 2.6.32-431.el6.x86
Driver Version : 3.1.2-1
MPSS Version : 3.1.2
Host Physical Memory : 32848 MB
o Board
PCIe Width : × 16
PCIe Speed : 5 GT/s
Board SKU : C0 QS-3120 P/A
ECC Mode : Enabled
SMC HW Revision : Product 300W Active CS
o Cores
Total No of Active Cores : 60
Frequency : 1100000 kHz
o GDDR
GDDR Density : 2048 Mb
GDDR Size : 5952 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
User_propmt> micsmc
Figure 5.3 Performance monitoring window on the platform with Xeon Phi coprocessor:
temperature, memory usage, power, and core average utilization for system and user.
In this subsection, we describe the compilation environment for the Xeon Phi
coprocessor and its host that can be used for different applications.
1. Install C++ compiler
We describe how to install the C++ compiler on the system with a Xeon
Phi coprocessor. You need to download three files from the Intel web site:
COM_L_CPP_C94M-W38PX7PH.Lic
I_ccompxe_2013_sp1.1.106.tgz
License.txt
Copy the three files above to a local folder and untar I_ccompxe_XXX.tgz.
The I_ccompxe_2013XXX subfolder will be generated in the current folder.
Enter this folder and double-click the install_GUI.sh file to start to
install the C++ compiler.
Since we need to use the Xeon Phi coprocessor to run application files,
the installation components should include a component like MIC support.
Input the proper license key and the compiler installs, by default, under
the /opt/intel folder. After the compiler is successfully installed, you
need to modify the environment variables.
(a) Create an intel.sh file via the steps below:
Enter /etc/profile.d and create a file intel.sh
Add PATH=$PATH:/opt/intel/bin to the file
Add export PATH to the file
Save the file
user_prompt>micctrl –updateramfs
user_prompt>service mpss restart
In the MPSS folder (/var/mpss/), there are two folders and three
files. Each Xeon Phi card corresponds to one folder such as mic0, mic1,
mic2, and mic3. The parameters for each Xeon Phi card in its own
folder micX and its own file micX.filelist, and the common files
and parameters, are located in the common folder and the
common.filelist file, as shown in Figure 5.4.
We use a simple example to demonstrate how to use the Intel Xeon CPU and Xeon
Phi coprocessor to calculate the result of the following formula:
4096000
aibi ci ,
i 0
(5.1)
where a, b and c are floating numbers. The C++ code (main.cpp) is shown in
Listing 5.1, which uses the new future of AVX and MIC, fused multiply-add
(FMA), such that one multiplication and addition operation can be completed by
using a single instruction.
Listing 5.1 Demonstration code for (5.1) on the Xeon Phi coprocessor
#include <stdio.h>
#include <time.h>
#include <omp.h>
#include <immintrin.h>
#ifdef _WIN32
#include <Windows.h>
#elif defined(__linux__)
#include <sys/time.h>
#endif
template<class T>
void aligned_malloc1D(T *&p, size_t n)
{
p = NULL;
size_t na = ALIGNED_SIZE;
if (na < sizeof(size_t)) na = sizeof(size_t);
p = (T *)(po + nshift);
template<class T>
void aligned_free1D(T *&p)
{
if (p == NULL) return;
free(po);
p = NULL;
}
template<class T>
void aligned_malloc2D(T **&pp, size_t n, size_t m)
{
pp = NULL;
size_t nm = n * m;
T *p = NULL;
aligned_malloc1D(p, nm);
if (p == NULL) return;
int i;
for (i = 0; i < n; i ++, p += m)
{
pp[i] = p;
}
}
template<class T>
void aligned_free2D(T **&pp)
{
if (pp == NULL) return;
T *p = pp[0];
free(pp);
aligned_free1D(p);
pp = NULL;
192 Advanced Computational Electromagnetic Methods and Applications
struct TTIME
{
int hour, minute, second, millisecond;
};
return tm;
}
GetSystemTime(¤t);
tm.hour = current.wHour;
tm.minute = current.wMinute;
tm.second = current.wSecond;
tm.millisecond = current.wMilliseconds;
#elif defined(__linux__)
timeval current;
Phi Coprocessor Acceleration Techniques 193
int main()
{
int n;
n = DIM;
if (n % SIMD_SIZE) n = (n / SIMD_SIZE + 1)* SIMD_SIZE;
#ifdef __MIC__
__m512 *vA, *vB, *vC;
vA = (__m512 *)A;
vB = (__m512 *)B;
vC = (__m512 *)C;
#else
__m128 *vA, *vB, *vC;
vA = (__m128 *)A;
vB = (__m128 *)B;
vC = (__m128 *)C;
#endif
int m = n / SIMD_SIZE;
int nThreads;
nThreads = omp_get_num_threads();
mb = m / nThreads;
if (m % nThreads) mb ++;
iThread = omp_get_thread_num();
i1 = iThread * mb;
i2 = (iThread + 1) * mb;
if (i1 > m) i1 = m;
if (i2 > m) i2 = m;
aligned_free1D(A);
aligned_free1D(B);
aligned_free1D(C);
return 0;
}
}
There are two files in the application folder, namely, GNUmakefile and main.cpp.
The second command will generate two files in the application folder,
namely, main.o and matrixcalc, as shown in Figure 5.5.
196 Advanced Computational Electromagnetic Methods and Applications
Figure 5.5 Two files main.o and matrixcalc in the application folder.
Type the following command to run the code on the CPU platform:
user_prompt>./matrixcalc
The screen message can be read as:
nThreads = 16
GFLOP : 1638.4 (total operations)
Duration : 00:00:17.709 (simulation time)
GFLOPS : 92.5179 (performance)
2. Run the code on the Phi platform (stand alone (as an independent processor))
Once again, the Intel Xeon E5 CPU and Xeon Phi coprocessor can share
the same source code with an additional compilation option –mmic.
Go to the application file folder and type the following commands (make
sure to use the correct makefile (with the –mmic option)):
user_prompt>make clean
user_prompt>make
The second command will generate two files in the application folder,
namely, main.o and matrixcalc. Type the following command to send
the code to Phi coprocessor mic0:
Type the following command to run (you are on the Phi platform already):
user_prompt> ./matrixcalc
User_prompt>exit
Logout
Connection to mic0 closed
The same benchmark can reach the better performance on the Phi
coprocessor:
export MIC_ENV_PREFIX=MIC
export \ MIC_LD_LIBRARY_PATH=/opt/intel/composerxe/lib/mic:
/opt/intel/mic/coi/host-linux-release/lib
The offload mode allows one to send the jobs to the Phi coprocessor
automatically. The performance in this way will be lower than the standalone
mode. In the offload mode, we need to allocate the memory both in the host
and Phi coprocessor, for example,
#ifdef _WIN32
__declspec(align(64)) float fa[FLOPS_ARRAY_SIZE];
__declspec(align(64)) float fb[FLOPS_ARRAY_SIZE];
#else
__declspec(target(mic)) float fa[FLOPS_ARRAY_SIZE]
__attribute__((align(64)));
__declspec(target(mic)) float fb[FLOPS_ARRAY_SIZE]
__attribute__((align(64)));
#endif
Use the code segment in Listing 5.4 to tell the system that the following code
will be sent to the Phi coprocessor.
For the same benchmark from Intel, the performance on the Phi coprocessor is:
The total available number of threads is 224, not 228, in the offload model
since one core is reserved for the communication.
10000
Performance
1000
CPU performance
Gap grows at
50% per year
100 7%/yr
10 Memory performance
Time
Cache Memory
Figure 5.7 Cache is used as a buffer between the processing unit and memory.
3. Cache hit ratio is used to measure how to organize the data inside cache
and memory. The performance comparison among L1, L2, L3, and
memory is listed in Table 5.2.
Table 5.2
Performance Comparison Between Cache and Memory
Phi Coprocessor Acceleration Techniques 201
Matrix A in
physical memory
Address translation by a page
table in memory Matrix A in
physical memory
Matrix A
Matrix A in
Translation Lookaside physical memory
Buffer (TLB)
Matrix A in
physical memory
5. Cache line is the minimum exchange unit between cache and memory, as
shown Figure 5.9. Size of the cache line is 64 bytes on the x86 system,
implying that cache will grasp 64 adjacent bytes of data or 16 floating
numbers in one fetch operation. Two adjacent instructions should access
the nearby data to improve the performance.
Cache Miss
Load
Cache Memory
6. Cache associativity is a cache policy to place the loaded data into cache,
for instance, two-way cache, as shown in Figure 5.10. M1 can be loaded
to C1 or C2. If C1 and C2 are all used, for instance, M3 is in C1 and M5
is in C2, a cache replacement will be made. Either C1 or C2 will be
written back to the memory before using it.
Memory Cache
M1 C1 C2
M2 C3 C4
M3
M4
M5
Each core has its own cache for the better compute performance, as shown in
Figure 5.11. If core1 and core2 access adjacent memory, cache1 and cache2 will
have the image of the same memory block. To keep the data coherence, any
change in cache1 should be pushed to cache2 immediately, and vice versa. Such an
operation will cause an additional cost. Even though the cost in the Phi coprocessor
is much lower than in Intel and AMD CPUs, we still need to avoid it as much as
we can.
In order to solve this problem, let each core process a different data block, and
each data block should be at least one cache line size. One will see a significant
performance improvement in some particular problems.
CORE 1 CORE 2
Cache
11 1 Cache 2
Memory
Figure 5.11 Each core has its own cache in a modern processor unit.
If the data is accessed in the pattern of cache set size (critical stride), the cache
replacement will happen frequently since all of these data will be assigned to the
Phi Coprocessor Acceleration Techniques 203
same cache set; however, the capacity of the cache set is limited, for instance, L2
cache in the Phi coprocessor is 8-way and the capacity of each cache set is 8, as
shown in Figure 5.12.
A(10.0), A(10.1)
A(10.1) A(11.1)
Unused Unused
Unused Unused
A(11.0), A(11.1)
A(12.0), A(12.1)
Suppose that we have an array A with 13×8 elements; the data in memory is
continuous along the column, as shown in Figure 5.12. For example, we calculate
A(12, i) from A(10, i) and A(11, i), i=0 to 7.
To calculate A(12, 0), we need A(10, 0) and A(11, 0), which have been loaded
to the cache. Actually, A(10, 1) to A(10, 7) and A(11, 1) to A(11, 7) have been
loaded simultanuously. Since the cache has been occupied by A(10, 0), A(10, 0)
will be replaced by A(12, 0). Next, when we calculate A(12, 1) for i = 1, we need
to reload A(10, 1) and then A(10, 1) is replaced by A(12, 1) again. The cache hit
ratio will be very low in this way.
Enlarging the column size of array A by one cache line size will avoid the
critical stride above, as shown in Figure 5.13.
A(10,0), A(10,1)
A(10,0)
A(11,0)
A(12,0)
A(11,0), A(11,1)
A(12,0), A(12,1)
Figure 5.13 Increase the column size of array A by one cache line size to avoid the critical stride.
204 Advanced Computational Electromagnetic Methods and Applications
We now demonstrate how to align memory to meet the requisites. The beginning
address of a memory block equals the multiple of a particular integer. For instance,
AVX-512 instructions need data to be aligned by 64 bytes, and the cache access is
aligned by 64 bytes while the page access is aligned by 4,096 bytes. In order to use
vector unit of the Phi coprocessor, one should use at least 64-byte alignment for
memory allocation. It is quite simple in Intel compiler, for example,
to allocate a variable A that can hold 4,096 floating numbers by 64-byte address
alignment, as shown in Figure 5.14.
0x207B340
(34059072)
A
0x207C33F
(34063167)
We introduce how to develop a parallel FDTD code on the Xeon Phi coprocessor
platform. A typical Yee cell with positions of the electric and magnetic fields is
illustrated in Figure 5.15. The complete update equations for all six components
are used to demonstrate the code development techniques based on the Xeon Phi
coprocessor platform. In order to show a realistic case, we consider the electric and
magnetic inhomogeneous media in (5.3), namely, the material parameters ε and μ
are functions of 3-D spatial coordinates. We can construct a material list and a
reference array instead of using the real material arrays to reduce the memory
usage and improve the cache hit ratio as well, as shown in Figure 5.16. A
comparison of the memory usage between regular array allocation and the material
list with the reference array is shown in Table 5.3.
Phi Coprocessor Acceleration Techniques 205
Ex
Ey Hz Ey
Ex
Ez Hy Ez
Ez Hx Ez Hx
Hy
Ex
Ey Hz
z Ey
y
x
Ex
Figure 5.15 A Yee cell and positions of the electric and magnetic fields in the FDTD method.
x 0.5t Mx n 1 2
H xn 1 2 i, j 1 2, k 1 2 Hx i, j 1 2, k 1 2
x 0.5t Mx
E yn i, j 1 2, k 1 E yn i, j 1 2, k
t z
x 0.5t Mx E n i, j 1, k 1 2 E n i, j , k 1 2
z z
y
(5.3a)
y 0.5t My
H yn 1 2 i 1 2, j, k 1 2 H yn 1 2 i 1 2, j, k 1 2
y 0.5t My
Ezn i 1, j , k 1 2 Ezn i, j , k 1 2
t x
y 0.5t My E n i 1 2, j , k 1 E n i 1 2, j , k
x x
z (5.3b)
z 0.5t Mz n 1 2
H zn 1 2 i 1 2, j 1 2, k Hz i 1 2, j 1 2, k
z 0.5t Mz
Exn i 1 2, j 1, k Exn i 1 2, j , k
t y
z 0.5t Mz
yE n
i 1, j 1 2, k E n
y i , j 1 2, k
x
(5.3c)
206 Advanced Computational Electromagnetic Methods and Applications
x 0.5t x n
Exn 1 i 1 2, j, k Ex i 1 2, j, k
x 0.5t x
H zn 1 2 i 1 2, j 1 2, k H zn 1 2 i 1 2, j 1 2, k
t y
x 0.5t x H n 1 2 i 1 2, j , k 1 2 H n 1 2 i 1 2, j , k 1 2
y y
z
(5.3d)
y 0.5t y
E yn 1 i, j 1 2, k E yn i, j 1 2, k
y 0.5t y
H zn 1 2 i 1 2, j 1 2, k H zn 1 2 i 1 2, j 1 2, k
t x
+
y 0.5t y H xn 1 2 i, j 1 2, k 1 2 H xn 1 2 i, j 1 2, k 1 2
z (5.3e)
z 0.5t z n
Ezn 1 i, j , k 1 2 E i, j , k 1 2
z 0.5t z z
H yn 1 2 i 1 2, j , k 1 2 H yn 1 2 i 1 2, j, k 1 2
t x
+
z 0.5t z H xn 1 2 i, j 1 2, k 1 2 H xn 1 2 i, j 1 2, k 1 2
y
(5.3f)
m
m
idx
Table 5.3
Memory Comparison Between Regular Array Allocation and Material List
12 3-D floating arrays One 3-D pointer array + one 2-D floating array
Overlap
E E E
H H
E E E
H H
Domain A Domain B
E E E
H H
E E E
H H
The Xeon Phi coprocessor is threading thirsty. The 5110P model needs at least
120 threads to make the device fully loaded. Unlike the traditional CPUs, we are
going to perform the multithreading on the x-y plane instead of in the x-direction
only, as shown in Figure 5.18 and demonstrated in Listing 5.5.
Listing 5.5 Code segment for the thread division in the x-y plane
int n = (nx + 1) * (ny + 1);
int nthreads, nsize;
#pragma omp parallel
{
#pragma omp single
208 Advanced Computational Electromagnetic Methods and Applications
{
nthreads = omp_get_num_threads();
if (n % nthreads) nsize = n / nthreads;
else nsize = n / nthreads + 1;
}
int idthread = omp_get_thread_num();
int n1, n2;
n1 = idthread * nsize;
n2 = (idthread + 1) * nsize;
if (n2 > n) n2 = n;
<< Call field update from n1 to n2 >>
}
Figure 5.18 Thread job assignments for the Xeon Phi coprocessor.
Each core of a Phi coprocessor can support up to four hardware threads. There will
be no penalty for the job switch among threads. However, all of these threads in
one core share the same L1 and L2 caches. If the data in each thread can cause
cache overwrite during threads switch, the cache hit ratio will be low efficient. But
we can arrange threads and make memory access as much locally as possible to
improve the code performance. The job scheduling has the following four
strategies, which are controlled by the environment variable KMP_AFFINITY. For
example, if we have 61 threads (from 0 to 60) and 80 jobs, the scheduling strategy
is described as follows [26]:
1. KMP = compact
2. KMP = scatter
80 threads: first 61 threads mapped one thread per core for cores 0,…
60, and the last 19 threads mapped one (more) thread per core for cores
0,…, 18, as shown in Figure 5.20.
Allows “one thread per core” studies for 1 to 61 threads.
Core 0
Core 1
……
Core 19
Core 20
……
Core 59
Figure 5.19 Compact job scheduling for 80 jobs and 61 cores (4 or 0 threads per core).
Core 0
Core 1
……
Core 18
Core 19
……
Core 59
Figure 5.20 Scatter job scheduling for 80 jobs and 61 cores (1 or 2 threads per core).
210 Advanced Computational Electromagnetic Methods and Applications
3. KMP = balanced
4. KMP = explicit
Export KMP_AFFINITY=‘explicit,
proclist=[0,1,2,3,4]’
But watch out for unexpected logical to physical processor mapping and
unexpected OpenMP thread to logical processor mapping.
We will use the scatter mode as an example here (see Listing 5.6) and select
number of threads to be twice number of cores. Then threads i and (i+n) will sit in
the same core. Since the data is continuous along the z-direction, we can split the
z-direction into two parts and let the thread i work on the first part and the thread
(i+n) work on the second one. It can localize the memory access inside a core.
where the number of block = number of cores, the parameter flag = 0 when thread
index is i, and the parameter flag = 1 when the thread index is (i+n).
The code parallel processing on the Xeon Phi coprocessor has four levels, namely,
card level, core level, thread level and vector unit level [18, 19], as shown in
Figure 5.21. The code can be assigned to different cards, cores, threads and vector
units. A pseudo code segment is shown in Listing 5.7. We use the FDTD method
to demonstrate how to develop the parallel code on the Xeon Phi coprocessor
platform. If the storage of a 3-D array in memory is continuous along the z-
direction, we divide the data into 60 blocks in the x-y plane, which is equal to the
number of cores in the Xeon Phi coprocessor, as shown in Figure 5.22. Select one
column in an individual block and assign it to two threads of one core; all cores
will be coalesced and each thread will work on one half-column of the selected
data. The vector unit will work on 16 adjacent data at the same time and generate
16 results in each cycle.
1
Cluster
OpenMP
2 CPU
Compute Core
Computer Core
Thread Parallel
processing SSE/AVX/FMA/MIC
4
Vector Unit
1/60
Thread 1
One core
Thread 2
Figure 5.22 Job division and assignment for threads and cores inside the Xeon Phi coprocessor.
© ACES 2014 [19]
The Xeon Phi coprocessor supports the AVX-512 instruction set, namely, the
MIC instruction set. It can process 16 floating calculations (32 floating calculations
with the FMA feature) in a single operation. It requires the data aligned by 64
Phi Coprocessor Acceleration Techniques 213
bytes. We will apply the AVX-512 operation to the data along the z-direction, and
then the data along the z-direction should be aligned by 64 bytes. The zero-padding
in the z-direction may be required to let an array aligned by 64 bytes. For example,
an array D[*][*][0] represents one of the electric or magnetic fields, and there are
12 grids along the z-direction. We know that it is not aligned by 64 bytes because
the direction only includes 48 bytes, as shown in Figure 5.23. Although the array D
has been aligned already when it is allocated, D[*][*][0] is not guaranteed to be
aligned. In order to align the array D to be 64 bytes, we need to pad zeroes along
the z-direction, say, 12 bytes, as shown in Figure 5.24. We will use 32 instead of
16 for coding convenience since each core has two threads employed. The zero
padding is realized by using the following statement:
0 D[0][0]
48 D[0][1]
0 D[0][0] Zero-Padding
64 D[0][1] Zero-Padding
Z-direction 16 16 16 16
Ideally, the update using the MIC instructions will be 16 times faster than a
regular floating point unit. A typical pseudo code segment is given in Listing 5.8.
The update of the electric fields, Ex, Ey, and Ez, on the Xeon Phi coprocessor
platform is demonstrated in Listing 5.9.
// Ey component
ssereg4 = _mm512_sub_ps(vHx[vk], ssereg2);
ssereg4 = _mm512_mul_ps(ssereg4, vrHdz[vk]);
ssereg3 = _mm512_sub_ps(vHz[vk], vHz10[vk]);
Phi Coprocessor Acceleration Techniques 215
// Ez component
ssereg4 = _mm512_sub_ps(vHy[vk], vHy10[vk]);
ssereg4 = _mm512_mul_ps(ssereg4, vrHdx);
ssereg3 = _mm512_sub_ps(vHx[vk], vHx01[vk]);
ssereg3 = _mm512_fmsub_ps(ssereg3, vrHdy, ssereg4);
ssereg3 = _mm512_mul_ps(ssereg3, vCezh);
vEz[vk] = _mm512_fmsub_ps(vEz[vk], vCeze, ssereg3);
In both FEM and MoM, solving matrix equations is the most time consuming part.
Now, we investigate how to use the Phi coprocessor to calculate the multiplication
of two matrices. We begin with the following equation:
where Cn×l, An×m, and Bm×l are matrices and the subscripts indicate the number of
rows and columns of the corresponding matrices. If we define a 1-D array and map
a 1-D array to a 2-D array that is used to allocate memory for the matrices in (5.4),
one matrix with 5 × 3 elements can be mapped from a 1-D array with 15 elements,
as shown in Figure 5.26. The detailed mapping relationship from 1-D array to 2-D
array was discussed in Section 3.2.4. It is obvious from Figure 5.26 that the data of
the matrix C is contiguous along its column index l.
n
C[0] C[4] C[7] C[10] C[13]
C00 C10 C20 C30 C40
Following the idea described above to allocate the matrices A and B, the
column of matrix B will not be contiguous; in turn, the multiplication operation of
the matrices A and B is very low efficient. It is a well-known fact that we need to
make a transpose of B to speed up the matrix multiplication. If we calculate the
matrix multiplication on the Phi coprocessor, it is observed that calculation for
smaller A and B is much faster than larger A and B. This happens because the
elements of smaller A and B can be held in the cache to increase the cache hit ratio.
The cache hit ratio becomes lower when matrices A and B become larger.
We use the following strategy to improve the compute performance. Each
element in matrix C can be calculated in the way described in Figure 5.27.
= X
C A B
Each core of the Xeon Phi coprocessor has 512KB L2 cache with 8 ways. The
critical strider is 512/8 = 64 KB. If the column size is equal 64 KB or 16,384
floating numbers, we append a cache line size zero-padding area to each column.
if (m % 16) m = (m / 16 + 1) * 16;
if (m % 16384 == 0) m += 16;
that the previously used data in each block can survive during the whole block
calculations.
A B
Transpose of B
C A B
= X + X
C A B B
Figure 5.29 Break the matrix into smaller blocks for the better code performance.
SIMD instruction, and the constant nBlockSizeBySIMD is 4 in this case for the
better code performance. The code for a block matrix operation is shown in Listing
5.10.
}
}
}
Nonblock matrix code applies OpenMP on the row direction of C and applies
AVX-512 on each C element calculation. Block matrix code applies OpenMP on
block set and applies AVX-512 on each block element calculation.
For two matrices A and B with 2,048 × 2,048 elements, the performance of
solving the matrix without the domain decomposition technique on Intel Xeon E5
2640 v2 Ivy-Bridge CPU is 0.63 second, but the performance with the domain
decomposition technique on the same CPU is 0.4 second. The performance of
solving the matrix with the domain decomposition technique on the Xeon Phi
coprocessor 5100P is 0.15 second, as shown in Figure 5.30.
0.63
0.63
0.4
0.4
0.15
0.15
Intel XEON E5-2640 v2 Intel XEON E5-2640 v2 Intel XEON PHI 5110P
(Schema A) (Schema B) (Scheme B)
In this section, we use the parallel FDTD code based on the Xeon Phi coprocessor
to demonstrate performance of the Phi coprocessor [18, 19]. The host computer
incudes two Intel Xeon E5-2640 v2 CPUs with 32 GB DDR3 RAM, and one
5110P Xeon Phi coprocessor is mounted on the host through PCI × 16. The Xeon
Phi coprocessor is installed with 8 GB GDDR5 RAM. We use a typical example,
an empty box truncated by the PEC boundary, to demonstrate performance of the
Xeon Phi coprocessor. The problem size we first test is 1.29 GB and the
performance is 1,200 MCPS defined in (3.4). The performance on a single Phi card
can easily achieve 1,200 MCPS, and the performance increases to 1,350 MCPS
when the problem size increases to 7.2 GB, as shown in Figure 5.31. For the sake
of comparison, we plot performance of the Xeon E5-2640 v2 (8-core 2.0 GHz
CPU) in the same figure. We also demonstrate performance of the Phi coprocessor
for the smaller problems such as 200,000 cells, as shown in Figure 5.32.
220 Advanced Computational Electromagnetic Methods and Applications
1400 7.2GB
3.07GB
1.29GB
1200
Performance (Mcells/sec)
1000
Phi coprocessor
800
Xeon E5-2640 v2
600
400
200
0
27 36 48 64 80 100 125 150
Problem size (Mcells)
Figure 5.31 Performance of the 5110P Xeon Phi coprocessor for the parallel FDTD code for regular
size of problems. © ACES 2014 [19]
1400 0.0394GB
1200
Performance (Mcells/sec)
1000 0.0096GB
800 0.0883GB
600
400
200
0
0.2 0.46 0.82 1.28 1.84
Problem size (Mcells)
Figure 5.32 Performance of the 5110P Xeon Phi coprocessor for the parallel FDTD code for smaller
size of problems. © ACES 2014 [19]
We next use the code based on the Phi coprocessor to simulate a time domain
reflectometer (TDR) problem. A discontinuous structure is fed by a pair of parallel
plates filled with dielectric material. We investigate how to obtain the accurate
TDR using the FDTD method [3035]. If an incident pulse is a narrow Gaussian
shape, the time-domain reflectometer (TDR) is expressed as:
Phi Coprocessor Acceleration Techniques 221
T0 t0
TDR
f t dt g t dt
0 0 (5.5)
t0
g t dt
0
where t0 is width of Gaussian pulse, T0 is simulation time, and g(t) and f(t) are the
incident Gaussian pulse and reflected signal, respectively. The corresponding time
domain impedance is defined as [30]:
1 TDR (5.6)
Z Z0
1 TDR
V1 V2 V3
V (5.7)
3
where V1, V2, and V3 are the time domain voltages measured at three different
locations, as shown in Figure 5.33(b). The TDR parameter can be calculated using
the arithmetic mean voltage. Now, we define a new parameter, namely, geometric
mean voltage [34]:
V 3 V1 V2 V3 (5.8)
The voltages V1, V2, and V3 in (5.7) and (5.8) measured at the center and two
edges might be different due to the finite microstrip in the feed port. Both (5.7) and
(5.8) are two ways to calculate the average of three output variables measured
across the microstrip. The arithmetic mean in (5.7) stands for the average of three
time domain signals measured across the microstrip, and the geometric mean in
(5.8) indicates the central tendency of three measured time domain signals. When
the observation points are located away from the excitation source, the arithmetic
and geometric means generate the same results.
222 Advanced Computational Electromagnetic Methods and Applications
Substrate
Observation voltage
Excitation
Excitation
Excitation
Observation voltage 3
Substrate
Observation voltage 2
Excitation
Excitation
Figure 5.33 Port configuration with excitation and output voltage sampling: (a) one output sampled
at the central stripline; and (b) three outputs sampled at the center and two edges. ©2014
IEEE [30].
more accurate than the regular method and arithmetic mean that is also based on
three observation points. The numerical experiment in Figure 5.34 also shows that
the arithmetic mean does not improve the TDR calculation. The curve “Regular” in
Figure 5.34 indicates the result is calculated by using a single point output. Due to
the numerical dispersion, the incident signal measured in the simulation may not
be the same as the ideal Gaussian pulse. Therefore, the truncation of incident pulse
will affect the result slightly. Here, we select the truncation criterion is 0.1% of the
peak value of incident pulse.
Figure 5.34 Time domain impedance of the uniform microstrip structure using different algorithms.
©2014 IEEE [30].
Figure 5.35 A nonuniform coaxial microwave connector fed by using a pair of PEC plates.
Figure 5.36 Time-domain impedance of the discontinuous coaxial connector using the different
approaches. ©2014 IEEE [30].
Phi Coprocessor Acceleration Techniques 225
REFERENCES
[1] http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions.
[2] http://en.wikipedia.org/wiki/AVX.
[3] W. Yu, X. Yang and W. Li, VALU, AVX, GPU Acceleration Techniques for Parallel Finite
Difference Time Domain Methods, Raleigh, NC: SciTech Publisher Inc., 2013.
[4] http://www.amd.com.
[5] http://www.intel.com.
[6] http://www.intel.com/content/www/us/en/io/quickpath-technology/quickpath-technology-
general.html.
[7] http://sites.amd.com/us/documents/48101a_opteron%20_6000_qrg_rd2.pdf.
[8] http://softpixel.com/~cwright/programming/simd/sse.php.
[9] http://neilkemp.us/src/sse_tutorial/sse_tutorial.html.
[10] https://developer.apple.com/hardwaredrivers/ve/sse.html
[11] http://en.wikipedia.org/wiki/Advanced_Vector_Extensions.
[12] http://software.intel.com/en-us/avx.
[13] http://devgurus.amd.com/thread/159669.
[14] http://lomont.org/Math/Papers/2011/Intro%20to%20Intel%20AVX-Final.pdf.
[15] https://software.intel. com/en-us/mic-developer#pid-11757-1231.
[16] Intel® Xeon Phi™ Coprocessor: System Software Developers Guide,
http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-coprocessor-system-
software-developers-guide.html.
[17] http://en.wikipedia.org/wiki/Xeon_Phi.
[18] X. Yang and W. Yu, “Phi Coprocessor Acceleration Techniques for Finite Difference Time
Domain Methods,” IEEE International Symposium on Antennas and Propagation and USNC-
URSI Radio Science Meeting, Memphis TN, July 2014.
[19] X. Yang and W. Yu, “Phi Coprocessor Acceleration Techniques for Computational
Electromagnetic Methods,” Applied Computational Electromagnetics Society Journal, Vol. 29,
No. 12, pp. 1013-1017, 2014.
[20] W. Yu, et al., Parallel Finite Difference Time Domain Method, Norwood, MA: Artech House,
2006.
[21] A. Taflove and S. Hagness, Computational Electromagnetics: The Finite-Difference Time-
Domain Method, 3rd ed., Norwood, MA: Artech House, 2005.
[22] A. Elsherbeni and V. Demir, The Finite Difference Time Domain Method for Electromagnetics:
With MATLAB Simulations, Raleigh, NC: SciTech Publisher Inc., 2009.
[23] J. Jin, The Finite Element Method in Electromagnetics, (2nd ed.) New York: John Wiley & Sons,
2002.
[24] A. Peterson and R. Mittra, Computational Methods for Electromagnetics, New York: Wiley-
IEEE Press, 1997.
226 Advanced Computational Electromagnetic Methods and Applications
[30] W. Yu, X. Yang, and H. Fluhler, “Accurate Calculation Technique for Time Domain Impedance
In FDTD Method,” IEEE International Symposium on Antennas and Propagation and USNC-
URSI Radio Science Meeting, Memphis, TN, July 2014.
[31] TDR Impedance Measurements, A Foundation for Signal Integrity, Tektronix application note,
2008.
[32 and M. Diamond, “Feasibility of Reflectometry for Nondestructive
Evaluation of Prestressed Concrete Anchors,” IEEE Sensors Journal, Vol. 9. No. 11, pp.
13221329, 2009.
[33 P. Smith, C. Furse, and J. Gunther, “Analysis of Spread Spectrum Time Domain Reflectometry
for Wire Fault Location,” IEEE Sensors Journal, Vol. 5, No. 6, pp.14691478, 2005.
[34 C. Furse, C. Smith, and M. Chet, “Feasibility of Spread Spectrum Sensors for Location of Arcs
on Live Wires,” IEEE Sensors Journal, Vol. 5, No. 6, pp.14451450, 2005.
[35] J. Schneider, The Understanding Finite Difference Time Domain Method, Lecture note 2013,
Washington State University.
Chapter 6
Domain Decomposition Methods for Finite
Element Analysis of Large-Scale
Electromagnetic Problems
Ming-Feng Xue and Jian-Ming Jin
227
228 Advanced Computational Electromagnetic Methods and Applications
In this chapter, we focus on the developments that expand the capability and
improve the performance of the existing FETI-DP methods by: (1) lifting the
requirement of conformal meshes on the subdomain interface; (2) speeding up the
convergence rate of the iterative solution of the global interface problem; and (3)
incorporating appropriate truncation boundaries for more accurate simulations.
First, we present the formulations of the finite element tearing and interconnecting
(FETI) and FETI-DP methods based on one and two Lagrange multipliers for
domain decomposition with conformal interface meshes. We then formulate two
nonconformal FETI-DP methods, both of which implement the Robin-type
transmission condition at the subdomain interfaces. One nonconformal method
extends the conformal FETI-DP algorithm that is based on two Lagrange
multipliers to deal with nonconformal interface and corner meshes, whereas the
other employs cement elements on the subdomain interface, combines the global
primal unknowns with the global dual unknowns, and extracts the corner
unknowns to formulate a global coarse problem [14, 15]. Next, we discuss the
implementation of higher-order transmission conditions in the FETI-DP method
with two Lagrange multipliers for a faster convergence of the iterative solution of
the global interface system [16]. These higher-order transmission conditions can
transmit both transverse-electric (TE) and transverse-magnetic (TM) evanescent
modes in addition to the propagating modes [1725]. They are critical for
obtaining a converged result in the case when perfectly matched layers (PMLs) are
used for mesh truncation. After that, we describe a hybrid scheme to handle multi-
region electromagnetic problems, whose computational domain consists of several
regions that can be meshed independently. In this scheme, the FETI method is
employed to deal with mesh-nonconformal and/or geometry-nonconformal
interfaces between different regions and the FETI-DP method is used for mesh-
conformal and geometry-conformal interfaces inside each region. A unified global
system of equations is then formulated for the interface unknowns from both
conformal and non-conformal interfaces [26, 27]. Higher-order transmission
conditions and a generalized cross-point correction technique are applied to
improve the convergence and ensure a correct interconnection across subdomain
interfaces [27]. Finally, we present several numerical results for the simulation of
wave propagation, finite antenna arrays, radomes, and optical devices to
demonstrate the accuracy, efficiency, capability, and applications of these
algorithms. For simulating large finite antenna arrays, we present an oblique
absorbing boundary condition and apply it to the FETI-DP method [28]. This
boundary condition can be tuned to become reflectionless for all frequencies and
polarizations for the main beam of the radiated wave. Equation Section 6
Domain Decomposition Methods For Finite Element Analysis 229
We start our discussion with the review of the FETI method, for two different
versions that are equipped with one Lagrange multiplier (1LM) and two Lagrange
multipliers (2LM), respectively. The FETI method is very effective for some
special domain decompositions, for example, the one-way and onion-like domain
decompositions [2931]. We illustrate how these two versions enforce the
continuity condition on the tangential electric and magnetic fields across the
subdomain interface.
where s denotes the interface between the sth subdomain and its neighboring
subdomains, Λ b is an unknown variable defined on the subdomain interface, and
s
J imp is an impressed current. For the portion of the subdomain boundary S s
coinciding with the exterior surface of the entire computational domain S0 , we can
either use an absorbing boundary condition (ABC), or a PML, or a boundary
integral equation (BIE) to deal with the field there. In the following, we will omit
this boundary term in order to focus on the treatment of the subdomain interfaces.
Λb
n̂(1) 12 n̂(2) 23 n̂(3)
V
V1 V2 V3
Figure 6.1 A computational domain is divided into three nonoverlapping subdomains. An unknown
Neumann boundary condition is assumed on each subdomain interface with the aid of a
global Lagrange multiplier.
230 Advanced Computational Electromagnetic Methods and Applications
After expanding the vector electric field using vector basis functions such that
E {Ns }T {E s } and applying Galerkin’s method, we obtain the subdomain matrix
s
equation partitioned as
Kiis Kibs
Ei
fi 0
fi 0
s s s
s s
s s s s (6.3)
b Bb b
s
Kbi Kbb Eb fb
fb
where
K uvs ]
[ Vs
[( {Nus }) r1 ( {N vs }T ) k02 {Nus } r {N vs }T ]dV (u, v i, b)
{ }
s s
b {N } Λ b dS
b
Ss s
In (6.3), [ K s ] is the FEM matrix, {E s } denotes the unknown discrete electric field,
and { f s } denotes the excitation vector contributed by the source. By using the
subscripts i and b, each vector is partitioned into two parts, which are associated
with the interior and interface of the subdomain, respectively. Also, [ Bbs ] is a
Boolean matrix introduced to extract the dual unknown {bs } on the sth subdomain
interface from the global interface dual unknown vector b, such that
{bs } [ Bbs ]{b } . Because two neighboring subdomains share the same set of
Lagrange multiplier, this version is usually referred to as the FETI method with
one Lagrange multiplier. Note that [ Bbs ] is a signed Boolean matrix that contains
only 0, 1, and –1, because the tangential magnetic field continuity condition (the
Neumann boundary condition) contains opposite signs on the two sides of an
interface as
where another Boolean matrix [ Rbs ] is introduced to extract the interface field
{Ebs } out of {E s } such that {Ebs } [ Rbs ]{E s } . Normally, we can eliminate the
interior unknown {Eis } and express the boundary unknown {Ebs } in terms of the
Domain Decomposition Methods For Finite Element Analysis 231
To couple the fields across all the subdomains, we use the fact that at the
interface between two subdomains, the electric field satisfies the tangential
continuity condition
nˆ s nˆ s Ebs nˆ q nˆ q Ebq (6.7)
This Dirichlet continuity condition can be enforced by assembling (6.6) over all
the subdomains and setting it to zero, which yields
Ns
Ns
s T
b
s
b [ B ] {E } [ B ] [ R ][ Ks T
b
s
b
s 1
] ({ f s } [ Rbs ]T [ Bbs ]{b }) 0
n 1 n 1 (6.8)
This enforcement is valid only when the interface meshes are conformal. It
requires the tangential electric fields on the two sides of an interface to be equal to
each other in an unknown-by-unknown manner. It should be noted that the +1 and
–1 entries in [ Bbs ] play an important role to make (6.8) concise. After the global
assembly over all subdomains, we have
where
Ns
[ Kbb ] [ Bbs ]T [ Rbs ][ K s ]1[ Rbs ]T [ Bbs ]
n 1
Ns
{ fb } [ Bbs ]T [ Rbs ][ K s ]1{ f s }
n 1
Once {b } is obtained by solving (6.9), the electric field in every subdomain can
be calculated using (6.6).
This formulation works well if each subdomain is either lossy or lossless but
with a size small enough so that the subdomain FEM matrix [ K s ] is never singular.
Otherwise, the iterative solution of (6.9) will not converge rapidly because of the
ill-conditioned [ Kbb ] . This problem can be alleviated with the next formulation.
232 Advanced Computational Electromagnetic Methods and Applications
To derive the formulation of the second version, we still consider the partial
differential equation in (6.1) but with the Robin boundary condition for the sth
subdomain [9]
nˆ s (r1 Es ) s nˆ s (nˆ s Es ) Λbs on s (6.10)
where
{bs } Ss s
{Nbs } Λbs dS
{b } ({b1}, , {bNs })T such that {bs } [Qs ]{b } . Different from [ Bbs ] , [Q s ]
is unsigned. Similar to (6.6), the subdomain matrix equation for the boundary
unknown {Ebs } in terms of the dual unknown {bs } can be written as
where [ Rbs ] is the same as that defined in Section 6.1.1 but [ K s ] now contains the
surface mass matrix [ M bbs ] .
At the interface sq , we add the two Robin boundary conditions on the two
sides of the interface such that
This equation holds true because nˆ s (r1 Es ) nˆ q (r1 Eq ) , which is
the continuity condition for the tangential magnetic field. Further, because of the
continuity condition for the tangential electric field nˆ s (nˆ s Ebs ) nˆ q (nˆ q Ebq ) ,
we obtain the following transmission conditions
Λb Λb ( )nˆ (nˆ Eb )
s q s q q q q
q on sq (6.14)
Λb Λb ( )nˆ (nˆ Eb )
s q s s s s
Note that [ M bbsq ] is a projection matrix mapping the interface electric field {Ebq }
in the qth subdomain to the dual unknown {bs } in the sth subdomain. Equation
(6.15) can further be rewritten as
{bs }q
[Tsq ]{bq } [M bbsq ][Tsq ]{Ebq }
(6.16)
234 Advanced Computational Electromagnetic Methods and Applications
s
where [Tq ] is a Boolean matrix employed to extract the interface unknowns
defined on sq from those defined on s such that E T E
s
b q q
s s
b and
{bs }q [Tqs ]{bs } . Equation (6.16) can be reduced by eliminating {Ebq } using
(6.12) and the result is
We can then reassemble (6.17) over all s and q to obtain an interface system
for all the subdomains as
Once {b } is computed by solving (6.18), the electric field in every subdomain
can be calculated using (6.12).
From Sections 6.1.1 and 6.1.2, we can find many similarities between the FETI
method with 1LM and the FETI method with 2LM. First of all, both versions need
to eliminate the interior electric field unknown and express the boundary electric
field unknown in terms of the dual known, which can be written symbolically as
{Ebs } f ({b },{ f s }) (6.19)
This step is usually performed by a direct solver. Second, both versions need to
solve a global interface problem after assembling over all the subdomains, which
can be written symbolically as
F ({b },{ f }) 0 (6.20)
Domain Decomposition Methods For Finite Element Analysis 235
This final interface equation can be solved iteratively using an iterative solver, for
example, Krylov subspace methods such as the generalized minimum residual
(GMRES) method and the stabilized biconjugate gradient (BiCGStab) method [8].
As we can see, the FETI method hybridizes direct and iterative solvers in a two-
level manner during the solution procedure. The global interface problem of the
FETI method with 1LM is generally indefinite and a Dirichlet preconditioner is
required for a fast convergence [3, 4], In contrast, the global interface problem of
the FETI method with 2LM is positive-definite and all the eigenvalues are located
within the unit circle centered at (1, 0) on the complex plane.
When we deal with more general cases, for example, the checkerboard domain
decompositions [3, 8], it is inevitable to encounter geometrical crosspoints, which
are the interfaces shared by more than two subdomains, as shown in Figure 6.3.
These geometrical crosspoints are also called corners.
1 12 2
(2)
n̂ (1)
n̂
13 24
n̂ (3) n̂ (4)
3 34 4
Figure 6.3 A computational domain is divided into four nonoverlapping subdomains. A
geometrical crosspoint shared by more than two subdomains occurs in this
decomposition.
The FETI method, no matter with 1LM or 2LM, encounters difficulties when
dealing with the continuity condition at a corner. First of all, there are four electric
field unknowns {Ec(1) } , {Ec(2) } , {Ec(3) } , and {Ec(4) } defined at the corner in Figure
6.3, but only three independent equations can be obtained from the tangential
electric field continuity condition {Ecs } {Ecq } ( s, q 1, 2,3, 4 and s q ). Second,
because the tangential magnetic fields at the corner are interrelated, {c(1) } , {c(2) } ,
{c(3) } , and {c(4) } are not independent at the corner. From these two points, we
can see the inherent redundancy associated with the FETI method in this general
case. To remove the redundancy and the resultant singularity, Farhat and his
236 Advanced Computational Electromagnetic Methods and Applications
colleagues [57] proposed a dual-primal (DP) strategy to extract {Ecs } out and give
them a unique global index to construct a global corner system. This idea works
perfectly for the 1LM version because {cs } associated with corner edges are
cancelled out during the global assembly process due to the associated Neumann
boundary condition. However, for the 2LM version, this is not the case because of
the Robin boundary condition. As a remedy, we change the Robin boundary
condition into the Neumann boundary condition at the corners for the 2LM version.
As such, the resultant two improved FETI methods are usually referred to as the
dual-primal finite element tearing and interconnecting (FETI-DP) method with
1LM and 2LM, respectively.
Again, we consider the boundary-value problem (BVP) defined by (6.1) and (6.2).
On the discretization level, the subdomain matrix equation is partitioned as
Kiis Kibs Kics Eis fi s 0 fi s 0
s s s s s
Kbi Kbbs Kbcs Ebs fb b fb Bb b (6.21)
K cis
K cbs K ccs s
Ec
s s
f c c
s s
f c c
where
K uvs ]
[ Vs
[( {Nus }) r1 ( {N vs }T ) k02 {Nus } r {N vs }T ]dV (u, v i, b, c)
{ }
s s
b {N } Λ b dS
b
Ss s
K rrs K rcs
Ers
f rs
[ Rbrs ]T Bbs b
s s s
s (6.22)
K cr K cc Ec fc
cs
Domain Decomposition Methods For Finite Element Analysis 237
where [ Rbrs ] is a Boolean matrix to extract the interface unknowns {Ebs } out of the
remaining unknowns {Ers } such that {Ebs } [ Rbrs ]{Ers } . Other matrices and vectors
are defined as
K s Kibs s Kics fi
s
[ K rrs ] iis s , [ K rc ] s , [ K s
cr ] [ K s
ci K s
cb ] , { f r
s
} s (6.23)
Kbi Kbb Kbc fb
As we separate the corner unknowns {Ecs } from the noncorner interface unknowns
{Ebs } , we can obtain two equations for the sth subdomain after eliminating the
interior unknowns {Eis } as
{Ebs } [ Rbrs ][ Krrs ]1 ({ f rs } [ Rbrs ]T {bs } [ Krcs ][ Bcs ]{Ec }) (6.24)
{ fcs } {cs } [ Kcrs ][ Krrs ]1{ f rs } [ Kcrs ][ Krrs ]1[ Rbrs ]T [ Bbs ]{b } (6.25)
where the Boolean matrix [ Bcs ] is introduced to extract the local corner unknowns
from the global corner unknowns, which can be expressed mathematically as
[ Bcs ]{Ec } {Ecs } .
To couple the fields over all the subdomains, however, we apply the Dirichlet
continuity condition in (6.7) to all interface electric field unknowns.
Mathematically, we can obtain the following matrix equation by assembling (6.24)
over all the subdomains and setting it to zero, which yields
Ns Ns
[ B ] {E } [ B ]
n 1
s T
b
s
b
n 1
s T
b
[ Rbrs ][ Krrs ]1 ({ f rs } [ Rbrs ]T [ Bbs ]{b } [ Krcs ][ Bcs ]{Ec }) 0 (6.26)
This equation is very similar to (6.8) except that it contains the contribution from
the global corner unknowns {Ec } . However, to obtain the other system equation
relating {b } to {Ec } , we can sum (6.25) over all the subdomains as
Ns Ns
[ B ]
n 1
s T
c ([ Kccs ] [ Kcrs ][ K rrs ]1[ K rcs ])[ Bcs ]{Ec } [ Bcs ]T
n 1
({ fcs } {cs } [ Kcrs ][ Krrs ]1{ f rs } [ Kcrs ][ Krrs ]1[ Rbrs ]T [ Bbs ]{b }) (6.27)
238 Advanced Computational Electromagnetic Methods and Applications
Because the tangential component of the magnetic field is continuous across the
interface between the subdomains (assume that no surface electric current exists on
the interface), based on the definition of {cs } , we have
Ns
[ B ] { } 0
n 1
s T
c
s
c (6.28)
c } { f c } [ Kcb ]{b }
[ Kcc ]{E (6.30)
where
Ns
[ K bb ] [ Bbs ]T [ Rbrs ][ K rrs ]1[ Rbrs ]T [ Bbs ]
n 1
Ns
[ K bc ] [ Bbs ]T [ Rbrs ][ K rrs ]1[ K rcs ][ Bcs ]
n 1
Ns
{ f b } [ Bbs ]T [ Rbrs ][ K rrs ]1{ f rs }
n 1
Ns
[ K cc ] [B ]
n 1
s T
c ([ K ccs ] [ K crs ][ K rrs ]1[ K rcs ])[ Bcs ]
Ns
[ K cb ] [ Bcs ]T [ K crs ][ K rrs ]1[ Rbrs ]T [ Bbs ]
n 1
Ns
{ fc } [B ]
n 1
s T
c ({ f cs } [ K crs ][ K rrs ]1{ f rs })
Domain Decomposition Methods For Finite Element Analysis 239
In these equations, [ Kcrs ] [ Krcs ]T (assume that r and r are symmetric) and
[ Kcb ] [ Kbc ]T because [ K rrs ] is symmetric. We can eliminate {Ec } in (6.29) and
(6.30) to find
After {b } is solved for, {Ec } can be obtained from (6.30) and the electric field
inside each subdomain can be obtained by solving (6.24) [8].
Now we consider the BVP defined by (6.1) and (6.10). Adopting the i, b, c
subscript notation, we can write the subdomain matrix equation as
{bs } Ss s
{Nbs } Λ bs dS
{Ebs } [ Rbrs ][ Krrs ]1 { f rs } [ Rbrs ]T {bs } ([ Krcs ] [ Rbrs ]T [ Lsbc ])[ Bcs ]{Ec } (6.33)
240 Advanced Computational Electromagnetic Methods and Applications
[K s
cc ] [ Kcrs ][ Krrs ]1 ([ Krcs ] [ Rbrs ]T [ Lsbc ]) [ Bcs ]{Ec }
{ fcs } {cs } [ Kcrs ][ Krrs ]1{ f rs } [ Kcrs ][ Krrs ]1[ Rbrs ]T {bs } (6.34)
where [ Rbrs ] , [ Bcs ] , [ K rcs ] , [ K crs ] , and { f rs } are the same as those defined in
Section 6.2.1 except for [ K rrs ] , which is now defined as
Ks Kibs
[ K rrs ] iis s s
(6.35)
Kbi Kbb M bb
c } { f c } [ Kcb ]{b }
[ Kcc ]{E (6.36)
where
Ns
[ K cc ] [ Bcs ]T [ K ccs ] [ K crs ][ K rrs ]1 ([ K rcs ] [ Rbrs ]T [ Lsbc ]) [ Bcs ]
s 1
Ns
[ K cb ] [ Bcs ]T ([ K crs ][ K rrs ]1[ Rbrs ]T )[Q s ]
s 1
Ns
{ fc } [B ]
s 1
s T
c ({ f cs } [ K crs ][ K rrs ]1{ f rs })
Here, [Q ] is the same as that defined in Section 6.1.2. Note that {cs } in all
s
subdomains are cancelled out after the Neumann continuity condition is enforced
at the corner.
However, at the interface sq , by enforcing the tangential continuity of the
magnetic field and the tangential continuity of the electric field step by step, we
can obtain similar transmission conditions as given in (6.14). Again, taking the sth
subdomain as reference, we can obtain the following matrix equation on the
discretization level as
Lsbc ]q
[ sq
s (nˆ s {Nbs }) (nˆ s {Ncs }T )dS
[ Lsq
bc ] sq
q (nˆ q {Nbs }) (nˆ q {N cq })T dS .
Domain Decomposition Methods For Finite Element Analysis 241
In (6.37), [ M bbsq ] is the same as that defined in Section 6.1.2. Note that [ Lsq
bc ] is a
projection matrix mapping the corner electric field {Ecq } in the qth subdomain to
the dual unknown {bs } in the sth subdomain. Equation (6.37) can further be
rewritten as
{bs }q ([Tsq ] [M bbsq ][Tsq ][ Fbbq ]){bq } [ Lsbc ]q [ Sqs ][ Bcs ]{Ec }
([ Lsq q q sq q q sq q q
bc ][ Ss ][ Bc ] [ M bb ][Ts ][ Fbc ]){Ec } [ M bb ][Ts ]{db } (6.39)
where
[ Fbbq ] [ Rbrq ][ K rrq ]1[ Rbrq ]T
[ Fbcq ] [ Rbrq ][ K rrq ]1 ([ K rcq ] [ Rbrq ]T [ Lqbc ])[ Bcq ]
{dbq } [ Rbrq ][ K rrq ]1{ f rq }
Finally, we can reassemble (6.39) over all s and q to obtain an interface system for
all subdomains as
where
Ns
] [ I ] [Q s ]T
[ K bb [Tqs ]T ([Tsq ] [ M bbsq ][Tsq ][ Fbbq ])[Q q ]
s 1 qneighbor( s )
Ns
[ K bc ] [Q s ]T [Tqs ]T [ Lsbc ]q [ Sqs ] [ Bcs ]
s 1 qneighbor( s )
Ns
[Q s ]T [Tqs ]T ([ Lsqbc ][ S sq ][ Bcq ] [ M bbsq ][Tsq ][ Fbcq ])
s 1 qneighbor( s )
Ns
{ f b } [Q s ]T [Tqs ]T [ M bbsq ][Tsq ]{d bq }
s 1 qneighbor( s )
242 Advanced Computational Electromagnetic Methods and Applications
By combining (6.36) and (6.40) to eliminate the primal variable {Ec } , we can
derive the global interface equation for the dual variable {b } as
After {b } is solved, {Ec } can be obtained from (6.36) and the electric field inside
each subdomain can be obtained by solving (6.33) [9].
6.2.3 Comparison Between FETI-DP Methods with One and Two Lagrange
Multipliers
Similar to the FETI method, the formulation of the FETI-DP method can also be
written symbolically. For both versions, because we separate the corner unknowns
from the noncorner interface unknowns, after eliminating the interior unknowns
{Eis } , we can extract two equations on the subdomain level. One is for the discrete
fields on the subdomain interface
{Ebs } f ({Ec },{b },{ f s }) (6.42)
which is called the subdomain interface system. The other equation is for the fields
at the corners of the subdomain
{Ecs } g ({b },{ f s }) (6.43)
which is called the subdomain corner system. With these, we have effectively
converted a subdomain volumetric problem into a subdomain surface problem.
Next, we have to couple all the subdomains. After the global assembly
through the noncorner interface and the corner interface, we obtain two global
interface systems, which are
F ({Ec },{b },{ f }) 0 (6.44)
An efficient solution strategy is to eliminate {Ec } to obtain the final system that
relates the dual unknown b and the excitation vector { f } . The elimination of
{Ec } provides an additional benefit for the FETI-DP method with 2LM, this is, the
resultant condensed linear system becomes positive-definite, which is similar to
that of (6.18). However, if we solve {b } and {Ec } together, the resultant linear
system in both the 1LM and 2LM versions is indefinite, which is not desirable for
an iterative solution [15].
Domain Decomposition Methods For Finite Element Analysis 243
The purpose of the coarse grid correction is two fold: (1) to avoid redundant
auxiliary variables at corner edges because no dual unknowns have to be defined
there; and (2) to introduce a mechanism to propagate the iterative residual error
globally and thus improve the convergence rate.
Up to now, one may find that the FETI-DP method with 1LM is more concise
and simpler to implement. However, it is not scalable with respect to the
subdomain size, that is, a much slower convergence is observed when the electrical
size of a subdomain is large enough to support resonant modes [8]. This is due to
the unknown Neumann boundary condition assumed on the subdomain interface.
In contrast, the FETI-DP method with 2LM is free of numerical resonance and thus
prevails in the high-frequency applications. All modifications and improvements
presented in the following sections are made to the FETI-DP method with 2LM
because all the applications considered in this chapter are pertinent to high-
frequency problems.
On the two sides of a subdomain interface, the conformal FETI-DP method with
2LM does not expand the dual variables (Lagrange multipliers) explicitly and the
continuity conditions across the interface are enforced on an unknown-by-
unknown basis. Unfortunately, such an unknown-by-unknown correspondence
does not exist if the meshes for the two neighboring subdomains are not the same.
Such a case is called a nonconformal interface. Recently, an effort to extend the
FETI-DP algorithm to deal with nonconformal meshes was presented in [32] and
some preliminary results were obtained for the Laplace equation. In this section,
we extend the conformal FETI-DP method with 2LM [14, 15] to the case with
nonconformal interface meshes. The new DDM algorithm is referred to as the
Lagrange-multiplier (LM)-based FETI-DP method in the following context.
We focus on the BVP defined by (6.1) and (6.10), with nonconformal meshes on
the subdomain interface. The resultant subdomain matrix equation, which is very
similar to (6.32), can be written as
Kiis Kibs Kics Eis fi s 0
s s s s s s s
Kbi Kbbs M bbs Kbc Eb fb Bbb b Lbc Ec
s
(6.46)
K cis
K cbs K ccs s s
Ec f c cs
where
[ Bbbs ]
Ss s
{Nbs }{Nbs }T dS
244 Advanced Computational Electromagnetic Methods and Applications
In (6.46), [ K s ] , {E s } , { f s } , [ M bbs ] , [ Lsbc ] , {bs } , and {cs } are the same as those
defined in Section 6.2.2. The only extra term is [ Bbbs ] , which represents the
interaction between the interface electric field and the interface dual unknown.
Different from the conformal FETI-DP method with 2LM, the dual unknown Λ bs
here is explicitly expanded in terms of a set of curl-conforming vector basis
functions defined on s such that Λbs {Nbs }T { s } [15]. From (6.46), we can
obtain two equations involving the interface and dual unknowns on the subdomain
interface. One is
{Ebs } [ Rbrs ][ Krrs ]1 { f rs } [ Rbrs ]T [ Bbbs ]{bs } ([ Krcs ] [ Rbrs ]T [ Lsbc ])[ Bcs ]{Ec } (6.47)
[ Nbbs ]q {bs }q [ Lsbc ]q {Ecs }q [ Nbbsq ]{bq }s [ Lsqbc ]{Ecq }s [ Mbbsq ]{Ebq }s (6.48)
where
[ Nbbs ]q {N bs } {N bs }T dS
sq
[ N ] {N bs } {N bq }T dS
sq
bb sq
s
and [ Lbc ]q , [ Lsq sq
bc ] , and [ M bb ] are the same as those defined in Section 6.2.2. Note
s
that [ Nbb ]q is always diagonally dominant. Therefore, we can take the inversion of
[ Nbbs ]q to write the transmission condition (6.48) as
{bs }q [ Nbbs ]q1[ Lsbc ]q [Sqs ]{Ecs } [ Nbbs ]q1[ Nbbsq ][Tsq ]{bq }
[ Nbbs ]q1[ Lsqbc ][Ssq ]{Ecq } [ Nbbs ]q1[M bbsq ][Tsq ]{Ebq } (6.49)
s s
where the Boolean matrices [Tq ] and [ Sq ] are the same as those defined in
Section 6.2.2. Similarly, we can further eliminate {Ebq } with the aid of (6.47) and
then assemble over all the subdomains to obtain the global interface system as
Domain Decomposition Methods For Finite Element Analysis 245
where
Ns
] [ I ] [Q s ]T
[ K bb [Tqs ]T [ N bbs ]q1 ([ N bbsq ][Tsq ] [ M bbsq ][Tsq ][ Fbbq ])[Q q ]
s 1 qneighbor( s )
Ns
[ K bc ] [Q s ]T [Tqs ]T [ N bbs ]q1[ Lsbc ]q [ Sqs ] [ Bcs ]
s 1 qneighbor( s )
Ns
[Q s ]T [Tqs ]T [ N bbs ]q1 ([ Lsqbc ][ S sq ][ Bcq ] [ M bbsq ][Tsq ][ Fbcq ])
s 1 qneighbor( s )
Ns
{ f b } [Q s ]T [Tqs ]T [ N bbs ]q1[ M bbsq ][Tsq ]{dbq }
s 1 qneighbor( s )
In these expressions, [ Fbbq ] [ Rbrq ][ Krrq ]1[ Rbrq ]T [ Bbbq ] , and [ Fbcq ] and {dbq } are the
same as those defined in Section 6.2.2. By combining (6.36) and (6.50), we can
solve for {b } and {Ec } . Afterwards, the electric field inside each subdomain can
be obtained by solving (6.47).
To further enhance the capability of the LM-based FETI-DP method to deal with
arbitrary meshes, we now focus on the extension to the nonconformal corner case.
We start from the continuity condition on one geometrical crosspoint ( c ) as
illustrated in Figure 6.4, where four subdomains share one global corner edge.
We denote the number of unknowns defined on each local corner edge as N c ,
then call the corner with most unknowns the “master” corner and the others “slave”
corners so that Ncslave Ncmaster . Note that subdomains with more than one
crosspoint could contain both master and slave corners. For the LM-based FETI-
DP method presented in Section 6.3.1, we impose the Dirichlet continuity
condition at the corner as
Etmaster =
Eslave
t (6.51)
in a weak sense, where the subscript t specifies the tangential electric field along
the corner edge.
The tangential electric field for the master and slave subdomains (taking one
slave subdomain for example) can be expanded by two independent sets of basis
functions {Ncmaster } and {Nslave
c } as
246 Advanced Computational Electromagnetic Methods and Applications
t n1 c, n c, n c c
(6.52)
Ncmaster
Et
master
n 1 Ec , n N c , n {N c } {Ecmaster }.
master master master T
N cmaster N cslave
master slave
corner corner
c
slave slave
corner corner
N cslave N cslave
Figure 6.4 Master and slave corners associated with one shared crosspoint, and the number of
unknowns defined on the master corner is larger than or equal to those defined on the
slave corners ( Ncslave Ncmaster ).©2012 IEEE [15].
By substituting (6.52) into (6.51) and testing both sides using {Nslave
c } , we obtain
where
[Gccslv-slv ] {Nslave
c } {Nslave
c }T dl
c
[H slv-mst
cc ] {Nslave
c } {Ncmaster }T dl.
c
Note that the matrix dimensions of [Gccslv-slv ] and [ H ccslv-mst ] are Ncslave Ncslave and
Ncslave Ncmaster , respectively. Because [Gccslv-slv ] is always diagonally dominant and
thus invertible, we have
which means that the corner unknowns defined on the slave corners can be
represented by those on the master corners. Therefore, one can construct a global
coarse problem by using only the corner unknowns on all the master corners. After
incorporating the nonconformal corner scheme into the LM-based FETI-DP
method, we find that the matrices and vectors related to the global corner system
Domain Decomposition Methods For Finite Element Analysis 247
remain the same for the master subdomains but have to be modified for the slave
subdomains [15].
As for the CE-based FETI-DP method, we regard all the subdomain interfaces
(except corners) as unknown Neumann boundaries with an auxiliary unknown
representing the surface current density
js nˆ s (r1 Es ) on s (6.55)
where jk0 .
To solve the BVP defined in (6.1) and (6.55) using the FEM method, each
subdomain is discretized separately into finite elements such as tetrahedra. Based
on the formulation we have derived, we can either choose the same set of vector
(e.g., curl-conforming) basis functions {Nbs } to expand both the electric field and
the interface auxiliary variable js , or choose the orthogonal sets {Nbs } and
nˆ s {Nbs } (as was done in [3436]) to expand them, respectively. Here, we use the
248 Advanced Computational Electromagnetic Methods and Applications
s
same set of vector basis functions to expand both E and js [15]. It should be
noted that the following derivation is also valid if one chooses nˆ s {Nbs } as the
basis function to expand js . By applying Galerkin’s method, the FEM equation for
the sth subdomain can be written as
E s
K iis Kibs Kics 0 is fi s
s E
K bi K bbs K bcs Bbjs bs f bs (6.57)
E
K cis
K cbs K ccs Bcjs cs f cs
j
where
B s
bj S s s
N N dS
s
b
s T
b
B s
cj S s s
N N dS
s
c
s T
b
where
D sjb ]
[ Ss s
(nˆ s {N bs }) ( nˆ s {N bs }T ) dS
D sjc ]
[ Ss s
(nˆ s {N bs }) ( nˆ s {N cs }T ) dS
j
[C sjj ]
k0 Ss s
{N bs } {Nbs }T dS
1
[V jjsq ]
jk0 sq
{N bs } {N bq }T dS
Domain Decomposition Methods For Finite Element Analysis 249
Equations (6.57) and (6.58) can be combined to form a complete system for the sth
subdomain as
where
Ebq
{g s } [U sq
jb U sq
jc V jjsq ] Ecq
qneighbor( s ) jq
After reordering unknowns in each subdomain, we obtain
where
K iis K ibs 0 K ics
[ K rrs ] K bis K bbs Bbbs , [ K rcs ] K bcs , [ K crs ] [ K cis K cbs Bcjs ]
0 Dbbs s
Cbb D sjc
Eis fi s 0
s s s s s
{ur } Eb , { f r } f b , and { f g } 0
s g s
j 0
s
Based on the convention adopted in [2], { f g } represents the contribution from all
s s T s
neighbors of the sth subdomain and can be written as { f g } [ R j ] {g } , where
[ R sj ] [0 0 Ibbs ] and [ I bbs ] is an identity matrix.
250 Advanced Computational Electromagnetic Methods and Applications
From (6.61), we can find that the subdomain system matrices for different
subdomains become decoupled, while the interaction with the neighboring
subdomains is included in the mixed boundary condition at the interfaces. By using
the first equation in (6.61) and a Boolean matrix
0 I bbs 0
[ Rbbs ]
0 0 I bbs
the electric field and auxiliary unknowns at the subdomain interfaces can be found
as
{ubs } {
Ebs , j s }T [ Rbbs ]{urs }
Ns Ns
{u} s
b
s 1 s 1
[Q ] {u } [Q ] [ R
s T s T s
bb ][ K rrs ]1{ f rs }
Ns Ns
[Q s ]T [ Rbbs ][ K rrs ]1[ R sj ]T {g s } [Q s ]T [ Rbbs ][ K rrs ]1[ K rcs ][ Bcs ]{Ec } (6.63)
s 1 s 1
Similar to the conformal FETI-DP method with 2LM, the subdomain level corner
unknown related system here can be derived from (6.61) by eliminating {urs } as
Finally, we obtain two equations to relate the global interface dual unknowns
and the global corner primal unknowns
where
Domain Decomposition Methods For Finite Element Analysis 251
Ns
] [ I ] [Q s ]T [ Rbbs ][ K rrs ]1[ R sj ]T
[ K rr [U sq
jb V jjsq ][Q q ]
s 1 qneighbor( s )
Ns
q
[ K rc ] [Q ] [ R
s T s
bb ][ K rrs ]1 [ K rcs ][ Bcs ] [ R sj ]T [U sq jc ][ Bc ]
s 1 qneighbor( s )
Ns
{ f r } [Q s ]T [ Rbbs ][ K rrs ]1{ f rs }
s 1
and
[ Kcc ]{E
c } { f c } [ Kcr ]{u} (6.66)
where
Ns
q
[ K cc ] [B ] s T
c ([ K ccs ] [ K crs ][ K rrs ]1[ K rcs ])[ Bcs ] [ K crs ][ K rrs ]1[ R sj ]T [U sq jc ][ Bc ]
s 1 qneighbor( s )
Ns
[ K cr ] [ Bcs ]T [ K crs ][ K rrs ]1[ R sj ]T [U sq
jb V jjsq ][Q q ]
s 1 qneighbor( s )
Ns
{ fc } [B ]
s 1
s T
c ({ f cs } [ K crs ][ K rrs ]1{ f rs })
Similarly, {u} can be solved by using one of the Krylov subspace iterative solvers
after eliminating {Ec } based on (6.65) and (6.66). The electric field inside each
subdomain can finally be obtained by solving (6.62).
To remove the requirement for the conformal corner mesh, one can employ the
scheme described in Section 6.3.2. After incorporating the nonconformal corner
scheme into the nonconformal FETI-DP method with cement elements, we find
that the matrices and vectors related to the global corner system in (6.65) and
(6.66) remain the same for master subdomains but have to be modified for the
slave subdomains [15].
Comparing the formulations of the LM- and CE-based FETI-DP methods, we first
find that the dimension of [ K rrs ] , which has to be factorized during the tearing
stage, is Nis Nbs and Nis 2 Nbs for the LM- and CE-based FETI-DP methods,
respectively, where N is and N bs denote the number of interior and boundary
252 Advanced Computational Electromagnetic Methods and Applications
s
unknowns. Even with one more matrix factorization for [ Nbb ]q in (6.49), the
computational cost of the LM-based FETI-DP method is still smaller than that of
the CE-based FETI-DP method. Furthermore, in the global system solution stage,
each iteration step requires solving all subdomain equations directly. Because of
the smaller subdomain matrices, the LM-based FETI-DP method has faster
forward and back substitutions. This advantage also holds for the LM-based FETI-
DP method during the subdomain solution recovering stage. Finally, considering
the global interface system, because the LM-based FETI-DP method includes only
the Lagrange multipliers, the dimension of its global system matrix is only half of
that of the CE-based FETI-DP method, which includes both the interface electric
field and the auxiliary variable. Therefore, if these two methods converge with the
same number of steps, as we have observed in most cases, the LM-based FETI-DP
method is generally more efficient than the CE-based FETI-DP method [15].
The Robin boundary condition (6.10) and the resultant transmission conditions
(6.13) in Section 6.1.2 are equivalent to the first-order transmission condition
(FOTC) defined by (6.55) and (6.56) in Section 6.4.1. The FOTC can only
guarantee the transmission of propagating modes through the subdomain interface
[34, 38]. A higher-order transmission condition can be designed to transmit both
propagating and evanescent modes and therefore can be employed to speed up the
convergence of the iterative solution of the global interface problem [23, 24, 38,
39]. The transverse-electric second-order transmission condition (SOTC-TE) can
be written as
js s nˆ s (nˆ s Es ) s [nˆ s ( Es )n ]
js s nˆ s (nˆ s Es ) s [nˆ s ( Es )n ] s t t js
where js is defined in (6.55). Comparing (6.67) and (6.68) to (6.56), we can find
that two terms that correspond to the tangential variation of the normal magnetic
flux density [nˆ s ( Es )n ] , where ( Es )n nˆ s ( Es ) , and the tangential
Domain Decomposition Methods For Finite Element Analysis 253
variation of the surface charge density t t js are added into the FOTC gradually
to construct the SOTC-TE and the SOTC-FULL. These two added terms transmit
the transverse-electric (TE) and transverse-magnetic (TM) evanescent modes
through subdomain interfaces and thus improve the convergence.
Similar to the FOTC with Lagrange multipliers that we have used for the
FETI-DP method with 2LM, we can write the SOTC-TE with Lagrange multipliers
as
nˆ s r1 Es s nˆ s nˆ s Es s nˆ s Es
n
s
Λ on
b s (6.69)
where s can be determined based on the smallest mesh size and the order of
basis functions on the subdomain interface to account for all the evanescent modes
supported by the interface mesh [23, 38]. More specifically, s
j / (k0 k ) ,
2
with k
j (kmax k02 )1/ 2 and kmax / hmin , where hmin denotes the smallest
mesh size on the subdomain interface. The boundary condition in (6.69) is of
particular interest because it can be implemented without introducing any extra
auxiliary variables on subdomain interfaces. When incorporated into the dual-
primal framework, it does not change the sparsity pattern of the subdomain
matrices compared to that in the FOTC case. The subdomain matrix symmetry is
also preserved, which is highly desirable for the storage and factorization by a
direct sparse solver [16]. The only extra computation is to calculate a localized
surface mass matrix, which is very cheap.
Adding the boundary conditions from two neighboring subdomains and
eliminating the tangential magnetic field, we have
Then we enforce the continuity of the tangential electric field and the tangential
variation of the normal magnetic flux [nˆ s ( Ebs )n ] [nˆ q ( Ebq )n ] to
obtain
s q
Λb Λ
b ( s q )nˆ q (nˆ q Ebq ) ( s q ) [nˆ q ( Ebq ) n ]
q s
Λb Λ
( s q )nˆ s (nˆ s Ebs ) ( s q ) [nˆ s ( Ebs ) n ]
b
(6.71)
to use of the SOTC-TE, the computation of some matrices in Sections 6.2.2 and
6.3.1 has to be modified as follows
[ Lsbc ]q
sq
[ s (nˆ s {Nbs }) ( nˆ s {N cs }T ) s ( {Nbs }) n ( {N cs }T ) n ]dS
[ Lsq
bc ] sq
[ q (nˆ q {N bs }) (nˆ q {N cq }T ) q ( {N bs }) n ( {N cq }T ) n ]dS
The SOTC-TE can be applied to both the conformal and nonconformal FETI-
DP methods with 2LM.
Figure 6.5 Two regions of an entire computational domain decomposed into six and four
subdomains. For a better view, two regions are artificially detached. ©2014 IEEE [27].
256 Advanced Computational Electromagnetic Methods and Applications
Figure 6.6 Illustration of a split Lagrange multiplier (associated with the bold line) defined on the
interface between two regions. After splitting, two independent Lagrange multipliers (still
associated with the bold line) are defined on the shaded and solid triangles. ©2014 IEEE
[27].
In the first example, we simulate wave propagation in free space and use the result
to compare the convergence performance for the solution of the global interface
problem in the conformal FETI-DP and LM-based and CE-based nonconformal
FETI-DP methods as described in Sections 6.2.2, 6.3, and 6.4, respectively. We
design three different subdomains as shown in Figure 6.8, and use them to form a
computational domain with 3 3 subdomains to test three cases: (1) mesh with
conformal interfaces and conformal corners; (2) mesh with nonconformal
interfaces but conformal corners; and (3) mesh with nonconformal interfaces and
non-conformal corners, as shown in Figure 6.9. It is well known that the
convergence rate of a linear system is closely related to its eigenvalue distribution.
Therefore, we compare the eigenspectra of the global interface equations for {b }
(in the conformal and LM-based nonconformal FETI-DP methods) and {u} (in the
CE-based non-conformal FETI-DP method) in Figures 6.10, 6.11, and 6.12. For
the case with conformal interface and corner meshes whose result is plotted in
Figure 6.10, the convergence performance of all the three methods is expected to
be similar because their eigenspectra look nearly identical except that the CE-
based FETI-DP has a pure propagation mode corresponding to the (1, 0) point on
the complex plane. Similarly, for the case with nonconformal interfaces and either
conformal or nonconformal corner meshes, we can see from Figures 6.11 and 6.12
that the performance of the LM- and CE-based nonconformal FETI-DP methods is
again similar to each other. Our prediction is further validated by comparing the
convergence history for all the cases in Figure 6.13, where the BiCGStab iterative
solver with a stopping criterion of 109 is employed to solve the global interface
equations.
2 2 2 1 2 1 1 1 1
2 2 2 2 2 2 1 3 1
2 2 2 1 2 1 1 1 1
(a) (b)
Figure 6.11 Eigenspectra of the global interface system matrix for the case of nonconformal interface
and conformal corner meshes. (a) LM-based nonconformal FETI-DP (matrix dimension:
1,144 × 1,144). (b) CE-based nonconformal FETI-DP (matrix dimension: 2,288 × 2,288).
©2014 IEEE [15].
Domain Decomposition Methods For Finite Element Analysis 259
(a) (b)
Figure 6.12 Eigenspectra of the global interface system matrix for the case of nonconformal
interface and nonconformal corner meshes. (a) LM-based nonconformal FETI-DP
(matrix dimension: 1,240 × 1,240). (b) CE-based non-conformal FETI-DP (matrix
dimension: 2,480 × 2,480). ©2014 IEEE [15].
(a)
(b)
(c)
Figure 6.13 Convergence history for all the cases by using the BiCGStab iterative solver with a
stopping criterion of 109. (a) The case of conformal interface and conformal corner
meshes corresponding to the decomposition pattern in Figure 6.9(a). (b) The case of
nonconformal interface and conformal corner meshes corresponding to the
decomposition pattern in Figure 6.9(b). (c) The case of nonconformal interface and non-
conformal corner meshes corresponding to the decomposition pattern in Figure 6.9(c).
©2014 IEEE [15].
Domain Decomposition Methods For Finite Element Analysis 261
(a)
(b) (c)
(d) (e)
Figure 6.14 (a) A computational domain filled with a PML medium and decomposed into nine
subdomains. (b) Convergence history of the iterative solution of the global interface
problem on the mesh in Figure 6.14(a) using the FETI-DP method with the FOTC. (c)
Using the FETI method with the SOTC-TE. (d) Using the cement element method with
the SOTC-FULL. (e) Using the FETI-DP method with the SOTC-TE.
As we know, this medium is used to absorb waves propagating along the z-axis.
The entire computational domain is further divided into nine subdomains and
discretized into tetrahedral elements with a certain mesh density, as shown in
Figure 6.14(a). In the simulation, we fix the wavelength at =5m and decrease the
262 Advanced Computational Electromagnetic Methods and Applications
mesh size h gradually from 0.5m to 0.0625m. In all setups, the interface mesh is
required to be conformal. The scalability with respect to the mesh size is shown by
the convergence history of the iterative solution of the global interface problem
using the FETI-DP method with the FOTC, the FETI method with the SOTC-TE,
the cement element method with the SOTC-FULL, and the FETI-DP method with
the SOTC-TE in Figures 6.14(b), 6.14(c), 6.14(d), and 6.14(e), respectively. By
comparing Figure 6.14(e) with Figures 6.14(b) and 6.14(c), it is observed that the
SOTC-TE yields a faster convergence than does the FOTC and that the global
corner coarse grid correction also helps to improve the convergence. A comparison
between Figures 6.14(e) and 6.14(d) shows that the FETI-DP method with the
SOTC-TE can achieve a better convergence performance than the cement element
method with the SOTC-FULL does. It should be noted that the FETI-DP method is
even more efficient due to the reduced size of the global interface system and
symmetry of the subdomain matrices.
(a) (b)
(c) (d)
Figure 6.15 Eigenspectra of the global interface system on the mesh in Figure 6.14(a), with a mesh
size h = / 20. (a) Using the FETI-DP method with the FOTC. (b) Using the FETI
method with the SOTC-TE. (c) Using the cement element method with the SOTC-FULL.
(d) Using the FETI-DP method with the SOTC-TE.
Figure 6.15 shows the eigenspectra of the global interface equations for the
mesh in Figure 6.14(a) with a mesh size of h=/20, using the four different DDM
solvers. It can be seen that the simplified TC parameters works well for all solvers
Domain Decomposition Methods For Finite Element Analysis 263
except for the FETI-DP method with the FOTC. Because its interface system
matrix is on longer positive-definite, the FOTC converges at the slowest rate when
solving the global interface problem. Comparing Figure 6.15(d) with Figure
6.15(b), one can see that the use of the global corner coarse grid correction helps to
make the eigenvalue distribution more compact, which results in better
convergence as shown in Figure 6.14(e). Note that the interface system of the
FETI-DP method has a dimension of 1,360, whereas those of the FETI method and
the cement element method are 1,472 and 3,160, respectively.
Table 6.1
Comparison of the Active Reflection Coefficient in terms of the Relative L2 Norm error, Using the
Result of the FETI-DP Method with 2LM as Reference, for a 10 10 Vivaldi Antenna Array,
Simulated at 3 GHz, with the Main Beam Steered to 0 and 0o .
o
LM-based FETI-DP
(with/conformal mesh) 1.48 105 2.11 104 8.80 105
CE-based FETI-DP
(with/conformal mesh) 1.23 105 5.36 105 3.12 105
LM-based FETI-DP
(with/nonconformal mesh) 8.95 104 2.11 103 1.20 103
CE-based FETI-DP
(with/nonconformal mesh) 1.15 103 1.95 103 1.35 103
Source: [15].
Table 6.2
Computational Information of the Nonconformal FETI-DP Method for Simulating Various Vivaldi
Antenna Arrays
(a) (b)
FETI-DP e 2LM
(with/ nonconformal mesh)
LM-based FETI-DP
(With/ nonconformal mesh)
CE-based FETI-DP
(with/ nonconformal mesh)
(c)
Figure 6.17 Simulation of the 100 × 100 Vivaldi antenna array at 3 GHz. (a) Convergence history. (b)
Broadside scan E-plane relative pattern. (c) Broadside scan H-plane relative pattern.
©2014 IEEE [15].
266 Advanced Computational Electromagnetic Methods and Applications
Figure 6.18 Computation time as a function of the total number of unknowns for various Vivaldi
antenna arrays. ©2014 IEEE [15].
Antenna array is a typical case where the outgoing wave may propagate towards
the truncation boundary at an oblique direction. If this is the case, no matter how
far away a planar ABC is placed, its absorption is limited and the artificial
reflection may not be reduced to a desired level. To effectively reduce the artificial
reflection, we can employ an oblique ABC given by [28]
nˆ ( E) jk0 coss nˆ (nˆ E) ( jk0 / cos s )tˆ(tˆ E) (6.74)
where tˆ (ˆs nˆ)sin s cos s ˆs sin s sin s and n̂ denotes the outward unit
normal vector of the planar truncation surface. The angle for perfect absorption of
this ABC can be tuned by parameters s and s . Obviously, (6.74) is reduced to
the conventional ABC if s 0o . Its reflection coefficients for the perpendicular
(E) and parallel (H) polarizations can be derived as
cos cos s cos s cos
R , R/ / (6.75)
cos cos s cos s cos
If the outgoing wave under simulation propagates towards the truncation boundary
at a certain specified angle, for example, the direction of the main beam of the
radiated wave is specified, we can always tune this ABC to minimize the reflection
error. Figure 6.19 compares the absorption performance of the conventional and
oblique ABCs over a range of incident angles.
To investigate the performance of the oblique ABC, a 20 20 Vivaldi antenna
array is considered. For the mesh truncation of the upper half space, we have two
setups. One is a hemispherical surface with a base radius of 7, whereas the other
Domain Decomposition Methods For Finite Element Analysis 267
is a rectangular surface placed 1 away from both the top and the side of the
antenna array. The size of the rectangular box is 8.8×9.2×1.33. Apparently, the
second setup is computationally more efficient than the first one because its
computational domain is much smaller. However, in the second setup, the radiated
field will be incident on the top truncation surface at a much larger angle than in
the first one if the antenna array is set to radiate away from broadside. In this case,
the oblique ABC can provide a good absorption performance while minimizing the
size of the computational domain. The 20 20 Vivaldi antenna array is simulated
at 3.0 GHz using: (1) the conventional ABC with the hemispherical truncation
surface; (2) the conventional ABC with the rectangular truncation surface; and (3)
the oblique ABC with the rectangular truncation surface for the main beam ( s , s )
steered to (60o, 0o). The near-zone field distributions in the x-z plane are plotted in
Figure 6.20. We take the result of Case 1 shown in Figure 6.20(a) as the reference
solution and enlarge the portion close to the antenna array in Figure 6.20(b) for a
better comparison with the results of Cases 2 and 3, which are shown in Figures
6.20(c) and 6.20(d). For the case of (s , s ) (60o ,0o ) , Case 3 yields a visually
much better result than does Case 2, as shown in Figures 6.20(c) and 6.20(d). The
far-field radiation patterns calculated in the three cases above are compared in
Figure 6.21, which shows that the result of Case 2 deviates from the reference
solution by 3 dB whereas the result of Case 3 has a much smaller derivation. For
Cases 2 and 3, it takes 9.2 minutes to finish the simulation of one frequency point
on one computational node which contains 16 Intel Xeon 2.70-GHz processors.
Both cases are computed using the conformal FETI-DP method with 2LM. The
result of the reference case (Case 1) is obtained using the hybrid nonconformal
FETI/conformal FETI-DP method described in Section 6.6 with 43.5 minutes for
one frequency on the same node.
Figure 6.19 Comparison of the reflection coefficients of the conventional ABC and the oblique ABC
(tuned to s = 60o and s= 0o) for the perpendicular (E) and parallel (H) polarizations.
©2014 Wiley [28].
268 Advanced Computational Electromagnetic Methods and Applications
(a)
(b)
(c)
(d)
Figure 6.20 Re( E ) for the 20 × 20 Vivaldi antenna array in the x-z plane at 3.0 GHz with steering
angle set at (s, s) = (60o, 0o). (a) Computed using the conventional ABC with a
hemispherical truncation surface. (b) Same as (a), but plotted in a limited region for the
purpose of comparison. (c) Computed using the conventional ABC with a rectangular
truncation surface. (d) Computed using the oblique ABC with a rectangular truncation
surface. ©2014 Wiley [28].
Figure 6.21 Copolarized radiation patterns for the 20 × 20 Vivaldi antenna array in the x-z plane at
3.0 GHz when the main beam is steered to (s, s) = (60o, 0o). ©2014 Wiley [28].
Domain Decomposition Methods For Finite Element Analysis 269
Figure 6.22 An 11 × 11 NRL antenna array. (a) Measurement setup in anechoic chamber. (b) Shape
of the metal on the middle layer. (c) Shape of the metal on the top (bottom) layer.
©2014 IEEE [27].
At 3.02 GHz, the hemispherical radome has a base radius of 5.5. The
thickness and the relative permittivity of the radome are 0.1 and r 2.0 j1.0 ,
respectively. The conventional first-order ABC is used on a hemispherical surface
placed 1 away from the exterior boundary of the hemispherical radome. In this
case, the first-order ABC is a better choice because the truncation surface can be
made conformal to the radome to reduce the size of the computational domain. In
addition, it provides good absorption for waves radiating along any direction.
Figure 6.23 shows the radiation patterns of the array with and without the radome.
All radiation patterns are normalized by the value in the maximum radiation
270 Advanced Computational Electromagnetic Methods and Applications
direction of the array without the radome. It can be seen that due to the loss of the
radome, the emitted power in the main beam direction is reduced by around 3 dB.
The result using conformal meshes on the interregion interfaces is also plotted for
comparison. Apparently, using nonconformal interregion interface meshes does not
sacrifice the accuracy of the solution since two sets of data are on the top of each
other. The field distribution is also plotted in Figure 6.24 for the cases with and
without the radome.
Figure 6.23 Comparison between the radiation patterns for the array with and without the radome at
3.02 GHz and steering angle s = 60o and s= 0o. ©2014 IEEE [27].
(a)
(b)
Figure 6.24 | E | in the = 0 plane for H-pol excitation at 3.02 GHz and steering angle s = 60o and
o
s= 0o. (a) The NRL array itself. (b) The NRL array with a radome.
Domain Decomposition Methods For Finite Element Analysis 271
For the simulation of these two examples, the region containing the 11 × 11
NRL array is decomposed into 256 subdomains, and the interregion interface is
placed just above the top of the antenna elements. The other region containing the
radome is meshed by CUBIT and further decomposed into 200 subdomains by
METIS. This array-radome example involves 11,403,519 unknowns, 1,482,864
dual unknowns, and 18,552 corner unknowns. Finally, the convergence history of
the iterative solution of the global interface problem for the array with the radome
is given in Figure 6.25. It should be noted that for large-scale problems, the non-
conformal meshes on the interfaces between different regions may introduce some
numerical resonance and yield slower convergence than a conformal mesh does.
Figure 6.25 Convergence history of the iterative solution of the global interface problem for the
NRL array with the radome.
In the last two examples, we apply the FETI-DP method with the SOTC-TE to the
analysis of computationally complex optical devices and compare its numerical
performance to that of the FETI-DP method with the FOTC. In the first example,
we simulate the TMz mode in a microring resonator (MRR) shown in Figure 6.26
for the purpose of validation. This structure is invariant along the direction
perpendicular to the page, and thus it can be modeled as a 2-D problem to validate
the 3-D solution. In the 3-D simulation, the ring/bus structure is assumed to have a
finite thickness and a perfectly conducting plane is placed at the top and bottom. If
the thickness is smaller than one-half of a wavelength, a vertically invariant field
can be preserved in the 3-D configuration. The MRR plays as a bandstop filter. To
enhance reflection at one of the resonant frequencies, a first-order grating is
employed along the inner circle of the upper half MRR [16]. If the ring lies in the
x-y plane, the parametric function for the inner radius of the ring is given by
272 Advanced Computational Electromagnetic Methods and Applications
x( ) (r1 sin 2m ) cos , y( ) (r1 sin 2m )sin for 0
(6.76)
x( ) r1 cos , y( ) r1 sin for 2
where r1 8.267 μm is the inner radius, 8.3 103 μm is the grating size, and
m 58 is the azimuthal order. The outer radius parametric function is given by
x( ) r2 cos and y( ) r2 sin for 0 2 with r2=8.68 m. The coupling
gap between the ring and the bus waveguide is set to g=0.248 m. The relative
permittivities of the core and the cladding are 4.0 and 1.0, respectively. The bus
waveguide is placed parallel to the x-axis. Around the wavelength =1,550 nm, the
guided mode in the bus waveguide has the electric field profile in the core and in
the cladding given by
d
zˆ cos[k y ( y y0 )]exp( j x) y y0
2
E( x, y ) (6.77)
zˆ cos k y d exp y y d exp( j x) d
y y0
2
0
2 2
1
where 6.8355 μm1 is the propagation constant, k y 4.3593 μm and
5.5039 μm1 describe the y-dependence, d 0.413 μm is the waveguide
width, and y0 9.135 μm denotes the center of the bus waveguide.
(a) (b)
Figure 6.26 (a) Top view of a ring/bus structure modeled with CUBIT. The ring has the first-order
grating along the upper half on the inner side. (b) Enlarged view of the rectangular
region in Figure 6.26(a) showing the grating. © Optical Society of America 2014 [16].
For the simulation, the entire ring/bus structure is enclosed by a box, whose
dimensions are 24.8 µm, 24.8 µm, and 0.27µm in the Cartesian coordinate system.
Except for the top and bottom boundaries, all four sides of the computational
Domain Decomposition Methods For Finite Element Analysis 273
domain are truncated by the PML, which has a thickness of 0.827 μm on each side.
The core and cladding regions are discretized by a mesh size of 0.0744 µm and
0.149 µm, respectively. As a result, there are at least six layers of elements in the
PML for each side truncation. The entire structure is excited by the mode described
in (6.77) through a current sheet located on the bus. Two contradirectional waves
are excited from the sheet and only the forward wave that bypasses the ring is of
interest. Accordingly, the reference planes for reflection and transmission
coefficient calculation are placed on the left and right sides of the current sheet, as
shown in Figure 6.26. After being meshed with CUBIT, the entire computational
domain is decomposed into N s 512 subdomains, involving 7,522,572 unknowns,
889,488 dual unknowns, and 9,880 corner unknowns when the second-order
hierarchical vector basis functions [43] are employed. The simulation is carried out
from 1.924 1014 to 1.944 1014 Hz and the computed reflection and transmission
coefficients are plotted in Figure 6.27 as functions of frequency. To validate the
3-D simulation result, a 2-D simulation is carried out by COMSOL Multiphysics
[44] using the same geometry and material setup in the x-y plane. Two sets of
results are in excellent agreement, as shown in Figure 6.27. The field distribution
in the z 0 μm plane is plotted in Figure 6.28 for 1.9246 1014 and
1.9355 1014 Hz . The standing-wave field profile in the ring shown in Figure 6.28
indicates that the MRR has a stronger resonance at 1.9355 1014 Hz . Most of the
energy guided on the bus is reflected by the grating in the ring instead of being
delivered to the receiving port. Usually, an iterative solver takes more iterations to
converge at resonant frequencies because the matrix is more ill-conditioned. This
prediction can be verified by the convergence history of the iterative solution of
the global interface problem given in Figure 6.29, which also shows that the FETI-
DP method with the SOTC-TE effectively overcomes the convergence difficulty
encountered with the FOTC when a PML mesh truncation is employed.
Figure 6.27 Power reflection and transmission coefficients of the ring/bus structure shown in Figure
6.26 from 1.924×1014 Hz to 1.944×1014 Hz. © Optical Society of America 2014 [16].
274 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 6.28 Re( Ez ) in the z = 0 m plane. (a) At 1.9246×1014 Hz. (b) At 1.9355×1014 Hz. ©Optical
Society of America 2014 [16].
Figure 6.29 Convergence history of the iterative solution of the global interface problem using the
FETI-DP method with the SOTC-TE and the FOTC. © Optical Society of America 2014
[16].
Next, we examine the parallel efficiency of the FETI-DP method with the
SOTC-TE by solving the bus/ring resonator problem using various numbers of
processors. Table 6.3 shows the time for preprocessing, the time for solving the
interface problem, and the total computation time. Based on the computation times
in Table 6.3, the speed-up, which is defined with respect to the wall-clock time
using four processors as
T4
Speed-up (6.78)
TN p
is plotted in Figure 6.30. Note that TN p is the total wall-clock time using N p
processors. As can be seen, an excellent speed-up has been achieved using up to
Domain Decomposition Methods For Finite Element Analysis 275
128 processors. With 128 processors employed, the peak memory usage is 55.7
GB.
Table 6.3
Computation Times for the Bus/Ring Resonator Problem with 7,522,572 Unknowns and 512
Subdomains
Figure 6.30 Parallel speed-up versus the number of processors with N =512. The computation time
using four processors is taken as the reference. © Optical Society of America 2014 [16].
the same format as that in (6.76) for the circle with grating, except that in this
example the grating size is 0.1 μm and the azimuthal order is m 200 . The
bus waveguide, which has a width d=1 m, is placed 345 nm away from the outer
ring at the closest point. The two rings and the bus waveguide have the same
thickness t 0.3756 μm in the vertical direction. The refractive indices are given
by ncore 1.977 for the cores of the two rings and the bus waveguide, which are
made of Si3N4, and ncladding 1.437 for the cladding, which is made of SiO 2,
respectively [16].
(a) (b)
Figure 6.31 (a) Top view of an ECDMRR/bus structure modeled with CUBIT. The inner ring has a
first-order grating along the half on its outer side. (b) Enlarged view of the rectangular
region in Figure 6.31(a). © Optical Society of America 2014 [16].
For the simulation, the entire double ring/bus structure is enclosed by a box-
shaped computational domain, whose dimensions are 66.66 m, 67.895 m, and
5.2 m in the Cartesian coordinate system. All of the six exterior boundaries are
truncated by the PML, which has a thickness of 1.2 m in each direction. The total
computational volume, which is about 1.9 10 ( / navg ) for navg 1.442 and
4 3
(a) (b)
Figure 6.32 Power reflection and transmission coefficients of the full-scale ECDMRR. (a)
Comparison between the measured and simulated results. (b) Comparison with the
simulated result shifted by 0.78 nm. © Optical Society of America 2014 [16].
278 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
(c) (d)
Figure 6.33 Snapshot of | E | in the z = 0 m plane. (a) With = 1,550.93 mm. (b) Enlarged view of
the field near the gap between the bus and the ECDMRR at = 1,550.93 mm. (c) With
= 1,549.14 mm. (d) Enlarged view of the field near the gap between the bus and the
ECDMRR at = 1,549.14 mm. © Optical Society of America 2014 [16].
6.8 SUMMARY
versions construct a global corner system that relates the fields at the crosspoints
between the subdomains through a Dirichlet continuity condition. This corner
system provides a coarse grid correction to improve the convergence of an iterative
solution by propagating residual errors globally in each iteration.
The algorithms described in this chapter expand the capability and improve
the performance of the FETI-DP method by: (1) lifting the requirement of
conformal meshes on the subdomain interface; (2) speeding up the convergence
rate of the iterative solution of the global interface problem; and (3) incorporating
appropriate truncation boundaries for more accurate results. First, we formulated
two nonconformal FETI-DP methods, both of which implement the Robin-type
transmission condition at the subdomain interfaces. One nonconformal method
extends the conformal FETI-DP algorithm, which is based on two Lagrange
multipliers, to deal with nonconformal interface and corner meshes, whereas the
other one employs cement elements on the interface, combines the global primal
unknowns with the global dual unknowns, and extracts the corner unknowns to
formulate a global coarse problem. Second, we implemented higher-order
transmission conditions in the Lagrange multiplier-based FETI-DP method for a
faster convergence because higher-order transmission conditions can transmit both
transverse-electric and transverse-magnetic evanescent modes in addition to the
propagating modes. Furthermore, when perfectly matched layers (PMLs) are used
as truncation, higher-order transmission conditions become more critical for a
converged result. Third, for multiregion electromagnetic problems, we developed a
hybrid method that employs the finite element tearing and interconnecting (FETI)
method to deal with mesh-nonconformal and/or geometry-nonconformal interfaces
between regions and the FETI-DP method for mesh-conformal and geometry-
conformal interfaces inside each region. We formulated a unified global system of
equations for the interface unknowns from both nonconformal and conformal
interfaces. In the formulation, we applied higher-order transmission conditions and
a generalized crosspoint correction technique to improve the convergence and
ensure a correct interconnection across subdomain interfaces. Fourth, we described
an oblique absorbing boundary condition and applied it to the FETI-DP method for
simulating large finite antenna arrays. This boundary condition can be tuned to be
reflectionless for all frequencies and polarizations as long as the main beam of the
radiated wave is specified. Finally, we presented numerical results for the
simulation of wave propagation, finite antenna arrays, photonic crystal cavities,
and optical devices to demonstrate the application, accuracy, efficiency, and
capability of these algorithms.
REFERENCES
[1] J. Jin, The Finite Element Method in Electromagnetics, 3rd ed., New York: Wiley, 2014.
[2] J. Jin and D. Riley, Finite Element Analysis of Antennas and Arrays, New York: Wiley, 2008.
280 Advanced Computational Electromagnetic Methods and Applications
[3] C. Farhat, and F. Roux, “A Method of Finite Element Tearing and Interconnecting and its
Parallel Solution Algorithm,” Int. J. Numer. Meth. Eng., vol. 32, no. 6, pp. 12051227, 1991.
[4] C. Farhat, A. Macedo, M. Lesoinne, F. Roux, and F. Magoulès, “Two-Level Domain
Decomposition Methods with Lagrange Multipliers for the Fast Iterative Solution of Acoustic
Scattering Problems,” Comput. Methods Appl. Mech. Eng., Vol. 184, No. 24, pp. 213239,
2000.
[5] C. Farhat, M. Lesoinne, P. LeTallec, K. Pierson, and D. Rixen, “FETI-DP: A Dual-Primal
Unified FETI Method—part I: A Faster Alternative to the Two-Level FETI Method,” Int. J.
Numer. Meth. Eng., Vol. 50, No. 7, pp. 15231544, 2001.
[6] C. Farhat, J. Li, and P. Avery, “A FETI-DP Method for the Parallel Iterative Solution of
Indefinite and Complex-Valued Solid and Shell Vibration Problems,” Int. J. Numer. Meth. Eng.,
Vol. 63, No. 3, pp. 398427, 2005.
[7] C. Farhat, P. Avery, R. Tezaur, and J. Li, “FETI-DPH: A Dual-Primal Domain Decomposition
Method for Acoustic Scattering,” J. Comput. Acoust., Vol. 13, No. 3, pp. 499524, 2005.
[8] Y. Li and J. Jin, “A Vector Dual-Primal Finite Element Tearing and Interconnecting Method for
Solving 3D Large-Scale Electromagnetic Problems,” IEEE Trans. Antennas Propag., Vol. 54,
No. 10, pp. 30003009, 2006.
[9] Y. Li and J. Jin, “A New Dual-Primal Domain Decomposition Approach for Finite Element
Simulation of 3D Large-Scale Electromagnetic Problems,” IEEE Trans. Antennas Propag., Vol.
55, No. 10, pp. 28032810, 2007.
[10] Y. Li and J. Jin, “Implementation of the Second-Order ABC in the FETI-DPEM Method for 3D
EM Problems,” IEEE Trans. Antennas Propag., Vol. 56, No. 8, pp. 27652769, 2008.
[11] Y. Li and J. Jin, “Parallel Implementation of the FETI-DPEM Algorithm for General 3D EM
Simulations,” J. Comput. Phys., Vol. 228, No. 9, pp. 32553267, 2009.
[12] A. Toselli and O. Widlund, Domain Decomposition Methods Algorithms and Theory, Berlin:
Springer-Verlag, 2005.
[13] T. Mathew, Domain Decomposition Methods for the Numerical Solution of Partial Differential
Equations, Berlin: Springer-Verlag, 2008.
[14] M. Xue and J. Jin, “Application of a Nonconformal FETI-DP Method in Antenna Array
Simulations,” IEEE APS Int. Symp. Dig., pp. 12, 2012.
[15] M. Xue and J. Jin, “Nonconformal FETI-DP methods for Large-Scale Electromagnetic
Simulation,” IEEE Trans. Antennas Propag., Vol. 60, No. 9, pp. 42914305, 2012.
[16] M. Xue, Y. Kang, A. Arbabi, S. McKeown, L. Goddard, and J. Jin, “Fast and Accurate Finite
Element Analysis of Large-Scale Three-Dimensional Photonic Devices with a Robust Domain
Decomposition Method,” Opt. Express, Vol. 22, No. 4, pp. 44374452, 2014.
[17] M. Gander, F. Magoulès, and F. Nataf, “Optimized Schwarz Methods without Overlap for the
Helmholtz Equation,” SIAM J. Sci. Comput., Vol. 24, No. 1, pp. 3860, 2003.
[18] M. Gander, L. Halpern, and F. Magoulès, “An Optimized Schwarz Method with Two-Sided
Robin Transmission Conditions for the Helmholtz Equation,” Int. J. Numer. Meth. Fluids, Vol.
55, No. 2, pp. 163175, 2007.
[19] M. Gander, and F. Kwok, “Best Robin Parameters for Optimized Schwarz Methods at Cross
Points,” SIAM J. Sci. Comput., Vol. 34, No. 4, pp. 18491879, 2010.
Domain Decomposition Methods For Finite Element Analysis 281
[20] P. Collino, G. Delbue, P. Joly, and A. Piacentini, “A New Interface Condition in the Non-
Overlapping Domain Decomposition for the Maxwell Equations,” Comput. Methods Appl. Mech.
Eng., Vol. 148, No. 12, pp. 195207, 1997.
[21] A. Alonso-Rodriguez and L. Gerardo-Giorda, “New Nonoverlapping Domain Decomposition
Methods for the Harmonic Maxwell System,” SIAM J. Sci. Comput., Vol. 28, No. 1, pp. 102122,
2006.
[22] V. Dolean, M. Gander, and L. Gerardo-Giorda, “Optimized Schwarz Methods for Maxwell’s
Equations,” SIAM J. Sci. Comput., Vol. 31, No. 3, pp. 21932213, 2009.
[23] Z. Peng, V. Rawat, and J. Lee, “One Way Domain Decomposition Method with Second Order
Transmission Conditions for Solving Electromagnetic Wave Problems,” J. Comput. Phys., Vol.
229, No. 4, pp. 11811197, 2010.
[24] Z. Peng and J. Lee, “Non-Conformal Domain Decomposition Method with Second-Order
Transmission Conditions for Time-Harmonic Electromagnetics,” J. Comput. Phys., Vol. 229, No.
8, pp. 56155629, 2010.
[25] Z. Peng and J. Lee, “Non-Conformal Domain Decomposition Method with Mixed True Second
Order Transmission Condition for Solving Large Finite Antenna Arrays,” IEEE Trans. Antennas
Propag., Vol. 59, No. 5, pp. 16381651, 2011.
[26] M. Xue and J. Jin, “A Hybrid Nonconformal FETI/Conformal FETI-DP Method for Arbitrary
Nonoverlapping Domain Decomposition Modeling,” IEEE APS Int. Symp. Dig., pp. 16281629,
2013.
[27] M. Xue and J. Jin, “A Hybrid Conformal/Nonconformal Domain Decomposition Method for
Multi-Region Electromagnetic Modeling,” IEEE Trans. Antennas Propag., Vol. 62, No. 4, pp.
20092021, 2014.
[28] M. Xue and J. Jin, “Application of an Oblique Absorbing Boundary Condition in the Finite
Element Simulation of Phased-Array Antennas,” Microwave Opt. Technol. Lett., Vol. 56, No. 1,
pp. 178184, 2014.
[29] B. Després, P. Joly, and J. Roberts, “A Domain Decomposition Method for the Harmonic
Maxwell Equations,” Iterative Methods in Linear Algebra, North-Holland, Amsterdam, pp.
475484, 1992.
[30] B. Stupfel, “A Fast-Domain Decomposition Method for the Solution of Electromagnetic
Scattering by Large Objects,” IEEE Trans. Antennas Propag., Vol. 44, No. 10, pp. 13751385,
1996.
[31] B. Stupfel and M. Mognot, “A Domain Decomposition Method for the Vector Wave Equation,”
IEEE Trans. Antennas Propag., Vol. 48, No. 5, pp. 653660, 2000.
[32] F. Roux, “A FETI-2LM Method for Non-Matching Grids,” Lecture Notes Comput. Sci. Eng., Vol.
70, pp. 121128, 2009.
[33] Y. Achdou, C. Japhet, Y. Maday, and F. Nataf, “A New Cement to Glue Nonconforming Grids
with Robin Interface Conditions: The Finite Volume Case,” Numer. Math., Vol. 92, pp. 593620,
2002.
[34] S. Lee, M. Vouvakis, and J. Lee, “A Non-Overlapping Domain Decomposition Method with
Non-Matching Grids for Modeling Large Finite Antenna Arrays,” J. Comput. Phys., Vol. 203,
No. 1, pp. 121, 2005.
282 Advanced Computational Electromagnetic Methods and Applications
[35] M. Vouvakis, Z. Cendes, and J. Lee, “A FEM Domain Decomposition Method for Photonic and
Electromagnetic Band Gap Structures,” IEEE Trans. Antennas Propag., Vol. 54, pp. 721–733,
2006.
[36] K. Zhao, V. Rawat, S. Lee, and J. Lee, “A Domain Decomposition Method with Nonconformal
Meshes for Finite Periodic and Semi-Periodic Structures,” IEEE Trans. Antennas Propag., Vol.
55, No. 9, pp. 25592570, 2007.
[37] Z. Lu, X. An, and W. Hong, “A Fast Domain Decomposition Method for Solving Three-
Dimensional Large-Scale Electromagnetic Problems,” IEEE Trans. Antennas Propag., Vol. 56,
No. 8, pp. 22002210, 2008.
[38] V. Rawat, “Finite Element Domain Decomposition with Second Order Transmission Condition
for Time Harmonic Electromagnetic Problem,” Ph.D. Thesis, The Ohio State University, 2009.
[39] J. Ma, J. Jin, and Z. Nie, “A Nonconformal FEM-DDM with Tree-Cotree Splitting and Improved
Transmission Condition for Modeling Subsurface Detection Problems,” IEEE Trans. Geosci.
Remote Sens., Vol. 52, No. 1, pp. 355364, 2014.
[40] CUBIT, available online https://cubit.sandia.gov/.
[41] METIS, Serial graph partitioning and fill-reducing matrix ordering, available online
http://glaros.dtc.umn.edu/gkhome/metis/metis/overview.
[42] M. Xue, J. Jin, S. Wong, C. Macon, and M. Kragalott, “Experimental validation of the FETI-
DPEM algorithm for simulating phased-array antennas,” IEEE APS Int. Symp. Dig., pp.
24952498, 2011.
[43] J. Webb, “Hierarchical Vector Basis Functions of Arbitrary Order for Triangular and Tetrahedral
Elements,” IEEE Trans. Antennas Propag., Vol. 47, No. 8, pp. 12441253, 1999.
[44] COMSOL Multiphysics ver. 4.2, available online http://www.comsol.com/.
Chapter 7
High-Accuracy Computations for
Electromagnetic Integral Equations
Andrew F. Peterson and Malcolm M. Bibby
283
284 Advanced Computational Electromagnetic Methods and Applications
f ( s) B ( s)
n n (7.2)
R( s) Lf g LB n n
g (7.3)
The minimum residual, in the least square sense, is produced by the set of
coefficients {n} that minimizes the expression
R( s ) w
2 2
ds i R( si ) (7.4)
where {wi, si} are the weights and nodes of a quadrature rule.
In the context of the boundary residual method, (7.1) can be discretized into
an overdetermined system of equations by selecting more testing points than basis
functions. Each equation is weighted with the square root of the appropriate
quadrature rule weight, to produce the M by N system [1921]
w1 LB1 w1 LBN w1 g ( s1 )
s1
1
s1
(7.5)
wM LB1 sM wM LBN s N wM g ( sM )
M
High-Accuracy Computations for Electromagnetic Integral Equations 285
Then a least-square solution to (7.5), using standard matrix library routines, will
minimize the residual in (7.4). As noted by Bunch and Grow, the more accurate
the quadrature rule, the better the residual will be minimized. For the results that
follow, we always employed Gauss-Legendre rules for weights and nodes, and
typically M/N = 2. We employ a discrete model of the target under consideration,
and apply the quadrature rule on a cell-by-cell basis when constructing (7.5).
As a byproduct of the least square solution, the residual error is easily
obtained at each quadrature node {si} and used to compute NRE
w
2
i
R ( si )
NRE (7.6)
w
2
i
g ( si )
The NRE exhibits excellent correlation with true error for problems where
exact solutions are available [4].
In the following, numerical results will usually be computed for a basis of
fixed polynomial order q (for a polynomial of degree p, the order q will be defined
so that q = p + 1), as the cell sizes in use are systematically reduced. Under these
conditions, the slope of NRE versus the reciprocal of the cell size usually
approaches integer values. The slope of an error curve for fixed q, on a log-log plot,
is obtained as [4]
log10 (NRE 2 ) log10 (NRE1 )
slope q (7.7)
log10 ( h1 ) log10 ( h2 )
for these results approximate straight lines on a log-log plot, with slopes that
approach integers as the cell sizes are reduced. Figure 7.1 shows plots of the NRE
curves for a circular cylinder of radius 1.0 wavelength, for polynomial orders
between q = 1 and q = 9, obtained from a numerical solution of the magnetic-field
integral equation (MFIE) for the TM polarization. Figure 7.2 shows plots of the
slopes of these curves for the four different equations considered in [4, 5], namely
the electric-field integral equation (EFIE) and the MFIE for the transverse
magnetic (TM) and transverse electric (TE) polarizations. It is observed that the
TM EFIE produces results that exhibit slopes approximately equal to q + 1 as the
cell sizes are reduced, while the TE EFIE results exhibit slopes that approach the
integer q – 1. The MFIE for either polarization produces NRE curves with slopes
approximating q.
Despite the fact that this is an extremely simple problem, it is of interest
because we will use the slopes of these NRE curves (for the appropriate equation)
as a baseline for future comparison. When results for more complex structures
exhibit the same slopes as those obtained in Figure 7.1, for the same degree of
representation, we will conclude that the results truly exhibit high-order behavior.
Note that the 3-D EFIE should mimic the behavior of the 2-D TE EFIE.
0.0
-2.0
-4.0
Log10(NRE)
-6.0
-8.0
-10.0 q = 1
q = 3
q = 5
-12.0 q = 7 .
q = 9 MFIE-TM: a = 1, m = 2
-14.0
3 4 5 6 7 8 100 2 3 4 5 6
DOF
Figure 7.1 NRE values obtained for the circular perfectly conducting cylinder of unit radius,
obtained from the TM MFIE.
The behavior of NRE has been investigated for other smooth structures,
including a prolate spheroid represented by an EFIE [3] and several toroidal targets
represented by MFIE [7, 15]. For these structures, the slopes of the NRE curves
approximate the same integer values for small cell sizes as those identified above.
The primary difficulty associated with the high-order numerical solution of the
preceding examples is that of maintaining sufficient accuracy in the matrix entries
High-Accuracy Computations for Electromagnetic Integral Equations 287
arising from the MoM procedure. Suffice it to say that the tried-and-true methods
in widespread use in conjunction with low-order solution procedures are not
adequate for high-order procedures. Specific details of these computations, and
procedures for accurate evaluation of the associated integrals, are discussed in [1, 6,
10, 13].
5 6 7 100.0 2 3 4 5 6
EFIE_TE EFIE_TM
1.0
0.0
-1.0
-2.0
-3.0
-4.0
q=1
q=2
Slope-q of NRE
-5.0
q=3
q=4 -6.0
q=5
-7.0
q=6
MFIE_TE MFIE_TM
-8.0
1.0
0.0
-1.0
-2.0
-3.0
-4.0
-5.0
-6.0
-7.0
-8.0
5 6 7 100.0 2 3 4 5 6 Degrees of Freedom
Figure 7.2 Slopeq values obtained for four sets of numerical results for the circular perfectly
conducting cylinder of unit radius. The slopes approach integer values as the cell sizes
are reduced.
Reference [3] investigated the high-order treatment of the linear dipole antenna,
when excited by a magnetic frill feed model and described by an MFIE. (EFIE
formulations were also investigated [2, 13].) The frill-fed dipole is actually a
model for a monopole fed through a ground plane. Two types of end geometries
were considered in [3]: flat end caps and hemispherical end caps. The NRE was
evaluated on a cell-by-cell basis along the dipole for various order numerical
solutions. Polynomial expansions were used in all cells.
288 Advanced Computational Electromagnetic Methods and Applications
Figure 7.3 shows the behavior of the NRE as a function of location along the
surface, for a dipole with flat end caps, after [3]. It should be apparent that as the
representation order is increased, the NRE is reduced by orders of magnitude in
cells along the barrel of the dipole, as well as in cells toward the center of the end
caps, but is not substantially reduced in the vicinity of the corners where the barrel
meets the flat ends. In fact, the charge singularity at those corners is not properly
modeled by the polynomial representation, and the resulting error in the residual
indicates that the accuracy of the result is not ensured. In this, and other examples
involving edges, corners, or even junctions involving a discontinuity in the
curvature of the geometry, purely polynomial representations are not able to
systematically reduce the local NRE levels. In practice, it is desirable to reduce the
NRE to a comparable level across the entire computational domain to ensure a
reliably accurate result.
Figure 7.3 Plot of the local NRE values along the barrel and end caps of a linear dipole of length
0.5 wavelength, radius 0.0625 wavelengths, and excited with a frill feed having b/a =
1.2. Plots show that the NRE is not reduced in the vicinity of the corners where the
barrel meets the flat end caps. ©IEEE 2004 [3].
References [8, 1113, 16] proposed a technique for the accurate treatment of
singularities in 2-D problems, based on the analytic behavior of the current and
charge densities in the vicinity of an ideal wedge. Figure 7.4 shows a wedge with
interior angle. In the immediate neighborhood of the tip, the current density has the
asymptotic form, valid for small ,
Jz c
m 0
n 1
mn
2 m n 1
(7.8)
Table 7.1
Exponents Used in Basis Functions of a Given Order, for Cells in the Vicinity of a 60o Corner, TE
Polarization. The corner cells involve twice as many terms and are recommended to be twice as large as
the neighboring cells.
1 0 0, 3/5
2 0, 1 0, 3/5, 1, 6/5
Figure 7.5 Slopes of the NRE curves for results obtained from the TE MFIE, for a perfectly
conducting cylinder whose cross section is an equilateral triangle of side dimension four
wavelengths. Until the solution accuracy reaches the precision limit, these slopes
approximate the same integer values as those of the circular cylinder. ©ACES 2009 [12].
Figure 7.6 Error in the current density as obtained by a method-of-moments solution of the TM
EFIE for a flat strip of seven wavelengths, illuminated by a uniform plane wave. The
reference result is the current density obtained from the eigenfunction expansion in
terms of Mathieu functions.
that approximate those of the circular cylinder problem, suggesting that the
representations are truly producing high-order behavior.
The behavior of NRE has also been used to investigate the satisfaction of boundary
conditions for a problem involving the junction of three conducting strips [11, 16],
as depicted in Figure 7.7. For that target, expansions containing fractional
exponents were required at the strip ends and at the junction. When the appropriate
representations were used in both locations, local residual errors were uniform
across the individual strips and were systematically reduced as the order of the
representations was increased. When representations containing fractional
exponents were dropped from either the end cells or the cells adjacent to the
central junction, the high-order behavior was not observed. Details may be found
in [16].
Figure 7.7 A perfectly conducting target made by connecting three strips at a central junction.
partial differential equations, relatively few attempts have been reported in the
electromagnetics literature in connection with integral equations [4, 14, 2226].
(There are a number of reports in the mechanical engineering literature, which we
do not attempt to review here.)
While the NRE error estimator described in Section 7.1 is robust, it is a
relatively expensive error estimator. It involves the solution of an overdetermined
system, which is a computational burden that grows as O(N3), where N is the
number of unknowns in the problem. Furthermore, on a conventional processor,
the least-square solution of a 2:1 overdetermined system requires approximately
five times as many operations as the solution of a square system by the LU
factorization. Alternative residual-based estimators have been investigated for 2-D
problems [14]. For example, if the tangential field boundary condition is enforced
as part of the formulation, the residual in the normal component of the field may be
used for error estimation. Alternatively, the residual in the magnetic field can be
used to estimate the error in an EFIE solution, and so on.
Residual computations usually involve a cost proportional to O(N2), which
may be prohibitive. However, it should be noted that this type of computation
lends itself to parallel implementation and should be easily adapted to multicore,
Xeon Phi coprocessors and GPU processors. In this situation, the actual overhead
of residual-based estimators may be small in practice.
Residual errors are fundamentally tied to the problem boundary conditions and
are inherently robust [22]. Other less robust estimators can be developed with an
O(N) cost. For example, in situations where current continuity is imposed by the
representation, the discontinuity in the derivative of the current density may be
used to identify regions with larger error. In vector problems where one component
of the current density is made continuous, the discontinuity in the other component
may be used to drive an error estimate. However, these cheaper estimators are not
expected to be as robust as the NRE estimator described above. They may also
require some kind of calibration to enable their use as a global error predictor.
These error estimators are of the explicit type and may be applied to any
approximate solution. Implicit error estimators [27] are often used in connection
with partial differential equation formulations and may also be useful for integral
equation formulations. These techniques involve re-solving subsets of the original
problem with a finer mesh or higher-order representation, measuring how much the
local solution changes, and using that information to estimate the error in the
original result.
C1 ( )r C0 ( )r
1
1 ei oi
where oi and ei are parameters that can be found by a solution of the appropriate
Lame equations, and (r, f) denote sphero-conal coordinates (see reference [34] for
details). As an example, Table 7.2 shows the first few values for noi and nei for a
90o corner.
The expansions in (7.11)–(7.13) for the current behavior near the plate tip are
somewhat analogous to the canonical wedge solution for current behavior near an
edge. Ongoing research is expected to produce suitable representations for the
current and charge densities near plate corners based upon these expressions.
High-Accuracy Computations for Electromagnetic Integral Equations 295
Table 7.2
Exponents of the Radial Variable r Used in the Asymptotic Expansion of the Current Density in the
Vicinity of a 90o Plate Corner
Index oi ei
1 0.81466 0.29658
2 1.59713 1.13125
3 1.95533 1.42651
4 2.52088 2.03957
7.8 SUMMARY
Controlled accuracy computations are nearing realization for fairly general 2-D
EMF problems based on integral equation formulations. The authors’ research on
high-order representations for currents at edges, combined with the existing
technologies on high-order bases for smooth surfaces, curved-cell models, and
accurate integration procedures, enables computations that have been demonstrated
to produce 710 or more digits of accuracy for a variety of problems involving
perfectly conducting structures. Additional research into accurate, efficient error
estimators is required, as is a broader experience base with penetrable and lossy
targets.
Much work is still needed for 3-D problems. To date, the authors are aware of
no 3-D target containing edges or tips for which high-accuracy solutions have been
found with integral equation formulations.
REFERENCES
[1] M. Bibby, and A. Peterson, “High Accuracy Calculation of the Magnetic Vector Potential on
Surfaces,” Applied Computational Electromagnetics Society (ACES) Journal, Vol. 18, pp. 1222,
March 2003.
[2] A. Peterson, “Application of the Locally-Corrected Nyström Method to the EFIE for the Linear
Dipole,” IEEE Trans. Antennas Propagat., Vol. 52, pp. 603605, 2004.
[3] A. Peterson, and M. Bibby, “High-Order Numerical Solutions of the MFIE for the Linear
Dipole,” IEEE Trans. Antennas Propagat., Vol. 52, pp. 26842691, 2004.
[4] M. Bibby, and A. Peterson, “On the Use of Over-Determined Systems in the Adaptive Numerical
Solution of Integral Equations,” IEEE Trans. Antennas Propagat., Vol. 53, pp. 22672273, 2005.
296 Advanced Computational Electromagnetic Methods and Applications
[5] A. Peterson, and M. Bibby, “Error Trends in Higher-Order Discretizations of the EFIE and
MFIE,” Digest of the 2005 IEEE Antennas and Propagation Society International Symposium,
Washington, D.C., Vol. 3A, pp. 5255, 2005.
[6] M. Bibby, and A. Peterson, “High Accuracy Evaluation of the EFIE Matrix Entries on a Planar
Patch,” Applied Computational Electromagnetics Society (ACES) Journal, Vol. 20, pp. 198206,
2005.
[7] M. Bibby, C. Coldwell, and A. Peterson, “Normally-Integrated Magnetic Field Integral
Equations for Electromagnetic Scattering,” IEEE Trans. Antennas Propagat., Vol. 55, pp.
25302536, 2007.
[8] M. Bibby, A. Peterson, and C. Coldwell, “High Order Representations for Singular Currents at
Corners,” IEEE Trans. Antennas Propagat., Vol. 56, pp. 22772287, 2008.
[9] M. Bibby, A. Peterson, and C. Coldwell, “Use of Extrapolation to Improve Accuracy and
Enhance Confidence in Numerical Results, ” IEEE Antennas and Propagation Magazine, Vol. 50,
No. 4, pp. 150155, August 2008.
[10] M. Bibby, and A. Peterson, “Highly Accurate Implementations of Singularity Cancellation and
Extraction Methods on a Planar patch,” ACES Journal, Vol. 23, pp. 298302, 2008.
[11] A. Peterson, M. Bibby, and C. Coldwell, “Satisfaction of End, Continuity, and Junction
Conditions by Implicit and Explicit Subsectional Legendre Expansions,” Proceedings of the 25th
Annual Review of Progress in Applied Computational Electromagnetics, Monterey, CA, pp.
771774, 2009.
[12] M. Bibby, A. Peterson, and C. Coldwell, “Optimum Cell Size for High Order Singular Basis
Functions at Geometric Corners,” ACES Journal, Vol. 24, pp. 368374, 2009.
[13] A. Peterson, and M. Bibby, An Introduction to the Locally-Corrected Nyström Method, San
Rafael: Morgan & Claypool Synthesis Lectures, 2010.
[14] U. Saeed, and A. Peterson, “Local Residual Error Estimators for the Method of Moments
Solution of Electromagnetic Integral Equations,” ACES Journal, Vol. 26, pp. 403410, 2011.
[15] M. Bibby, C. Coldwell, and A. Peterson, “A High Order Numerical Investigation of
Electromagnetic Scattering from a Torus and a Circular Loop,” IEEE Trans. Antennas Propagat.,
Vol. 61, pp. 36563661, 2013.
[16] M. Bibby, and A. Peterson, “High-order Treatment of Junctions and Edge Singularities with the
Locally-Corrected Nyström Method,” ACES Journal, Vol. 28, pp. 892902, 2013.
[17] M. Bibby and A. Peterson, Accurate Computation of Mathieu Functions, San Rafael: Morgan &
Claypool Synthesis Lectures, 2014.
[18] J. Davies, “A Least Square Boundary Residual Method for the Numerical Solution of Scattering
Problems,” IEEE Trans. Microwave Theory Tech., Vol. MTT-21, pp. 90103, 1973.
[19] K. Bunch, “Theoretical and Numerical Foundations of a Boundary Residual Method for Solving
Three-dimensional Boundary-Value Problems in Electromagnetics,” PhD Dissertation,
University of Utah, March 1990.
[20] K. Bunch and R. Grow “Numerical Aspects of the Boundary Residual Method,” Int. J. Num.
Modelling, Vol. 3, pp. 5771, 1990.
[21] K. Bunch, and R. Grow, “On the Convergence of the Method of Moments, the Boundary-
Residual Method, and the Point-Matching Method with a Rigorously Convergent Formulation of
the Point Matching Method,” ACES Journal, Vol. 8, no. 2, pp. 188202, 1993.
High-Accuracy Computations for Electromagnetic Integral Equations 297
[22] G. Hsiao, and R. Kleinman, “Mathematical Foundations for Error Estimation in Numerical
Solutions of Integral Equations in Electromagnetics,” IEEE Trans. Antennas Propagat., Vol. 45,
pp. 316328, 1997.
[23] J. Wang, and J. Webb, “Hierarchal Vector Boundary Elements and Adaption for 3-D
Electromagnetic Scattering,” IEEE Trans. Antennas Propagat., Vol. 45, pp. 18691879, 1997.
[24] A. Fourie, D. Nitch, and A. Clark, “Predicting MoM Error Currents by Inverse Application of
Residual E-Fields,” ACES Journal, Vol. 14, pp. 7275, 1999.
[25] F. Bogdanov, and R. Jobava, “Estimating Accuracy of MoM Solutions on Arbitrarily
Triangulated 3-D Geometries Based on Examination of Boundary Conditions Performance and
Accurate Derivation of Scattered Fields,” JEWA, Vol. 18, No. 7, pp. 879897, 2004.
[26] X. Wang, M. Botha, and J. Jin, “An Error Estimator for the Moment Method in Electromagnetic
Scattering,” Microwave and Optical Technology Letters, Vol. 44, pp. 320326, 2005.
[27] M. Ainsworth and J. Oden, A posteriori Error Estimation in Finite Element Analysis. New York:
Wiley, 2000.
[28] M. Ilic, S. Savic, A. Ilic, and B. Notaros, “Constant Speed Parametrization Mapping of Curved
Boundary Surfaces in Higher-Order Moment-Method Electromagnetic Modeling,” IEEE
Antennas and Wireless Propagation Letters, Vol. 10, pp. 14571460, 2011.
[29] B. Notaros, “Higher Order Frequency-Domain Computational Electromagnetics,” IEEE Trans.
Antennas Propagat., Vol. 56, pp. 22512276, August 2008.
[30] R. Graglia, D. Wilton, and A. Peterson, “Higher-Order Interpolatory Vector Bases for
Computational Electromagnetics,” IEEE Trans. Antennas Propagat., Vol. 45, pp. 329342, 1997.
[31] R. Graglia, A. Peterson, and F. Andriulli, “Curl-Conforming Hierarchical Vector Bases for
Triangles and Tetrahedra,” IEEE Trans. Antennas Propagat., Vol. 59, pp. 950959, 2011.
[32] R. Graglia and A. Peterson, “Hierarchical Divergence-Conforming Nedelec Elements for
Volumetric Cells,” IEEE Trans. Antennas Propagat., Vol. 60, pp. 52155227, 2012.
[33] K. Warnick and A. Peterson, “Higher-Order Basis Functions,” in Numerical Analysis for
Electromagnetic Integral Equations, by K. F. Warnick, Norwood MA: Artech House, pp.
161185, 2008.
[34] R. Satterwhite and R. Kouyoumjian, Electromagnetic Diffraction by a Perfectly Conducting
Plane Annular Section, Technical Report 2183-2, AFCRL-69-0401, The Ohio State University,
1970.
[35] J. Boersma, and J. Jansen, Electromagnetic Field Singularities at the Tip of an Elliptic Cone,
Eindhoven University of Technology Report 90-WSK-01, 1990.
Chapter 8
Fast Electromagnetic Solver Based on
Randomized Pseudo-Skeleton Approximation
Xianyang Zhu
8.1 INTRODUCTION
The MoM method has been a very popular approach in solving electromagnetic
scattering problems. However, the MoM method has also raised challenging issues
since it suffers from the high memory requirement for large dense impedance
matrices and computational complexity for large scale problems. It has been
observed in the past decades that some significant progress has been made on
299
300 Advanced Computational Electromagnetic Methods and Applications
reducing memory usage and computational cost for the MoM method. For example,
the MLFMA [36] incorporating iterative techniques [7, 8] can reduce the memory
usage and computational complexity to ON log N . However, one of the major
disadvantages of this approach is that the algorithm is NOT independent of the
integral equation kernel. That is, for the integral equations with different kernels,
one has to make appropriate modifications to implement the associated fast
algorithms. Another approach to compression of operators is based on wavelets [9,
10], which exploits the smoothness of the elements of the matrix viewed as a
function of their indices and tends to fail for highly oscillatory operators.
It is a well-known fact that the entire impedance matrix derived from MoM is
usually neither singular nor rank-deficient. However, if all the unknowns are
assembled into groups as in MLFMA, then all the submatrix blocks representing
the interactions between two well-separated groups associated with the far
interaction terms are rank deficient. Recently, several approaches based on low-
rank representation of impedance matrix blocks have been introduced into the field
of CEM. These approaches include but are not limited to IES3 (pronounced “ice
cube,” an integral equation solver) [11], integral equation rank revealing (IE-QR)
[12, 13], predetermined interaction list octree (PILOT) [14], and adaptive cross
approximation (ACA) [1519]. In these approaches, the impedance matrix blocks
associated with the far interaction terms are represented by a product of two much
smaller matrices. For example, assume that the size of a matrix block is m × n and
its effective rank is r, then we can decompose it as the product of two matrices
with the sizes of n × r and r × n. Generally, r is much smaller than m and n. Thus,
the memory requirement can be reduced from m × n to r × (m + n), and the same
ratio of CPU time saving can be obtained for matrix-vector multiplication. The
beauty of these algorithms is their purely algebraic nature. That is, the
computational speed-up is achieved by employing linear algebra manipulations of
the impedance matrix. Thus, the implementations of these algorithms do not
depend on the complete knowledge of the integral equation kernels. However, the
computational complexity of the aforementioned algorithms is dependent on the
dimensions of the matrix under decomposition. For example, for the ACA
algorithm, its computational complexity is O(r3(m+n)), which is not trivial when m
or n is very large.
In this chapter, we will introduce the RPSA method [2] into the community of
CEM to do the matrix decomposition. In contrast to the aforementioned algorithms,
its computational complexity is O(r3), which is independent of m or n.
In Section 8.2, we will show that most submatrices of the impedance matrix
are rank deficient if the unknowns are partitioned into different groups
appropriately. The nice property can be exploited to reduce the memory
requirement and simulation time. Then different partitioning approaches are
presented in Section 8.3, and different methods for low rank matrix decomposition
are reviewed in Section 8.4. The approaches include the well-known VG method,
randomized algorithms, popular ACA algorithm, and RPSA method. For the
applications of multiple right sides, the low rank property will be exploited again
Fast Electromagnetic Solver Based on RPSA 301
ZI V (8.1)
where Z , I, and V are impedance matrix, current coefficient vector, and voltage
vector associated with the incident fields, respectively. Their dimensions are N × N,
N × 1, and N × 1, respectively, where N is the number of current coefficients
(unknowns defined on the edges connected by two neighboring triangles). Each
element of the impedance matrix Z represents the interaction of two points (one
source and one field point) on the target surface. It is straightforward to find the
scattered fields once the current coefficients are determined.
Clearly, we can see that the memory requirement is N2 and the computational
complexity is O(N3) or O(MN2), if the direct or iterative approaches are employed,
where M is the number of iterations to get converged solutions for iterative
approaches. For electrically large problems, it is challenging to solve the large
system of linear equations successfully.
However, we know that about 10 samples per wavelength are regularly
required to get accurate results. This is due to the singular property of the Green’s
functions: the interaction between two points will change very quickly when the
distance between them becomes smaller and smaller. Therefore, more samples are
required to model the interactions accurately of two points close to each other.
However, the interactions are much smoother if the distances between them are
relatively large. That is, oversampling is not required for those cases. For a specific
source point, oversampling is only necessary for field points around itself in the
near region and is not necessary for all the other field points in the far region.
Therefore, there must be a lot of redundant information inside the impedance
302 Advanced Computational Electromagnetic Methods and Applications
matrix. The entire impedance matrix is neither singular nor rank deficient except at
the internal resonances, since each unknown is both source and field point and we
have to oversample everywhere. This is to say, the entire matrix itself is a full rank
matrix.
To exploit the redundancy property of the impedance matrix, we can partition
all the unknowns into groups according to their spatial positions. Assume all the
unknowns are partitioned into J groups, and all the unknowns inside each group
are indexed consecutively, then Equation (8.1) can be rewritten as follows:
Here we reuse the subscripts. Now they stand for the indices of groups.
Therefore, each entry is a submatrix of the original impedance matrix Z,
representing the interactions between two groups. If the shortest distance between
two groups is large enough, then the submatrix is highly correlated since the
unknowns are oversampled everywhere and Green’s functions in that region are
changing slowly. That means there is a lot of redundant information in the
submatrix and therefore the submatrix must be rank deficient.
Several different approaches including IES3 [11], IE-QR [12, 13], PILOT [14],
and ACA algorithms [1519] have been developed to exploit the aforementioned
low rank property. In these methods, all the impedance matrix blocks associated
with the far interaction terms are represented by a product of two much smaller
matrices. Without loss of generality, a low rank matrix A can be approximated as
the product of two smaller matrices U and V, namely,
Am n U m r V r n (8.3)
Several algorithms are available for the partitioning of the computational domain.
One of the well-known approaches is the octree partitioning technique [22, 23]. It
has been widely used in 3-D graphics and 3-D game engines. An octree is a tree
structure in which each node has exactly eight children. It works with partitioning
the entire domain recursively in 8 groups (children, octants) as shown in Figure 8.3.
Each group can again be partitioned into 8 subgroups. This process can be repeated
until the partitioning satisfies one or more requirements. For example, it stops
when the number of unknowns in the subgroup is less than a specific number.
Fast Electromagnetic Solver Based on RPSA 305
The octree technique has been widely employed in MLFMA to partition a 3-D
space. The main disadvantage is that the number of unknowns inside each
subgroup could be quite different at the same level, which could result in
unbalanced load in parallel processing implementation.
Another partitioning method is based on the cobblestone distance sorting
technique [17]. The steps involved in this partitioning method are summarized as
follows:
Create a box bounding all unsorted unknowns and find the diagonal vector of
the box.
Project all the unknowns to the diagonal vector and find the unknown that is
the closest to the diagonal vector.
Use that unknown as the first point of the group.
Compute the distances between this point and all the other unsorted points,
and fill the group with the closest unsorted points.
Terminate when the desired group size is obtained or if the next point is
farther than a specified threshold.
Repeat the above steps for the next group until all unknowns are partitioned.
Figure 8.4 An example of unknowns partitioning is based on the cobblestone distance sorting
technique; 22,801 unknowns are partitioned into 30 groups.
1. Find the lower and upper limits of all unknowns along the three Cartesian
axes (this is equivalent to finding a bounding box for all unknowns);
2. Choose the axis with the largest extent, and partition all the unknowns along
the axis into two groups equally or almost equally (the maximum difference
is 1);
3. Repeat the above steps until the number of unknowns inside each group
meets the threshold number specified by the users.
It is obvious that this approach has more cuts along the axis with the largest
extent. Therefore, it has the combined advantages of both the octree and
cobblestone techniques: (1) the number of unknowns is almost the same for all
groups at the same level (the maximum difference is 1); and (2) it can be easily
employed for multilevel algorithms.
An example partitioning result is shown in Figure 8.5, where the target is a
sphere with 39,390 unknowns, which are partitioned into a total of 64 groups.
Fast Electromagnetic Solver Based on RPSA 307
Figure 8.5 An example of unknowns partitioning based on binary space partitioning technique,
39,390 unknowns are partitioned into a total of 64 groups
In this section we will focus on how to decompose a rank deficient matrix into a
product of two smaller matrices as shown in (8.3).
To this end, several different approaches are available. We will review the
SVD method and then introduce an interesting approach based on the randomized
projection. The ACA algorithm that is one of the most popular compression
techniques of the past decade is then presented. Finally, the randomized pseudo-
skeleton approximation method will be introduced.
The SVD method [25] is a factorization of a real or complex matrix. It has found
many applications in signal processing.
The SVD formulation of a complex matrix A with dimensions of m × n is:
A USV '
u11 u12 u1m s1 0 0 v11 v12 v1n
u u22 u2 m 0 s2 0 v21 v22 v2 n (8.4)
21
um1 um 2 umm 0 0 sm v n 1 v n 2 vnn
308 Advanced Computational Electromagnetic Methods and Applications
The dimensions of the three matrices on the right side are m × r, r × r, and r ×
m, respectively. Once obtaining the three matrices, we can represent the original
matrix as the product of two matrices by putting either the first two or the last two
together. For example, the matrices U and V in (8.3) can be defined as the product
of the first two matrices and the third matrix of the right side in (8.5), respectively,
or they can be defined as the first matrix, and the product of the second and third
matrices of the right-hand side in (8.5), respectively.
It has been proved theoretically that the SVD method would find the best
decomposition for a low rank matrix with a given rank. In other words, for a given
accuracy, SVD will find the associated lowest rank.
However, SVD is very expensive, especially when the matrix dimensions are
big since the computational complexity of the best algorithm for SVD computation
of an m × n matrix is O 4m2 n 22n3 . Another disadvantage of SVD is that all the
elements of the original matrix will be used for the SVD decomposition. Thus, we
have to calculate all the elements of the matrix. These two factors exclude SVD for
fast solvers.
Fast Electromagnetic Solver Based on RPSA 309
The randomized projection approach [26, 27] avoids applying the SVD
decomposition to the original matrix directly. Instead, the original matrix is
projected onto a much smaller space first. Then the rank of the matrix and
associated orthonormal bases are found through the much smaller matrix. The last
step is to find the associated coefficient matrix. The associated MATLAB code is
given in Listing 8.1.
The main idea of the ACA algorithm is to use an iterative and pivoting procedure
to find the two submatrices adaptively. It uses a series of approximation matrices
S0 , S1 ,, Sr to approximate the original matrix. Note that the symbol S is abused
here again. Equation (8.3) can be rewritten as follows in the format of the outer
product:
Am n U m r V r n (8.6)
U :, 1V :, 1 U :, 2V 2, : U :, r V r, :
At the very beginning, the approximation matrix is set to be 0 (S0 = 0). In the
first step of ACA, it uses the pivoting procedure to find one column of U and one
row of V to approximate the original matrix. The associated approximation matrix
is defined as
Then check if the norm of the matrix associated with the newly added column
and row is small enough or not, compared to the norm of the approximation matrix
so far. If yes, then stop; otherwise, add another column and row to U and V,
respectively. Repeat the above steps until the residue error is smaller than the
specified threshold. We can find that the approximation matrix after the kth
iteration can be written as follows
After the first iteration, U and V have one column and one row, respectively.
The newly added column and row are chosen in such a way that the same column
and row in the approximation matrix are exactly the same as the original matrix if
the computer’s round-off error is not considered. The other elements in the
approximation matrix are approximated by the outer product of the newly added
column and row. Similarly, after the kth iteration, then there are k columns and
rows in the approximation matrix that are the same as the original matrix. At the
same time, the difference at other columns and rows will be decreasing
monotonously. That means that the ACA algorithm will be guaranteed to converge
after min(m, n) iterations. That is the worst case associated with a full rank matrix.
It should be noted that not all elements of the original low rank matrix are
required. This is one of the most important features of ACA, especially when the
dimensions of the matrix are large and it is expensive to calculate elements. The
ACA algorithm is summarized as follows:
Fast Electromagnetic Solver Based on RPSA 311
ik 2 : U :, k 1ik 2 max U :, k 1i
i i k 1
U :, k 1 F V k 1,: F A F
2 2
Sk F
Sk 1 F
k 1 (8.9)
+2 Re conj U :, j U :, k * V k ,: conj V j,:
j 1
2 2
U :, k F
V k ,: F
The disadvantage of ACA is that the vectors inside U and V are not orthogonal.
That implies that there is still redundant information in U and V, that is, they are
still rank deficient themselves. To remove the redundancies, both the QR
factorization (also called the QR decomposition) and SVD can be employed here.
First apply the QR factorization to U and V’, respectively:
312 Advanced Computational Electromagnetic Methods and Applications
U Qu * Ru (8.10)
Then apply SVD to the two middle matrices to find its real rank r based on the
singular values:
During this step, the effective rank r of the matrix A is determined by:
r sum diag Stmp tol Stmp 1,1 (8.14)
where the tol factor is the relative tolerance. Generally, the tol factor is chosen to
be 103. The ACA results can thus be recompressed as follows:
where
Am n C m r A
ˆ 1 r r Rr n (8.18)
where G is not necessarily equal to the inverse of 𝐴̂ and even not necessarily
nonsingular. For example, G can be chosen as the pseudo-inverse of 𝐴̂. This kind
of decomposition is called the pseudo-skeleton approximation.
Once the matrices C, G, and R are obtained, one can easily obtain the matrices
U and V defined in (8.3):
U C (8.20)
V GR (8.21)
Or we can have:
U CG (8.22)
V R (8.23)
Similar to Equation (8.19), the original low rank matrix can be approximated
as
where Up and Vp are the conjugate transpose of VAcap(1 : r, :) and VAcap(: , 1, r),
respectively. Sp is still a diagonal matrix and its diagonal elements are the inverses
of their counterparts in SAcap.
Similar to (8.20)(8.23), the original low matrix A can be approximated by the
production of two smaller matrices U and V
U CU P (8.28)
V S PVP R (8.29)
or
U CU P S P (8.30)
V VP R (8.31)
Note that only l rows and columns are needed for the original matrix.
Numerical experiments show that l = 2r is good enough to obtain excellent results.
Fast Electromagnetic Solver Based on RPSA 315
4 11 21 27 31 4,11,21,27,31
7 2
Aˆ 1
18
2,7,18,26,31
31 26
For many practical applications, one is often interested in the monostatic scattering
patterns of a target. In these cases, the aforementioned large system of linear
equations in (8.1) needs to be solved a lot of times. For the same target, the terms
on the left side remain the same. The voltage vector on the right side will change
for different observation angles. Simulation will be very costly if a large number of
observation angles are considered.
There is another issue associated with the monostatic applications, namely,
how many samples (observation angles) are needed to guarantee that all the details
of the monostatic scattering patterns can be caught? In general, this question can
only be addressed case by case: more observation angles need to be considered if
the scattering pattern is complex with more fast-changing details, and vice versa.
The problem is that we do not know beforehand if the scattering pattern is complex
or not, especially for targets with complex geometries.
The randomized pseudo-skeleton approximation can be employed here to
address the above issues. The right sides can be decomposed as the product of two
much smaller matrices:
V U vVv (8.32)
where the dimensions of U, U, and V are Nedge × Nobservation, Nedge × k, and k ×
Nobservation, respectively. Nedge is the number of unknowns (defined on the edges
shared by two triangles), Nobservation is the number of observation angles, and k is the
rank of V. It should be noted that V is a very big matrix since Nedge and Nobservation
are very large numbers. But there is no need to calculate all elements of V. Only a
few rows and columns are randomly selected and calculated.
The k columns of the matrix U can be viewed as the principal bases of the
original voltage matrix V, while each column of the matrix V represents the
weighting coefficients of all the k bases at the associated observation angle.
Equation (8.23) implies that only k independent voltage vectors need to be
considered for the original Nobservation observation angles. The steps to solve the
original problems are as follows:
Find the low rank decomposition of the right sides with the employment of
the randomized pseudo-skeleton approximation.
Use each column in U as the right side of (8.1), and solve the equations to
obtain the current coefficients on the target surface k times. These current
coefficients are referred to as principal current components.
Fast Electromagnetic Solver Based on RPSA 317
For each of the original Nobservation observation angles, the current coefficients
are just linear combination of the k principal current components. The
weighting coefficients are given in the columns of the matrix V.
Solving the large system of equations is time consuming. But we only need to
do that for k times other than Nobservation times. Generally, k is much smaller than
Nobservation; thus, a lot of simulation time can be reduced.
Some numerical results will be shown in the next section to validate the
approach.
L11 0 0 Y1 V1
L L22 0 Y2 V2
21 (8.34)
LJ 1 LJ 2 LJJ YJ VJ
It is observed that the most expensive part of this step is the inversion of the
diagonal matrices. They should be inverted immediately after the block LU
Fast Electromagnetic Solver Based on RPSA 319
decomposition. Most operations are associated with matrix multiplication, and they
can be highly parallelized using the BLAS library. Hence, the solution time to find
the current distributions is ignorable compared to the block LU decomposition of
the impedance matrix.
It is worthwhile to note that all the nondiagonal submatrices of L and U are
low rank as well. Therefore, the RPSA algorithm can also be employed here to
compress all those submatrices.
Single core
Multiple cores (partial)
Multiple cores (full)
In this section, several different numerical examples are presented to validate the
RPSA method.
The first numerical example is used to determine how many random rows and
columns are needed to guarantee that RPSA works with the specified accuracy.
An electromagnetic-related impedance submatrix representing interaction
between two well separated groups is employed here for the study of the selection
of the sample numbers.
The size of the impedance submatrix is 280 × 280, and its effective rank is 8,
which is determined numerically according to (8.14).
The relative errors (the ratio of the Frobenius norms (also known as Hilbert-
Schmidt or Schur norm) of the difference matrix to the original matrix) as a
function of sample numbers are shown in Figure 8.8. For each sample number, we
run the code 10,000 times and select the worst case to calculate the relative error.
From the figure, we can see that the relative error is in the order of 104 when
the number of samples is three times that of the effective rank. The relative error is
in the order of 103 when the number of samples is twice that of the effective rank.
This should be sufficient for most applications.
It should be noted that the above results are based on the real data. For
synthetic data that all the smaller singular values are set to be zero, the RPSA
algorithm will be successful if the number of samples is larger than its rank.
Fast Electromagnetic Solver Based on RPSA 321
To test the accuracy of the RPSA algorithm, a random complex low rank matrix is
generated. The size of the matrix is 1,200 × 1,200, and its rank is 10. The real parts
are shown in Figure 8.9. The imaginary parts are very similar.
Thirty rows and columns chosen randomly are used to obtain its pseudo-
skeleton decomposition. The revealed rank is 10, exactly the same as the ground
truth. The difference between the reconstructed matrix and the original one is
shown in Figure 8.10. Notice that the difference is at the level of 105. It is evident
that the RPSA method performs excellently.
In this subsection, we will compare the performance between RPSA and ACA.
Four low rank random matrices are generated; their dimensions are 500 × 5,000,
1,000 × 1,000, 1,500 × 1,500, and 2,000 × 2,000, and their ranks are 50, 100, 150,
and 200, respectively.
For ACA, the revealed ranks are 55, 106, 158, and 217, respectively.
Therefore, further compression is necessary for ACA. The termination criterion for
the first three cases is 10-6. However, the wrong decomposition results will be
obtained for the last matrix for this criterion. The criterion has to be reduced to
108 to obtain a good result in this case. The relative Frobenius norm errors in the
four cases are at the level of 1013 (3.43e14, 1.32e13, 2.40e-3, and 4.30e13).
For RPSA, the revealed ranks are exactly the same as the ground truths. This
is because the real rank can be found directly when calculating the pseudo-inverse
of the intersection matrix. The threshold for the pseudo-inverse purpose is set to
0.001 for all cases. The relative Frobenius norm errors in the four cases are at the
level of 1015 (4.12e15, 6.25e15, 5.97e15, and 1.07e14). They are at least one order
more accurate than ACA.
The CPU time for both algorithms is shown in Figure 8.11. Again we can see
that RPSA is at least one order faster than ACA.
Fast Electromagnetic Solver Based on RPSA 323
The far field of a PEC sphere can be expressed in a closed form via Mie series [35].
Therefore, it is often used to test the accuracy of different electromagnetic solvers.
To this end, the RCS of a PEC sphere versus frequency is calculated.
The sphere is modeled by 72,982 triangles (109,473 unknowns), and its radius
is 5 meters. The variation of the normalized RCS with frequency is shown in
Figure 8.12, where the solid and dashed lines represent the Mie series solution and
numerical results based on RPSA, respectively. The differences between the Mie
series solution and the numerical results are shown in Figure 8.13.
Figure 8.13 Differences between the Mie series solution and numerical results.
From the figures we can see that the numerical result based on RPSA agrees
well with the Mie series solution. The differences of the two methods are generally
less than 0.01 dB.
In this section, we will show how RPSA is applied to reduce the number of
simulations for the multiple monostatic scattering analysis.
A generic fighter size airplane model VFY218 is shown in Figure 8.14. It has
been widely used as a benchmark by the Electromagnetic Codes Consortium
(EMCC) [36]. It is 15.5m long from nose to tail, 4.1m from top to bottom, and
8.9m from one wing tip to another.
At 300 MHz, the VFY218 airplane is modeled by using 79,172 triangles with
an average edge length of 0.071 m. The model is closed; thus, the total number of
unknowns is 118,758 (1.5 times the number of triangles).
To obtain the monostatic scattering pattern of this model, RPSA is employed
to compress the right sides. The step sizes for both azimuthal and elevation angles
are 1o. The revealed ranks are 1,511 and 1,456, for the v- and h-polarizations,
respectively. That means that only 1,511 and 1,456 independent simulations are
needed for the monostatic scattering analysis at all observation angles. The current
distributions at all observation angles (65,160 cases) can be reconstructed using the
1,511 or 1,456 eigen-current distributions for the v- or h-polarizations.
The monostatic RCS patterns on a cut ( = 90o, = 0~360o) for different
polarizations are shown in Figure 8.15. The two curves associated with the cross-
polarizations are supposed to be the same theoretically due to the reciprocity, and
they do agree with each other very well.
(a) (b)
Figure 8.15 Monostatic RCS on a cut. (a) VV- and HH-polarizations; and (b) VH- and HV-
polarizations.
Figure 8.16 shows the current distribution on the target surface. The incident
elevation and azimuth angles are 0o and 45o, respectively. We can see that the
surface current is dominated by the physical optics in the lit region. In some local
areas, the current is disturbed due to the multiple reflections in between different
parts.
We also calculate the RCSs on the same cut without applying the randomized
pseudo-skeleton approximation to the right-hand sides. The difference is shown in
Figure 8.17. Clearly we can see that the difference is negligible. Thus, the
application of RPSA to the right sides provides an efficient way to reduce the
simulation time of multiple monostatic scattering cases, while no accuracy is
sacrificed.
326 Advanced Computational Electromagnetic Methods and Applications
To test the performance of the parallelized code, we run the code with different
numbers of threads. Its speed-up and efficiency as a function of the number of
threads are illustrated in Figure 8.18.
Fast Electromagnetic Solver Based on RPSA 327
It is interesting to notice that the performance is better than the ideal case.
This phenomenon is called super-linear speed-up, which is due to the fact that
more cache memory (faster than normal memory) is available.
We can also see that the efficiency appears to saturate around 50%. This is
due to the block LU decomposition, where some threads have to be idle while
waiting for the results from the other threads.
(a)
(b)
Figure 8.18 Performance of the parallelized code based on OpenMP: (a) speed-up and (b) efficiency.
8.9 SUMMARY
The RPSA algorithm is very simple and efficient compared with the other low rank
approximation methods. Similar to the popular ACA algorithm, it is purely
algebraic. Therefore, its implementation is integral equation kernel independent. Its
computational complexity is not dependent on the dimensions of the deficient
matrix, but on its effective rank. In addition to CEM, it could also benefit other
communities where the low rank decompositions are employed.
328 Advanced Computational Electromagnetic Methods and Applications
REFERENCES
[1] F. Harrington, Field Computation by Moment Method, New York: IEEE Press, 1993.
[2] X. Zhu and W. Lin, “Randomised Pseudo-Skeleton Approximation and Its Application in
Electromagnetics,” Electronics Letters, Vol. 47, No. 10, pp. 590592, 2011.
[3] V. Rokhlin, “Rapid Solution of Integral Equations of Scattering Theory in Two Dimensions,” J.
Comput. Phys., Vol. 86, No. 2, pp. 414439, 1990.
[4] N. Engheta, W. Murphy, V. Rokhlin, and M. Vassiliou, “The Fast Multipole Method FMM for
Electromagnetic Scattering Problems,” IEEE Trans. Antennas Propag., Vol. 40, No. 6, pp.
634641, 1992.
[5] W. Chew, J. Jin, C. Lu, E. Michielssen, and J. Song, “Fast Solution Methods in
Electromagnetics,” IEEE Trans. Antennas Propag., Vol. 45, No. 3, pp. 533543, 1997.
[6] N. Geng, A. Sullivan, and L. Carin, “Fast Multipole Method for Scattering from 3-D PEC
Targets Situated in a Half-Space Environment,” Microw. Opt. Tech. Lett., Vol. 21, No. 6, pp.
399405, 1999.
[7] J. Song, C. Lu, and W. Chew, “Multilevel Fast-Multipole Algorithm for Electromagnetic
Scattering by Large Complex Objects,” IEEE Trans. Antennas Propag., Vol. 45, No. 10, pp.
14881493, 1997.
[8] N. Geng, A. Sullivan, and L. Carin, “Multilevel Fast-Multipole Algorithm for Scattering from
Conducting Targets above or Embedded in a Lossy Half Space,” IEEE Trans. Geoscience and
Remote Sensing, Vol. 38, No. 4, pp. 15611573, 2000.
[9] G. Beylkin, R. Coifman, and V. Rokhlin, “Fast Wavelet Transforms and Numerical Algorithms
I,” Comm. Pure Appl. Math., Vol. 44, No. 2, pp. 141183, 1991.
[10] B. Alpert, G. Beylkin, R. Coifman, and V. Rokhlin, “Wavelet-Like Bases for the Fast Solutions
of Second-Kind Integral Equations,” SIAM J. Sci. Comput., Vol. 14, No. 1, pp. 159184, 1993.
[11] S. Kapur and D. Long, “IES3: Efficient Electrostatic and Electromagnetic Solution,” IEEE
Comput. Sci. Eng., Vol. 5, No. 4, pp. 6067, 1998.
[12] S. Seo and J. Lee, “A Single-Level Low Rank IE-QR Algorithm for PEC Scattering Problems
Using EFIE Formulation,” IEEE Trans. Antennas Propag., Vol. 52, No. 8, pp. 21412146, 2004.
[13] R. Burkholder and J. Lee, “Fast Dual-MGS Block-Factorization Algorithm for Dense MoM
Matrices,” IEEE Trans. Antennas Propag., Vol. 52, No. 7, pp. 16931699, 2004.
[14] D. Gope and V. Jandhyala, “Efficient Solution of EFIE via Low-Rank Compression of
Multilevel Predetermined Interactions,” IEEE Trans. Antennas Propag., Vol. 53, No. 10, pp.
33243333, 2005.
Fast Electromagnetic Solver Based on RPSA 329
[15] M. Bebendorf, “Approximation of Boundary Element Matrices,” Numer. Math., Vol. 86, No. 4,
pp. 565589, 2000.
[16] K. Zhao, M. Vouvakis, and J. Lee, “The Adaptive Cross Approximation Algorithm for
Accelerated Method of Moments Computations of EMC Problems,” IEEE Trans. Electromagn.
Compat., Vol. 47, No. 4, pp. 763773, 2005.
[17] J. Shaeffer, “Direct Solve of Electrically Large Integral Equations for Problem Sizes to 1M
Unknowns,” IEEE Trans. Antennas Propag., Vol. 56, No. 8, pp. 23062313, 2008.
[18] J. Tamayo, A. Heldring, and J. Rius, “Multilevel Adaptive Cross Approximation,” IEEE Trans.
Antennas Propag., Vol. 59, No. 12, pp. 46004608, 2011.
[19] A. Heldring, J. Tamayo, C. Simon, E. Ubeda, and J. Rius, “Sparsified Adaptive cross
Approximation Algorithm for Accelerated Method of Moments Computations,” IEEE Trans.
Antennas Propag., Vol. 61, No. 1, pp. 240246, 2013.
[20] A. Peterson, S. Ray, and R. Mittra, Computational Methods for Electromagnetics, New York:
IEEE Press, 1998.
[21] S. Rao, D. Wilton, and A. Glisson, “Electromagnetic Scattering by Surface of Arbitrary Shape,”
IEEE Trans. Antennas Propag., Vol. 30, No. 3, pp. 409418, 1982.
[22] D. Meagher, “Octree Encoding: A New Technique for the Representation, Manipulation and
Display of Arbitrary 3-D Objects by Computer,” Rensselaer Polytechnic Institute Technical
Report IPL-TR-80-111.
[23] H. Eberhardt, V. Klumpp, and U. Hanbeck, “Density Trees for Efficient Nonlinear State
Estimation,” Proceedings of the 13th International Conference on Information Fusion,
Edinburgh, United Kingdom, July 2010.
[24] A. Heldring, J. Rius, J. Tamayo, J. Parron, and E. Ubeda, “Multiscale Compressed Block
Decomposition for fast Direct Solution of Method of Moments Linear System,” IEEE Trans.
Antennas Propag., Vol. 59, No. 2, pp. 526536, 2011.
[25] G. Golub, and C. Loan, Matrix Computation, Baltimore, MD: The Johns Hopkins University
Press, 1996.
[26] E. Liberty, F. Woolfe, P. Martinsson, V. Rokhlin, and M. Tygert, “Randomized Algorithms for
the Low-Rank Approximation of Matrices,” PNAS, Vol. 104, No. 51, pp. 2016720172, 2007.
[27] F. Woolfe, E. Liberty, V. Rokhlin, and M. Tygert, “A Fast Randomized Algorithm for the
Approximation of Matrices,” Dept. of Computer Science, Yale University, Technical Report
1386, 2007.
[28] E. Michilsen and A. Boag, “A Multilevel Matrix Decomposition Algorithm for Analyzing
Scattering from Large Structures,” IEEE Trans. Antennas Propag., Vol. 44, No. 8, pp.
10861093, 1996.
[29] R. Pierri and F. Soldovieri, “On the Information Content of the Radiated Fields in the Near Zone
over Bounded Domains,” Inverse Problems, Vol. 14, pp. 321337, 1998.
[30] R. Piestun and D. Miller, “Electromagnetic degrees of freedom of an optical system,” J. Opt. Soc.
Am. A, Vol. 17, No. 5, pp. 892902, 2000.
[31] A. Heldring, J. Tamayo, and J. Rius, “On the Degrees of Freedom in the Interaction Between
Sets of Elementary Scatterers,” 3rd European Conference on Antennas and Propagation, Berlin,
Germany, 2009.
330 Advanced Computational Electromagnetic Methods and Applications
[32] J. Dongarra and F. Sullivan. “Guest editors’ introduction to the top 10 algorithms,” Computing in
Science and Engineering, Vol. 2, No. 1, pp. 22–23, 2000.
[33] en.wikipedia.org/wiki/OpenMP.
[34] en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms.
[35] R. Mautz, “Mie Series Solution for a Sphere (Computer Program Descriptions),” IEEE
Transactions on Microwave Theory and Techniques, Vol. 26, No. 5, p. 375, 1978.
[36] J. Kirklabnd, “The Electromagnetic Code Consortium,” IEEE Antennas and Propagation Society
International Symposium, Chicago, IL, 1992.
Chapter 9
Computational Electromagnetics for the
Evaluation of EMC Issues in Multicomponent
Energy Systems
Osama A. Mohammed and Mohammadreza R. Barzegaran
This chapter reviews the physics modeling based on the electromagnetic stray
fields and interference in the electric power network. The low-frequency as well as
high-frequency equivalent source modeling of the power components for the study
of radiated and conducted electromagnetic compatibility is implemented. The 3-D
finite element analysis with some modifications is applied in the solution method
as well as meshing strategies for the simulation of large-scale components.
Moreover, the stray field of the components is utilized for improving the control of
the machine-drive system using hardware in loop method. The optimization in
design of the components such as power converter based on the electromagnetic
compatibility (EMC) compliance is also applied. This is achieved by coupling
MATLAB with 3-D finite element technique for applying the numerical
optimization techniques. The results are verified experimentally.
9.1 INTRODUCTION
The compliance with the EMC standards is an increasingly important aspect in the
design of practical engineering systems. Consideration of EMC issues at the design
stage is necessary to ensure the functional safety and reliability of complex
modern products, which are increasingly reliant on electronic subsystems to
provide powering, communications, control, and monitoring functions that are
needed to provide enhanced levels of functionality of systems. Typical examples
include transportation vehicles (road, rail, sea, and air), manufacturing plants,
power generation and distribution, and communications. The opportunities for
using numerical simulation techniques to predict and analyze the system EMC
and related issues (e.g., human field exposure and installed antenna
performance) are therefore of considerable interest in many industries.
331
332 Advanced Computational Electromagnetic Methods and Applications
For efficient control and use of electric energy, electronics and power
electronics are increasingly used within electrical systems. Examples of such
technologies are solar and wind power conversion systems, electric vehicles,
variable speed drives, and energy-efficient lighting systems. These technologies
are also used in evolving smart grid applications. A basic performance of such
modern electrical systems is related to EMC in the area of low-frequency
disturbances. Based on the above background, the importance of low-frequency
EMC study is increasing considerably.
However, the power electronic technologies are also used in evolving
machine-drive equipment such as vessels and aircrafts. The magnetic signature is
observable at low frequencies in the local magnetic field, but then several
applications in military include detection and classification by and subsequent
detonation of sea mines and detection and localization of submarines. Due to the
improvement of the sensitivity of EMF sensors and smart signal processing,
signature reduction is vital. Thus, the first goal is the decrease of the detection
range by complying with the strict signature requirements.
The other signature study aspect of the radiated fields at low frequencies is
condition monitoring of the components. The faults in the winding of the machines
as well as switches failure and many other problems can be detected without the
need for the system to be dismantled. This is critically beneficial for the sensitive
applications in which it may not be easily possible to get near to the components
for online testing and offline testing of the component is costly.
The previous works of the investigation of the radiated fields in the power
system can be categorized into EMC studies in power systems, electromagnetic
computational modeling studies, electromagnetic signature studies, system
monitoring studies, and fault and failure diagnosis. The electromagnetic
computational modeling is the concern of this chapter.
The modeling process in the field of electromagnetic compatibility means to
establish a connection between the source of interference or any other cause and its
effect that can be the response of the component as the part of the system. This
relationship can be established in several ways, depending on the type of problem,
its complexity, and the degree of approximation with respect to an exact
formulation. The possible methods involve:
Using circuit theory for designating the conducted disturbance, such as voltage
dips, over-voltages, voltage stoppages, harmonics, and common ground
coupling [1, 2];
Using an equivalent model (usually circuit) with either distributed or lumped
parameters, such as at low-frequency EMF coupling expressed in terms of
mutual inductances and stray capacitances, field-to-line coupling using the
transmission line approximation, and cable crosstalk [3, 4];
Formulating the problem in terms of formal solutions to Maxwell’s equations
and making analytical models based on that [5];
Computational Electromagnetics for Evaluation of EMC Issues 333
Generally, the methods used in the EMC modeling are not only to visualize
electromagnetic phenomena but also to predict and suppress the interferences,
which can be regarded as either theoretical or experimental.
Here, first the procedure of physics-based modeling for the EMC study is
explained. Then the equivalent source modeling versus 3-D full FE modeling is
discussed. Afterward, the EMC modeling of the converter with the purpose of the
optimization of power electronic component performance is described. Through
these sections, the techniques for the physics-based modeling for special purpose
are discussed.
Table 9.1
Classification of the Numerical Level Necessary to Predict the Different Performance Measures
in the Virtual Test Environment
Model Leveling Model Character Objectives
Internal Cables
Enclosures
Power Bus
AC and DC Choke
EMI Filter
Figure 9.1 Decomposition of the modeling problem for creation of numerical test environment.
The device level consists of each of the physical models of all components
calculated from a 2-D or 3-D quasi-static electromagnetic FE analysis. In the
device level, the component models can be divided into several subsystems, based
upon their power range, and their location inside the components, their degree of
importance from EMC and EMI issues, force outage rate, and the related fault
diagnosis issues. The interface level consists of any resistive, capacitive, or
inductive paths between enclosures of the components and the additional
Computational Electromagnetics for Evaluation of EMC Issues 335
decoupling capacitor that are used to reduce the area of ground current loop and
also to cut the current path and prevent it from entering the control units.
The environmental level consists of the physical model of the chamber filled
with air and the enclosure model of each of the components placed in it.
Low frequency physics-based Low and high frequency Creation of the chamber with
modeling of each component physics-based modeling and black box model of enclosures
including power bus, machines, the enclosures while they are in FE-based software for
switches, DC choke, EMI filters, placed in the chamber using a simulation of electromagnetic
and ETC including enclose quasi-static FE solution field propagation
using a quasi-static FE solution
Creation of the equivalent Excitation of the black box
Creation of the high frequency circuit of enclosures including model with simulated ground
equivalent circuit of each self and mutual inductances currents from simulink
element of component between enclosures and
mutual capacitance and
floating ground capacitance
Connection of the elements
(model 2) Signature study
and creation of each active or
passive component separately
Combination of models 1 and 2
Connection of all of the Excitation of the finite element
components and creation of Simulation of the global model with extracted current
the model 1 equivalent circuit in simulink from simulink
and extraction current to
Simulation of the equivalent ground
circuit of each part in simulink
Creation of each part in an FE-
based software for simulation
Fault diagnosis and prognostic
of electromagnetic field
studies
propagation
Voltage
unbalance
LF, electric fields.
Radiated from
circuits with a high
dv/dt
DC in AC Radiated low frequency
circuits and (LF) interference (up to
vice versa about 10 kHz)
LF, magnetic
fields. Radiated
from circuits with a
Voltage dips high di/dt
and power
interruptions
Transient over-
Conducted low voltages due to
Power lightning or
frequency (LF)
frequency switching
interference (up to about
variation
10 kHz)
HF, magnetic
fields. Radiated
from circuits with a
high di/dt
Figure 9.4 A visualized view of the numerical test environment for machine-drive design.
(a) (b)
Figure 9.5 Prototype of the proposed machine (SCIM) in finite element analysis. (a) Actual model
and (b) an equivalent line-shape model for EMI and signature studies. ©IEEE 2011 [10].
0 Il dl aˆR
4 R 2
Bl (9.1)
where the l is the length of the line and Il is the carrying current of the line, and aˆ R
is the distance vector between dl and the observation point. Similarly, for a volume
current, the radiated magnetic field density at an R distance is as follows:
0 Jdv aˆR
4 R 2
Bv (9.2)
The idea of this model is to have the same field, while the model is a line and
does not have a cross-section. Hence, by equalizing (9.1) and (9.2) and considering
J, R, ds, and dl as known parameters, then Il, the current amplitude of the line, can
be calculated. The voltage of nodes is similarly calculated by equalizing the
electric field due to the charge distribution of the line and volume. Each
component has some parameters that should be considered about this modeling and
explained in their section. More details about the basics of the model are
mentioned in [8].
A typical power setup consists of electrical generators, such as a synchronous
generator, electrical motors, such as induction and DC motors, and connection
cables and power converters. All of these components except the converter are
modeled using the equivalent source model, and some of them are verified
experimentally. The converter has some considerations, and its modeling for stray
field analysis is explained in the next section (Section 9.4). In addition to the study
of each component, their coupling is also studied. Finally, the whole setup is
investigated.
340 Advanced Computational Electromagnetic Methods and Applications
V1 A
V2
V3 V4
C
V5 V6
B V8
V7
Figure 9.6 Prototype of the proposed cube model for replicating the electric field of the actual
machine. ©IEEE 2011 [10].
Consequently, to have both the magnetic and electric field of the model
simultaneously, these two models [Figure 9.5(b) and Figure 9.6] are combined
together. The combined model propagates similar electric and magnetic fields at
far distances. This model is shown in Figure 9.7.
For simulation purposes, a three-phase, 380V, 5A, and 120 turn/phase
induction machine with a stack length of 0.15m and outer diameter of 0.175m is
simulated in the 3-D electromagnetic FE domain for a specific time. The meshed
final model is shown in Figure 9.8. The number of degrees of freedom of the
source model is considered as large as possible, in order to have accurate results of
the propagation in measured areas. In addition, an appropriate element growth rate
is applied to the model and the tolerance of analysis is considered at 1e-6.
Computational Electromagnetics for Evaluation of EMC Issues 341
Figure 9.7 The equivalent cylinder-cube model for reproducing radiated electric and magnetic
fields of the actual machine. ©IEEE 2011 [10].
In order to verify the accuracy of the model, propagated electric and magnetic
fields from the proposed model and the actual machine along three lines in the x-,
y-, and z-directions at a far distance, as shown in Figure 9.9(a), are calculated and
compared, as shown in Figure 9.9(b). The position of the reference lines from
which the propagated electric and magnetic fields are measured is also shown in
Figure 9.9(a). As shown in Figures 9.9(b) and 9.10, the model propagates similar
electric and magnetic fields in comparison with the actual model at far distances.
Because of the adjacency of the radiated electric fields in the x and y lines, a
magnified view depicting details is shown in Figure 9.10.
In Y direction
In X direction
In Z direction
(a)
-6 X Axis (actual)
x 10
1.4 X Axis (model)
Y Axis (actual)
Y Axis (model)
1.2
Z Axis (actual)
Magnetic Field Density (T)
Z Axis (model)
1
0.8
0.6
0.4
0.2
0
-2 -1 0 1 2
Coordinate (m)
(b)
Figure 9.9 (a) Reference lines from which propagated electric and magnetic fields are measured;
and (b) propagated magnetic field from the actual and proposed model in all three axes.
©IEEE 2011 [10].
Computational Electromagnetics for Evaluation of EMC Issues 343
-4
x 10
6 X Axis (actual)
X Axis (model)
Y Axis (actual)
5 Y Axis (model)
Z Axis (actual)
Z Axis (model)
Electric Field (V/m)
4
0
-2 -1 0 1 2
Coordinate (m)
Figure 9.10 Propagated electric field from the actual and proposed model in all three axes. ©IEEE
2011 [10].
Comparing Figures 9.12(a) and 9.12(b), it can be seen that, not only do the
wave shapes of the magnetic field density of the two models match, but also their
amplitude is almost the same in all points of the plane. Also, the electric fields of
both models, which are shown in Figures 9.12(c) and 9.12(d), are the same at
almost all points. This is also valid for all other planes around the model. In
conclusion, it is verified that the equivalent model can replace the actual model for
the signature study analysis of one case machine.
-6
x 10
1.2
X Axis (actual)
X Axis (model)
1
Magnetic Field Density (T)
0.8
0.6
0.4
0.2
0
-2 -1 0 1 2
Coordinates (m)
Figure 9.11 Radiated magnetic field from the actual and proposed models in the x-axis at 1-m
distance to the models. ©IEEE 2011 [10].
Since the final goal of this research is to use this model in multicomponent
systems, the model is studied for the two-motor case. This can also be considered
as validation of the obtained cylinder-cube model from the one-motor case, and
inserted into the model to investigate a multimachine drive, while the currents in
the branches of the cylinder and voltages at the nodes of the cube remain the same
as in the first case (single case model). The centers of the coordinates of the two
cubes and cylinders are exactly the same as the actual machine model. Figures 9.13
and 9.14 show the comparison between magnetic and electric fields propagated
from the actual and proposed models for the two-motor case. Note that the
proposed planes and lines for measuring the fields are the same as the single
machine case [see Figure 9.9(a)]. As can be seen, the magnetic and electric fields,
like the single machine case, follow the same patterns with good accuracy.
The shift in the electric field signatures measured along the z-axis [see Figure
9.14] is because of the size of the equivalent model. As discussed before, an
optimization method can be used for fitting the size of the model. If parameters of
the optimization vary, for example, the mutation factor is modified to bigger values,
this shift would be decreased. This becomes true for the magnetic field as well.
Computational Electromagnetics for Evaluation of EMC Issues 345
(a) (b)
(c) (d)
Figure 9.12 Magnetic and electric field spectrums throughout the x-y plane propagated from the
actual machine and the proposed model. ©IEEE 2011 [10].
-6
x 10 X Axis (actual)
3 X Axis (model)
Y Axis (actual)
Y Axis (model)
2.5
Z Axis (actual)
Magnetic Field Density (T)
Z Axis (model)
1.5
0.5
0
-2 -1 0 1 2
Coordinate (m)
Figure 9.13 Propagated magnetic field from the actual and proposed models for two motors in three
axes. ©IEEE 2011 [10].
346 Advanced Computational Electromagnetic Methods and Applications
X10-14
8
6
Electric Field (V/m)
5
X Axis (actual)
4 X Axis (model)
Y Axis (actual)
3 Y Axis (model)
Z Axis (actual)
Z Axis (model)
2
1
-2 -1 0 1 2
Coordinate (m)
Figure 9.14 Propagated electric field from the actual and proposed models for two motors in all
three axes. ©IEEE 2011 [10].
One of the achievements of this research is the simulation time reduction. The
comparison between the simulation times shows that this approach makes the
simulation time of the model at least 100 times faster than a full 3-D model. More
details of the comparison are illustrated in Table 9.2.
Table 9.2
Comparison of Computation Time for the Actual and Equivalent Models
intensity) of the two models (actual and equivalent source models) is obtained as
shown in Figure 9.16. The H-field streamline also shows that the equivalent source
model has very similar results to the actual model. It also shows that the dipoles
establish around the equivalent source model also in near distance. It should be
noted that the purpose of this model is to obtain resembling fields at far distances.
(a) (b)
Figure 9.15 Arrow plot of magnetic field density of (a) actual machine and (b) equivalent model in
the x-y plane.
(a) (b)
Figure 9.16 Stream-line of H-field of (a) actual machine model in the x-y plane (A/m) and (b)
equivalent source model in the x-y plane (A/m).
(a)
(b)
Figure 9.17 (a) Decomposition and reconstruction by MRA at two levels (L and H represent the low
and high pass rescontruction filters, respectively) and (b) decomposition and
reconstruction by bandwidth of subsignals.
0.0366<f(Hz)<0.0732 0.0366<f(Hz)<0.0732
250 160
200 140
150 120
-2 0.0732<f(Hz)<0.1465 2 0.0732<f(Hz)<0.1465
200
-2 2
200
100 Normal Electric Field (Micro V/m) 100
(nT)
0
Density(nT)
0 0.1465<f(Hz)<0.293
-2 0.1465<f(Hz)<0.293 2 -2 2
100 200
FluxDensity
0 0
-100 0.293<f(Hz)<0.5859
-200
0 4 -2 0.293<f(Hz)<0.5859 2
500
Flux
100
0 0
Magnetic
Magnetic
-500 -100
-2 0.5859<f(Hz)<1.1719 2 -2 0.5859<f(Hz)<1.1719 2
200 20
0
Normal
0
-200
Normal
1.1719<f(Hz)<2.3438 -20
-2 2 -2 1.1719<f(Hz)<2.3438 2
100 50
0
0
-100
-2 2.3438<f(Hz)<4.687 2 -50
50 -2 2.3438<f(Hz)<4.687 2
0 10
-50 0
-2 4.68<f(Hz)<9.375 2 -10
20 -2 4.68<f(Hz)<9.375 2
0 5
0
-20 -5
-2 -1 0 1 2 -2 -1 0 1 2
Coordinate (m) Coordinate (m)
Figure 9.18 show the reconstructed magnetic and electric fields in the y-
direction at different frequency bands for one machine, respectively. It can be
observed that an acceptable matching at different frequency bands exists between
the equivalent and real machines almost in the entire frequency band.
350 Advanced Computational Electromagnetic Methods and Applications
Figure 9.19 is the reproduction of Figure 9.18 for the two-machine case study.
As can be observed, there is an acceptable agreement between the equivalent and
real models. Moreover, comparison of Figures 9.18 and 9.19 proves that the linear
relationship between the one-machine and two-machine cases of studies exist in all
of the frequency bands. However, the found relationship between the one- and
two-machine cases of studies depends on the explicit geometrical arrangement of
two-machine cases.
0.0366<f(Hz)<0.0732 0.0366<f(Hz)<0.0732
500 250
400
300 200
-2 0.0732<f(Hz)<0.1465 2
400 -2 0.0732<f(Hz)<0.1465 2
200 400
Normal Magnetic Flux Density (nT)
200
0
Normal Electric Field (Micro V/m)
-2 0.1465<f(Hz)<0.293 2 0
-2 0.1465<f(Hz)<0.293 2
200 200
0
0
-200
-2 0.293<f(Hz)<0.5859 2 -200
500 -2 0.293<f(Hz)<0.5859 2
0 100
-500 0
-2 0.5859<f(Hz)<1.1719 2 -100
-2 0.5859<f(Hz)<1.1719 2
500
0 50
-500 0
-2 1.1719<f(Hz)<2.3438 2 -50
200 -2 1.1719<f(Hz)<2.3438 2
0
50
0
-200 -50
-2 2.3438<f(Hz)<4.687 2 -2 2.3438<f(Hz)<4.687 2
100 10
0 0
-100 -10
-2 4.68<f(Hz)<9.375 2 -2 4.68<f(Hz)<9.375 2
50 5
0 0
-50 -5
-2 -1 0 1 2 -2 -1 0 1 2
Coordinate (m) Coordinate (m)
(a) (b)
Figure 9.19 Comparisons of (a) normal magnetic and (b) electric field at different frequency bands,
equivalent and real machine, two-machine case of study.
Computational Electromagnetics for Evaluation of EMC Issues 351
Time-Based Analysis
Since the actual induction machine carries alternating current (AC), the time-based
analysis is more useful. In the previous sections, the analysis was time-based;
however, the figures are just depicted in one typical moment of time. In this
section, the radiated EMFs of different instances of time in one cycle are studied.
For brevity, four time instances are selected (0.0025 second, 0.005 second, 0.0075
second and 0.0125 second). The voltage amplitude of the terminal of the model
during one time cycle is shown in Figure 9.20.
First, the radiated magnetic field in the near distance (0.5m) from the machine
is studied. The magnetic field density measured in four time instants is shown in
Figure 9.21. The result shows that the magnetic field rotates by the variation in
time, although the position of the maximum field point remains unchanged. It can
be inferred from this result that the model resembles the machine and can be used
instead of that, at all time instants, not just one time instant, in which the model is
designed. Next, the radiated magnetic field at a far distance (~10m) from the
machine is studied. In this distance, the rule of the magnetic dipoles for these
distances causes the field to become similar to a dipole, as shown in Figure 9.21
[14]. As shown in this figure, the dipoles are sensitive to time changes and they
rotate when the time changes. Consequently, the equivalent source model can be
used for the time-based analysis at near and far distances.
Phase A
Phase B
0.5 Phase C
Terminal Votage (p.u)
-0.5
-1
0 0.0025 0.005 0.0075 0.01 0.0125 0.015 0.0175 0.02
time (s)
Figure 9.20 Voltage amplitude of the terminal of the model during one time cycle.
352 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
(c) (d)
Figure 9.21 Magnetic field density (B) of equivalent model in four different moments of time at near
distance (a) t = 0.0025 second; (b) t = 0.005 second; (c) t = 0.0075 second; and (d) t =
0.0125 second.
Another condition that should be studied for the induction machine is testing
various positions of the machine. In many cases, the location of the motor with
respect to the measured points will change. Therefore, the electromagnetic
signatures are expected to be changed. Hence, a specific change of the motor is
studied here. The whole machine was rotated around an axis and the results were
obtained and illustrated in Figure 9.22. The magnetic field in this figure is plotted
at a far distance.
Computational Electromagnetics for Evaluation of EMC Issues 353
(a) (b)
(c) (d)
Figure 9.22 Magnetic field density (B) of equivalent model at the four different intendances of time
at far distance
-6
x 10
2
0 deg
1.8 20 deg
40 deg
1.6
60 deg
Magnetic Field Density (T)
1.4 80 deg
1.2
0.8
0.6
0.4
0.2
0
-20 -16 -12 -8 -4 0 4 8 12 16 20
coordinate in X axis (m)
Figure 9.23 Deviation of magnetic field density (B) of the equivalent model due to the rotation of
the whole machine around the z-axis.
(x, y) from right to left. When plotting other angles ranging from 90 o to 180o, the
results are exactly symmetrical with respect to the changes from 0 o to 90o. The
magnetic field density of 180o change is exactly the same as the one with 0o change.
This study is useful in identifying the situation of the source machine by looking at
the signatures at far distances. All of these studies can be imported to an
optimization program, such as genetic algorithm or neural network. Therefore, the
machine in any situation can be recognized.
Table 9.3
The Characteristics of the Components
Components Description
Figure 9.24 The studied experimental setup including the machine and measurement tools.
Computational Electromagnetics for Evaluation of EMC Issues 355
The coil antenna and the real-time spectrum analyzer, which are used in the
measurement, are specifically for low-frequency analysis with high precision. The
frequency range is between 20 Hz and 500 kHz. The winding of the antenna is 36
turns of 7-41 litz wire shielded with 10-ohm resistance and 340-µH inductance.
The antenna and the setup are located based on the standards (MIL-461-STD [16],
MIL-462-STD). The spectrum analyzer also covers 1 Hz to 3 GHz with ±0.5-dB
absolute amplitude accuracy to 3 GHz. The details of the components are
mentioned in Table 9.3.
-10
measurement
3DFE model
-20 equivalent model
Intensity(dBA/m)
(dBuA/m)
-30
-40
Intensity
-50
FluxFlux
Magnetic
-60
Magnetic
-70
-80
-2.5 -1.5 -0.5 0.5 1.5 2.5
Arc Length (m)
(a)
(b)
Figure 9.25 (a) The magnetic field intensity at 55 cm away from the setup in the y-axis while all
components except IM were off at 60 Hz (dBµA/m), and (b) the region of the model
(the model is in the center, and the measured line is shown in dark grey).
356 Advanced Computational Electromagnetic Methods and Applications
The nominal voltage is applied to the machine and the stray magnetic field is
obtained at various distances. The result of measurement, the full 3-D FE model
and the equivalent source model at 60 Hz are obtained and shown in Figure 9.25.
The magnetic field intensity, as the standard index of signature studies, is used
with dBµA/m as the unit of comparative measure.
As illustrated in Figure 9.25, the signatures from the two simulation models
match the measurement. The reason that the measurement results in the figure do
not show distortions is the low number of patterns of the measured results in
comparison with the simulation result, especially the equivalent source model. The
number of patterns along the line in the y-axis, which is used as the measured line,
for the equivalent source model is 260 points, while it is about 10 points for the
measurements. The measured line is also shown in Figure 9.25.
9.3.2 DC Motor
The induction motor that is discussed in Section 9.3.1 had armature and field
windings and in terms of winding, it is known as the simplest machine. In contrast,
the DC machine has four types of winding, including armature, field,
compensation, and commutating windings. Therefore, their equivalent modeling
and merging as implemented in Section 9.3.1 is not easy. Each of these windings
has a specific design that causes specific types of electromagnetic signature. Since
each winding has different shapes of the radiated field at far distances, each of
them are simulated and modeled individually and finally all of them are combined
as one model.
However, the second part of the modeling is finding the appropriate size of the
model, which is very important in far-field and also near-field computation.
Basically, dimensions of the model are based on the size of the machine, but
for better and precise results, an optimization method is used. The proposed
optimization process is GA-based PSO, which was explained in Section 9.3.1.
In this method, objective functions are dimensions of the model including the
number of dimensions and their length. In addition to the length of dimensions,
also the number of dimensions can be considered as objectives of the model,
whereas the number of dimensions can vary from a cone and cube to polyhedron.
A typical schematic of this aspect of modeling is shown in Figure 9.26.
Finally, by collecting the previously mentioned methods and strategies, the
equivalent model is achieved. For better investigation and generating a more
accurate model, the equivalent model of each winding is achieved and shown in
Figures 9.27(a–d).
Computational Electromagnetics for Evaluation of EMC Issues 357
(a) (b)
(c) (d)
Figure 9.27 Equivalent models of (a) armature winding; (b) commutation winding; (c) compensation
winding; and (d) field winding in equivalent DC machine.
For simulation purposes, an 800 HP, 750 V, 8-pole, and 185 RPM propulsion DC
machine with a length of about 3m and an outer diameter of 1.7 m is simulated in a
3-D electromagnetic FE domain for one time instance. The actual model and the
mesh structure of this machine in FE domain are shown in Figures 9.29(a) and
9.29(b), respectively.
(a) (b)
Figure 9.29 Schematic of (a) the detailed model of the DC machine and (b) the mesh in FE domain.
The analysis for the model of the actual machine requires about 7 million
degrees of freedom in the FE analysis. This causes the simulation time to be about
43,000 seconds (~12 hours). However, the equivalent model with less than 1
million degrees of freedom takes about 300 seconds (~6 minutes).
The analysis method, which is used in this analysis, is the generalized minimal
residual method (usually abbreviated GMRES) with successive over-relaxation
(SOR) pre- and post-smothers, which was explained in [9].
After implementing simulation of both the actual and the equivalent models,
the propagated electric and magnetic fields are measured in different locations at a
Computational Electromagnetics for Evaluation of EMC Issues 359
distance from the source. Figure 9.30 shows the propagated magnetic fields from
both models along different lines.
Figure 9.30 show that the magnetic field propagated from the actual machine
has different wave shapes in various measured lines, so it can be inferred that it is
not possible to use a single dipole as an equivalent model because a single dipole
shows similar results in all planes. In addition to the wave shape of fields, also
their amplitudes in various measured lines are different, which can be another
reason to use an embedded equivalent model. This point can also be seen in
radiated electric field wave shapes (see Figure 9.31), although the difference of
electric field wave shapes measured along various lines is shallow and can hardly
be recognized. For example, comparing electric fields in Figures 9.31 and 9.32, it
can be seen near the peak that the results of the actual machine and the equivalent
model are different.
-4
x 10
Actual a
Equivalent a
Actual c Actual b
Equivalent c Equivalent b
Actual c
Magnetic Field Density (T)
Equivalent c
2 Actual a
Equivalent a
Actual b
1 Equivalent b
0
-20 -16 -12 -8 -4 0 4 8 12 16 20
Coordinates (m)
(a)
0.9
Actual b
0.8 Equivalent b
actual c
0.7 Equivalent c
Actual b
Equivalent b
Electric Field (V/m)
0.6
0.5
0.4
0.3
Actual c
0.2 Equivalent c
0.1
0
-20 -16 -12 -8 -4 0 4 8 12 16 20
Coordinates (m)
Figure 9.31 Radiated electric field of in (case c in Figure 9.30) the x-y plane when x varies from 20
to 20 (case c in Figure 9.30) in the y-z plane when z varies from 20 to 20.
0.9
Actual a
0.8 Equivalent a
0.7
Electric Field (V/m)
0.6
0.5
0.4
0.3
0.2
0.1
0
-20 -16 -12 -8 -4 0 4 8 12 16 20
Coordinates (m)
Figure 9.32 Radiated magnetic field density in (case d in Figure 9.30) the x-z plane when x varies
from 20 to 20.
In Figures 9.30, 9.31, and 9.32, the magnetic and electric fields of two models
along a single line are compared and show a reasonable similarity. However, one
may say there might be dissimilarities if the fields are measured in other lines of a
plane. In other words, the measured lines are in the middle of planes, so a
Computational Electromagnetics for Evaluation of EMC Issues 361
symmetric dipole of the propagated field is more likely to occur in the equivalent
model, but other lines may not have this type of result. Hence, for further
investigation, the measurement is implemented in a plane and the results are
depicted in Figures 9.33 and 9.34.
Comparing Figure 9.33(a) with Figure 9.33(b) and also Figure 9.34(a) with
Figure 9.34(b), it can be seen that the propagated fields from the equivalent model
have very similar results to the actual model on not just one line but also on a
whole slice.
(a) (b)
Figure 9.33 Magnetic field density of (a) the actual machine and (b) the equivalent model.
(a) (b)
Figure 9.34 Electric field of (a) the actual machine and (b) the equivalent model (mV/m).
As mentioned in Section 9.3.1, the main goal of this chapter is to study the
signature of a multimachine system. Therefore, for more validation of the proposed
equivalent model, a two-machine system is designed. The two equivalent models
of the studied DC machine are located at a close distance to each other and then the
analysis is applied. The applied current of branches and voltages of nodes of
362 Advanced Computational Electromagnetic Methods and Applications
equivalent model in the multimachine study are exactly the same as those in the
single machine system.
Figures 9.35 and 9.36 show the comparison between the magnetic and electric
fields propagated from the actual and the proposed models along several lines for
two-motor cases. As can be seen, the magnetic and electric fields follow the same
patterns with excellent accuracy. For brevity, only some planes and lines from
measured planes are considered, which are illustrated in Figure 9.35(b). All other
lines and planes show similar accuracy.
-4
x 10
3.5
Actual a
Actual c Equivalent a
3 Equivalent c Actual b
Equivalent b
Actual c
Magnetic Field Density (T)
2.5 Equivalent c
Actual a
Equivalent a
2
1.5
Actual b
Equivalent b
1
0.5
0
-20 -16 -12 -8 -4 0 4 8 12 16 20
Coordinates (m)
(a)
For the situation applied with different rate of power, the variation coefficient
of voltages and currents of each actual machine can be applied to the respective
equivalent model.
Computational Electromagnetics for Evaluation of EMC Issues 363
1.4
Actual a
Equivalent a
1.2
Actual c
Equivalent c
1 Actual a
0.6
0.4
Actual c
Equivalent c
0.2
0
-20 -16 -12 -8 -4 0 4 8 12 16 20
Coordinates (m)
Figure 9.36 Radiated electric field of two machine case (case b): in the x-y plane when x varies from
20 to 20 (case c); in the y-z plane when z varies from 20 to 20.
(a) (b)
Figure 9.37 Magnetic field density of (a) the actual machine and (b) the equivalent model.
364 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 9.38 Electric field of (a) the actual machine; and (b) the equivalent model.
different in the synchronous machine. The field winding carries DC and the
armature winding has AC. Hence, individual models should be made for each of
these windings. The equivalent models of the armature and field windings are
shown in Figures 9.40(a) and 9.40(b). The effect of each winding in the total
signature is investigated next.
(a) (b)
Figure 9.39 Prototype of synchronous machine: (a) actual machine; and (b) equivalent model.
(a) (b)
Figure 9.40 Equivalent model of individual windings: (a) armature winding; and (b) field winding.
After defining the final equivalent model, the simulation is implemented in the FE
domain. The 3-D electromagnetic FE method is used as an acceptable method for
physics-based simulation. For implementation purposes, a three-phase, 600-kW,
600-V, 1,200-RPM synchronous generator is simulated in a 3-D electromagnetic
FE domain for one time instant. The analysis for the model of the actual machine
requires about 5.5 million degrees of freedom in the FE analysis. This causes the
simulation time to be about 38,000 seconds (~10.5 hours). However, the equivalent
model contains less than 1 million degrees of freedom and takes about 270 seconds
(4.5 minutes).
366 Advanced Computational Electromagnetic Methods and Applications
After solving the problem by the FE model, the magnetic field density
propagated from the machine with and without the armature winding in two
conditions is evaluated as shown in Figure 9.41.
With armature
armature winding
-6
x 10 -6
×10 -6-6
2.5
2.5 With winding x×10
10
Without armature winding
1.21.2 With armature
armature winding
winding
Without armature winding
Without armature winding
Without armature winding
Density
2.02 11
fielddensity
fieldDensity
density
0.80.8
1.5
1.5
Magnetic field
field
Magnetic
0.60.6
Magnitude
Magnitude
1.01
0.40.4
0.5
0.5
0.20.2
00
-4 -3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2 4 00
-4 -3.2 -2.4 -1.6 -0.8 0
Coordinates 0.8
(m) 1.6 2.4 3.2 4 -4
-4 -3.2
-3.2 -2.4
-2.4 -1.6
-1.6 -0.8
-0.8 00 0.8
0.8 1.6
1.6 2.4
2.4 3.2
3.2 4
4
Coordinates
Coordinate (m) Coordinate (m)
(a) (b)
Figure 9.41 Magnetic field density propagated with and without the armature winding along (a) x-
axis in the x-z plane; and (b) x-axis in the x-y plane.
(a) (b)
(c) (d)
Figure 9.42 EMF comparisons in three planes: (a) magnetic field of actual machine; (b) magnetic
field of equivalent model; (c) electric field of actual machine (V/m); and (d) electric
field of equivalent model (V/m).
The basics of the modeling of cables are similar to the previous cases. However,
since this component is not an electrical machine, some considerations should be
employed.
The actual physical modeling of cables for signature studies requires all the
details to be considered, even in a large region. The cross-linked polyethylene
(XPLE or PEX) cables similar to all electromagnetic sources propagate dipoles at a
far distance. However, the interaction of several components such as electrical
machines and power converters modify the shape and the amplitude of dipoles.
Therefore, each model should be designed and studied independently. Nevertheless,
there is a problem, which is the modeling of the relatively small layers of multicore
368 Advanced Computational Electromagnetic Methods and Applications
XLPE cables. The studied region could be about 20,000 times bigger. This causes
the deformation of the cable model during meshing in numerical modeling
methods, such as FEM. The present study is performed on the XLPE insulated and
armored polyvinyl chloride (PVC) sheathed cable (0.6/1 kV).
Figure 9.43 shows the typical model, as well as the original and deformed
models of the studied cable in the FE analysis environment. In order to solve this
issue, a specific modeling including multidipoles with several line currents and
node voltages is designed, which resembles the actual model of the cable for
signature studies.
Figure 9.43 Models of the proposed cable in FE element design: (a) typical model; (b) original FE
model; and (c) deformed FE model.
The multidipole model of the studied cable is shown in Figure 9.44. A typical
node voltage and line current are displayed in the figure.
Voltage point
Current
line
For simulation purposes, first a unit length of the actual XLPE and the model
cables are simulated and compared using the FEM method. Afterward, various
Computational Electromagnetics for Evaluation of EMC Issues 369
directions of the cable are studied. The cable is then analyzed in multi-permittivity
areas such as undersea. As mentioned earlier, the XLPE insulated and armored
PVC sheathed cable (0.6/1 kV) is the proposed cable.
Initially, the radiated EMFs of the proposed model and the full model of the cable
are evaluated and compared. In order to avoid deformation, the actual model is
simulated by considering a large number of elements, which is only applicable in
simple situations, such as a unit length of a cable. This case is studied by applying
two different voltages to the ends of the cable. The field spectrums radiated from
the actual and the proposed model are shown in Figures 9.45(a) and 9.45(b),
respectively. Comparing the results in Figures 9.45(a) and 9.45(b) shows that the
proposed model has a very good accuracy.
(a) (b)
Figure 9.45 Radiated magnetic field density of (a) the actual model; and (b) the equivalent model in
tesla. Note that the cable is very small compared to the region.
Since cables are symmetrical, the radiated fields are the same in all planes of
the region similar to a simple dipole. Nevertheless, the radiated fields in two planes
are measured, and the result shows that the proposed multidipole model propagates
similar radiated fields as the actual model. A similar study is implemented for the
radiated electric field, and the result is shown in Figure 9.46. As shown in the
figure, the radiated electric field of the proposed model equals the actual one.
Therefore, both indices of the signature study of the proposed model represent
accurate results, while the simulation time of this model is about 100 times less.
370 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 9.46 Radiated electric field of unit length of the cable (a) for the actual model and (b) for the
equivalent model (mV/m).
Multidirectional Cables
The XLPE cables in connection between two components may have many curves
or torsions; therefore, various magnetic dipoles would be established and
consequently radiated fields would become different. In multidirection cable
analysis, the simulation time increases significantly, or in the cases of coupling
with other components, the simulation may become impossible due to the increase
of the number of tiny spaces between fragments of each component, while the
region is huge. As an initial case of multidirectional cable, perpendicular cables are
located in the same region and the radiated EMFs are measured at a far distance,
which is displayed in Figures 9.47 and 9.48. Similar to the single cable case, the
proposed model shows great accuracy. Additionally, the difference in simulation
time between the actual model and the equivalent model increases.
(a) (b)
Figure 9.47 Radiated magnetic field density of perpendicular cables case (a) for the actual model
and (b) for the equivalent model in tesla.
Computational Electromagnetics for Evaluation of EMC Issues 371
Comparing Figure 9.47 and Figure 9.45, the maximum point of the radiated
magnetic field density in the lateral plane is moved to the corner. This is because
of the interaction of dipoles of two perpendicular cables. Since the source of
signature is not symmetrical anymore, the radiated fields in the two shown planes
are different.
(a) (b)
Figure 9.48 Radiated electric field of perpendicular cables case (a) for the actual model and (b) for
the equivalent model.
Moreover, to verify the model and have the study in all dimensions, a more
complex multidirectional cable is analyzed. To do so, four discontinuous units of
the cable are located arbitrarily in different angles (see Figure 9.49).
Similar to previous cases, magnetic and electric fields radiated from the cables
are obtained and shown in Figures 9.50 and 9.51. Comparing the result of
multicable case with that of a single case, they are not similar at all because of the
presence of cables at various angles. The proposed model equals the actual model
in this case, as well as in previous cases. Note that the lines around the region are
for increasing the number of meshes in measured planes to have more accurate
radiated fields.
372 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 9.50 Radiated magnetic field of multidirectional cables case (a) for the actual model and (b)
for the equivalent model.
(a) (b)
Figure 9.51 Radiated electric field of multidirectional cables case (a) for the actual model and (b) for
the equivalent model (mV/m).
(a) (b)
Figure 9.52 Radiated electric field of the cables in multipermittivity area (a) for the actual model and
(b) for the equivalent model (V/m).
For further verification and studying the application of this type of modeling, the
proposed model is analyzed in connection with a power component. A
synchronous generator is coupled with a multicore XLPE cable. The modeling of a
synchronous generator was explained in Section 9.3.3. The actual and equivalent
models of the cable connected to the machine are shown in Figure 9.53.
(a) (b)
Figure 9.53 Schematic of the synchronous machine connected to the cable: (a) the detailed model;
and (b) the equivalent model.
The rated voltage is applied to the cable, which is connected to the machine
and the radiated field, which is measured at a far distance from the sources. The
current and voltage values of the equivalent model are calculated based on the
individual actual model of the machine and cable. Figure 9.54 shows the
propagated field of both models along the x-axis in the x-y plane. The proposed
line is also shown in the figure. The difference of the amplitude between these two
models is because of the superposition of materials. Since the cables and machine
are so close together, there is a superposition effect in the magnetic field. The
374 Advanced Computational Electromagnetic Methods and Applications
radiated magnetic field from the cable is induced into the machine and creates an
induced current which radiates an additional field from the machine. This situation
cannot be simulated perfectly in the proposed multidipole modeling, which results
in a difference in the curves. In order to clarify the effect fields of each component
on the total radiated fields in Figure 9.54, the radiated field of each component is
calculated and shown in Figure 9.55. As shown in the figure, the effect of the
cable’s radiated field is less than that of the machine. This is because of the volume
of the machine and the effect of that on the current density, which builds the
magnetic field.
-9
x 10
3
actual model
2.8 equivalent model
2.6
Magnetic Field Density (T)
2.4
2.2
1.8
1.6
1.4
1.2
1
-20 -16 -12 -8 -4 0 4 8 12 16 20
coordinates (m)
(a) (b)
Figure 9.54 Radiated magnetic field density along the x-axis in the x-y plane for the actual and
equivalent models. (a) Problem configuration. (b) Radiated magnetic field density along
the x-axis in the x-y plane.
-9
x 10
3
both on
only machine on
2.5 just cable on
Magnetic field density (T)
1.5
0.5
0
-20 -16 -12 -8 -4 0 4 8 12 16 20
coordinates (m)
Figure 9.55 Radiated magnetic field density along the y-axis in the x-y plane for three cases.
Computational Electromagnetics for Evaluation of EMC Issues 375
(a) (b)
Figure 9.56 Schematic power setup (a) for the full FE model and (b) for the equivalent model.
The synchronous generator and induction motor are switched on and off to see
their effects separately and verify the equivalent source model. In the following,
the synchronous generator is turned on while other components are switched off.
As shown in Figure 9.56, the electric and magnetic fields radiated from the
wire-model of the synchronous generator matches the fields radiated from the
actual machine. The electric fields in Figure 9.57 match the Maxwell radiation
theory, since the electric field is in the direction of the poles of the terminal voltage.
That is why the propagated field in Figure 9.57 is more in the frontal plane
compared to the lateral planes. Inversely, the magnetic fields establish
perpendicular to the direction of currents; thus, the field in the lateral planes is
more than that on in the frontal plane in Figure 9.58.
(a) (b)
Figure 9.57 Radiated electric field of (a) the actual model and (b) the equivalent model in tesla while
the synchronous generator is turned on and other components are off.
376 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 9.58 Radiated magnetic field density of (a) the actual model and (b) the equivalent model in
tesla while the synchronous generator is turned on and other components are off.
After testing the fields of each machine specifically, the fields of coupled
motor generator are measured, as shown in Figures 9.59 and 9.60.
Comparing Figure 9.60 with Figure 9.58, the amplitude of the electric field is
decreased, while the induction motor is connected to the generator. This is because
the terminal voltage in motor and the generator voltage are out of phase, so the
electric field is not cumulative. However, the direction of the current of the motor
is in the direction of the generator. By the way, the radiated field of the equivalent
model matches the radiated field of the actual model.
(a) (b)
Figure 9.59 Radiated electric field of (a) the actual model and (b) the equivalent model in tesla while
the coupling of machines (generator-motor) is turned on and others are off.
Computational Electromagnetics for Evaluation of EMC Issues 377
(a) (b)
Figure 9.60 Magnetic stray field density of (a) the actual model and (b) the equivalent model in tesla
while the coupling of machines (generator-motor) is turned on and others are off.
Finally, all the components are gathered and the excitation is applied to the
generator and motor, and a pulse load and the connection cable get connected to
them. The details of the components are mentioned in Table 9.4. The model is
analyzed in full detail in the FE domain. In addition, the equivalent source model is
used to model all these components.
The models are shown in Figure 9.61. As shown in this figure, the equivalent
source model consists of numerous lines with specific currents flowing and
voltages established at nodes of these wires.
Table 9.4
The Details of the Components in the Tested Setup
Component Characteristics
Synchronous generator 13.8 kW, PF: 0.8, length: 25 cm, diameter: 28-30cm, pole: 4, RPM: 1800,
nominal voltage: 230V, amp: 39.5A, exc. voltage: 37V, exc. amp: 1.9A
Induction machine 5.5 kW, PF:0.85, length: 30 cm, diameter: 25 cm, pole: 4
Connection cable XLPE, Diameter: 5 cm, insulated and armored PVC sheathed cable
378 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 9.61 Schematic of the power setup: (a) the full FE model and (b) the equivalent model.
x10
-5-5
10
4.5
4.5
(T)
4
4
Density
FluxDensity (T)
3.5
3.5
MagneticFlux
3
Magnetic
2.5
2.5
2
2
0
0 0.5
0.5 1
1 1.5
1.5 22
Length of
Length of the
theline
line(m)
(m)
(a) (b)
Figure 9.62 (a) Radiated magnetic field of the induction machine on the cable, while only the
induction motor is turned on; and (b) the problem model.
The magnetic fluxes radiated from the actual and the equivalent source models
are derived from the simulation at 7m away from the arrangement and are shown
in Figure 9.63. As illustrated in the figure, the magnetic flux radiated from the
equivalent source model is similar to the actual model. The small difference
between the maximum values of magnetic flux densities of two models is due to
the issue of the superposition of the components.
(a) (b)
Figure 9.63 Radiated magnetic flux density of (a) the actual model and (b) the equivalent model in
tesla.
The optimized result is shown in Figure 9.64. As can be seen, the magnitude
of the radiated magnetic field of the optimized equivalent source model is almost
the same as the actual model.
380 Advanced Computational Electromagnetic Methods and Applications
Figure 9.64 Radiated magnetic flux density of the optimized equivalent source model.
Figure 9.65 The studied setup including machines, measurement devises and control drive (for
switching).
Computational Electromagnetics for Evaluation of EMC Issues 381
For the experimental test, all components including the synchronous generator,
the induction motor, and the electric load are turned on. The cables are passing
currents, so they can also be considered on. The same test as in the previous case is
studied here. All switches are turned on and the H-field is measured
experimentally and also obtained from the simulation models. The machines are
tested at their nominal voltages. The magnetic field intensity (H-field) of the
measurement and simulation models is shown in Figure 9.66. As shown in this
figure, the full FE model and the equivalent source model have a similar radiated
H-field compared to the measurement. The small differences of the amplitudes are
because of the effect of the body of the other components around the system.
The application of this study is in the system monitoring and fault diagnosis,
which is studied in [18].
0
Magnetic Flux Intensity (dBuA/m)
-10
-20
-30
-40
measurement
-50 3DFE model
equivalent model
-60
-2.5 -1.5 -0.5 0.5 1.5 2.5
Arc Length (m)
Figure 9.66 The measured magnetic field intensity at 55 cm from the setup in the y-axis while all
components are turned on at 60 Hz (dBµA/m).
All designed models are suitable for only one typical situation, including terminal
voltage rates and physical geometry conditions of the machines. The model
proposed here is optimized in a way such that it can be utilized for various types of
machine sizes and operating voltages.
The procedure involves measuring (numerically evaluating) radiated fields of
an actual AC machine model with a basic size. If the model of the proposed size is
not available, the fields can be estimated using the related equations, based on the
382 Advanced Computational Electromagnetic Methods and Applications
fields measured for basic size [19]. Optimization factors are then applied as
follows:
K BS BiSnew / Bbase , ?K ES EiSnew / Ebase ?
(9.3)
where Bbase and Ebase are the magnetic field density and electric field of the basic
case, and BiSnew and EiSnew are magnetic field density and electric field of any
machine size. These parameters could be measured at any random points around
the component (e.g., the maximum B in a plane at a distance from the component).
The KBS and KES factors in (9.3) are applied, respectively, to the currents and
voltages of the equivalent source model to optimize the model for a new machine
with a different size. These factors are applied due to the fact that the magnetic
field density, which is used in these equations, shows strong correlation with the
magnitude of the current of the lines in nonvolumetric models (Biot-Savart Law).
Similarly, the electric field has the same relation with the voltage at the nodes [19].
A similar procedure can be applied for variations of the terminal voltage. In this
case, the factors are as follows:
K BV BiVnew / Bbase ,?K EV EiVnew / Ebase
(9.4)
where BiVnew and EiVnew are the magnetic field density and electric field of any
proposed sizes. Also, the KBV and KEV factors in (9.4) are applied, respectively, to
the currents and voltages of the equivalent source model to optimize the new case
with a different terminal voltage. If there is a case with both voltage and size
variations, both factors will be multiplied by the current and voltage values of the
basic equivalent model.
Since some material properties of machines, such as the permeability, are
nonlinear, it is not possible to utilize the currents and voltages instead of B and E
in (9.3), for all working conditions. However, it might be possible to replace the
magnetic field density (B) with the current or other parameters for a specific range
of currents. Hence, the most reliable parameter is magnetic field density and
electric field, which are being used in (9.3) and (9.4). However, in order to avoid
modeling the actual machine in each different case to obtain BiSnew, EiSnew, BiVnew,
and EiVnew in (9.3) and (9.4), the related curves of the four factors (KBS, KES, KBV,
KEV) for both AC machines are obtained. Random examples for different cases
(size and voltage variation) are measured and the related factors are obtained.
These are shown in Tables 9.5 and 9.6, and based on the points in Table 9.5 and
the curve-fitting procedure, the curves are established.
As shown in Table 9.5, the factors due to the geometrical size changes are not
just based on size ratios, but many other parameters have an effect on the values of
these factors. A curve-fitting technique is used to find an equation to obtain these
factors based on a size ratio as a variable. For example, the equation for KBS of a
synchronous generator is as follows:
K BS 0.06805R3 0.80653R2 0.28031R 0.0028016 (9.5)
Computational Electromagnetics for Evaluation of EMC Issues 383
Table 9.5
Some Patterns of Size Variation of Induction Motor and Synchronous (SYN) Generator
Table 9.6
Patterns of Terminal Voltage Variation of Induction Motor and Synchronous (SYN) Generator
3.5
synchronous generator
induction motor
3
2.5
2
B
K
1.5
0.5
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
size ratio (S new / Sbase )
Figure 9.67 KB due to size variation of synchronous generator and induction motor.
©IEEE 2012 [20].
2.5
synchronous generator
induction motor
2
1.5
E
K
0.5
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
size ratio (S new / Sbase )
Figure 9.68 KE due to change of size of synchronous generator and induction motor.
©IEEE 2012 [20].
between the new equivalent source model and the actual numerical models is
shown in Figures 9.70(ad).
This case considered both variations of geometrical size and terminal voltage.
The results is shown in Figures 9.71(ad) for both the equivalent source model and
the detailed numerical model.
(a) (b)
(c) (d)
Figure 9.69 Field spectrum of induction motor while the geometric size increased 20%: (a) B of
actual model; (b) B of equivalent source model; (c) E of actual model; and (d) E of
equivalent source model. ©IEEE 2012 [20].
two factors are applied to the branch currents and node voltages of the equivalent
source model, respectively. For verification, the actual and optimized equivalent
models are compared and are shown in Figures 9.72(ad).
Comparing the amplitudes and spectrums in Figure 9.72(a) with Figure
9.72(b) and Figure 9.72(c) with Figure 9.72(d), the propagated fields from the
equivalent source model accurately match in both planes around the machines at a
distance of about 2025m. It should be noted that unlike the induction motor cases
in the previous study, the generators are located horizontally. Locating an
induction machine vertically is necessary in some applications [21]. As shown in
Figure 9.72, the actual models are much larger than the equivalent source one,
because they are tested with a new size, while the equivalent source model is in the
same size and shape. The optimization factors are applied and the results match
very accurately.
(a) (b)
(c) (d)
Figure 9.70 Field spectrum of induction motor while the terminal voltage decreased 40%: (a) B
distribution of actual model; (b) B distribution of equivalent model; (c) E distribution of
actual model; and (d) E distribution of equivalent model. ©IEEE 2012 [20].
Computational Electromagnetics for Evaluation of EMC Issues 387
The final test case, in which both AC machine types are located in a region
(50m × 50m) with a different test, is implemented on both. The size ratio of the
synchronous generator is chosen equal to 1.4, while this ratio for the induction
motor is chosen as 0.85. In addition, the voltage ratios for the synchronous
generator and the induction motor are selected as 1.2 and 0.7, respectively. Using
Tables 9.4 and 9.5 and the curves derived earlier, the desired factors are estimated.
A diagram for calculating these factors is shown in Figure 9.73. The two factors
for each machine are estimated based on the four factors. The field spectrums of
this test are demonstrated in Figures 9.74(ad).
(a) (b)
(c) (d)
Figure 9.71 Field spectrum of induction motor while the terminal voltage decreased 40% and
geometric size increased 20%: (a) B distribution of the actual model; (b) B distribution
of the equivalent model; (c) E distribution of the actual model; and (d) E distribution of
the equivalent model. ©IEEE 2012 [20].
388 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
(c) (d)
Figure 9.72 Field spectrum of synchronous generator while the terminal voltage decreased 20% and
geometric size increased 70%: (a) B distribution of the actual model; (b) B distribution
of the equivalent model; (c) E distribution of the actual model; and (d) E distribution of
the equivalent model. ©IEEE 2014 [20].
Figure 9.73 Calculation diagram of optimization factors for the two AC machines. SYN G and IM
stand for synchronous generator and induction motor. ©IEEE 2012 [20].
This test case shows that even with the change of all the conditions
simultaneously, the equivalent model results match the actual one. In addition to
the verification of the equivalent source model, other aspects in the area of EMC
evaluation can be recognized. For example, comparing Figure 9.74 with Figures
9.72 and 9.70, it is obvious that the field spectrum in Figure 9.74 is similar to
Computational Electromagnetics for Evaluation of EMC Issues 389
Figure 9.72. This means that coupled AC machine systems radiate similar
signatures to the synchronous machine. This is because of the large difference
between the nominal power of the synchronous generator and induction motor
(857-kVA versus 33-kVA). Therefore, radiated fields of the induction motor only
increase the amplitude of overall fields of the studied system.
(a) (b)
(c) (d)
Figure 9.74 Radiated field spectrums from both synchronous generator and induction motor: (a) B of
the actual model; (b) B of the equivalent source model; (c) E of the actual model; and (d)
E of the equivalent source model. ©IEEE 2012 [20].
In order to increase the accuracy of the equivalent source model, switches are
considered for each machine to turn them on and off to study the superposition
concept.
Finally, by comparing the simulation times of the two models, the actual
model of the machines implemented by the full FE model and the generalized
equivalent source model, the results demonstrated in Table 9.7 are obtained.
390 Advanced Computational Electromagnetic Methods and Applications
Table 9.7
Simulation Characteristics Comparison
The same procedure of modeling is applied for the power converter with the
exception that the power electronics converter has switches and the switching
activities should be considered.
The proposed full FE model is shown in Figure 9.75. This electronic drive
consists of an inverter, AC load, and the armored connection cable. The details of
the devices are identified in Table 9.8.
Cable
Three Phase
Inverter
Three Phase AC
Load
Figure 9.75 The prototype of the inverter, load, and connection cable.
Computational Electromagnetics for Evaluation of EMC Issues 391
Component Characteristics
Sap
San Sbn Scn
The inverter operation during other sequences of the 60-Hz reference sine
wave is similar to the aforementioned sequence, except that the opposite phase of
the bridge is switched on and off. The sinusoidal variations of the duty cycle ratios
for each phase were specified by comparing triangular waveforms to the
magnitude of the sinusoidal reference signal. When the value of the reference sine
wave is larger than the value of the upper triangle wave, S ap is switched on;
otherwise, it must be off. The same procedure goes within the other IGBTs as well.
Figure 9.77 shows the simulated load current for the space vector PWM (SV-
PWM) operation.
To model the IGBT switches of the inverter for signature studies, the switches
must be considered off for a moment of time and then they must be considered on
for the next time instant. This shift occurs based on the switching frequency of the
converter. In order to do this in the FE simulation, the plate between the load and
the positive bus, as shown in Figure 9.78, is considered a conductive plate for the
switch-on case. Subsequently, this plate is considered a nonconductive plate for the
switch-off case. This alteration of the conductivity of the plate occurs 5,000 times
in a second due to the switching frequency (5 kHz).
The schematic of the converter shown in Figure 9.75 is implemented based on the
above procedure and modification. The simulation is computed in 6 hours with
about 1 million elements including face, line, and node meshes in the model with 6
million degrees of freedom. The large number of elements is necessary because of
very small surfaces, edges, and lines of the critical part of the inverter and cable, as
shown in Figure 9.79. The details of FE modeling are reflected in [2125]. The
simulation is implemented in a fast computer Intel Xeon 16-core 3.47 GHz CPU
with 192-GB RAM.
Since there are two cases in this study, it is decided to define two types of
results. In case 1, generated fields of the system on three different surfaces at a
distance in space are considered as the result, and in case 2, the harmonics of the
fields and the frequency responses are investigated. Hence, in this case, the
generated stray magnetic and electric fields are obtained in 3-D at a given distance
in both switching circumstances. Figures 9.80(a) and (b) show that turning on and
off the switches has the effect only on the amplitude of the magnetic field density,
and the spatial distribution of the stray magnetic field on the slices does not change
significantly. This is due to the presence of the AC load, which is discussed further.
However, the electric field, which is shown in Figure 9.81, illustrates that when the
switches turn on, the electric field in two lateral planes, the x-y and y-z planes,
increases while the field in the x-z plane decreases. The increase of the electric
field, in these planes, is due to the flow of current in the switches. It is also due to
the creation of a current loop, and its reduction is because of the superposition,
which is suppression in this case. The suppression occurs due to the propagation of
fields into the other conductive parts of the devices in vicinity; therefore, the stray
field induced from the imposed conductive parts decreases. The reason for the
suppression is the inverse direction of the induced field due to Lenz’s law [26].
Therefore, the induced stray field is subtracted from the main stray field and the
total field decreases as in Figure 9.81(a).
394 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 9.80 Stray magnetic field density of the system: (a) IGBT switched on and (b) IGBT
switched off (µT).
(a) (b)
Figure 9.81 Stray electric field distribution: (a) IGBT switched on and (b) IGBT switched off (µV/m).
Computational Electromagnetics for Evaluation of EMC Issues 395
To recognize which element of the setup has more effect on the total field, the
stray magnetic fields of each component in this setup are analyzed individually to
observe their spectrum and compare it with the overall fields. The results are
shown in Figure 9.82(ad). Comparing Figure 9.82(c) with Figures 9.82(a) and
9.82(b), while only the load is switched on, the stray magnetic fields have a higher
value in comparison with Figures 9.82(a) and 9.82(b) and the total field are
affected by it [compare Figure 9.82(c) with Figure 9.82(d)]. The reason is that the
AC load has bigger conductive elements, including iron and copper materials,
compared to the other elements in the setup.
(a) (b)
(c) (d)
Figure 9.82 Stray magnetic field density of the system (µT): (a) only the cable is switched on; (b)
only the inverter is switched on; (c) only the load is switched on; and (d) the whole
system is switched on.
the resistance of an element is less than another element in the vicinity while there
is no shield between them, EMF will be induced from the component with less
conductivity into the one with higher conductivity [27]. As mentioned above, due
to Lenz’s law, the field radiated from the induced EMF will be the opposite of the
main field. Therefore, the overall field will be less than the aggregation of the
fields.
In this case, the inverter is connected to an induction motor. The aim of this case is
investigating the radiation of harmonic fields from the inverter while the distance
and the speed of the motor change. The parameters of the induction motor are: 5.5
kW, 3phase, 208V, PF: 0.85, length: 30 cm, diameter: 25 cm, number of poles: 4.
This case is simulated using FEM shown in Figure 9.83(a).
(a)
(b)
Figure 9.83 The scheme of the setup of case 2: (a) FEM simulation and (b) measurement setup.
Computational Electromagnetics for Evaluation of EMC Issues 397
The simulation is computed in 6 hours with 950,000 elements and 5.7 million
degrees of freedom. Since the case includes very small elements and also nonlinear
materials (e.g., the core of the machine), the simulation of the inverter connected to
the load or motor may take 8 hours or more for only one time instant.
Generally, linear or nonlinear solvers are being used in the FEM simulations.
In this case, since there are several materials with nonlinear characteristics, the
linear solver cannot be used. However, using nonlinear material rises the
simulation time dramatically. Hence, a modification in choosing the solver and the
associated iterative technique is employed. Instead of having linear or curved
commutation curve, the ramp of the curve in several zones was calculated (µ r1,
µr2, …) and used instead of the commutation curve in this part as shown Figure
9.84.
The benefit of this modification is that the magnetic flux density of a
component changes in a very small period due to the steady state condition of the
system. For example, the magnetic flux density of the stator core of the induction
motor is about 1.52T in power frequency analysis, 5060 Hz. For higher
frequencies, it goes down to under 1T. Therefore, in this case, a specific zone of
the permeability can be chosen for this component. Similarly, the permeability of
other components of the system can be chosen based on the working frequency.
Therefore, having the idle parts of the commutation curves of the elements would
be avoided, and the simulation time decreases. This algorithm can be defined in the
material properties part of the FEM simulation.
µ r3
µ r2
µ r1
GMRES, with the Krylov as the preconditioner was used. The fast GMRES is a
variant of the GMRES method with flexible preconditioning that enables the use of
a different preconditioner at each step of the Arnoldi process. The Krylov subspace
is a linear subspace which enables multipreconditioning [28]. In particular, a few
steps of GMRES can be used as a preconditioner for fast GMRES. The flexibility
of this solution method is beneficial for the problem with nonlinear material
characteristics, such as the motor’s core. Therefore, the simulation time decreases
from over 8 hours to 20 minutes. More explanation is given in [29].
In addition to the simulation, the experimental setup is implemented in a
chamber, which isolates the setup from the outside environment, shown in Figure
9.83(b). The coil antenna is located at 10 cm away from the inverter to obtain the
stray magnetic field. The fields are transferred to an EMI receiver, real-time
spectrum analyzer, with a cable of 50Ω impedance.
The magnetic field intensity (H-field) generated from the setup in simulation
is shown in Figure 9.85. The H-field at 5-kHz frequency is shown on a slice at 10
cm away from the setup, the same as the experimental setup. As illustrated in this
figure, the amplitude of the stray field around the inverter box is higher than other
places. The reason is that the switching frequency of the inverter is 5 kHz, the
same as the frequency depicted from the simulation figure. The simulation is
implemented at several other frequencies but only the switching frequency of the
inverter, which is 5 kHz, is shown here.
Figure 9.85 Stray magnetic field intensity of the setup case 2 at 5 kHz simulated in FEM (µA/m).
Figure 9.86 Measured frequency response of the stray magnetic field intensity of the setup case 2
from DC to 20 kHz (dBµA/m).
The unit of the simulation result is µA/m, while the unit of the experimental
results is dBµA/m. The µA/m can be converted to dBµA/m by using (9.7). Using
this equation, the experimental peak of the stray magnetic field at 5 kHz at the
given distance is 4.37 dBµA/m (0.61 µA/m), which is very close to the simulated
value (see Figure 9.85).
dBμA m
A (9.7)
10 20
m
-10
-20
-25
-30
-35
-40
-45
-50
0 1250 2500 3750 5000 7500
Frequency (Hz)
Figure 9.87 Stray magnetic field intensity of the setup case 2 from DC to 7.5 kHz (dBµA/m) at 5 cm
away from the inverter with and without the shield by means of simulation and
measurement.
(a)
(b)
Figure 9.88 Stray magnetic field intensity of the setup at case 2 from DC to 20 kHz (dBµA/m) at 5
cm away from the inverter (a) without shield and (b) with shield.
As another application of this case, the shielding in the vicinity of the switch,
5 cm, is tested. Figure 9.87 shows the frequency response of the stray H-field with
Computational Electromagnetics for Evaluation of EMC Issues 401
and without the shield between the switches and the antenna by means of
simulation and measurement. Using a steel shield, Steel 1018 as an example in this
test, it can be seen that the noises, subharmonics between the main harmonic
orders, decrease dramatically. The experimental results show a wider band of
frequency, DC to 20 kHz, as shown in Figure 9.88 to illustrate the effect of
shielding on the other harmonic orders.
Consequently, considering this test, the main harmonics and the related sub-
harmonics can help in selecting a shield with proper characteristics including the
permittivity and permeability. Comparing the curves of Figure 9.87, the simulation
result is similar to the experimental one. Hence, the proposed shield can be studied
and optimized using the physics-based simulation. The permittivity, permeability,
conductivity, and other physical characteristics of the shield can be altered and
optimized for the best electromagnetic compliance or any other purposes using the
simulation and experimental design.
Figure 9.89 Inverter circuit of the AC motor drive, used in simulation, with inclusion of parasitic
components.
system, the three major components of the system (i.e., inverter, cable and PMSM)
are replaced with their corresponding physics-based models.
(a)
(b)
Figure 9.90 Schematic view of a motor-drive system: (a) schematic of motor-drive system used for
the CM measurement and (b) experimental setup.
The test setup used to measure the frequency spectrum at different points in
the drive system is shown in Figure 9.90(b). The illustrated test setup consists of a
DC power supply, line impedance stabilization network (LISN), inverter circuit, a
2-m long armored power cable, and a 250-watt PMSM. To measure the common
mode current, all these components are assembled on a metallic plate.
Subsequently, the conducted current can be measured between these plates. In
order to avoid a time-consuming computing process and to get a better evaluation,
the frequency-domain simulation approach is used.
Figure 9.91 shows the system structure in the FE model. This model was
solved to estimate the values of the parasitic elements in the circuit model. Figure
9.92 shows comparisons of conducted EMI common mode between the
measurements data and two modeling approaches in the frequency domain.
To study the effectiveness of our models, the equivalent models for cable and
PMSM are added to the inverter model and the simulation results are compared to
Computational Electromagnetics for Evaluation of EMC Issues 403
the experimental results. To verify the accuracy of our numerical results, the
common mode current of the setup in Figure 9.91 is measured using the current
probe with 100-MHz bandwidth. The current of Figure 9.91 is measured at the
ground port of the input DC power supply.
(a) (b)
Figure 9.91 The FE meshes of (a) the BCP and (b) the converter numerical models.
Measurement of
actual system
-4
10
|Y(f)|(dBV)
0
10
|Y(f)|(dBV)
-2
10
Simulation
Experimental
-4
10
1 2 3 4 5
10 10 10 10 10
Frequency (Hz)
Figure 9.93 Phase comparison of the frequency spectrum of the inverter current between the
equivalent model and experiments.
(a) (b)
Figure 9.94 Magnetic flux density at different switching patterns: (a) before switch and (b) during
switching.
It can be easily inferred from the figure that, within this model, various
parameters can be changed and studied in order to identify an EMI mitigation
strategy during the design stage of these systems. In the proposed model, the EMI
can be analyzed at any point or plane within the simulation volume and can be
solved for different switching patterns. The time dependence of the radiated EMI
can also be evaluated using the model. We can see how the magnetic flux density
Computational Electromagnetics for Evaluation of EMC Issues 405
behaves over time at specific locations and at various switching patterns and
frequencies. Furthermore, the field image can be obtained for various scenarios
specified by the designer and provide them with information that can be obtained
quickly. This would allow for efficient and effective complete design work using
CEM.
Table 9.9 shows the results from the optimization process. The magnetic
component positions of this converter are shown in Figure 9.97. It is clear that this
power converter is showing a poor EMI performance at the initial design stage, as
shown in Figure 9.97(a). The FE analysis is performed to observe the near-field
effects for the given layout. The best EMI performance versus geometry of the
board and the frequency is shown in Figure 9.97(b).
Figure 9.98 compares the input current spectrum, filter inductor current
spectrum, and output voltage spectrum of the converter in the ideal case and
physics-based mode (nonoptimized case), respectively. It is noticed that in the
optimized case the peak of the frequency spectrum has been decreased, as
compared to the nonoptimized case. Figure 9.99 shows the circuit layout of the
converter in the optimized case. In this case, the magnetic components are placed
so that the magnetic field generated by each one has less interference with the
other. More details are reflected in [33].
Table 9.9
Optimization Results
(a) (b)
Figure 9.97 Layout of the system: (a) before optimization and (b) after optimization.
Computational Electromagnetics for Evaluation of EMC Issues 407
0
10
-2
10
|Y(f)|(dBV)
-4
10
-6
10 Non-optimized case
Nonoptimized case
Optimized case
Optimized case
-8
10
3 4 5 6
10 10 10 10
Frequency (Hz)
Figure 9.98 Comparison of the FFT spectrum between optimized and nonoptimized quasi-resonant
converter.
Figure 9.99 Circuit of the zero current switching (ZCS) quasi-resonant buck converter in the
optimized layout.
9.7 SUMMARY
This chapter reviewed the physics-based modeling analysis for the purpose of
EMC evaluation in a multicomponent power system. It introduced the algorithm of
physics-based modulation for both low- and high-frequency analysis. The
equivalent source modeling of the powertrain was implemented for EMC studies
and the results showed that the equivalent model can produce the same result as the
full model with significantly less simulation time. The model has been used for
condition monitoring of the components based on the EM signatures. Moreover,
the optimization of the switching algorithm as well as the proper placement of the
magnetic components on the PCB was achieved all based on the radiated EMFs.
408 Advanced Computational Electromagnetic Methods and Applications
REFERENCES
[1] W. Zhang, M. Zhang, F. Lee, J. Roudet, and E. Clavel, “Conducted EMI Analysis of a Boost
PFC Circuit,” IEEE Appl. Power Electron. Conf., pp. 223–229, 1997.
[2] B. Revol, et al., “EMI Study of a Three Phase Inverter-Fed Motor Drives,” IEEE Industry
Applications Conference, Vol. 4. 2004.
[3] Y. Zhong, et al., “HF Circuit Model of Conducted EMI of Ground Net Based on PEEC,”
Zhongguo Dianji Gongcheng Xuebao. Vol. 25, No. 17, 2005.
[4] H. Zhu, et al. “Analysis of Conducted EMI Emissions from PWM Inverter Based on Empirical
Models and Comparative Experiments,” 30th IEEE Annual Power Electronics Specialists
Conference, Vol. 2, 1999.
[5] X. Pei, et al., “Analytical Estimation of Common mode Conducted EMI in PWM Inverter,” IEEE
Industry Applications Conference, Vol. 4, pp. 14, 2004.
[6] L. Sevgi, et al., “EMC and BEM Engineering Education: Physics-Based Modeling, Hands-on
Training, and Challenges,” IEEE Antennas and Propagation Magazine, Vol. 45, No .2,
pp.114119, 2003.
[7] D. Dixon, M. Obara, and N. Schade. “Finite-Element Analysis (FEA) as an EMC Prediction
Tool,” IEEE Transactions on Electromagnetic Compatibility, Vol. 35, No. 2, pp. 241248, 1993.
[8] A. Sarikhani, M. Barzegaran, and O. Mohammed, “Optimum Equivalent Models of Multi-Source
Systems for the Study of Electromagnetic Signatures and Radiated Emissions from Electric
Drives,” IEEE Transactions on Magnetics, Vol. 48, No. 2, pp. 10111014, 2012.
[9] M. Barzegaran, A. Sarikhani, and O. Mohammed, “An Optimized Equivalent Source Modeling
for the Evaluation of Time Harmonic Radiated Fields from Electrical Machines and Drives,”
Applied Computational Electromagnetics Society Journal, Vol. 28, No. 4, pp. 273282, 2013.
[10] M. Barzegaran, A. Sarikhani, and O. Mohammed, “An Equivalent Source Model for the Study of
Radiated Electromagnetic Fields in Multi-Machine Electric Drive Systems,” 2011 IEEE
International Symposium on Electromagnetic Compatibility, Long Beach, CA, pp. 442447,
2011.
[11] A. Rosales, A. Sarikhani, and O. Mohammed, “Evaluation of Radiated Electromagnetic Field
Interference due to Frequency Switching in PWM Motor Drives by 3D Finite Elements,” IEEE
Transactions on Magnetics, Vol. 47, No. 5, pp. 14741477, 2011.
[12] M. Vetterli and C. Herley “Wavelets and Filter Banks: Theory and Design,” IEEE Trans. on
Signal Processing, Vol. 40, No. 9, pp.22072232, 1992
[13] R. Coifman, Y. Meyer, and M. Wickerhauser, “Wavelet analysis and signal processing” In
Wavelets and their Applications, Boston, MA: Jones and Bartlett, pp.153–178, 1992.
[14] T. Chow, Introduction to electromagnetic theory: a modern perspective, Boston MA: Jones &
Bartlett, 2006
[15] M. Barzegaran, “Physics-Based Modeling of Power System Components for the Evaluation of
Low-Frequency Radiated Electromagnetic Fields,” PhD Dissertation, Florida International
University, FIU Electronic Theses and Dissertations, Paper 1193, 2014.
[16] Department of Defence Interface Standard, Requirements for the Control of Electromagnetic
Interference Characteristics of Subsystems and Equipment, MIL-461-STD, 2007.
Computational Electromagnetics for Evaluation of EMC Issues 409
[17] M. Ubeid, M. Shabat, and M. Sid-Ahmed, “Effect of Negative Permittivity and Permeability on
the Transmission of Electromagnetic Waves through a Structure Containing Left-Handed
Material,” Natural Science Magazine, Vol. 3, No. 4, pp. 328333, 2011.
[18] M. Barzegaran, A. Mazloomzadeh, and O. Mohammed, “Fault Diagnosis of the Asynchronous
Machines through Magnetic Signature Analysis Using Finite Element Method and Neural
Networks,” IEEE Transactions on Energy Conversion, Vol. 28, No. 4, pp. 10641071, 2013
[19] F. Ulaby, Fundamentals of Applied Electromagnetics, 5th Edition, Upper Saddle River, NJ:
Prentice Hall, pp. 321324, 2006.
[20] M. Barzegaran and O. Mohammed, “A Generalized Equivalent Source Model of AC Electric
Machines for Numerical Electromagnetic Field Signature Studies,” IEEE Transactions on
Magnetics, Vol. 48, No. 11, pp. 44404403, 2012.
[21] F. Lattarelo, Electromagnetic Compatibility in Power Systems, 1st edition, New York: Elsevier,
2006.
[22] M. Barzegaran, A. Nejadpak, and O. Mohammed, “Evaluation of High Frequency
Electromagnetic Behavior of Planar Inductor Designs for Resonant Circuits in Switching Power
Converters,” Applied computational electromagnetic society (ACES) journal, Vol. 26, No. 9, pp.
737748, 2011.
[23] M. Barzegaran and O. Mohammed, “3-D FE Equivalent Source Modeling and Analysis of
Electromagnetic Signatures from Electric Power Drive Components and Systems,” IEEE
Transactions on Magnetics, Vol. 49, No. 5, pp. 19371940, 2013.
[24] G. Skibinski, R. Kerkman, and D. Schlegel, “EMI Emissions of Modern PWM AC Drives,”
IEEE Ind. Appl. Mag., Vol. 5, No. 6, pp. 4780, 1999.
[25] O. Martins, S. Guedon, and Y. Marechal, “A New Methodology for Early Stage Magnetic
Modeling and Simulation of Complex Electronic Systems,” IEEE Trans. Magn., Vol. 48, No. 2,
pp. 319322, 2012.
[26] D. Giancoli, Physics: Principles with Applications, Upper Saddle River, NJ: Pearson Education,
p. 624, 2005.
[27] C. Paul, Inductance: Loop and Partial, Hoboken, NJ: Wiley-IEEE Press, p. 195, 2011.
[28] W. Arnoldi, “The principle of Minimized Iterations in the Solution of the Matrix Eigenvalue
Problem,” Quarterly of Applied Mathematics, Vol. 9, pp. 17–29, 1951.
[29] J. Stoer and R. Bulirsch, Introduction to Numerical Analysis, 3rd edition, New York: Springer,
2002.
[30] M. Barzegaran and O. Mohammed, “Multi-Dipole Modeling of XLPE Cable for Electromagnetic
Field Studies in Large Power Systems,” International Journal for Comp. and Math. in Electrical
Eng. , Vol. 33, No. 1, 2014.
[31] M. Barzegaran and O. Mohammed, “Near Field Evaluation of Electromagnetic Signatures from
Wound Rotor Synchronous Generators Using Equivalent Source Modeling in Finite Element
Domain,” 28th ACES Conf., Columbus, OH, Apr. 2012.
[32] A. Nejadpak, “Development of Physics-based Models and Design Optimization of Power
Electronic Conversion Systems,” FIU Electronic Theses and Dissertations. P. 824, 2013.
[33] A. Nejadpak and O. Mohammed, “Physics-Based Optimization of EMI Performance in
Frequency Modulated Switch Mode Power Converters,” Electromagnetic Field Problems and
Applications (ICEF), pp.14, 2012.
Chapter 10
Manipulation of Electromagnetic Waves Based
on New Unique Metamaterials: Theory and
Applications
Qun Wu, Jiahui Fu, Fanyi Meng, Kuang Zhang, and Guohui Yang
10.1 INTRODUCTION
411
412 Advanced Computational Electromagnetic Methods and Applications
cloak [5], a perfect lens [6], and many other kinds of novel applications in
microwave [79], terahertz [10], and optical regime [11].
Manipulation of electromagnetic waves as desired has been a hot topic in the
field of electromagnetism for a long time. The emerging of metamaterials provides
great opportunities to control transmissions and distributions of electromagnetic
waves and energy. In this chapter, the applications of metamaterials in the
manipulation of electromagnetic waves are discussed. In Section 10.2, the theory
of transform optics is introduced. Then based on the form invariance properties, an
electromagnetic energy concentrator and waveguide connector are proposed and
simulated. After simplification processing of constitution parameters, the
simulation is completed to verify the theoretical works. In Section 10.3, zero index
metamaterials with matched impedance are constructed and applied to enhance the
gain of horn antenna. Measurements of gain and far-field pattern verify the
theoretical design. In Section 10.4, metamaterials are applied to build a novel
broadband absorber. A brief conclusion is in Section 10.5.
G J (10.2)
where Fαβ represents the matrix of electric field E and the magnetic field B, Gαβ
represents the matrix of electric displacement vector D and the magnetic field
strength H, and Jβ is the vector of the excitation source. These three components
can be expressed as:
0 E1 E2 E3
E 0 cB3 cB2
F 1 (10.3)
E2 cB3 0 cB1
E3 cB2 cB1 0
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 413
c
J
J 1 (10.5)
J2
J3
Based on the expressions above, the material parameters can be expressed as:
1
G C F (10.6)
2
where Cαβμυ might be the constitutive matrix, including permittivity, permeability,
and bi-anisotropy parameters. Here below we will take the Cartesian coordinates as
an example to show in detail how the material parameters will be derived. As
shown in Figure 10.1, the original coordinate system can be expressed as OX1X2X3,
while the new coordinate system can be expressed as OX1X2X3, and the
transforming function can be expressed as:
x1 ' f1 x1 , x2 , x3 (10.7)
x2 ' f 2 x1 , x2 , x3 (10.8)
x3 ' f3 x1 , x2 , x3 (10.9)
Then the relationship between arbitrary vector X' in the transformed space and the
vector X in the original space can be derived as:
X ' X (10.10)
where Λ is a Jacobian matrix, which can be expressed as:
xi'
ij i, j 1, 2,3 (10.11)
x j
where xi' and xj represent the coordinate components in the transformed space and
in the original space, respectively. Then (10.11) can be further expressed into:
414 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 10.1 Sketch of the optical transformation: (a) original coordinate and (b) transformed
coordinate.
Finally, it can be derived that the constitutive tensor of the material in the
transformed space can be expressed as:
T
' ' (10.13)
It can be seen that the constitutive parameters of the material derived from the
transform optics own the direct relationship with the transforming function
between the original space and the transformed space. In the next section, we will
employ several examples to show how the transform optics works in the design of
novel microwave devices.
Suppose an arbitrary point H(x0, y0) in the original coordinate system, whose
corresponding point is G(x0', y0') in the transformed system, as shown in Figure
10.2. Then the 2-D arbitrary transformation from the original space to new space is
expressed as:
r ' k0 r R1 (10.14)
where r', r, and R1 denote the distance of OG, OH, and OM, respectively. M is the
joint point of OG and the inner surface. k0 equals (b-a)/b, where a and b are the
intercept of the inner and outer polygons with the y-axis, respectively. Here the
cloak is conformal to the inner region, so the corresponding sides have the same
slope.
Suppose that the vertex of the inner polygon is named (xn, yn) clockwise, and
the corresponding vertex of the outer polygon is named (xn', yn'). Then the equation
of the nth side made up by (xn, yn) and (xn+1, yn+1) can be identified as:
y yn x xn
(10.15)
yn 1 yn xn 1 xn
R1 k1
(10.16)
r y k2 x
yn 1 yn yn 1 yn
where k1 yn xn and k2 . According to the transformation
xn 1 xn xn 1 xn
invariance, the unit vectors in the original space and in the transformed space must
be equal; hence we can get the coordinate transformations as:
x' y' b a k1
(10.17a)
x y b y k2 x
z' z (10.17b)
Then we can easily compute the Jacobian transformation matrix, which
represents the derivative of the transformed coordinates with respect to the original
coordinates. Using the property that Maxwell’s equations are form invariant in the
original and the transformed spaces, we can obtain permittivity and permeability
tensors of the medium in the transformed space:
xx xx (10.18a)
k02 k0 R1
416 Advanced Computational Electromagnetic Methods and Applications
k0 y k 2 x R1 yk2 x r 2 k2 R12 y k2 x
1 2
xy yx xy yx (10.18b)
k02 k0 R1
yy yy (10.18c)
k02 k0 R1
1
zy zz 2 (10.18d)
k k0 R1
0
k1
where R1 . Above all, (10.18) provides the full design parameters for the
y k2 x
permittivity and permeability tensors in the 2-D arbitrarily irregular polygonal
cloaks. Next we will utilize the constitutive tensors that we got above to make full-
wave simulations using the FEM method, in order to validate the design.
G
b a M
H
O
x
matched and the device is therefore reflectionless. All these results verify the
theoretical design and derivations.
-1 -1 -1 -1
(a) (b)
Figure 10.3 Simulation results of E-field distributions of (a) five-sided bulgy polygonal invisibility
cloak and (b) five-sided concave polygonal invisibility cloak.
rf ' r
(10.19b)
f r
f 'r f r
z (10.19c)
r
where f(r) is the function between the original space and the transformed space.
For the transformation between r'∈[0, R2] and r∈[0, R1], namely the core region,
the function can be easily expressed as:
418 Advanced Computational Electromagnetic Methods and Applications
R2
r ' f r r (10.20)
R1
Then the constitutive tensor of the core region can be derived as:
cr c 1 (10.21a)
2
R2
z (10.21b)
R1
For the circular region, it can be noticed that the values of εr and εθ are reciprocal.
If one of them is set as a constant, the other can be also fixed. Suppose that:
rg ' r
m0 (10.22)
g r
By solving the ordinary differential equation above, the general solution can
be expressed as:
f r m1r m0 (10.23)
where m0 and m1 are unknown coefficients, which could be defined through the
boundary conditions. Furthermore, the transformation function f(r) between the
original space and the transformed space should fulfill the boundary condition
f R3 R3 (10.24a)
f R1 R2 (10.24b)
Based on the boundary conditions, the unknown coefficients can be solved, and the
constitutive tensor for the circular region can be expressed as:
1
m0 (10.25a)
r
21 m0
R3
z m0 (10.25b)
r
R3
where m0 log R3 . Hence we have obtained all the constitutive parameters of
R1
R2
the cylindrical electromagnetic concentrator. It could be seen that the relative
permittivities r and are obtained as constants, and only z is the function of
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 419
the radius, which could also be homogenized through the layered structure.
Furthermore, it can be observed that the constitutive tensor is nonsingular and
positive, which improves the flexibilities for 2-D EM concentrator designs.
Moreover, the impedance of the concentrator at the outer boundary can be
expressed as Z r R z 1 . The electromagnetic concentrator is always
3
impedance matched with free space, which indicates minimized scattering fields of
the cylindrical electromagnetic concentrator. Next there are some full wave
simulations based on the constitutive parameters above.
R1
R2
R3
Here lossless cases are studied based on the simulation results of the FEM
method. The geometry parameters are selected to be R3 = 2R2 = 4R1 = 0.4m. Based
on all the geometry parameters, the constitutive parameters can be calculated
through the equations above. The frequency is selected to be 2 GHz. Figure 10.5(a)
shows the electric field distributions of the concentrator. It can be seen that the
electric fields are concentrated into the inner core region smoothly, and the fields
outside are rarely disturbed. Furthermore, the power flow of the electric fields are
also calculated and shown in Figure 10.5(b). It can be seen that the power flow is
enhanced obviously in the inner core region. The enhancing ratio can be expressed
as the ratio of R2 and R1, and the enhancement theoretically diverges to infinity as
R1 tends to zero.
420 Advanced Computational Electromagnetic Methods and Applications
Max:1 Max:1
(a) (b)
0.4
0.2
0.1
0
-0.1
-0.2
-0.4
-0.4 -0.2 -0.1 0 0.1 0.2 0.4 Min:-1 -0.4 -0.2 -0.1 0 0.1 0.2 0.4 Min:0
(a) (b)
Figure 10.5 Simulation results of the concentrator: (a) electric field distribution and (b) normalized
power flow.
y2 x y1 x y2 x y1 x
y' y (10.26a)
2a 2
x' x (10.26b)
z' z (10.26c)
where the length of AC is assumed to be 2a, the curve that connects C and D′ is
defined as y1(x), and the curve that connects A and B' is defined as y2(x). The
functions of the two curves can be selected arbitrarily, so long as they can satisfy
the numerical values at the points of A, B' and C, D', respectively. Hence, the
Jacobian transformation matrix can be gotten based on (10.26):
1 0 0
y d y2 x y1 x 1 d y2 x y1 x y2 x y1 x
A 0 (10.27)
2a dx 2 dx 2a
0 0 1
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 421
which represents the derivative of the transformed coordinates with respect to the
original coordinates. Using the property that Maxwell’s equations are form
invariant in the original and transformed spaces, the permittivity and permeability
tensors of the medium in the transformed space can be expressed as:
AAT
' (10.28a)
det A
AAT
' (10.28b)
det A
where and represent the constitutive tensors of the original space. Here the
original space is supposed to be free space, so the constructive tensors of the
original space can be expressed as:
I 0 (10.29a)
I 0 (10.29b)
Hence, we can easily get the relative permittivity and permeability tensors in the
transformed region as:
2a
xx xx zz zz (10.30a)
y2 x y1 x
d y2 x y1 x d y2 x y1 x
y a
dx dx (10.30b)
xy yx xy yx
y2 x y1 x
2
d y2 x y1 x d y2 x y1 x
y a
yy yy
dx dx y2 x y1 x
2a y2 x y1 x 2a
(10.30c)
Furthermore, the symmetrical constitutive matrix can be transformed into the
diagonal matrix through rotating the coordinates, which will be more useful in the
construction of metamaterial. The diagonal matrix can be expressed through the
eigenvalues of the symmetrical constitutive matrix:
yy 4 xy
2
xx yy xx (10.31a)
11 11
2
422 Advanced Computational Electromagnetic Methods and Applications
yy 4 xy
2
xx yy xx (10.31b)
22 22
2
33 33 zz (10.31c)
Above all, (10.31) provides the design parameters for the permittivity and
permeability tensors of the metamaterials filled in the waveguide connector. Next
the constitutive tensors above will be utilized for full-wave simulations on
arbitrary waveguide connectors.
Y
(a) (b) A a B
C -a D
(a) (b)
Figure 10.6 Sketch of the waveguide connector: (a) sketch of the connector and (b) connector in the
Cartesian coordinate.
In order to validate the constitutive tensors above, we use the FEM method to
simulate arbitrary waveguide connectors. The geometrical sizes of waveguides are
properly selected to make sure that the TE10 mode at 2 GHz can be transmitted.
The simulation domain is shown in Figure 10.6(a), and port 1 is selected to
illuminate the incident wave. Here it should be noticed that the simulations are
carried out in the transformed space, but the constitutive parameters in (10.31) are
expressed as the function of x and y, which are variables of the original space. The
variables of the original space should be replaced by the variables of the
transformed space, which can be gotten through (10.32):
x x' (10.32a)
z z' (10.32c)
First, a simple connector of symmetrical structure is simulated to verify the
designed formulae. The electric field distribution of the connector filled with
metamaterials of the designed constitutive parameters is shown in Figure 10.7(a),
and the connector filled with air is also simulated for the sake of comparison, as
shown in Figure 10.7(b). It can be seen that although the connector filled with air
can fulfill the transmission of electromagnetic waves from a big waveguide into a
small one, there exist reflections and part of the energy is lost. For the connector
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 423
filled with metamaterial of the designed constitutive tensors, it is obvious that the
electromagnetic waves are properly guided from a big waveguide to a small one
without any impact on guided mode. Then several general models are also
simulated, including sharper connector and unsymmetrical connectors, and the
results are shown in Figure 10.8. From the electric field distributions it can be seen
that the connectors all work very well. The electromagnetic waves can be
transmitted from the big waveguide into the small one properly.
0 0 0
-0.4 -1 -1
-0.9 0 1 -0.9 0 1
(a) (b)
Figure 10.7 Simulation results of electric field distributions of: (a) waveguide connecter based on
optical transformation and (b) traditional waveguide connecter.
0.4
1 1
0 0 0
-0.4 -1 -1
-0.9 0 1 -0.9 0 1
(a) (b)
0.4
1 1
0 0 0
-0.4 -1 -1
-0.9 0 1 -0.8 0 1.2
(c) (d)
Figure 10.8 Simulation results of electric field distributions of: (a) sharper symmetrical waveguide
connecter; (b) curve-symmetrical waveguide connecter; (c) unsymmetrical waveguide
connecter I; and (d) unsymmetrical waveguide connecter II.
In this section, we focus on the multibeam antenna based on the transform optics.
424 Advanced Computational Electromagnetic Methods and Applications
We restrict our investigation to 2-D cases for simplification, where the field is
invariant in the z-direction. The transformation is constructed in the Cartesian
coordinate system (in the x-y plane), as shown in Figure 10.9. Assume that an ideal
isotropic line source is located at the center point O. r represents the radius of the
inner circle and l represents the length of the n-sided regular polygon. Divide the
regular polygon domain into n isosceles triangles with a vertex angle of θ. In the
triangle OAB, we first make the following transformation: the fan-shaped virtual
space OA′B′ is mapped to the triangular physics space OAB. Obviously, the
nonlinear transformation will inevitably make the material parameters
inhomogeneous. In order to eliminate the inhomogeneity of the transformation
space, a geometrical simplification is made in the fan-shaped OA′B′. For a small
angle θ, the arc length of A′B′ is approximately equal to the length of segment A′B′.
So eventually, let the triangle OA′B′ (x, y, z) be mapped to triangle OAB (x′, y′, z′).
The mapping transformation function can be written as follows:
x ' ax by c (10.33a)
y ' dx ey f (10.33b)
z' z (10.33c)
with
1
a x0 y0 1 x0
(10.34)
b xA ' y A ' 1 x A
c x yB ' 1 xB
B'
Then Jacobian matrix can be expressed as:
a b 0
A d e 0 (10.35)
0 0 1
By the metric invariance of Maxwell’s equations, we can obtain the constitutive
parameter tensors of the material in the transformation region:
T
' (10.36a)
det
T
' (10.36b)
det
1 0 0
0 1 0 (10.37)
0 0 r 2 l 2 cos 2 2
It is noted that all tensors of the material parameters (permittivity and permeability)
are position-independent and are only functions of r, l, and θ. For the fixed points
O, A, A′, B, B′, C, and C′, the material parameters are constants. Therefore, we can
design an arbitrary N-beam antenna with homogeneous materials, which are much
easier to be realized by metamaterials. It is worth mentioning that the radiation
direction varies with the change of the position of the points (A, A′, B, B′) thus
resulting in more generally arbitrary radiation direction with arbitrary beam.
Nevertheless, the inhomogeneity of the media has been eliminated; the
permeability is still not unity. In the case of a transverse electric (TE) incident
wave with an electric field polarized along the z-direction, only ε′zz, μ′xx and μ′yy
should be required in (10.37). According to [5], the wave trajectory remains
unchanged as long as the products of εzzμxx and εzzμyy are kept invariant. Therefore,
we can make one optimal choice that the material parameters are the simplest. Set
r 2 cos 1 2
xx' yy' 1, zz' (10.38)
l2
Only one component of all the tensors of the material parameters will be needed to
realize the goal. It will be quite easy to fabricate such metamaterials in practical
426 Advanced Computational Electromagnetic Methods and Applications
engineering work.
Then full-wave simulations based on the FEM method are carried out to verify
the approach. For the sake of convenience, the TE incident wave with the electric
field polarized along the z-axis is adopted. The boundary conditions surrounding
the computational region are set as PML to simulate the propagation of incident
wave in the real region. The isotropic line source is located at the center O(0, 0).
The working frequency is set to be 5.8 GHz.
Three typical cases are taken into consideration, namely, three-beam, four-
beam and eight-beam antennas, and the corresponding transformed space
parameters are ε′zz = 0.027, 0.056, and 0.07, respectively, while the line source
embedded in free space is also taken for comparison. The distributions of the
magnitude of electric field along the z-axis for all four cases are presented in
Figure 10.10. The cylindrical waves are excited by the isotropic line source in free
space as shown in Figure 10.10(a). For the cases where the line source is
embedded in the transformed medium with the specific constitutive parameters in
(10.38), the propagating path of the cylindrical waves is reorganized as desired,
and three-, four-, and eight-beam are fulfilled as shown in Figures 10.10(bd),
respectively.
y(m)
(a) (b)
(c) (d)
Figure 10.10 Electric field distributions for: (a) line source in the free space; (b) three-beam antenna;
(c) four-beam antenna; and (d) eight-beam antenna.
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 427
Furthermore, we also compute the far-field radiation patterns of the above four
antennas, as shown in Figure 10.11. The far-field pattern of the line source shown
in Figure 10.11(a) in free space is taken for comparison. It can be seen from
Figures 10.11(bd) that the multibeam antennas constructed by the metamaterials
based on the transform optics provide high directivity radiation beams in the
desired directions. The far-field patterns show a good agreement with the electric
field distribution depicted in Figure 10.10. So both the near-field distribution and
far-field patterns verify the theoretical design.
(a) (b)
(c) (d)
Figure 10.11 Far-field patterns of: (a) line source in free space; (b) three-beam antenna; (c) four-beam
antenna; and (d) eight-beam antenna.
The proposed unit cell of the detached ZIML is illustrated in Figure 10.12. It
consists of a metal patch and modified split ring resonator (MSRR). The unit cells
are aligned along the x- and y-axes. The patches are continuously aligned along the
y-axis to form a metal strip that is able to cause electrical resonance to achieve
zero permittivity, similar to [41]. The MSRR consists of two square loops, and
each loop has two slots at the opposite sides. One of the two square loops is
generated by rotating 90° with the other one. The two loops are etched on the
opposite sides of the dielectric substrate. The MSRR is implemented instead of the
traditional SRR for stronger magnetic resonance, smaller electrical size, and
broader resonance bandwidth [42]. Referring to Figure 10.12, the geometric
parameters of the metal patch and MSRR are designed as: l2 = 5.4 mm, l3 = 6.6 mm,
t1 = 2.9 mm, w = 0.8 mm, t2 = 0.8 mm, and εr = 2.2. The overall length of the unit
cell along the x-, y-, and z-axes is t = 8.2 mm, h = 6.6 mm, and l1 = 8 mm,
respectively.
Figure 10.13 Transmission and reflection coefficients of the unit cell of the detached ZIML.
430 Advanced Computational Electromagnetic Methods and Applications
The S-parameters are calculated for a periodic array with the unit cells of
ZIML and the thickness of one unit cell. The magnitude of S-parameters of the
ZIML for the z-directional incident plane wave is illustrated in Figure 10.13. It is
shown that the magnitude of S21 is larger than 3 dB from 8.8 GHz to 10.9 GHz
where its peak value is 0 dB at 9 GHz and 9.9 GHz, implying that the field can
easily pass through the ZIML within this frequency band.
Effective constitutive parameters μeff and εeff of the ZIML are extracted from
the corresponding transmission and reflection data [43] and shown in Figure 10.14.
It can be seen that the effective permeability μeff and the effective permittivity εeff
in turn approach zero at 9.4 GHz and 9.7 GHz, respectively, which will make the
corresponding effective refractive index n to be near zero in a band as broad as
possible. Particularly, the effective permittivity and permeability are the same at
9.0 GH and 9.9 GHz with the values of 0.8 and 0.3, respectively, which leads the
ZIML to have both near-zero refractive index and perfectly wave impedance
matching with air.
4~ Z
S21 ~ (10.39)
12 ~ 12 Z
where Z is the transmission term
~
Z exp jk (10.40)
The wavy lines above the parameters indicate that the parameters are complex.
The thickness d of the slab is much smaller than the operating wavelength, the
magnitude of S21 approximately equals 1 regardless of the loss of the ZIML. The
thickness of the detached ZIML is 8 mm, and hence it is much smaller than the
wavelength corresponding frequencies from 9.4 GHz to 9.7 GHz, which leads the
transmission of fields through the detached ZIML to a high level.
The detached ZIML was fabricated and measured with an H-plane horn antenna to
validate the gain enhancement ability of the detached ZIML, as shown in Figures
10.16 and 10.17. The detached ZIML is constructed from metal strip slabs and
MSRR slabs, as shown in Figures 10.16(a) and 10.17(a). The metal strip slab and
the MSRR slab are realized by splicing 13 patches and MSRRs in Figure 10.12
together along the y-axis.
A metal strip slab and an MSRR slab are paired, and nineteen pairs of the
metal strip slab and MSRR slab are periodically inserted in slits on dielectric fixing
slabs to form the detached ZIML, as shown in Figure 10.17(b). The distance
between the metal strip slab and MSRR slab is t1, and the period of the pair is t.
The parameters t and t1 are the same as the ones in Figure 10.12. The ZIML is
placed in front of the H-plane horn antenna with a distance d = 40 mm, as shown in
432 Advanced Computational Electromagnetic Methods and Applications
Figures 10.16(b) and 10.17(b). The horn has an aperture of 139 mm (a2) × 12.70
mm (a3) with length along the z-axis b = 143 mm and is fed by a waveguide of
25.40 mm (a1) × 12.70 mm (a3). In addition, in Figure 10.17, fixing slab A is used
to fix metal strip slabs and MSRR slabs to make up ZIML, fixing slab B is used to
fasten the ZIML and antenna together, and the fixing rods made of PETT
(polyethylene terephthalate) are used to reinforce the ZIML.
(a)
(b)
Figure 10.16 Schema of the detached ZIML: (a) the enlarged structure of the metal strip slab and
MSRR slab and (b) an H-plane horn with the detached ZIML.
(a) (b)
Figure 10.17 Prototype of the fabricated detached ZIML: (a) the metal strip slab and MSRR slab and
the overall view of the detached ZIML and (b) the H-plane horn with the detached
ZIML.
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 433
Return losses of the horn antenna with and without the detached ZIML were
measured with a vector network analyzer in a microwave anechoic chamber and
depicted in Figure 10.18. It can be observed from the figure that the return loss of
the horn is slightly affected by the ZIML that has a good transmission due to its
impedance matching and thin thickness. So the ZIML barely reduces the total
efficiency of the antenna, which is an advantage over the traditional ZIMLs or
gradient index lenses.
Figure 10.18 Measured return losses of the H-plane with and without the detached ZIML.
The patterns of the horn antenna with and without the detached ZIML are also
investigated by both simulation and measurement. At 9.9 GHz, the simulation
results are shown in Figures 10.19(a) and 10.19(b), and the measured results are
shown in Figures 10.19(c) and 10.19(d). Figures 10.19(a) and 10.19(c) show the
simulated and measured E-plane patterns of the horn antenna with and without the
ZIML, respectively. Placing the ZIML in front of the horn significantly reduces the
width of the main lobe of E-plane from 91.40o to 14.80o. Moreover, the measured
results of the E-plane show great consistency with the simulation ones. Figures
10.19(b) and 10.19(d) show the simulated and measured H-plane patterns of the
horn antenna with and without the ZIML. In contrast to the E-plane pattern, both
the simulated results and measured results indicate that the ZIML slightly narrows
the main lobe of the H-plane pattern of the horn. The different effects of the ZIML
on the E-plane and H-plane patterns of the horn can be explained by the anisotropy
of the ZIML. In fact, the magnetic response of MSRR can be excited only by an
incident magnetic field penetrating through the MSRR plane. In this case, referring
to Figures 10.12 and 10.16(b), if the field is incident in an azimuth angle between
the wave vector and the z-axis, the constitutive parameters will be quite different
from what are extracted in Figure 10.14, and the refractive index n is not near zero
any more. However, MSRR is independent on the direction of electric field if the
electric field vector is in the y-z plane because the magnetic field can always
434 Advanced Computational Electromagnetic Methods and Applications
penetrate through the MSRR plane. As a result, the constitutive parameters vary
little if the wave is incident in a pitch angle and are almost the same as what are
extracted in Figure 10.14, which means the effective refractive index is still near
zero and the improvement in the E-plane is significant.
(a) (b)
(c) (d)
Figure 10.19 Normalized radiation patterns of the H-plane horn with and without the detached ZIML:
(a) the simulated results of E-plane patterns; (b) the simulated results of H-plane
patterns; (c) the measured results of E-plane patterns; and (d) the measured results of H-
plane patterns.
The gain enhancement of the H-plane horn antenna with the detached
ZIML is also measured, as shown in Figure 10.20. A wideband gain
enhancement from 8.9 GHz to 10.8 GHz is observed. Particularly, the gain
enhancement at 9.9 GHz is 3.88 dB and the peak is 4.02 dB at 9.7 GHz. It is
worth noting that the distance between the ZIML and the antenna will not be a
vital parameter affecting the gain enhancement ability of the detached ZIML,
which is distinct from the lenses based on Fabry-Pérot resonance and gradient
index lens. In order to verify this, numerical simulations are carried out to test
the gain of the H-plane horn antenna loading the detached ZIML with different
values of the distance d and the antenna gain variations are shown in Figure
10.21. It can be seen from the figure that the gain enhancement of the horn
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 435
Figure 10.20 Measured gain variation of the horn with and without the detached ZIML.
Figure 10.21 Frequency variation of the simulated gains of the horn with the detached ZIML for
different distances between the antenna and ZIML.
The concept of graded index (GRIN) metamaterials was proposed by Smith et al.
[45] before the first GRIN metamaterial lens, which possessed a negative refractive
index, was realized for free-space microwave focusing by Driscoll et al. [46].
Recently, GRIN metamaterial lenses consisting of resonant metamaterials with a
positive index of refraction were designed to transform cylindrical or spherical
waves into planar waves, yielding antennas with an increased directivity [4756].
In particular a highly sophisticated broadband, dual-linear-polarized, and high-
directivity lens horn antenna using the GRIN metamaterials, composed of
multilayer microstrip square-ring arrays, was presented in [55]. However, there are
436 Advanced Computational Electromagnetic Methods and Applications
still some open issues left in the recent studies of GRIN metamaterial lenses. All
GRIN metamaterial lenses presented so far are highly polarization sensitive. Their
characteristic electromagnetic response is only supported for predefined linear
polarization states. Polarization-insensitive GRIN metamaterial lenses would be
highly desirable for many applications, such as satellite communications, where it
is necessary to work with circularly polarized waves. For all these cases, the
implementation of GRIN metamaterial lenses requires polarization independence.
However, similar to metamaterial cloaks, ideal GRIN metamaterial lenses rely
on a continuous distribution of the effective refractive index that is achieved by a
proper grading of the unit cells in the underlying metamaterial structure. Hence,
any continuous index distribution has to be approximated by a discrete set of
various metamaterial sections where each of them contains unit cells either of
correspondingly altered shape or with a substantially different topology. In general,
the geometric parameters of the metamaterial unit cells are obtained through
extensive full-wave numerical simulation with no regard for the potential
applicability of any approximate analytical synthesis methodology. In this case the
design procedure of GRIN metamaterial lenses may become extremely arduous, let
alone the resulting high manufacturing costs.
In response to these issues, we introduce a simple and highly efficient
automatic design and fabrication method for broadband polarization-insensitive
GRIN metamaterial lenses. The GRIN metamaterial lens encompasses a
nonresonant metamaterial layer that is represented by an isotropic dielectric slab
accordingly perforated with drill holes of deep-subwavelength dimensions, where
the desired polarization insensitivity is already fostered by the nonresonant nature
of the underlying metamaterial. We also derive analytical formulas describing the
proper distribution rules of the drill holes mimicking the intended grading of the
GRIN lens. The fabricated lens structure is then both numerically and
experimentally validated by placing it on the aperture of a circularly polarized
conical horn antenna.
L2 r 2 L
n r n0 (10.41)
t
where n0 is the refractive index of dielectric material, L stands for the distance
from the phase center to the incidence plane of GRIN metamaterial lens, and r is
the in-plane radial variable corresponding to the radius of the displayed concentric
circle with its center at the origin O.
Figure 10.22 Geometry of the GRIN metamaterial lens that transforms incoming spherical waves into
outgoing plane waves.
It can be easily reasoned that the key issue for realizing a polarization-
insensitive GRIN metamaterial lens is to choose: (1) an isotropic background
material substrate; (2) a kind of feasible polarization-insensitive (and nonresonant)
metamaterial unit cell; and (3) a corresponding planar distribution of those unit
cells, meaning that both the latter features have to cope with the circular symmetry.
Here, the GRIN metamaterial lens is realized by a dielectric slab containing a
circularly symmetric distribution of deep-subwavelength drill holes, as displayed
in Figure 10.23. The different unit cells are organized along concentric annuli all
centered at the origin O, maintaining a uniform distribution of drill holes. These
holes have the same diameter d, and are equally spaced with the same central angle
ζ in the same annulus with a specific thickness a, where for different concentric
annuli the central angle ζ may be different. Adapting the dielectric plate with drill
holes has introduced a favorable feature into the design of lenses and cloaks
[5760], while the design of nonuniform drill holes restricts reducing the volume
of unit cell because they need a larger area to change the radius of holes to fit the
requirement of refractive index, thus it is hard to approximate the continuous
distribution of gradient index in the ideal situation. At the same time, the effective
medium theory is fit for subwavelength structures, thus the drill holes with large
radii in the design of nonuniform drill holes would make the dielectric plate
438 Advanced Computational Electromagnetic Methods and Applications
y
x
z
Figure 10.23 Top view of the drill holes in the GRIN metamaterial lens (not drawn to scale).
where εd(v) and Vd(v) are the relative permittivity and the occupied volume of the
two material phases, that is, the dielectric background and the air hole, respectively.
In order to examine the accuracy of this approximation method, we analyze
the effective refractive index of an infinite dielectric slab with a periodic
perforation of air holes (see Figure 10.24) and compared the results obtained from
the mixing rule to the numerically simulated data (see Figure 10.25). The relative
permittivity εd = 2.2 and the thickness of the dielectric slab is chosen to be t = 5
mm. The diameter d of the air hole is fixed at 0.6 mm. The simulated effective
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 439
refractive index is obtained from the effective medium theory and the S-parameter
retrieval method [32, 63, 64].
Figure 10.24 Top view of the equivalent infinite dielectric slab with periodically distributed holes.
It can be observed from Figure 10.25(a) that the calculated and the simulated
data are in good agreement, especially for small volume fractions of the air holes
where the latter is due to the validity range and the intrinsic symmetry of the
mixing formula. The relative error between the calculated and the simulated results
is defined as
nc ns
100% (10.43)
nc
where nc is the calculated value of refractive index and ns is the simulated one. It
can be deduced from Figure 10.25(b) that the maximum Δ over the frequency
range of 1 GHz to 15 GHz is less than 2.7%, which means that the mixing formula
given by (10.42) is operated well in the long wavelength limit, namely in the
validity range of the metamaterial approach, and therefore is sufficient to estimate
the effective permittivity/refractive index of the underlying metamaterial.
Based on the possibility of synthesizing the metamaterial effective refractive
index, we can now combine (10.41) and (10.42) to find a distribution relation for
the drill holes, which yields the graded refraction index profile for the intended
spherical-to-plane-wave transformation. On the annulus with inner radius (k
0.5)a and outer radius (k + 0.5)a, the effective refractive index of the metamaterial
has to match the value n(ka) according to
L2 ka L
2
n r n ka n
0
t (10.44)
r (( k 0.5) a , ( k 0.5)], k 1, 2, L
440 Advanced Computational Electromagnetic Methods and Applications
2
2k a 2 d
Ad Aall A , k 1, 2, L (10.45)
360 2
45d 2 d
eff d , k 1, 2, L (10.46)
ka 2
(a) (b)
Figure 10.25 Infinite dielectric slab with a periodic perforation of air holes: (a) comparison between
simulated values and calculated values of the effective refractive index for the given
frequency range and (b) relative error between calculated and simulated results.
45d 2 d 1
r
2
2 2
L ka L
. ka 2 d n0 (10.47)
t
r (0.0065(k 0.5), 0.0065( k 0.5)], k 1, 2, L
This analytical formula fully describes the distribution rule of the drill holes
according to the annulus k and the associated central angle ζ, allowing the GRIN
metamaterial lens to be automatically designed.
In order to validate the automatic design method of the GRIN metamaterial lens, a
prototype is designed based on (10.47) and the FDTD method is employed to
simulate this lens. As shown in Figure 10.26, the simulation model consists of the
GRIN metamaterial lens mounted on a conical horn antenna.
Figure 10.26 Sketch of the conical horn antenna (lower) with the GRIN metamaterial lens (upper).
Referring to Figure 10.26, the geometric dimensions of the horn antenna are
chosen to be L = 101 mm, dw = 24.9 mm, and R0 = 50 mm. The GRIN
metamaterial lens consists of a planar dielectric disk with the dimensions R = 60
mm and t = 40 mm, which is isotropic dielectric material with a permittivity εd =
2.2. The GRIN lens is a little larger than the horn antenna to collect potential
fringing field. Referring to Figure 10.23, the diameter of the drill holes amounts to
442 Advanced Computational Electromagnetic Methods and Applications
d = 0.6 mm and the radial extent of the annulus is chosen to be a = 0.65 mm,
leading to a total of 92 annuli including 77 annuli positioned inside the horn
antenna aperture (r ≤ 50 mm) and 15 annuli outside the aperture (50 mm < r ≤ 60
mm). The conical horn antenna operates in the X-band from 8 GHz to 12 GHz.
Substituting the above values into (10.47), the central angle ζ(r) of the
corresponding annulus k = 1, 2…, 77 is calculated according to
46.0118
r
2
As for the distribution rule of the drill holes outside the horn antenna aperture
(annulus k = 78, 79 …, 92), the design is the same as that on the annulus 77.
For the purpose of visualization and comparison, the resulting effective
refractive index profile of the GRIN metamaterial lens is calculated by using (10.46)
and neff eff based on the geometric parameters above. As depicted in Figure
10.27, the given theoretical index profile [see (10.41)] is accurately approximated
by the discrete effective refractive index values of the synthesized GRIN
metamaterial lens.
Figure 10.27 Comparison of the theoretical target profile and the realized effective refractive index
distribution of the GRIN metamaterial lens.
Figure 10.28 compares the simulated electric field distribution in the radiative
near-field region of the circularly polarized conical antenna with and without the
designed GRIN metamaterial lens at 10.5 GHz. In particular, Figure 10.28(a)
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 443
shows that the electric field distribution of the ordinary horn antenna gradually
diverges, and an associated decrease in the field amplitude is observed in the
radiation direction along the antenna axis. After placing the GRIN metamaterial
lens on the horn antenna, the electric field distribution is improved as expected,
namely, a transversally confined quasi-plane wave with virtually uniform
amplitude (along the antenna axis) appears in the radiation area, as shown in
Figure 10.28(b). All these effects are represented by a considerably reduced width
of the main radiation lobe, which is tantamount to enhanced radiation directivity,
and thus to an increased gain, while underpinning the intended transformation
performance of the GRIN metamaterial lens.
(a) (b)
Figure 10.28 Electric-field distributions in the x-z plane of the circularly polarized conical horn
antenna: (a) without GRIN metamaterial lens and (b) with GRIN metamaterial lens, both
at an operation frequency of 10.5 GHz.
(a)
(b)
(c)
Figure 10.29 Normalized far-field gain in the y-z plane of the circularly polarized conical horn antenna
with (solid line) and without (dashed line) GRIN metamaterial lens at the operation
frequencies (a) 8.1 GHz; (b) 10.0 GHz; and (c) 11.9 GHz.
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 445
Figure 10.30 Simulated frequency response of the maximum gain of the circularly polarized conical
antenna with (solid line) and without (dashed line) GRIN metamaterial lens.
Figure 10.32 Measured return loss of the conical horn antenna with (solid line) and without (dashed
line) GRIN metamaterial lens.
(a)
(b)
(c)
Figure 10.33 Measured normalized far-field gain patterns in the y-z plane of the circularly polarized
conical horn antenna with (solid line) and without (dashed line) GRIN metamaterial lens
at the operation frequencies: (a) 8.1 GHz, (b) 10.0 GHz, and (c) 11.9 GHz.
448 Advanced Computational Electromagnetic Methods and Applications
Given an upper bound of 0.522 [65] for the utilization factor (i.e., aperture
efficiency) of the optimal circular horn antenna, one easily concludes that the horn
antenna in the experiment is far from optimal, but more importantly, that the
designed GRIN metamaterial lens is capable to increase the utilization factor
significantly not to mention the peak values well above the upper bound. It is
worth noting that, in the calculation for the aperture efficiency of the GRIN lens
antenna, we use the aperture dimension of the GRIN lens rather than that of the
horn antenna in order to obtain convincing results.
As intended by the chosen symmetry of both the air holes and hole
distribution, the designed GRIN metamaterial lens is expected to have little impact
on the polarization states of incident waves. To prove this, the axial ratio of the
circularly polarized horn antenna with the GRIN metamaterial lens is analyzed and
compared to the corresponding ratio of the bare feeding horn antenna. The
measured axial ratio within a range of operation frequency covering the entire X-
band is shown in Figure 10.36. The unloaded horn antenna emits circularly
polarized radiation with an axial ratio lower than 1.5 dB in the entire operation
bandwidth, whereas the inclusion of the GRIN metamaterial lens degrades the
axial ratio only in the subrange between 9.7 GHz and 11.3 GHz with a maximum
value below 1.6 dB. Another characteristic measure for the quality of circular
polarization is the polarization efficiency as defined here below
Pco
p (10.49)
Pco Pcross
where Pco and Pcross are the power of copolarization and cross-polarization,
respectively. The resulting minimum value within the whole X-band amounts to
99.2% for the radiation field emitted after the GRIN metamaterial lens, proving a
high degree of purity of the output circular polarization state.
Figure 10.34 Frequency variation of the measured maximum gain of the circularly polarized conical
antenna with (solid line) and without (dashed line) GRIN metamaterial lens.
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 449
Figure 10.35 Frequency variation of the measured utilization coefficient (i.e., aperture efficiency) for
the horn antenna with (solid line) and without (dashed line) GRIN metamaterial lens.
Figure 10.36 Frequency variation of the measured axial ratio of the circularly polarized horn antenna
with (solid line) and without (dashed line) GRIN metamaterial lens.
10.5 CONCLUSIONS
In this chapter method and applications of metamaterials have been reviewed. The
theory of transform optics is first summarized, and then the electromagnetic
concentrator and the waveguide connector are presented based on the transform
optics. Both numerical experiments and measurements have validated the
metamaterial theory and design.
450 Advanced Computational Electromagnetic Methods and Applications
REFERENCES
[1] V. Veselago, “The Electrodynamics of Substances with Simultaneously Negative Values of ε and
µ, ” Soviet Physics USPEKHI, Vol. 10, No. 4, pp. 509514, 1968.
[2] J. Pendry, A. Holden, W. Stewart, and I. Youngs, “Extremely Low Frequency Plasmons in
Metallic Mesostructures,” Physical Review Letters, Vol. 76, pp. 47734476, 1996.
[3] J. Pendry, A. Holden, D. Robbins, and W. Stewart, “Magnetism from Conductors and Enhanced
Nonlinear Phenomena,” IEEE Transactions on Microwave Theory and Technology, Vol. 47, pp.
20752084, 1999.
[4] S. Enoch, G. Tayeb, P. Sabouroux, N. Guérin, and P. Vincent, “A metamaterial for Directive
Emission,” Physical Review Letters, Vol. 89, p. 213902, 2002.
[5] D. Schurig, J. Mock, B. Justice, S. Cummer and J. Pendry, “Metamaterial Electromagnetic Cloak
at Microwave Frequencies,” Science, Vol. 314, p. 977, 2006.
[6] J. Pendry, “Negative Refraction Makes a Perfect Lens,” Physical Review Letters, Vol. 85, p.
3966, 2000.
[7] K. Zhang, F. Meng, Q. Wu, J. Fu, and L. Li, “Waveguide Connector Constructed by Normal
Layered Dielectric Materials Based on Embedded Optical Transformation,” EPL, Vol. 99, p.
47008, 2012.
[8] H. Ma and T. Cui, “Three-Dimensional Broadband and Broad-Angle Transformation-Optics
Lens,” Nature Communication, Vol. 1, p. 124, 2010.
[9] T. Driscoll, G. Lipworth, J. Hunt, N. Landy, N. Kundtz, D. Basov, and D. Smith, “Performance
of a Three Dimensional Transformation-Optical-Flattened Lüneburg Lens,” Optics Express, Vol.
20, pp. 1326213273, 2012.
[10] L. Cong, W. Cao, Z. Tian, J. Gu, J. Han, and W. Zhang, “Manipulating Polarization States of
Terahertz Radiation Using Metamaterials,” New Journal of Physics, Vol. 14, p. 115013, 2012.
[11] K. Zhang, Q. Wu, J. Fu, and L. Li, “Cylindrical Electromagnetic Concentrator with only Axial
Constitutive Parameter Spatially Variant,” Journal of the Optical Society of America B, Vol. 28,
pp. 15731577, 2011.
[12] J. B. Pendry, “A Chiral Route to Negative Refraction,” Science, Vol. 306, pp. 13531355, 2004.
[13] B. Andres-Garcia, L. Garcia-Munoz, V. Gonzalez-Posadas, F. Herraiz-Martinez, and D. Segovia-
Vargas, “Filtering Lens Structure Based on SRRs in the low THz Band,” Progress in
Electromagnetics Research, Vol. 93, pp. 7190, 2009.
[14] L. Huang and H. Chen, “Multi-Band and Polarization Insentive Metamaterial Absorber,”
Progress In Electromagnetics Research, Vol. 113, pp. 103110, 2011.
[15] J. Pendry, “Negative Refraction Makes a Perfect Lens,” Physical Review Letters, Vol. 85, No. 18,
pp. 39663969, 2000.
[16] H. Chen, B. Hou, S. Chen, X. Ao, W. Wen, and C. Chan, “Design and Experimental Realization
of a Broadband Transformation Media Field Rotator at Microwave Frequencies,” Physical
Review Letters, Vol. 102, No. 18, pp. 183903:14, 2009.
[17] C. Lim and T. Itoh, “A Reflecto-Directive System Using a Composite Right/Left-Handed (CRLH)
Leaky-Wave Antenna and Hetero-Dyne Mixing,” IEEE Microwave and Wireless Components
Letters, Vol. 14, No. 4, pp. 183185, 2004.
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 451
[18] H. Attia, M. Bait-Suwailam, and O. Ramahi, “Enhanced Gain Planar Inverted-F Antenna with
Metamaterial Superstrate for UMTS Applications,” Progress in Electromagnetics Research
Symposium Proceedings, Cambridge, pp. 494497, 2010.
[19] H. Bahrami, M. Hakkak, and A. Pirhadi, “Analysis and Design of Highly Compact Bandpass
Waveguide Filter Utilizing Complementary Split Ring Resonators (CSRR),” Progress in
Electromagnetics Research, Vol. 80, pp. 107122, 2008.
[20] S. Enoch, G. Tayeb, P. Sabouroux, N. Guerin, and P. Vincent, “A Metamaterial for Directive
Emission,” Physical Review Letters, Vol. 89, No. 21, pp. 213902:14, 2002.
[21] R. Ziolkowski, “Propagation in and Scattering from a Matched Metamaterial Having a Zero
Index of Refraction,” Physical Review E Statistical, Nonlinear, and Soft Matter Physics, Vol.
70, No. 42, pp. 046608:14, 2004.
[22] Q. Wu, P. Pan, F. Meng, L. Li, and J. Wu, “A Novel Flat Lens Horn Antenna Designed Based on
Zero Refraction Principle of Metamaterials,” Applied Physics A Materials Science and
Processing, Vol. 87, No. 2, pp. 151156, 2007.
[23] Z. Xiao and H. Xu, “Low Refractive Metamaterials for Gain Enhancement of Horn Antenna,”
Journal of Infrared And Millimeter Waves, Vol. 30, pp. 225–232, 2009.
[24] D. Kim and J. Choi, “Analysis of Antenna Gain Enhancement with a New Planar Metamaterial
Superstrate: an Effective Medium and a Fabry-Prot Resonance Approach,” Journal of Infrared
Millimeter and Terahertz Waves, Vol. 31, No. 11, pp. 12891303, 2010.
[25] S. Hrabar, D. Bonefacic, and D. Muha, “ENZ-Based Shortened Horn Antenna - An Experimental
Study,” Antennas and Propagation Society International Symposium, San Diego, CA, pp. 14,
2008.
[26] J. Ju, D. Kim, W. Lee, and J. Choi, “Wideband High-Gain Antenna Using Metamaterial
Superstrate with the Zero Refractive Index,” Microwave and Optical Technology Letters, Vol. 51,
No. 8, pp. 19731976, 2009.
[27] Q. Cheng, W. Jiang, and T. Cui, “Radiation of Planar Electromagnetic Waves by a Line Source
in Anisotropic Metamaterials,” Journal of Physics D-Applied Physics, Vol. 43, No. 33, pp.
335446:16, 2010.
[28] Y. Ma, P. Wang, X. Chen, and C. Ong, “Near-Field Plane-Wave-Like Beam Emitting Antenna
Fabricated by Anisotropic Metamaterial,” Applied Physics Letters, Vol. 94, No. 4, pp.
044107:13, 2009.
[29] Z. Jiang and D. Werner, “Anisotropic Metamaterial Lens with a Monopole Feed for High-Gain
Multi-Beam Radiation,” IEEE International Symposium on Antennas and Propagation, pp.
13461349, Spokane WA, 2011.
[30] Z. Weng, Y. Jiao, G. Zhao, and F. Zhang, “Design and Experiment of one Dimension and Two
Dimension Metamaterial Structures for Directive Emission,” Progress in Electromagnetics
Research, Vol. 70, pp. 199209, 2007.
[31] Z. Weng, X. Wang, Y. Song, Y. Jiao, and F. Zhang, “A Directive Patch Antenna with Arbitrary
Ring Aperture Lattice Metamaterial Structure,” Journal of Electromagnetic Waves and
Applications, Vol. 22, No. 89, pp. 12831291, 2008.
[32] R. Sauleau, P. Coquet, T. Matsui, and J. Daniel, “A New Concept of Focusing Antennas Using
Plane-Parallel Fabry-Perot Cavities with Nonuniform Mirrors,” IEEE Transactions on Antennas
and Propagation, Vol. 51, No. 11, pp. 31713175, 2003.
452 Advanced Computational Electromagnetic Methods and Applications
[33] D. Smith, S. Schultz, S. McCall, and P. Platzmann, “Defect Studies in a 2-Dimensional Periodic
Photonic Lattice,” Journal of Modern Optics, Vol. 41, No. 2, pp. 395404, 1994.
[34] D. Kaklamani, “Full-Wave Analysis of a Fabry-Perot Type Resonator,” Journal of
Electromagnetic Waves and Applications, Vol. 13, No. 12, pp. 16271634, 1999.
[35] B. Zhou, H. Li, X. Zou, and T. Cui, “Broadband and High-Gain Planar Vivaldi Antennas Based
on Inhomogeneous Anisotropic Zero-Index Metamaterials,” Progress in Electromagnetics
Research, Vol. 120, pp. 235247, 2011.
[36] Q. Wu, J. Turpin, D. Werner, and E. Lier, “Thin Metamaterial Lens for Directive Radiation,”
IEEE International Symposium on Antennas and Propagation, Spokane, WA, pp. 28862889,
2011.
[37] J. Turpin, Q. Wu, D. Werner, E. Lier, B. Martin, and M. Bray, “Anisotropic Metamaterial
Realization of a Flat Gain-enhancing Lens for Antenna Applications,” IEEE International
Symposium on Antennas and Propagation, pp. 28822885, Spokane, WA, 2011.
[38] Z. Mei, J. Bai, T. Niu, and T. Cui, “A Half Maxwell Fish-Eye Lens Antenna Based on Gradient-
Index Metamaterials,” IEEE Transactions on Antennas and Propagation, Vol. 60, No. 1, pp.
398401, 2012.
[39] Y. Zhang, R. Mittra, and W. Hong, “On the Synthesis of a Flat Lens Using a Wideband Low-
Refraction Gradient-Index Metamaterial,” Journal of Electromagnetic Waves and Applications,
Vol. 25, No. 16, pp. 21782187, 2011.
[40] J. Neu, B. Krolla, O. Paul, B. Reinhard, R. Beigang, and M. Rahm, “Metamaterial-Based
Gradient Index Lens with Strong Focusing in the THz Frequency Range,” Optics Express, Vol.
18, No. 26, pp. 2774827757, 2010.
[41] J. Pendry, A. Holden, W. Stewart, and I. Youngs, “Extremely low frequency plasmons in
metallic mesostructures,” Physical Review Letters, Vol. 76, No. 25, pp. 47734776, 1996.
[42] Q. Tang, F. Meng, Q. Wu, and J. Lee, “A Balanced Composite Backward and Forward Compact
Waveguide Based on Resonant Metamaterials,” Journal of Applied Physics, Vol. 109, No. 7, pp.
07A319:13, 2011.
[43] F. Meng, Q. Wu, D. Erni, and L. Li, “Controllable Metamaterial-Loaded Waveguides Supporting
Backward and Forward Waves,” IEEE Transactions on Antennas and Propagation, Vol. 59, No.
9, pp. 34003411, 2011.
[44] R. Ziolkowski, “Design, Fabrication, and Testing of Double Negative Metamaterials,” IEEE
Transactions on Antennas and Propagation, Vol. 51, No. 7, pp. 15161529, 2003.
[45] D. Smith, J. Mock, A. Starr, and D. Schurig, “Gradient Index Metamaterials,” Physical Review E,
Vol. 71, pp. 036609:15, 2005.
[46] T. Driscoll, D. Basov, A. Starr, P. Rye, S. Nemat-Nasser, and D. Schurig et al., “Free-Space
Microwave Focusing by a Negative-Index Gradient Lens,” Applied Physics Letters, Vol. 88, pp.
081101:1 3, 2006.
[47] M. Goldflam, T. Driscoll, B. Chapler, O. Khatib, N. Jokerst, and S. Palit et al., “Reconfigurable
Gradient Index Using VO2 Memory Metamaterials,” Applied Physics Letters, Vol. 99, pp.
044103:13, 2011.
[48] Paul, B. Reinhard, B. Krolla, R. Beigang, and M. Rahm, “Gradient Index Metamaterial Based on
Slot Elements,” Applied Physics Letters, Vol. 96, pp. 241110:13, 2010.
Manipulation of Electromagnetic Waves Based on New Unique Metamaterials 453
[49] L. Ruopeng, C. Qiang, J. Chin, J. Mock, T. Cui, and D. Smith, “Broadband Gradient Index
Microwave Quasioptical Elements Based on Non-Resonant Metamaterials,” Optics Express, Vol.
17, pp. 2103021041, 2009.
[50] L. Ruopeng, Y. Mi, J. Gollub, J. Mock, T. Cui, and D. Smith, “Gradient Index Circuit by
Waveguided Metamaterials,” Applied Physics Letters, Vol. 94, pp. 073506:13, 2009.
[51] D. Smith, Y. Tsai, and S. Larouche, “Analysis of a Gradient Index Metamaterial Blazed
Diffraction Grating,” IEEE Antennas and Wireless Propagation Letters, Vol. 10, pp. 16051608,
2011.
[52] Y. Xin Mi, Z. Xiao Yang, C. Qiang, M. Feng, and T. Cui, “Diffuse Reflections by Randomly
Gradient index Metamaterials,” Optics Letters, Vol. 35, pp. 808810, 2010.
[53] L. Zhen, Q. Rui, and C. Zhen, “A Novel Broadband Fabry-Perot Resonator Antenna with
Gradient Index Metamaterial Superstrate,” IEEE International Symposium Antennas and
Propagation and CNC-USNC/URSI Radio Science Meeting, Toronto, pp. 14, 2010.
[54] M. Zhong and T. Cui, “Experimental Realization of a Broadband Bend Structure Using Gradient
Index Metamaterials,” Optics Express, Vol. 17, pp. 1835418363, 2009.
[55] X. Chen, H. Ma, X. Zou, W. Jiang, and T. Cui, “Three-Dimensional Broadband and High-
Directivity Lens Antenna Made of Metamaterials,” Journal of Applied Physics, Vol. 110, pp.
044904:18, 2011.
[56] H. Ma, X. Chen, H. Xu, X. Yang, W. Jiang, and T. Cui, “Experiments on High-Performance
Beam-Scanning Antennas Made of Gradient-Index Metamaterials,” Applied Physics Letters, Vol.
95, pp. 094107:13, 2009.
[57] Z. Mei, J. Bai, and T. Cui, “Gradient Index Metamaterials Realized by Drilling Hole Arrays,”
Journal of Physics D-Applied Physics, Vol. 43, pp. 055404:16, 2010.
[58] H. Ma and T. Cui, “Three-Dimensional Broadband and Broad-Angle Transformation-Optics
Lens,” Nature Communications, Vol. 1, pp. 124:16, 2010.
[59] B. Zhou, Y. Yang, H. Li, and T. Cui, “Beam-Steering Vivaldi Antenna Based on Partial
Luneburg Lens Constructed with Composite Materials,” Journal of Applied Physics, Vol. 110, pp.
084908:16, 2011.
[60] H. Ma and T. Cui, “Three-Dimensional Broadband Ground-Plane Cloak Made of Metamaterials,”
Nature Communications, Vol. 1, pp. 21:1 6, 2010.
[61] L. Zhijia, S. Yang, and Z. Nie, “A Dielectric Lens Antenna Design by Using the Effective
Medium Theories,” International Symposium on Intelligent Signal Processing and
Communication Systems, Chengdu, China, pp. 14, 2010.
[62] A. Ittipiboon, and S. Thirakoune, “Investigation on Arrays of Perforated Dielectric Fresnel
Lenses,” IEE Proceedings Microwaves, Antennas and Propagation, Vol. 153, pp. 270276, 2006.
[63] F. Meng, Q. Wu, D. Erni, and L. Li, “Controllable Metamaterial-Loaded Waveguides Supporting
Backward and Forward Waves,” IEEE Transactions on Antennas and Propagation, Vol. 59, pp.
34003411, 2011.
[64] H. Ma and T. Cui, “Three-Dimensional Broadband and Broad-Angle Transformation-Optics
Lens,” Nature Communications, Vol. 1, pp. 124:16, 2010.
[65] T. Teshirogi and T. Yoneyama (eds.), Modern Millimeter-Wave Technologies, Burke, VA: IOS
Press, 2001.
Chapter 11
Time-Domain Integral Equation Method for
Transient Problems
Mingyao Xia
This chapter is concerned with the time-domain integral equation (TDIE) method
for solving transient phenomena. Following a brief introduction to the approach,
various integral equations are derived based on equivalent principle, retarded
potential theory, and boundary conditions. Then discretizing schemes are described,
including geometric meshing and mathematical handling to convert the continuous
operator equations into discrete linear systems. An emphasis is placed on precise
evaluations of matrix elements, which is crucial for stability and accuracy. As an
advance, the method is extended to transient scattering by an arbitrarily moving
body, which may travel at hypervelocity and rotate simultaneously about a center.
Plenty of numerical results are provided for both algorithmic validations and real-
world applications.
11.1 INTRODUCTION
455
456 Advanced Computational Electromagnetic Methods and Applications
equations at each time step. This was achieved by choosing a small size of time
step and point-matching the equations in both spatial and temporal domains, so
that the field at the being matched space-time point was contributed by two parts:
from the source on the observer’s patch at the current time step, and from the
sources on the other patches at the earlier time steps. This means that the
contributions from the sources on other patches at the current time step are
excluded due to finite propagation speed. This approach, unfortunately, was
doomed to be divergent eventually, if no extra stabilized measure was introduced.
Stable solutions may be achievable by choosing a larger size of time step, which
led to the so called implicit procedure [16] that had to solve a sparse matrix
equation at each time step. However, the results might be inaccurate. Another way
to extract stable solutions was to use entire domain matching [17] other than the
point-matching or the MOT scheme. This resulted in an increased storage
requirement and computing time. It might be said that TDIE methods were plagued
to large extent by the instability until the end of the last century, such that it missed
the key period being popularized as the FDTD. What blessing was that we
witnessed a significant progress in reducing the computing complexity for large
scale problems, i.e., the development of the fast plane wave time domain (PWTD)
algorithm [18], which was the time domain counterpart of the fast multipole
method (FMM) in frequency domain.
Since the turn of this century, two important techniques have been introduced
to overcome instability and generate accurate TDIE solutions. The first technique
was the proper choice of temporal basis functions that could postpone the
occurrence of instability. They included the squared cosine function [19],
approximate prolate spheroidal wave function [20], higher order Lagrange
interpolating function [21], and quadratic B-spline function [22]. Another choice
was the Laguerre polynomials that led to the marching-on-in-degree (MOD)
scheme [23] rather than the classical MOT scheme. The second technique
consisted of the precise evaluations of matrix elements [24] and was proved to be
very effective to improve the stability and accuracy. Precise calculations of matrix
elements for wire structures [25, 26], 2-D configurations [27, 28], and general 3-D
scatterers [29, 30] were reported. It seemed clear that previous instability and
inaccuracy of the TDIE methods were caused by inaccurate evaluations of matrix
elements, besides improper choice or manipulation of temporal basis functions. It
is expected that proper choice of temporal basis functions and precise evaluation of
matrix elements, in conjunction with the fast PWTD algorithms, would make the
TDIE solvers attractive tools for simulations of various transient or ultrawideband
problems.
This chapter gives basic description of the TDIE methods, from derivation of
governing equations to discretization as time-stepping linear systems, with
emphasis placed on precise evaluations of matrix elements. An extension to
simulations of scattering by moving objects is presented. Sufficient numerical
examples are provided for benchmarks.
Time-Domain Integral Equation Methods for Transient Problems 457
For integral equation (IE) methods, the unknown functions to be solved for are the
sources, including real electric currents, equivalent electric currents, equivalent
magnetic currents, equivalent electric dipoles, or other equivalent sources. These
sources are distributed over limited regions, such as conducting wires, conducting
or dielectric surfaces, or within a finite volume. They are called wire sources,
surface sources, or volume sources. The EMFs generated by the sources are
generally expressed as integral forms through a Green function, which is a solution
of a point source under the same boundary conditions as the problem.
Because we can assume equivalent sources, it is possible that any physical
interface is removed and replaced by an equivalent surface source. The same is
true for a volume with any medium. The region can be replaced by any other
matter plus an equivalent volume source. Therefore, if we like, we can replace any
geometric structures with some equivalent sources and deal with the problems in
free space, so that the solutions of Maxwell equations in free space apply.
By assuming equivalent sources, EMFs are expressed by the sources in
integral forms. To determine the source distributions, the fields are enforced to
meet the boundary conditions. By doing so, we would obtain a lot of integral
equations (IEs), which are exactly the governing equations that we have to solve.
Once the IEs are solved out and the sources are extracted, EMFs at any space-time
point can be found by using the integral expressions. Postprocessing may be
followed by using EMFs that have been calculated.
This outlines the complete process by which an IE method is actualized. The
TDIE method obeys this process. In this section, we concentrate on deriving
various governing integral equations based on the equivalent principle described
above. Discretization and solutions of these TDIEs are left to later sections.
J s (r' , t R / c)
A(r, t ) 0 dS' , R r r' (11.2)
S
4πR
1 s (r' , t R / c)
0 S
(r, t ) dS' (11.3)
4πR
in which 0 and 0 are the permittivity and permeability of free space and
c 1/ 0 0 is the light speed in free space. The scattered fields are retrieved by
the retarded potentials and written as
A
Es 0 L0 (J s ) (11.4)
t
1
Hs A K0 (J s ) (11.5)
0
R 1 1 J s (r' , t R / c)
K0 (J s ) dS' (11.7)
S
R R c t 4πR
1
K0 (J s ) K0 (J s ) nˆ J s , r S (11.8)
2
where K0 is the principal value part of K0 , and S means the outside/inside
surface of the scatterer.
The integral equations are established by enforcing the boundary conditions.
The electric and magnetic field boundary conditions for a PEC object read
nˆ (Ei Es ) 0 (11.9)
nˆ (Hi H s ) J s (11.10)
i i
where E and H are the incident electric and magnetic fields. Substituting (11.4)
through (11.8) into above, we have
1 J s (r' , t R / c) c (r' , t R / c) 1
nˆ
c t
S
4π R
dS' s
S
4π R
dS' nˆ Ei (r, t ) (11.11)
0
1 R 1 1 J s (r' , t R / c)
J s nˆ P.V. dS' nˆ Hi (r, t ) (11.12)
2 S
R R c t 4π R
where P.V. means taking the principal value integral. Either the electric field
integral equation (EFIE) (11.11) or the magnetic field integral equation (MFIE)
(11.12) can be used to solve for the induced currents on the surface. However, for a
closed body, using either EFIE or MFIE, we may encounter the interior resonant
problem that may result in significant errors near the resonant frequencies, which
may be clearly observed if data in the time domain are transformed into the
frequency domain. To overcome the resonant issue, the combined field integral
equation (CFIE) should be employed, which is a combination of EFIE and MFIE
in some way, for example, nˆ EFIE (1 ) MFIE , with 0 1 being a
combination parameter.
An extension to 1-D case from the 3-D formulation above is easy. However,
usually only EFIE is used for a wire problem. The boundary condition (11.9) is
rewritten as E E tan 0 , where the subscript “tan” means taking the tangential
i s
1 J (r' , t R / c) c (r' , t R / c) 1
c t
L
4π R
dl'
L
4π R
dl' Ei (r, t ) (11.13)
tan 0
tan
in which J and are the linear distributions of currents and charges along a
wire. A wire problem usually means that the thin-wire approximations apply,
which require that the diameter of a wire is much smaller than both its length and a
properly defined minimum wavelength of the incident wave. The scattering
geometry of a wire problem is shown in Figure 11.2. Under the thin-wire
approximations, the currents and charges are taken to be on the centerline, and the
distance from a source point to an observation field point is approximated by
2
R a 2 r r' (11.14)
where a is the radius of the wire, and r' is exactly on the centerline.
Extension to 2-D case from the 3-D formulation is also straightforward. The
scattering geometry is shown in Figure 11.3. The body in the z-direction is taken to
be infinite. Now the retarded potentials (11.2) and (11.3) are modified as
J s r ' , t t'
ct
Ar, t 0 2 dct 'dC ' (11.15)
CR ct' 2 R 2
s r ' , t t'
ct
1
r, t 2 dct 'dC ' (11.16)
0 ct' 2 R 2
CR
Time-Domain Integral Equation Methods for Transient Problems 461
1 J s ρ' , t t'
ct
1 1 R 1
J s ρ, t n̂ P.V R ct ' R c t dct 'dC ' n̂ Hi
2 2 ct' 2
R 2
CR
(11.18)
We have derived all the integral equations for PEC scatterers above. An extension
to the homogeneous lossless dielectric case is not difficult. We use a 3-D dielectric
body as an example, as shown in Figure 11.4, and assume the distributions of
equivalent electric current J s and equivalent magnetic current J ms on the object
surface. By virtue of the duality principle and the forms of (11.4) and (11.5), the
scattered fields outside the body can be written as
Es 0 L0 (Js ) K0 (J ms ) (11.19)
1
Hs L0 (J ms ) K0 (J s ) (11.20)
0
The transmitted fields inside the body can be written similarly as
Et 1 L1 (Js ) K1 (J ms ) (11.21)
1
Ht L1 (J ms ) K1 (J s ) (11.22)
1
462 Advanced Computational Electromagnetic Methods and Applications
Substituting (11.19) and (11.21) into the equations above and extracting the
singularity property as (11.8), we obtain the EFIEs as follows:
1
nˆ 0 L0 (J s ) J ms nˆ K0 (J ms ) nˆ Ei (11.25)
2
1
nˆ 1 L1 (J s ) J ms nˆ K1 (J ms ) 0 (11.26)
2
Similarly, by using the magnetic field boundary conditions, we can obtain the
MFIEs, which are the dual forms of the EFIEs above, that is,
1 1
nˆ L0 (J ms ) J s nˆ K0 (J s ) nˆ Hi (11.27)
0 2
1 1
nˆ L1 (J ms ) J s nˆ K1 (J s ) 0 (11.28)
1 2
By using the EFIEs (11.25) and (11.26) or the MFIEs (11.27) and (11.28), the
equivalent currents J s and J ms can be solved. However, using either EFIEs or
MFIEs, we may encounter the interior resonant problem, as pointed out in the PEC
Time-Domain Integral Equation Methods for Transient Problems 463
case. The reason can be attributed to use of either the electric field or the magnetic
field boundary conditions, but not both. Thus, a simple remedy is to combine the
two sets of equations, so that both the electric field and magnetic field boundary
conditions are enforced. Direct additions of (11.25) to (11.26) and (11.27) to
(11.28) lead to the Poggio-Miller-Chang-Harrington-Wu-Tai (PMCHWT)
equations [31]:
nˆ (0 L0 1 L1 )(Js ) nˆ (K0 +K1 )(J ms ) nˆ Ei (11.29)
1 1
nˆ ( L0 L1 )(J ms ) nˆ (K0 +K1 )(J s ) nˆ Hi (11.30)
0 1
1 p
nˆ (0 L0 p1 L1 )(J s ) J ms nˆ (K0 pK1 )(J ms ) nˆ Ei (11.31)
2
1 q 1 q
nˆ ( L0 L1 )(J ms ) J s nˆ (K0 qK1 )(J s ) nˆ Hi (11.32)
0 1 2
The governing equations derived above must be discretized before they can be
solved numerically. Geometric discretization is also called meshing, which is used
to divide the whole solution domain into many small units. Mathematical
464 Advanced Computational Electromagnetic Methods and Applications
where rn (n 0,1, , N ) are a set of control points on the wire, and n ( ) are
called shape functions. The classical linear interpolating shape function reads
( ) 1 , 1 0
( ) (11.34)
( ) 1 , 0 1
and n (
) ( n) . In (11.34), ( ) 0 if 1 , which has been omitted,
and we will take this as a convention throughout this chapter. Obviously, the linear
shape function fits the wire with a train of straight line segments. The segmented
wire is piecewise continuous. It has a first-order derivative that is not continuous.
Higher-order derivatives are not available.
N N
r ( ) rnn ( ) rn' n ( ) , 0 N (11.35)
n 0 n 0
( ) (1 )(1 2 ), 1 0
2
( ) (11.36)
( ) (1 )(1 2 ), 0 1
2
( ) (1 ) 2 , 1 0
( ) (11.37)
( ) (1 ) , 0 1
2
The tangential vector of the wire and the unit directional tangential vector can
be defined as
r' ( )
s( ) r ( ) r' ( ) , sˆ( ) (11.40)
r' ( )
dl s( ) d (11.41)
n1 ( ) ( )
dn ( ) rn 1 rn n rn rn 1 , n 1 n (11.43)
The lengths of a segment and the whole wire are given by
n N N
ln d n ( ) d rn rn 1 , L s( ) d ln (11.44)
n 1 0 n 1
If the higher-order shape functions are adopted, the directional vector and the
segmental length can be calculated in the same way.
466 Advanced Computational Electromagnetic Methods and Applications
After the geometry discretization, a spatial vector basis function associated with
the shape function may be defined as
1
g n ( ) s ( )
n ( ), n 1 n
n
g n (r ( )) l f n ( ) (11.46)
g ( ) 1
n ( ), n n 1
n s n ( )
For the linear shape function, we have
fn ( ) (1 n)sˆ n , n 1 n
fn ( ) (11.47)
fn ( ) (1 n)sˆ n , n n 1
1
g n ( ) s , n 1 n
n
g n ( ) (11.48)
g 1
( ) , n n 1
n s n
N 1
(r' , t' ) t I n ( j ) g n (r' ) S ( t' j ) (11.50)
j 1
n 1
where we have defined t' t' / t with t being the size of the time step or the
temporal resolution, and I n ( j ) is the current at the nth node and jth time step. We
need to solve for only N 1 unknowns due to I 0 I
N 0 . If the wire is a closed
Time-Domain Integral Equation Methods for Transient Problems 467
loop, the number of unknowns should be N. Because the current and charge must
meet the continuity equation, and we already have defined gn (r' ) 'l fn (r' ) ,
we must impose
T ( t' ) t S ( t' ) S (t' ) S' (t' ) (11.51)
t' t'
where T ( t' ) and S ( t' ) are the temporal basis functions for the currents and
charges, respectively. A simple choice of T ( t' ) is the triangle function or linear
interpolating function:
1 t' , 1 t' 0
T ( t' ) (11.52)
1 t' , 0 t' 1
The temporal basis function for the charge is
12 (1 t' ) 2 , 1 t' 0
t'
S ( t' ) T (u )du 1 12 (1 t' ) 2 , 0 t' 1 (11.53)
1, t' 1
It is noticed that the function S ( t' ) is not compact and has an infinite support
width, which will result in a full matrix system. To avoid employing an uncompact
temporal basis function, we may require the function S ( t' ) to have a definite
support interval, and the function T ( t' ) is also compact. If doing so, the function
S ( t' ) must be at least second order, and the support intervals must be at least three.
A proper choice is the quadratic B-spline function [22], which is expressed as
12 (1 t ) 2, 0 t 1 1
S ( t ) 12 t t 2, 0 t 1 (11.54)
1 2
2 ( t 1) 2 ( t 1) , 0 t 1 1
1
1 t , 0 t 1 1
T ( t ) S' ( t ) 1 2 t , 0 t 1 (11.55)
1 ( t 1), 0 t 1 1
Now, both the temporal basis functions for the currents and charges are
compact. We will use this set of temporal basis functions in the computations later.
468 Advanced Computational Electromagnetic Methods and Applications
f
m
m (r) l dl l fm (r) dl
m
g
m
m (r) dl (11.60)
in which the first equality holds because fm (r) vanishes at the two end points. If
the temporal basis functions are compact, that is, S ( j R )
0 if j R 1 or
j R p , where p 1 for the triangle function or p 2 for the quadratic B-
spline function, we should have Z ( j ) 0 if j Rmin 1 or j Rmax p , where
Rmin 0 is the minimum dimension of the wire and Rmax is the maximum
dimension of the wire. Therefore, we should have 0 j int( Rmax ) p. As a
result, we can rewrite (11.57) as the marching-on-in-time (MOT) form:
min( i 1, L )
[ Z (0)]{
I (i)} {V (i)} j 1
[ Z ( j )]{I (i j )} (11.61)
where
L int( Rmax ) p with p 1 being the number of support intervals of the
temporal basis function. It is clear that L if the temporal basis function is
noncausal or has an infinite support width. It is seen that (11.61) takes a recursive
Time-Domain Integral Equation Methods for Transient Problems 469
form, that is, the right sides are known when the coefficient {I (i)} is solved for at
the ith time step.
Geometric discretization for a 2-D geometry is the same as the wire structure,
because the contour of cross-section can be seen as a wire for an open strip or a
closed loop for a cylinder, as shown in Figure 11.6. For a 2-D problem, we usually
distinguish it as either a TM case or a TE case. TM case means that the electric
field of incident wave has only a z-component. TE case means that the magnetic
field of incident wave has only a z-component.
For the TM case, Ei zˆ Ezi and Hi (kˆ i zˆ ) Ezi / 0 , and the induced current
on the surface has only z-component, too, that is, J s J z zˆ . The charge on the
surface vanishes because Jz is invariant with z such that
s Js J z / z 0 / t . As a results, the governing equation (11.17) is
reduced to
ct
1 J z ρ' , t t ' 1
ct d ct 'dC ' E zi ρ, t (11.62)
2 ct '2
R 2 0
CR
The spatial basis function that is suitable for expanding the currents is the
same as (11.45) but replaces sˆ n ( ) with ẑ . Thus, the current is expanded as
470 Advanced Computational Electromagnetic Methods and Applications
N
J z (ρ' , t' ) I n ( j )n (' )T ( t' j ) (11.63)
j 1 n 1
where T ( t' ) is taken to be (11.55). Substituting (11.63) into (11.62) and testing it
with m ( ) , and after some recasting, we obtain the discrete version of (11.62) in
MOT form as
i 1
[ Z E,TM (0)]{I (i)} {V E,TM (i)} [ Z E,TM ( j )]{I (i j )} (11.64)
j 1
with
1
VmE,TM (i ) m ( ) Ezi (ρ, it )dC , dC s( ) d (11.65)
0 m
1 1
2π ct m n
Z mE,TM
,n ( j) F ( j, R)m ( )n (' )dC' dC (11.66)
j 1 ct
d(ct )
F j, R T ( j t ) (11.67)
max R , j 2 ct (ct )2 R 2
If we use the MFIE of (11.18), its discrete version in MOT form is the same as
(11.64) just by replacing the elements with
1
(nˆ kˆ ) ( ) Ezi (ρ, it )dC (11.68)
i
VmH,TM (i) m
0 m
1
Z mH,TM
,n ( j) T j m ( )n ( )dC
2 m
1 1 ˆ ) ( ) (' )dC' dC
P.V. G( j, R)(nˆ R m n
(11.69)
2π ct m n
( j 1) ct
T ( j t ) T ( j t ) d(ct )
G( j, R) (ct )
max R ,( j 2) ct
ct' R
ct (ct )2 R 2
(11.70)
N
J s (ρ' , t' ) I n ( j )n (' )sˆ n (' )T ( t' j ) (11.71)
j 1 n 1
1 1
2π ct m n
Z mE,TE
,n ( j)
F ( j , R)fm (ρ) fn (ρ) (ct ) 2 W ( j, R) g m (ρ) g n (ρ) dC'dC
(11.73)
j 1 ct
d(ct )
W j, R S ( j t ) (11.74)
max R , j 2 ct (ct )2 R 2
1 1 1
Z mH,TE
,n ( j) T ( j ) fm (ρ) fn (ρ)dC P.V.
2 m
2π ct m n
For meshing of a 3-D structure, the same procedure as in the wire case applies.
Giving a set of control points and some shape functions, a general 3-D surface can
be modeled tightly. Commonly used shape functions include planar triangle,
curved triangles, planar quadrangles, and curved quadrangles. Triangular
discretization is most popular and is adopted here.
We first use the planar triangular meshing. Three control points on the surface
are taken to be the three vertex points ri (i 1, 2,3) of a planar triangle, as shown
in Figure 11.7. A point inside the triangle is expressed by
3
r(1 , 2 ) i (1 , 2 )ri 1r1 2 r2 (1 1 2 )r3 (11.77)
i 1
1 (1 , 2 ) 1
2 (1 , 2 ) 2 (11.78)
( , ) 1
3 1 2 1 2
dS J d1d2 (11.79)
r r
J (r1 r3 ) (r2 r3 ) 2 A (11.80)
1 2
(a) (b)
Figure 11.7 An arbitratily planar triangle is mapped into a right triangle: (a) an arbitrarily planar
triangle and (b) mapped as a right triangle.
r r 6 6
J
1 2
i 1 j 1
ij (1 , 2 )(ri r j ) (11.83)
i j
with ij (1 , 2 ) .
1 2
(a) (b)
Figure 11.8 An arbitratily curved triangle is mapped into a right triangle: (a) an arbitrarily curved
triangle and (b) mapped as a right triangle.
The spatial basis function corresponds to the planar triangle discretization that
is known as the Rao-Wilton-Glisson (RWG) vector triangle function, which is
defined over a pair of triangles [32], as shown in Figure 11.9. A point inside the
two triangles is expressed as
1r1 2 r2 (1 1 2 )r3 , r n
r (1 , 2 ) (11.84)
1r1 2 r2 (1 1 2 )r4 , r n
Define
r
s1 r1 r3 , r n
1
s1 (11.85)
s r r r , r
1 1
4 1 n
r
s 2 r2 r3 , r n
2
s2 (11.86)
s r r r , r
2
2
4 2 n
where ln is the length of the nth edge, and An is the area of n , and
ρn 1s1 2 s 2 r r3 , r n
(11.88)
ρn 1s1 2 s 2 r4 r, r n
6
i (1 , 2 )ri , r n
i 1
r (1 , 2 ) 6 (11.90)
( , )r , r
i 1 2 i
i 1
n
r 6
s1 i (1 , 2 )ri
1 1 i 1
s1 (11.91)
s r
6
1
i (1 , 2 )ri
1 1 i 1
r 6
s 2 i (1 , 2 )ri
2 2 i 1
s2 (11.92)
s r 6
2
2
i (1 , 2 )ri
2 i 1
ln
f n (r ) (1s1 2s 2 ), r n
J n
f n (r ) (11.93)
f (r ) ln
( s s ), r
n J n
1 1 2 2 n
where the Jacobi is
r r 6 6
J n
1 2
i 1 j 1
ij (1 , 2 )(ri rj ) (11.94)
2ln
g n (r ) , r n
J n
g n (r ) s f n (r ) (11.95)
2 l
g (r ) n , r
n J n
n
Using the spatial basis functions defined above, the currents and charges on
the body surface are expanded by
476 Advanced Computational Electromagnetic Methods and Applications
N
J s (r, t ) I n ( j )fn (r)T ( t j ) (11.96)
j 1 n 1
N
s (r, t ) t I n ( j ) g n (r) S ( t j ) (11.97)
j 1 n 1
1
VmE (i )
0 f
m
m (r) Ei (r, it )dS (11.99)
f
m
m (r) s dS s fm (r) dS gm (r) dS
m m
(11.101)
in which the first equality is achieved by using the divergence theorem and
noticing that the normal component of fm (r) is continuous across the common
edge and vanishes on the other four edges.
Similarly, substituting (11.96) into the MFIE of (11.12) and testing it by
fm (r) (t it ) , after some recasting, we obtain its discretized version in MOT
form:
min( i 1, L )
[ Z H (0)]{I (i)} {V H (i)} j 1
[ Z H ( j )]{I (i j )} (11.102)
with
VmH (i) f
m
m (r) nˆ Hi (r, it ) dS (11.103)
Time-Domain Integral Equation Methods for Transient Problems 477
1 1 T ( j R ) T' ( j R )
Z mH, n ( j ) T ( j ) fm (r ) fn (r )dS
2 m
4π m n R
ct
with
Z m, n ( j ) Z m, n ( j ) (1 ) Z m, n ( j )
C E H
C (11.106)
Vm (i) Vm (i) (1 )Vm (i)
E H
and 0 1 is a combination parameter.
N
J s (r, t ) I n(e) ( j )fn (r)T ( t j ) (11.107)
j 1 n 1
N
s (r, t ) t I n(e) ( j ) g n (r) S ( t j ) (11.108)
j 1 n 1
N
J ms (r, t ) 1 I n(m) ( j )f n (r)T ( t j ) (11.109)
j 1 n 1
N
ms (r, t ) 1t I n(m) ( j ) g n (r)S ( t j ) (11.110)
j 1 n 1
The 1 in (11.109) would make the coefficient I n(e) and I n(m) to be on the
some order of magnitude. Substituting these expansions into (11.31) and (11.64)
and testing them by [nˆ fm (r)] ( t i) , we will obtain
478 Advanced Computational Electromagnetic Methods and Applications
i 1 i 1
[Z
j 0
EE
( j )]{I (e) (i j )} [ Z EH ( j )]{I (m) (i j )} {V E (i)}
j 0
(11.111)
i 1 i 1
[Z
j 0
HE
( j )]{I (e) (i j )} [ Z HH ( j )]{I (m) (i j )} {V H (i)}
j 0
(11.112)
m, n ( j ) pr Lm, n ( j )
ZmEE, n ( j ) L(0) (1)
(11.113)
1
VmE (i )
0 f
m
m (r ) Ei (r, it )dS (11.117)
VmH (i) f
m
m (r) Hi (r, it )dS (11.118)
U m, n ( j ) [nˆ f
m n
m (r)] fn (r) dS (11.119)
In the above c0 and c1 are the light speeds in the vacuum and dielectric,
respectively. The set of equations (11.111) and (11.112) can be written in the same
MOT form as (11.105) by replacing Z C ( j ) , I (i ) and V C (i) with the following,
respectively.
[ Z EE ( j )] [ Z EH ( j )]
[ Z ( j )] HE HH (11.122)
[ Z ( j )] [ Z ( j )]
Time-Domain Integral Equation Methods for Transient Problems 479
Stability and accuracy of TDIE methods largely rely on the precise evaluations of
matrix elements. Key integral techniques for 1-D, 2-D and 3-D geometrics are
addressed in this section, for both singular and nonsingular terms.
1, 0 x 1
( x) (11.124)
0, otherwise
By using this function, the quadratic B-spline temporal basis functions given
in (11.54) can be written as
Z m, n ( j ) X m , n ( j 1) 2 X m , n ( j ) X m , n ( j 1)
12 Ym(2) 1 (0) (1) (2)
, n ( j 1) 2 Ym , n ( j ) Ym , n ( j ) Ym , n ( j ) (11.128)
12 Y(0)
m, n ( j 1) Y (1)
m, n
(2)
( j 1) 12 Y
m,n ( j 1)
1 1 2,2
fmp (r) fnq (r' )
4π ct
p , q 1
R
( j R )dl' dl (11.129)
p q
m n
480 Advanced Computational Electromagnetic Methods and Applications
We set 1m m and 2m m , and so forth. It is seen from (11.128) that the
calculations of the matrix elements take a recursive way, that is, the elements
calculated in previous time steps can be used to calculate the elements at a few
later time steps. By this manner, at least half the CPU time can be saved in the
matrix setup stage.
If not for the causality imposed by the factor ( j R) , the integrations of
(11.129) and (11.130) may be carried out analytically without difficulty. This
factor complicates the integrals to a great extent, making analytical treatments [26]
too tedious to be practical. As a result, we will give closed-form expressions only
for the self-action term, while for interaction terms Gaussian numerical quadrature
is employed.
Let
g m (r) g n (r' )
Ym(, n); p , q ( j ) ( j R)
( j R )dl' dl (11.132)
mp qn
R
NG NG
Ym(, n); p ,q ( j ) (lmp lnq ) wi wk [ gmp (ri ) gnq (rk )]Fm(,n); p ,q ( j, Rik ) (11.134)
i 1 k 1
where wi’s are the weighted factors, which are given in Table 11.1, and
Time-Domain Integral Equation Methods for Transient Problems 481
( ) ( j Ri , k )
F ( j , Ri , k ) ( j Ri , k )
m , n; p , q Ri , k
(11.135)
2
R rmp (i ) rnq ( k ) a 2
i ,k
with rm1 (i ) (1 i )rm1 i rm and rm2 (i ) (1 i )rm i rm1 , where i take the
values of xi in Table 11.1.
Table 11.1
Gauss-Legendre Quadrature Evaluated Points and Weighted Factors
NG 1 x1 0.5 w1 1
NG 2 x1 0.211325 w1 0.5
x2 1 x1 w2 0.5
NG 3 x1 0.112702 w1 0.277778
x2 0.5 w2 0.444444
x3 1 x1 w3 w1
NG 4 x1 0.069432 w1 0.173927
x2 0.330009 w2 0.326073
x3 1 x2 w3 w2
x4 1 x1 w4 w1
NG 5 x1 0.046910 w1 0.118463
x2 0.230765 w2 0.239314
x3 0.5 w3 0.284444
x4 1 x2 w4 w2
x5 1 x1 w5 w1
NG 6 x1 0.033765 w1 0.085662
x2 0.169395 w2 0.180381
x3 0.380690 w3 0.233957
x4 1 x3 w4 w3
x5 1 x2 w5 w2
x6 1 x1 w6 w1
482 Advanced Computational Electromagnetic Methods and Applications
The integrals (11.131) and (11.132) are singular only if j 1 and at the mean time
mp and qn overlap. This happens in three cases: (1) if m n 1 , m overlaps
with n ; (2) if m n , m overlaps with n , and m overlaps with n ; and (3)
if m n 1 , m overlaps with n .
(ct )2 a 2 ; thus, the integral domain is the shadowing part in Figure 11.10(b).
1 1
I p(, q) (b) R p' q (1 R )d' d
0 0
1 1 1 1 '
d d' d d' d' d R p' q (11.138)
0 0 min( ,1) 0 min( ,1) 0
(a) (b)
Figure 11.10 (a) Illustration of singularity treatment and (b) the integral domain of (11.138).
Time-Domain Integral Equation Methods for Transient Problems 483
Specifically,
1 1 1 1 '
d' d
( 1)
I 0,0 (b)
b2 ( ' )2 a 2
0 0 min( ,1) 0 min( ,1) 0
2 a 2 a2
ln (11.139)
b b a
where min ct , a 2 b 2 . More formulae that are needed include
1 a 2 a2
( 1)
I1,0 ( 1)
(b) I 0,1 (b) ln (11.140)
b b a
2 2a 2 9b 2 (a ) a 2 a2
( 1)
I1,1 (b) ln (11.141)
3b 6b 2 b a
1 a2 2
(0)
I 0,0 (b) 2 2 a2 (11.142)
2b b
2a3 2 3 3b 2 a 2 a 2 2 a 2
(1)
I 0,0 (b) ln (11.143)
3b2 b a
Making use of the above formulae, we obtain
Yn(0) ( 1)
1, n;2,1 (1) I 0,0 (ln ) (11.145)
1 (0)
Yn(1)1, n;2,1 (1) I 0,0
( 1)
(ln ) I 0,0 (ln ) (11.146)
ct
2 (0) 1
Yn(2) ( 1)
1, n;2,1 (1) I 0,0 (ln ) I 0,0 (ln ) (1)
I 0,0 (ln ) (11.147)
ct (ct )2
(ln )2 I1,1
( 1)
(ln ) (11.148)
1 1
(1 R )
X n, n;2,2 (1) (ln )2 (1 )(1 ' ) d' d
0 0
R
(ln )2 I 0,0
( 1) ( 1)
(ln ) 2I1,0 (ln )
( 1)
(ln ) I1,1 (11.149)
1 1
(1 R )
, n; p , p (1)
Yn(0) d' d
0 0
R
( 1) p
I 0,0 (ln ) , ln1 ln , ln2 ln (11.150)
1 1
(1 R )
, n; p , p (1) (1 R )
Yn(1) d' d
0 0
R
( 1) p 1 (0) p
I 0,0 (ln ) I 0,0 (ln ) (11.151)
ct
1 1
(1 R )
, n; p , p (1) (1 R )
Yn(2) d' d
2
0 0
R
2 (0) p 1
( 1)
I 0,0 (lnp ) I 0,0 (ln ) (1)
I 0,0 (lnp ) (11.152)
ct (ct )2
Yn(0) ( 1)
1, n;1,2 (1) I 0,0 (ln ) (11.154)
1 (0)
Yn(1)1, n;1,2 (1) I 0,0
( 1)
(ln ) I 0,0 (ln ) (11.155)
ct
2 (0) 1
Yn(2) ( 1)
1, n;1,2 (1) I 0,0 (ln ) I 0,0 (ln ) (1)
I 0,0 (ln ) (11.156)
ct (ct )2
For a 3-D PEC body, evaluations of the matrix elements of (11.100) are the same
as (11.128) through (11.132) just by replacing the spatial basis functions for wire
segmenting with the RWG basis functions for surface meshing. That is, the matrix
Time-Domain Integral Equation Methods for Transient Problems 485
element Z mE, n ( j ) is still written in the same form as (11.128), with (11.131) and
(11.132) being rewritten as
( j R)
X m , n; p , q ( j )
R
fmp (r) fnq (r' )dS'dS (11.157)
mp nq
( j R)
Ym(, n); ( j R)
p,q ( j ) g mp (r) g nq (r' )dS'dS (11.158)
mp nq
R
NG , NG
( j Ri , k )
Ym(, n); p , q ( j ) ( Amp Anq )
i , k 1
wi wk ( j Ri ,k )
Ri , k
g mp (ri ) g nq (rk ) (11.160)
where wi’s are the weighted factors, ri 1i r1 2i r2 (1 1i 2i )r3 are the
evaluated points, and Amp and Anq are the area of mp and qn . The evaluated
points and weighted factors are given in Table 11.2, with xi standing for 1i , yi
standing for 2i , and zi standing for 1 1i 2i .
If j 1 and mp nq (the two triangles overlap), the integrals of (11.157)
and (11.158) will be singular. Let ( j R)
1 ( j R) , and then
( j R)
X m, n
; p,q ( j) K m, n; p , q R
fmp (r) fnq (r' )dS'dS (11.161)
mp qn
( j R)
Ym(, n); p , q ( j
) Pm(, n); p , q ( j ) ( j R)
g mp (r) g nq (r' )dS'dS (11.162)
mp qn
R
where
1
Rf
p
K m , n; p , q
m (r) fnq (r' )dS'dS (11.163)
mp nq
1 p
Pm(, n); ( j R)
p,q ( j) g m (r) g nq (r' )dS'dS (11.164)
mp nq
R
486 Advanced Computational Electromagnetic Methods and Applications
Table 11.2
Gauss-Legendre Quadrature Evaluated Points and Weighted Factors for a Triangle Domain
NG 1 x1 1/ 3 y1 1/ 3 z1 1/ 3 w1 1
NG 3 x1 1 / 2 y1 1 / 2 z1 0 w1 1 / 3
x2 0 y2 1 / 2 z2 1 / 2 w2 w1
x3 1 / 2 y3 0 z3 1 / 2 w3 w1
NG 4 x1 1 / 3 y1 1 / 3 z1 1 / 3 w1 27 / 48
x2 1 / 5 y2 1 / 5 z2 3 / 5 w2 25 / 48
x3 3 / 5 y3 1 / 5 z3 1 / 5 w3 w2
x4 1 / 5 y4 3 / 5 z4 1 / 5 w4 w2
NG 7 x1 1 / 3 y1 1 / 3 z1 1 / 3 w1 9 / 40
x2 a y2 b z2 b w2 (155 15) /1200
x3 b y3 a z3 b w3 w2
x4 b y4 b z4 a w4 w2
x5 c y5 d z5 d w5 (155 15) /1200
x6 d y6 c z6 d w6 w5
x7 c y7 d z7 c w7 w5
a 0.05971587, b 0.47014206
c 0.79742699, d 0.10128651
The second terms in (11.161) and (11.162) are nonsingular and again
performed by using the Gaussian quadrature. The integrals (11.163) and (11.164)
are given next below [33].
1. If m n , then p q and we have
lm ln 1
Pm(0),n; p , q (1) p q
Amp Anq R dS'dS
mp nq
lm ln 4 2 1 a 1 b 1 c
(1) p q A ln 1 ln 1 ln 1 (11.165)
Amp Anq 3 a s b s c s
lm ln ( j R) p q lm ln
Pm(1), n; p , q (1) p q
Amp Anq
mp nq R
dS'dS jPm(0)
, n ( 1)
ct
(11.166)
Time-Domain Integral Equation Methods for Transient Problems 487
lm ln ( j R )2
mp qn R dS'dS
pq
Pm(2)
, n; p , q ( 1)
Amp Anq
lm ln l l 1
j 2 Pm(0),n 2 j (1) p q
ct
(1) p q mp n q
Am An (ct )2 R dS'dS (11.167)
mp nq
lm ln ρmp ρqn lm ln
K m, n; p , q
2 Amp 2 Anq p q R dS'dS 2 Amp 2 Anq I S (11.168)
m n
A2 a 2 b2 a2 c2 a 2 b2 b2 c2
IS 10 3 2
3 a 5 3 2 b
30 c b2 c 2
a2
a2 c2 c2 b2 2 2 2 A2 2 a
5 3 2
2 2 c a 3b 3c 8 2
ln(1 )
b a a a s
A2 4 b 2 A2 4 c
a 2 2b 2 4c 2 6 2 ln(1 ) a 4b 2
2 c 2
6 ln(1 )
b b s c2 c s
(11.169)
In the above, A 1
2
(s a)(s b)(s c) is the area of mp , s 12 (a b c) ,
a, b , and c are the lengths of the three edges; a lm is the mth edge, and b and
c are the other two edges. The last term in (11.167) can be calculated by Gaussian
quadrature.
2. If m n , I S is written as
A2 a 2 b2 a 2 c2 a 2 b2 b2 c2
I S (1) p q 10 a 5 6 b
60 c2 b2 c 2
a2
a2 c2 c2 b2 2 2 2 A2 12 a
5 2
6 2 c 2a b c 4 2 ln(1 )
b a a a s
A2 2 b 2 A2 2 c
9a 2 3b 2 c 2 4 2 2 2
ln(1 ) 9a b 3c 4 2 ln(1 )
b b s c c s
(11.170)
where b and c are the lengths of mth and nth edges while and a is the length of
the third edge.
j 1, 0 j 1 R 1
S' ( j R ) S" ( j R ) 1
1 2 j , 0 j R 1 (11.171)
R ct R
j 2, 0 j 1 R 1
Define
1 1, j 1
Dm, n ( j ) ( j ) fm (r) fn (r) dS , ( j ) (11.172)
2 m 0, otherwise
1 ˆ )[f (r ) f (r' )]
( j R ) (nˆ R
P.V.
m n
Vm, n ( j ) dS'dS (11.173)
4π R 2
ˆ
n n (nˆ R) [f m (r ) f n (r' )]
There is no singularity with the last integral (the integrand is taken to be zero
as R 0 ). Thus, the matrix element (11.104) can be cast in a recursive way:
The evaluations of matrix elements for a dielectric object are essentially the
same as the PEC body. Specifically, (11.120) is the same as (11.100), while
(11.121) is only a little different from (11.104), which is easy to handle.
11.4.3.1 TM Case
Refer to Figure 11.6. For the TM case, if we use EFIE, the matrix element is
(11.66). If we use MFIE, the matrix element is (11.69). Substituting (11.126) into
(11.67) and (11.70), and making use of
2
x R
dx
2
ln x x 2 R 2 (11.175)
1 dx 1 xR
xR x2 R2
R xR
(11.176)
we find that
Time-Domain Integral Equation Methods for Transient Problems 489
F j, R L j 1, R u ( j 1 R ) 3L j , R u ( j R )
3L j 1, R u ( j 1 R ) L j 2, R u ( j 2 R ) (11.177)
ln( R) ( j 1 R ) 2 ( j R ) ( j 1 R )
G j , R K j 1, R u ( j 1 R ) 3K j , R u ( j R )
(11.178)
3K j 1, R u ( j 1 R ) K j 2, R u ( j 2 R )
(11.179)
1, x 0
u x
0, x 0
smp snq
smp
smp snq
where s1m sm and sm2 sm are the lengths of the portions m and m ,
respectively, and so forth. Using these definitions, (11.66) and (11.69) are
calculated as
1 1 2,2
,n j
Z mE,TM
2π ct
j
p , q 1
p,q
m, n (11.186)
1 2,2
1 1 2,2
,n j
Z mH,TM T j mp ,,qn j p,q
m, n (11.187)
2 p , q 1 2π ct p , q 1
490 Advanced Computational Electromagnetic Methods and Applications
with
mp ,,qn ( j ) X mp,,nq j 1 3 X mp,,nq j 3 X mp,,nq j 1 X mp,,nq j 2
(11.188)
+Ymp,,nq j 1 2Ymp,,nq j Ymp,,nq j 1
The integral (11.184) can be carried out analytically and the results are:
1 1,1 1 1 2,1 1
1,2
m , m 1 sm , m, m sm , 2,2
m, m sm , m, m 1 sm (11.190)
6 3 3 6
p,q
The other m , n ’s are zero. The integrals (11.182) and (11.185) are nonsingular and
can be evaluated by Gaussian quadrature, say, using the seven-point rule. The
integral (11.183) is also nonsingular if j 2 or if mp and qn do not overlap. The
singularity arises only if mp and qn overlap along with j 1 .
Define
x y m n
Pm , n ln x y 1 x y dydx
0 0 ct
min( ct , ) x x
x m dx ln x y y n dy x m dx ln x y y n dy (11.191)
0 0 min( ct , ) x c t
min( ct , ) y y
y n dy ln y x x m dx y n dy ln y x x m dx
0 0 min( ct , ) y ct
The integration domain is shown in Figure 11.11. The integrals for m, n 0,1
can be carried out analytically without difficulty and the results are
2
P0,0 (2ln 3) 2( )(ln 1) (11.192)
2
3
P1,0 P0,1 (2 ln 3) ( 2 2 )(ct ) ln(ct ) 1
4
(11.193)
(ct ) 2
( ) 2 ln(ct ) 1
4
Time-Domain Integral Equation Methods for Transient Problems 491
4 2
P1,1 (4 ln 7) ( 3 3 )(ct ) ln(ct ) 1
16 3 (11.194)
1 2
( )(ct ) 2 2 ln(ct ) 1
2
4
in which min(ct , ) , and only the first terms remain if ct or .
By using these results, the singular integrals that will be used are
sm sm
s s' s sm s'
Ym1,2,m 1 (1) ln( s s' ) 1
ct sm sm
ds ds
0 0
(11.195)
1 1
P1,0 ( sm ) 2 P1,1 ( sm )
sm ( sm )
sm sm
s s' s s'
ln( s s' ) 1 ds ds
1,1
Y m, m (1)
ct sm sm
0 0
(11.196)
1
P1,1 ( s )
m
( sm ) 2
sm sm
s s' sm s sm s'
0 0 ds ds
2,2
Y m,m (1) ln( s s' ) 1
ct sm sm
(11.197)
2 1
P0,0 ( sm ) P1,0 ( sm ) 2 P1,1 ( sm )
sm ( sm )
492 Advanced Computational Electromagnetic Methods and Applications
sm sm
s s' sm s s'
0 0 ds ds
2,1
Y m , m 1 (1) ln( s s' ) 1
ct sm sm
(11.198)
1 1
P0,1 ( sm ) 2 P1,1 ( sm )
sm ( sm )
11.4.3.2 TE Case
For TE wave incidence, if we use EFIE, the matrix elements are (11.73), and
(11.74) becomes
j 1 ct
d(ct )
W j, R S ( j t )
max R , j 2 ct (ct )2 R 2
Q( j 1, R)u ( j 1 R ) 3Q( j, R)u ( j R )
3Q( j 1, R)u ( j 1 R ) Q( j 2, R)u ( j 2 R )
1 (11.199)
ln( R) ( j 1) 2 ( j 1 R ) (2 j 2 2 j 1) ( j R )
2
( j 2) 2 ( j 1 R )
with
1 R2 3j R2
Q( j, R) 2 j 2 L ( j , R ) ( jc t ) 2
R 2
(11.200)
4 (ct )2 ct (ct )2
If we use MFIE, the elements are (11.69). Similar to (11.186) and (11.187),
(11.73) and (11.76) are calculated by
1 1 2,2
Z mE,TE
,n ( j)
2π ct
(sˆ
p , q 1
p
m sˆ qn ) mp ,,qn ( j ) (ct )2 mp ,,qn ( j ) (11.201)
1 2,2
,n j
Z mH,TE T j (sˆ mp sˆ nq ) mp ,,qn
2 p , q 1
1 1 2,2
2π ct
(sˆ
p , q 1
p
m sˆ nq ) mp ,,qn ( j ) (sˆ mp sˆ nq ) Λ mp ,,qn ( j ) (11.202)
with
Time-Domain Integral Equation Methods for Transient Problems 493
smp snq
W ( j , R) g (ρ) g nq (ρ)dsds
p,q p
m, n ( j) m (11.203)
0 0
smp snq
The integral (11.203) has a logarithmic singularity only if the integrations are
performed over the same segments along with j 1 . Treatment of the singularity is
the same as (11.191) but involves only P0,0 ( ) . The integral (11.204) is
nonsingular and evaluated by Gaussian quadrature. A recursive way to calculate
mp ,,qn ( j ) and Λ mp ,,qn ( j ) , like mp ,,qn ( j ) and mp ,,qn ( j ) may be adopted.
So far, we have provided all the formulae for evaluations of matrix elements
for 1-D, 2-D, and 3-D geometrics. Closed-form expressions must be used for
singular integrals that happen at the j=1 time step and the source segment/triangle
overlapping with the field segment/triangle. Higher-order numerical Gaussian
quadrature is suggested for nonsingular integrals.
Refer to Figure 11.12. An object is moving at high speed and acceleration, and in
the meantime, it may rotate about a center. To characterize the motions, we
introduce four reference systems, or frames, as illustrated in Figure 11.13. We
assume that the ground frame, or G-frame, is on the Earth with the x-axis
southward, y-axis eastward, and z-axis upward. The target frame, or T-frame, is
fixed on the target with its z-axis in heading direction, x-axis toward the left wing,
and y-axis upward from the back of the target. Between the G-frame and T-frame,
two intermediate frames, called the S-frame and C-frame, are introduced. The S-
frame characterizes the initial position and orientations of the target, while the C-
frame characterizes the motion of an apparent barycenter that can be superfast
and/or in acceleration.
axis 3
, a
axis 2
axis 1
where r0 is the initial position, t0 is the time that the wave travels from the origin
of the G-frame to the origin of the S-frame, and R ini reflects the orientation of the
object,
1 0 0 cos(v 12 π) sin(v 12 π) 0
R ini 0 cos( 12 π v ) sin( 12 π v ) sin(v 12 π) cos(v 12 π) 0 (11.206)
0 sin( 12 π v ) cos( 12 π v ) 0 0 1
Time-Domain Integral Equation Methods for Transient Problems 495
where v is the azimuthal angle measured from the xG-axis to the projection of the
velocity vector onto the xG-yG plane, and v is the elevation angle measured from
the projection to the velocity vector, as shown in Figure 11.14.
The transforms between the S-frame and the C-frame are involved in relativity
transforms. If the translations are uniform, they are expressed as
ctC ctC
ctS 0 ( zC ) 0 cosh
sinh 0 0
ct ct
z ( z ) cosh C sinh C
S 0 C 0 0
(11.208)
where R mic characterizes the rotations about three axes intercepting at the apparent
barycenter, named as micro-motions. For a plane-like object,
R mic R roll R pitch R yaw
cos sin 0 1 0 0 cos 0 sin
(11.210)
sin cos 0 0 cos sin 0 1 0
0 0 1 0 sin cos sin 0 cos
where (t ), (t ), and (t ) are the angles of yaw, pitch, and roll maneuvers. For a
missile-like object,
R mic Rspinning R nutation R coning
cos sin 0 1 0 0 cos sin 0
(11.211)
sin cos 0 0 cos sin sin cos 0
0 0 1 0 sin cos 0 0 1
with
(t) s t 0
) p m sin(n t 0 )
(t (11.212)
(t ) c t 0
with
0 0 0 0 0 0 0
L 0 0 0 , K 0 0
0 0 (11.215)
0 0 1 0 0 0
If rotating effects are ignorable; that is, max Dmax c where max
max(s , n , c ) and Dmax is the dimension of the target, the transforms from the
C-frame to the T-frame are
ET R mic EC , BT R mic BC (11.216)
where p̂ G indicates the polarization, q̂G k̂ G p̂G with k̂ G being the propagation
direction, and k G kG kˆ G with kG G / c . Because we ignore acceleration
effects and all the four frames are inertial systems, the field expressions in the four
frames take the same forms as (11.217) and (11.218). Thus, the fields in the
T-frame are
ET (rT , tT ) pˆ T E0 cos(T tT k T rT 0 ) (11.219)
pˆ T =R mic pˆ C
=R mic (L pˆ S +K qˆ S ) (11.221)
=R mic L (R ini pˆ G )+K (R ini qˆ G )
498 Advanced Computational Electromagnetic Methods and Applications
qˆ T =R mic qˆ C
=R mic (L qˆ S K pˆ S ) (11.222)
=R mic L (R ini qˆ G ) K (R ini pˆ G )
It is not difficult to show that qˆ T kˆ T pˆ T , and the transforms for (, ck ) are
the same as the space-time (ct , r) , that is,
T 1 0 C
ck 0 R ck
T mic C
0 0 0 0 0
1 0 0 1 0 0 S
0 R mic 0 0 1 0 ck S
0 0 0 0 0
(11.223)
0 0 0 0 0
1 0 0 1 0 0 1 0 G
0 R mic 0 0 1 0 0 R ini ck G
0 0 0 0 0
A modulated Gaussian impulse in the G-frame and the T-frame is written in
the same form as
t
2
EG (rG , tG ) pˆ T E0 exp T dG cos(G G ) (11.224)
2 G
t
2
ET (rT , tT ) pˆ T E0 exp T dT cos(T T 0 ) (11.225)
2 T
G tdG 0
tdT , T G G (11.226)
T T
It is seen that the nominal bandwidth or the effective pulse duration has been
changed, which may be defined as f bw,G 6 / (2π G ) and f bw,T 6 / (2π T ) .
Time-Domain Integral Equation Methods for Transient Problems 499
0 1 ˆ s ˆ s
EsT, n (rT , tT ) k T k T f n (r' )T' ( T kˆ sT r' )dS' (11.228)
4πrT ct n
ctT 1 0 ctC
r 0 R r
T mic C
0 0 0 0 0
1 0 0 1 0 0 ctS
0 R mic 0 0 1 0 rS
0 0 0 0 0
0 0 0 0 0
1 0 0 1 0 ctG ct0
0 1 0
(11.229)
0 R mic 0 0 1 0 0 R ini rG r0
0 0 0 0 0
3. Transform EsT (rT , tT ) in the T-frame to EsG (rG , tG ) in the G-frame by the
inverse transforms for fields, that is,
EsG (rG , t
G)
T
R ini T
ESs R ini
L EsC K cBsC
T
R ini L R mic
T
EsT K R mic
T
cBsT (11.230)
where R ini
T
and R ini
T
are the transposes of R ini and R mic , respectively,
while L and K are given in (11.215).
EsG ( f G )
2
where N is the number of pulses, EsG (n, fc ) is the frequency-domain response for
the nth pulse, and f D
1/ ( NTprf ) is the Doppler resolution with Tprf being the
repeating period. To ensure coherence, it is required that Tprf Tc , where is an
integer and Tc 1/ fc is the period of the carrier wave. One may estimate the
Doppler shift at any frequency f a by replacing the f c in (11.232).
In addition to the superfast translation, the target may rotate about an apparent
center, which is called micro-motions that may produce observable micro-Doppler
effects. To estimate the micro-Doppler effects, we may use a total of M
narrowband pulses and use the RCS defined in (11.231) to calculate
M
where K is the length of recorded time sequence, and rm is the distance from the
observer to the target when the mth pulse is transmitted. Replacing the (m, f G ) in
(11.233) with E (m) of (11.234), we can identify the micro-Doppler effects as well.
td 2 6 / (2πf bw )
E (r, t ) pˆ E0 exp
i
cos(2πf c ) , (11.235)
2 t kˆ r / c
i
1 2π f
A( f fc ) A( f fc ) e jk r , k i kˆ i
i
i
E (r, f ) pˆ E0 (11.236)
2 c
with
The impulse function (11.235) and the magnitude of its spectrum (11.236) is
shown in Figure 11.15 with fc 450 MHz and f bw 300 MHz.
502 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 11.15 (a) A modulated Gaussian impulse and (b) its magnitude spectrum.
Once the MOT equations are solved and the expansion coefficients are
extracted, we can compute the fields at any space-time position. Usually, far-zone
field properties are most interesting, including the time-domain waveforms and
frequency responses or wideband radar cross-section (RCS). In the far zone, the
electric field is calculated by the transverse parts of the first term of (11.4) for the
PEC body, that is,
A 0
Es (r , rˆ , ) rˆ rˆ rˆ rˆ J s (r' , rˆ r' / c)dS' (11.238)
t 4πr t S
where
1 1 0 Nt N
H s (r , rˆ , ) rˆ Es (r , rˆ , ) rˆ I n ( j )U n ( j; rˆ , ) (11.242)
0 0 4πr j 1 n 1
Time-Domain Integral Equation Methods for Transient Problems 503
For a dielectric geometry, the far-zone electric and magnetic fields produced
by the equivalent electric currents, denoted by Ese (r , kˆ s , ) and Hse (r , kˆ s , ) , are in
the same forms as (11.239) and (11.242) but replacing I n ( j ) with I n(e) ( j ) (refer
to (11.107)). The far-zone magnetic and electric fields produced by the equivalent
magnetic currents, by the duality principle, are
0 Nt N
H ms (r , rˆ ,
) rˆ rˆ I n(m) ( j )U n ( j; rˆ , ) (11.243)
4πr j 1
n 1
N N t
The total scattered far-zone fields are the sums, that is,
s
E Ees Ems , H
s
Hes Hms (11.245)
Es ( f )
2
( f ) lim 4πr 2 (11.246)
r 2
Ei ( f )
The first example is a scattering problem by a dipole antenna (see the inset of
Figure 11.16). The length of the dipole antenna is 1m, and its diameter is 1 cm.
The incident wave in this example is an unmodulated Gaussian impulse:
4 4
Ei (r, t ) pˆ
exp (ct ct0 kˆ i r )2 (11.247)
T π T
(a) (b)
Figure 11.16 Scattering of a dipole antenna by a Gaussian impulse: (a) the short circuit current at the
feeding point and (b) the magnitude of current at the feeding point.
(a) (b)
(c)
Figure 11.17 Radiation of a V-shape dipole antenna fed by a Gaussian voltage: (a) the input current;
(b) the magnitude of input current; and (c) the far-zone radiation field.
Time-Domain Integral Equation Methods for Transient Problems 505
(a) (b)
(c)
Figure 11.18 Radiation of a helical monopolar antenna fed by a Gaussian voltage: (a) the input current;
(b) the magnitude of input current; and (c) the far-zone radiation field.
The third example is also a radiation problem for a helical antenna [see the
inset of Figure 11.18], which has six turns and the raising angle is 14o. The length
is 1.24m and is divided into 12 segments. A Gaussian source voltage in (11.248) is
506 Advanced Computational Electromagnetic Methods and Applications
applied at the vertex (feeding point). The current at the vertex is depicted in Figure
11.18(a), and its late-time behavior until 1,000 LM (4,000 time steps) is shown in
Figure 11.18(b). The far-zone field is displayed in Figure 11.18(c).
As demonstrated by the three examples above, the TDIE solutions are
absolutely convergent at exponent rates as long as the matrix elements are
evaluated precisely, where precise evaluations mean closed-form expressions for
self-interacting terms and numerical Gaussian quadrature by at least four points for
nonself-interacting terms. If the one-point rule is used, the MOT solution would
diverge eventually.
For 2-D problems, the first example is a PEC circular cylinder with a radius of 1 m
(see the inset of Figure 11.19). The incident wave is the modulated Gaussian
impulse as (11.235) with kˆ i xˆ , E0 120π , and fc f bw 300 MHz. For a TM
wave incidence, pˆ zˆ , and for a TE wave incidence, pˆ yˆ . The induced current
at the point (1, 0) is shown in Figure 11.19(a) for TM polarization. The
convergent property of the currents is shown Figure 11.19(b). Comparisons by
using EFIE, MFIE, and CFIE are given, as well as the IDFT solution that solves
the problem by using the MoM at 256 frequency points from 150 MHz to 450
MHz and then converts the data from frequency domain to time domain. We repeat
the computing procedure by changing the incident wave to the TE polarization.
The convergent property of magnitude of the current at the same point is displayed
in Figure 11.19(c). It is seen from the figures that CFIE gives much more accurate
results than EFIE and MFIE.
The second example is a double strip structure (see the inset of Figure 11.20).
The width of the strips is 1m, and they are separated by 0.5m. The incident wave
is the same as the previous example, except for the incident direction for the TE
case, which is changed to kˆ i (xˆ yˆ ) / 2 such that pˆ (xˆ yˆ ) / 2 . For the
TM case, the magnitudes of induced currents at the point (0, 0.25) is shown in
Figure 11.20(a), and the RCS at 450 MHz is shown in Figure 11.20(b). For TE
case, the magnitude of induced currents at the same point is shown in Figure
11.21(a), and the RCS at 150 MHz is shown in Figure 11.21(b). It can be seen
from the graphs that the magnitudes attenuate exponentially first and level off
eventually.
To verify the convergent property for the narrowband case, we repeat the
computations in Figure 11.22(a) with the structure enlarged by 10 times and the
nominal bandwidth is as narrow as 3 MHz (effective band is from 298.5 MHz to
301.5 MHz). The result is illustrated in Figure 11.22(a), which levels off at around
108 A/m. A close look at a portion of the waveform is plotted in Figure 11.22(b).
Good convergence and accuracy are achieved.
Time-Domain Integral Equation Methods for Transient Problems 507
(a) (b)
(c)
Figure 11.19 Scattering of modulated Gaussian impluse by a conducting cylinder: (a) the induced
current at point (1, 0) for TM case; (b) the magnitude of currents for TM case; and (c)
the magnitude of currents for TE case. ©IEEE 2014 [28].
(a) (b)
Figure 11.20 Scattering of modulated Gaussian impluse by a double strip structure for TM
polarization: (a) the magnitude of currents at point (0, 0.25) and (b) the bistatic RCS at
450 MHz. ©IEEE 2014 [28].
508 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
Figure 11.21 Scattering of modulated Gaussian impluse by a double strip structure for TE
polarization: (a) the magnitude of currents at point (0, 0.25) and (b) the bistatic RCS at
150 MHz. ©IEEE 2014 [28].
(a) (b)
Figure 11.22 Scattering by a large-size structure to verify the convergent performance: (a) the time
domain waveform and (b) a zoom-in look at a part of the waveform.
The above results demonstrate that the present TDIE methods for 2-D
structures are stable and accurate.
For 3-D geometries, the first example is a square PEC plate with side size 1m,
lying in the x-y plane. It is discretized by using 1,825 RWG basis functions. The
incident wave is the modulated Gaussian impulse with fc 600 MHz and
f bw 1.2 GHz. The induced current waveform at the plate center is shown in
Figure 11.23(a) from 0 to 10 LM (400 time steps). The convergent behavior of the
magnitudes is shown in Figure 11.23(b) till 4,000 time steps (100 LM), which
levels off below 1015 A/m. The backscattering field is plotted in Figure 11.23(c).
Using the scattered field data, monostatic RCS from 0 to 1.2 GHz can be found as
Time-Domain Integral Equation Methods for Transient Problems 509
shown in Figure 11.23(d), where comparisons by using different time step sizes
and different numbers of spatial unknowns, as well as the MoM solutions, are
illustrated. Good stability and accuracy are achieved.
(a) (b)
(c) (d)
Figure 11.23 Backscattering by square PEC plate with side size 1m: (a) the current waveform at the
plate center, (b) the magnitude of the currents at the center, (c) the backscattering field,
and (d) the backscattering wideband RCS.
The next example is a PEC cube with an edge size of 0.5 meter. It is
discretized by using 2,178 RWG basis functions. The parameters of the incident
wave are the same as the previous example. The induced current waveform at the
center of the top surface is shown in Figure 11.24(a) from 0 to 10 LM (400 time
steps). The convergent behavior of the magnitudes is shown in Figure 11.24(b)
until 4,000 time steps (100 LM), which levels off about 1016 A/m. The monostatic
RCS from 0 to 1.2 GHz can be found as shown in Figure 11.24(c), and the bistatic
RCS at 900 MHz is shown in Figure 11.24(d), where a comparison with MoM
solution and measured data [34] is provided. Again, it is observed from Figure
11.24 that the results are stable and accurate.
510 Advanced Computational Electromagnetic Methods and Applications
(a) (b)
(c) (d)
Figure 11.24 Backscattering by a PEC cube with an edge size of 0.5m: (a) the current waveform at the
center of the top side, (b) the magnitude of the currents at the center point, (c) the
monostatic RCS, and (d) the bistatic RCS at 900 MHz.
(a) (b)
Figure 11.25 Scattering by a NASA almond: (a) the current waveform at the point P and (b) the
bistatic RCS at 1.57 GHz.
Time-Domain Integral Equation Methods for Transient Problems 511
(a) (b)
(c) (d)
Figure 11.26 Scattering by a dieletric sphere: (a) the equivalent current at the polar pioint; (b) the
equivalent magnetic current at the polar point; (c) the scattered far-zone field; and (d)
wideband backscattering RCS.
The third example is the NASA almond [35]. We use 2,031 curved RWG
basis functions to discretize the surface of the almond. The carrier frequency and
nominal bandwidth of the incident modulated Gaussian impulse are f c 1.57 GHz
and f bw 3.14 GHz, respectively. The induced currents at a point are shown in
Figure 11.25(a), and its bistatic RCS at 1.57 GHz is shown in Figure 11.25(b);
these are in good agreement with the measured data.
The last example is a dielectric sphere [see the inset of Figure 11.26(a)], for
which an analytical solution is available. The diameter of the sphere is 0.5m, and
the relative permittivity is r 4.0 . It is modeled by 6,672 curved RWG basis
functions. The carrier frequency and bandwidth are fc 500 MHz and f bw 1.0
GHz, respectively. The x-component of the equivalent electric currents and the y-
component of the equivalent magnetic currents at the point (0, 0, 0.25) are plotted
in Figures 11.26(a) and 11.26(b), respectively. The backscattering far-zone field
and RCS are displayed in Figures 11.26(c) and 11.26(d), respectively. For the sake
512 Advanced Computational Electromagnetic Methods and Applications
of comparison, the analytical Mie series solution is provided in the same figure,
and it is observed from the figure that they are in good agreement.
The first example is a PEC sphere [see the inset of Figure 11.27(b)] with a radius
of a 1 m. It is located on the z-axis at z0 107 meters at the initial instant, and
moves along the z-axis at an exaggerated speed v0 3 107 m/s. The incident wave
is an elementary Gaussian pulse:
t0 2 z z0
EiG ( ) xˆ exp , t c (11.249)
2
1 2 v=0
v=3e7
0
Spectrum of Ex (v*s/m)
1.5
-1 v=0
Ex(v/m)
v=3e7
-2 1
-3
0.5
-4
-5 0
0 1 2 3 4 5 6 7 8 9 0 0.5 1 1.5 2 2.5 3 3.5 4
time(s) -8 freq(Hz) x 10
8
x 10
(a) (b)
Figure 11.27 The time-domain and frequency-domain responses of a moving PEC sphere: (a) the echo
waveform and (b) the spectrum of the echo.
The second example is a missile model that consists of a cylinder and a half
sphere (see the inset of Figure 11.28). The height and diameter of the cylinder is
1.5m and 1m, respectively. It flies at v 1 km/s and in the meantime seesaws
about the track-axis at 1 kHz. A very narrowband modulated Gaussian impulse
with carrier frequency fc 200 MHz and nominal bandwidth f bw 1 MHz is
incident upon it at an elevation angle 45o. The time-domain waveform of
backscattered far-zone field looks somewhat like Figure 11.22. Taking the Fourier
Time-Domain Integral Equation Methods for Transient Problems 513
-9
x 10
1.6
1.4
1.2
spectrum (V.s/m)
0.8
0.6
0.4
0.2
0
-8 -6 -4 -2 0 2 4 6 8 10
f-f0 (kHz)
Figure 11.28 The Doppler and micro-Doppler effects of a model missile that moves at 1 km/s and
vibrates at 1 kHz about its track axis.
The last example is a moving cone or a warhead model. The diameter of its
base is 42 cm, and its height is 145 cm. It flies horizontally at a speed v 3.4 km/s,
and in the meantime rotates in coning at f coning 2 Hz and in nutation at
f nutation 3.3 Hz. Both the procession angle and the maximum nutation angle are
set to be p m 10o [refer to (11.212)]. A train of modulated Gaussian pulses
is transmitted to the target from a ground station (G-frame), as illustrated in Figure
11.29. Each incident pulse has the waveform as shown in Figure 11.15 and is
delayed by TPRT 0.02 second, and a total of 128 pulses are used. Each pulse is
first transformed from the G-frame to the T-frame, and then the scattered pulse is
calculated in the T-frame, which is finally transformed back to the G-frame and
recorded as a vector echo:
S(n, tG ) rn EGs (n, tG 2rn / c) (11.250)
where rn is the distance from the station to the target when the nth pulse is
transmitted. Three echoes (only HH-polarization components) are illustrated in
Figure 11.29. The normalized energy of each echo may be calculated by
514 Advanced Computational Electromagnetic Methods and Applications
S(n, k t )
2
E ( n)
(11.251)
k 1
where K is the number of samples of each recorded echo. This energy sequence is
a measure of time varying scattering capability of the target (amounting to the
scattering cross-section), which is plotted in Figure 11.30(a), and its Fourier
transform is shown in Figure 11.30(b). Obviously, the peak positions are located at
f peak m f coning n f nutation , where m and n are a pair of integers. For example, the
first four peaks correspond to (m, n) (2, 1) , (m, n) (1,1) , (m, n) (1,0) , and
(m, n) (0,1) , respectively. It might be possible to identify the micro-motion
characteristics by using a group of peak positions.
Figure 11.29 Illustration of pulse reposnes from a flying cone target that rotates in coning and
nutation in the mean time.
0.040 1.5
0.035
Spectrum, V 2 Hz
Echo Energy, V 2
1.0
0.030
0.025
0.5
0.020
0.015 0.0
0.0 0.5 1.0 1.5 2.0 2.5 0 1 2 3 4 5 6
Time, s Frequency, Hz
(a) (b)
Figure 11.30 The echo energy and its spectrum of 128 pulses that is repeated at 0.02 second: (a) the
normalized echo energy and (b) the spectrum of echo energy. The target flies at 3 km/s
and rotates in coning at 2 Hz and in nutation at 3.3 Hz; both the procession angle and
maxium nutation angle are 10o.
Time-Domain Integral Equation Methods for Transient Problems 515
11.7 SUMMARY
REFERENCES
[1] C. Bennett, “A Technique for Computing Approximate Impulse Response for Conducting
Bodies,” Electrical Engineering, West Lafayette, IN: Purdue University, 1968.
[2] C. Bennett and W. Weeks, “Electromagnetic Pulse Response of Cylindrical Scatters,” IEEE G-
AP International Symposium, Northeastern University, pp. 176183, 1968.
[3] C. Bennett, “Transient Scattering from Conducting Cylinders,” IEEE Trans. Antennas Propagat.,
Vol. 18, pp. 627633, 1970.
[4] E. Sayre and R. Harrington, “Time Domain Radiation and Scattering by Thin Wires,” App. Sci.
Res., Vol. 26, pp. 413444, 1972.
[5] T. Lui and K. Mei, “A Time Domain Integral Equation Solution for Linear Antenna and
Scatterers,” Radio Sci., Vol. 8, pp. 797804, 1973.
[6] E. Miller, J. Poggio, and G. Burke, “An Integro-Differential Equation Technique for the Time
516 Advanced Computational Electromagnetic Methods and Applications
Domain Analysis of Thin-Wire Structures, I. The Numerical Method,” J. Comput. Phys., Vol. 12,
No. 1, pp. 2448, 1973.
[7] R. Mittra, “Integral Equation Methods for Transient Scattering,” Transient Electromagnetic
Fields, edited by L. B. Felsen, New York: Springer-Verlag, pp. 83138, 1976.
[8] A. Tijhuis, “Toward a Stable Marching-on-in-Time Method for Two Dimensional Electro-
Magnetic Scattering Problems,” Radio Sci., Vol. 19, pp. 13111317, 1984.
[9] B. Rynne, “Instability in Time Marching Methods for Scattering Problems,” Electromagnetics,
Vol. 6, pp. 129144, 1986.
[10] P. Smith, “Instability in Time Marching Methods for Scattering: Cause and Rectification,”
Electromagnetics, Vol. 10, pp. 439451, 1990.
[11] D. Vechinski and S. Rao, “A Stable Procedure to Calculate the Transient Scattering by
Conducting Surfaces of Arbitrary Shape,” IEEE Trans. Antennas Propagat., Vol. 40, pp.
661665, 1992.
[12] A. Sadigh and E. Arvas, “Treating the Instabilities in Marching-on-in-Time Method from a
Different Perspective,” IEEE Trans. Antennas Propagat., Vol. 41, pp. 16951702, 1993.
[13] P. Davies, “A Stability Analysis of a Time Marching Scheme for the General Surface Electric
Field Integral Equation,” Applied Numerical Mathematics, Vol. 27, pp. 3357, 1994.
[14] P. Davies and D. Duncan, “Averaging Techniques for Time-Marching Schemes for Retarded
Potential integral Equations,” App. Numer. Math., Vol. 23, pp. 291-310, 1997.
[15] E. Miller, “A Selective Survey of Computational Electromagnetics,” IEEE Trans. Antennas
Propagat., Vol. 30, pp. 29, 1988.
[16] S. Rao, T. Sarkar, and M. Bluck, “Time-Domain Modeling of Two-Dimensional Conducting
Cylinders Utilizing an Implicit Scheme-TM Incidence,” Microwave Opt. Technol. Lett., Vol. 15,
pp. 342347, 1997.
[17] Y. Shifman and Y. Leviatan, “On the Use of Spatiotemporal Multiresolution Analysis in Method
of Moments Solutions for the Time-Domain Integral Equation,” IEEE T. Antenn. Propagat., Vol.
49, pp. 1123–1129, 2001.
[18] A. Ergin, B. Shanker, and E. Michielssen, “Fast Evaluation of Three Dimensional Transient
Wave Fields Using Diagonal Translation Operators,” J. Comput. Phys., Vol. 146, pp. 157180,
1998.
[19] J. Hu and C. Chan, “Improved Temporal Basis Functions for Time Domain Electric Field
Integral Equation Method,” Electronics Letters, Vol. 35, No. 11, pp. 883885, 1999.
[20] D. Weile, G. Pisharody, N. Chen, Shanker B., and Michielssen E., “A Novel Scheme for the
Solution of the Time-Domain Integral Equations of Electromagnetics,” IEEE Trans. Antennas
Propagat., Vol. 52, pp. 283-295, 2004.
[21] H. Bagci, A. Yilmaz, V. Lomakin, and E. Michielssen, “Fast Solution of Mixed-Potential Time-
Domain Integral Equations for Half-Space Environments,” IEEE Trans. Geosci Remote Sensing,
Vol. 43, pp. 269279, 2005.
[22] M. Xia, G. Zhang, G. Dai, and C. Chan, “Stable Solution of Time Domain Integral Equation
Methods Using Quadratic B-Spline Basis Functions,” Journal of Computational Mathmatica,
Vol. 25, pp. 374384, 2007.
[23] Y. Chung, T. Sarkar, B. Jung, M. Salazar-Palma, J. Zhong, J. Seongman, and K. Kyungjung,
“Solution of Time Domain Integral Equation Using the Laguerre Polynomials,” IEEE Trans.
Antennas Propagat., Vol. 52, pp. 23192328, 2004.
Time-Domain Integral Equation Methods for Transient Problems 517
[24] M. Lu and E. Michielssen, “Closed form Evaluation of Time Domain Fields due to Rao-Wilton-
Glisson Sources for Use in Marching-on-in-Time Based EFIE Solvers,” IEEE APS Int. Symp.
Dig., pp. 7477, 2002.
[25] B. Zubik-Kowal and P. Davies, “Numerical Approximation of Time Domain Electromagnetic
Scattering from a Thin Wire,” Numerical Algorithms, Vol. 30, pp. 2535, 2002.
[26] G. Zhang, M. Xia, and X. Jiang, “Transient Analysis of Wire Structures Using Time Domain
Integral Equation Method with Exact Elements,” Progress in Electromagnetics Research, Vol.
92, pp. 281298, 2009.
[27] M. Lu, K. Yegin, B. Shanker, and E. Michielssen, “Fast Time Domain Integral Equation Solvers
for Analyzing Two-Dimensional Scattering Phenomena; Part I: Temporal Acceleration,”
Electromagnetics, Vol. 24, pp. 425–449, 2004.
[28] X. Guo, M. Xia, and C. Chan, “Stable TDIE-MOT Solver for Transient Scattering by Two-
Dimensional Conducting Structures,” IEEE Trans. Antennas Propagat., Vol. 62, pp. 2149–2157,
2014.
[29] B. Shanker, M. Lu, J. Yuan, and E. Michielssen, “Time Domain Integral Equation Analysis of
Scattering from Composite Bodies via Exact Evaluation of Radiation Fields,” IEEE Trans.
Antennas Propagat., Vol. 57, No. 5, pp. 1506–1520, 2009.
[30] Y. Shi, M. Xia, R. Chen, E. Michielssen, and M. Lu, “Stable Electric Field TDIE Solvers via
Quasi-Exact Evaluation of MOT Matrix Elements,” IEEE Trans. Antennas Propagat., Vol. 59,
pp. 574584, 2011.
[31] B. Kolundzija and A. Djordjevic, Electromagnetic Modeling of Composite Metallic and
Dielectric Structures, Norwood MA: Artch House, pp. 170171, 2002.
[32] S. Rao, D. Wilton, and A. Glisson, “Electromagnetic Scattering by Surfaces of Arbitrary Shape,”
IEEE Trans. Antennas Propagat., Vol. 30, pp. 409418, 1982.
[33] P. Arcioni, M. Bressan, and L. Perregrini, “On the Evaluation of the Double Surface Integrals
Arising in the Application of the Boundary Integral Method to 3-D Problems,” IEEE Trans.
Microwave Theory Tech., Vol. 45, pp. 436438, 1997.
[34] M. Cote, M. Woodworth, and A. Yaghjian, “Scattering from the Perfectly Conducting Cube,”
IEEE Trans. Antennas Propagat., Vol. 36, pp. 13211329, 1988.
[35] J. Volakis, A. Woo, H. Wang, M. Schuh, and M. Sanders, “Benchmark Radar Targets for the
Validation of Computational Electromagnetics Problems,” IEEE Antennas and Propagation
Magazine, Vol. 35, pp. 8489, 1993.
[36] G. Kobidze, J. Guo, B. Shanker, and E. Michielssen, “A Fast Time Domain Integral Equation
Based Scheme for Analyzing Scattering from Dispersive Objects,” IEEE Trans. Antennas
Propagat., Vol. 53, pp. 1215–1226, 2005.
[37] N. Gres, A. Ergin, B. Shanker, and E. Michielssen, “Volume Integral Equation Based Analysis of
Transient Electromagnetic Scattering from Three-Dimensional Inhomogeneous Dielectric
Objects,” Radio Sci., Vol. 36, No. 3, pp. 379–386, 2001.
[38] P. Jiang and E. Michielssen, “Temporal Acceleration of Time-domain Integral Equation Solvers
for Electromagnetic Scattering from Objects Residing in Lossy Media,” Microwave and Optical
Technology Letters, Vol. 44, pp. 223230, 2005.
[39] B. Jung, J. Zhong, T. Sarkar, and M. Salazar-Palma, “A Comparison of Marching-on in Time
Method with Marching-on in Degree Method for the TDIE Solver,” Progress In
Electromagnetics Research, Vol. 70, pp. 281296, 2007.
[40] Y. Beghein, K. Cools, H. Bagci, and D. Zutter, “A Space-Time Mixed Galerkin Marching-on-in-
Time Scheme for the Time-Domain Combined Field Integral Equation,” IEEE Trans. Antennas
518 Advanced Computational Electromagnetic Methods and Applications
12.1 INTRODUCTION
519
520 Advanced Computational Electromagnetic Methods and Applications
polynomial expansion (GPCE), which can be used to build these surrogate models
with a parsimonious number of FDTD simulations.
1 E 2
SAR (12.1)
2
model Zubal [5], and more recently the Virtual Family [6] and the Chinese [7]
models. The phantoms that have been developed recently have a millimeter
resolution while some of the previous can have a resolution of few millimeters.
Using some of these phantoms, studies have been carried [8] out to assess the
human exposure induced by a frontal plane wave from 20 MHz to 2.4 GHz. As
shown in Figure 12.2, the frequency plays an important role in the human exposure.
Figures 12.2 and 12.3 show also the large influence of the morphology.
(a) (b)
(c)
Figure 12.1 Influence of the presence of tissues on the pattern antenna of a mobile: (a) far-field
pattern of a mobile device alone; (b) configuration in the FDTD simulation; and (c) far
field pattern of a mobile device with a human head.
522 Advanced Computational Electromagnetic Methods and Applications
Frequency (MHz)
Figure 12.2 Whole-body SAR versus frequencies from 20 MHz to 2.4 GHz for an incident density
power of 1 W/m2.
Deviation from mean wb SAR in %
Frequency (MHz)
Figure 12.3 Variability analysis of SAR from 20 MHz to 2.4 GHz. © Phys. Med. Biol. 2008 [8].
It is evident from Figures 12.2 and 12.3 that the frequency variation of
different human exposure is significant. For instance, there is a large variability for
frequency bands close to 100 MHz, where the human body is similar to an antenna
having a good efficiency to grab energy. The influence of morphologies occurs
Statistical Methods and Computational Electromagnetics 523
also in the frequency close to 1,800 MHz, in this case the influence on the SAR
value is due to the human cross-section, which varies between individuals.
Since head and body morphologies evolve with age, much effort has also been
carried out to develop a child head [9, 10] as well as a fetus [11] at different stages
[12, 13] to assess SAR induced by a mobile phone in tissues of young children.
Figure 12.4 Electric field strength coming from a closed Femto box calculated using the FDTD
method combined with Huygens’ surface and the spherical wave modes.
To avoid nonuseful free space meshing, the equivalent principle is often used
to model the excitation source using the incident EMFs occurring at the surface
surrounding the exposed object (human in the current case). This method has
proven its efficiency when the coupling between the source and the exposed object
can be negligible. This method has been used for a long time in FDTD through the
well-known Huygens’ surface. However, only a plane wave is modeled most of the
time. With the recent use of small mobile devices that are quite close to the human
body, the plane wave model is discussable. An efficient way to overcome this limit
is to use the spherical wave expansion (SWE). The EMF emitted by the sources is
expressed as a combination of spherical waves (SW), which are an orthogonal
basis of the EMF space [14]:
k
E r , , 2s 1nN1mn n Qs , m, n Fs , m, n r , , (12.2)
where E and H are the electric and magnetic fields expressed in the spherical
coordinates (r is the radius from the source to the observation point, 𝜃 is the
elevation angle, and 𝜑 is the latitude angle), 𝜂 is the free-space propagation
constant, N is the number of modes, and Q is the coefficient and F is the spherical
524 Advanced Computational Electromagnetic Methods and Applications
wave function of index s (TM or TE fields), order m and degree n. The fields are
fully characterized by this expansion. There can theoretically be an infinite number
of spherical modes but in practice, the number N of modes is chosen in order to be
sufficient to correctly describe the field behavior emitted by the antenna [15]. Such
an approach can be used to calculate SAR induced by small devices such as
“femto-cells” that can be close to the human body, which does not allow the use of
a plane wave model. Figure 12.4 demonstrates the electric field distribution that is
performed with FDTD using Huygens’ surface and the spherical wave modes.
1
n M x i my
2
Sy (12.5)
n 1 i 1
Thanks to the central limit theorem, the estimator my is asymptotically
Gaussian. As a consequence, if n is sufficiently large, with q1 2 the 𝛼 quantile of
the centered and reduced Gaussian law 𝒩(0, 1), the uncertainty of the estimator is
given by q1 2 S y
n . For example, with a typical risk value of = 5%, the
1.96 S y
n q1 2 . A similar formula exists for the confidence interval of the
standard deviation.
The main advantage of the Monte Carlo method is its simplicity and
applicability to a large class of problems but the main limitation is the very large
number of requested experiments for problems involving a large number of inputs
or for higher mode estimation.
Other methods exist [21] but they are also requesting a large number of
samples that are not compatible with numerical methods requesting much time
computation such as FDTD. As described previously, the main advantage of the
FDTD method is to proceed without any matrix inversion that can be cumbersome,
but the main constraint is the time computation that can be very large (i.e., at least
a few hours if the calculation involved the whole body) for human RF exposure
assessment.
Much effort has been put toward high-performance computing using parallel
526 Advanced Computational Electromagnetic Methods and Applications
architectures with recently developed graphic processor units. But even with this
push, the time computation is still not compatible with Monte Carlo methods that
can require from a few hundreds to a few thousands of simulations depending on
the required precision of the higher computed moment of the probability
distribution.
The problem described in the previous section is not specific to the radio
frequency electromagnetic exposure assessment, and it may occur in many other
physics or engineering problems involving heavy use of computer simulations that
request significant time computations. Typical examples can be found in
mechanics with the design optimization of an optimal shape. Indeed, for many
real-world problems, a single simulation, as in RF dosimetry or antenna design,
can take several minutes or hours. Similar problems occur when the objective is to
characterize the influence of input variations on the statistical distribution of the
calculated outputs with the simulations.
A way to overcome such a limitation is to build simpler approximation
models, known as alternative models, surrogate models, response surfaces, or
metamodels that mimic the complex response of the model (represented in
dosimetry by the FDTD simulations) as close as possible while calculating cheaply
[22].
A model of a physical problem or system can be represented using a general
function M : x y M x ; with this notation x is a vector composed of the input
parameters of the model ( x D M
). The model response, y M x , is also a
vector with a dimension possibly different from the input. Within this formalism,
described in Figure 12.5, the objective is to build an approximation of the model
response: y M x .
Input (x)
• Geometry design
• Frequency
• Shape … …
Surrogate model
Approximation output
We will now focus on the design of experiments used to build the surrogate model
of the calculation code. The experiments are the configurations that will be
computed numerically (e.g., via FDTD). In the case of a person exposed to an
electromagnetic plane wave having variable angles of incidences ( 𝜃, 𝜑 ), the
experimental design will consist of selecting a set of incidence directions {(𝜃𝑖 , 𝜑𝑖 ),
. . , (𝜃𝑗 , 𝜑𝑗 ), . . . } (those are the experiences), for which the whole-body SAR, for
example, will be calculated using FDTD.
It is obvious that experiments must be optimally sampled to estimate model
parameters. The fundamental difference between the design of experiments
developed in the laboratory for physical experiences and the design of experiments
built for numerical calculations is that we assume the presence of random errors in
measurement in the laboratory but not in numerical simulations. The repetition of a
numerical experiment under the same conditions is irrelevant since it does not
provide any additional information that is not the case with physical experiences.
The choice of the design of numerical experiments and therefore the points for
which the simulations will be conducted must meet several constraints. The first
one is to distribute these points in space as uniformly as possible to capture
possible nonlinearities (relative to input variables) of the simulated phenomenon.
The second one is that the uniform distribution must subsist if a dimensionality
reduction is performed. Indeed, when problems have a large dimension, that is to
528 Advanced Computational Electromagnetic Methods and Applications
say that they have many input parameters, it is common to observe that the
calculations depend heavily on only a few influential variables or on main
components consisting of linear combinations of these variables. It is therefore
important to keep the uniform filling properly even in projection onto subspaces.
The last constraint, but not the least, is parsimony. It is necessary that the number
of simulations is large enough to estimate all the coefficients of the approximation
model but it must be limited to reduce the cost of simulation variables. In the case
of the SAR calculations, for which computational time can be a few hours, this last
constraint is fundamental.
Literature and textbooks exist on the design of experiments. An easy way to
address the problem of filling the space is to select points on a regular grid in the
experimental area but, as one can easily understand, that such an approach cannot
only lead to a large number of experiments but also to the wrong model by
ignoring some component of the phenomenon due to the regular spacing. An easy
way to avoid the problems due to the regular spacing is to select the point
randomly. But such an approach can create, as shown in Figure 12.6, a nonuniform
sampling of the space that can lead to overweighing some part of the space and
having a possible bias surrogate model.
0.9
0.8
0.7
0.6
0.5
y
0.4
0.3
0.2
0.1
0.0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Figure 12.6
x
Random sampling of 10 points for two variables having uniform distribution (arbitrary
units for the x- and y-axes).
N 1
M 1
p 0 M p M !
N 1
(12.6)
Some of the possible combinations do not fill uniformly the space as shown in
Figure 12.8. Following the LHS rules does not prevent possible bad space filling.
The identification of the combination inducing the best space filling can be
done using the “maxi-min” criteria. The minimum Euclidean distance di between
the points of the possible planning of experiments can be calculated of all the
possible experiment plans. The planning of experiments having the maximum
distance di can be considered as the best plan from the space filling point of view.
1.0
0.9
0.8
0.7
0.6
y
0.5
0.4
0.3
0.2
0.1
0.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Figure 12.7
x
LHS for M = 10 points and two variables having uniform distribution (arbitrary units for
the x- and y-axes).
sampling (NLHS) [24, 25]. The principle of this technique is to complement the
existing LHS plane and keep, at least approximately, the LHS plan configuration.
With an initial LHS designed for N variables and M samples, adding a new point
leads to a LHS for N variables and M+1 samples. According to the LHS approach,
the range of each variable is divided into M+1 equally probable intervals and M+1
sample points should be placed in these intervals. The new intervals are by
definition smaller than the initial ones, so all the existing points are located in
different intervals, and NLHS will keep the existing point and only add a point to
satisfy the Latin hypercube criteria (only one sample in each row and each
column).
(a) (b)
Figure 12.8 (a) The LHS design obtained with interval selection with N = 2 (dimensions) and M = 6
(intervals) and (b) a different LHS design obtained with the same constraint.
The use of alternative models requests a method to validate these surrogate models.
Studies have been performed on the assessment of the accuracy of a model, and
the first intuitive approach is to analyze the errors of the prediction. In statistical
data analysis, the variability of the data set is measured through different sums of
squares and in particular the total sum of squares (known as SST or TSS) and the
residual sum of squares (known as RSS or SSR). SST and SSR are given by
SST yi y
2
(12.7)
i
2
SSR yi y i (12.8)
i
where 𝑦i and 𝑦̂i are, respectively, the observed and predicted values and 𝑦̅ is the
mean value given by:
Statistical Methods and Computational Electromagnetics 531
1
yi
N
y (12.9)
N i 1
build a model and one experiment to test it. If N is large enough then the accuracy
of a model based on (N1) experiments is similar to N. If we consider the N
simulations, 𝑌 = {𝑦1 , 𝑦2 , … , 𝑦𝑝 , … , 𝑦𝑁 }, that have been performed and if one notes
𝑀̂−𝑖 the model based on (N1) simulations, {𝑦1 , 𝑦2 , . . , 𝑦𝑝 , . . 𝑦𝑁 } − {𝑦𝑗 }, then we
can estimate the mean square error of the model using
1
2
M i xi M xi
N
rrloo (12.12)
N i 1
where 𝜎̂𝑦2 is the variance of the outputs Y. Having Q2 close to one indicates a good
generalization aptitude of the surrogate model.
The other quite simple and intuitive method is the bootstrap [26] that does not
require additional information other than that available in the sample. This
approach is based on constructing a number of new samples (called bootstrap
sample or resamples) obtained by replacing the random sampling with the original
sample and having the size equal to the observed dataset.
x xj
F x
n
i 0
yi o j
n, j i x x
(12.14)
i j
If the polynomial approximation cannot be done, then a linear regression can
be performed to model the relationship between a variable vector and explanatory
variables. For instance, consider a set of data representing the observation between
a vector 𝑦 having m components with explanatory variables having an n
component
y , x
1 11 , x12 x1n , y2 , x21 x2n
2n , y p , x p1 x pn , ym , xm1 xmn (12.15)
yp x
i
i ip p x pT p (12.16)
1 y1 1
, y
,
(12.19)
m ym m
Figure 12.9 Example of linear regression in one-dimension distribution (arbitrary units for the x-
and y- axes).
xT x xT y
1
(12.20)
As explained previously, the Monte Carlo method has been and is still is the most
commonly used method when statistics are evaluated on the outputs. The
disadvantage of this method is its low rate of convergence with the number of
simulations, m, and the convergence rate is proportional to the inverse of the root
1
square of m (𝑜 = ). If higher-order moments such as the variance are needed,
√𝑚
this method is often prohibitive.
In electromagnetics, methods based on the stochastic finite element were
introduced [2729] in the past decades. These methods have already been used in
other fields such as mechanics and fluid dynamics to incorporate random
fluctuations in the deterministic finite-element method. The key results on which
these approaches are based are due to Norbert Wiener [30] where Hermite
polynomials were used to model stochastic processes with Gaussian random
variables, a few years after Cameron and Martin [31] showed that such an
expansion converges (ℒ 2 ) for any arbitrary stochastic process with finite second
moment. Such constraint is quite easy since most of the physical systems comply
with it. Recently, studies [20, 32, 33] have contributed the development and use of
stochastic methods in the engineering domain and have provided a mathematical
framework to manage the variability of the inputs in numerical calculations.
The methods that use polynomial chaos (PC) expansion can be divided into
broad categories: the intrusive methods requesting to modify the simulation code
of the solver, and the nonintrusive methods that are using solvers as black boxes.
The first category is strongly dependent on the simulation code and requires
manipulation of the governing equations that can be can be very complex and
analytically cumbersome. This complexity of intrusive approaches explains the
increasing attention given to the nonintrusive methods, which, using the complex
codes as black boxes, are more easily generalizable. The nonintrusive approaches
may themselves be divided into two categories. The first one is composed of the
stochastic collocation and the stochastic spectral methods. The stochastic
collocation method, in which the polynomial approximation is constrained to fit
exactly the model response at a suitable point set, relies upon well-established
results on Lagrange polynomial interpolation [34]. The second one is the spectral
methods in which the polynomial chaos coefficients are estimated using spectral
projections or least-square regressions. In this section we will pay specific
attention to these spectral methods.
The aim of this chapter is to consider the variations of the outputs of a
physical phenomenon or system induced by the variations of the inputs. Therefore,
of interest is a mathematical model 𝑀 having M inputs, 𝑦 = 𝑀(𝑥), with the inputs
x affected by some possible random variations or uncertainties. Because of that, a
probabilistic framework needs to be defined.
Statistical Methods and Computational Electromagnetics 535
Let us note the probability space (Ω, ℱ, 𝒫) , where Ω is the event space
equipped with σ-algebra ℱ and probability measure 𝒫. In the following M random
variables are noted by uppercase letters (𝜔): Ω ⟶ 𝒟𝑥 ⊂ ℝ𝑀 ; their realizations are
noted by the corresponding lowercase letters (e.g., 𝑥).
Let us also note ℒ 2 (Ω, ℱ, 𝒫, ℝ), the space of squared integrable real valued
function equipped with the inner product:
X1 , X 2 2 E X1 , X 2
X 1 X 2 d
(12.21)
x1 x2 f X1 , X 2 x1 , x2 dx1dx2
Dx
where 𝑓𝑋1,𝑋2 is the joint probability density function (PDF) of the vector {𝑋1 , 𝑋2 }.
This inner product provides also a norm:
X E X 2 (12.22)
Under this formalism and assuming that the components of the input random
vector are independent. Any scalar-valued model ℳ: ℝM ⟼ ℝ and having a
random response 𝑌(ω) = 𝑀(𝑋(ω)) with a finite second-order moment Ε(𝑌 2 ) <
+∞ can be described [35] using an infinite modal expansion such as:
Y X
M
(12.23)
ji X i , ki X i E
ji X i , ki X i j ,k (12.25)
M
k x ki xi i
(12.26)
i 1
The PCE was originally formulated with standard Gaussian random variables
and Hermite polynomials. It was later extended to other classical random variables
together with basis functions. The decomposition is then often referred to as
Generalized PCE (GPCE). Table 12.1 provides some of the most common
continuous distributions in the associated family of polynomials.
Table 12.1
Example of Relationship Between Families of Orthogonal Polynomials
in GPCE and the Usual Input Distributions
n 1 Pn1 x
2n 1 xP x nPn1 x , n (12.28)
1
Pk x Pl x 11,1 x 11,1 x / 2 k ,l (12.29)
2n 1
1 1
P1 x
1, P2 x
2
3x2 1 , P3 x
2
5 x 3 3x (12.30)
H e 1 x H eo x
1 (12.31)
en1 x
H xH en x nH en1 x , n (12.32)
Statistical Methods and Computational Electromagnetics 537
x2
1
H em x H en x e 2
dx n ! k ,l (12.33)
2
H e1 x 1, H e2 x x 2 1 , H e3 x x3 3x (12.34)
If the statistical distribution of the input data used in a problem are not those
associated with the ones associated with a well-defined family of polynomial, then
an iso-probabilistic transformation can be used. Let us denote P as the probability
governing a random variable 𝑋, and 𝐹𝑋 (𝑥) as the PDF of the random variable 𝑋
that is monotone and defined as:
FX x P X x (12.35)
FY P Y FX X P X FX1
(12.36)
=FX FX1
M N N N M !
P CN M (12.37)
N N ! M !
Let us note 𝑌̂ as a truncation of the GPCE:
N 1
Y k k X (12.38)
k 0
𝑌̂ represents the surrogate model we are looking for and will be the substitute
to the complex and cumbersome FDTD calculations. The truncation and the
polynomials involved in this substitution model are influencing the accuracy of
such mode. The validation methods described previously will have to be used to
check the validity of such model and adapt it if necessary through the number and
type of polynomial used in ̂𝑌. The next step is to assess the expansion coefficients.
538 Advanced Computational Electromagnetic Methods and Applications
m E M X m X M x m x f X x dx (12.39)
x m x f X x dx
m k k
(12.40)
k 0
m k k m k k ,m (12.41)
k 0 k 0
L
m m wi M xi m xi (12.42)
i 1
The accuracy depends on the number and choice of the sampling. The
simplest method to assess this is to use the Monte Carlo method. In this case the
standard error that is decreasing in L−1/2 induces a low convergence rate, which is a
well-known drawback of Monte Carlo simulations. Other methods such as the
Latin hypercube sampling the quasi-random or low discrepancy sequences are
more efficient than MCM but they still request a large number of simulations that
have to conduct as much as the number of inputs and the requested coefficients.
Such a large number of calculations are often not compatible with the FDTD
constraints.
Statistical Methods and Computational Electromagnetics 539
(a) (b)
Figure 12.10 Collocation points with sparse grids on the left, with a tensorial product on the right. (a)
Sparse grids. (b) Tensorial product.
An alternative approach for selecting the integration nodes and weights is the
use of quadrature techniques, but the main drawback of this approach is still the
curse of dimensionality. For multiple input variables the basic method and the
tensor product require the use of LN point where N is the number of random input
variables and L is the number of points used by quadrature 1-D.
Table 12.2
Number of Simulations versus Order and Number of Uncertain Variables for Sparse Grid
Order 1 2 3 4
1 3 5 7 9
2 5 13 25 41
3 9 29 69 137
4 17 65 177 401
In order to bypass this issue, sparse quadrature schemes using the Smolyak
algorithm [38] that uses a multidimensional grid construction and sparse grids can
be used to reduce the simulation effort. Advanced methods such as Clenshaw-
540 Advanced Computational Electromagnetic Methods and Applications
Curtis formulation [39] can reduce even more the number of collocation points
(see Figure 12.10).
In spite of these efforts, as shown in Table 12.2, the number of simulations
requests is still important even for advance methods involving the Clenshaw-Curtis
rule. As a conclusion, the quadrature approach combined with smart grid can be
used for problems involving a small number of variables. In practical problems the
number of inputs is often higher than 3 or 4, and in this case the projection
approach is not really compatible with FDTD.
The calculation of the GPCE coefficients using regression aims at computing the
GPCE coefficients that minimize the mean-square error of approximation of the
model response. Consider a model 𝑦̂ that has been built using a truncated GPCE.
N 1
Y k k X (12.43)
k 0
If the model has M variables and we want to build a model with pth order,
(𝑀+𝑝)!
according to (12.37) the number of coefficients is (e.g., 70 if the model has
𝑝!𝑀!
four inputs and we start with a polynomial order of 4).
The construction of a surrogate model using the polynomial chaos can follow the
process described in Figure 12.11. The first step is to identify and characterize the
input variables, and the second step will be to identify and build the polynomial
family (see the previous section) according to the inputs and the characteristics of
the inputs (e.g., Legendre polynomials since uniform inputs). After that the
computational budget has to be taken into account. The LHS (see the previous
section) has to take into account the number of inputs (given by the problem) and
the degree of GPCE we want to start. For example, with the previous example
(four inputs and polynomial order of 4) we will need 70 coefficients, so the LHS
has to be larger than 70 and larger enough to avoid a bad conditioning of the
information matrix of the least-square formulation. The coefficients are provided
through a regression. The quality of the surrogate model can be tested using the
leave-one-out method. If the accuracy is in line with the target accuracy then the
surrogate model is ready or else new points have to complement (i.e., FDTD
simulations have to be performed) the initial LHS until the quality reaches the
target accuracy.
Statistical Methods and Computational Electromagnetics 541
Figure 12.11 Computational scheme of the surrogate built using full GPCE.
3500
3000
Cardinal of the GPCE
2500
2000
1500
1000
500
0
1 2 3 4 5 6 7 8 9 10
Number of variables
Figure 12.12 Cardinal of the GPCE basis with 5th maximum order versus the number of variables.
With a GPCE that is using all the polynomial, the cardinal of the GPCE basis
is given by (12.37). With two input variables and a polynomial order up to 7 the
542 Advanced Computational Electromagnetic Methods and Applications
cardinal of the GPCE basis is 36 while it is 792 for the same order but with 5 input
variables. As shown in Figure 12.12, the number of polynomials can be huge, and
this constraint is known as the curse of dimensionality.
In fact, as we can imagine, all the polynomials do not have the same importance in
the GPCE truncation. Studies have been performed to build iteratively a sparse
polynomial chaos expansion for uncertainty propagation and sensitivity analysis
[40]. The objective, as shown in Figure 12.13, is to select the most important
polynomials taking into account the constraint of a constant cardinal of the
polynomials basis.
7 7
6 6
5 5
74 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7
0 1 2 3 14 5 6 7
Figure 12.13 Example of selection of polynomials in a 2-D case. The x- and y-axes represent the
order of the univariate polynomial. (a) shows a full GPCE and (b) shows a sparse GPCE.
(a) shows, in the red boxes, the polynomials having order below 5. (b) shows, in the
green boxes, a possible selection of polynomials with a number of selected polynomials
less important than that in (a) but with higher-order polynomials. In both cases, the blue
cross represents the possible polynomial having a pure order below 7.
Among the approaches that have been studied to build the sparse GPCE, there
is a method based on the sparsity-of-effects principle, which states that most
models are principally governed by the main effects and low-order interactions.
Within these approaches we have, for instance, the hyperbolic index sets [41] that
are quite easy to implement (select the polynomial having a global order below a
hyperbolic curve) and have been used in mechanical problems but seem less
relevant in electromagnetism.
Though sparse polynomial chaos based on the least angle regression method
(LAR) [42] and least absolute shrinkage and selection operator method (LASSO)
[43] (also known as “LARS”) are not easy to implement they are much more
Statistical Methods and Computational Electromagnetics 543
efficient and are well adapted for engineering problems [44] including
electromagnetics and bio-electromagnetism problems.
Among a large set of polynomials forming a full truncation, the LARS
objective is to select iteratively those polynomials having the greatest impact from
the point of view of their correlation with the residual. As a consequence, the
algorithm chooses, one by one, the polynomials in descending order of influence.
Therefore, this method provides many possible truncations having increasing sizes.
Accordingly, the steps from calculate the GPCE coefficient using regression in
Figure 12.14 linked to the computational scheme have to be replaced by a new one
describing the process of selection of polynomials using LARS and selecting,
using LOOCV, the best truncations.
Figure 12.14 Computational scheme of the surrogate built using sparse GPCE.
To verify the compliance of mobile phones with safety limits when used close to
the head, technical standards have defined two test positions close to the head
(known as the cheek and tilt name). These configurations are used in a procedure
that uses a homogeneous phantom (known as SAM) designed to overestimate
human exposure. This approach is useful to ensure compliance with exposure
limits but does not address the need of epidemiological studies of characterization
distribution of the exposure associated with various uses of phones. To
characterize such real exposure, it is of interest to investigate the impact on head
exposure of variable phone positions. As explained in previous sections, usual
approaches that are using the Monte Carlo method are not suitable for FDTD. To
544 Advanced Computational Electromagnetic Methods and Applications
overcome this limitation and in line with the previous section, a sparse GPCE has
been used [45] to analyze the influence of variable phone usage on the SAR10g
(maximum SAR over 10 grams in the head) response. Figure 12.15 shows the
configuration studied: a phone located close to the head of the Duke human
phantom (part of the virtual family [6]). The handset model is a generic one
composed of a p.c.b., a screen, a battery, and a patch antenna located on the top of
the phone model.
Four parameters, 𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 , govern the rotation and translation of the
phone model (see Figure 12.15) relative to the head. The support of the uniform
distributions of these parameters is, respectively, [0o, 30o], [15o, 15o], [5 mm, 30
mm], and [10 mm, +10 mm]. The procedure described previously has been used
to build a SAR10g surrogate model using a sparse approach. The initial
experimental design are composed of N = 25 points (𝑥1(1) , 𝑥2(1) , 𝑥3(1) , 𝑥4(1) ) , ……,
(𝑖) (𝑖) (𝑖) (𝑖)
(𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 ), ……, selected using the LHS method, and it has been enriched
iteratively using a NLHS.
(a) (b)
Figure 12.15 Generic phone model located close to the head of the Duke human phantom. (a) Face
side view. (b) Phone side view.
The input variables are uniform, orthogonal polynomials and therefore the
Legendre polynomials are suitable for the GPCE. Since the GPCE inputs in the
case of Legendre polynomials must be [1, 1], an iso-probabilist transform has to
be used to link 𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 and the standardized GPCE Legendre polynomials
inputs. The sparse GPCE has been obtained using the hyperbolic index sets and
using the LARS. For the hyperbolic approach, as illustrated in Table 12.3, the
sparse GPCE surrogate model produced by the iterative procedure allows, for an
accuracy of 10-2 (assessed using a LOOCV), a GPCE degree of p=8, which
contains only 18 terms instead of the 495 terms that can be requested by the usual
full GPCE according to Equation (12.37).
Statistical Methods and Computational Electromagnetics 545
As explained previously, the LARS is much more efficient than other sparse
approaches. In the present example a LOOCV ( 1 − 𝑄2 ) accuracy of 1% is
obtained with a seventh order and 71 simulations (15 significant polynomial
coefficients to compare with the 330 that are requested using a full GPCE). The
0.1% is obtained with 103 simulations (i.e., 78 simulations added iteratively to the
25 initial ones using the NLHS), in this case 27 polynomials are involved in the
surrogate model. Figure 12.16 shows the probability distribution function (PDF)
of 122 simulations that have been performed.
Table 12.3
Order of GPCE Polynomials, Number of Simulations, and 𝑄2 of the Sparse SAR10g Surrogate GPCE
Model Obtained with the Iterative Process and the “Hyperbolic” Index Set
p=2 7 30 0.9
p=5 9 43 0.95
p=8 18 88 0.99
SAR10g
Figure 12.16 PDF of the 122 FDTD simulations that have been performed.
546 Advanced Computational Electromagnetic Methods and Applications
Figure 12.17 PDF of the SAR10g based on different surrogate models (“full” GPCE, sparse
“hyperbolic” GPCE, and sparse LARS GPCE) and 10,000 positions of the phone model
relative to the head.
Figure 12.16 provides the PDF of the FDTD simulations that have been
performed, but even if the experimental design has been done with LHS, the
resulting PDF is not necessarily fully representative of that of the SAR10g linked
to the variations of the position of the phone model relative to the head. To assess
this statistical distribution, one can use the SAR10g surrogate model that has been
built and the Monte Carlo approach to generate a large number of outputs. Figure
12.17 shows the PDF of the SAR10g estimated using the surrogate models based
on “full” GPCE (order 3), sparse GPCE using the hyperbolic index set, and the
(𝑖) (𝑖) (𝑖) (𝑖)
sparse GPCE built using LARS and 10,000 points (𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 ) selected
using a usual Monte Carlo process.
The sensitivity analysis (SA) studies the apportionment in the output uncertainty of
a mathematical model or system to different sources of inputs uncertainty in its
input. The SA is often divided in local and global sensitivity analysis. The local
SA addresses the influence, in the vicinity of given values, on the outputs of little
variations of the inputs. The global SA quantifies the outputs uncertainties due to
changes of the inputs over their domains of variation. Different methods exist [46]
Statistical Methods and Computational Electromagnetics 547
to perform SA; among them, the variance-based methods, also known as ANOVA
(analysis of variance) with the Sobol decomposition [47], are often used. With this
ANOVA approach, the response Y = M(x) of a system having finite variance and
independent inputs can be decomposed [48] into main effects and interactions.
The response variance D = Var[Y] can be decomposed in partial variances
Di , j , k VarX i E Y
X i x i, X j xj , Xk xk
(12.46)
Di , j Di , k Dj, k Di D j Dk
Where E[Y=M(Xi=xi)] is the mean model response when the i-th input parameter is
kept fixed at a given value 𝑥𝑖 the variance of the latter is all the greater since this
conditional mean is varying much as a function of 𝑥𝑖 . The partial variance 𝐷𝑖
measures the contribution 𝑋𝑖 alone to the uncertainty (variance) in Y (averaged
over variations in other variables). The Sobol indices that are known to be good
descriptors of the sensitivity of the model response to its input parameters, since
they do not suppose any kind of linearity of the model, are defined as
Di1 ,...,is
Si1 ,...,is (12.47)
D
The estimation of Sobol indices is usually assessed using Monte Carlo
approaches that are not easy in the case of heavy calculation. The use of a
surrogate model alleviates the procedure. In the case of GPCE, the calculations are
very greatly reduced since the knowledge of the coefficients of GPCE allows for
Sobol indices without further calculation. Indeed, thanks to the orthonormality of
the polynomials involved in the GPCE, the total and partial variances can be
assessed using the total or partial sums of the squared coefficients. With the same
notation as in (12.44) the total variance is given by:
N 1
D Var
(Y )
k 0
2
k (12.48)
Di1 ,...,is
2 (12.49)
i1 ,...,is
where
548 Advanced Computational Electromagnetic Methods and Applications
i1 ,is : k 0, k i1 ,..., is (12.50)
Equation (12.49) shows that the partial variances 𝐷 ̂𝑖 ,…𝑖 are obtained by
1 𝑠
summing up the squared coefficients of the relative polynomials that depend only
on 𝑥𝑖 .
Si1 ,...,is
i1 ,is
2
(12.51)
N 1
2
k 0 k
The total sensitivity indices 𝑆𝑖𝑇 have been also defined to quantify the total
effect of an input parameter on the output. They are defined from the sum of all
partial sensitivity indices 𝑆i1,…is involving parameter i.
D i (12.52)
SiT 1
D
For instance, the GPCE coefficients estimated in Section 12.7.4 can be used to
perform the sensitivity analysis of the head exposure with respect to the four
parameters, 𝑋1 , 𝑋2 , 𝑋3 , 𝑋4 , governing the rotation and translation of the phone model
relative to the head (see Figure 12.15). Figure 12.18 shows the Sobol indices
estimated using the GPCE coefficients. These indices show that the four most
important parameters are S1, S3, S12 and S13. The total sensitivity indices have also
been estimated. The most important are 𝑆1𝑇 and 𝑆3𝑇 that are, respectively, about
85% and 10%. The less important are 𝑆3𝑇 and 𝑆2𝑇 that are contributing less than 5%
together. This analysis shows that the most important parameter is the rotation in
the plane composed of the ears and mouth.
The signature of the GPCE is also of great interest to analyze the importance
of the polynomials involved in the GPCE. Such analysis has been performed to
analyze the variation of the scattered field by building facades. Initial study
analysis methods are based on Green’s functions of semi-infinite medium [49].
The method is fast but not enough to perform a large statistical study. GPCE has
been used to perform the stochastic analysis of scattered field by building facades
[50] when the number of required input samples has been reduced by more than
one order compared to a Monte Carlo approach for the same precision in output
distribution. As shown in Figure 12.19, the problem has eight input variables [49,
50] (height and width of the windows and façade, separation distances between
windows and between windows and façade edges).
Statistical Methods and Computational Electromagnetics 549
0.9
0.8
0.7
0.6
Sobol indices
0.5
0.4
0.3
0.2
0.1
-6 D2
-4 D4
W H
-2 D1 D3
H X
0 y
E
2
x
4
6
-6 -4 -2 0 2 4 6
100
Pure order of degree 1 Mixed order of degree 2
10-3
10-4
1 2 1-1 3 2-1 1-1-1
Standardized order of chaos polynomials
Figure 12.20 Coefficient values for all the polynomials involved in the full GPCE.
Uniform distribution and NLHS have been used in this study [50]. Figure
12.20 shows the signature of the GPCE composed of the coefficient values for all
the polynomials involved in the GPCE. The GPCE’s signature helps to identify the
most important polynomial and is therefore a valuable complement to the
sensitivity analysis.
12.8 KRIGING
The Kriging method, also known as Gaussian process (GP) regression, is another
method to build surrogate models. The name of the method comes from the South
African engineer, D. G. Krige, who initiated it [51]. The formalism and the
popularization of the method are based on the work of Georges Matheron [52].
This approach and more generally the geostatistical methods are interpreting the
sampled data as random process results. In fact, this does not mean that the
phenomenon such as the RF exposure is produced by a random phenomenon. This
approach allows the benefit of a well-defined mathematical framework to manage
the spatial inference of quantities in unobserved areas and to quantify the
uncertainty associated with the estimator.
Statistical Methods and Computational Electromagnetics 551
yˆ x 0 yˆ 0 i i y i (12.53)
Variograms and covariances are often used in geostatistic and Kriging in particular.
These are functions describing the degree of spatial dependence of a spatial
random field or stochastic process. The covariance between random variables
𝑋(𝑥, 𝜔) and 𝑌(𝑦, 𝜔) is noted K(x, y) and is defined as
K x, y cov X x, , Y y,
(12.54)
E X x, E X x , Y y , E Y y ,
For random vectors of dimension m and n, respectively, the cross-covariance
matrix is given by:
cov X x, , Y y,
(12.55)
=E X x, Y y, E X x, E Y y,
T T
The stationarity is a standard assumption in many applications but all the
phenomena are not necessarily stationary. A process is said to be stationary if the
covariance k(x + h, x) does not depend on x. In this case the notation of k(x + h, x)
552 Advanced Computational Electromagnetic Methods and Applications
is often reduced to k(h). Classical geostatistical theory [53, 54] relies on a weaker
assumption: the intrinsic hypothesis. In this case the random function Y is called
intrinsically stationary if the increment process (h) = (Y(x) – Y(x + h)) is
stationary. In this case, E(Y(x) – Y(x + h)) and E(Y(x) – Y(x + h))2 are stationary.
The variogram, often noted as 𝛾(𝑥, 𝑦), has covered this case. It is defined as the
variance of the difference between field values at two locations. Any stationary
process is intrinsically stationary, but the converse is not true.
The covariance is a key question for the Kriging method. It can be known but
often that is not the case; the covariance model establishment has therefore to be
fitted to the data (i.e., the covariance model has to be chosen and parameters
involved in the model assessed using the data).
Large efforts have been dedicated to the covariance models that can be used
in GP [55]. Among the possible functions (e.g., 𝛾-exponential function or Matern
function based on the Bessel function), the squared exponential (SE) is quite
simple and often used. The SE covariance is given by
r2
K r e 2l (12.56)
2
where the parameter l defines the characteristic length-scale. This parameter has to
2
(𝑥(𝑖)−𝑥(𝑗) )
be assessed using the existing data {⋯ (𝐾𝑖,𝑖 = 𝑒 2𝑙2 ) … } . One can use
Consider {x(1) , x(2),….x(n)} as the sampling sites the sample covariance K between
them, and the covariance K0 of the samples with the estimate point 𝑥 (0) are
expressed as:
K1,1 K1,1,nn
K
K
cov Y x ,Y x
i j
1 i , j n
(12.57)
n,1 K n, n
K1,0
K 0
K
cov Y x ,Y x
i 0
1 i n
(12.58)
n,0
Let us note
Statistical Methods and Computational Electromagnetics 553
1 y 1
i
j and Y y (12.59)
n y n
If we consider that the mean of the process, noted m, is known then we can
consider without any loss of generality the case m=0. The best linear unbiased
prediction (BLUP) of the estimate value 𝑌(𝑥 (0) ) is given by the system
0
Y x
t
Y y i
i
(12.60)
K 1 K 0
In the case of OK, the mean value is unknown, and a new system exists
j K i , j K i ,0 i
j
(12.61)
j 1
j
In this case (12.60) are still valid, but 𝐾 𝑂𝐾 , 𝐾0 𝑂𝐾 and 𝜆𝑂𝐾 provided in (12.62)
are slightly different from (12.57), (12.26), and (12.59).
Var Yˆ x0 Y x0
02
K0,0 i i Ki ,0 (12.63)
Figure 12.21 shows the Kriging method applied to 𝑦 = 𝑥𝑠𝑖𝑛(𝑥) with the
samples performed at x = 1, 2, 3, 5, 6, 8, 10, respectively, where the main
advantage is to have uncertainty of the estimation.
554 Advanced Computational Electromagnetic Methods and Applications
10
Y=x.sin(x)
BLUP best linear unbiased predictor
8 95% confidence Interval
6
Y arbitrary unit
4
-2
-4
-6
0 1 2 3 4 5 6 7 8 9 10
x arbitrary unit
Figure 12.21 Kriging applied to y = xsin(x) with the sample performed at x = 1, 2, 3, 5, 6, 8, 10.
(c) (d)
Figure 12.22 OK results obtained for different number of samples: (a) 200; (b) 15; (c) 30; and (d) 60.
Because of the versatile use of mobile phones, tablets, and computers, efforts
have been recently dedicated to estimate fetus exposure [11, 56, 57]. In spite of the
progress in high-performance computation (e.g., GPU, parallel computing) the
Statistical Methods and Computational Electromagnetics 555
12.9 CONCLUSION
REFERENCES
[1] M. Ackerman, “Accessing the Visible Human Project,” D-Lib Magazine, 1995. (www.nlm.
nih.gov/cresearch/visible/visible _human.html)
[2] P. Dimbylow, “Development of the Female Voxel Phantom, NAOMI and Its Application to
Calculations of Induced Current Densities and Electric Fields from Applied Low Frequency
Magnetic and Electric fields,” Physics in Medicine and Biology, Vol. 50, No. 6, pp. 1047–1070,
2005.
[3] T. Nagaoka, et al., “Development of Realistic High-Resolution Whole-Body Voxel Models of
Japanese Adult Males and Females of Average Height and Weight, and Application of Models to
556 Advanced Computational Electromagnetic Methods and Applications
[21] B. Sudret, Uncertainty Propagation and Sensitivity Analysis in Mecanical Models. Contribution
to Structural Reliability and Stochastic Spectral Method, Habilitation à Diriger des Recherches,
Universite Blaise Pascal, Clermont-Ferrand, France, 2007. (http://www.ibk.ethz.ch/su/
publications /Reports/HDRSudret.pdf)
[22] http://www.openturns.org
[23] M. Mc Kay, W. Conover, and R. Beckman, “A Comparison of Three Methods for Selecting
Values of Input Variables in the Analysis of Output from a Computer code,” Technometrics, Vol.
21, No. 2, pp. 239–245, 1979.
[24] G. Wang, “Adaptive Response Surface Method Using Inherited Latin Hypercube Design Points,”
J. Mech. Des., Vol. 125, No. 2, pp. 210–220, 2003.
[25] P. Qian, “Nested Latin Hypercube Designs,” Biometrika, Vol. 96, No. 4, pp. 957–970, 2009.
[26] B. Efron, “Bootstrap Methods: Another Look at the Jackknife,” Annals of Statistics, Vol. 7, No
1, pp. 1–2, 1979.
[27] C. Chauvière, J. Hesthaven, L. Lurati, “Computational Modeling of Uncertainty in Time-Domain
Electromagnetics,” SIAM J. Sci. Comput., Vol 28, No. 2, pp. 751–775, 2006.
[28] D. Xiu and J. Hesthaven, “High-Order Collocation Methods for Differential Equations with
Random input,” SIAM J. Sci. Comput., Vol. 27, No. 3, pp. 1118–1139, 2005.
[29] J. Silly-Carette, D. Lautru, M. Wong, A. Gati, J. Wiart, and V. Fouad Hanna, “Variability on the
Propagation of a Plane Wave Using Stochastic Collocation Methods in a Bio Electromagnetic
Application,” IEEE Microwave and Wireless Components Letters, Vol. 19, No. 4, pp. 185–187,
2009.
[30] N. Wiener, “The Homogeneous Chaos,” Amer. J. Math., Vol. 60, No. 4, pp. 897–936, 1938.
[31] R. Cameron and W. Martin, “The Orthogonal Development of Nonlinear Functionals in Series of
Fourier-Hermite Functionals,” Ann. of Math., Vol. 48, No. 2, pp. 385–392, 1947.
[32] R. Ghanem and P. Spanos, Stochastic Finite Elements: A Spectral Approach, New York:
Springer-Verlag, 1991.
[33] W. Shoutens, Stochastic Processes and Orthogonal Polynomials, New York: Springer-Verlag,
2000.
[34] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions with Formulas, Graphs,
and Mathematical Tables, 9th printing, New York: Dover Publications, 1972.
[35] Ch. Soize and R. Ghanem, “Physical Systems with Random Uncertainties: Chaos
Representations with Arbitrary Probability Measure,” SIAM Journal on Scientific Computing,
Vol. 26, No. 2, pp. 395–410, 2004.
[36] A. Nataf, “Détermination des Distributions Dont Les Marges Sont Données,” C.R. de
l’Académie des Sciences, Vol. 225, pp 42–43, 1962.
[37] M. Rosenblatt, “Remarks on a Multivariate Transformation,” The Annals of Mathematical
Statistics, Vol 23, pp 470–472, 1992.
[38] S. Smolyak, “Quadrature and Interpolation Formulas for Tensor Products of Certain Classes of
Functions,” Soviet. Math. Dokl. Vol. 4, pp. 240–243, 1963.
[39] C. Clenshaw and A. Curtis, “A Method for Numerical Integration on an Automatic Computer,”
Num.Math. Vol. 2, pp. 197–205, 1960.
[40] G Blatman, Adaptive Sparse Polynomial Chaos Expansions for Uncertainty Propagation and
Sensitivity Analysis, Ph.D Thesis, Université Blaise Pascal, Clermont-Ferrand, 2009.
558 Advanced Computational Electromagnetic Methods and Applications
[41] G. Blatman and B. Sudret, “Anisotropic Parsimonious Polynomial Chaos Expansions Based on
the Sparsity-of-Effects Principle,” Int Conf. on Structural Safety and Reliability, Osaka, Japan,
2009.
[42] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least Angle Regression,” Annals of
Statistics Vol. 32, pp. 407–499, 2004.
[43] R. Tibshirani, “Regression Shrinkage and Selection via the Lasso,” J. Royal Stat. Soc., Series B
Vol. 58, pp. 267–288, 1996.
[44] G. Blatman and B.Sudret, “Adaptive Sparse Polynomial Chaos Expansion Based on Least Angle
Regression,” Journal of Computational Physics, Vol. 230, No. 6, pp. 2345–2367, 2011.
[45] A. Ghanmi, Analyse de l’exposition aux Ondes électromagnétiques des Enfants Dans le Cadre
des Nouveaux Usages et Nouveaux Réseaux, Phd Université Marne La Vallée, 2013.
[46] A. Saltelli, K. Chan, and E. Scott, (eds.), Sensitivity analysis. New York: John Wiley & Sons,
2000.
[47] I. Sobol, “Sensitivity Estimates for Nonlinear Mathematical Models,” Math Model & Comput
Exp., Vol. 1, pp. 407–414, 1993.
[48] B. Efron and C. Stein, “The Jacknife Estimate of Variance,” Annals Statist, Vol. 9, No. 3, pp.
586–596, 1981.
[49] S. Mostarshedi, et al., “Fast and Accurate Calculation of Scattered Electromagnetic Fields from
Building Faces Using Green's Functions of Semi-Infinite Medium, ” IET Microwaves, Antennas
& Propagation, Vol. 4, No. 1, pp. 78–82, 2010.
[50] P. Kersaudy, et al., “Stochastic Analysis of Scattered Field by Building Facades Using
Polynomial Chaos,” IEEE Trans on Antenna & Propagation, Vol. 62, No. 12, pp. 6382–6393,
2014
[51] D. Krige, “A Statistical Approach to Some Basic Mine Valuation Problems on the
Witwatersrand,” Journal of the Chemical, Metallurgical and Mining Society, Vol. 52, pp. 119–
139, 1951.
[52] G. Matheron, Traité de géostatistique appliquée, Tome I., In E. Technip (ed.), Mémoires du
Bureau de Recherches Géologiques et Minières, No. 14, Paris, 1962.
[53] G. Matheron, “The Intrinsic Random Functions and Their Applications,” Adv. Appl. Prob. Vol. 5,
pp. 439–468, 1973.
[54] J. Chiles and P. Delfiner, Geostatistics. Modeling Spatial Uncertainty, New York: John Wiley
and Sons, 2012.
[55] C. Rasmussen and C. Williams, Gaussian Processes for Machine Learning, University Press
Group Limited, New Era Estate, 2006.
[56] M. Jala, et al., “Simplified Pregnant Woman Models for the Fetus Exposure Assessment,” C.R.
Physique, Vol. 14, No.5, pp. 412–417, 2013.
[57] M. Jala, Plans D'expériences Adaptatifs Pour le Calcul de Quantiles et Application à la
Dosimétrie Numérique, PhD. Thesis, Telecom Paris-Tech, 2013.
[58] T. Nagaoka, et al., “An Anatomically Realistic Whole-Body Pregnant-Woman Model and Speci-
fic Absorption Rates for Pregnant-Woman Exposure to Electromagnetic Plane Waves from 10
MHz to 2 GHz,” Phys. Med. Biol., Vol.52, pp. 6731–6745, 2007.
About the Authors
559
560 Advanced Computational Electromagnetic Methods and Applications
Jiahui Fu received B.S. and M.S. degrees from the Harbin Institute of
Technology in 1995 and 1998, respectively, and a Ph.D. degree in
information and communication engineering from the Harbin Institute of
Technology, China, in 2005.
He is currently a professor in the School of Electronics and
Information Engineering, Harbin Institute of Technology. His research
interests include microwave wave and millimeter-wave circuits, antennas,
metamaterials, and electromagnetic compatibility.
Champaign, and was appointed as the first Henry Magnuski Outstanding Young Scholar in
the Department of Electrical and Computer Engineering in 1998 and later as a Sony scholar
in 2005. He was appointed as a distinguished visiting professor in the Air Force Research
Laboratory in 1999 and was awarded adjunct, visiting, guest, or chair professorship by City
University of Hong Kong, University of Hong Kong, Anhui University, Beijing Institute of
Technology, Peking University, Southeast University, Nanjing University, Zhejiang
University, Shanghai Jiao Tong University, and Xidian University. His name appeared over
20 times in the University of Illinois at Urbana-Champaign’s List of Excellent Instructors.
His students have won the best paper awards in IEEE 16th Topical Meeting on Electrical
Performance of Electronic Packaging and 25th and 27th Annual Review of Progress in
Applied Computational Electromagnetics. He served as an associate editor and guest editor
for the IEEE Transactions on Antennas and Propagation, Radio Science, Electromagnetics,
Microwave and Optical Technology Letters, and Medical Physics. He was the Symposium
cochairman and technical program chairman of the Annual Review of Progress in Applied
Computational Electromagnetics in 1997 and 1998, respectively
Joe Wiart, received a Ph.D. from Telecom Paris Tech and University P
VI in 1995. He has been a telecommunication engineer from Telecom
Paris Tech since 1992, and the head of the research unit of Orange
(www.orange.com former France Telecom) in charge of studies relative
to the human exposure to electromagnetic fields since 1997. Since 1999
Dr. Wiart has served as the chairman of the working group of the
European Committee for Electrotechnical Standardization (CENELEC) in
charge of mobile and base station standards. He is one of founders of the common
laboratory of the Institute Mines-Telecom and the Orange Labs (http://whist.mines-
telecom.fr/) which he has managed, since its creation in 2009. Dr. Wiart is the present
chairman of the International Union of Radio Science (URSI) commission K. He has been
the chairman of the French chapter of URSI and a consultant to ICNIRP. He is emeritus
member of The Society of Environmental Engineers (SEE) since 2008 and a senior member
of Institute of Electrical and Electronics Engineers (IEEE) since 2002. He has led more than
10 national projects dedicated to dosimetry (http://whist.mines-telecom.fr/) and was
involved in several EU projects (Interphone, Mobi-Kids and Geronimo). Since the end of
2012, he is the leader of the EU project LEXNET (http://www.lexnet-project.eu/). His
research interests are dosimetry, numerical methods, and statistic applied in
electromagnetism, and stochastic dosimetry. His works resulted in more than 90
publications and more than 120 communications (including numerous invited
communications).
August 2000 and from January 2002 to June 2002, he was a senior research assistant and a
research Fellow, respectively, with the City University of Hong Kong. He joined Peking
University (PKU), Beijing, China, as an associate professor in 2002, and was promoted to
full professor in 2004. He moved to the University of Electronic Science and Technology of
China, Chengdu, China, as a Chang-Jiang Professor nominated by the Ministry of Education
of China in 2010. He returned to PKU after finishing the appointment in 2013. His research
interests include computational electromagnetics, wave propagation and scattering,
microwave remote sensing, antennas, and microwave components. He has authored one
book and a few book chapters, and more than 80 peer-viewed papers.
Prof. Xia was the recipient of the Young Scientist Award of the URSI in 1993. He was
awarded the first-class prize on Natural Science by the Chinese Academy of Sciences in
2001. He was the recipient of the Foundation for Outstanding Young Investigators
presented by the National Natural Science Foundation of China in 2008.
569
570 Advanced Computational Electromagnetic Methods and Applications
For further information on these and other Artech House titles, includ-
ing previously considered out-of-print books now available through our
In-Print-Forever® (IPF®) program, contact: