A-STABLE IMPLICIT RAPID SCHEME AND SOFTWARE SOLUTION FOR
ELECTROMAGNETIC WAVE PROPAGATION
By
Mathialakan Thavappiragasam
A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
Electrical Engineering - Doctor of Philosophy
Computational Mathematics, Science and Engineering - Dual Major
2019
ABSTRACT
A-STABLE IMPLICIT RAPID SCHEME AND SOFTWARE SOLUTION FOR
ELECTROMAGNETIC WAVE PROPAGATION
By
Mathialakan Thavappiragasam
A robust and rapid scheme to solve electromagnetics (EM) is an important requirement in
the scientiﬁc computing environment in which there are several useful methods used to solve
tasks in EM. Our research study is motivated by this need and is targeted to develop a fast
A-stable implicit numerical scheme and scalable software solution for EM wave propagation.
Our scheme is based on the Method Of Lines Transpose (MOLT) approach which discretizes
time ﬁrst and then solves boundary value problems. By applying the free-space Green’s
function, the solution is derived by decomposing particular and homogeneous solutions. The
compact Simpson’s quadrature based, O(N) fast convolution, a recursive algorithm, is used
to solve the particular solution for N number of grid points. The homogeneous solution
is obtained using a particular solution at the boundary points and the applied boundary
conditions. The multi-dimensional scheme is developed using the ADI splitting approach
and an arbitrary order accuracy in time is achieved by switching the time derivation to a
spatial derivation using the Lax-Wendroﬀ approach.
The focus of the work in this thesis has been to overcome the limitations in Neumann
and outﬂow boundary conditions to get high-order accuracy by using special treatments that
deal with a choice of the interpolation, ﬁnite diﬀerence stencil, and the initial conditions.
In addition, we have extended these ideas to construct perfectly electrically conducting
boundary conditions in 2D for the MOLT.
In addition to introducing higher-order boundary conditions, an embedded boundary
method is employed to deal with complex geometries. As the method is A-stable, it does not
suﬀer from small-time step limitations that are found in explicit ﬁnite diﬀerence time domain
methods when using either embedded boundary or cut cell methods to capture geometry.
Further, we are developing an open source code MOLTN (Method Of Lines Transpose, Nth
order) which is intended to be a hardware-independent, scalable software tool, using multi-
node MPI, multi-core OpenMP, and GPU CUDA implementation. As a test case of the
method, we implement and study the A6 magnetron with our embedded boundary method
using point sources inside of the domain.
The eventual goal is to combine this method with a novel particle method for the simu-
lations of plasma. The particle method would treat particles as point particles that generate
ﬁelds that are tracked on the mesh. No density or current will be mapped to the mesh.
The consistency and performance of the scheme are evaluated for EM wave propagation and
scattering using diﬀerent shaped objects including curved boundaries and the introduction
of true point sources that demonstrate how we handle particles. Stable solutions result for
a wide range of mesh sizes and potential to leverage novel computing architectures, such as
GPU, have been demonstrated.
Copyright by
MATHIALAKAN THAVAPPIRAGASAM
2019
For my beloved late parents (Thavappiragasam and Pushpavathy) who motivate me
beyond, to move toward my dream
v
ACKNOWLEDGEMENTS
My ﬁrst and deepest gratitude goes to my advisor Professor Andrew Christlieb for giving
me point to point guidance and support to achieve the target successfully; thank god for
giving me such a mentor for my Ph.D. studies. Dr. John Luginsland is the next person to
bring me through these studies. Thank you so much for your valuable guidance sir.
I will never forget your help, Professor Shanker Balasubramaniam; you established a
successful path to move forward, thank you so much! I would also like to thank Professor
John Verboncoeur, Professor Edward J. Rothwell, Professor Brian O’Shea for their guidance
and thoughts over my Ph.D. studies. Without your support, I couldn’t imagine success.
I would like to thank Dr. Matthew F. Causley, Dr. Aditya Viswanathan, and Dr. Pierson
Guthrey for their support and collaborative works to accomplish every milestone, and also
I am happy to thank Dr. Eric Wolf for his valuable and timely supports. The Christlieb-
group, the bleeding edge research team with active researchers, is the team I have been
working throughout my Ph.D. studies. Thank you colleagues, William Sands, Dr. Hana
Cho, Dr. Michel Crockatt, Dr. Xiao Feng, Dr. Bosu Choi, Dr. Yan Jiang, Dr. Wei Guo,
Dr. Bankim Mandal, Dr. Rouchun Zhang for your tips and tricks. I would also like to thank
Dr. Bernard G. Pope, a professor emeritus of physics at the Michigan State University, for
thesis proofreading and editing.
My thanks also go to professors who taught me courses at Michigan State University,
and the faculty and staﬀ that are working in the Department of Electrical and Computer
Engineering and Computational Mathematics, Science, and Engineering.
Finally, I am proud to thank my family for their support and patience, especially my
loving wife Kukapalini for her supports and inspirational words of encouragement to carry
my studies and our family concurrently. I also want to thank friends and relatives for their
motivation.
vi
TABLE OF CONTENTS
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
x
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xi
LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
CHAPTER 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Context and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 High Power Magnetrons
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Research Goal and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Electromagnetic Wave Propagation . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Electromagnetic Fields . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.2 Boundary Conditions for Electromagnetic Fields . . . . . . . . . . . .
1.5 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CHAPTER 2: Mathematical Model for Wave Equation Solver
. . . . . . .
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 One Dimensional Implicit Wave Equation Solver . . . . . . . . . . . . . . . .
2.2.1
Semi-discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Fast Convolution Algorithm . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Boundary Condition . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3.1 Dirichlet Boundary Condition . . . . . . . . . . . . . . . . .
2.2.3.2 Neumann Boundary Condition . . . . . . . . . . . . . . . .
2.2.3.3 Periodic Boundary Condition . . . . . . . . . . . . . . . . .
2.2.3.4 Outﬂow Boundary Condition . . . . . . . . . . . . . . . . .
2.3 Multi-Dimensional Implicit Wave Equation Solver using ADI Scheme . . . .
2.4 Arbitrary Temporal Order Accuracy . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Arbitrary Order One Dimensional Scheme . . . . . . . . . . . . . . .
2.4.2 Extension to Higher Dimensions for Arbitrary Order Accuracy . . . .
2.5 Treatment for Variable Speed Wave Propagation . . . . . . . . . . . . . . . .
2.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.1.1 Higher Order Wave Solvers in Two Dimension . . . . . . . .
2.6.1.2 Higher Order Wave Solvers in Three Dimensions
. . . . . .
2.6.2 Three Dimensional Waves
. . . . . . . . . . . . . . . . . . . . . . . .
2.6.3 Waves with Variable Speeds . . . . . . . . . . . . . . . . . . . . . . .
2.6.3.1 One Dimensional Case . . . . . . . . . . . . . . . . . . . . .
2.6.3.2 Two Dimensional Case . . . . . . . . . . . . . . . . . . . . .
2.6.3.3 Three Dimensional Cases
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
2.6.4.1 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . .
2.6.4 Waveguide in a Photonic Crystal
2.6.1 Convergence Studies
1
1
3
6
7
7
10
11
13
13
16
16
23
24
25
26
27
27
31
35
37
41
44
46
46
46
48
48
50
50
53
55
56
56
vii
2.6.4.2 Geometry of the Problem . . . . . . . . . . . . . . . . . . .
2.6.4.3 Result and Discussion . . . . . . . . . . . . . . . . . . . . .
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CHAPTER 3: Higher Order Non-reﬂecting Boundary Condition . . . . . .
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Extension to Higher Order in Accuracy . . . . . . . . . . . . . . . . . . . . .
3.3 Appropriate Initial Condition . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Convergence Test for High-Order Outﬂow . . . . . . . . . . . . . . .
3.4.2 Rotating Gaussian Pulse . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2.1 Time Evaluation of 50o angled Gaussian pulse
. . . . . . .
3.4.3 Outﬂow Boundary Condition Along Curve Boundaries
. . . . . . . .
3.4.4 Convergence Studies Using Analytical Solution . . . . . . . . . . . . .
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
58
61
62
62
63
67
68
68
70
72
73
74
76
CHAPTER 4: Higher Order Embedded Neumann Boundary Method . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1
4.2 Derivation of The Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 One Dimensional Scheme . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Two Dimensional Scheme
. . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 Three Dimensional Scheme . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Treatment for Complex Geometries . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Pre-computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2 Geometry for A6 MDO in 2D . . . . . . . . . . . . . . . . . . . . . .
4.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
77
77
79
79
81
85
88
90
92
93
95
4.4.1.1 Convergence studies
96
Spherical Scattering
98
4.4.2.1 Convergence Studies . . . . . . . . . . . . . . . . . . . . . . 100
4.4.3 Electron Beam in Magnetrons . . . . . . . . . . . . . . . . . . . . . . 101
4.4.3.1 A6 Magnetron with Diﬀraction Output (MDO) in 2D . . . . 101
4.4.3.2 A6 Magnetron in 3D . . . . . . . . . . . . . . . . . . . . . . 103
4.4.3.3 Convergence Studies . . . . . . . . . . . . . . . . . . . . . . 104
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.4.1 Cylindrical Scattering
4.4.2
CHAPTER 5: Second-Order PEC Boundary Condition . . . . . . . . . . . . 107
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.1
2D Electric Scalar and Magnetic Vector Potential
. . . . . . . . . . . . . . . 108
5.2
5.3 Perfect Electric Conductor Boundary . . . . . . . . . . . . . . . . . . . . . . 109
5.4 Numerical Approximation for PEC Boundary Conditions . . . . . . . . . . . 111
5.4.1 Using Dirichlet Boundary Conditions for A to Capture PEC Boundary 112
5.4.2 Embedded Boundary Method, An eﬀective Neumann to Dirichlet
. . . . . . . . . . . . . . . . . . . . . 115
5.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Map for Creating Ghost Points
viii
5.5.1
Square Cavity Rotated Through Diﬀerent Angles
. . . . . . . . . . . 120
5.5.1.1 Convergence Studies and Error Analysis . . . . . . . . . . . 123
5.5.2
Square Cavity with a leak (diﬀraction Q)
. . . . . . . . . . . . . . . 126
5.5.3 A6 Magnetron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.5.4 Rising Sun Magnetrons . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
CHAPTER 6: Software Solution and Code Acceleration . . . . . . . . . . . 134
6.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.2 Multi-core OpenMP, Multi-node MPI version . . . . . . . . . . . . . . . . . 134
6.2.1 Data Structures and Data Flows . . . . . . . . . . . . . . . . . . . . . 135
6.2.2 Domain Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.2.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.3 GPGPU CUDA version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.3.1 Task Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.3.2 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
CHAPTER 7: Summary and Potential Future Directions . . . . . . . . . . . 145
7.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.3.1 Code Acceleration for MOLTN . . . . . . . . . . . . . . . . . . . . . 146
7.3.2 Complex Geometry in MOLTN . . . . . . . . . . . . . . . . . . . . . 147
7.3.3 Further HPM tube simulations
. . . . . . . . . . . . . . . . . . . . . 147
Incorporate with Particles . . . . . . . . . . . . . . . . . . . . . . . . 148
7.3.4
7.3.4.1 Derive the scheme for 3D point source . . . . . . . . . . . . 149
7.3.4.2 Numerical Example
. . . . . . . . . . . . . . . . . . . . . . 151
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
ix
LIST OF TABLES
Table 2.1: The maximum values βmax for which the P th order scheme remains A-stable
[14]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
Table 5.1: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping
test.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Table 5.2: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping
test for CFL 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Table 5.3: Numerical error in the frequency computation using the ping test at the
point (3.36cm, 3.36cm).
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
x
LIST OF FIGURES
Figure 1.1:
3D Relativistic A6 magnetron with diﬀraction output (MDO) with a
. . . . . . . . . . . . . . . . . . .
cylindrical cathode in the center [61]
Figure 1.2:
2D view of A6 magnetron with a cathode of radius rc and the anode with
inner radius ra, vane radius rv, vane angle α1, and cavity angle α2 . . . .
Figure 1.3: Boundary surface between two regions with electric ﬁelds E1 and E2,
magnetic ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic
ﬂux densities B1 and B2 respectively. Here, Js is the surface current, ρs
is the surface charge density, and n is the normal vector pointed out from
the region two.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure 2.1: Fast convolution line integration which decomposes left I L[u](x) and right
I R[u](x)) oriented integrals between the domain a and b with the spatial
step size, ∆xj(= xj − xj−1).
. . . . . . . . . . . . . . . . . . . . . . . .
Figure 2.2:
(a) x-, (b) y- and (c) xy lines with exact circular boundary points on the
mesh lines that are used for the ADI x (a) and y (b) sweeps.
. . . . . .
Figure 2.3: Fourth order time convergence of the 2D wave solver using Dirichlet (a)
and periodic (b) boundary conditions with ∆x = ∆y = 6.25 × 10−3.
This is a self-reﬁnement study which measures L∞ norm of the error
at time T = 2.0 on a square domain Ω = [−1, 1]2 with a point source
cos(ωt), ω = 1 at the center of the box, (0, 0). The CFL (= c∆t
∆x ) value
changes proportional to the time step size ∆t with ﬁxed spatial step size.
Figure 2.4: Fourth order convergence of 3D wave solver using Dirichlet boundary
conditions with ∆x = ∆y = ∆z = 1.25 × 10−2 ω = 1. This is a self-
reﬁnement study which measures L∞ norm of the error at time T = 2.0
on a cubic domain Ω = [−1, 1]3 with a point source cos(ωt) at the center
of the cube, (0, 0, 0) for (a) ω = 0.1 and (b) ω = 1. The CFL (= c∆t
∆x )
value changes proportional to the time step size ∆t with ﬁxed spatial step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
size.
Figure 2.5: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic do-
main (Ω = [−1, 1]3) by imposing Dirichlet boundary condition along the
surface with the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and
the time step size, ∆t = 0.0313.
. . . . . . . . . . . . . . . . . . . . . .
5
6
10
23
33
47
49
50
xi
Figure 2.6: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω =
[−1, 1]3) by imposing periodic boundary condition along the surface with
the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step
size, ∆t = 0.0313.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure 2.7: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω =
[−1, 1]3) by imposing outﬂow boundary condition along the surface with
the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
size, ∆t = 0.0313.
Figure 2.8:
1D wave travelling over the two domains Ω1 and Ω2 with piecewise con-
stant speed c1 = 1.0 and c2 = 2.0. For this experiment, we choose spatial
step ∆x = 0.002, and time step ∆t = 0.005.
. . . . . . . . . . . . . . .
Figure 2.9:
1D wave travelling over the layered media with eight domains and the
wave travels with the same speed in every bi-domain ( c1 = 1.0 and
c2 = 2.0). For this experiment, we choose spatial step ∆x = 0.002, and
time step ∆t = 0.005 along layered media with piecewise constant wave
speed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure 2.10: Geometrical view of 0.5 × 0.5 (a) one and (b) four square patches in the
square domain Ω = [−1, 1]2 with c2(= 0.1) and c1(= 1.0) are the wave
speeds in the patches and the remaining area. . . . . . . . . . . . . . . .
Figure 2.11: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω =
[−1, 1]2 with a 0.52 square patch as shown in Figure 2.10-(a). Here, the
spatial step size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005. . .
Figure 2.12: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω =
[−1, 1]2 with four 0.52 square patches as shown in Figure 2.10-(b). Here,
the spatial step size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005.
Figure 2.13: Geometrical view of 0.5 × 0.5 × 0.5 (a) one and (b) four cubic patches
in the cubical domain Ω = [−1, 1]3 with c2(= 0.1) and c1(= 1.0) are the
wave speeds in the patches and the remaining area. . . . . . . . . . . . .
Figure 2.14: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with a 0.53 cubic patch as shown in Figure 2.13-(a). Here, the
spatial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size,
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
∆t = 0.0313.
Figure 2.15: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with four 0.53 cubic patches as shown in Figure 2.13-(b). Here,
the spatial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size,
∆t = 0.0313.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
50
52
53
54
54
54
55
56
56
xii
Figure 2.16: Schematic illustration of a line defect (in red) within a set of cylindrical
rods (in green) on a photonic crystal. . . . . . . . . . . . . . . . . . . . .
Figure 2.17: 2D Lattice of cylindrical rods which are periodic along x and y axes with
. . . . . . . . . . . . . . . . . .
lattice constant a on a photonic crystal.
Figure 2.18: Square lattice of square rods of side length 0.38a with lattice constant a
and dielectric constant = 8.9 in 3D (a) and 2D (b), a linear defect is
. . . . . . . . . . . . . . . . . . .
formed by removing a column of rods.
Figure 2.19: The wave generated by the point source, e(−iωt), where ω = cky, ky =
, = 8.9, and µ = 1 travelling through the photonic
0.88(2a), c = 1√
waveguide using the model shown in Figure 2.18 . . . . . . . . . . . . .
(µ)
Figure 2.20: Square lattice of cylindrical rods of side length 0.38a with lattice constant
a and dielectric constant = 8.9 in 3D (a) and 2D (b), a linear defect is
. . . . . . . . . . . . . . . . . . .
formed by removing a column of rods.
Figure 2.21: The wave generated by the point source, e(−iωt), where ω = cky, ky =
, = 8.9, and µ = 1 travelling through the photonic
0.88(2a), c = 1√
waveguide using the model shown in Figure 2.20 . . . . . . . . . . . . .
(µ)
Figure 2.22: The wave, e(−25(x2+y2)) traveling through the photonic waveguide using
. . . . . . . . . . . . . . . . . . . . . .
the model shown in Figure 2.20.
Figure 3.1: Fourth order convergence of (a) 2D and (b) 3D wave solver using out-
ﬂow boundary conditions with the spatial step size h = 1.25 × 10−2.
This is a self-reﬁnement study which measures L∞ norm of the error at
time T = 2.0 on a (a) square and (b) cubic domain with a point source
cos(ωt), ω = 1 at the center of the domain. The CFL (= c∆t
h ) value
changes proportional to the time step size ∆t with ﬁxed spatial step size,
h.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure 3.2: Fourth order convergence of the oval shape Gaussian pulse (however it
reduces to the above third order near to θ = Π
4 due to the ADI splitting
error) given by the Equation 3.8 rotated through Π
2 on square do-
main Ω = [−1, 1]2 by imposing Outﬂow boundary conditions along the
boundaries. This self-reﬁnement study measures L∞ norm of the error at
time T = 2.0 with h = 0.02.
. . . . . . . . . . . . . . . . . . . . . . . .
18 to Π
xiii
57
58
59
59
60
60
60
70
71
Figure 3.3: Time evolution of a Gaussian pulse given by the Equation 3.8 with angle
θ = 50o on square domain Ω = [−1, 1]2 by imposing outﬂow boundary
conditions along the boundaries. Here the time step size ∆t = 3.3× 10−3
. . .
and CFL value is 0.5. The wave is leaving completely as expected.
Figure 3.4: Evolution of a Gaussian ﬁeld given by the Equation 3.8 at time t =
0.6864, rotated through diﬀerent angles (a) θ = 30o, (b) θ = 50o, (c)
θ = 70o, and (d) θ = 90o on square domain Ω = [−1, 1]2 by imposing
outﬂow boundary conditions along the boundaries, Here the time step
size ∆t = 3.3 × 10−3 and CFL value is 0.5 at time t = 0.1287 using
outﬂow boundary conditions. We observe the same behaviour of the
wave for rotated Gaussian pulse through diﬀerent angles. The waves
leave completely through the curved boundary.
. . . . . . . . . . . . . .
Figure 3.5:
(a) Geometry and (b) graph representation of the 2D object constructed
by using an oval of width 2a and height 2b and a rectangle of 2rw × (rh1 +
.
rh2). Here rh1 = b and v1, v2, v3, and v4 are the vertices of the object.
Figure 3.6: Time evolution of the point source ﬁeld cos ωt with ω = 0.5 at (0, 100∆x)
in the object as shown in Figure 3.5 by imposing outﬂow boundary con-
ditions along the curved boundary and homogeneous Dirichlet along the
straight boundaries. Here the spatial step size, ∆x = ∆y = 0.039 and
the time step size, ∆t = 0.0195 . . . . . . . . . . . . . . . . . . . . . . .
Figure 3.7:
(a) Snapshot of the point source ﬁeld at time T = 5.0 and (b) Fourth
order convergence of 2D wave solver using Outﬂow boundary conditions
along the curve boundary with the spatial grid width ∆x = ∆y = 0.078.
This is a self-reﬁnement study which measures L∞ norm of the error at
time T = 5.0 on the object shown in Figure 3.5 with a point source
. . . . . . . . . . .
cos(ωt), ω = 0.5 at the center of the box, (0, 100∆x)
Figure 3.8: Fourth order convergence of 3D wave solver using outﬂow boundary con-
ditions with the spatial step size h = 2.5× 10−2. This measures L∞ norm
of the error which compares with analytical solution using a point source
sin(4πt) placed at the center of the domain. . . . . . . . . . . . . . . . .
Figure 4.1: Geometry of 1D embedded Neumann boundary domain Ω ≡ [xa, xb] with
left boundary xB, ghost point xG, and interpolation points xI and xII,
where the distance ξI = |xI − xa|, ξII = |xII − xa|, and ξG = |xG − xa|
.
Figure 4.2:
(a)Geometry of 2D embedded Neumann boundary method with a bound-
ary point (xB, yB), ghost point (xG, yG), and interpolation points (xI, yI),
(xII, yII), (xIII, yIII) and (xIV , yIV ) and (b) symmetric 8-points stencil to
interpolate the solution at the point (xI, yI).
. . . . . . . . . . . . . . .
72
72
73
74
75
76
80
82
xiv
Figure 4.3: Geometry of 3D embedded Neumann boundary method with a bound-
ary point (xB, yB, zB), ghost point (xG, yG, zG), and interpolation points
(xI, yI, zI), (xII, yII, zII), (xIII, yIII, zIII) and (xIV , yIV , zIV ) . . . . . . .
Figure 4.4:
2D graph with 3 vertices (v1(x1, y1), v2(x2, y2), and v3(x3, y3), 2 straight
edges es1(≡ (v1, v2)), es2(≡ (v3, v1)) and 1 circular arch (a) or elliptical
arch (b) edge ea1(≡ (v2, v3, Oa1)) centered at Oa1(xa1, ya1).
. . . . . . .
Figure 4.5: Flow diagram of pre-computation for 3D problems with complex geome-
tries; it cuts a 3D object into 2D slices Sxy, Sxz, makes 2D graphs Gxy,
Gxz, and identiﬁes the properties of the line segments Px, Py, and Pz.
.
Figure 4.6: A6 MDO (a) Axial view with a solid cathode of radius rc in the center,
the anode with inner radius ra, vane radius rv, vane angle α1, cavity angle
α2, and a coupling horn of radius rh (b) the angle of kth cavity which is
represented by the angles θk
. . . . . . . . . . . .
1 and θk
2 , θk
2 = θk
1 + α1.
Figure 4.7: Geometrical structure of A6 MDO with key points; intersection, bound-
ary, ghost, and interpolation points required to obtain (a) x− and (b) y−
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
sweeps.
Figure 4.8:
2D model for a single cylindrical scatter at the (a) center and two (b)
symmetric and (c) asymmetric cylindrical scatters in the corners. Here
the red mark indicates the point source.
. . . . . . . . . . . . . . . . . .
86
89
90
93
94
95
Figure 4.9: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a circular
scatter of radius r = 0.5 in the center of the square domain Ω = [−1, 1]2
with a spatial step size ∆x = ∆y = 0.0156 and time step size ∆t = 0.0078. 96
Figure 4.10: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
circular scatters of radii r1 = r2 = 0.3 in the bottom-left and top-right
corners of the square domain Ω = [−2, 2]2 with a spatial step size ∆x =
. . . . . . . . . . . . . . .
∆y = 0.0156 and time step size ∆t = 0.0078.
Figure 4.11: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmet-
rical circular scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left
and top-right corners of the square domain Ω = [−2, 2]2 with a spatial
step size ∆x = ∆y = 0.0156 and time step size ∆t = 0.0078. . . . . . . .
Figure 4.12: Fourth order convergence of 2D wave solver using a point source cos(ωt),
ω = 1 placed at (0,−1 + 3∆x), ∆x = 0.0156 on a square domain Ω =
[−1, 1]2 with a circular scatter of radius r = 0.5 at the center by imposing
Neumann and outﬂow along the circular and square boundaries. This is
a self-reﬁnement study which measures L∞ norm of the error at the ﬁnal
time T = 2.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
96
97
97
xv
Figure 4.13: Convergence plots for a) space using entire solution and b) space using
boundary derivative on a square domain Ω = [−1, 1]2 with a circular
scatter of radius r = 0.5 at the center by imposing Neumann and outﬂow
along the circular and square boundaries. This is a self-reﬁnement study
which measures L2 norm of the error at the ﬁnal time T = 2.0.
. . . . .
Figure 4.14: 3D model for a single spherical scatter at the center of a cube, two b)
. . . . .
symmetric and c) asymmetric spherical scatters in the corners.
98
99
Figure 4.15: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a spherical
scatter of radii r = 0.5 in the center of the cubic domain Ω = [−1, 1]3 with
a spatial step size ∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078. 99
Figure 4.16: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
spherical scatters of radii r1 = r2 = 0.3 in the bottom-left and top-
right corners of the cubic domain Ω = [−1, 1]3 with a spatial step size
∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.
. . . . . . . . 100
Figure 4.17: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmet-
ric spherical scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left
and top-right corners of the cubic domain Ω = [−1, 1]3 with a spatial step
size ∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.
. . . . . . 100
Figure 4.18: Fourth order (a) time and (b) space convergence plots using a spherical
scatter of radius r = 0.5 at the center of a cubic domain Ω = [−1, 1]3 by
imposing Neumann and outﬂow along the surface of the sphere and cubic
boundaries. This is a self-reﬁnement study which measures (a) L∞ and
(b) L2 norm of the error at the ﬁnal time T = 2.0.
. . . . . . . . . . . . 101
Figure 4.19: Time evolution of a circular source ﬁeld cos(ωt)(|x2+y2−r2
c| < 2∆x), ω =
10 in the model of A6RM with a solid cathode of radius rc = 1.58 and
anode of inner radius ra = 2.11, vane radius rv = 4.11 and the coupling
horn radius rh = 4.9 and the angular width of the vane, α1 = π
6 , and the
cavity, α2 = π
6 .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
Figure 4.20: Time evolution of ﬁelds provided by six point sources in the model of
A6M with a transparent cathode of radius rc = 1.58 and anode of inner
radius ra = 2.11, vane radius rv = 4.11 and the coupling horn radius
6 , and the cavity, α2 = π
rh = 4.9 and the angular width of the vane, α1 = π
6 .102
xvi
Figure 4.21: Fourth order convergence of ﬁelds provided by a point source cos(ωt),
ω = 1 at the center (0, 0), of A6MDO with the anode of radii rh = 4.9
rv = 4.11, and ra = 2.11 and angular width of the vane, α1 = π
6 and the
cavity, α2 = π
6 . This is a self-reﬁnement study which measures L∞ norm of
the error at time T = 2.0 for ﬁxed spatial step size ∆x = ∆y = 3.13× 10−2 103
Figure 4.22: Key points used for the simulation of 3D A6 magnetron; intersection,
boundary, ghost, and interpolation points required to obtain (a) x−, (b)
y−, and (c) z− sweeps.
. . . . . . . . . . . . . . . . . . . . . . . . . . . 104
Figure 4.23: Time evolution of ﬁelds provided by six point sources in the model of 3D
A6 magnetron with a transparent cathode of radius rc = 1.58 and anode
of inner radius ra = 2.11, vane radius rv = 4.11 and the coupling horn
radius rh = 4.9, the angular width of the vane, α1 = π
9 , cavity angle,
α2 = 2π
9 , and thickness h = 6. . . . . . . . . . . . . . . . . . . . . . . . . 105
Figure 4.24: Fourth order (a) time convergence and (b) space convergence of ﬁelds
provided by six point sources in the model of 3D A6 magnetron with
a transparent cathode and the anode of radii rh = 4.9 rv = 4.11, and
ra = 2.11 and angular width of the vane, α1 = π
6 and the cavity, α2 = π
6 .
This is a self-reﬁnement study which measures L∞ norm of the error at
time T = 2.0 for a ﬁxed spatial step size of ∆x = ∆y = 3.13 × 10−2 . . . 105
Figure 5.1: Boundary surface between two regions with electric ﬁelds E1 and E2,
magnetic ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic
ﬂux densities B1 and B2 respectively. Here, Js is the surface current, ρs
is the surface charge density, and n is the normal vector pointed out from
the region two (Perfect Electric Conductor). . . . . . . . . . . . . . . . . 110
Figure 5.2: A PEC square cavity rotated by an angle θ with normal vectors nL, nD,
nR, and nU along the left, down, right, and up boundaries. . . . . . . . . 112
Figure 5.3: Geometry of 2D embedded PEC boundary method with a boundary point
(xB, yB), ghost point (xG, yG), and interpolation points (xI, yI), (xII, yII),
and (xIII, yIII).
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Figure 5.4: Frequency distribution for 31.42o rotated 21cm×21cm square cavity com-
puted using the measured vector potential at the point (3.36cm, 3.36cm)
for the impulse response, h = 0.084 cm . . . . . . . . . . . . . . . . . . . 121
Figure 5.5: A strong fundamental mode for 31.42o rotated 21cm×21cm square cavity
computed for diﬀerent resolutions.
. . . . . . . . . . . . . . . . . . . . . 121
xvii
Figure 5.6: Convergence plots for the natural modes (a) 1-3, (b) 3-3, (c) 1-5, (d) 3-5,
and (e) 5-5 with analytically computed frequencies (a) f = 2.2588GHz,
(b) f = 3.0305GHz, (c) f = 3.6422GHz, (d) f = 4.1650GHz, and (e)
f = 5.0508GHz for 31.42o rotated 21cm × 21cm square cavity computed
for diﬀerent resolutions.
. . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Figure 5.7: Time evolution of a point source ﬁeld sin(2πf t) with f = 1GHz placed
at the center of a PEC square box of grid size 100 × 100 rotated by the
angles 0o, 31.42o, 45o, time step size ∆t = 7ps.
. . . . . . . . . . . . . . 124
Figure 5.8: Convergence plots for a) space using entire solution and b) space using
boundary derivative on a PEC square domain [0cm, 21cm]2. This study
measures L2 norm of the error at time T = 1.0ns compared with the
analytical solution, for the cases of mesh aligned and nonaligned (rotated
by 45o). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Figure 5.9: Convergence plots for (a) time and (b) space on a PEC a square domain
[0cm, 21cm]2. This study measures L∞ norm of the error at time T =
1.0ns compared with the analytical solution, for the cases of mesh aligned
and nonaligned (rotated by 45o).
. . . . . . . . . . . . . . . . . . . . . 126
Figure 5.10: Time evaluation of A in (a) mesh aligned and (b) θ = 31.420 rotated
square cavities with a leak imposed by outﬂow boundary condition and
diﬀraction Q which is obtained for an oscillating point source (sin(2πf t),
f = 1GHz) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Figure 5.11: Frequency spectrum of 2D A6M with vane resonators rv = 4.11cm, anode
radius ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane,
20o, and cavity angle, 40o, the grid size 128×128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient = 0.1.
. . . . . 128
Figure 5.12: Evolution of the transparent cathode A6M with vane resonators rv =
4.11cm, anode radius ra = 2.11cm, cathode radius rc = 1.58cm, angular
width of vane, 20o, and cavity angle, 40o, the grid size 128×128, time step
size ∆t = 39ps, averaging parameter β = 1.4 and dissipation coeﬃcient
= 0.1.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
Figure 5.13: 2D view of the 18 cavity rising sun magnetron (AX9) with a cathode of
radius rc and anode of inner radius ra, short and long cavity radii rvh and
rvl.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
xviii
Figure 5.14: Evolution of the transparent cathode 12 cavity rising sun magnetron
with vane resonators rvs = 4.11cm and rvl = 4.91cm, anode radius
ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane, 12o,
and cavity angle, 18o, the grid size 128 × 128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient = 0.1.
. . . . . 131
Figure 5.15: Evolution of the transparent cathode 18 cavity rising sun magnetron
(AX9) with vane resonators rvs = 4.11cm and rvl = 4.91cm, anode radius
ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane, 8o,
and cavity angle, 12o, the grid size 128 × 128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient = 0.1.
. . . . . 132
Figure 6.1: Domain Decomposition of the domain [a, b] to three 1D sub domains
j ) and right
[c0, c1], [c1, c2],and [c2, c3], each evolves own internal left (I L
(I R
j ) sweeps over the domain of length ∆cj.
. . . . . . . . . . . . . . . . 137
Figure 6.2: Log-log plot for wall time using 28 core Intel(R) Xeon(R) CPU E5-2680
v4 @ 2.40GHz processor, for the grid size N 2, N = 64, 128, 256, 1024, 2048.139
Figure 6.3: Log-log plot for speedup in using 28 core Intel(R) Xeon(R) CPU E5-2680
v4 @ 2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024,
2048.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Figure 6.4: Log-log plot for eﬃciency using 28 core Intel(R) Xeon(R) CPU E5-2680
v4 @ 2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024,
2048.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Figure 6.5: Task parallelism for the computation of un+1 based on Cxy and parallel
stages executing through CUDA streams.
. . . . . . . . . . . . . . . . . 142
Figure 6.6: Log-log plot for wall time using Intel(R) Xeon(R) CPU, Tesla K20, and
K80 GPUs for the grid sizeN 2, N = 32, 64, 128, 256, 512, 1024, 2048,
4096, 8192.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Figure 6.7:
Intel(R) Xeon(R) CPU vs Tesla K20 and K80 GPU speedup (log-log plot)
for the grid size N 2, N = 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192.
. 143
Figure 7.1: Magnetron with eight circular hole cavity anode (this ﬁgure is from En-
cyclopedia Britannica, Inc.)
. . . . . . . . . . . . . . . . . . . . . . . . 148
Figure 7.2: Two-cavity klystron (this ﬁgure is from Reference [30]) . . . . . . . . . . 148
Figure 7.3: Evolution of the two 3D point sources sin(2πf t) with the same frequency
f = 1 at the corners of the cubical domain ([−1, 1]3), spatial step size
∆x = ∆y = ∆z = 0.013, and time step size ∆t = 0.0067.
. . . . . . . . 152
xix
Figure 7.4: Time (a) and space (b) convergence studies using 3D point source sin(2πf t)
with frequencies f = 1 at the center of the cubical domain ([−1, 1]3).
. . 152
Figure 7.5: Evolution of the two 3D point sources sin(2πf t) with frequencies 1 and
10 at the corners of the cubical domain ([−1, 1]3), spatial step size ∆x =
∆y = ∆z = 0.013, and time step size ∆t = 0.0067.
. . . . . . . . . . . . 153
xx
LIST OF ALGORITHMS
Algorithm 1
Fast Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Algorithm 2
Two Dimensional Solver . . . . . . . . . . . . . . . . . . . . . . . . .
Algorithm 3
Algorithm 4
compute u : Compute u at time tn+1
. . . . . . . . . . . . . . . . . .
linv : L−1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Algorithm 5
Compute u at time tn+1 with high-order accuracy . . . . . . . . . . .
Algorithm 6
poly highorder : Compute the coeﬃcients for high-order accuracy . .
Algorithm 7
Algorithm 8
Algorithm 9
Algorithm 10
compute u high :Compute u in 2D with high-order accuracy . . . . .
compute c : Compute operator C in 2D; Cxy = L−1
compute d :Compute operator D in 2D; Dxy = 1 − L−1
compute cd : Compute operators C and D in 2D; Dxy and Cxy
y Dx + L−1
y L−1
x Dy . . . .
. . . . . .
. . . .
x
Algorithm 11 Compute u at time tn+1 by imposing outﬂow boundary conditions
.
linv out: Compute L−1 for outﬂow boundary condition . . . . . . . .
Algorithm 12
Algorithm 13
apply bc outf low: Apply outﬂow boundary conditions . . . . . . . . .
Algorithm 14
gamma : Compute the γ coeﬃcients for a given order of accuracy . .
25
35
36
36
39
40
43
44
45
45
67
68
69
69
70
81
84
85
88
Algorithm 15 E : Compute the coeﬃcients E based on lemma 2.1
. . . . . . . . .
Algorithm 16 Compute L−1 for 1D scheme . . . . . . . . . . . . . . . . . . . . . . .
Algorithm 17 Compute wx(= L−1
Algorithm 18 Compute wyx(= L−1
Algorithm 19 Compute wzyx(= L−1
Algorithm 20 Compute wx(= L−1
Algorithm 21 Compute wyx(= L−1
x [u]) . . . . . . . . . . . . . . . . . . . . . . . . . .
y [wx]) . . . . . . . . . . . . . . . . . . . . . . . .
z [wyx]) . . . . . . . . . . . . . . . . . . . . . . .
x [A])
y [wx]) . . . . . . . . . . . . . . . . . . . . . . . . 119
. . . . . . . . . . . . . . . . . . . . . . . . . 118
xxi
CHAPTER 1: Introduction
1.1 Context and Motivation
Scientists have been researching Higher Power Microwave (HPM) tubes to improve their
performance in new technologies, as well as identifying applications in diﬀerent areas, such
as medicine, waste decomposition, radar systems, plasma screens, linear accelerators, etc.
Beyond working in the Electromagnetic (EM) experimental labs, researchers also have been
working on the development of HPM simulation tools using diﬀerent numerical schemes.
Simulations can provide insight into real experiments. Moreover, they are very ﬂexible to
implement and can test new ideas using easy, fast, and cost-eﬃcient design cycles. In order
to perform EM simulations, ﬁrst, we have to solve Maxwell’s equations. Several approaches
have been introduced to solve Maxwell’s equations. They basically fall in the broad category
of Finite Diﬀerence (FD) Methods, Finite Element Methods (FEM), Method of Moments
(MOM), or hybrid techniques which combine two or more methods ( e.g., Finite Element
Time Domain (FETD) with Finite Diﬀerent Time Domain (FDTD) or FETD with MOM).
Each of the aforementioned methods has pros and cons, For example, the FDTD-based
explicit Yee scheme [71] has been used as a benchmark method for more than 60 years because
it preserves certain physical properties very well namely, Gauss’s law and the continuity
condition. The drawback of the method is that it fails to handle curved boundaries in
Cartesian grids due to staircasing issues. Further, explicit FD schemes can become unstable
when the surface bounds a medium [52]. In contrast, the FETD method is a very prominent
approach for complex geometries but this requires the solution of large systems of linear
equations. Such methods are expensive and hard to scale due to the bottleneck of matrix
inversions. Further, FETD schemes based on the Newmark beta method are unconditionally
1
stable but exhibit only second-order accuracy [29]. In the same spirit as FEM, the MOM
was developed to address complex problems in EM. The method transforms the boundary-
value problem into a system of linear equations [42]. Consequently, it suﬀers from the same
performance issues as FEM, since these matrices need to be inverted. The MOLT based
scheme developed by Causley et.al. [15; 13] was designed to address the issues faced by FDTD
and FETD methods. It is an unconditionally stable, fast, and implicit scheme, capable of
arbitrary order accuracy in time for Dirichlet and periodic boundaries on Cartesian grids.
The method reduces to second order for outﬂow boundary condition and Neumann boundary
condition on curved boundaries. My work extended the order of accuracy for these problems
to arbitrary order, which was veriﬁed through reﬁnement studies. We show fourth-order
spatiotemporal accuracy for complex geometry problems including curved boundaries.
Further, a general framework was deﬁned for problems with complex geometry using an
embedded boundary approach. Simulation tools for HPM tubes, such as A6 magnetron, 12
and 18 cavity rising suns were developed using a PEC boundary condition. The scheme
was derived for the electromagnetic vector potential using the Lorenz gauge which imposed
the PEC boundary condition in 2D. I evaluated the simulation of A6 magnetron using a
ping test and obtained six-strong resonance modes. This is identiﬁed as a challenging task
because the existing tools may fail to give an accurate or a robust solution [39; 40; 51; 67],
or the methods are limited to second-order accuracy [32]. The Discontinuous Galerkin PIC-
based approach fails to obtain the exact six resonance modes for the simulation of an A6
magnetron due to the diﬀusivity of the scheme [39; 40]. The multiphysics simulation software
tool VSim which is based on conformal Finite Diﬀerence Time Domain (FDTD) methods
and conformal particle boundary conditions, fails to give robust solutions in the simulation
of A6 magnetrons for every resolution. The user-conﬁgurable code, MAGIC which is an EM
FDTD-PIC code, is limited to the second-order accuracy[32].
We will review the needs and challenges for the simulation of magnetrons, speciﬁcally A6
magnetron in the next section.
2
1.2 High Power Magnetrons
The Magnetron is the most tunable Higher Power Microwave (HPM) source with 30%
tunable range [62]. It was invented by Arthur Hull in 1913, and magnetrons were built on
his original principles in the 1920s and 1930s with power levels of about 100W. In the 1940s,
the magnetrons developed by Boot and Randell achieved 10kW on their initial development.
In August 1940, the cavity magnetron based new weapons development for the US defense
systems during World War-II was led by Sir Henry Tizard. The magnetron driven radar had
a great impact on the war and the subsequent technical development of the magnetrons was
mainly carried out at the MIT radiation laboratory. The postwar development of magnetron
and related technologies has been enhanced in multi-directions. Such technology is strapping,
a mode selection technology which shifts the frequency spectrum by connecting alternate
resonators and prevents mode hopping. The rising suns, designed by alternating between
two resonators’ shape to separate the mode in frequency. A new generation of high-power
frequency-agile magnetrons was introduced for electronic warfare applications. The coaxial
magnetron, introduced by Feinstein and Collier, allows stable tuning by use of a cavity
surrounding the magnetron resonator. The Magnetron Injection Gun was introduced by
French and the relativistic magnetron (high current extension of the conventional magnetron)
was developed by Bekeﬁ and Orzechowski in 1976 [5; 23; 32].
Nowadays, magnetrons, high output power (GW-class) microwave tubes, are used in nu-
merous applications such as communication systems, radar, warfare, medical X-ray sources,
and microwave ovens. To study and enhance the performance of the magnetrons, magnetron
developers need a reliable and accurate simulation method. The research studies on the sim-
ulation of magnetrons have been carried out in academic research labs, national research labs
as well as electrical-electronic-computer engineering industries. For this simulation, we need
to derive a time-dependent solution for Maxwell’s equations that is applicable for complex
geometries and should have the ability to incorporate with particles. In recent years, the
conventional FDTD particle-in-cell (PIC) method has been widely used to study magnetrons
3
as well as other microwave tubes. Even though the FDTD based Yee scheme is a well-agreed
method for Maxwell’s equations and preserves divergence-free quantities, because of an ex-
plicit scheme it suﬀers from CFL restrictions and is not suitable for curved surfaces due to the
cut-cell staircasing approximation. Another approach, the conformal ﬁnite diﬀerence time
domain (CFDTD) PIC method, treats cut-cells diﬀerently in order to maintain second-order
accuracy for curved surfaces [51]. The CFDTD Dey-Mittra algorithm adjusts the area and
lengths of cut cells in the equation of Faraday’s law in Maxwell’s equation, but its stability
is limited by the CFL condition and it wouldn’t be scalable. The ADI-FDTD is one of the
most powerful schemes to solve Maxwell’s equations because it relies on simple, one dimen-
sional, tridiagonal system solvers in contrast to a single large system solver as is required by
the Crank-Nicholson implicit method. However, the ADI-FDTD method is mostly used to
solve non-complicated domains such as rectangular domain and is not applicable for complex
geometries because of showing only ﬁrst-order accuracy for stair-stepped curved boundaries.
ADI-FDTD method combined with the Dey-Mittra embedded boundary method can model
the curved domains associated with complex structures and time step sizes beyond the CFL
limit [67]. The eﬃciency of this method depends on the one-dimensional tridiagonal solvers
which are used underneath and that will cause a major bottle-neck which eﬀects the scala-
bility of the scheme. Further, the order of accuracy is limited to second-order.
Based on these studies, in order to simulate the HPM tube, we need a robust implicit
scheme that applies to complex geometries. Hence, the MOLT based A-stable implicit scheme
introduced in [14] is chosen to simulate HPM magnetrons and imposed embedded boundary
methods to deal with complex geometries associated with these kinds of accelerator struc-
tures. We derived a time-dependent solution to Maxwell’s equations in the vector potential
form under the Lorenz gauge and imposed a perfect electric conductor (PEC) embedded
boundary method. The scheme is unconditionally stable, shows fourth-order accuracy be-
yond the CFL restriction, and is freely scalable with high performance (requires O(n) cost
for n grids). It doesn’t depend on scalability restricting matrix operations but instead uses
4
free-space Green’s function based fast convolution. We performed a cold test using our sim-
ulation study. The cold test is an experiment of classical electromagnetic/microwave studies
without including charged particles and it can be done in either the frequency or the time
domain. We determined the dispersion relation of the interacting structures or slow-wave
structures and obtained the fundamental frequency mode using ping tests.
It would be
possible to carry out a follow up hot test which will include particles.
Figure 1.1 shows the 3D view of a relativistic A6 magnetron with diﬀraction output
(MDO) [61]. It has a cylindrical cathode in the center, an anode with six vanes around the
circle symmetrically, and a coupling horn which is connected with a cavity. We simulate
the open cavity A6 MDO using our MOLT based scheme by imposing embedded Neumann
(Chapter 4)/PEC (Chapter 5) and outﬂow boundary conditions for the scheme evaluation.
Further, our simulation study includes a symmetrical A6 magnetron using PEC boundary
conditions in 2D and Neumann boundary conditions in 3D. Further, we evaluate the A6
magnetron for the impulse response and resonance mode frequencies of it. A 2D view of
an A6 magnetron without diﬀraction output is shown in Figure 1.2. This has a cathode of
radius rc and an anode with inner radius ra, vane radius rv, vane angle α1, and cavity angle
α2.
Figure 1.1: 3D Relativistic A6 magnetron with diﬀraction output (MDO) with a cylindrical
cathode in the center [61]
5
Figure 1.2: 2D view of A6 magnetron with a cathode of radius rc and the anode with inner
radius ra, vane radius rv, vane angle α1, and cavity angle α2
.
1.3 Research Goal and Objectives
We have been working towards developing a linear-time, fast higher-order, multi-dimensional,
A-stable implicit solver that can be applicable for electromagnetic (EM) problems, specif-
ically targeting plasma science. This approach uses MOLT formulation combined with an
ADI scheme. In this way, algebraic approximations of original complex multi-physics prob-
lems can be mapped to utilize a computer’s core competencies. We aim to develop several
classes for leadership distributed multi-core computing platforms based on GP-GPU that is
an O(N ) direct implicit solver, and thereby enable exploration of a range of grand-challenge
problems. We are motivated by the following objectives:
• The development of high-order outﬂow and Neumann boundary conditions in 3D and
implementing a general geometric framework.
• The development of a high-order EM solver with no CFL constraint by developing
6
suitable boundary conditions for the vector-potential form of Maxwells equations under
the Lorenz gauge. Develop a scheme for perfectly electrically conducting layers (PECs)
in 2D.
• The development of high-order multi-core 2D and 3D implicit solvers on GPGPU for
general geometries.
• The development of a scalable software solution using multi-core openMP and multi-
node MPI.
We choose classic EM problems with complex geometries to assess our solver. These
include photonic crystal waveguides, scattering of laser light on spherical objects, and sim-
ulating electron beams in an A6 relativistic magnetron.
1.4 Electromagnetic Wave Propagation
1.4.1 Electromagnetic Fields
The mathematical relationship of electromagnetic ﬁelds is governed by Maxwell’s equa-
tions. The diﬀerential form of the macroscopic Maxwell’s equations can be expressed as:
= 0
∇ × E +
∂B
∂t
∇ × H − ∂D
∂t
= J
∇ · B = 0
∇ · D = ρ
(1.1)
(1.2)
(1.3)
(1.4)
Where Equations 1.1 and 1.2 represent Faraday’s law and Ampere’s law respectively, and
Equations 1.3 and 1.4 represent Gauss’s law. Here B is the magnetic ﬂux density (Wbm−2),
D is the electric ﬂux density (Cm−2), E is the electric ﬁeld intensity (Vm−1), H is the
magnetic ﬁeld intensity (Am−1), J is the electric current density (Am−2), and ρ is the
7
electric charge density (Cm−3). In linear isotropic media, the electric ﬂux density D and the
magnetic ﬂux density B have the constitutive relations with electric ﬁeld intensity E and
magnetic ﬁeld intensity H as follows:
D = E = 0rE
B = µH = µ0µrH
(1.5)
(1.6)
where the dielectric constant is the permittivity (Fm−1), 0 is the free space permittivity
and r is the relative permittivity. Similarly, µ is the permeability, µ0 is the free space
permeability (Hm−1), and µr is the relative permeability of the media. Upon applying the
approximations (1.5 and 1.6), we get
∇ · H = 0
∇ · E =
ρ
Now consider the case where and µ do not vary with time, thus
∇ × E + µ
∇ × H −
∂H
∂t
∂E
∂t
= 0
= J
(1.7)
(1.8)
(1.9)
(1.10)
Since we have solvers for second-order wave equations, we convert the ﬁrst order Maxwell
system to second-order form.
We know the following relationship with the curl and divergence operations,
∇ × (∇ × v) = ∇(∇ · v) − ∇2v
(1.11)
Applying Equation 1.11 to E, we ﬁnd the wave equation for the electric ﬁeld in the
8
presence of source,
∇(∇ · E) − ∇2E = ∇ × (∇ × E)
(cid:17) − ∇2E = −µ(∇ × ∂H
∇(cid:16) ρ
)
∂t
∂(∇ × H)
= −µ
= −µ
= −µ
∂J
∂t
∂t
∂( ∂E
∂t + J)
∂t
+ ∇(cid:16) ρ
(cid:17)
∂t2 − µ
∂2E
∂J
∂t
∇2E − 1
c2
∂2E
∂t2 = µ
(1.12)
where the speed c = 1√
µ. Suppose there are no free charges, ρ = 0, and J = 0 then we
have source-free wave equation for the electric ﬁeld,
∇2E − 1
c2
∂2E
∂t2 = 0
(1.13)
In the same way, upon applying Equation 1.11 to H, we obtain
∇(∇ · H) − ∇2H = ∇ × (∇ × H)
(cid:18)
(cid:19)
−∇2H = ∇ ×
=
∂t
+ J
+ ∇ × J
∂E
∂t
∂(∇ × E)
∂(−µ ∂H
∂t )
+ ∇ × J
∂t2 + ∇ × J
∂t
∂2H
=
= −µ
∇2H − 1
c2
∂2H
∂t2 = −∇ × J
(1.14)
If J = 0 then we have source-free wave equation for the magnetic ﬁeld,
9
∇2H − 1
c2
∂2H
∂t2 = 0
(1.15)
We have derived a wave equation for the electric ﬁeld (1.12) and the magnetic ﬁeld (1.14)
previously. However, since our targeting application domain is plasma science, we need to
obtain electric scalar and magnetic vector potential formulation from the mixed potential
form of Maxwell’s equation which can easily deal with charged particles. We give a detailed
derivation for the vector potential formulation in Chapter 5
1.4.2 Boundary Conditions for Electromagnetic Fields
The electromagnetic ﬁelds and ﬂux densities are subject to boundary conditions; they
can be derived from the integral form of Maxwell’s equations. Let’s consider a current and
charge-free boundary s that separates the regions 1 and 2 as shown in Figure 1.3.
Figure 1.3: Boundary surface between two regions with electric ﬁelds E1 and E2, magnetic
ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic ﬂux densities
B1 and B2 respectively. Here, Js is the surface current, ρs is the surface charge
density, and n is the normal vector pointed out from the region two.
Faraday’s and Ampere’s laws predict electric and magnetic ﬁelds are continuous along
the boundary, so the tangential boundary conditions can be expressed as,
ˆn × (E1 − E2) = 0
ˆn × (H1 − H2) = 0
(1.16)
Gauss’s law predicts the electric and magnetic ﬂuxes are continuous along the normal to
10
the boundary surface, so the normal boundary conditions can be expressed as,
ˆn · (D1 − D2) = 0
ˆn · (B1 − B2) = 0
(1.17)
Suppose the surface s has the induced surface electric current density Js and the induced
surface electric charge density ρs, the tangential and normal boundary conditions can be
predicted from Faraday’s, Ampere’s, and Gauss’s laws as follows,
ˆn × (E1 − E2) = 0
ˆn · (D1 − D2) = ρs
ˆn × (H1 − H2) = Js
ˆn · (B1 − B2) = 0
(1.18)
(1.19)
1.5 Overview
We give a general overview of this dissertation and draw attention to the original contri-
butions of the work. Basically, this work begins with MOLT based A-stable implicit solvers
introduced by [15] and later developed to achieve higher-order schemes [14] and adopt em-
bedded boundary methods [13]. We provide high-order outﬂow and Neumann boundary con-
ditions in 3D while targeting 3D multi-core high-order implicit EM solvers on the GP-GPU
platform. This work contributes to the development of high-order outﬂow and Neumann
boundary conditions in 3D for general geometries and 3D high-order EM solver using the
vector-potential form of Maxwells equations under the Lorenz gauge. Further, this work
implements these solvers on a multi-core computing platform, GP-GPU in order to release
a full-throttle multi-scale solver to enhance spatiotemporal performance. We now describe
the structure of the dissertation section by section.
We start with an introduction of our work in Chapter 1 including motivation (in Section
1.1) and objectives (Section 1.3).
In Chapter 2, we describe the available MOLT based
implicit scheme (in Section 2.2.1) and explain how we extend the technique to a 3D solver
11
using an ADI scheme ( in Section 2.3). We present numerical results for newly developed
3D solvers including variable speed test cases. Waveguides on a photonic crystal are used to
assess our 2D variable speed solver.
High-order temporal schemes are explained in Chapter 3 including a novel scheme for
high-order outﬂow boundary conditions described in Section 3.2. Further, this chapter in-
cludes numerical results for the proposed high-order 3D solver.
In Chapter 4, we begin with a description of the embedded method presented by [13] and
extend it to higher-order using the method given in [15] for 2D and 3D problems. Further, we
develop a general approach to deal with complex boundary problems in Section 4.3. We give
an explanation of our high-order 3D scheme in Section 4.2.3 and present numerical results
for complex boundary problems, electromagnetic wave scattering by PEC cylinders (4.4.1),
electron beams in A6 Relativistic Magnetron in 2D (4.4.3), and scattering of Laser light
on spherical objects in 3D (4.4.2) which are implemented using a fourth-order embedded
boundary scheme.
In Chapter 5, we explain our versatile approach for electromagnetic vector potential and
the implementation of PEC boundary conditions. We provide several test cases including
frequency mode analysis for A6 magnetron in 2D. Chapter 6 explains nuts and bolts of a
software solution which can be scalable using high-performance computing systems. Finally
in Chapter 7 we summarize our work that was carried out throughout this study and give a
list of possible directions that can be used for further advancement.
12
CHAPTER 2: Mathematical Model for Wave Equation Solver
2.1 Introduction
In this chapter, we review the novel impact method which we will further develop in
later chapters. The new work in this chapter is the extension to 3D. We also discuss the
advantages of this approach over other methods.
Maxwell’s equations in free-space result in a hyperbolic wave equation with ﬁnite speed of
propagation of a ﬁeld quantity. We develop a numerical scheme in 3D to solve the hyperbolic
wave equation by imposing diﬀerent boundary conditions, speciﬁcally outﬂow and embedded
Neumann and PEC boundaries. We are mainly focused on the development of an EM
solver for problems with complex geometries, which can be easily accepted on multi-core
technologies facilitated by GP-GPU computing platforms. Our main target is to derive a
time-dependent solution for Maxwell’s equations with higher-order accuracy in time and
space.
In order to obtain the solution, we choose an implicit unconditionally stable Method Of
Lines Transpose (MOLT - also known as the transverse method of lines or Rothes method) [4;
17; 38; 41; 54; 50; 31] based numerical scheme coupled with an embedded boundary approach
to deal with complex geometry problems [15; 14; 13]. This was implemented in 2D and we
extend the scheme to 3D by successful implementations for diﬀerent applications. This
approach uses a MOLT formulation combined with an Alternating Direction Implicit (ADI)
scheme. The method avoids the use of matrices that typically result from the discretization
of the spatial parts of a problem and thus eliminates the main bottleneck in scaling implicit
methods. In this scheme, a PDE is ﬁrst discretized in time, and then the resulting boundary-
value problems are solved using a Green’s function method. In particular, the inverse of the
13
resulting modiﬁed Helmholtz operator is analytically constructed and evaluated eﬃciently
using an O(N ) recursive fast convolution algorithm. Extension to multi-dimensions is formed
using an ADI scheme, and each line is solved independently. Next, we describe why this
scheme is an optimal solution.
When we use this scheme to solve the hyperbolic wave equation, the spatial information
along a line in the simulation is connecting the boundaries to all points in space along that
line to advance a computational solution in time. Along this line, every point in space knows
about the other points due to the nature of the Green’s function used in the solution. The
approximation of spatial convolution depends on the quadrature size we choose, and the
accuracy of the time derivation can be extended by using the higher-order scheme which
performs a set of spatial convolutions/sweeps in multiple computing levels. The hyperbolic
wave equation form of Maxwell’s equation is preserved by the time discretization and the
spatial convolution. Further, we proved the scheme is strictly dispersing ( no diﬀusive for-
mulation) [53; 15], and satisﬁes the Gauss’s divergence-free conditions. When we extend
the scheme to multi-D using ADI splitting, the spatial points talk to each other by moving
in each direction, and they communicate along the lines in that direction. We reduce the
splitting error by doing a couple of sweeps along each direction and taking the average in
multiple computing levels.
Historically, time-dependent solutions of the wave equation are most commonly con-
structed with the method-of-lines (MOL) approach [47; 48; 72; 59]. The MOL schemes
consist of either collocation (nodal) methods or Galerkin (modal) methods [7; 12; 20]. The
collocation methods (e.g.ﬁnite-diﬀerence, ﬁnite-volume, and ﬁnite-element methods) are used
to ﬁnd a solution at a particular point, but in Galerkin methods, the solution is projected
onto a set of basis functions and the time-dependent coeﬃcients are computed. The ap-
proach which has inﬂuenced most numerical methods in computational electromagnetics is
Finite-diﬀerence time-domain (FDTD) methods [46; 65; 19], which use an explicit scheme,
so these schemes are restricted by Courant-Friedrichs-Lewy (CFL) condition (concerning the
14
ratio of the time step to the spatial step) that limits the time step size. In 1966, Kane S. Yee
derived an FDTD based time-dependent solution for Maxwell’s equations [71]. The FDTD
Yee scheme has been used for a large number of problems in computational electromagnetics.
However, the scheme is limited to Cartesian grids and suﬀers from the staircase eﬀect on
curved boundaries. Further, it cannot be used to deal with discontinuities across material
interfaces while maintaining a high-order of accuracy. Hence, the scheme is not suitable for
curved objects or media with material interfaces [57; 19]. There have been many attempts
apart from FDTD methods [46; 65; 19] to complex geometry and raise the order of these
methods above two [21; 19; 11; 10; 18]. However, these methods have not made it into
the mainstream because they are not robust. In addition to stability issues, these methods
typically suﬀer from very small CFL [21; 19; 11; 18] conditions because of small cells at the
boundaries.
In contrast, implicit schemes break the CFL restriction. Hence, the implicit schemes
are most preferable to the problems with tightly coupled scales such as problems in plasma
science. Alternating direction implicit FDTD (ADI-FDTD) is an implicit scheme that is
used to develop an A-stable implicit Maxwell solver in several plasma-related problems [73;
58; 26; 27], but they are all second order in time. Further, the implicit Finite Element Time
Domain (FETD) method based on the Newmark beta approach is unconditionally stable,
but it gives second-order accuracy. Further, the implicit and frequency domain schemes
require cost-expensive matrix inversion. Therefore, the MOLT based scheme is chosen to
be an appropriate scheme for this simulation study in order to obtain high-order accuracy
rapidly.
In Section 2.2, we give the implicit semi-discrete solution for the 1D wave equation, in
Section 2.3 we explain how we can extend this framework to multi-dimensions.
In Sec-
tion 2.2.3, we derive equations for several applicable boundary conditions including outﬂow
boundary conditions, and Section 2.4 explains details of the higher-order scheme, and ﬁnally
Section 2.6 reports a set of test cases. The subsections of section 2.4 are laid out as follows.
15
In Section 2.4.1, we derive a family of schemes of order 2P for the one-dimensional case,
and in Section 2.4.2, we generalize the higher-order scheme for higher spatial dimensions,
producing ADI methods of order 2P which will be A-stable. Here P refers to the number of
terms taken in the MOLT expansion.
2.2 One Dimensional Implicit Wave Equation Solver
Based on the initial boundary value problem with consistent boundary conditions, the
wave equation for one spatial dimension can be written as follows,
1
c2 ∂ttu − ∂xxu = S(x, t),
x ∈ [a, b],
t < 0,
(2.1)
with initial conditions
u(x, 0) = f (x),
∂tu(x, 0) = g(x),
x ∈ [a, b],
x ∈ [a, b],
where c is the propagation speed and S(x, t) denotes a source. This is well-posed once
consistent boundary conditions are appended, such as:
1. Dirichlet boundary condition: u(a, t) = UL(t) and u(b, t) = UR(t).
2. Neumann boundary condition: ∂xu(a, t) = VL(t) and ∂xu(b, t) = VR(t).
3. Periodic boundary condition: u(a, t) = u(b, t) and ∂xu(a, t) = ∂xu(b, t).
4. Outﬂow boundary condition: ∂tu(a, t) = c∂xu(a, t) and ∂tu(b, t) = −c∂xu(b, t).
2.2.1 Semi-discretization
The semi-discrete form of the wave equation can be obtained as follows:
We start by discretizing utt using the second order time centered ﬁnite diﬀerence approxi-
16
mation,
∂ttun =
un+1 − 2un + un−1
∆t2
− ∆t2
12
∂ttttu(x, η),
η ∈ [tn−1, tn+1].
(2.2)
The Laplacian has to be evaluated at time tn+1 to obtain an implicit scheme. To perform
this, we can apply a symmetric 3-point averaging for the Laplacian because we choose a
centered ﬁnite diﬀerence stencil.
∂xxun = ∂xx
where β > 0.
un +
un+1 − 2un + un−1
β2
(cid:17) − ∆t2
β2 ∂ttxxu(x, η).
(2.3)
Using (2.2) and (2.3), we obtain the second order semi-discrete equation,
(cid:16)
∂xx − β2
(c∆t)2
(cid:17)(cid:16)
un +
(cid:17)
un+1 − 2un + un−1
β2
= − β2
(c∆t)2 − S(x, tn) + O(∆t2).
(2.4)
(cid:16)
(cid:16)
The diﬀerential operator in Equation (2.4) can be interpreted as a modiﬁed Helmholtz
operator with,
Lx[u] :=
(cid:17)
1 − 1
α2 ∂xx
u(x), α =
β
c∆t
,
x ∈ [a, b].
(2.5)
On rearranging equation (2.4), we form the modiﬁed Helmholtz equation which can be
written as,
Lx[un+1 − (2 − β2)un + un−1] = β2(cid:16)
un +
α2 Sn(cid:17)
1
.
(2.6)
We solve this modiﬁed Helmholtz equation (2.6) by inverting the Helmholtz operator L
using the free space Greens function. In this way, we start with the solution subject to the
boundary [−∞,∞] and then map it to the actual boundary [a, b] by making the boundary
adjustment terms. Next, we describe the process in detail beginning with the derivation of
17
Green’s formula. Equation (2.4) can be written as,
(cid:16)
1 − 1
(cid:17)
where ψ(x) = un+1−(2−β2)un+un−1 and s(x) = β2(cid:16)
ψ(x) = s(x),
α2 ∂xx
x ∈ [a, b],
α2 Sn(cid:17)
un+ 1
, subject to the homogeneous
(2.7)
Dirichlet boundary condition
ψ(a) = 0, ψ(b) = 0,
(2.8)
The semi-discrete form of Equation (2.6) approximates the hyperbolic wave equation
with the second-order of accuracy in time using centered ﬁnite diﬀerence, and the symmetric
formulation removes the numerical dissipation. The scheme was proved as purely dispersive
[53], so it is more favorable for long-time simulations and overcomes the spurious eﬀect of
numerical diﬀusion. Consistency of the scheme was proved for free space and with boundary
conditions. Further, the scheme was proved as an unconditionally stable scheme using Von-
Neumann analysis (see [53] for proof). Based on those characteristics/proofs of the scheme,
(cid:16)
1− 1
α2 ∂xx
(cid:17)
the hyperbolic nature of the equation is preserved.
Upon multiplying the expression
ψ(x) with a test function P (x) and integrate
(cid:16)
(cid:12)(cid:12)(cid:12)b
a
P (x)∂xψ(x)
(cid:17)
(cid:12)(cid:12)(cid:12)b
a
− ψ(x)∂xP (x)
(2.9)
it from a to b,
b(cid:90)
a
=
(cid:16)
b(cid:90)
1 − 1
α2 ∂xx
ψ(x)P (x)dx
(cid:17)
(cid:16)
(cid:17)
ψ(x)
1 − 1
α2 ∂xx
P (x)dx − 1
α2
a
Now we have the Green’s formula,
18
(cid:104)
b(cid:90)
a
(cid:16)
(cid:17)
P (x) −(cid:16)
(cid:17)
1 − 1
α2 ∂xx
1 − 1
α2 ∂xx
(cid:105)
ψ(x)P (x)
dx =
(cid:16)
1
α2
(cid:12)(cid:12)(cid:12)b
a
P (x)∂xψ(x)
ψ(x)
Suppose the function P is a free space Green’s function
(cid:16)
(cid:17)
1 − 1
α2 ∂xx
P (x, x(cid:48)) = δ(x − x(cid:48))
(cid:17)
(cid:12)(cid:12)(cid:12)b
a
− ψ(x)∂xP (x)
(2.10)
(2.11)
b(cid:90)
a
ψ(x)δ(x − x(cid:48))dx −
ψ(x(cid:48)) −
b(cid:90)
b(cid:90)
a
a
s(x)P (x, x(cid:48))dx =
s(x)P (x, x(cid:48))dx =
(cid:16)
(cid:16)
1
α2
1
α2
Hence,
P (x, x(cid:48))∂xψ(x, x(cid:48))
− ψ(x)∂xP (x, x(cid:48))
P (x, x(cid:48))∂xψ(x)
− ψ(x)∂xP (x, x(cid:48))
(2.12)
(cid:17)
(cid:12)(cid:12)(cid:12)b
(cid:17)
a
(cid:12)(cid:12)(cid:12)b
a
(cid:12)(cid:12)(cid:12)b
a
(cid:12)(cid:12)(cid:12)b
a
b(cid:90)
a
ψ(x(cid:48)) =
s(x)P (x, x(cid:48))dx +
(cid:16)
1
α2
(cid:12)(cid:12)(cid:12)b
a
P (x, x(cid:48))∂xψ(x)
− ψ(x)∂xP (x, x(cid:48))
(cid:17)
(cid:12)(cid:12)(cid:12)b
a
(2.13)
This ends up with the particular solution and boundary correction which can be obtained by
the boundary condition we chose (Dirichlet, Neumann, periodic, outﬂow etc.). Here we chose
to use the free space Greens function and enforce the boundary conditions on u(a, t) and
u(b, t) through expressions we derive for the unknown terms. Let’s apply the homogeneous
Dirichlet boundary condition (2.8),
b(cid:90)
a
ψ(x(cid:48)) =
s(x)P (x, x(cid:48))dx +
(cid:16)
1
α2
(cid:12)(cid:12)(cid:12)x=b
P (b, x(cid:48))∂xψ(x)
− P (a, x(cid:48))∂xψ(x)
(cid:17)
(cid:12)(cid:12)(cid:12)x=a
(2.14)
19
Now we have two unknowns ∂xψ(x)
and they can be obtained by solving
the systems of linear equations that are derived by the limits on Equation (2.14) to x(cid:48) → a
and x(cid:48) → b,
and ∂xψ(x)
(cid:12)(cid:12)(cid:12)x=b
(cid:12)(cid:12)(cid:12)x=a
(cid:16)
(cid:16)
1
α2
1
α2
(cid:12)(cid:12)(cid:12)x=b
(cid:12)(cid:12)(cid:12)x=b
− P (a, a)∂xψ(x)
− P (a, b)∂xψ(x)
(cid:17)
(cid:17)
(2.15)
(2.16)
(cid:12)(cid:12)(cid:12)x=a
(cid:12)(cid:12)(cid:12)x=a
b(cid:90)
b(cid:90)
a
(cid:12)(cid:12)(cid:12)x=a
(cid:12)(cid:12)(cid:12)x=b
ψ(a) =
s(x)P (x, a)dx +
P (b, a)∂xψ(x)
ψ(b) =
s(x)P (x, b)dx +
P (b, b)∂xψ(x)
a
Since here, we are imposing homogeneous boundary condition, ψ(a) = 0 and ψ(b) = 0
b(cid:90)
(cid:16) − P (a, b)
b(cid:90)
(cid:16) − P (b, b)
a
b(cid:90)
b(cid:90)
a
(cid:17)
(cid:17)
(2.17)
(2.18)
∂xψ(x)
= wα
s(x)P (x, a)dx + P (a, a)
s(x)P (x, b)dx
∂xψ(x)
= wα
s(x)P (x, a)dx + P (b, a)
s(x)P (x, b)dx
where
a
a
wα =
α2
P (a, a)P (b, b) − P (a, b)P (b, a)
Now we compute the exact free space Green’s function to substitute the test function P ,
(cid:16)
1 − 1
α2 ∂xx
(cid:17)
G(x, x(cid:48)) = δ(x − x(cid:48))
(2.19)
subject to G(x, x(cid:48)) = 0 when x → −∞ and x → ∞. We can show that the Green’s
x(cid:48)− = −α2.
function satisﬁes the continuity condition G
Consider the homogeneous solution φ, Lφ = 0 and decompose it to two solutions φ1 and φ2
for x > x(cid:48) and x < x(cid:48). So they also satisfy Lφ1 = 0 and Lφ2 = 0 and φ1 and φ2 can be
x(cid:48)− = 0 and jump conditions ∂xG
(cid:12)(cid:12)(cid:12)x(cid:48)+
(cid:12)(cid:12)(cid:12)x(cid:48)+
20
written in the form of,
φ1 = c11sinh(αx) + c12cosh(αx)
φ2 = c21sinh(αx) + c22cosh(αx)
(2.20)
(2.21)
Upon applying the continuity and jump conditions ,
c11sinh(αx(cid:48)) + c12cosh(αx(cid:48)) − c21sinh(αx) − c22cosh(αx(cid:48)) = 0
c11cosh(αx(cid:48)) − c12sinh(αx(cid:48)) + −c21cosh(αx) + c22sinh(αx(cid:48)) = −α
(2.22)
(2.23)
Let’s consider x(cid:48) = 0, then we get, c12 = c22 and c21 = c11 + α. Now we write the solutions
φ1 and φ2 in the exponential form,
(cid:16)eαx − e−αx
(cid:17)
(cid:16) eαx − e−αx
2
+ c12
(cid:16)eαx + e−αx
(cid:17)
(cid:17)
(cid:16)eαx + e−αx
2
+ c12
2
φ2 = (c11 + α)
2
φ1 = c11
(cid:17)
(2.24)
(2.25)
For x > x(cid:48), limx(cid:48)→∞ φ1 = 0, we choose c11 = −c12 such that eαx vanishes, so,
φ1 = −c11e−αx
Similarly, for x < x(cid:48), limx(cid:48)→−∞ φ2 = 0, we choose c11 = −α/2 such that e−αx vanishes,
so,
Hence, the Green’s function G(x, x(cid:48)) = α
φ2 =
eαx
α
2
2 e−α|x−x(cid:48)|, and therefore the inverted modiﬁed
Helmholtz operator can be written as,
L−1
x [u] =
(cid:124)(cid:123)(cid:122)(cid:125)
Ix[u]
Particular solution
(cid:124)
+ Ae−α(x−a) + Be−α(b−x)
.
(2.26)
(cid:123)(cid:122)
(cid:125)
Homogeneous solution
21
where,
b(cid:90)
a
α
2
Ix[u] :=
e−α|x−y|u(y)dy,
x ∈ [a, b],
and
(cid:12)(cid:12)(cid:12)x=a
∂xψ(x)
A = −ψ(a)
2
− 1
2α
,
B = −ψ(b)
2
+
1
2α
(cid:12)(cid:12)(cid:12)x=b
∂xψ(x)
(2.27)
(2.28)
A and B also can be written in integral form as follows,
a(cid:90)
−∞
A =
α
2
e−α(a−y)u(y)dy,
B =
α
2
∞(cid:90)
b
e−α(y−b)u(y)dy.
(2.29)
The coeﬃcients A and B of the homogeneous solution are determined by applying bound-
ary conditions. Now, equation (2.6) can be written using the inverted Helmholtz operator,
L−1 as follows,
un+1 − 2un + un−1 = −β2un + β2L−1
x [un] + β2L−1
x
(2.30)
α2 Sn(cid:105)
(cid:104) 1
.
α2 Sn(cid:105)
(cid:104) 1
The deﬁnition can additionally be modiﬁed to include boundary corrections on a ﬁnite
domain (see [15]).
We also introduce a new operator D related to equation (2.26) that we are going to use
frequently in higher-order and higher dimensional solutions.
un+1 − 2un + un−1 = −β2Dx[un] + β2L−1
x
, Dx[u] := u − L−1
x [u].
(2.31)
The computational cost of evaluating Ix[u] in equation (2.26) is typically O(N 2); however it
can be computed eﬃciently with second order accuracy at O(N ) cost using the fast convo-
lution method detailed next.
22
2.2.2 Fast Convolution Algorithm
We are going to see how we can compute the particular solution using a fast convolution
algorithm. The integration for the particular solution, I[u](x) can be performed by decom-
posing it into left (I L[u](x)) and right (I R[u](x))) oriented integrals as shown in the Figure
2.1
Figure 2.1: Fast convolution line integration which decomposes left I L[u](x) and right
I R[u](x)) oriented integrals between the domain a and b with the spatial step
size, ∆xj(= xj − xj−1).
We begin by splitting the integration at y = x, and integrating from a to x and x to b;
i.e.,
I[u](x) = I L[u](x) + I R[u](x)
(2.32)
x(cid:90)
a
α
2
I L[u](x) =
u(y)e−α(x−y)dy,
I R[u](x) =
b(cid:90)
x
α
2
u(y)e−α(y−x)dy,
so that both integrands decay exponentially away from x. Additionally, they satisfy expo-
nential recurrence relations, which means that
I L[u](xj) = e−α∆xj I L[u](xj−1) + J L[u](xj)
j = 1 to N
J L[u](xj) =
α
2
e−αyu(x − y)dy,
(2.33)
∆xj(cid:90)
0
23
I R[u](xj) = e−α∆xj I R[u](xj+1) + J R[u](x),
j = N − 1 to 0
∆xj(cid:90)
J R[u](xj) =
α
2
e−αyu(x + y)dy.
(2.34)
0
where the ∆xj = xj − xj−1 which is the spatial grid size. The recursive representation
and the localization reduces the computational complexity.
In particular, using a second
order polynomial interpolant in place of u in the integral allows us to compute I[u](x) in
O(N ) time.
The local integrals ((2.33) and (2.34)) may be evaluated using quadrature or analytically.
For example, we can derive a second order quadrature using compact Simpson’s rule and
Lemma 2.1,
j ≈ P uj + Quj−1 + R (uj+1 − 2uj + uj−1) ,
J L
j ≈ P uj + Quj+1 + R (uj+1 − 2uj + uj−1) ,
J R
(2.35)
with quadrature weights,
P = 1 − 1 − d
v
, Q = −d +
1 − d
v
, R =
1 − d
v2 − 1 + d
2v
,
where v = α∆x, d = e−v, and uj denotes the solution at grid point xj
This method can be summarized as follows (Algorithm 1),
2.2.3 Boundary Condition
We can determine the Homogeneous coeﬃcients (A, B) from the imposed boundary con-
ditions and the end/boundary point values of the particular solution (Ix(xstart) and Ix(xend)
where xstart, and xend are two boundary points). Let us work with several common boundary
conditions in 1D ﬁrst. Then these can be expandable to higher dimension.
24
Algorithm 1 Fast Convolution
1: un - array of u values at time tn,
2: ν - array of weighted nodes along the speciﬁc direction,
3: expweight - array of exponentially weighted nodes along the speciﬁc direction,
4: ps - order of accuracy in space
5: for j = 1 to n do
6:
Compute JL(j + 1) and JR(j) via quadrature (see equation (2.35) for second order
quadrature) or analytical integration using expweight and ps
IL(j + 1) = d(j)IL(j) + JL(j + 1)
IR(n − j + 1) = d(n − j + 2)IR(n − j + 2) + JL(n − j + 1)
7: end for
8: d = e−ν
9: for j = 1 to n do
10:
11:
12: end for
13: for j = 1 to n do
I = 1
14:
2
15: end for
(cid:0)IL(j) + IR(j)(cid:1)
2.2.3.1 Dirichlet Boundary Condition
Let us begin with Dirichlet boundary conditions. Evaluating the semi-discrete solution
deﬁned in equation (2.30) at x = a and b, we get.
UL(tn+1) − 2UL(tn) + UL(tn−1) = −β2UL(tn) + β2
UR(tn+1)−2UR(tn)+UR(tn−1) = −β2UR(tn)+β2(cid:16)
(cid:20)
(cid:18)
(cid:104)
(cid:21)
1
α2 Sn
α2 Sn(cid:105)
1
I
un+1 +
(a) + A + Be−α(b−a)
I
un+1+
(b)+Ae−α(b−a)+B
, (2.37)
(cid:19)
,
(2.36)
(cid:17)
We can rearrange the linear system by unknown and known values,
A + µB = −wD
a ,
µA + B = −wD
b .
Solving the linear system for the unknowns A and B gives,
(wD
a − µwD
b )
(µ2 − 1)
A =
, B =
(wD
b − µwD
a )
(µ2 − 1)
,
(2.38)
25
where,
(cid:104)
(cid:104)
wD
a = I
un +
wD
b = I
un +
1
α2 Sn(cid:105)
α2 Sn(cid:105)
1
(a) − 1
β2
(a) − 1
β2
(cid:16)
(cid:16)
(cid:17)
(cid:17)
,
(2.39)
(2.40)
UL(tn+1) + (β2 − 2)UL(tn) + UL(tn−1)
UR(tn+1) + (β2 − 2)UR(tn) + UR(tn−1)
,
and µ = e−α(b−a).
2.2.3.2 Neumann Boundary Condition
For Neumann boundary condition, we obtain the identities, I
(b) =
−αI(b) using the characteristics of the integral solution (equation (2.27)); speciﬁcally, all
(a) = αI(a),
I
(cid:48)
(cid:48)
dependence on x is on the Greens function which is a simple exponential function. Diﬀer-
entiating the semi-discrete solution (2.30), and applying the Neumann boundary conditions
at x = a and b yields,
VL(tn+1)−2VL(tn)+VL(tn−1) = −β2VL(tn)+β2(cid:16)
I
(cid:104)
un+1+
and
VR(tn+1)−2VR(tn)+VR(tn−1) = −β2VR(tn)+β2(cid:16)
(cid:104)
I
un+1+
1
α2 Sn(cid:105)
α2 Sn(cid:105)
1
(a)+A+Be−α(b−a)(cid:17)
, (2.41)
(cid:17)
. (2.42)
(b)+Ae−α(b−a)+B
We can rearrange the linear system by unknown and known values,
A − µB = wN
a ,
−µA + B = wN
b .
Solving the linear system for the unknowns A and B gives,
A =
(wN
a + µwN
b )
(µ2 − 1)
, B =
(wN
b + µwN
a )
(µ2 − 1)
,
(2.43)
26
where,
(cid:104)
(cid:104)
wN
a = I
un +
wN
b = I
un +
1
α2 Sn(cid:105)
α2 Sn(cid:105)
1
(a) − 1
β2
(a) − 1
β2
(cid:16)
(cid:16)
2.2.3.3 Periodic Boundary Condition
VL(tn+1) + (β2 − 2)VL(tn) + VL(tn−1)
(2.44)
VR(tn+1) + (β2 − 2)VR(tn) + VR(tn−1)
.
(2.45)
(cid:17)
(cid:17)
,
By applying the periodic boundary conditions on the semi-discrete solution (2.30), we
obtain,
I
(cid:104)
α2 Sn(cid:105)
1
(cid:104)
un +
(a) + A + µB = I
un +
(a) + µA + B,
and
α
(cid:104)
(cid:16)
I
un +
α2 Sn(cid:105)
1
(cid:17)
(cid:16) − I
= α
(a) − A + µBα
(cid:104)
un +
(cid:17)
.
(a) − µA + Bα
(2.46)
(2.47)
1
1
α2 Sn(cid:105)
α2 Sn(cid:105)
(cid:104)
(cid:104)
We can rearrange the linear system to obtain,
(µ − 1)A + (1 − µ)B = I
(1 − µ)A + (1 − µ)B = I
un +
un +
(a) − I
un +
(a) + I
un +
(cid:104)
(cid:104)
1
α2 Sn(cid:105)
α2 Sn(cid:105)
1
1
α2 Sn(cid:105)
α2 Sn(cid:105)
1
(b),
(b).
Upon solving these linear equations we get,
(cid:104)
α2 Sn(cid:105)
(cid:104)
α2 Sn(cid:105)
I
A =
un + 1
(1 − µ)
(b)
I
, B =
un + 1
(a)
(1 − µ)
.
(2.48)
2.2.3.4 Outﬂow Boundary Condition
We can extend the deﬁnition of the outﬂow boundary conditions to the domains exterior
to [a, b]. Assuming that our initial condition has compact support in [a, b], after time t = tn,
the domain of dependence of un(x) extends to [a − ctn, b + ctn], since the propagation speed
27
is c. Now the solution of L−1 ((2.26)) can be written as.
L−1[un(x)] =
α
2
where
e−α|x−y|un(y)dy = I[u(x)] + Ane−α(x−a) + Bne−α(b−x).
b+ctn(cid:90)
a−ctn
An :=
α
2
a(cid:90)
b+ctn(cid:90)
a−ctn
e−α(a−y)un(y)dy,
Bn :=
α
2
e−α(y−b)un(y)dy.
(2.49)
(2.50)
b
Now we turn these spatial integrals into time integrals which exist at precisely the end points
x = a and b. First we consider outﬂow along the right boundary, x > b and assume this
region contains only right traveling waves, u(x, t) = u(x − ct). By tracing backward along a
characteristic ray we ﬁnd,
u(b + y, t) = u(b, t − y/c), y > 0.
Hence,
ctn(cid:90)
e−α(y)un(b + y, tn)dy =
0
tn(cid:90)
0
αc
2
Bn =
α
2
e−α(cs)un(b, tn − s)ds.
Now, we can impose outﬂow boundary conditions using the behavior of u at x = b. A
temporal recurrence relation due to the exponential will be,
tn−1(cid:90)
0
e−αcsu(b, tn − s)ds + e−αc∆t
e−αcsu(b, tn−1 − s)ds,
e−βzu(b, tn − z∆t)dz + e−βBn−1.
(2.51)
∆t(cid:90)
1(cid:90)
0
Bn =
αc
2
=
β
2
0
28
where β = αc∆t. Therefore, the boundary coeﬃcient Bn can be computed locally in
both time and space. We choose a quadratic interpolant in terms of u to have second order
accuracy.
u(b, tn − z∆t) = un(b) − z
2
(cid:16)
(cid:17)
un+1(b) − un−1(b)
(cid:16)
+
z2
2
un+1(b) − 2un(b) − un−1(b)
(cid:17)
.
(2.52)
Using the Equation (2.52) would give the second order approximation for outﬂow. To
compute the values that come from this approximation, we will make use of the following
lemma which comes from integration by parts (see [15] for proof),
Lemma 2.1. For integers m ≥ 0 and real v > 0,
1(cid:90)
Em := v
e−vzdz =
zm
m!
1
vm
(cid:16)
(cid:17)
1 − e−vPm(v)
m(cid:88)
l=0
vl
l!
where Pm(v) =
0
is the Taylor series expansion of order m of ev.
Using this lemma and integrating the expression (2.52) analytically, we can get,
Bn = e−βBn−1 + γ0un+1(b) + γ1un(b) + γ2un−1(b),
(2.53)
where,
γ0 = E2(β) − 1
E1(β),
2
γ1 = −2E2(β) + E0(β),
γ2 = E2(β) +
1
2
E1(β).
and
(2.54)
There are two unknowns in this equation (2.53), Bn and un+1, so we need one more equation,
29
that can be derived by evaluating equation (2.30) at x = b,
un+1(b) − 2un(b) + un−1(b) = −β2(cid:16)
un(b) −(cid:0)I(b) + µAn + Bn(cid:1)(cid:17)
.
(2.55)
Upon solving these two linear equations (2.53) and (2.55) we can obtain,
−Γ0µAn + (1 − Γ0)Bn = e−βBn−1 + Γ0I(b) + Γ1un(b) + Γ2un−1(b),
(2.56)
where
Γ0 =
β2
2
γ0, Γ1 = γ1 − γ0(β2 − 2), Γ2 = γ2 − γ0.
Similarly, by considering outﬂow at the other end x < a, we get
(1 − Γ0)An − Γ0µBn = e−βAn−1 + Γ0I(a) + Γ1un(a) + Γ2un−1(a).
(2.57)
Solving the linear system of equations (2.56) and (2.57) gives,
(1 − Γ0)wOut
a + µΓ0wOut
b
(1 − Γ0)2 − (µΓ0)2
, Bn =
An =
where
(1 − Γ0)wOut
b + µΓ0wOut
a
(1 − Γ0)2 − (µΓ0)2
,
(2.58)
a = e−βAn−1 + Γ0I(a) + Γ1un(a) + Γ2un−1(a),
wOut
b = e−βBn−1 + Γ0I(b) + Γ1un(b) + Γ2un−1(b)
wOut
According to the solution of the coeﬃcients An and Bn, we realize that we need to know
their values at the previous time step(An−1 and Bn−1). For utilization purposes, we set
A0 = B0 = 0 .
30
2.3 Multi-Dimensional Implicit Wave Equation Solver using ADI
Scheme
The extension to multi-dimension is formed using an alternating direction implicit (ADI)
scheme, and each line is solved independently.
The multi-dimensional wave equation can be represented using the initial boundary value
problem,
1
c2
∂2u(x, t)
∂t2 − ∇2u(x, t) = S(x, t), x ∈ Ω,
t > 0,
(2.59)
u(x, 0) = f (x), x ∈ Ω,
∂tu(x, 0) = g(x), x ∈ Ω.
with consistent boundary conditions, u(x, t) = h(x, t), x ∈ ∂Ω, t > 0, source term S(x, t),
and wave speed c
The multi-dimensional implicit scheme uses an ADI scheme [15], and each ADI line is solved
independently. Consider a scheme for three spatial dimensions,
1 − 1
=
+
(cid:16)
1
α4
∂2
∂z2
∂2
∂y2
(cid:17)(cid:16)
∂x2 +
∂2
∂x2
(cid:17)
(cid:17)(cid:16)
∂2
∂y2 +
1 − 1
α2
(cid:16) ∂2
α2∇2 = 1 − 1
α2
(cid:17) − 1
(cid:16) ∂4
1 − 1
α2
α2∇2 = LxLyLz + O(cid:0)(c∆t)4(cid:1)
∂x2∂y2 +
∂x2∂z2 +
1 − 1
α2
∂2
∂z2
∂y2∂z2
α6
∂4
∂4
(cid:17)
∂6
∂x2∂y2∂z2
(2.60)
Hence,
1 − 1
where Lx, Ly, and Lz are one dimensional univariate modiﬁed Helmholtz operators [15]
31
applied in the indicated spatial variable. Now the semi-discrete equation,
(cid:104)
β2un + un+1 − 2un + un−1(cid:105)
LxLyLz
= β2un +
β2
α2 Sn(x, t)
(2.61)
Upon inverting the modiﬁed Helmholtz operators Lx, Ly, and Lz, we have a semi-discrete
solution,
un+1 − 2un + un−1 = −β2Dxyz[un] + β2L−1
z L−1
y L−1
x
α2 Sn(cid:105)
(cid:104) 1
(x, y, z)
(2.62)
where the three dimensional operator is, Dxyz[u] := u − L−1
z L−1
y L−1
x [u],
while in two dimensions it is
un+1−2un+un−1 = −β2Dxy[un]+β2L−1
y L−1
x
α2 Sn(cid:105)
(cid:104) 1
(x, y), Dxy[u] := u−L−1
y L−1
x [u]. (2.63)
Inversion of the operator L−1
is performed sequentially leading to the usual x, y and z
“sweeps” of the ADI algorithm that is represented by a combination of the operator L−1
γ .
γ
Because the inverse operators are deﬁned along lines, it is quite natural to discretize 2D and
3D regions along Cartesian lines. However, the endpoints of these lines are not restricted to
residing at mesh points and can always be chosen to lie on the boundary ∂Ω. For example,
the x, y, boundary points for the x-sweep and y-sweep for a circle are shown in Figure 2.2.
We ﬁrst perform x-sweeps along with x-lines with actual boundary ends and obtain the
intermediate solution and then apply y-sweeps over the intermediate solution, along with y-
lines with actual boundary ends. The xy-sweeps form a complete circular boundary (Figure
2.2 - (c)). Similarly, we can perform y-sweeps ﬁrst then x-sweeps next and take an average
to obtain the solution
The general approach for implementation of the 3D ADI scheme follows the steps shown
below: Assume the number of x, y, and z lines are nx, ny, and nz respectively.
At each time step,
32
(a) x-lines
(b) y-lines
Figure 2.2: (a) x-, (b) y- and (c) xy lines with exact circular boundary points on the mesh
lines that are used for the ADI x (a) and y (b) sweeps.
(c) xy-lines
1. Perform the x-sweep
Operate L−1 on u(x, y, z) along x, and store the result into a temporary variable
wx(x, yi, zk) for 1 ≤ k ≤ nz and 1 ≤ i ≤ ny. The boundary conditions are imposed at
x = xa(ik) and xb(ik).
2. Perform the y-sweep
Operate L−1 on wx(x, y, z) along y, and store the result into a temporary variable
wyx(xj, y, zk) for 1 ≤ k ≤ nz and 1 ≤ j ≤ nx. The boundary conditions are imposed
33
at y = ya(jk) and yb(jk).
3. Perform the z-sweep
For 1 ≤ i ≤ ny and 1 ≤ j ≤ nx using wyx(x, y, z), solve for the equation un+1 =
2un + un−1 − β2Dxyz, where Dxyz = un − L−1
z [wyx]. The boundary conditions are now
applied at z = za(ij) and zb(ij).
In order to improve the accuracy of the ADI solve, the inversion of the x, y and z
Helmholtz operators is symmetrized, by averaging the results of x—y —z, y —z —x, and z
—x —y solves. That means we need to apply the operators Dxyz, Dyzx, and Dzyx and then
take the average.
We give the detailed algorithm for the two-dimensional implicit wave equation solver in
Algorithm 2 and relevant supporting functions are given in Algorithms 3 and 4. Algorithm
3 computes solution u at time tn+1 using our two dimensional implicit solver that calls
the function LINV given in Algorithm 4 to apply the operator L−1 to perform x and y
sweeps. LINV may be written in two functions for x and y sweeps separately. The function
applyBC called in Algorithm 4 is used to ﬁnd the homogeneous coeﬃcients An and Bn using
appropriate boundary conditions. Those are retrieved as the ﬁrst and second elements of the
vector H respectively. Since we have to use the value of the coeﬃcients at the previous time
step (tn−1), An−1 and Bn−1 to compute An and Bn for outﬂow boundary condition, these
algorithms, however, need to be modiﬁed with minor changes by passing reference type
parameters to the function applyBC. The parameter should be deﬁned within the main
Algorithm 2 and be called via computeU . Further, although Algorithm 2 would typically
be used to deal with problems with rectangular boundaries, it can be extended to adopt
problems with randomly placed boundary points in each grid line such as when dealing with
a circular boundary using the embedded boundary method. Here, we need to use matrices
x and y instead of vectors and deﬁne additional vectors to keep boundary points of each
horizontal and vertical grid line.
34
Algorithm 2 Two Dimensional Solver
1: xA, xB and yA , yB - boundary points along x and y respectively,
2: nx, ny - number of grid points along x and y respectively,
3: ∆t - time step size,
4: T - total time,
5: c - wave speed,
6: β - averaging parameter,
7: ps -order of accuracy in space.
8: x = grid vector(xA, xB, nx)
9: y = grid vector(yA, yB, ny)
10: set initialvalue(un−1, t0)
11: set initialvalue(un, t1)
12: α = β
c∆t
13: nt = T
∆t
14: µx = e−α(x(n)−x(0))
15: µy = e−α(y(n)−y(0))
16: expAx = e−α(x−x(0))
17: expBx = e−α(x(n)−x)
18: expAy = e−α(y−y(0))
19: expBy = e−α(y(n)−y)
20: νx = α(x(2 : nx) − x(1 : nx − 1))
21: νy = α(y(2 : ny) − y(1 : ny − 1))
22: expweightx = exp weights(νx, p)
23: expweighty = exp weights(νy, p
24: for tn = 1 to nt do
25:
26:
27:
28: end for
compute u(un−1, un, nx, ny, β, µx, µy, xA, xB, yA, yB, expAx, expBx,
=
un+1
expAy, expBy, νx, νy, expweightx, expweighty, ps)
un−1 = un
un = un+1
2.4 Arbitrary Temporal Order Accuracy
So far, we have derived multi-dimensional semi-discretizations of the wave equation with
second-order accuracy in time. In order to take large time steps in problems, we need higher-
order accuracy in time. While there is no stability restriction placed on our A-stable scheme,
considerations of accuracy present themselves when the CFL number becomes large. Indeed,
when large time steps are taken, the anisotropies introduced by the dimensional splitting are
very pronounced.
35
Algorithm 3 compute u : Compute u at time tn+1
1: un−1, un - array of u values at time tn−1 and tn respectively,
2: nx, ny - number of grid points along x and y respectively,
3: β - averaging parameter,
4: µx = e−α(x(n)−x(0)),
5: µy = e−α(y(n)−y(0)),
6: xA, xB and yA , yB - boundary points along x and y respectively,
7: expAx, expBx - exponential vector of grid distance from left boundary and right boundary
respectively along x,
8: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-
ary respectively along y,
9: νx, νy - array of weighted nodes along x and y respectively,
10: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-
tively,
11: ps -order of accuracy in space
12: linv x = linv(u1, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
13: linv yx = linv(Linvx, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, p, 1)
14: Dxy = un − linv yx
15: un+1 = 2un + un−1 − β2Dxy
Algorithm 4 linv : L−1
1: un - array of u values at time tn,
2: n - number of lines that need to be sweep,
3: bdryA, bdryB - boundary points along the speciﬁc line,
4: expA expB- exponential vector of grid distance from left boundary and right boundary
respectively along the line,
I = f astconvolution(u(i, :), ν, expweight, ps)
H = apply bc(u(bdryA), u(bdryB), I(bdryA), I(bdryB), β, bdryA, bdryB, µ)
Linv(i, :) = I + H(1)expA + H(2)expB
5: ν - array of weighted nodes along the line,
6: expweight - array of exponentially weighted nodes along the line,
7: p -order of accuracy in space,
8: dir - determine which sweep is going to do, along x or y determined by 0 or 1 respectively
9: if dir == 0 then
10:
11:
12:
13:
14:
15: else
16:
17:
18:
19:
20:
21: end if
I = f astconvolution(u(:, j), ν, expweight, ps)
H = apply bc(bdryA), u(bdryB), I(bdryA), I(bdryB), β, bdryA, bdryB, µ)
Linv(:, j) = I + H(1)expA + H(2)expB
end for
for i = 1 to n do
end for
for j = 1 to n do
36
In this section, we discuss A-stable schemes of arbitrary order using a MOLT formula-
tion of the wave equation, and by implicitly including higher-order derivatives. However,
we construct the derivatives using a novel approach: recursive convolution. The cornerstone
of our method lies in the fact that by using recursive applications of the convolution oper-
ators introduced in [15], the inversions of higher derivatives can be performed analytically.
So the resulting scheme is made explicit, even at the semi-discrete level. In constructing
the analytical convolution operators, we incorporate the boundary conditions directly. In
this way, without additional complexity Dirichlet and periodic boundary conditions can be
implemented to higher-order. Since dealing with outﬂow boundary conditions depend on
previous time step values at the boundaries, we need to do a little more work to implement
it in higher-order. We use the recursive convolution algorithm which consumes O(N ) oper-
ations. So, the schemes we obtain in d = 1, 2 and 3 dimensions will achieve accuracy 2P in
O(P dN ) operations per time step.
2.4.1 Arbitrary Order One Dimensional Scheme
A higher-order scheme can be achieved by including more spatial derivatives in the nu-
merical scheme. We choose a Lax-Wendroﬀ approach to increase the temporal accuracy of
our second order semi-discrete solution. This approach requires us to exchange time deriva-
tives with spatial derivatives in the Taylor expansion. Let’s start with the semi-discrete wave
equation and apply the Lax-Wendroﬀ procedure. We have
un+1 − 2un + un−1 = 2
(∂tt)m un
(2.64)
Hence, by using the fact,
(∂tt)m u =(cid:0)c2∂xx
∆t2m
(2m)!
∞(cid:88)
(cid:1)m u, m ≥ 1.
m=1
37
un+1 − 2un + un−1 = 2
= 2
∞(cid:88)
∞(cid:88)
m=1
m=1
(c∆t)2m
(2m)!
(∂xx)m un
(cid:18)∂xx
(cid:19)m
β2m
(2m)!
α2
un.
(2.65)
Diﬀerential operators (∂xx)m have to be approximated properly. To do this, we can
consider the convolution operator D from (2.31). We begin by considering the equation,
(cid:20)(cid:18)∂xx
(cid:19)m(cid:21)
α2
F
(cid:18) k
(cid:19)2m
α
= (−1)m
where F denotes the Fourier transform. Now, we need to ﬁnd (k/α)2; it can be formed using
D as follows,
ˆD := F [D] = 1 − F(cid:2)L−1(cid:3) = 1 −
1 +(cid:0) k
1
α
(cid:1)2 =
(cid:0) k
(cid:1)2
1 +(cid:0) k
α
α
(cid:1)2 .
Solving for the quantity (k/α)2 in terms of ˆD, gives an expression for all even derivatives as
follows,
(cid:18) k
(cid:18) k
α
α
(cid:19)2
(cid:19)2m
=
=
ˆD
1 − ˆD
(cid:32) ˆD
1 − ˆD
=
∞(cid:88)
(cid:33)m
p=1
ˆDp
=
and
(cid:19)
(cid:18) p − 1
m − 1
∞(cid:88)
p=m
ˆDp,
By inserting these into the Taylor expansion (2.65), we obtain
∞(cid:88)
m=1
(cid:19)
(cid:18) p − 1
m − 1
∞(cid:88)
p=m
Dp[un].
un+1 − 2un + un−1 =
(−1)m 2β2m
(2m)!
Upon reversing the order of summation, we have
un+1 − 2un + un−1 =
Ap(β)Dp[un],
(2.66)
∞(cid:88)
p=1
38
where Ap(β) is a polynomial of coeﬃcients in β2,
Ap(β) = 2
(−1)m β2m
(2m)!
p(cid:88)
m=1
(cid:19)
(cid:18) p − 1
m − 1
.
(2.67)
From this scheme (2.4.1), 1D second (identical to the original second order scheme (2.31))
and fourth order equations are
un+1 − 2un + un−1 = −β2D[un]
un+1 − 2un + un−1 = −β2D[un] −
(cid:18)
β2 − β4
12
(cid:19)
D2[un]
(2.68)
(2.69)
Algorithm 3 can be modiﬁed as follows to be able to compute the solution with higher-
order accuracy in 1D.
Algorithm 5 Compute u at time tn+1 with high-order accuracy
1: un−1, un - array of u values at time tn−1 and tn respectively,
2: n - number of points along the line,
3: β - averaging parameter,
4: µ = e−α(x(n)−x(0)),
5: bdryA, bdryB - boundary points along the speciﬁc line
6: expA, expB - exponential vector of grid distance from left boundary and right boundary
respectively along the line,
7: ν - array of weighted nodes along the line,
8: expweight - array of exponentially weighted nodes along the line,
9: ps -order of accuracy in space,
10: pt -order of accuracy in time
11: P = pt
2
12: Dterms = 0
13: for k = 1 to P do
14: D = u − linv(u, n, β, µ, bdryA, bdryB, expA, expB, ν, expweight, ps)
15: Dterms = Dterms + poly highorder(β, P )D
u = D
16:
17: end for
18: un+1 = 2un − un−1 + Dterms
Temporal cost for a scheme of order P is O(P N ) per time step for N spatial points.
Since the local truncation errors are O((c∆t/β)2P +2), the error constant will decrease with
39
Algorithm 6 poly highorder : Compute the coeﬃcients for high-order accuracy
1: β - averaging parameter,
2: k - order of accuracy at current stage
3: coef = 0;
4: for m = 1 to k do
5:
6: end for
7: coef = 2coef
coef = coef + (−1)m β2m
(cid:1)
(cid:0) k−1
m−1
(2m)!
increasing β. However, it should be limited with an upper value to maintain stability. The
stability of this scheme was proved in [15] using Von-Neumann analysis and they obtained
an upper limit of β for several even orders of temporal accuracy which is summarized in
Table 2.1. We can notice that βmax decreases with increasing P
P
Order
βmax
1
2
2
2
4
3
6
4
8
5
10
1.4840
1.2345
1.0795
0.9715
Table 2.1: The maximum values βmax for which the P th order scheme remains A-stable [14].
We need to set appropriate initial conditions to achieve the expected order. Suppose
we would like to get P th order accuracy, the initial value should be computed to O(∆t2P ).
For our 3-step scheme (2.66), u0 and u1 have to be initialized. They may be computed
analytically. However, in general, we know the exact value of u0 (= f (x)), and we need to
approximate the value for u1 = u(x, ∆t). To do so, we can apply the Lax-Wendroﬀ procedure
in Taylor expansion, we obtain
40
m=0
∞(cid:88)
∞(cid:88)
∞(cid:88)
∞(cid:88)
m=0
m=0
u1 =
=
=
=
(2m)!
t u0
∂m
∆tm
m!
(cid:18) ∆t2m
(cid:18)(c∆t)2m
(cid:18)(c∆t)2m
(2m)!
(2m)!
m=0
(cid:19)
∆t2m+1
(2m + 1)!
∂2m+1
t
u0
∂2m
t u0 +
∂2m
x u0 +
(c∆t)2m+1
(2m + 1)!
∂2m
x
1
c
∂tu0
∂2m
x f (x) +
(c∆t)2m+1
(2m + 1)!
∂2m
x
1
c
(cid:19)
(cid:19)
g(x)
.
(2.70)
where g(x) = ut(x, 0). This expansion can now be truncated at O(∆t2P ) to have P th
order of accuracy.
2.4.2 Extension to Higher Dimensions for Arbitrary Order Accuracy
We derive higher-order accurate solutions for the wave equation in higher dimensions
using an ADI scheme. We begin with the expansion of Equation (2.64) by introducing the
Laplacian in the Lax-Wendroﬀ procedure.
un+1 − 2un + un−1 = 2
∆t2m
(2m)!
(∂m
tt ) un = 2
∞(cid:88)
m=1
∞(cid:88)
m=1
β2m
(2m)!
(cid:18)∇2
(cid:19)m
α2
un.
Before approximating higher-order powers of the Laplacian operator, consider univariate
modiﬁed Helmholtz operators, and their corresponding D operators as given below,
Lγ := 1 − ∂2
γ
α2 , Dγ := 1 − L−1
γ ,
γ = x, y, z.
Since these operators satisfy identities, LγDγ[u] = L[u] − u = − ∂γγ
α2 u, the Laplacian in
3D can be written by,
−∇2
α2 = LxDx + LyDy + LzDz = LxLyLz[Cxyz],
(2.71)
41
where
Cxyz : L−1
y L−1
z Dx + L−1
z L−1
x Dy + L−1
x L−1
y Dz
Consider the D operator in three dimensions.
Dxyz := 1 − L−1
x L−1
y L−1
z .
(2.72)
(2.73)
Upon rearranging and inverting (2.73), we get an identity LxLyLz = (1 − Dxyz)
−1 , Now,
the Laplacian becomes,
−∇2
α2 = (1 − Dxyz)
−1 Cxyz.
By expanding (1−D)−m as a power series, we can construct even orders of the Laplacian as
follows,
(cid:18)∇2
(cid:19)m
α2
(cid:19)
(cid:18) p − 1
m − 1
∞(cid:88)
p=m
= (−1)mCm
xyz
Dp−m
xyz .
(2.74)
Upon omitting the subscripts, the semi-discrete scheme for 2D and 3D is
un+1 − 2un + un−1 =
=
=
(cid:18)∇2
(cid:19)m
2β2m
α2
(2m)!
(−1)m 2β2m
(2m)!
Cm
m=1
∞(cid:88)
∞(cid:88)
∞(cid:88)
m=1
p(cid:88)
(−1)m 2β2m
(2m)!
p=1
m=1
(cid:19)
un
(cid:18) p − 1
∞(cid:88)
(cid:18) p − 1
(cid:19)
m − 1
p=m
m − 1
Dp−m[un]
xyzDp−m
Cm
xyz [un].
We give second and fourth-order schemes here,
un+1 − 2un + un−1 = −β2Cxyz[un]
un+1 − 2un + un−1 = −β2Cxyz[un] −
(cid:18)
β2Dxyz − β4
12
(cid:19)
Cxyz
Cxyz[un]
(2.75)
(2.76)
(2.77)
As in the 1D case, these schemes will be unconditionally stable for all ∆t, and the same
42
range for β as shown in Table 2.1.
The algorithm for computing u in two dimensions using ADI splitting with a given order
of accuracy in time is designed as below in Algorithm 7. Using this algorithm, we can retrieve
the solution with accuracy O (pt) in time.
Algorithm 7 compute u high :Compute u in 2D with high-order accuracy
1: un−1, un - array of u values at time tn−1 and tn respectively,
2: nx, ny - number of points along x and y respectively,
3: β - averaging parameter,
4: µx = e−α(x(n)−x(0)),µy = e−α(y(n)−y(0)),
5: xA, xB, and yA, yB - boundary points in x and y directions respectively,
6: expAx, expBx - exponential vector of grid distance from left boundary and right boundary
respectively along x,
7: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-
ary respectively along y,
8: νx, νy - array of weighted nodes along x and y respectively,
9: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-
tively,
expAy, expBy, νx, νy, expweightx, expweighty, ps)
10: ps and pt - order of accuracy in space and time respectively
11: P = pt
2
12: struct outarray(0)
13: cd terms = −β2structoutarray(0)
14: for k = 2 to P do
15:
16:
=
compute c(u, nx, ny, β, µx, µy, xA, xB, yA, yB, expAx, expBx,
struct inarray = struct outarray
struct outarray(1) = compute cd(struct inarray(0), nx, ny, β, µx, µy, xA, xB, yA,
yB, expAx, expBx, expAy, expBy, νx, νy, expweightx, expweighty, ps)
for i = 2 to k − 1 do
struct outarray(i) = compute d(struct inarray(i), nx, ny, β, µx, µy, xA, xB, yA, yB,
expAx, expBx, expAy, expBy, νx, νy, xpweightx, expweighty, ps)
17:
18:
cd terms = cd terms + ploy highorder(β, i, k)struct outarray(i − 1)
end for
for i = k to 1 do
19:
20:
21:
22:
23: end for
24: un+1 = 2un − un−1 + cd terms
end for
Two unique functions were designed to compute solutions for the convolution operators D
(Algorithm 9) and C (Algorithm 8) by leveraging the fast convolution algorithm (Algorithm
1). Even though these two algorithms can be used together to compute the operators C and
43
D by applying them on the same vector of values at any stage, we need a separate function
as given in Algorithm 10 in order to avoid redundant calculations and follow up issues with
computing certain homogeneous boundary coeﬃcients. For this algorithm, we expand the
equation
Cxy = L−1
y Dx + L−1
x Dy as follows using Dγ = 1 − L−1
,
γ
Cxy = (L−1
y + L−1
x ) − (L−1
y L−1
x + L−1
x L−1
y )
(2.78)
The Algorithm 6 computes the coeﬃcients of the equation (2.75) for higher dimension,
higher-order accuracy in terms of β using (2.81).
Algorithm 8 compute c : Compute operator C in 2D; Cxy = L−1
1: u - array of u values at time t,
2: nx, ny - number of points along x and y respectively,
3: β - averaging parameter,
4: µx = e−α(x(n)−x(0)),
5: µy = e−α(y(n)−y(0)),
6: xA, xB, and yA, yB - boundary points along x and y respectively,
7: expAx, expBx - exponential vector of grid distance from left boundary and right boundary
y Dx + L−1
x Dy
respectively along x,
8: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-
ary respectively along y,
9: νx, νy - array of weighted nodes along x and y respectively,
10: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-
tively,
11: ps -order of accuracy in space
12: Dx = u − Linv(u, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
13: linv yDx = linv(Dx, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
14: Dy = u − linv(u, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
15: linv xDy = linv(Dy, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
16: Cxy = linv yDx + linv xDy
2.5 Treatment for Variable Speed Wave Propagation
This procedure of exchanging time derivatives with spatial derivatives in the semi-discrete
equation using the Lax-Wendroﬀ procedure will also work in the case of variable wave speeds.
44
Algorithm 9 compute d :Compute operator D in 2D; Dxy = 1 − L−1
1: u - array of u values at time t, nx, ny - number of points along x and y respectively,
2: β - averaging parameter,
3: µx = e−α(x(n)−x(0)),
4: µy = e−α(y(n)−y(0)),
5: xA, xB, and yA, yB - boundary points along x and y directions respectively,
6: expAx, expBx - exponential vector of grid distance from left boundary and right boundary
y L−1
x
respectively along x,
7: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-
ary respectively along y,
8: νx, νy - array of weighted nodes along x and y respectively,
9: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-
tively,
10: ps -order of accuracy in space
11: linv x = linv(u, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
12: linv yx = linv(linv x, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
13: Dxy = u − linv yx
Algorithm 10 compute cd : Compute operators C and D in 2D; Dxy and Cxy
1: u - array of u values at time t, nx, ny - number of points along x and y respectively,
2: β - averaging parameter,
3: µx = e−α(x(n)−x(0)),
4: µy = e−α(y(n)−y(0)),
5: xA, xB, and yA, yB - boundary points along x and y respectively,
6: expAx, expBx - exponential vector of grid distance from left boundary and right boundary
respectively along x,
7: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-
ary respectively along y,
8: νx, νy - array of weighted nodes along x and y respectively,
9: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-
tively,
10: ps -order of accuracy in space
11: linv x = linv(u, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
12: linv y = linv(u, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
13: linv yx = linv(linv x, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
14: linv xy = linv(linv y, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
15: Dxy = u − linv yx
16: Cxy = linv x + linv y − (linv yx + linv xy)
The wave equation with variable wave speed can be written as,
utt = [c(x)]2uxx
(2.79)
45
Assume that the speed is ﬁnite and bounded as c1 ≤ c(x) ≤ c2. It can be normalized by
¯c(x) = c(x)/c2. We choose α = β
β > 0. Thus,
c2∆t,
un+1 − 2un + un−1 = 2
m=1
m=1
∞(cid:88)
∞(cid:88)
∞(cid:88)
∞(cid:88)
∞(cid:88)
m=1
m=1
p=1
= 2
= 2
=
=
p(cid:88)
m=1
∆t2m
(2m)!
∆t2m
(2m)!
β2m
(2m)!
(∂tt)mun
(cid:0)c2(x)∂xx
(cid:18)
¯c2(x)
∂xx
α2
(−1)m 2β2m
(2m)!
¯c2m(x)
Ap(β)Dp[un], .
(cid:1)m un
(cid:19)m
∞(cid:88)
p=m
un
(cid:18) p − 1
(cid:19)
m − 1
Dp[un]
(2.80)
.
(2.81)
(cid:19)
(cid:18) p − 1
m − 1
(−1)m (β¯c(x))2m
(2m)!
and the polynomial of coeﬃcients will be,
Ap(β) = 2
2.6 Numerical Results
We present a set of test cases that evaluates our framework; speciﬁcally, we consider
problems with variable/piecewise constant wave speed in one, two, and three dimensions,
using a Gaussian pulse as an initial source or a point source. For 2D, we choose a classic
EM problem - a transverse magnetic (TM) mode photonic crystal waveguide.
2.6.1 Convergence Studies
2.6.1.1 Higher Order Wave Solvers in Two Dimension
In this section, we consider a two-dimensional fourth-order temporally accurate solver.
A square domain (Ω = [−1, 1]× [−1, 1]) is chosen to assess the two dimensional higher-order
solver. We place a point source cos(ωt) at the center of the domain, (0, 0) and apply diﬀerent
46
boundary conditions.
(a) Dirichlet boundary condition
(b) Periodic boundary conditions
Figure 2.3: Fourth order time convergence of the 2D wave solver using Dirichlet (a) and
periodic (b) boundary conditions with ∆x = ∆y = 6.25 × 10−3. This is a self-
reﬁnement study which measures L∞ norm of the error at time T = 2.0 on a
square domain Ω = [−1, 1]2 with a point source cos(ωt), ω = 1 at the center of
the box, (0, 0). The CFL (= c∆t
∆x ) value changes proportional to the time step
size ∆t with ﬁxed spatial step size.
47
Figure 2.3 shows fourth order time convergence obtained for Dirichlet and periodic bound-
ary conditions by performing a reﬁnement study on a square domain Ω = [−1, 1] × [−1, 1]
with a point source cos(ωt), ω = 1 at the center of the box, (0, 0). The CFL (= c∆t
∆x )
value changes proportional to the time step size ∆t with ﬁxed spatial step size, ∆x = ∆y =
0.00625. We consider a ﬁnal time T = 2.0, with a ﬁxed spatial resolution of 320× 320 spatial
points. The discrete L∞ norm of the error is constructed at each time step and the maximum
over all time steps is used to generate the graph in Figure 2.3.
2.6.1.2 Higher Order Wave Solvers in Three Dimensions
We ﬁrst show fourth order convergence for Dirichlet boundary condition by performing a
time reﬁnement study on a cubic domain Ω = [−1, 1] × [−1, 1] × [−1, 1] with a point source
cos(ωt) at center of the domain, (0, 0, 0). This runs up to time T = 2.0, with a ﬁxed spatial
resolution of 160× 160× 160 spatial points. The discrete L∞ norm of the error is constructed
at each time step and the maximum error over all time steps is used to graph Figure 2.4 for
∆t = 625e−2
, k = 1 to 5, with Dirichlet boundary conditions.
2k
2.6.2 Three Dimensional Waves
In this section, we validate our three dimensional solver using a cubic domain (Ω =
[−1, 1] × [−1, 1] × [−1, 1]). We set the spatial grid size as ∆x = ∆y = ∆z = 0.0625
(32 × 32 × 32 grid points) and the time step size to ∆t = 0.0313. We successfully tested the
solver for a point source by applying Dirichlet, periodic and outﬂow boundary conditions
along the cubic faces. Figures 2.5, 2.6, and 2.7 show the ﬁeld of a point source cos(ωt), ω = 1
that is placed at (0, −1 + 3∆x, 0) using Dirichlet, periodic, and outﬂow boundary conditions
along the six faces respectively.
In each case, a three dimensional wave generated by a point source (Figure 2.5, 2.6,
and 2.7) smoothly propagates in a cube and obeys applied boundary conditions along the
surfaces of the cube.
48
(a) ω = 0.1
(b) ω = 1
Figure 2.4: Fourth order convergence of 3D wave solver using Dirichlet boundary conditions
with ∆x = ∆y = ∆z = 1.25 × 10−2 ω = 1. This is a self-reﬁnement study which
measures L∞ norm of the error at time T = 2.0 on a cubic domain Ω = [−1, 1]3
with a point source cos(ωt) at the center of the cube, (0, 0, 0) for (a) ω = 0.1
and (b) ω = 1. The CFL (= c∆t
∆x ) value changes proportional to the time step
size ∆t with ﬁxed spatial step size.
49
Figure 2.5: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic domain
(Ω = [−1, 1]3) by imposing Dirichlet boundary condition along the surface with
the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step size,
∆t = 0.0313.
Figure 2.6: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω = [−1, 1]3)
by imposing periodic boundary condition along the surface with the spatial step
size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step size, ∆t = 0.0313.
Figure 2.7: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω = [−1, 1]3)
by imposing outﬂow boundary condition along the surface with the spatial step
size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step size, ∆t = 0.0313.
2.6.3 Waves with Variable Speeds
2.6.3.1 One Dimensional Case
A one dimensional wave travels through two diﬀerent consecutive domains, Ω1 ∈ [−1, 0]
and Ω2 ∈ [0, 1] with piecewise constant speed c1 = 1.0 and c2 = 2.0 respectively. The solution
50
until the wave reaches the location of discontinuity, x = 0 is,
u(x, t) = aie−η(x−c1(t−t0))2
(2.82)
where ai is the initial amplitude, η is the Gaussian shape parameter, and t0 is a time oﬀset
which delays the Gaussian pulse.
A portion of the traveling wave in domain one, Ω1 will be reﬂected at the location of interface,
x = 0, while the remaining waves are transmitted to domain two, Ω2. Thus, we need to
consider the combination of transmitted, reﬂected, and incident waves from the location of
the interface, x = 0 and onward. The left domain (Ω1) will have components of the incident
and reﬂected waves, and the right domain (Ω2) will have the transmitted component of the
original wave.
In this way, the solution obeys the following equations,
u1(t) = aieη(x−c1(t−t0))2 + areη(x+c1(t−t0))2 x < 0
u2(t) = ate¯η(x−c2(t−t0))2 x ≥ 0,
where, ar and at are amplitudes of the reﬂected and transmitted wave respectively and ¯η
is the shape parameter of the transmitted Gaussian pulse. Because of the zero transverse
displacement at the interface, u1(0, t) = u2(0, t), that gives us,
(ai + ar)eηc2
1(t−t0)2 = ate¯ηc2
2(t−t0)2.
By equating the coeﬃcients we get,
ai + ar = at
ηc2
1 = ¯ηc2
2.
51
(2.83)
(2.84)
From energy conservation, we obtain
a2
i√
η
=
a2
r√
η
+
a2
t√
¯η
.
(2.85)
Upon solving equations (2.84), and (2.85), we obtain
,
2aic2
at =
(c1 + c2)
ar = ai − at.
, and ar = 1 − at. Further,
For our ﬁrst numerical example, we choose ai = 1, so at = 2c2
c1+c2
we chose c1 = 1.0 and c2 = 2.0, and apply our solver along the domain Ω1 ∪ Ω2 with 1024
uniform grid points of size ∆x = 0.002, while maintaining a CFL of 2.5 by choosing time step
∆t = 0.005. Figure 2.8 shows time snapshots of the moving wave using outﬂow boundary
conditions at the left and right boundaries. The numerical results agree closely with the
theoretical solutions as can be seen in the ﬁgure.
Figure 2.8: 1D wave travelling over the two domains Ω1 and Ω2 with piecewise constant speed
c1 = 1.0 and c2 = 2.0. For this experiment, we choose spatial step ∆x = 0.002,
and time step ∆t = 0.005.
A problem with multiple domains was selected as our second numerical test case with
the same spatial step size, ∆x = 0.002, and time step size ∆t = 0.005, hence CFL remains
2.5. The entire domain has eight equal width sub-domains of size 0.25, and the wave travels
at the same speed in every bi-domain. The ﬁrst sub-domain is between -1 and -0.75, has
waves travelling with speed c1 (=1.0); the wave speed is c2 (= 2.0) in the even-numbered
sub-domains.
52
Figure 2.9 shows time snapshots of the moving wave using outﬂow boundary conditions at
the external left and right boundaries.
Figure 2.9: 1D wave travelling over the layered media with eight domains and the wave
travels with the same speed in every bi-domain ( c1 = 1.0 and c2 = 2.0). For
this experiment, we choose spatial step ∆x = 0.002, and time step ∆t = 0.005
along layered media with piecewise constant wave speed.
2.6.3.2 Two Dimensional Case
In this section, we consider the two-dimensional variable speed solver. A square domain
with square patches is chosen to assess the two-dimensional variable speed solver. Due to
the material property of patches, waves travel with diﬀerent speeds through these patches
compared with the remaining area. We choose a Gaussian pulse e−25(x2+y2) as an initial
solution and apply outﬂow boundary condition along the border of the square domain (Ω =
[−1, 1] × [−1, 1]). Since we use spatial step size ∆x = ∆y = 0.02 and time step size
∆t = 0.005, applicable CFL is 0.25. We demonstrate two test cases: One and Four patches
are placed for case-i and case-ii respectively as shown in Figure 2.10
Case-i - Single patch
A 0.5 × 0.5 square patch is placed at the left top corner centered at, (-0.5, 0.5). The wave
speed is set to be c2(= 0.1) in the patch, and c1(= 1.0) in the remaining area. Figure 2.11
shows time snapshots of the solution.
Case-ii - Four patches
There are four 0.5 x 0.5 square patches placed at the four corners; i.e., centered at (-0.5,
0.5), (-0.5, -0.5), (0.5, 0.5), and (0.5, -0.5). The wave speed is set to be c2(= 0.1) in every
patch, and c1(= 1.0) in the remaining area. Figure 2.12 shows time snapshots of the wave
53
(a) single patch
(b) 4 patches
Figure 2.10: Geometrical view of 0.5× 0.5 (a) one and (b) four square patches in the square
domain Ω = [−1, 1]2 with c2(= 0.1) and c1(= 1.0) are the wave speeds in the
patches and the remaining area.
Figure 2.11: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω = [−1, 1]2
with a 0.52 square patch as shown in Figure 2.10-(a). Here, the spatial step
size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005.
at diﬀerent time instants.
Figure 2.12: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω = [−1, 1]2
with four 0.52 square patches as shown in Figure 2.10-(b). Here, the spatial
step size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005.
54
2.6.3.3 Three Dimensional Cases
In this section, we discuss the three-dimensional variable speed wave-solver. A cubic
domain with cubic patches is chosen to assess the three-dimensional variable speed solver.
Due to the material property of patches, waves travel with diﬀerent speeds through these
patches compared with the remaining area. We place a Gaussian pulse e−36(x2+y2+z2) as
an initial solution and apply outﬂow boundary conditions along the border of the cube
(Ω = [−1, 1] × [−1, 1] × [−1, 1]). Since we use a spatial grid size ∆x = ∆y = ∆z = 0.0625
(32 × 32 × 32 grid points) and time step size ∆t = 0.0313, applicable CFL is 0.5. We
demonstrate two test cases: one, and four patches for case-i and case-ii respectively as
shown in Figure 2.13
(a) single patch
(b) 4 patches
Figure 2.13: Geometrical view of 0.5 × 0.5 × 0.5 (a) one and (b) four cubic patches in the
cubical domain Ω = [−1, 1]3 with c2(= 0.1) and c1(= 1.0) are the wave speeds
in the patches and the remaining area.
Case-i - Single patch
A 0.5 × 0.5 × 0.5 cubic patch is placed at the left top corner centered at, (-0.5, 0, 0.5). The
wave speed is set to be c2(= 0.1) in the patch, and c1(= 1.0) in the remaining area. Figure
2.14 shows snapshots of the wave at diﬀerent time instants.
Case-ii - Four patches
There are four 0.5× 0.5× 0.5 cubic patches placed at the four corners; i.e., centered at (-0.5,
55
Figure 2.14: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with a 0.53 cubic patch as shown in Figure 2.13-(a). Here, the spa-
tial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size, ∆t = 0.0313.
0, 0.5), (-0.5, 0, -0.5), (0.5, 0, 0.5), and (0.5, 0, -0.5). The wave speed is set to be c2(= 0.1) in
every patch, and c1(= 1.0) in the remaining area. Figure 2.15 shows snapshots of the wave
at diﬀerent time instants.
Figure 2.15: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with four 0.53 cubic patches as shown in Figure 2.13-(b). Here, the
spatial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size, ∆t = 0.0313.
The 3D Gaussian pulse smoothly travels through the diﬀerent material interfaces provided
by patches (one in Figure 2.14 and four in Figure 2.15) and the rest of the area.
2.6.4 Waveguide in a Photonic Crystal
2.6.4.1 Problem Deﬁnition
We study the propagation of light in a photonic crystal, a low-loss periodic dielectric
medium, in which the atoms or molecules are replaced by macroscopic media with diﬀerent
56
dielectric constants, and the periodic potential is replaced by periodic dielectric functions
[43]. These are very useful in applications such as the design of ﬁber-optic cables, laser
engineering, high-speed computing, and spectroscopy.
In order to study this propagation, we ﬁrst cast Maxwell equations written in terms of
the electric ﬁeld, E and magnetic ﬁeld H in the form of a second-order wave equation.
2.6.4.2 Geometry of the Problem
A two-dimensional photonic crystal which is periodic in the x, y directions and homo-
geneous in the z direction is chosen as the test domain. The crystal has photonic band
gaps in the xy plane and it can prevent light from propagating in any direction within the
plane. This system has discrete translational symmetry in the xy plane, and this symmetry
property can be used to characterize its electromagnetic modes. Any modes that propagate
in the xy plane are invariant under reﬂections through the plane. The Transverse Magnetic
(TM) modes have the H ﬁeld in the plane and, the E ﬁeld normal to the plane. As shown
in Figure 2.16, a line defect is introduced along the y direction by removing a line of rods in
a set of alumina rods. This can be represented using a square lattice of cylindrical rods in
the air (Figure 2.17).
Figure 2.16: Schematic illustration of a line defect (in red) within a set of cylindrical rods
(in green) on a photonic crystal.
57
In our model, we chose an m-by-n lattice that has dielectric rods which are periodic along
x (n rods) and y (m rods) axes with the lattice constant a. The cylindrical rod with radius r
and dielectric constant is homogeneous along the z direction (we imagine the cylinders are
inﬁnitely tall). A column of rods is removed to produce a cavity, so it has reﬂecting walls on
both sides along the y direction. We consider the waveguide in TM mode using the following
boundary conditions: periodic in the x direction and outﬂow in the y direction. We assume
that there are no free charges, so only the dielectric constants are changing in space due to
the rods.
Figure 2.17: 2D Lattice of cylindrical rods which are periodic along x and y axes with lattice
constant a on a photonic crystal.
2.6.4.3 Result and Discussion
We performed numerical simulations of a waveguide on the photonic crystal using square
and cylindrical rods. A rectangular (3.9 × 2.1) lattice of rods was chosen, the rods were
placed in x, y directions with distance a = 0.3. The dielectric constant of rods was chosen
to be = 8.9. For the ADI scheme, we chose nx = ny = 100, with time step t = 0.0105.
Using square rods
The set of square rods of size r = 0.38a is modeled as shown in Figure 2.18, and we obtained
the result as shown in Figure 2.19 using the boundary conditions: periodic in the direction
58
x and outﬂow in the direction y for a point source e(−iωt), where ω = cky, ky = 0.88(2a),
c = 1√
, = 8.9, and µ = 1.
(µ)
(a) 3D
(b) 2D
Figure 2.18: Square lattice of square rods of side length 0.38a with lattice constant a and
dielectric constant = 8.9 in 3D (a) and 2D (b), a linear defect is formed by
removing a column of rods.
Figure 2.19: The wave generated by the point source, e(−iωt), where ω = cky, ky = 0.88(2a),
, = 8.9, and µ = 1 travelling through the photonic waveguide using
c = 1√
the model shown in Figure 2.18
(µ)
Using cylindrical rods
We replaced the square rods with cylindrical rods and used the same parameter choices as
in case i (Figure 2.20).
We obtained the result as shown in Figure 2.21 using the following boundary conditions:
periodic in x and outﬂow in y
Further, we tested the model for a source-free wave equation with initial solution e(−25(x2+y2))
using the same model (Figure 2.20) and obtained the result as shown in Figure 2.22. It is
59
(a) 3D
(b) 2D
Figure 2.20: Square lattice of cylindrical rods of side length 0.38a with lattice constant a
and dielectric constant = 8.9 in 3D (a) and 2D (b), a linear defect is formed
by removing a column of rods.
Figure 2.21: The wave generated by the point source, e(−iωt), where ω = cky, ky = 0.88(2a),
, = 8.9, and µ = 1 travelling through the photonic waveguide using
c = 1√
the model shown in Figure 2.20
(µ)
important to know that with the embedded boundary method, such as we are pursuing here,
it is no ”harder” to introduce an embedded boundary method for a square then it is for a
circle.
Figure 2.22: The wave, e(−25(x2+y2)) traveling through the photonic waveguide using the
model shown in Figure 2.20.
60
2.7 Summary
We derived the implicit semi-discrete solution for the multi-dimension wave equation sub-
ject to several applicable boundary conditions including Dirichlet, periodic, Neumann, and
outﬂow boundary conditions. We extended the second-order scheme to arbitrary higher-
order using Dirichlet and periodic boundaries on Cartesian grids. The P th order scheme
consumes O(P N ) time and O(P 2N ) space for the domain with N grid points, it can be
reduced to ideally linear time and space complexity O(N ) for considerably large problems
(N >>). The three-dimensional high-order scheme is implemented for Dirichlet and pe-
riodic boundary conditions and evaluated using diﬀerent applications including a variable
speed wave. In the next chapter, we describe the derivation of high-order outﬂow boundary
conditions.
61
CHAPTER 3: Higher Order Non-reﬂecting Boundary Condition
3.1 Introduction
In this chapter, we develop a high-order non-reﬂecting boundary condition that can be
applicable for curved boundaries based on the scheme described in Chapter 2.
A way to deﬁne a ﬁnite computational domain in the inﬁnite physical domain is by using
an artiﬁcial boundary which is called non-reﬂecting or open boundary condition. Since we
develop simulation of A6 magnetron with diﬀraction output (A6 MDO), we need a scheme
that successfully handles non-reﬂecting boundary conditions, even for curved boundaries.
The approach described in Chapter 2 gives second-order accuracy for the outﬂow boundary
condition. Basically, this approach extends the spatial domain for a speciﬁc time instance
and switches spatial integration into time integration at the boundary ends. We use higher-
order interpolation polynomial based on a higher-order ﬁnite-diﬀerence stencil for the time
integration approximation and apply a suitable initial condition to reach high-order accuracy.
Historically, the open boundary condition was developed for outgoing wave propagation
from the Sommerfeld radiation condition [66] by Zienkiewicz and Newton [74]. The wave-
based interpretation of this boundary condition is more applicable for EM wave propagation
[25; 60; 63]. Later, Higdon developed a sequence of increasing-order boundary conditions at
various incidence angles [35]. The absorbing layer-based method, Perfectly Matched Layer
(PML), is found to be the most popular method. It was developed by Berenger [6] for the
2-D Maxwell equations, and has been extended to 3D. PML imposes both losses of electric
and magnetic ﬁelds, identifying the correct loss term for PML so that the outgoing waves
not reﬂected in the domain can be problem-speciﬁc. However, it is diﬃcult to choose an
optimal conductive parameter(σ) for the best performance of the PML. Another approach,
62
the sponge layer method [16], requires extra grid points to make padding which implements
the functions to prevent the reﬂection. Kirchhoﬀ’s formula-based method developed by Ting
and Miksis [69] also requires more computation.
Recently, Bradley [1] introduced a high-order nonreﬂecting boundary condition which
uses a compressed nonreﬂecting boundary kernel to achieve high-order for 2D and 3D cases.
This boundary kernel approach is very complicated and consumes O(M N logN ) work in 2D
and O(M N log2N ) work in 3D, for M time steps and N spatial grid points. Without using
the complicated boundary kernel approach, our higher-order scheme for outﬂow boundary
condition works well for multi-dimensional problems with consuming O(P M N ) works, for
the temporal order of accuracy 2P [14]. For arbitrarily big problems, M N >> P , so the
complexity will be reduced to O(M N ).
In this chapter, we discuss 4-th order derivation for outﬂow boundary conditions. In Sec-
tion 3.2, we derive the high-order outﬂow boundary condition and give a concrete deﬁnition.
Section 3.3 gives the suitable initial condition for this approach, and ﬁnally we report a set
of test cases in Section 3.4.
3.2 Extension to Higher Order in Accuracy
We need to apply boundary conditions appropriately to obtain high order accuracy.
Dirichlet, Neumann, and periodic boundary conditions can be implemented straightforwardly
( see [15] for details) using the approach shown in [14]. Since we have to know the behavior
of the wave equation outside of the domain x ≤ a and x ≥ b, the outﬂow boundary condition
requires little additional computation as detailed in the following.
In order to obtain higher-order outﬂow boundary conditions, we consider more terms in
the Taylor series. Note, equation (2.52) uses three terms for second-order accuracy. We
use the same approach as explained previously in section 2.2.3.4. Besides, we perform some
iterations along the boundary stencil to ensure convergence. For this procedure, we need
to know the solution at time steps, tn+1, tn+2, and so on, if we choose the centered ﬁ-
63
nite diﬀerence stencil to derive equations with higher-order accuracy (as was used to derive
second-order outﬂow boundary conditions). A problem with this approach is that we need
data at a future time, tn+2, which we don’t have direct access to.
Instead, we prefer to
use one-sided backward ﬁnite diﬀerence stencils to obtain higher-order accuracy and obtain
boundary coeﬃcients explicitly.
Let us derive outﬂow boundary conditions with fourth order accuracy. First, we construct
a time interpolant at the right boundary (x > b) using a Taylor series expression of the form
(Here z = s/∆t as introduced in 2.52 )
u(b, tn − z∆t) ≈ u(b, tn) − z∆tut(b, tn) +
z2∆t2
2
utt(b, tn) − z3∆t3
6
uttt(b, tn) +
z4∆t4
24
utttt(b, tn)
(3.1)
By truncating higher order error terms, we only need to approximate the ﬁrst time derivative
to fourth order accuracy, second time derivative to third order accuracy, third time derivative
to second order accuracy, and fourth time derivative to ﬁrst order accuracy. To perform this
approximation, we use a ﬁve point backward ﬁnite diﬀerence stencil to approximate ut, utt,
uttt, and utttt to the desired order of accuracy. we obtain,
u(b, tn − z∆t) ≈ un(b) − z
(cid:16)25
12
(cid:16)35
(cid:16)5
(cid:16)
un−1(b) +
un(b) − 4un−1(b) + 3un−2(b) − 4
3
un−3(b) +
un(b) − 26
3
12
un(b) − 9un−1(b) + 12un−2(b) − 7un−3(b) +
2
un(b) − 4un−1(b) + 6un−2(b) − 4un−3(b) + un−4(b)
un−3(b) +
11
12
un−4(b)
un−2(b) − 14
3
19
2
3
2
(cid:17)
z2
+
2
− z3
6
z4
24
+
un−4(b)
1
4
un−4(b)
(cid:17)
(cid:17)
(cid:17)
(3.2)
Integrating this expression analytically using Lemma 2.1, we arrive at
Bn = e−βBn−1 + γ0un(b) + γ1un−1(b) + γ2un−2(b) + γ3un−3(b) + γ4un−4(b)
(3.3)
64
where,
E3(β) + E4(β),
35
12
E2(β) − 5
2
γ0 = E0(β) − 25
E1(β) +
12
E2(β) + 9E3(β) − 4E4(β),
γ1 = 4E1(β) − 26
3
E2(β) − 12E3(β) + 6E4(β),
γ2 = −3E1(β) +
19
2
E2(β) + 7E3(β) − 4E4(β),
14
3
E2(β) − 3
11
2
12
4
γ3 =
3
γ4 = −1
4
E3(β) + E4(β).
E1(β) +
E1(β) +
(3.4)
Likewise, by considering the left boundary x < a, we get
An = e−βAn−1 + γ0un(a) + γ1un−1(a) + γ2un−2(a) + γ3un−3(a) + γ4un−4(a).
(3.5)
We can also compute the coeﬃcients for second-order accuracy using the explicit approach
in this form,
where,
An = e−βAn−1 + γ0un(a) + γ1un−1(a) + γ2un−2(a).
Bn = e−βBn−1 + γ0un(b) + γ1un−1(b) + γ2un−2(b).
E1(β) + E2(β),
γ0 = E0(β) − 3
2
γ1 = 2E1(β) − 2E2(β),
γ2 = −1
2
E1(β) + E2(β).
(3.6)
(3.7)
Now we have equations to compute the homogeneous boundary coeﬃcients An and Bn
to second and fourth-order accuracy. Note that the boundary constants corresponding to
operator D[u] (denoted by An
1 , Bn
1 ) are independent of the boundary constants corresponding
65
to operator D2[u] (denoted by An
2 , Bn
2 ). Therefore, our fourth-order wave solution can be
constructed on two levels.
• Level 1
Compute D[u] using u; An
1 and Bn
1 are obtained by second order solution implicitly
(2.58) or explicitly (3.6).
• Level 2
Compute D2[u] using D[u]; An
2 and Bn
2 are obtained by fourth order solution (3.3) and
(3.5) explicitly
Now, we provide algorithms for fourth-order outﬂow boundary conditions in one dimen-
sion. We ﬁrst modify Algorithm 5 to work for outﬂow boundary conditions with fourth-order
accuracy in time. This Algorithm 11 computes γ coeﬃcients (see 3.7 and 3.4 ) using second-
order centered, and fourth-order backward ﬁnite diﬀerent stencils in Level 1 and Level 2
computations respectively. Initially, outﬂow boundary coeﬃcients (vector H) are set to zero,
meaning boundary coeﬃcients at time step t0 are zero, and previous solutions at boundaries
are maintained (ubppr) for the computation.
During the computation of L−1 using outﬂow (Algorithm 12), we use an iterative method
to ensure the solution converges. In each iteration, it computes boundary coeﬃcients and
updates the solutions within the boundary stencil. We perform the iteration until it satisﬁes
a criterion |u − utemp|∞ < tol where tol is a tolerance which may be chosen to be quite
small. We developed an Algorithm 13 to compute outﬂow boundary coeﬃcients using the
previously explained approach.
A general procedure to compute the γ coeﬃcients for any order based on a given ﬁnite dif-
ference stencil is designed in Algorithm 14. This algorithm utilizes a function f dcoef f F (k, ¯x, x)
published by [49] to compute coeﬃcients for ﬁnite diﬀerence approximations of the derivative
of order k at ¯x based on grid values at points in the boundary stencil x. In Algorithm 15,
the function E represents the expression deﬁned in lemma 2.1
66
Algorithm 11 Compute u at time tn+1 by imposing outﬂow boundary conditions
1: un, un−1, and un−2 - array of u values at time tn, tn−1, and tn−2 respectively,
2: bdryA, bdryB - boundary points,
3: expA expB- exponential vector of grid distance from left boundary and right boundary
respectively along the line,
4: ν - array of weighted nodes along the line,
5: expweight - array of exponentially weighted nodes along the line,
6: ps, pt -order of accuracy in space and time respectively,
7: xA, xB - left and right boundary points respectively.
8: stimp2 = 1 : −1 : 1
9: stexp2 = 0 : −1 : −4
10: g(1, :) = gamma(β, 2, stimp2)
11: g(2, :) = gamma(β, 4, stexp4)
12: P = 2
13: Dterms = 0
14: for k = 1 to P do
15: D
=
u − linv out(un, β, bdryA, bdryB, expA, expB, ν, expweight, ps, 2k, H,
ubppr, g, maxit, tol)
16:
17: Dterms = Dterms + poly high(β, P )D
u = D
18:
19: end for
20: un+1 = 2un − un−1 + Dterms
3.3 Appropriate Initial Condition
Our MOLT based implicit numerical scheme computes solution un+1 at time tn+1 using
the solutions un and un−1 at previous time steps tn, and tn−1. Therefore our initial condition
should be handled carefully in order to avoid order reduction of the scheme. In Causley’s
higher order paper [14] he introduced a technique to approximate u1 at time t1 using the
initial solution at time t0, u0.
However, since our high-order outﬂow scheme requires solutions un, un−1, un−2, un−3,
and un−4 at the boundary points for the time levels tn, tn−1, tn−2, tn−3, and tn−4 to compute
homogeneous coeﬃcients, the above approximation is not enough to go with high-order
outﬂow scheme. Therefore, we require the initial solution u1, u2, u3, and u4 to use our
higher order scheme. In order to obtain those values, we ﬁrst use the second order scheme
with the initial approximation of u1 and obtain u2, u3, and u4, then afterwords we use our
67
Algorithm 12 linv out: Compute L−1 for outﬂow boundary condition
1: u - array of u values at time tn,
2: β - averaging parameter,
3: bdryA, bdryB - boundary points,
4: expA expB- exponential vector of grid distance from left boundary and right boundary
respectively along the line,
5: ν - array of weighted nodes along the line,
6: expweight - array of exponentially weighted nodes along the line,
7: ps -order of accuracy in space,
8: pt - order of accuracy in time.
9: H - boundary coeﬃcients at time step tn,
10: ubppr - solution at the boundary points at previous time steps,
11: g - a vector of γ coeﬃcients for a given order of accuracy,
12: maxit - maximum number of iterations,
13: tol - tolerance number for the convergence.
14: Compute particular solution
15: I = f astconvolution(u, ν, expweight, ps)
16: Boundary correction
17: utemp = u
18: it = 0
19: repeat
20: H = apply bc outf low(uA, uB, bdryA, bdryB, β, H, ubppr, g, pt)
21:
22:
utemp = update stencil(utemp, I, expA, expB)
it = it + 1
23: until (|u − utemp|∞ < tol) OR (it < maxit)
24: Operate L−1
25: k = pt
2
26: linv = I + H(k, 1)expA + H(k, 2)expB
fourth order scheme to compute the solution u5.
3.4 Numerical Results
In this section, several test cases are reported for higher-order accuracy in 2D and 3D.
3.4.1 Convergence Test for High-Order Outﬂow
We ﬁrst show fourth order convergence of outﬂow boundary conditions by performing
a reﬁnement study on a rectangular domain Ω = [−1, 1] × [−1, 1] and a cubic domain
Ω = [−1, 1] × [−1, 1] × [−1, 1] for 2D and 3D respectively.
In each case, we use a point
68
Algorithm 13 apply bc outf low: Apply outﬂow boundary conditions
1: IA, IB - solution at boundary points,
2: bdryA, bdryB - boundary points,
3: H - boundary coeﬃcients at time step tn,
4: ubppr - solution at the boundary points at previous time steps,
5: g - a vector of γ coeﬃcients for a given order of accuracy,
6: β - averaging parameter,
7: p - order of accuracy in time.
8: p = 2 ∗ k
9: H(k, 1) = H(k, 1)e−β + g(k, 1)IA
10: H(k, 2) = H(k, 2)e−β + g(k, 1)IB
11: for j = 2 to p +1 do
for bp = 0 to 1 do
12:
13:
14:
15: end for
16: for j = 1 to p do
17:
18: end for
19: ubppr(p + 1, 0) = uxA
20: ubppr(p + 1, 1) = uxB
H(k, bp) = H(k, bp) + g(k, j)ubppr(p + 2 − j, bp)
ubppr(j, :) = ubppr(j + 1, :)
end for
Algorithm 14 gamma : Compute the γ coeﬃcients for a given order of accuracy
cf s(:, j) = f dcoef f F (j, 0, stencil)
1: β - averaging parameter,
2: p - order of accuracy in time,
3: stencil - array of uniform grid points
4: for j = 1 to p do
5:
6: end for
7: for k = 1 to p+1 do
for j = 1 to p do
8:
9:
10:
11: end for
12: g(1) = g(1) + E(0, β)
g(k) = g(k) + (−1)jE(j, β)
end for
source cos(ωt), ω = 1 placed at center of the domain ( (0, 0) in 2D and (0, 0, 0) in 3D),
and run the evaluation up to dimensionless time T = 2.0, with a ﬁxed spatial resolution of
160 × 160 spatial points in 2D and 160 × 160 × 160 in 3D. The discrete L∞ norm of the
error is constructed at each time step and maximum error over all time steps is used to
graph in Figure 3.1 for ∆t = 125×10−3
, k = 1 to 5, with outﬂow boundary conditions. The
2k
69
Algorithm 15 E : Compute the coeﬃcients E based on lemma 2.1
1: m and v - integers,
2: m ≥ 0 and v > 0
3: Pmv = 0
4: for l = 0 to m do
Pmv = Pmv + vl
5:
l!
6: end for
7: Emv = 1
vm (1 − e−v)Pmv
CFL (= c∆t
∆x ) value changes proportional to the time step size ∆t with ﬁxed spatial step size,
∆x = ∆y = ∆z.
(a) in 2D
(b) in 3D
Figure 3.1: Fourth order convergence of (a) 2D and (b) 3D wave solver using outﬂow bound-
ary conditions with the spatial step size h = 1.25×10−2. This is a self-reﬁnement
study which measures L∞ norm of the error at time T = 2.0 on a (a) square and
(b) cubic domain with a point source cos(ωt), ω = 1 at the center of the domain.
The CFL (= c∆t
h ) value changes proportional to the time step size ∆t with ﬁxed
spatial step size, h.
3.4.2 Rotating Gaussian Pulse
In this section, we examine the ability of our outﬂow scheme to handle the eﬀect of
a rotating Gaussian pulse. First, we perform reﬁnement studies on a square domain Ω =
[−1, 1]×[−1, 1] for a rotating Gaussian pulse through diﬀerent angles. We chose the Gaussian
70
pulse as an initial solution,
(cid:16)
−36
e
( cos(θ)x+sin(θ)y
r1
)2
+( sin(θ)x+cos(θ)y
r2
)2(cid:17)
(3.8)
with r1 = 2 and r2 = 1 and rotate the angle θ from Π
18 to Π
2 for this reﬁnement studies.
We run the evaluation up to the dimensionless time T = 2.0, with a ﬁxed spatial resolution
of 100 × 100 spatial points. The discrete L∞ norm of the error is constructed at each time
step and maximum error over all time step is used to graph in Figure 3.2 for ∆t = 0.2
2k , k = 0
to 4, with our fourth order outﬂow boundary conditions.
Figure 3.2: Fourth order convergence of the oval shape Gaussian pulse (however it reduces to
4 due to the ADI splitting error) given by the
2 on square domain Ω = [−1, 1]2 by imposing
the above third order near to θ = Π
Equation 3.8 rotated through Π
18 to Π
Outﬂow boundary conditions along the boundaries. This self-reﬁnement study
measures L∞ norm of the error at time T = 2.0 with h = 0.02.
The convergence plot shows that the order of the accuracy reduces from fourth-order to
above third-order while the angle goes to Π
4 . This is acceptable behavior because the highest
splitting error should be obtained for the angle Π
4 due to the ADI splitting.
71
3.4.2.1 Time Evaluation of 50o angled Gaussian pulse
In this section, we present a time evolution of the same Gaussian pulse deﬁned in equa-
tion 3.8 with r1 = 2, r2 = 1 and angle θ = 50o using outﬂow boundary conditions. We chose
a rectangular domain Ω = [−1, 1]× [−1, 1] with spatial resolution of 300× 300 spatial points,
and set the time step size ∆t = 3.3 × 10−3 to maintain CFL value 0.5. Figure 3.3 shows
snapshots of the wave at diﬀerent time instants
(a)
t = 0.3597
(b)
t = 0.6864
(c)
t = 1.0131
(d)
t = 1.3695
Figure 3.3: Time evolution of a Gaussian pulse given by the Equation 3.8 with angle θ = 50o
on square domain Ω = [−1, 1]2 by imposing outﬂow boundary conditions along
the boundaries. Here the time step size ∆t = 3.3 × 10−3 and CFL value is 0.5.
The wave is leaving completely as expected.
(a) θ = 30o
(b) θ = 50o
(c) θ = 70o
(d) θ = 90o
Figure 3.4: Evolution of a Gaussian ﬁeld given by the Equation 3.8 at time t = 0.6864,
rotated through diﬀerent angles (a) θ = 30o, (b) θ = 50o, (c) θ = 70o, and (d)
θ = 90o on square domain Ω = [−1, 1]2 by imposing outﬂow boundary conditions
along the boundaries, Here the time step size ∆t = 3.3 × 10−3 and CFL value is
0.5 at time t = 0.1287 using outﬂow boundary conditions. We observe the same
behaviour of the wave for rotated Gaussian pulse through diﬀerent angles. The
waves leave completely through the curved boundary.
72
3.4.3 Outﬂow Boundary Condition Along Curve Boundaries
We performed an experiment to examine the ability of our outﬂow boundary treatment
along the curved boundaries. For this experiment, we built a 2D geometry with a curved
boundary as shown in Figure 3.5 (a). The curve is a portion of an ellipse which has axes a
and b and centred at the origin (0, 0).
rw =
ab
√
b2 + a2 tan2 θ
rh2 = rw tan θ
rh1 = b
This 2D object can be represented as a 2D graph (Figure 3.5), G(V, E(ES, EA)) with a
set of vertices V = (v1, v2, v3, v4), straight edges ES = [es1(≡ (v2, v3)), es2(≡ (v3, v4)), es3(≡
(v3, v4)), es4(≡ (v4, v1))], and an arch edge EA = [ea1(≡ (v1, v2, O))].
(a) Geometry
(b) Graph
Figure 3.5: (a) Geometry and (b) graph representation of the 2D object constructed by using
an oval of width 2a and height 2b and a rectangle of 2rw×(rh1+rh2). Here rh1 = b
and v1, v2, v3, and v4 are the vertices of the object.
73
We made an instant of this object with a = 1.6, b = 4.8, and θ = 50o, place a point source
cos ωt with ω = 0.5 at O(0, 100∆x), apply outﬂow boundary condition along the elliptical
curved edge and Dirichlet along the remaining straight edges. Time evaluation of the source
ﬁeld for the spatial grid width ∆x = ∆y = 0.039 and time step size ∆t = 0.0195 is shown in
Figure 3.6
(a) t = 1.8135s
(b) t = 11.778s
(c) t = 12.6945s
(d) t = 13.7085s
Figure 3.6: Time evolution of the point source ﬁeld cos ωt with ω = 0.5 at (0, 100∆x) in the
object as shown in Figure 3.5 by imposing outﬂow boundary conditions along
the curved boundary and homogeneous Dirichlet along the straight boundaries.
Here the spatial step size, ∆x = ∆y = 0.039 and the time step size, ∆t = 0.0195
Further, we performed reﬁnement studies for the same structure and obtained the fourth-
order accuracy in time by computing the L∞ error for the ﬁnal time T = 5.0 as shown in
the plot 3.7.
3.4.4 Convergence Studies Using Analytical Solution
We perform this experiment to conﬁrm the order of accuracy of our scheme using an
analytical solution. We use the fourth order scheme (see [14] for the proof) as given in
Equation 3.9 which includes the source term s,
un+1 − 2un + un−1 = −β2Cxyz[un] −
β2Dxyz − β4
12
(cid:0)sn+1 + 10sn + sn−1(cid:1) +
+
β2
12α2
(cid:19)
Cxyz
Cxyz[un]
β2
12α4Cxyz[sn]
(3.9)
(cid:18)
74
(a) Field at t = 5.0
(b) Convergence plot
Figure 3.7: (a) Snapshot of the point source ﬁeld at time T = 5.0 and (b) Fourth order
convergence of 2D wave solver using Outﬂow boundary conditions along the
curve boundary with the spatial grid width ∆x = ∆y = 0.078. This is a self-
reﬁnement study which measures L∞ norm of the error at time T = 5.0 on the
object shown in Figure 3.5 with a point source cos(ωt), ω = 0.5 at the center of
the box, (0, 100∆x)
Here we use three dimensional integration operator Cxyz acting on the source term s for a
high-order correction. We derive an analytical solution for a point source sin(4πt) placed at
(x0, y0, z0) as follows,
(cid:112)(x − x0)2 + (y − y0)2 + (z − z0)2 + δ2
e−36(t/15−1).2sin(4πt)
1
4π
(3.10)
where δ ∈ R+, 0 < δ2 < ∆x
We choose a cubic domain Ω = [−1, 1]× [−1, 1]× [−1, 1], place a point source sin(4πt) at
the center (0, 0, 0), impose outﬂow boundary conditions along the all six boundary surfaces,
and run the evaluation up to dimensionless time T = 3.7, with a ﬁxed spatial resolution
of 80 × 80 × 80. The discrete L∞ norm of the error is constructed at each time step and
maximum error over all time steps is used to graph in Figure 3.8 for ∆t = 2×10−1
, k = 0 to
2k
3.
75
Figure 3.8: Fourth order convergence of 3D wave solver using outﬂow boundary conditions
with the spatial step size h = 2.5 × 10−2. This measures L∞ norm of the error
which compares with analytical solution using a point source sin(4πt) placed at
the center of the domain.
3.5 Summary
We derived a solution for high-order outﬂow boundary conditions using our MOLT based
scheme which uses the extended backward ﬁnite diﬀerence time-stencil along the boundaries
and applies a suitable initial condition. The computation begins with the second-order
scheme followed by the 4th-order scheme after three time-steps. We evaluated the scheme
using several test cases including curved boundaries. In the next chapter, we describe the
derivation of the high-order embedded Neumann boundary method for multi-dimensions.
76
CHAPTER 4: Higher Order Embedded Neumann Boundary Method
4.1 Introduction
In this chapter, we review the second-order embedded Neumann boundary method for
the scheme described in Chapter 2 and extend the order of spatiotemporal accuracy for
this scheme to arbitrary high-order. Further, we develop a general framework for complex
geometries in 3D.
We choose an embedded boundary method to deal with complicated geometries. As an
initial test for the simulation of magnetrons, we impose the embedded Neumann boundary
condition (later, in Chapter 5 we introduce our embedded PEC boundary condition). The
MOLT based scheme described in Chapter 2 gives second-order accuracy for embedded
Neumann boundary condition, therefore we want to improve the accuracy for arbitrary
higher-order in time and space.
In this chapter, we describe a high-order 3D embedded
Neumann boundary method and generalize it for complex geometries in 3D. The high-order
accuracy in time is achieved using the Lax-Wendroﬀ approach as explained previously in
Chapter 2. Our treatment for embedded Neumann boundary condition includes the ghost
points and boundary corrections. First, the solution at the ghost points are computed using
Hermite-Birkhoﬀ interpolation, and then interior points were updated using a boundary
correction iteration along with boundary stencils. Hence, we extend the interpolation stencils
to achieve high-order accuracy in space.
The embedded boundary method can deal with curved boundaries and material inter-
faces by superimposing boundaries on Cartesian grids and it also can handle oﬀ-grid points.
Thus, the embedded boundary method is the most suitable approach for complicated geome-
tries in diﬀerent application areas such as ﬂuid dynamics in aerospace engineering [55; 44],
77
electromagnetics [3], and acoustics [45; 2]. Further, several material interface problems are
dealt with embedded boundary methods such as dielectric materials. Here, the question
is how to handle wave propagation along with material/domain interfaces. Researchers
propose appropriate jump conditions [11; 3; 46; 22] which were derived from the domain’s
material properties and apply the suitable boundary conditions along the interfaces using
interpolation techniques. In [46], the embedded method was used for numerical modeling of
electromagnetic scattering of an incident plane wave by a dielectric circular cylinder. They
dealt with discontinuous wave propagation speed using appropriate jump conditions, but
the scheme is in second-order temporal and spatial accuracy for 2D problems. In [11] and
[18] second and fourth-order accuracy was achieved using an upwinding embedded boundary
method for Maxwell’s equations based on Runge-Kutta type time discretization. However,
the approach is very cost expensive and is not as fast as our MOLT based scheme. Another
method proposed in [3] uses interior boundary points instead of exterior ghost points to
apply Neumann or Dirichlet boundary conditions with the embedded boundary method and
gives fourth-order accuracy for the wave equation in 2D. The method is based on a compact
Pade-type discretization of spatial derivatives together with the Taylor series method in time
and it can remove small-cell stiﬀness problems for both Neumann or Dirichlet boundary con-
ditions. However, it also deals with cost expensive and slow matrix operations to obtain the
solution.
An approach similar to our scheme has been used in [9; 8]. They provide an implicit ADI
based numerical scheme for solving the heat and wave equations in a general embedded-
boundary domain with high-order spatial accuracy using Fourier continuation spatial approx-
imations and second-order temporal accuracy. Thus, the big diﬀerence from our approach
is that they use a Fourier-based method to invert the semi-discrete diﬀerential operator
instead of constructing and inverting the modiﬁed Helmholtz operator as we do. Besides,
the Fourier continuation method allows for minimal dispersion errors for wave propagation
problems. However, the schemes are only second-order accurate in time and higher-order
78
accuracy requires resorting to Richardson extrapolations.
In the following sections, we describe the higher-order Neumann boundary conditions
in Section 4.2, detailing its implementation in 1D, 2D, and 3D. In Section 4.3, we describe
the generalized embedded method suitable for any complex geometry and ﬁnally provide
numerical results.
4.2 Derivation of The Scheme
Neumann boundary conditions for boundary geometries along grid lines can be directly
imposed using a two-point boundary correction by utilizing surface grids in 2D and volu-
metric grids in 3D. The method can be applied to polygonal domains by the use of multiple
overset grids, each aligned with a boundary segment, which communicates with the interior
grid through interpolation on a ghost cell region. We chose the embedded boundary methods
explained in [13] as the basis to impose Neumann boundary conditions for curved boundary
surfaces in 3D. Here we determine the corresponding Dirichlet boundary values at the end-
points of each x−, y− and z− sweep lines that result in the approximate satisfaction of the
Neumann boundary condition.
The Neumann boundary condition is unconditionally unstable even when space is set
continuously [46; 22]. This means all embedded boundary methods need a touch of diﬀusion
to stabilize it. In our earlier work, we developed a O(∆t2) with one higher-order diﬀusion
term to stabilize it. Here we extend the diﬀusion term to the fourth-order accuracy.
4.2.1 One Dimensional Scheme
Let us consider 1D domain Ω ≡ [xa, xb] with a left boundary xB, and we choose a ghost
point xG, which is the exterior-point to the simulation domain (it should have at least one
interior neighbor). We give the formulation of the left domain in detail next,
As shown in Figure 4.1, xI and xII are two interior points, their distance from boundary
79
Figure 4.1: Geometry of 1D embedded Neumann boundary domain Ω ≡ [xa, xb] with left
boundary xB, ghost point xG, and interpolation points xI and xII, where the
distance ξI = |xI − xa|, ξII = |xII − xa|, and ξG = |xG − xa|
point are ξI = |xI − xa| and ξII = |xII − xa| along the normal. We deﬁne ξG as the distance
between ghost and boundary point (ξG = |xG− xa|). We apply Hermite-Birkhoﬀ interpolate,
P (ξ) through the points, xG, xa, xI, and xII by imposing the conditions P (cid:48)(0) = 0, P (ξI) = uI
and P (ξII) = uII, and explicitly solving for uG gives:
(cid:19)
(cid:18) ξ2
II − ξ2
II − ξ2
ξ2
I
G
(cid:18) ξ2
G − ξ2
II − ξ2
ξ2
I
I
(cid:19)
ul
II + O(∆x2),
ul+1
G =
ul
I +
We have introduced additional notation of l + 1 and l, which designate how we will set up
a ﬁxed point iteration using the Hermite-Birkhoﬀ interpolate. The iteration itself is a simple
ﬁrst order ﬁxed point method. Let w(l) and w(l+1) be the solutions at the lth and (l + 1)th
iterations. We choose a tolerance tol, and maximum number of allowed iterations mit, and
deﬁne a stopping criterion as |w(l+1) − w(l)|∞ < tol OR nit > mit where nit is the current
iteration number. The detailed steps for this iterative approach are given in Algorithm 16,
As detailed in [13] we need to include an artiﬁcial dissipation term into the solution to
maintain stability for such an embedded boundary method. Thus the second-order solution
un+1(x) at the next time step looks like,
un+1(x) − 2un(x) + un−1(x) = −β2D(1)
x [un] − D(2)
x [un−1].
(4.1)
where is an artiﬁcial dissipation coeﬃcient, that satisﬁes 0 < < 1. The Value of D(2)
at the previous time step can be used as D(2)
go with one more computing-level to obtain D(2)
x [un]
x [un−1] at the current time step, and we need to
x [un]. According to [13], β has to be reduced
80
Algorithm 16 Compute L−1 for 1D scheme
1: Compute particular solution I n+1 = I[un]
2: Initial guess wn+1(0) ≈ 3un − 3un−1 + un−2
3: Initialize the iteration counter nit = 0
4: repeat
5:
and wn+1(l)
I
II
G
=
G
I
I
6:
7:
(cid:17)
(cid:17)
wn+1(l)
II−ξ2
II−ξ2
ξ2
(cid:16) ξ2
(cid:16) ξ2
, at the interpolation points xI, and xII and as well as at
Compute wn+1(l)
the interpolation points along the right boundary
Compute solution at the ghost points using the following Hermite-Birkhoﬀ interpolant
wn+1(l)
Compute homogeneous coeﬃcients a1 and b1 using the fact (Here, xaG and xbG are
ghost points ),
wn+1(l) = I n+1 + a1e−α(x−xaG) + b1e−α(xbG−x).
Compute solution and update boundary stencil
wn+1(l+1) = I n+1 + a1e−α(x−xaG) + b1e−α(xbG−x)
nit = nit + 1
10: until (|wn+1(l+1) − wn+1(l)| < tol OR nit > mit)
G−ξ2
II−ξ2
ξ2
wn+1(l)
II
8:
9:
+
I
I
by a small amount from the case published in [14].
4.2.2 Two Dimensional Scheme
When extending the embedded Neumann boundary method to 2D cases, we need to
consider two important changes: every spatial point will be treated in x and y Cartesian
coordinates, and the ghost point approximation will be made by using bi-linear interpolation.
To achieve high-order accuracy, we need more computing-levels (perform P computing-levels
for 2P -th order of accuracy in time ) and extend the interpolation stencils by using 4-points
along the normal for the Hermite-Birkhoﬀ interpolation and an eight-point stencil for the bi-
linear interpolation. We can derive second-order and fourth-order 2D solution with artiﬁcial
dissipation ([13]) as given below,
un+1 − 2un + un−1 = −β2C(1)
(cid:18)
(cid:19)
xy [un] + C(2)
xy [un−1],
un+1 − 2un + un−1 = −β2C(1)
xy [un] −
β2D(2)
xy − β4
12
C(2)
xy
C(1)
xy [un] + C(3)
xy [un−1].
(4.2)
(4.3)
We now consider the situation of a two-dimensional domain with a curved boundary Γ
81
displayed in Figure 4.2. In the 2D case, we deﬁne a ghost point (xG, yG) which is an exterior
point (xi, yj), but at least one of the neighboring points (xi±1, yj) or (xi, yj±1) should be an
interior point. In order to compute the value of unknown uG at the ghost point location
(xaG, yaG), we construct a quadratic Hermite-Birkhoﬀ boundary interpolant P (ξ) along the
direction normal to the boundary, which intersects the boundary curve Γ at location (xa, ya).
We choose four interior points selected along the normal of the surface with distances, ξI =
|(xI, yI)− (xa, ya)| = ∆sI, ξII = |(xII, yII)− (xa, ya)| = 2∆sI, ξIII = |(xIII, yIII)− (xa, ya)| =
3∆sI and ξIV = |(xIV , yIV )−(xa, ya)| = 4∆sI, where we will typically take ∆sI =
2∆x, and
√
compute values uI, uII, uIII, and uIV at points (xI, yI), (xII, yII), (xIII, yIII), and (xIV , yIV )
respectively by interpolating from interior grid points, and use these interpolation values uI,
uII, uIII, and uIV to perform the Hermite-Birkhoﬀ interpolant P (ξ). Further, the distance
from the boundary to the ghost point is deﬁned as ξG = |(xG, yG) − (xa, ya)|.
(a) Geometry
(b) 8-points stencil
Figure 4.2: (a)Geometry of 2D embedded Neumann boundary method with a boundary
point (xB, yB), ghost point (xG, yG), and interpolation points (xI, yI), (xII, yII),
(xIII, yIII) and (xIV , yIV ) and (b) symmetric 8-points stencil to interpolate the
solution at the point (xI, yI).
The solution at the ghost point, uG can be computed using the Hermite-Birkhoﬀ inter-
polant P (ξ) by imposing the conditions P (cid:48)(0) = 0, P (ξI) = uI, P (ξII) = uII, P (ξIII) = uIII,
82
and P (ξIV ) = uIV given by,
ul+1
G =
ul
I +
(cid:19)
(cid:18) ξ2
III − ξ2
III − ξ2
ξ2
I
G
(cid:18) ξ2
IV − ξ2
IV − ξ2
ξ2
G
II
(cid:19)
ul
II +
(cid:18) ξ2
G − ξ2
III − ξ2
ξ2
I
I
(cid:19)
ul
III +
(cid:18) ξ2
G − ξ2
IV − ξ2
ξ2
II
II
(cid:19)
IV + O(∆x2),
ul
We obtain uI and uII by the approximations using the symmetric 8-point stencil inter-
polation. Suppose the interpolation point uI lies in a cell with corners (xi, yj), (xi+1, yj),
(xi+1, yj+1), and (xi, yj+1), we have the following approximation for uI
uI = w1ui,j + w2ui+1,j + w3ui+1,j+1 + w4ui,j+1
+ w5ui−1,j−1 + w6ui+2,j−1 + w7ui+2,j+2 + w8ui−1,j+2
(4.4)
where
w1 =
w3 =
w5 =
w7 =
4∆x∆y
(xi+2 − xI)(yj+2 − yI)
(xI − xi−1)(yI − yj−1)
(xi+1 − xI)(yj+1 − yI)
(xI − xi)(yI − yj)
4∆x∆y
4∆x∆y
4∆x∆y
w2 =
w4 =
w6 =
w8 =
4∆x∆y
(xI − xi−1)(yj+2 − yI)
(xi+2 − xI)(yI − yj−1)
(xI − xi)(yj+1 − yI)
(xi+1 − xI)(yI − yj)
4∆x∆y
4∆x∆y
4∆x∆y
As described in 1D (4.2.1), we use an iterative scheme to improve the accuracy of the
solution. For each term in the operator on the right hand side of the second order solution
given by equation 4.2, or fourth order in time solution given by equation 4.3, we must identify
x [un] +L−1
the correct ghost point that allows that term to guarantee that the normal derivative is zero
along the boundary. Taking the C(1)[un] operator and expanding the operator out we have
C(1)[un] = L−1
the ghost point for wx = L−1
y [wx] such
that when solving for the boundary correction terms the operator satisﬁes ∂nwx|∂Ω = 0,
∂nwy|∂Ω = 0, ∂nwxy|∂Ω = 0, ∂nwyx|∂Ω = 0.
y [un]. The ﬁxed point iteration identiﬁes
x [wy], and wyx = L−1
y [un], wxy = L−1
x [un], wy = L−1
y [un]−L−1
x [un]−L−1
x L−1
y L−1
83
Let us consider wx and wyx = L−1
y [wx], knowing that wy and wxy are similar. The
iterative process for wx starts by making an initial guess at the direct boundary values
wx|∂Ω, using an extrapolate in time wx
n−2 at each boundary
n+1,(0) ≈ 3wx
n−1 + wx
n − 3wx
point (Algorithm 17).
Algorithm 17 Compute wx(= L−1
x [u])
1: Compute the interior sweep (particular solution) I n+1
n − 3wx
2: Initial guess wx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:
n+1(0) ≈ 3wx
for k = 1 to ny do
n−1 + wx
x = Ix[un]
n−2
Compute wxI(k), and wxII(k), at the interpolation points (xI(k), y(k)), and
(xII(k), y(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (xaG(k), y(k)) and (xbG(k), y(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wx
Compute solution and update boundary stencil
wx
n+1(l) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)
n+1(l+1) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)
7:
8:
9:
10:
11:
end for
nit = nit + 1
12: until (|wx
n+1(l+1) − wx
n+1(l)| < tol OR nit > mit)
Given the update on wx, the next step is to consider wyx = L−1
y [wx]. The process starts
with the initial guess wx
n+1,(0) ≈ 3un − 3un−1 + un−2 at the boundary (Algorithm 18).
The same process is done for wy and wxy. The Hermite-Birkhoﬀ interpolate enforces
that the normal derivative is zero for wx, wy, wxy and wyx such that C(1)[un] satisﬁes the
normal derivative condition on ∂Ω to within tolerance. For the fourth order formulation, we
compute the boundary conditions for C(1)[un] and then we repeat the process for C(2) and
D(2) acting on C(1)[un].
Next, we will eamine the higher order embedded Neumann boundary method for 3D in
detail,
84
y [wx])
Algorithm 18 Compute wyx(= L−1
1: Compute the interior sweep (particular solution) I n+1
2: Initial guess wyx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:
n+1(0) ≈ 3un − 3un−1 + un−2
for k = 1 to nx do
y = Iy[wx
n]
Compute wyxI(k), and wyxII(k), at the interpolation points (x(k), yI(k), and
(x(k), yII(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (x(k), yaG(k)) and (x(k), ybG(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wyx
Compute solution and update boundary stencil
wyx
n+1(l) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)
n+1(l+1) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)
7:
8:
9:
10:
11:
end for
nit = nit + 1
12: until (|wyx
n+1(l+1) − wyx
n+1(l)| < tol OR nit > mit)
4.2.3 Three Dimensional Scheme
We now consider the situation of a three-dimensional domain with a curved-surface
boundary Γ displayed in Figure 4.3. In the 3D case, we deﬁne a ghost point (xaG, yaG, zaG)
which is an exterior point (xi, yj, zk), but at least one of the neighbouring points (xi±1, yj, zk)
or (xi, yj±1, zk) or (xi, yj, zk ± 1) of it should be an interior point. As described in the one and
two dimensional cases, to compute uG, we construct a quadratic Hermite-Birkhoﬀ boundary
interpolant P (ξ) along the direction normal to the boundary surface, which intersects the
boundary curved-surface Γ at location (xa, ya, za). We choose two interior points selected
along the normal of the surface with distances, ξI = |(xI, yI, zI) − (xa, ya, za) = ∆sI| and
ξII = |(xII, yII, zII) − (xa, ya, za) = 2∆sI|, where we will typically take ∆sI =
√
3∆x, and
compute values uI and uII, at points (xI, yI, zI) and (xII, yII, zII) respectively by interpo-
lating from interior grid points, and use these interpolation values uI and uII to perform the
Hermite-Birkhoﬀ interpolant P (ξ). Further, the distance from the boundary to the ghost
point is deﬁned as ξG = |((xaG, yaG, zaG)) − (xa, ya, za)|.
The solution at ghost point, uG can be computed using the Hermite-Birkhoﬀ interpolant
85
Figure 4.3: Geometry of 3D embedded Neumann boundary method with a boundary
point (xB, yB, zB), ghost point (xG, yG, zG), and interpolation points (xI, yI, zI),
(xII, yII, zII), (xIII, yIII, zIII) and (xIV , yIV , zIV )
P (ξ) by imposing the conditions P (cid:48)(0) = 0, P (ξI) = uI, and P (ξII) = uII, given by,
(cid:19)
(cid:18)ξ2
II − ξ2
II − ξ2
ξ2
I
G
(cid:19)
(cid:18) ξ2
G − ξ2
II − ξ2
ξ2
I
I
ul+1
G =
ul
I +
ul
II + O(∆x2),
We obtain uI and uII by the approximations using the tri-linear interpolation. Sup-
pose the interpolation point uI lies in a cell of rectangular prism with corners (xi, yj, zk),
(xi+1, yj, zk), (xi+1, yj+1, zk) , (xi, yj+1, zk), (xi, yj, zk+1), (xi+1, yj, zk+1), (xi+1, yj+1, zk+1), and
(xi, yj+1, zk+1), we have the following approximation for uI
uI = w1ui,j,k + w2ui+1,j,k + w3ui+1,j+1,k + w4ui,j+1,k
+ w5ui,j,k+1 + w6ui+1,j,k+1 + w7ui+1,j+1,k+1 + w8ui,j+1,k+1
(4.5)
86
where
w1 =
w3 =
w5 =
w7 =
∆x∆y∆z
(xi+1 − xI)(yj+1 − yI)(zk+1 − zI)
(xI − xi)(yI − yj)(zk+1 − zI)
(xi+1 − xI)(yj+1 − yI)(zI − zk)
(xI − xi)(yI − yj)(zI − zk)
∆x∆y∆z
∆x∆y∆z
∆x∆y∆z
∆x∆y∆z
(xI − xi)(yj+1 − yI)(zk+1 − zI)
(xi+1 − xI)(yI − yj)(zk+1 − zI)
(xI − xi)(yj+1 − yI)(zI − zk)
(xi+1 − xI)(yI − yj)(zI − zk)
∆x∆y∆z
∆x∆y∆z
w2 =
w4 =
w6 =
w8 =
∆x∆y∆z
Three dimensional 2-nd and 4-th order schemes can be deﬁned as follows,
un+1 − 2un + un−1 = −β2C(1)
(cid:18)
xyz[un−1],
xyz[un] + C(2)
(cid:19)
un+1 − 2un + un−1 = −β2C(1)
xyz[un] −
β2D(2)
xyz − β4
12
C(2)
xyz
C(1)
xyz[un] + C(3)
xyz[un−1].
(4.6)
(4.7)
y
x and L−1
in the z−
to perform z− sweeps. Taking the C(1)[un] operator
x [un]−
z [un]. The ﬁxed point iteration identiﬁes the
We use the operators C and D in three dimensions, and need to operate L−1
direction in addition to L−1
and expanding the operator out we have C(1)[un] = L−1
L−1
x L−1
z L−1
ghost point for wx = L−1
wxz = L−1
operators satisﬁes ∂n(.)|∂Ω = 0 when solving for the boundary correction terms.
z [un], wyx = L−1
x [wzy], and wyxz = L−1
y [wx], wzy = L−1
y [wxz], and each of these
z L−1
y [un] − L−1
x [un], wy = L−1
z [wyx], wxzy = L−1
x L−1
y L−1
y [un], wz = L−1
x [wz], wzyx = L−1
x [un] − L−1
z [un] +L−1
y [un] +L−1
y L−1
y L−1
x L−1
z L−1
z [wy],
z
Let us consider wx, wyx = L−1
y [wx] and wzyx = L−1
z [wyx], knowing that computing
of (wy, wzy, wxzy), and (wz, wxz, wyxz) are similar. The iterative process for wx starts
by making an initial guess at the direct boundary values wx|∂Ω, using an extrapolation in
n−2 at each boundary point. Given the update on
n+1,(0) ≈ 3wx
n−1 + wx
n − 3wx
time wx
wx, the next step is to consider wyx = L−1
n+1,(0) ≈ 3wyx
n− 3wyx
wyx
the following step is to consider wzyx = L−1
n−1 + wyx
y [wx]. The process starts with the initial guess
n−2 at the boundary, and then using the update on wyx,
z [wyx]. The process starts with the initial guess
87
in Algorithm 19.
z
z [wyx])
x , L−1
y , and L−1
n+1,(0) ≈ 3un− 3un−1 + un−2 at the boundary. We use three separate algorithms for each
wzyx
sweep L−1
z , and each algorithm processes three major tasks: computation of a
particular solution, a boundary correction, and then obtaining the value of actual L−1. We
give the detailed steps for L−1
Algorithm 19 Compute wzyx(= L−1
1: Compute the interior sweep (particular solution) I n+1
2: Initial guess wzyx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:
7:
n+1(0) ≈ 3un − 3un−1 + un−2
for j = 1 to nx do
for i = 1 to ny do
z = Iz[wyx
n]
Compute wzyxI(i,j), and wzyxII(i,j), at the interpolation points (x(j), y(i), zI(i,j)),
and (x(j), y(i), (zII(i,j)) and as well as at the interpolation points along the right
boundary using bi-linear interpolation
Compute solution at the ghost points (x(j), y(i), zaG(i,j)) and (x(j), y(i), zbG(i,j)) using
the Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients az(i,j) and bz(i,j) using the fact
wzyx
Compute solution and update boundary stencil
wzyx
end for
n+1(l) = I n+1 + az(i,j)e−α(z−zaG(i,j)) + bz(i,j)e−α(zbG(i,j)−z)
n+1(l+1) = I n+1 + az(i,j)e−α(z−zaG(i,j)) + bz(i,j)e−α(zbG(i,j)−z)
8:
9:
10:
11:
12:
13:
end for
nit = nit + 1
14: until (|wzyx
n+1(l+1) − wzyx
n+1(l)| < tol OR nit > mit)
Next, we are going to develop a generalized higher dimensional, higher-order scheme for
problems with complex geometries.
4.3 Treatment for Complex Geometries
We now consider domains with complex geometries such as a model for A6 magnetron
that has a set of arch areas joined together in 2D. The A6 magnetron has a complex ge-
ometry (Figure 1.2), and so it serves as a nice numerical test case for this boundary con-
dition as opposed to a physical accurate boundary condition in a magnetron. We chose an
embedded boundary method to solve such complex problems. The complex geometries in
88
higher-dimensional problems may be even harder. In our approach, however, since the higher
dimensional problems are solved with ADI schemes, we need to worry about one-dimensional
lines. Thus, we need to know all the relevant properties of each line to solve it. However,
these lines are going to be broken into several segments due to complex boundaries. First,
we compute boundary intersection points ( which may be in oﬀ-grids), where the grid lines
intersect with boundaries.
We give a detailed explanation for 3D complex geometries in this section. In our approach,
ﬁrst, we decompose higher-dimensional geometry to 2D slices and then represent it using 2D
graphs with a set of vertices and edges. The edges can be classiﬁed as straight-line edges
and curves or arches. The graph G can be given as,
G(V, E(ES, EA)), ES ≡ (vsi, vsj),
and EA ≡ (Oij, vai, vaj)
(4.8)
where vsi, vsj, vai, and vaj are the vertices and Oij is center of the arch (vai, vaj) which
may be a circular arch (4.4 (a)) or an elliptical arch (4.4 (b)). In Figure 4.4, each object
has three vertices (v1, v2, v3), two straight edges es1(≡ (v1, v2)), es2(≡ (v3, v1)), and one arch
edge ea1(≡ (v2, v3, Oa1)) that can be represented as a graph G([v1, v2, v3], [[es1, es2], [ea1)]])
(a) with circular arch
(b) with elliptical arc
Figure 4.4: 2D graph with 3 vertices (v1(x1, y1), v2(x2, y2), and v3(x3, y3), 2 straight edges
es1(≡ (v1, v2)), es2(≡ (v3, v1)) and 1 circular arch (a) or elliptical arch (b) edge
ea1(≡ (v2, v3, Oa1)) centered at Oa1(xa1, ya1).
89
4.3.1 Pre-computing
Before beginning the PDE evolution, we need to perform a pre-computation to identify
key characteristics of the geometry such as boundary and relevant parameter values along
the grid lines in each direction. The ﬂow diagram in Figure 4.5 shows major tasks performed
in pre-computation and results obtained at the end of each task.
Figure 4.5: Flow diagram of pre-computation for 3D problems with complex geometries; it
cuts a 3D object into 2D slices Sxy, Sxz, makes 2D graphs Gxy, Gxz, and identiﬁes
the properties of the line segments Px, Py, and Pz.
1. Slicing
This task decomposes a given 3D object into a set of 2D objects by slicing along the
grid lines. We can form three sets of 2D slices Syz, Sxz, and Sxy in each direction x,
y, and z respectively, but two sets are suﬃcient for our computation (if the object is
uniform along any direction, we need to process slicing in that direction only - yielding
90
a set of slices). Suppose we apply the slicing along z and y directions, we will have a
set of xy slices, Sxy placed along z with distance ∆z apart and a set of xz slices, Sxz
placed along y with distance ∆y.
2. Graphing
This task deﬁnes the geometry of each 2D slice using 2D graphs as expressed in equation
4.8. We have two sets of 2D slices, Sxy of size nz and Sxz of size ny where nz and ny are
the number of spatial grid points along the z and y directions respectively. Each slice
should have at least one 2D object. Suppose we assume a 3D object has only convex
surfaces along with primary directions x, y, and z, then 2D objects in the slices will
be deﬁned by
Gxy(k),
k = 1, 2, ...nz
Gxz(j),
j = 1, 2, ...ny
Therefore, we will obtain, nz + ny graphs.
3. Segmentation
This task generates line segments along each grid line in each direction x, y, and z using
intersections between the grid lines and the surface of the object. The segments along
x and y can be computed using the graph Gxy and along x and z can be computed
using the graph Gxz, but we should avoid unneeded duplicate computations for x in
order to reduce computing cost. In the ﬁnal stage of the pre-computation all relevant
parameters/properties Px of size nx × nsx, Py of size ny × nsy and Pz of size nz × nsz
for each line segment will be computed. Here nsx, nsy and nsz are the number of line
segments along x, y, and z respectively. Since, however, each line in the same direction
does not need to have the same amount of segments, nsx, nsy and nsz are not single
variables, they are vectors/arrays to keep track of the number of segments in each line.
91
Now, the sizes of Px, Py, and Pz are
nx(cid:88)
i=1
nsx(i),
ny(cid:88)
j=1
nsy(j) , and
nz(cid:88)
k=1
nsz(k) respectively.
After we are done with the pre-computation, we will have three sets of line segments with
their properties (Px, Py, and Pz) for each direction. These segments will be utilized by our
higher-order solver to obtain a solution with higher-order accuracy. Actually, during each
sweep, each line segment is treated individually with its own piecewise constant wave speed,
and then the 3D higher-order equation 2.77 is applied to compute the solution un+1 at the
next time step tn+1.
4.3.2 Geometry for A6 MDO in 2D
A model of A6M is a useful device to study high power microwave sources. This device
has a solid cathode and a solid anode with six cavities as shown in Figure 4.6-(a)). In our
modeling, we consider both categories solid and transparent cathodes ( [28; 56]) and design
the coupling horn by removing the top portion of a rectangular pyramid and ﬁxing it with a
cavity of the anode. If we install the whole pyramid in the same place, the top will be at the
center of the anode. In order to model the A6 magnetron with diﬀraction output (MDO)
in 2D, we assume that the cathode and anode are inﬁnitely long in the z-direction. Figure
4.6-(a) shows the axial view of the MDO in 2D. Suppose α1 and α2 are the angular widths
of the vane and cavity respectively, we can compute the angles θ1 and θ2 for each cavity of
arch in terms of α1 and α2 (Figure 4.6-(b)). The angle of the kth arch is given by,
1 = θk−1
θk
2 = θk−1
θk
1 + (k − 1)(α1 + α2)
1 + kα1 + (k − 1)α2
1 + α1 + α2 = θ1
1 + 2α1 + α2 = θ1
(4.9)
(4.10)
where, since this is a symmetric uniform model α1 + α2 = π
3 . The model has two closed
objects, a six wings object of anode with an object of circle inside which represents the
cathode. The geometry of a six wings object can be represented with a graph which has 24
vertices, 12 straight line edges and 12 arches [68].
92
During the pre-computation, for each horizontal and vertical lines, we ﬁnd the intersection
(a) Axial view
(b) Cavity angle
Figure 4.6: A6 MDO (a) Axial view with a solid cathode of radius rc in the center, the
anode with inner radius ra, vane radius rv, vane angle α1, cavity angle α2, and
a coupling horn of radius rh (b) the angle of kth cavity which is represented by
the angles θk
1 and θk
2 , θk
2 = θk
1 + α1.
points with edges of graphs and compute ghost, boundary, and interior points( (xI, yI) and
(xII, yII)) Figure 4.7 and relevant parameters which are needed to apply bi-linear interpo-
lation. The pre-computation task forms two sets of line segments for each direction which
have to be treated during x and y sweeps with our numerical scheme.
4.4 Numerical Results
In order to evaluate our numerical scheme, we consider several test cases in 2D and
3D using classic EM problems such as electromagnetic scattering of a plane wave by metal
circular cylinders ( which can be extended to dielectric circular cylinders and applicable to
photonic crystal applications), scattering of laser light by spherical objects, and electron
beams in A6M.
93
Figure 4.7: Geometrical structure of A6 MDO with key points; intersection, boundary, ghost,
and interpolation points required to obtain (a) x− and (b) y− sweeps.
94
4.4.1 Cylindrical Scattering
We begin with electromagnetic wave scattering of a plane wave by a metal cylinder of
radius r which is homogeneous along the z direction. By assuming the cylinder is inﬁnitely
tall, we can represent this case in a 2D model. We choose a rectangular domain Ω =
[−1, 1]×[−1, 1] with a circle of radius r = 0.5 with center at (0, 0), and analyse a point source
ﬁeld (cos(ωt), ω = 10) placed at (−1 + 3∆x, 0), ∆x = 0.0156 (Figure 4.8 - (a)) by applying
Neumann, outﬂow and Dirichlet boundary conditions along the circular boundary, vertical
rectangular borders (in y direction), and horizontal rectangular borders (in x direction)
respectively. We set the grid size to be 128 × 128, time step size ∆t = 0.0078, averaging
parameter β = 1.4 and dissipation coeﬃcient = 0.1. Figure 4.9 shows snapshots of the
scattered wave at diﬀerent time instants.
(a) A cylinder at the center (b) Two
cylinders
symmetrical
(c) Two
cylinders
asymmetrical
Figure 4.8: 2D model for a single cylindrical scatter at the (a) center and two (b) symmet-
ric and (c) asymmetric cylindrical scatters in the corners. Here the red mark
indicates the point source.
For the following two experiments, we use two circular scatters instead of one and consider
symmetric and asymmetric geometry with a rectangular domain Ω = [−2, 2] × [−2, 2]. We
move the point source to the bottom right corner and place another circular scatter with the
same radius r1 = 0.3 for the symmetric case (Figure 4.8 - (c)) and with radius r2 = 0.15 for
the asymmetric case (Figure 4.8 - (d)) at the top right corner (1.5, 1.5). We apply Neumann
and outﬂow boundary conditions along the circular and the four rectangular boundaries
respectively and retain the same values for parameters ∆x, ∆t, β , and . Figures 4.10 and
95
(a) t = 1.0608
(b) t = 2.0436
(c) t = 3.1668
(d) t = 4.329
Figure 4.9: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a circular scatter of
radius r = 0.5 in the center of the square domain Ω = [−1, 1]2 with a spatial
step size ∆x = ∆y = 0.0156 and time step size ∆t = 0.0078.
4.11 show snapshots of the wave at diﬀerent time instants for the symmetric and asymmetric
cases respectively.
(a) t = 1.2714
(b) t = 1.9032
(c) t = 2.886
(d) t = 4.29
Figure 4.10: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
circular scatters of radii r1 = r2 = 0.3 in the bottom-left and top-right corners
of the square domain Ω = [−2, 2]2 with a spatial step size ∆x = ∆y = 0.0156
and time step size ∆t = 0.0078.
4.4.1.1 Convergence studies
We demonstrate fourth order convergence by performing a reﬁnement study on a 2D
square domain Ω = [−1, 1] × [−1, 1] with a circular scatter of radius r = 0.5 at the center,
and a point source cos(ωt), ω = 1 placed at (0,−1 + 3∆x), ∆x = 0.0156. This runs up to the
ﬁnal time T = 2.0, with a ﬁxed spatial resolution of 160 × 160 spatial points. The discrete
L∞ norm of the error is constructed at each time step and maximum error over all time
steps is used to graph 4.12 for ∆t = 6.25
2k
, k = 1 to 5, with Neumann and outﬂow boundary
96
(a) t = 1.2714
(b) t = 1.9032
(c) t = 2.886
(d) t = 4.29
Figure 4.11: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmetrical
circular scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left and top-right
corners of the square domain Ω = [−2, 2]2 with a spatial step size ∆x = ∆y =
0.0156 and time step size ∆t = 0.0078.
conditions along the circular and rectangular boundaries respectively.
Figure 4.12: Fourth order convergence of 2D wave solver using a point source cos(ωt), ω = 1
placed at (0,−1 + 3∆x), ∆x = 0.0156 on a square domain Ω = [−1, 1]2 with
a circular scatter of radius r = 0.5 at the center by imposing Neumann and
outﬂow along the circular and square boundaries. This is a self-reﬁnement
study which measures L∞ norm of the error at the ﬁnal time T = 2.0.
For the space convergence study, a point source cos(ωt), ω = 1 is placed at (−1+3∆x, 0),
and runs up to time T = 2.0, with ﬁxed CFL values 2 and 1, where the wave speed c is
97
chosen to be 1. Our L∞ norm error plots of the solution at time T = 2.0 with varying
resolutions show fourth order accuracy (Figure 4.13 - (a)). We also note that the normal
derivative along the boundary converges with fourth order accuracy (Figure 4.13 - (b)).
(a) Space convergence of solution
(b) Space convergence of boundary derivative
Figure 4.13: Convergence plots for a) space using entire solution and b) space using boundary
derivative on a square domain Ω = [−1, 1]2 with a circular scatter of radius
r = 0.5 at the center by imposing Neumann and outﬂow along the circular and
square boundaries. This is a self-reﬁnement study which measures L2 norm of
the error at the ﬁnal time T = 2.0.
4.4.2 Spherical Scattering
In this section, we evaluate our 3D scheme using the scattering of light by a spherical
object.
For the ﬁrst test case, we chose a cubic domain (Ω = [−1, 1] × [−1, 1] × [−1, 1]) with a
spherical scatter of radius r = 0.5 at the center of the cube (0, 0, 0) (Figure 4.14, (a)). We
place a source ﬁeld (cos(ωt), ω = 10) placed 3∆z above the center of the bottom surface(
xy plane), at (0, 0,−1 + 3∆z), ∆z = 0.0156 by applying Neumann and outﬂow boundary
conditions along the surface of the sphere and six surfaces of the cube respectively. We set
the grid size to be 128× 128× 128, time step size ∆t = 0.0078, averaging parameter β = 1.4
and dissipation coeﬃcient = 0.1. Figure 4.15 shows snapshots of the wave at diﬀerent time
98
instants.
(a) Single sphere
(b) Two symmetric spheres
(c) Two asymmetrical spheres
Figure 4.14: 3D model for a single spherical scatter at the center of a cube, two b) symmetric
and c) asymmetric spherical scatters in the corners.
Figure 4.15: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a spherical scatter
of radii r = 0.5 in the center of the cubic domain Ω = [−1, 1]3 with a spatial
step size ∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.
For the next two experiments we placed two equal sized spheres of radii r1 = r2 = 0.3
for the ﬁrst case and two spheres with diﬀerent radii r1 = 0.3 and r2 = 0.15 for the second
case at the corners of the cubical domain as shown in Figure 4.14 - (b), (c). We move the
point source to (0, 0,−1 + 3∆z), ∆z = 0.0156 and simulate the wave propagation by apply-
ing Neumann and outﬂow boundary conditions along the surface of the two spheres and six
surfaces of the cube respectively. We retain the same values of parameters ∆x, ∆t, β, and
as before. Figure 4.16 and 4.17 show snapshots of the wave at diﬀerent time instants for
the symmetric and asymmetric cases respectively.
99
Figure 4.16: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
spherical scatters of radii r1 = r2 = 0.3 in the bottom-left and top-right corners
of the cubic domain Ω = [−1, 1]3 with a spatial step size ∆x = ∆y = ∆z =
0.0156 and time step size ∆t = 0.0078.
Figure 4.17: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmetric
spherical scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left and top-
right corners of the cubic domain Ω = [−1, 1]3 with a spatial step size ∆x =
∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.
4.4.2.1 Convergence Studies
We demonstrate fourth order convergence in time and space by performing self-reﬁnement
studies on a rectangular domain Ω = [−1, 1]3 with a spherical scatter of radius r = 0.5 at
the center.
For the time convergence study, a point source cos(ωt), ω = 1 is placed at (0, 0,−1 +
3∆z), ∆z = 0.0156, and runs up to dimensionless time T = 2.0, with a ﬁxed spatial resolution
of 160×160×160 spatial points. The discrete L∞ norm of the error is constructed at each time
step and maximum error over all time steps is used to graph Figure 4.18 -a) for ∆t = 0.025
2k
, k = 1 to 5, with Neumann and outﬂow boundary conditions along the sphere surface and
cube surfaces respectively.
For the space convergence study, a point source cos(ωt), ω = 1 is placed at (0, 0,−1 +
3∆z), and evolves to dimensionless time T = 2.0, with ﬁxed CFL values 2, 1, and 0.5 where
100
(a) Time convergence
(b) Space convergence
Figure 4.18: Fourth order (a) time and (b) space convergence plots using a spherical scatter
of radius r = 0.5 at the center of a cubic domain Ω = [−1, 1]3 by imposing
Neumann and outﬂow along the surface of the sphere and cubic boundaries.
This is a self-reﬁnement study which measures (a) L∞ and (b) L2 norm of the
error at the ﬁnal time T = 2.0.
the wave speed c is chosen to be 1. Our L2 norm error plots of the solution at dimensionless
time T = 2.0 with varying resolutions show fourth order accuracy (Figure 4.18 -b)).
4.4.3 Electron Beam in Magnetrons
4.4.3.1 A6 Magnetron with Diﬀraction Output (MDO) in 2D
In this section, we test our solver using a modiﬁed model for the A6M in 2D. We use a
point source instead of voltage source and made two experiments using solid and transparent
cathodes with a coupling horn of radius rh = 4.9, radius of the vane resonators rv = 4.11,
the anode radius ra = 2.11, the cathode radius rc = 1.58, the angular width of the vane,
6 , and the cavity, α2 = π
α1 = π
circular source, cos(ωt)(|x2 + y2 − r2
6 . For the ﬁrst test case, we simulate the scattered wave of a
c| < 2∆x), ω = 10 placed along with the cathode. 4.19
shows snapshots of the wave captured at diﬀerent time instants.
In the second test case, we use six point-sources that are placed around the circle of the
101
cathode each nearby a cavity (we maintain 15o from left edge-radius of each cavity) and
simulate the waves propagated by these sources. 4.20 shows snapshots of the wave captured
at diﬀerent time instants. We applied Neumann boundary conditions along with the model
for both cases.
Figure 4.19: Time evolution of a circular source ﬁeld cos(ωt)(|x2 +y2−r2
c| < 2∆x), ω = 10 in
the model of A6RM with a solid cathode of radius rc = 1.58 and anode of inner
radius ra = 2.11, vane radius rv = 4.11 and the coupling horn radius rh = 4.9
and the angular width of the vane, α1 = π
6 , and the cavity, α2 = π
6 .
Figure 4.20: Time evolution of ﬁelds provided by six point sources in the model of A6M with
a transparent cathode of radius rc = 1.58 and anode of inner radius ra = 2.11,
vane radius rv = 4.11 and the coupling horn radius rh = 4.9 and the angular
width of the vane, α1 = π
6 , and the cavity, α2 = π
6 .
Further, we perform a convergence test by a reﬁnement study on a model for A6M
without cathodes. We set the radii for coupling horn, vane resonators, and anode as rh = 4.9
rv = 4.11, and ra = 2.11 respectively. The angular width of the vane, α1 = π
6 and the cavity
angle, α2 = π
6 . We place a point source cos(ωt), ω = 1 at the center (0, 0), and evolve to the
ﬁnal dimensionless time T = 2.0, with a ﬁxed spatial resolution of 160 × 160 spatial points.
The discrete L∞ norm of the error is constructed at each time step and maximum error
102
over all time steps is used to graph Figure 4.21 for ∆t = 0.3125
2k
, k = 1 to 6, with Neumann
boundary conditions along the model. It shows fourth order convergence.
(a) A6M
Figure 4.21: Fourth order convergence of ﬁelds provided by a point source cos(ωt), ω = 1
at the center (0, 0), of A6MDO with the anode of radii rh = 4.9 rv = 4.11, and
ra = 2.11 and angular width of the vane, α1 = π
6 . This
is a self-reﬁnement study which measures L∞ norm of the error at time T = 2.0
for ﬁxed spatial step size ∆x = ∆y = 3.13 × 10−2
6 and the cavity, α2 = π
4.4.3.2 A6 Magnetron in 3D
For this experiment, a 3D domain of Ω = [−5, 5]3 is chosen with radius of the vane
resonators rv = 4.11, anode radius ra = 2.11, cathode radius rc = 1.58, angular width of
vane 20o, cavity angle 40o, and thick h = 6 along z. We set the grid size to be 1283, time
step size ∆t = 0.039, averaging parameter β = 1.4 and dissipation coeﬃcient = 0.1. In
this simulation of A6 magnetron with a transparent cathode which is mimicked by using
six emitters/line sources, we calculated the time evolution by imposing embedded Neumann
boundary conditions over the boundary stencil during the x−, y−, and z− sweeps (Figure
103
4.23). Figure 4.22 shows the geometrical setup of the 3D A6 magnetron, indicating key
on/oﬀ mesh grid points used by the scheme.
(a) on x−sweep
(b) on y−sweep
(c) on z−sweep
Figure 4.22: Key points used for the simulation of 3D A6 magnetron; intersection, boundary,
ghost, and interpolation points required to obtain (a) x−, (b) y−, and (c) z−
sweeps.
4.4.3.3 Convergence Studies
Further, we perform self-reﬁnement studies to evaluate the consistency of our scheme
for A6 magnetron with a transparent cathode. We retain the same values of geometrical
parameters (rh, rv, ra, rc, α1, α2, and h) and emitters. The simulation evolves to the ﬁnal
104
(a) t = 1.5625
(b) t = 4.6875
(c) t = 6.25
(d) t = 11.7187
Figure 4.23: Time evolution of ﬁelds provided by six point sources in the model of 3D A6
magnetron with a transparent cathode of radius rc = 1.58 and anode of inner
radius ra = 2.11, vane radius rv = 4.11 and the coupling horn radius rh = 4.9,
the angular width of the vane, α1 = π
9 , and thickness
h = 6.
9 , cavity angle, α2 = 2π
time T = 2.0 by imposing Neumann boundary conditions along the model. For the time-
convergence test, we set a ﬁxed spatial resolution of 1283 spatial points, the discrete L∞
norm of the error is constructed at each time step and maximum error over all time steps is
used to graph Figure 4.24 -a) for ∆t = 0.3125
2k
, k = 1 to 4. It shows fourth order convergence.
(a) time convergence
(b) space convergence
Figure 4.24: Fourth order (a) time convergence and (b) space convergence of ﬁelds provided
by six point sources in the model of 3D A6 magnetron with a transparent
cathode and the anode of radii rh = 4.9 rv = 4.11, and ra = 2.11 and angular
width of the vane, α1 = π
6 . This is a self-reﬁnement
study which measures L∞ norm of the error at time T = 2.0 for a ﬁxed spatial
step size of ∆x = ∆y = 3.13 × 10−2
6 and the cavity, α2 = π
105
For space-convergence test, we use ﬁxed CFL values 2, 1, and 0.5 where the wave speed
c is chosen to be 1. Our L∞ norm error plots of the solution at dimensionless time T = 2.0
with varying resolutions show fourth order accuracy (Figure 4.24 -b)).
4.5 Summary
We proposed a fourth-order 3D embedded boundary method and developed a generalized
framework for complex geometry domains in 3D. We use the extended stencil for both com-
putations of the solution at the interpolation points (8-point stencil) and the ghost points (4-
point stencil) to achieve 4th-order convergence in space. And we use the same Lax-Wendroﬀ
approach (exchange the time derivative to spatial derivative) to obtain higher-order accuracy
in time. We gave a detailed description of the high-order Neumann boundary conditions and
its implementation in 1D, 2D, and 3D. We evaluated our scheme for several problems with
complex geometries including A6 magnetron in 3D. Since the magnetrons originally designed
using metal, our best choice to represent its boundaries should be PEC. Therefore, we discuss
the PEC boundary condition in the next chapter. We give descriptions of the derivation of
PEC boundary conditions for our MOLT based scheme and its evaluation.
106
CHAPTER 5: Second-Order PEC Boundary Condition
5.1 Introduction
In this chapter, we develop an embedded PEC boundary condition for the scheme de-
scribed in Chapter 2 and develop an exact simulation for magnetrons by solving the vector
potential form of Maxwell’s equations.
Since magnetrons are physically designed by metal boundaries and have complex ge-
ometries, we need PEC boundary conditions to represent the exact metal boundaries and
the embedded boundary method to deal with complex geometries. Hence, we derive an
embedded PEC boundary condition based on the electromagnetic vector potential for the
simulation of magnetrons in this chapter. We introduce a fast and geometrically ﬂexible
approach to calculate the solution to Maxwell’s equations in vector potential form under the
Lorenz gauge. As presented, the method is 4th order in time and second-order in space, but
the A-stable formulation could be extended to higher order in both time and space. While
there is no conceptual limitation to develop this in 3D, our initial work has centered on 2D.
The eventual goal is to combine this method with particle methods for the simulations of
plasma. In the current work, the scheme is evaluated for EM wave propagation within an
object that is bounded by PEC.
Vector potential formulations of electromagnetism are widely used in classical and quan-
tum physics. Maxwell’s equations describe the time-evolution of four ﬁelds: the magnetic ﬂux
density (B), the electric ﬁeld intensity (E), the electric ﬂux density (D), and the magnetic
ﬁeld intensity (H). It is often convenient to employ a formulation based on the magnetic
vector potential A and electric scalar potential φ. Maxwell’s equations then reduce to un-
coupled wave equations for the vector potential A and scalar potential φ under the Lorenz
107
gauge condition [37]. We use our MOLT based implicit A-stable scheme to solve these wave
equations.
A Perfect Electric Conductor (PEC) boundary condition is derived based on the conti-
nuity of the magnetic ﬂux normal to the boundary and is applied to the magnetic vector
potential. Our formulation avoids the additional compatibility conditions used in [34]. This
embedded PEC boundary condition is a slightly modiﬁed embedded Neumann boundary
condition. The fundamental diﬀerence is that one of the partial spatial derivatives in the
PDE is negative. As presented, the current embedded boundary formulation is second-order
in space but could be extended to higher-order.
In this chapter, we ﬁrst derive wave equations for vector and scalar potentials A and
φ under the Lorenz gauge condition discussed in section 5.2. Then we describe our two-
dimensional high-order implicit scheme using ADI splitting and the higher-order embedded
PEC boundary conditions in section 5.3, and ﬁnally we give numerical results for several test
cases in Section 5.5.
5.2 2D Electric Scalar and Magnetic Vector Potential
The macroscopic Maxwell’s equations are:
∂tB = −∇ × E,
∂tD = ∇ × H − J,
∇ · B = 0,
∇ · D = ρ.
(5.1)
(5.2)
(5.3)
(5.4)
where J is the electric current density, and ρ is the electric charge density. In a linear
isotropic medium, D = E and B = µH. The dielectric constant = 0r with 0 and r
being the free space and relative permittivity respectively, and the permeability µ = µ0µr
with µ0 and µr being the free space and relative permeability respectively.
108
The electric scalar potential φ and magnetic vector potential A are related to E and B by
E = −∇φ − ∂tA,
B = ∇ × A.
For free space, Ampere’s law (Equation 5.2) is,
1
c2 ∂tE = ∇ × B − µ0J.
(5.5)
(5.6)
(5.7)
where the free space phase velocity c = 1√
(µ00).
Substituting E and B using equations 5.5 and 5.6 and imposing the Lorenz gauge con-
dition
1
c2 ∂tφ = −∇A.
a wave equation in terms of the magnetic vector potential A results:
t A − ∇2A = µ0J.
1
c2 ∂2
(5.8)
Similarly by imposing the Lorenz gauge condition, a wave equation for the scalar potential
φ in free space is obtained:
t φ − ∇2φ =
1
c2 ∂2
ρ
0
.
(5.9)
5.3 Perfect Electric Conductor Boundary
The electric ﬁeld (E) is continuous along the boundary and the magnetic ﬂux (B) is
continuous along the normal to the boundary. Since the Perfectly Conducting Boundary
(PEC) has inﬁnite electrical conductivity (σ = ∞), there will be no interior electric ﬁeld
(E2 = 0) in the perfect conductor. It also follows that there is no magnetic ﬁeld (H2 = 0).
109
Figure 5.1: Boundary surface between two regions with electric ﬁelds E1 and E2, magnetic
ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic ﬂux densities
B1 and B2 respectively. Here, Js is the surface current, ρs is the surface charge
density, and n is the normal vector pointed out from the region two (Perfect
Electric Conductor).
The boundary conditions for the PEC become:
n × E1 = 0,
n × H1 = Js,
n · B1 = 0,
n · D1 = ρs.
(5.10)
(5.11)
(5.12)
(5.13)
where n is the unit normal vector to the boundary and Js and ρs are the surface current
density and charge density respectively.
Consider a two-dimensional vector/scalar potential (we are solving 2D problems in this
section). If we choose to restrict B to the x-y plane, the vector potential A only has a z
component,
A = Az(x, y)z.
(5.14)
Using equation 5.6 and the boundary condition represented by equation 5.12 we get
n · (∂yAz − ∂xAz) = 0.
(5.15)
110
5.4 Numerical Approximation for PEC Boundary Conditions
In this section we discuss enforcing boundary conditions with the above method. For
PEC boundary conditions, we extend our previous work on Neumann boundary conditions
to PEC (Chapter 4, [13]). Before we move with boundary conditions, let us express the
second and fourth order two dimensional schemes,
An+1 − 2An + An−1 = −β2C(1)
An+1 − 2An + An−1 = −β2C(1)
xy [An].
xy [An] −
(cid:18)
β2D(2)
xy − β4
12
(cid:19)
C(2)
xy
C(1)
xy [An].
(5.16)
(5.17)
where superscripts on the operators Cxy and Dxy denote level numbers.
On a domain Ω = [xa, xb]× [ya, yb] outﬂow boundary and PEC conditions can be deﬁned
by,
1. PEC:
∂yAz(xa, t) = 0, −∂yAz(xb, t) = 0,
−∂xAz(ya, t) = 0,
∂xAz(yb, t) = 0.
2. Outﬂow boundary condition:
∂tA(xa, t) = c∂xA(xa, t) , ∂tA(xb, t) = −c∂xA(xb, t),
∂tA(ya, t) = c∂yA(ya, t) , ∂tA(yb, t) = −c∂yA(yb, t).
Unlike Neumann boundary conditions, the challenge here is that PEC needs y derivatives
during the x− sweeps and x derivatives during the y− sweeps. This is challenging for an
ADI method because of the separation of directional information.
For a rotated square, the PEC boundary condition can use a similar approach as used
in the original method for Neumann boundaries in [13] because the directions are coupled
at the boundary. Rotating the domain through any angle, say θ, as shown in ﬁgure 5.2, we
have,
111
1. Left: cos(θ)∂yAz − sin(θ)∂xAAz = 0,
2. Right: −cos(θ)∂yAz + sin(θ)∂xAz = 0,
3. Up: sin(θ)∂yAz − cos(θ)∂xAz = 0,
4. Down: −sin(θ)∂yAz + cos(θ)∂xAz = 0.
Figure 5.2: A PEC square cavity rotated by an angle θ with normal vectors nL, nD, nR, and
nU along the left, down, right, and up boundaries.
In the next sections, we demonstrate how to construct a solution for A given the Dirichlet
boundaries, followed by an extension of the ﬁxed point map for Neumann boundaries in [13]
to the case of PEC to generate the Dirichlet boundary conditions (including an adaptation
for the mesh aligned case).
5.4.1 Using Dirichlet Boundary Conditions for A to Capture PEC Boundary
In our embedded boundary method for Neumann, we constructed an eﬃcient conver-
gent Neumann to Dirichlet Map, converting a boundary condition on the outward normal
derivative of a function into a condition that constructs ghost points outside the domain.
The method is designed such that the generated ghost points enforce the desired Neumann
112
condition on the boundary. In the next two sections, we adapt this idea to PEC, which is a
condition on the tangential derivative.
We now show how to carry out the sweeps with given values at the end of each line of
the 4th order ADI method. In our 2D scenario, to construct the operator L−1
to
build C(1), C(2) and D(2), each line is treated as a 1D problem, where the lines are evaluated
with x− sweeps and y− sweeps respectively [13]. We note that the operators that make up
D(k) and C(k) have their own boundary conditions to enforce at level k.
x and L−1
y
The objective is given the boundary condition, solve for each ax and bx along the x−
direction and ay and by along the y− direction. We will denote A(ti, xa) = AL(ti), A(ti, xb) =
AR(ti), A(ti, ya) = AD(ti) and A(ti, yb) = AU (ti). Here, [xa, xb] and [ya, yb] are horizontal
and vertical lines that we make sweeps along. These permit the boundary conditions to be
incorporated into the method.
As we did for wave solvers in our early work, [15], taking AL(ti) and AR(ti) as ﬁxed, for
component of the second order term C(1) along the line ∂Ω = [xa, xb], we arrive at
the L−1
x
two equations in two unknowns:
AL(tn+1) + (β2 − 2)AL(tn) + AL(tn−1)
I
AL(tn) +
(xa) + ax − Mxbx
,
AR(tn+1)(β2 − 2)AR(tn) + AR(tn−1)
I
AR(tn) +
(xb) + Mxax + bx
.
= β2(cid:16)
= β2(cid:16)
(cid:104)
(cid:104)
µ0
α2 Jn(cid:105)
α2 Jn(cid:105)
µ0
(cid:17)
(cid:17)
where Mx = e−α(xb−xa). We can rearrange the linear system into unknown and known values,
ax + Mxbx = wP
a ,
Mxax + bx = wP
b .
113
Solving the linear system for the unknowns ax and bx gives
ax =
(wP
a − MxwP
b )
(1 − M 2
x )
,
bx =
(wP
b − MxwP
a )
(1 − M 2
x )
,
(5.18)
where
(cid:16)
(cid:16)
wP
a =
wP
b =
1
β2
1
β2
AL(tn+1) + (β2 − 2)AL(tn) + AL(tn−1)
AR(tn+1) + (β2 − 2)AR(tn) + AR(tn−1)
(cid:104)
(cid:17) − I
(cid:17) − I
(cid:104)
µ0
α2 Jn(cid:105)
α2 Jn(cid:105)
µ0
AL(tn) +
(xa),
AR(tn) +
(xb).
Taking AD(ti) and AU (ti) as ﬁxed for L−1
y of the second order term C(1) along the line
∂Ω = [ya, yb], we arrive at two equations in two unknowns for ay and by:
AD(tn+1)(β2 − 2)AD(tn) + AD(tn−1) = β2(cid:16)
AU (tn+1)(β2 − 2)AU (tn) + AU (tn−1) = β2(cid:16)
(cid:104)
(cid:104)
µ0
α2 Jn(cid:105)
α2 Jn(cid:105)
µ0
I
AD +
(ya) + ay + Myby
I
AU +
(yb) + Myay + by
(cid:17)
(cid:17)
.
,
(5.19)
(5.20)
where My = e−α(yb−ya). Solving the linear system for the unknowns ay and by gives
ay =
(wP
a − MywP
b )
(1 − M 2
y )
,
by =
(wP
b − MywP
a )
(1 − M 2
y )
,
where
wP
a =
wP
b =
1
β2
1
β2
(cid:16)
(cid:16)
AD(tn+1) + (β2 − 2)AD(tn) + AD(tn−1)
AU (tn+1) + (β2 − 2)AU (tn) + AU (tn−1)
(cid:17) − I
(cid:104)
(cid:17) − I
(cid:104)
AD(tn) +
AU (tn) +
µ0
α2 Jn(cid:105)
α2 Jn(cid:105)
µ0
(ya)
(yb).
For the operator C(2) + D(2) acting on C(1)[An], which is the higher order correction, one
can obtain similar equations, except that the values of AL(ti), AR(ti), AD(ti) and AU (ti) at
the boundaries are explicitly zero. This is due to linearity and the leading order operator
satisﬁes the boundary condition exactly.
114
5.4.2 Embedded Boundary Method, An eﬀective Neumann to Dirichlet Map
for Creating Ghost Points
Given that we know how to solve for the Dirichlet boundary conditions for equations
5.16 and 5.17, as in [14], we would like to extend this process to the PEC case. As discussed
in [13], the reason to employ an embedded boundary approach for a diﬀerential boundary
condition has to do with stability. The diﬀerence here is we need to develop an iterative
map to ﬁnd the homogeneous coeﬃcients a and b for each piece of the operator instead of
directly computing these coeﬃcients.
Our embedded boundary method is based on our initial work in [13]. In this paper, we
developed a second-order embedded boundary method for Neumann boundary conditions
which was second-order accurate in time and space. Using the same ideas, we extended
this method to PEC boundary conditions for the vector potential in 2D. However, in [13],
direct application through the ADI of the operators is more direct. Extending this work to
fourth-order involves developing an approach to solving for the boundary conditions for the
operators C(1), C(2), and D(2).
We use an iterative method to obtain accurate values at the boundaries using an initial
approximation and correct it by imposing our PEC boundary condition through a ﬁxed
point iteration. The ﬁxed point iteration is a multi-step process that converts the derivative
condition along the boundary to a Dirichlet boundary condition, to be set at the ghost
point. These Dirichlet boundary values force the solution to satisfy the PEC condition at
the boundary to within the tolerance of iteration of the ﬁxed point method. It should
be noted that the iteration is local at the boundary and does not involve re-
computing the interior sweeps, making this update cost-eﬀective.
To understand the method, we start by considering a collection of uniform points with
a boundary passing through, see ﬁgure 5.3. We deﬁne the ghost points to be the collection
of points that are greater than one half of a grid spacing away, but less than two and
a half grid spacings away, from the boundary. As in our work in [13], starting from the
115
Figure 5.3: Geometry of 2D embedded PEC boundary method with a boundary point
(xB, yB), ghost point (xG, yG), and interpolation points (xI, yI), (xII, yII), and
(xIII, yIII).
ghost point we deﬁne a normal to the surface, along which we will enforce the boundary
condition. We deﬁne the ﬁxed point method along this normal.
In ﬁgure 5.3, the ghost
point being considered is the red grid point labeled (xG, yG), the blue point labeled (xB, yB)
is the boundary point and there are three interior points needed that are related to the
normal, at (xI, yI), (xII, yII), and (xIII, yIII). Their distances from the boundary point
2∆sI , ξII = |(xII, yII) − (xB, yB)| =
(xB, yB) are ξI = |(xI, yI) − (xB, yB)| =
and ξIII = |(xIII, yIII) − (xB, yB)| = 2∆sI, where we will typically take ∆sI =
√
2∆sI,
√
2∆x.
√
The ﬁxed point method starts by assuming we know AI, AII, AIII, and ∂T AB, which
is taken as a solution at {(xI, yI), (xII, yII) and (xIII, yIII)} and the tangential derivative
of a solution at (xB, yB). Here we enforce ∂T AB = 0 by making ∂T P|(xB,yB) = 0 one
of the conditions we use to solve for the coeﬃcients of the Hermite-Birkhoﬀ interpolating
polynomial, P (x, y) = c0 +c1x+c2y +c3xy. Enforcing that the Hermite-Birkhoﬀ interpolates
these points and explicitly solving for AG gives:
Al+1
G = ΓIAl
I + ΓIIAl
II + ΓIIIAl
III + O(∆x2),
(5.21)
116
where,
where,
ΓI = 1 +
γ3(γ4 + γ5) − γ6(γ1 + γ2)
(γ4γ2 − γ5γ1)
,
ΓII =
ΓIII =
,
(γ6γ2 + γ3γ5)
(γ4γ2 − γ5γ1)
(γ6γ1 + γ3γ4)
(γ4γ2 − γ5γ1)
,
γ1 = xII − xI + yII − yI,
γ2 = xI − xIII + yI − yIII,
γ3 = xG − xI + yG − yI,
γ4 = (xII − xI)(xB − yB) + xIyI,
γ5 = (xI − xIII)(xB − yB) + xIIIyIII,
γ6 = (xB − yB)(xG − x1) + xG(yG − y1).
(5.22)
(5.23)
We have introduced additional notation of l + 1 and l, which designate how we will set up a
ﬁxed point iteration using the Hermite-Birkhoﬀ interpolant. The iteration itself is a simple
ﬁrst order ﬁxed point method. Let w(l) and w(l+1) be the solutions at the lth and (l + 1)th
iterations. We choose a tolerance tol, and maximum number of allowed iterations mit, and
deﬁne a stopping criterion as |w(l+1) − w(l)|∞ < tol OR nit > mit where nit is the current
iteration number. We will now discuss the quasi local process for obtaining w(l+1) given the
rest of the information.
For each term in the operator on the right hand side of the second order solution given
by equation 5.16, or fourth order in time solution given by equation 5.17, we must identify
the correct ghost point that allows that term to guarantee that the tangential derivative is
zero along the boundary. Taking the C(1)[An] operator and expanding the operator out we
117
x [An] + L−1
have C(1)[An] = L−1
y [An] − L−1
identiﬁes the ghost point for wx = L−1
L−1
y [wx] such that when solving for the boundary correction terms the operator satisﬁes
∂T wx|∂Ω = 0, ∂T wy|∂Ω = 0, ∂T wxy|∂Ω = 0, ∂T wyx|∂Ω = 0.
x L−1
y [An], wxy = L−1
y L−1
x [An] − L−1
x [An], wy = L−1
y [An]. The ﬁxed point iteration
x [wy], and wyx =
Let us consider wx and wyx = L−1
y [wx], knowing that wy and wxy are similar. The
iterative process for wx starts by making an initial guess at the direct boundary values
wx|∂Ω, using an extrapolate in time wx
n−2 at each boundary
n+1,(0) ≈ 3wx
n−1 + wx
n − 3wx
point (Algorithm 20).
Algorithm 20 Compute wx(= L−1
x [A])
1: Compute the interior sweep (particular solution) I n+1
n − 3wx
2: Initial guess wx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:
n+1(0) ≈ 3wx
for k = 1 to ny do
n−1 + wx
n−2
x = Ix[An]
Compute wxI(k), and wxII(k), at the interpolation points (xI(k), y(k)), and
(xII(k), y(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (xaG(k), y(k)) and (xbG(k), y(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wx
Compute solution and update boundary stencil
wx
n+1(l) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)
n+1(l+1) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)
7:
8:
9:
10:
11:
end for
nit = nit + 1
12: until (|wx
n+1(l+1) − wx
n+1(l)| < tol OR nit > mit)
Given the update on wx, the next step is to consider wyx = L−1
y [wx]. The process starts
with the initial guess wx
n+1,(0) ≈ 3An − 3An−1 + An−2 at the boundary (Algorithm 21).
The same process is done for wy and wxy. The Hermite-Birkhoﬀ interpolate enforces
that the tangential derivative is zero for wx, wy, wxy and wyx such that C(1)[An] satisﬁes
the tangential derivative condition on ∂Ω to within tolerance.
For the fourth order formulation, we compute the boundary conditions for C(1)[An] and
then we repeat the process for C(2) and D(2) acting on C(1)[An].
118
y [wx])
Algorithm 21 Compute wyx(= L−1
1: Compute the interior sweep (particular solution) I n+1
2: Initial guess wyx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:
n+1(0) ≈ 3An − 3An−1 + An−2
for k = 1 to nx do
y = Iy[wx
n]
Compute wyxI(k), and wyxII(k), at the interpolation points (x(k), yI(k), and
(x(k), yII(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (x(k), yaG(k)) and (x(k), ybG(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wyx
Compute solution and update boundary stencil
wyx
n+1(l) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)
n+1(l+1) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)
7:
8:
9:
10:
11:
end for
nit = nit + 1
12: until (|wyx
n+1(l+1) − wyx
n+1(l)| < tol OR nit > mit)
Further as detailed in [13] we need to include an artiﬁcial dissipation term to maintain
stability for these embedded boundary domains. Thus we have
(cid:18)
β2D(2)
xy − β4
12
C(2)
xy
(cid:19)
An+1 − 2An + An−1 = −β2C(1)
xy [An] −
C(1)
xy [An] + C(3)
xy [An−1].
where is an artiﬁcial dissipation coeﬃcient that satisﬁes 0 < < 1. The value of C(3)
at the previous time step can be used in place of C(3)
we need to go to one more computing level to obtain C(3)
xy [An]
xy [An−1] at the current time step, and
xy [An] [68]. We obtain AI, AII, and
AIII approximately, using the bilinear interpolation. Suppose the interpolation point AI
lies in a rectangular cell with corners (xi, yj), (xi+1, yj), (xi+1, yj+1), and (xi, yj+1). Then we
have the following approximation for AI
AI =w1Ai,j + w2Ai+1,j + w3Ai+1,j+1 + w4Ai,j+1,
(5.24)
119
where
w1 =
w3 =
(xi+1 − xI)(yj+1 − yI)
(xI − xi)(yI − yj)
∆x∆y
∆x∆y
,
,
w2 =
w4 =
(xI − xi)(yj+1 − yI)
(xi+1 − xI)(yI − yj)
∆x∆y
∆x∆y
,
.
(5.25)
5.5 Experimental Results
5.5.1 Square Cavity Rotated Through Diﬀerent Angles
For this experiment, we chose a square cavity bounded by a PEC, placed a point source
at the center of the cavity, and performed a ping test and mode analysis. First, we computed
the vector potential A at a point inside the box using our scheme, then computed the derived
frequency by taking FFT over the time history of the measured A, and ﬁnally analyzed the
frequency modes for varying CFL value, step size, and rotation angle. Further, we have
transformed into physical units here in contrast to the normalized units we have used up to
this point.
We chose a square box 21cm × 21cm with PEC boundaries in a square domain (Ω =
[−21cm, 21cm]2), placed a point source 1 at the center of the box,(0, 0), and turned it on for
a single time step t = ∆tns. The vector potential is measured at the point (3.36cm, 3.36cm)
for the time period t = [∆tns, T ns]. The derived frequencies for diﬀerent CFL values (0.5,
1.0, 2.0), rotation angles (0o, 31.42o, 45o), and resolutions are summarized in tables 5.5.1
and 5.5.1. Table 5.5.1 consists of frequencies obtained for the CFL value 1. Here, we set
the wave speed c = 30 Gcms−1, averaging parameter β = 1.4 and dissipation coeﬃcient
= 0.1. Figure 5.4 shows the frequency distribution for θ = 31.42o which consists the 1-1
strong mode fundamental frequency 1GHz as expected and Figure 5.5 shows a frequency
mode computed for diﬀerent resolutions with CFL value 1. We see clear convergence to the
analytically computed 1-1 mode fundamental frequency 1GHz. For the ﬁxed CFL and wave
speed, time step size ∆t reduces with reducing spatial step size ∆x. Hence, the amplitude
120
is reducing by the decreasing spatial step size.
Figure 5.4: Frequency distribution for 31.42o rotated 21cm × 21cm square cavity computed
using the measured vector potential at the point (3.36cm, 3.36cm) for the impulse
response, h = 0.084 cm
Figure 5.5: A strong fundamental mode for 31.42o rotated 21cm × 21cm square cavity com-
puted for diﬀerent resolutions.
121
(a) 1-3 mode,
f = 2.2588GHz
(b) 3-3 mode,
f = 3.0305GHz
(c) 1-5 mode,
f = 3.6422GHz
(d) 3-5 mode,
f = 4.1650GHz
(e) 5-5 mode,
f = 5.0508GHz
Figure 5.6: Convergence plots for the natural modes (a) 1-3, (b) 3-3, (c) 1-5, (d) 3-5, and
(e) 5-5 with analytically computed frequencies (a) f = 2.2588GHz, (b) f =
3.0305GHz, (c) f = 3.6422GHz, (d) f = 4.1650GHz, and (e) f = 5.0508GHz
for 31.42o rotated 21cm× 21cm square cavity computed for diﬀerent resolutions.
122
Rotated angle, θ
Grid size, N
CF L = 0.5, T = 50ns CF L = 2, T = 200ns
45o
31.42o
31.42o
45o
0o
0o
502
1002
1502
2002
2502
0.98
1.00
1.00
1.00
1.00
0.99
1.00
1.00
1.00
1.00
0.96
0.98
0.98
1.00
1.00
0.98
1.00
0.99
1.00
1.00
0.98
0.99
1.00
1.00
1.00
0.96
0.98
0.98
0.99
1.00
Table 5.1: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping test.
Rotated angle, θ
Grid size, N
0o
31.42o
45o
502
1002
1502
2002
2502
9.599802E-1
9.899800E-1
9.899800E-1
1.000015E+0
1.000008E+0
9.899800E-1
9.999800E-1
9.999800E-1
1.000115E+0
1.000108E+0
9.899800E-1
9.999800E-1
9.999800E-1
1.000015E+0
1.000008E+0
Table 5.2: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping test
for CFL 1.
Further, we evaluated the frequencies of the normal modes 1-3, 3-3, 1-5, 3-5, and 5-5.
Our convergence plots (Figure 5.6) show the clear convergence to the analytically computed
frequencies 2.2588GHz, 3.0305GHz, 3.6422GHz, 4.1650GHz, and 5.0508GHz for the modes
1-3, 3-3, 1-5, 3-5, and 5-5 respectively.
Figure 5.7 shows the time evolution of the potential A generated by a point source
sin(2πf t) with f = 1GHz placed at the center of the box. We chose the same domain as
used in the previous experiment and set the grid size to be 100×100, time step size ∆t = 7ps.
5.5.1.1 Convergence Studies and Error Analysis
We evaluate the consistency of our scheme via (1) the ping test and (2) space-time
convergence studies for the magnetic vector potential A in the PEC square cavity.
First, we perform a convergence study given the initial condition sin(cid:0) mπx
(cid:1) sin(cid:0) nπy
(cid:1) with
L
H
frequency mode 3-2 (m = 3, n = 2) over a square domain [0cm, 21cm]2 and compare our
123
(a) θ = 0o,
t = 0.637ns
(b) θ = 31.42o,
t = 0.637ns
(c) θ = 45o,
t = 0.637ns
(d) θ = 0o,
t = 1.176ns
(e) θ = 31.42o,
t = 1.176ns
(f) θ = 45o,
t = 1.176ns
Figure 5.7: Time evolution of a point source ﬁeld sin(2πf t) with f = 1GHz placed at the
center of a PEC square box of grid size 100×100 rotated by the angles 0o, 31.42o,
45o, time step size ∆t = 7ps.
numerical solution to the exact solution given by,
A(t, x, y) = cos
c
(
mπ
L
)2 + (
nπ
H
(cid:18)
(cid:114)
(cid:19)
)2t
sin
(cid:16)mπx
(cid:17)
L
(cid:16)nπy
(cid:17)
H
sin
, where the wave speed c is chosen to be 30Gcms−1. Our L2 norm error plots of the solution
at time T = 1.0ns with varying resolutions and a ﬁxed CFL( = 1) for the cases of mesh
aligned and nonaligned (rotated by 45o) square cavities show second order accuracy (Figure
5.8 - (a)). We also note that the tangential derivative along the boundary converges with
124
third order accuracy (Figure 5.8 - (b)).
(a) Space convergence of solution
(b) Space convergence of boundary derivative
Figure 5.8: Convergence plots for a) space using entire solution and b) space using boundary
derivative on a PEC square domain [0cm, 21cm]2. This study measures L2 norm
of the error at time T = 1.0ns compared with the analytical solution, for the
cases of mesh aligned and nonaligned (rotated by 45o).
For the second evaluation, we compute the error with our fundamental frequency com-
putation using the ping test explained above and summarize it in Table 5.5.1.1. The error
is the diﬀerence between the frequencies (1-1 mode) obtained analytically (= 1GHz) and
numerically using our scheme.
Rotated angle θ
Grid size, N
CF L = 0.5, T = 50ns
31.42o
0o
CF L = 2, T = 200ns
31.42o
0o
502
1002
1502
2002
2502
1.001980E-2
1.999960E-5
1.999960E-5
1.999960E-5
8.0000064E-6
4.501910E-2
2.001960E-2
2.001960E-2
1.500023E-5
8.000064E-6
1.501970E-2
5.019900E-3
1.999960E-3
5.015075E-4
5.0080040E-5
4.501910E-2
2.001960E-2
1.501970E-2
4.985075E-4
8.000064E-6
Table 5.3: Numerical error in the frequency computation using the ping test at the point
(3.36cm, 3.36cm).
Finally, we perform a self-reﬁnement study for time and space convergence using a point
source. Figure 5.9 shows the time and space convergence plots for the cases of mesh aligned
and nonaligned (rotated by 45o) square cavities with the same conﬁguration as used in
125
the above experiment done for the time evolution of a point source ﬁeld. For the time
convergence test, we maintain the resolution at 320 × 320 and reduce the time step size
∆t = 10.9ps
to
0.3418ps. For the space convergence test, we maintain the CFL value
as 1 and reduce the spatial step size ∆x = 1.313cm to 0.082cm. We obtained fourth-order
convergence in time and second-order in space.
(a) Time convergence
(b) Space convergence
Figure 5.9: Convergence plots for (a) time and (b) space on a PEC a square domain
[0cm, 21cm]2. This study measures L∞ norm of the error at time T = 1.0ns
compared with the analytical solution, for the cases of mesh aligned and non-
aligned (rotated by 45o).
5.5.2 Square Cavity with a leak (diﬀraction Q)
The Q factor (quality factor) of a resonator is a measure of the damping, with a lower
Q indicating higher damping [33]. It measures the harmonic losses of the oscillations. The
diﬀraction Q factor is the external Q factor which characterized the energy loss due to
radiation. The value of Q is inﬁnite for a closed cavity and ﬁnite for an open cavity and it
can be used to determine the amount of energy loss. The diﬀraction Q can be expressed as,
Q = −πf (t2 − t1)
ln( A1t1
A2t2
)
where A1 and A2 are the vector potentials at time t1 and t2.
126
In the second experiment, an open boundary is placed along the center of the right edge
of the square cavity, and the diﬀraction quality factor Q is obtained for an oscillating point
source (sin(2πf t), f = 1GHz) placed at the center.
Every conﬁguration was the same as above, except for imposing an outﬂow boundary
condition along the open boundary and keeping the Gaussian pulse running for the entire
time period [∆tns, T ns]. The vector potential A is measured at the point (3.36cm, 3.36cm)
and the computation of Q is done for aligned and nonaligned mesh cases. For non aligned
meshes, we rotate the square by an angle θ = 45o.
Analytic calculation of the quality factor treating an open cavity like a transmission line
with a load simulating free space (377Ω) gives a value of Q=24 [36]. The Q values obtained
from our scheme (Q = 26.5768 for θ = 0o and Q = 25.7875 for θ = 31.420) are very close to
the analytically obtained value. Figure 5.10 shows the time evaluation of vector potential A
at the point (3.36cm, 3.36cm) for both cases.
(a) θ = 0o, Q = 26.5768
(b) θ = 31.420, Q = 25.7875
Figure 5.10: Time evaluation of A in (a) mesh aligned and (b) θ = 31.420 rotated square
cavities with a leak imposed by outﬂow boundary condition and diﬀraction Q
which is obtained for an oscillating point source (sin(2πf t), f = 1GHz)
127
5.5.3 A6 Magnetron
We chose a 2D A6 magnetron to test the applicability of our PEC scheme to complicated
geometry. For this experiment, a 2D domain of Ω = [0cm, 5cm]2 is chosen with radius of
the vane resonators rv = 4.11cm, anode radius ra = 2.11cm, cathode radius rc = 1.58cm,
angular width of vane, 20o, and cavity angle, 40o. We set the grid size to be 128 × 128, time
step size ∆t = 39ps, averaging parameter β = 1.4 and dissipation coeﬃcient = 0.1. The
embedded PEC boundary condition is imposed over the boundary stencil during the x− and
y− sweeps.
This experiment was conducted for frequency mode analysis. A point source was placed
in the center of the A6M and the frequency distribution examined using a ping test at the
point (1.4063cm, 0.8203cm). We obtained six strong frequency modes as shown in ﬁgure
5.11, associated with the ﬁrst two passbands, clearly showing the eﬀects of all six symmetric
resonances.
Figure 5.11: Frequency spectrum of 2D A6M with vane resonators rv = 4.11cm, anode radius
ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane, 20o, and
cavity angle, 40o, the grid size 128 × 128, time step size ∆t = 39ps, averaging
parameter β = 1.4 and dissipation coeﬃcient = 0.1.
We further simulated the A6 magnetron with a transparent cathode (5.12) which is
128
mimicked by using six emitters/point sources and calculated the time evolution for the same
conﬁguration as above. The formulation maintains symmetry and high accuracy for relatively
coarse-grained solutions. For example, in the A6 magnetron results, there are only 12 points
across the neck of the magnetron vanes.
(a) t = 1.23ns
(b) t = 3.01ns
(c) t = 5.68ns
(d) t = 8.81ns
(e) t = 11.95ns
(f) t = 17.87ns
Figure 5.12: Evolution of the transparent cathode A6M with vane resonators rv = 4.11cm,
anode radius ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane,
20o, and cavity angle, 40o, the grid size 128 × 128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient = 0.1.
5.5.4 Rising Sun Magnetrons
We chose Rising Sun magnetrons with 12 and 18 cavities for further evaluation of our
scheme to complicated geometry. Figure shows the 2D view of the 18 cavity rising sun
129
magnetron (AX9) with the cathode of radius rc and the anode with inner radius ra, short
vane radius rvs and long vane radius rvl. For this experiment, a 2D domain of Ω = [0cm, 5cm]2
is chosen with radius of the wider and shorter vane resonators rvs = 4.11cm, and rvl = 4.91cm
respectively, anode radius ra = 2.11cm, cathode radius rc = 1.58cm. For the 12 cavity rising
sun magnetron , we chose the angular width of vane, 12o, and cavity angle, 18o. For the 18
cavity rising sun magnetron , we chose the angular width of vane, 12o, and cavity angle, 8o.
We set the grid size to be 128 × 128, time step size ∆t = 39ps, averaging parameter β = 1.4
and dissipation coeﬃcient = 0.1. The embedded PEC boundary condition is imposed over
the boundary stencil during the x− and y− sweeps.
Figure 5.13: 2D view of the 18 cavity rising sun magnetron (AX9) with a cathode of radius
rc and anode of inner radius ra, short and long cavity radii rvh and rvl.
We simulated the rising sun magnetrons with transparent cathodes ( Figures 5.14 and 5.15
) which are mimicked by using 12 and 18 emitters/point sources for 12 and 18 cavity rising
sun magnetrons respectively and calculated the time evolution for the same conﬁguration as
above. The formulation maintains symmetry and high accuracy for relatively coarse-grained
solutions.
130
(a) t = 0.72s
(b) t = 0.72s
(c) t = 5.68s
(d) t = 11.86s
(e) t = 5.68s
(f) t = 11.86s
Figure 5.14: Evolution of the transparent cathode 12 cavity rising sun magnetron with vane
resonators rvs = 4.11cm and rvl = 4.91cm, anode radius ra = 2.11cm, cathode
radius rc = 1.58cm, angular width of vane, 12o, and cavity angle, 18o, the grid
size 128 × 128, time step size ∆t = 39ps, averaging parameter β = 1.4 and
dissipation coeﬃcient = 0.1.
131
(a) t = 0.81s
(b) t = 0.81s
(c) t = 2.05s
(d) t = 5.72s
(e) t = 5.72s
(f) t = 5.72s
Figure 5.15: Evolution of the transparent cathode 18 cavity rising sun magnetron (AX9)
with vane resonators rvs = 4.11cm and rvl = 4.91cm, anode radius ra = 2.11cm,
cathode radius rc = 1.58cm, angular width of vane, 8o, and cavity angle, 12o,
the grid size 128 × 128, time step size ∆t = 39ps, averaging parameter β = 1.4
and dissipation coeﬃcient = 0.1.
5.6 Summary
We derived wave equations for vector and scalar potentials A and φ under the Lorenz
gauge condition and obtained the solution for the vector potentials based wave equation
subject to PEC boundary condition. We explained the embedded PEC boundary conditions
in 2D and the consistency and performance of the scheme are conﬁrmed using the ping
test and frequency mode analysis for rotated square cavities. We then demonstrate the
132
diﬀraction Q value test and the use of this method for simulating an A6 magnetron. We
obtained strong six resonance modes for the impulse response of A6 magnetron as expected.
In the next chapter, we describe our scalable software solution MOLTN in detail.
133
CHAPTER 6: Software Solution and Code Acceleration
6.1 Introduction
In this chapter, we describe our new open-source code development work and give scala-
bility studies for the OpenMP and GP-GPU CUDA versions.
The exciting EM software solutions in plasma science are developed based on the FDTD-
PIC method [51; 67; 32; 70] and they have limitations as described previously in section 1.1.
We have proved that our MOLT based scheme can overcome those limitations. Hence, we
are motivated to develop a software solution to bring our scheme to the scientiﬁc commu-
nity. We are developing an open source code MOLTN (Method Of Line Transpose based
Nth arbitrary order scheme) which is intended to be an architecture-independent, scalable
software tool, using MPI, OpenMP, and as well as GPU CUDA implementation. We are
working with the templated C++ library Kokkos [24] at the lowest levels of the code to
achieve this goal.
In this chapter, we give the design of MOLTN software and scalability studies for the
OpenMP version in section 6.2 and the parallel design and scalability studies for the GP-GPU
CUDA version in section 6.3.
6.2 Multi-core OpenMP, Multi-node MPI version
As an initial work, we developed the code in C++ using multi-core, shared memory
OpenMP and multi-node, distributed memory MPI. For this implementation, we ﬁrst de-
compose the domain as a set of subdomains and assign each sub-domain per node. We use
OpenMP pragmas for the data parallelization within the nodes and each node communicates
134
with each other using the MPI library. In our scheme, internal sweeps are carried out on
each node and boundary information is passed between the adjacent nodes during the left
and right oriented sweeps for the fast convolution integration (as described in the section
2.2.2). We give an overview of the structure of the MOLTN in the following section,
6.2.1 Data Structures and Data Flows
As we have seen, we know the kernel of this scheme is the O(n) one-dimensional recursive
fast convolution to obtain the particular solution and the homogeneous solution ( section
2.2). Thus, we deﬁned the MOLT kernel using a C++ object to implement this algorithm
with subroutines to build quadrature weights, perform left and right oriented sweeps, and
apply boundary conditions. We deﬁned another major object, MOLT mesh, to represent
the domain with every relevant parameter. The MOLT mesh acts as the main storage
place of our implementation, which consists of a set of one-dimensional C++ pointer arrays.
The controller object deﬁned in MOLTN is the MOLT time-step which receives the user-
deﬁned object and parameters, builds MOLT mesh and performs needed computation and
communication. We use node-core hybrid parallelism using MPI and OpenMP, thus initially,
the whole domain has to be partitioned and distributed over the HPC cluster nodes. Then
computing tasks have to be performed and the resultant relevant data must be transferred
between the nodes. The MOLT time-step performs this task by invoking appropriate
routines deﬁned in the MOLTN library API that includes MOLT kernel, MOLT mesh,
and MOLT time-step. So the application class can include these library ﬁles to carry out
its work. The application of these libraries can be extended beyond the wave solver such as
advection-diﬀusion, magnetohydrodynamic (MHD), and ﬂuid dynamics.
135
6.2.2 Domain Decomposition
Here we present our domain decomposition strategy ([64]). Consider the L−1(u) operator:
M (u) =
u(y)e−α|x−y|dy + A0e−α(x−a) + B0e−α(b−x).
a
We seek to partition this expression for subdomains, so we deﬁne ci such that c0 < ... < ci <
... < cN for some N , where c0 = a and cN = b, and thus for ci < x < ci+1,
Mi[u](x) =
u(y)e−α|x−y|dy + Aie−α(x−ci) + Bie−α(ci+1−x).
This expression can be rewritten in terms of the left and right sweep operators
ci+1(cid:90)
Mi[u](x) =e−α(x−ci)
u(y)eα(y−ci)dy + e−α(ci+1−x)
u(y)eα(ci+1−y)dy
ci
x
+ Aie−α(x−ci) + Bie−α(ci+1−x).
We evaluate this expression at x = ci:
Mi[u](ci) =e−α(ci+1−ci)
u(y)eα(ci+1−y)dy + Ai + Bie−α(ci+1−ci),
(6.1)
(6.2)
b(cid:90)
ci+1(cid:90)
ci
x(cid:90)
This is computed in the following way: we deﬁne
ci+1(cid:90)
ci
x(cid:90)
b(cid:90)
a
I L[u](x) =
I R[u](x) =
x(cid:90)
b(cid:90)
a
u(y)e−α|x−y|dy = e−αx
u(y)eαydy
u(y)e−α|x−y|dy = eαx
u(y)e−αydy
x
x
136
So then
M (u) = I L[u](x) + I R[u](x) + Ae−α(x−a) + Be−α(b−x)
Suppose we decompose the domain [a, b] to three sub domains 1,2, and 3 with pseudo
boundaries c0, c1, c2, and c3 as shown in Figure 6.1, we can now apply the above expression
for the each sub domain,
Figure 6.1: Domain Decomposition of the domain [a, b] to three 1D sub domains [c0, c1],
j ) sweeps over
[c1, c2],and [c2, c3], each evolves own internal left (I L
the domain of length ∆cj.
j ) and right (I R
A1e−α(x−a) + B1e−α(c1−x) − A0e−α(x−a) − B0e−α(b−x)
= e−α(c1−x)I R
2 [u](c1) + e−α(c2−x)I R
3 [u](c2)
A2e−α(x−c1) + B2e−α(c2−x) − A0e−α(x−a) − B0e−α(b−x)
= e−α(x−c1)I L
2 [u](c1) + e−α(c2−x)I R
3 [u](c2)
A3e−α(x−c2) + B3e−α(b−x) − A0e−α(x−a) − B0e−α(b−x)
= e−α(x−c1)I L
1 [u](c1) + e−α(x−c2)I L
2 [u](c2)
137
We deﬁne ∆c1 = a − c1, ∆c2 = c2 − c1, and ∆c3 = b − c2. We now write:
A1e−α(x−a) + B1e−α(c1−x) − A0e−α(x−a) − B0e−α(c1−x)e−α∆c2e−α∆c3
= e−α(c1−x)I R
2 [u](c1) + e−α(c1−x)e−α∆c2I R
3 [u](c2)
A2e−α(x−c1) + B2e−α(c2−x) − A0e−α∆c1e−α(x−c1) − B0e−α∆c3e−α(c2−x)
= e−α(x−c1)I L
2 [u](c1) + e−α(c2−x)I R
3 [u](c2)
A3e−α(x−c2) + B3e−α(b−x) − A0e−α∆c1e−α∆c2e−α(x−c2) − B0e−α(b−x)
= e−α(x−c2)e−α∆c2I L
1 [u](c1) + e−α(x−c2)I L
2 [u](c2)
From the equations 6.3, 6.4, and 6.5 we can discern
(6.3)
(6.4)
(6.5)
2 [u](c1) + e−α∆c2I R
3 [u](c2)
A1 = A0
B1 = B0e−α∆c2e−α∆c3 + I R
A2 = A0e−α∆c1 + I L
B2 = B0e−α∆c3 + I R
A3 = A0e−α∆c1e−α∆c2 + e−α∆c2I L
3 [u](c2)
2 [u](c1)
1 [u](c1) + I L
2 [u](c2)
B3 = B0
We choose ∆t such that e−α∆ci = 0 for i = 1...N , hence,
A1 = A0, B1 = I R
2 [u](c1),
A2 = I L
2 [u](c1), B2 = I R
3 [u](c2),
A3 = I L
2 [u](c2), B3 = B0.
138
6.2.3 Performance Analysis
As an initial evaluation of MOLTN, we tested it using a single node multi-core platform.
We chose a node with 28 cores which facilitated with Intel(R) Xeon(R) CPU E5-2680 v4 @
2.40GHz processor and 492 GB memory to examine our OpenMP parallel MOLTN code with-
out using any accelerators. This strong scaling study shows nearly linear speedup (in Figure
6.3) and approximately 85% eﬃciency ( in Figure 6.4). We observe less speedup/eﬃciency
for a small amount of work due to the leading eﬀect of communication cost including com-
munication overhead. Speedup and eﬃciency reduce with increasing core counts due to
memory bounds.Hence, it shows positive signs for the scalability of our scheme, but we need
to prove the scalability of the hybrid version and also have to work with the performance
enhancement using the accelerators such as NVIDIA GPU and Intel SIMD vectorization. We
believe that designing an algorithm based on the architecture is the best practice to achieve
high performance, and focus to work deeply with lower-level code optimization technologies,
NVIDIA warp primitive-level programming and SIMD vectorization intrinsics.
Figure 6.2: Log-log plot for wall time using 28 core Intel(R) Xeon(R) CPU E5-2680 v4 @
2.40GHz processor, for the grid size N 2, N = 64, 128, 256, 1024, 2048.
139
Figure 6.3: Log-log plot for speedup in using 28 core Intel(R) Xeon(R) CPU E5-2680 v4 @
2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024, 2048.
Figure 6.4: Log-log plot for eﬃciency using 28 core Intel(R) Xeon(R) CPU E5-2680 v4 @
2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024, 2048.
140
6.3 GPGPU CUDA version
In this section, we describe our initial work of GPGPU CUDA implementation of our
scheme and performance evaluation.
6.3.1 Task Organization
For this implementation, we chose the two-dimensional second-order scheme,
un+1 − 2un + un−1 = −β2C(1)
xy [un]
where,
Cxy = (L−1
y + L−1
x ) − (L−1
y L−1
x + L−1
x L−1
y )
(6.6)
Based on the structure of the scheme, the computation of the solution un+1 at the time step
tn+1 can be obtained through four parallel stages using task paralleling,
1. Perform the ﬁrst set of x- and y- sweeps:
Compute wx = L−1
x [u] and wy = L−1
y [u] concurrently,
2. Perform the second set of x- and y- sweeps:
Compute wyx = L−1
x [wx] and wxy = L−1
y [wy] concurrently,
3. Obtain Cxy:
Compute Cxy1 = wx − wyx and Cxy2 = wy − wxy concurrently,
4. Obtain un+1:
Compute un+1
1 = 2un − β2Cxy1 and un+1
2 = −un−1 − β2Cxy2 concurrently,
and ﬁnally obtain un+1 = un+1
1 + un+1
2
. We choose two CUDA streams to go through
these stages, Figure 6.5 shows the processing ﬂow of the two streams.
141
Figure 6.5: Task parallelism for the computation of un+1 based on Cxy and parallel stages
executing through CUDA streams.
6.3.2 Performance Analysis
We evaluated the scalability and performance of the scheme using NVIDIA Tesla K20 and
K80 GPU accelerators. Tesla K20 (2880 CUDA cores) and K80 (4992 CUDA cores) GPUs
were compared with a CPU, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz processor. This
study shows the nearly 45X and 35X speedup for Tesla K80 and K20 respectively with full
utilized CUDA cores(in Figure 6.7) and linear time complexity for both CPU and GPU
architectures (Figure 6.6). Even though the evaluation result shows reasonable improve-
ment in time and cost, we have to include the MPI library to communicate between nodes
for the distributed multi-node system and also consider the lower-level code optimization
technologies, NVIDIA warp primitive-level programming.
142
Figure 6.6: Log-log plot for wall time using Intel(R) Xeon(R) CPU, Tesla K20, and K80
GPUs for the grid sizeN 2, N = 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192.
Figure 6.7: Intel(R) Xeon(R) CPU vs Tesla K20 and K80 GPU speedup (log-log plot) for
the grid size N 2, N = 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192.
143
6.4 Summary
We have been developing an architecture-independent, scalable software tool MOLTN
using MPI, OpenMP, and as well as GPU CUDA implementation. In this chapter, we have
described the overview of the design of MOLTN, parallel implementation of GP-GPU CUDA
version, and scalability studies for both OpenMP and CUDA versions. In the next chapter,
we summarize the work carried out through these studies and give useful ideas for future
work.
144
CHAPTER 7: Summary and Potential Future Directions
7.1 Introduction
In this chapter, I conclude the work carried out through this study, emphasize my con-
tributions to this research work, and identify potential future directions.
7.2 Conclusion
Based on my literature review, I identiﬁed the MOLT based scheme developed by Christlieb
team [15] as an unconditionally stable, fast, and implicit scheme and arbitrary order accuracy
for Dirichlet and periodic boundaries on a Cartesian grid. Hence, I chose the scheme for my
simulation study and overcame the identiﬁed limitations on it. The major limitation is that
the scheme was second order for outﬂow and Neumann boundary conditions. So, I worked
to extend the order of accuracy for Neumann and outﬂow for arbitrary order and achieved a
scheme with fourth-order spatiotemporal accuracy for complex geometry problems including
curved boundaries. Further, I deﬁned a general framework for complex geometry problems
and simulated such problems using an embedded boundary method. Speciﬁcally, I developed
a simulation tool for HPM tubes, such as A6 magnetron, 12 and 18 cavity rising suns us-
ing PEC boundary condition. For this simulation, I derived the scheme for electromagnetic
vector potential using the Lorenz gauge and imposed PEC boundary condition in 2D. I eval-
uated the simulation of A6 magnetron using a ping test and obtained six strong resonance
modes. This is one of the challenging tasks that I accomplished.
I extended the implementation of the scheme to 3D and obtained multiple applications
such as 3D variable speed waves, spherical scattering, 3D point source, and A6 magnetron in
145
3D. The simulation of three-dimensional A6 magnetron is an additional important milestone.
I developed codes in C++ for these simulations and improved those performances by code
optimization and acceleration using parallel technologies in diﬀerent architectures including
NVIDIA GPU. Further, I have been working with the scalable open source MOLTN devel-
opment and contributed to the development of multi-core shared memory OpenMP version
and wave kernel development using kokkos.
7.3 Future Directions
7.3.1 Code Acceleration for MOLTN
As described in the Chapter 6, we are developing an open source scalable software to make
visible our MOLT based scheme to the scientiﬁc community. We are on the initial stage of
this development and will have to optimize the code in diﬀerent levels; i) low-level architec-
ture based ii) compiler related iii) high-level code based optimizations. To obtain accelerated
code in Intel architectures, we can use the SIMD vectorization using Intel intrinsic instruc-
tions, which are C style functions that provide access to many Intel instructions, including
Intel SSE (128 bits Streaming SIMD Extension), AVX (256 bits Advanced Vector Extension),
and AVX512 (512 bits Advanced Vector Extension) etc. without the need to write assembly
code. The SIMD vectorization techniques can increase processor throughput by perform-
ing multiple computations in a single instruction. We can use Arithmetic and Elementary
Math functions such as the function m256d mm256 add pd( m256d,
m256d) to add
packed double-precision (64-bit) ﬂoating-point elements using AVX. We can also use the
compiler ﬂags for the auto-vectorization (e.g. −O2 or higher in intel). Another well-known
accelerator is NVIDIA GPUs, which execute warps of 32 parallel threads using SIMT (Sin-
gle Instruction Multiple Threads - multiple threads issue common instructions to arbitrary
data) technologies. The explicit warp-level programming using warp-level primitives which
were introduced in CUDA 9.0, is safe, eﬀective and will give high-performance. We can use
146
NVIDIA’s intrinsic functions to make the code even faster with reduced accuracy, but they
can be used in device codes only.
7.3.2 Complex Geometry in MOLTN
We have already implemented simple rectangular-shaped boundary problems on the
Cartesian grids for MOLTN, so we have to deﬁne a general framework for complex ge-
ometry problems on MOLTN in the future. To obtain this, i) convert MOLT kernel to
able to operate on line-segments instead of lines because line-segments are unit elements for
complex geometry cases. ii) we have to deal with the work load-balancing and data locality
carefully in order to make sure better performance is obtained. I can suggest a max-ﬁt policy
which takes care of load balancing and locality at the same time.
7.3.3 Further HPM tube simulations
Even though the concept of the magnetrons is the same, every magnetron is not designed
by the same structure. They may diﬀer in the number of cavities (12 or 18 cavity rising
suns), the shape of the cavities (triangular or circular cavity magnetrons), and radius of
the cathode/anode, etc. In these studies, my simulations are developed based on triangular
cavities. In the future, we can consider the circular cavity-anode structure as shown in Figure
7.3.3 which is a typical magnetron for microwave ovens in cutaway view. The cylindrical
anode structure contains a number of equally spaced cavity resonators with the anode surface
adjacent to the cylindrical cathode. Permanent magnets are used to provide the necessary
magnetic ﬁeld, which is perpendicular to the electric ﬁeld between the cathode and the
anode. The power output is coupled through an antenna that runs from one of the cavities
to a waveguide that channels the microwave radiation to the cooking chamber.
Further, beyond the magnetrons, we can develop simulations for other kinds of HPM
tubes such as the accelerator klystron. Figure 7.3.3 describes the two-cavity klystron [30]
147
Figure 7.1: Magnetron with eight circular hole cavity anode (this ﬁgure is from Encyclopedia
Britannica, Inc.)
Figure 7.2: Two-cavity klystron (this ﬁgure is from Reference [30])
7.3.4
Incorporate with Particles
We obtained a cold test (in Chapter 5) using the electromagnetic waves without including
particles successfully. Now, we are planning to perform hot tests by including particles in
148
the simulation. For this purpose, we tested for a point source as a delta function in three
dimensions. Next we will see how to derive the scheme which includes the point source.
7.3.4.1 Derive the scheme for 3D point source
Let’s rewrite the Helmholtz operator in 1D,
(cid:16)
(cid:17)
(.)
Lx(.) =
1 − 1
α2 ∂xx
We use the free space Green’s function to solve the L−1 as previously described. The one
dimensional Green’s function can be given by,
Upon using the Green’s function,
G(x|x0) =
e−α|x−x0|2
|x − x0|2
1
4π
(cid:90)
G(x|x0)(.)dx0
G(x|x0)(.)dx0 +
(7.1)
(7.2)
(cid:90)
L−1
x (.) =
Ωx0
Ωx0
Now, we break the solution into two pieces, the particular solution and the homogeneous
solutions:
where,
b(cid:90)
a
Ix(.) =
α
2
L−1
x (.) := Ix(.) + Hx
(7.3)
e−α|x−y|(.)dy, Hx = axe−α(x−xa) + bxe−α(xb−x).
For three dimensional inverse operator L−1(.) for the ADI splitting is performed one line
149
at a time. Recall,
where,
b(cid:90)
a
Ik(.) =
α
2
L−1
k (.) := Ik(.) + Hk
(7.4)
e−α|k−k0|(.)dk0, Hk = ake−α(k−ka) + bke−α(kb−k).
So the three dimensional operator L−1(.) can be expressed as,
L−1(.) = L−1
x (L−1
y (L−1
z (.))) + O(∆t2)
or
or
L−1(.) = Ix,y,z(.) + Hx,y,z(.) + O(∆t2)
L−1(.) = Ix(Iy(Iz(.) + Hz) + Hy) + Hx + O(∆t2)
By plugging into the second order scheme,
un+1 = (2 − β2)un − un−1 + β2 ¯L−1
un+1 = (2 − β2)un − un−1 + β2L−1
x
3D [un]
(cid:2)L−1
y
(cid:2)L−1
z [un](cid:3)(cid:3) .
(7.5)
(7.6)
and including a point source S,
un+1 = (2 − β2)un − un−1 + β2L−1
x
(cid:2)L−1
y
(cid:2)L−1
z [un](cid:3)(cid:3) +
β2
α2 Sn.Gδ
3D.
(7.7)
where,
Gδ
3D =
1
4π
√
(cid:112)(x − x0)2 + (y − y0)2 + (z − z0)2 + δ2
(x−x0)2+(y−y0)2+(z−z0)2
e−α
150
Gδ
3D(x/x0) =
(cid:112)(x − x0)2 + (y − y0)2 + (z − z0)2 + δ2
e−α||x−x0||2
1
4π
where δ ∈ R+, 0 < δ2 < ∆x
7.3.4.2 Numerical Example
For the next two experiments we placed two point sources sin(2πf t) with equal fre-
quencies (f = 1) for the ﬁrst case and two point sources sin(2πf t) with diﬀerent fre-
quencies f = 1 and f = 10 for the second case at the corners (−0.613,−0.613,−0.613)
and (1.613, 1.613, 1.613) of the cubical domain ([−1, 1]3). We simulate the wave propaga-
tion by applying outﬂow and Dirichlet boundary conditions along the right surface of the
cube and the other ﬁve surfaces of the cube respectively. We set the values of parameters
∆x = ∆y = ∆z = 0.013, ∆t = 0.0067, and β = 2. Figure 7.3 and 7.5 show snapshots of the
wave at diﬀerent time instants for the symmetric and asymmetric cases.
Finally, we perform a self-reﬁnement study for time and space convergence using a point
source sin(2πf t) with frequencies f = 1 at the center of the cubical domain ([−1, 1]3). Figure
7.4 shows the time and space convergence plots for the time evolution of a point source ﬁeld.
For the time convergence test, we maintain the resolution at 40 × 40 × 40 and reduce the
time step size ∆t = 0.1 to 0.0125. For the space convergence test, we maintain the CFL
value as 1.3 and reduce the spatial step size ∆x = 1.313 to 0.082. We obtained second-order
convergence in time and space
This shows a positive result to deal with particles properly, and the approach can be
extended to high-order by using the Taylor series expansion in terms of the source function.
151
(a) t = 1.23ns
(b) t = 3.01ns
(c) t = 5.68ns
(d) t = 8.81ns
Figure 7.3: Evolution of the two 3D point sources sin(2πf t) with the same frequency f = 1
at the corners of the cubical domain ([−1, 1]3), spatial step size ∆x = ∆y =
∆z = 0.013, and time step size ∆t = 0.0067.
(a) time convergent
(b) space convergent
Figure 7.4: Time (a) and space (b) convergence studies using 3D point source sin(2πf t)
with frequencies f = 1 at the center of the cubical domain ([−1, 1]3).
152
(a) t = 1.23ns
(b) t = 3.01ns
(c) t = 5.68ns
(d) t = 8.81ns
Figure 7.5: Evolution of the two 3D point sources sin(2πf t) with frequencies 1 and 10 at the
corners of the cubical domain ([−1, 1]3), spatial step size ∆x = ∆y = ∆z = 0.013,
and time step size ∆t = 0.0067.
7.4 Summary
I gave the summary and contributions of this research work and gave potential future
directions in this chapter. I strengthen the MOLT based scheme by including high-order
Neumann, outﬂow and PEC boundary conditions, general geometry framework, 3D im-
plementations, and accelerated codes and wave kernel for MOLTN development. We are
proposing to work with particles, several HPM tube simulations, and optimization and de-
velopment of MOLTN in the future.
153
BIBLIOGRAPHY
154
BIBLIOGRAPHY
[1] Alpert, B., L. Greengard, and T. Hagstrom (2000), Rapid evaluation of nonreﬂect-
ing boundary kernels for time-domain wave propagation, SIAM Journal on Numerical
Analysis, 37 (4), 1138–1164, doi:10.1137/S0036142998336916.
[2] AlSalem, H., P. Petrov, G. Newman, and J. Rector (2018), Embedded boundary meth-
ods for modeling 3d ﬁnite-diﬀerence laplace-fourier domain acoustic-wave equation
with free-surface topography, GEOPHYSICS, 83 (5), T291–T300, doi:10.1190/geo2017-
0828.1.
[3] Appelo, D., and N. A. Petersson (2012), A Fourth-Order Accurate Embedded Boundary
Method for the Wave Equation, SIAM J. Sci. Comput., 34 (6), A2982–A3008.
[4] Ascher, U., R. Mattheij, and R. Russell (1995), Numerical Solution of Boundary Value
Problems for Ordinary Diﬀerential Equations, Society for Industrial and Applied Math-
ematics, doi:10.1137/1.9781611971231.
[5] Benford, J. (2010), History and future of the relativistic magnetron, in 2010 Interna-
tional Conference on the Origins and Evolution of the Cavity Magnetron, pp. 40–45,
doi:10.1109/CAVMAG.2010.5565566.
[6] Berenger, J.-P. (1994), A perfectly matched layer for the absorption of elec-
tromagnetic waves, Journal of Computational Physics, 114 (2), 185 – 200, doi:
https://doi.org/10.1006/jcph.1994.1159.
[7] Bernacki, M., S. Lanteri, and S. Piperno (2006), Time-Domain Parallel Simulation of
Heterogeneous Wave Propagation on Unstructured Grids Using Explicit, Nondiﬀusive,
Discontinuous Galerkin Methods, Journal of Computational Acoustics, 14 (01), 57–81,
doi:10.1142/S0218396X06002937.
[8] Bruno, O. P., and M. Lyon (2010), High-order unconditionally stable FC-AD solvers for
general smooth domains I . Basic elements, Journal of Computational Physics, 229 (6),
2009–2033.
[9] Bruno, O. P., and M. Lyon (2010), High-order unconditionally stable FC-AD solvers for
general smooth domains I. Basic elements , Journal of Computational Physics, 229 (6),
2009 – 2033, doi:http://dx.doi.org/10.1016/j.jcp.2009.11.020.
[10] Cai, W. (2004), Numerical methods for maxwell’s equations in inhomogeneous media
with material interfaces, Journal of Computational Mathematics, 22 (2), 156–167.
[11] Cai, W., and S. Deng (2003), An upwinding embedded boundary method for Maxwells
equations in media with material interfaces: 2D case, Journal of Computational Physics,
190 (1), 159 – 183, doi:http://dx.doi.org/10.1016/S0021-9991(03)00269-9.
155
[12] Catella, A., V. Dolean, and S. Lanteri (2008), An Unconditionally Stable Discontin-
uous Galerkin Method for Solving the 2-D Time-Domain Maxwell Equations on Un-
structured Triangular Meshes, IEEE Transactions on Magnetics, 44 (6), 1250–1253,
doi:10.1109/TMAG.2007.916578.
[13] Causley, M., A. Christlieb, and E. Wolf (2017), Method of Lines Transpose: An Eﬃcient
Unconditionally Stable Solver for Wave Propagation, Journal of Scientiﬁc Computing,
70 (2), 896–921, doi:10.1007/s10915-016-0268-8.
[14] Causley, M. F., and A. J. Christlieb (2014), Higher Order A-Stable Schemes for the
Wave Equation Using a Successive Convolution Approach, SIAM Journal on Numerical
Analysis, 52 (1), 220–235, doi:10.1137/130932685.
[15] Causley, M. F., A. Christlieb, Y. G¨u¸cl¨u, and E. Wolf (2013), Method of Lines Transpose:
A Fast Implicit Wave Propagator, Mathematics of Computation, submitted.
[16] Chini, G. P., and S. Leibovich (2003), An analysis of the klemp and dur-
internal waves,
doi:10.1175/1520-
ran radiation boundary condition as applied to dissipative
of Physical Oceanography,
Journal
0485(2003)033¡2394:AAOTKA¿2.0.CO;2.
33 (11),
2394–2407,
[17] Coifman, R., V. Rokhlin, and S. Wandzura (1993), The fast multipole method for the
wave equation: a pedestrian prescription, IEEE Antennas and Propagation Magazine,
35 (3), 7–12, doi:10.1109/74.250128.
[18] Deng, S., and W. Cai (2006), A Fourth-Order Upwinding Embedded Boundary Method
(UEBM) for Maxwells Equations in Media with Material Interfaces: Part I.
[19] Ditkowski, A., K. Dridi, and J. Hesthaven (2001), Convergent Cartesian Grid Methods
for Maxwell’s Equations in Complex Geometries, Journal of Computational Physics,
170 (1), 39 – 80, doi:http://dx.doi.org/10.1006/jcph.2001.6719.
[20] Dolean, V., H. Fahs, L. Fezoui, and S. Lanteri (2010), Locally implicit discontinuous
Galerkin method for time domain electromagnetics, Journal of Computational Physics,
229 (2), 512 – 526, doi:http://dx.doi.org/10.1016/j.jcp.2009.09.038.
[21] Dridi, K. H., J. S. Hesthaven, and A. Ditkowski (2001), Staircase-free ﬁnite-diﬀerence
time-domain formulation for general materials in complex geometries, IEEE Transac-
tions on Antennas and Propagation, 49 (5), 749–756, doi:10.1109/8.929629.
[22] Duru, K., and G. Kreiss (2012), A Well-Posed and Discretely Stable Perfectly Matched
Layer for Elastic Wave Equations in Second Order Formulation, Communications in
Computational Physics, pp. 1–35.
[23] Dwivedi, S., and P. K. Jain (2013), Magnetically Insulated Line Oscillator (MILO)
Performance Study and its Parameter Optimization, IEEE Transactions on Plasma
Science, 41 (9), 2532–2538, doi:10.1109/TPS.2013.2277859.
156
[24] Edwards, H. C., C. R. Trott,
En-
abling manycore performance portability through polymorphic memory access pat-
terns, Journal of Parallel and Distributed Computing, 74 (12), 3202 – 3216, doi:
https://doi.org/10.1016/j.jpdc.2014.07.003, domain-Speciﬁc Languages and High-Level
Frameworks for High-Performance Computing.
and D. Sunderland (2014), Kokkos:
[25] Engquist, B., and A. Majda (1977), Absorbing boundary conditions for the nu-
merical simulation of waves, Mathematics of Computation, 31 (139), 629–651, doi:
10.1090/S0025-5718-1977-0436612-4.
[26] Fornberg, B. (2001), A short proof of the unconditional stability of the ADI-FDTD
scheme, Tech. rep., University of Colorado.
[27] Fornberg, B., J. Zuev, and J. Lee (2007), Stability and accuracy of time-extrapolated
ADI-FDTD methods for solving wave equations, Journal of Computational and Applied
Mathematics, 200 (1), 178 – 192.
[28] Fuks, M., and E. Schamiloglu (2005), Rapid start of oscillations in a mag-
205,101, doi:
netron with a ”transparent” cathode, Phys. Rev. Lett.,
10.1103/PhysRevLett.95.205101.
95,
[29] Gedney, S. D., and U. Navsariwala (1995), An unconditionally stable ﬁnite element
time-domain solution of the vector wave equation, IEEE Microwave and Guided Wave
Letters, 5 (10), 332–334, doi:10.1109/75.465046.
[30] Gewartowski, H. A. W., J. W. (1965), Principles of electron tubes, D. Van Nostrand
Company, Inc.
[31] Gimbutas, Z., and V. Rokhlin (2003), A Generalized Fast Multipole Method for
Nonoscillatory Kernels, SIAM Journal on Scientiﬁc Computing, 24 (3), 796–817, doi:
10.1137/S1064827500381148.
[32] Goplen, B., L. Ludeking, D. Smith, and G. Warren (1995), User-conﬁgurable MAGIC
for electromagnetic PIC calculations, Computer Physics Communications, 87 (1), 54 –
86, doi:https://doi.org/10.1016/0010-4655(95)00010-D, particle Simulation Methods.
[33] Harlow, J. (2003), Electric Power Transformer Engineering, The Electric Power Engi-
neering Hbk, Second Edition, CRC Press.
[34] Henshaw, W. D. (2006), A High-Order Accurate Parallel Solver for Maxwell’s Equations
on Overlapping Grids, SIAM J. Scientiﬁc Computing, 28, 1730–1765.
[35] Higdon, R. L. (1987), Numerical absorbing boundary conditions for the wave equation,
Mathematics of Computation, 49 (179), 65–90.
[36] Huang, Y. J., L. H. Yeh, and K. R. Chu (2014), An analytical study on the
diﬀraction quality factor of open cavities, Physics of Plasmas, 21 (10), 103,112, doi:
10.1063/1.4900415.
157
[37] Jackson, J. D. (1999), Classical Electrodynamics, 3rd ed. ed., Wiley, New York, NY.
[38] Jackson, S. D., and P. H. Muir (2001), Theory and numerical simulation of nth-
order cascaded Raman ﬁber lasers, J. Opt. Soc. Am. B, 18 (9), 1297–1306, doi:
10.1364/JOSAB.18.001297.
[39] Jacobs, G., and J. Hesthaven (2009), Implicit explicit time integration of a high-order
particle-in-cell method with hyperbolic divergence cleaning, Computer Physics Commu-
nications, 180 (10), 1760 – 1767, doi:https://doi.org/10.1016/j.cpc.2009.05.020.
[40] Jacobs, G. B., J. S. Hesthaven, and G. Lapenta (2006), Simulations of the Weibel
instability with a high-order discontinuous galerkin particle-in-cell solver, Collection of
Technical Papers - 44th AIAA Aerospace Sciences Meeting, 19, 14,189–14,199.
[41] Jia, J., and J. Huang (2008), Krylov deferred correction accelerated method of lines
transpose for parabolic problems, Journal of Computational Physics, 227 (3), 1739–
1753.
[42] Jin, J. (2010), Theory and Computation of Electromagnetic Fields, John Wiley & Sons,
Inc, doi:10.1002/9780470874257.
[43] Joannopoulos, J., S. Johnson, J. Winn, and R. Meade (2011), Photonic Crystals: Mold-
ing the Flow of Light (Second Edition), Princeton University Press.
[44] Johansen, H., and P. Colella (1998), A Cartesian Grid Embedded Boundary Method
for Poisson’s Equation on Irregular Domains, J. Computational. Phys., 147 (1), 60–85,
doi:10.1006/jcph.1998.5965.
[45] Kreiss, H., and N. Petersson (2006), An embedded boundary method for the wave
equation with discontinuous coeﬃcients, SIAM Journal on Scientiﬁc Computing, 28 (6),
2054–2074, doi:10.1137/050641399.
[46] Kreiss, H., and N. A. Petersson (2006), An Embedded Boundary Method for the
Wave Equation with Discontinuous Coeﬃcients, SIAM Journal on Scientiﬁc Comput-
ing, 28 (6), 2054–2074, doi:10.1137/050641399.
[47] Lambers, J. V. (2016), Solution of time-dependent PDE through component-wise ap-
proximation of matrix functions, IAENG Journal of Applied Mathematics, 41 (1), 1–26.
[48] Lapidus, L., and G. F. Pinder (1999), Numerical Solution of Partial Diﬀerential Equa-
tions in Science and Engineering, 677 pp., John Wiley and Sons Inc.
[49] LeVeque, R. (2007), Finite Diﬀerence Methods for Ordinary and Partial Diﬀerential
Equations: Steady-State and Time-Dependent Problems, Society for Industrial and Ap-
plied Mathematics.
[50] Li, P., H. Johnston, and R. Krasny (2009), A Cartesian treecode for screened
Coulomb interactions , Journal of Computational Physics, 228 (10), 3858 – 3868, doi:
http://dx.doi.org/10.1016/j.jcp.2009.02.022.
158
[51] M. C. Lin, C. Nieter, P. H. Stoltz, D. N. Smithe (2010), Accurately and Ef-
ﬁciently Studying the RF Structures Using a Conformal Finite-Diﬀerence Time-
Domain Particle-in-Cell Method, The Open Plasma Physics Journal, 3, 48–52, doi:
10.2174/1876534301003010048.
[52] Marfurt, K. J. (1984), Accuracy of ﬁnitediﬀerence and ﬁniteelement modeling
of the scalar and elastic wave equations, GEOPHYSICS, 49 (5), 533–549, doi:
10.1190/1.1441689.
[53] Matthew Causley, B. O., Andrew Christlieb, and L. V. Groningen (2014), Method of
lines transpose: An implicit solution to the wave equation, Mathematics of Computation,
83, 2763–2786, doi:https://doi.org/10.1090/S0025-5718-2014-02834-2.
[54] Mazzia, A., and F. Mazzia (1997), High-order transverse schemes for the numerical
solution of PDEs, Journal of Computational and Applied Mathematics, 82 (1), 299 –
311, doi:http://dx.doi.org/10.1016/S0377-0427(97)00090-3.
[55] McCorquodale, P., P. Colella, and H. Johansen (2001), A Cartesian grid embedded
boundary method for the heat equation on irregular domains.
[56] Mendonca, C., S. Prasad, E. Schamiloglu, and T. Fleming (2011), 3d icepic simulations
of a pulsed relativistic magnetron with transparent cathode: A comparative study, in
2011 IEEE Pulsed Power Conference, pp. 823–828, doi:10.1109/PPC.2011.6191521.
[57] Monk, P., and E. Suli (1994), Error estimates for Yee’s method on non-uniform grids,
IEEE Transactions on Magnetics, 30 (5), 3200–3203, doi:10.1109/20.312618.
[58] Namiki, T. (1999), A new FDTD algorithm based on alternating-direction implicit
method, IEEE Transactions on Microwave Theory and Techniques, 47 (10), 2003–2007,
doi:10.1109/22.795075.
[59] Newmark, N. (1959), A Method of Computation for Structural Dynamics, no. nos. 179-
181 in A Method of Computation for Structural Dynamics, American Society of Civil
Engineers.
[60] Orlanski, I. (1976), A simple boundary condition for unbounded hyperbolic ﬂows,
Journal of Computational Physics, 21 (3), 251 – 269, doi:https://doi.org/10.1016/0021-
9991(76)90023-1.
[61] Palevsky, A. (1980), Generation of Intense Microwave Radiation by the Relativistic E-
beam Magnetron (experiment and Numerical Simulation), Massachusetts Institute of
Technology, Department of Physics.
[62] Palevsky, A., and G. Bekeﬁ (1979), Microwave emission from pulsed, relativistic ebeam
diodes. II. The multiresonator magnetron, The Physics of Fluids, 22 (5), 986–996, doi:
10.1063/1.862663.
159
[63] Pearson, R. A. (1974), Consistent boundary conditions for numerical models of systems
that admit dispersive waves, Journal of the Atmospheric Sciences, 31 (6), 1481–1489,
doi:10.1175/1520-0469(1974)031¡1481:CBCFNM¿2.0.CO;2.
[64] Pierson Guthrey, W. S. A. C., Thavappiragasam Mathialakan (2019), Domain Decom-
position Strategy for Method of Line Transpose Based Scheme.
[65] Ricci, P., G. Lapenta,
(2002), A Simpliﬁed Implicit
Maxwell Solver, Journal of Computational Physics, 183 (1), 117 – 141, doi:
http://dx.doi.org/10.1006/jcph.2002.7170.
and J. Brackbill
[66] Sommerfeld, A. (1949), p. iv, doi:https://doi.org/10.1016/B978-0-12-654658-3.50002-1.
[67] T. Austin, D. N. S., J. R. Cary, and C. Nieter (2010), Alternating Direction Implicit
Methods for FDTD Using the Dey-Mittra Embedded Boundary Method, The Open
Plasma Physics Journal, 3, 29–35, doi:10.2174/1876534301003010029.
[68] Thavappiragasam, M., A. Viswanathan, and A. Christlieb (2017), MOLT based fast
high-order three dimensional A-stable scheme for wave propagation, Journal of Coupled
Systems and Multiscale Dynamics, 5 (2), 151–163, doi:doi:10.1166/jcsmd.2017.1137.
[69] Ting, L., and M. J. Miksis (1986), Exact boundary conditions for scattering prob-
lems, The Journal of the Acoustical Society of America, 80 (6), 1825–1827, doi:
10.1121/1.394297.
[70] Wang, Y., and S. Langdon (2010), Design of wave ports in fdtd and its application
to microwave circuits and antennas, in 2010 IEEE Antennas and Propagation Society
International Symposium, pp. 1–4, doi:10.1109/APS.2010.5562023.
[71] Yee, K. (1966), Numerical solution of
initial boundary value problems involving
Maxwell’s equations in isotropic media, IEEE Transactions on Antennas and Propa-
gation, 14 (3), 302–307, doi:10.1109/TAP.1966.1138693.
[72] Zafarullah, A. (1970), Application of the Method of Lines to Parabolic Par-
tial Diﬀerential Equations With Error Estimates, J. ACM, 17 (2), 294–302, doi:
10.1145/321574.321583.
[73] Zheng, F., Z. Chen, and J. Zhang (1999), A ﬁnite-diﬀerence time-domain method with-
out the Courant stability conditions, IEEE Microwave and Guided Wave Letters, 9 (11),
441–443, doi:10.1109/75.808026.
[74] Zienkiewicz, O., and R. Newton (1969), Coupled Vibrations of a Structure Submerged
in a Compressible Fluid.
160