A-STABLE IMPLICIT RAPID SCHEME AND SOFTWARE SOLUTION FOR

ELECTROMAGNETIC WAVE PROPAGATION

By

Mathialakan Thavappiragasam

A DISSERTATION

Submitted to

Michigan State University

in partial fulﬁllment of the requirements

for the degree of

Electrical Engineering - Doctor of Philosophy

Computational Mathematics, Science and Engineering - Dual Major

2019

ABSTRACT

A-STABLE IMPLICIT RAPID SCHEME AND SOFTWARE SOLUTION FOR

ELECTROMAGNETIC WAVE PROPAGATION

By

Mathialakan Thavappiragasam

A robust and rapid scheme to solve electromagnetics (EM) is an important requirement in

the scientiﬁc computing environment in which there are several useful methods used to solve

tasks in EM. Our research study is motivated by this need and is targeted to develop a fast

A-stable implicit numerical scheme and scalable software solution for EM wave propagation.

Our scheme is based on the Method Of Lines Transpose (MOLT) approach which discretizes

time ﬁrst and then solves boundary value problems. By applying the free-space Green’s

function, the solution is derived by decomposing particular and homogeneous solutions. The

compact Simpson’s quadrature based, O(N) fast convolution, a recursive algorithm, is used

to solve the particular solution for N number of grid points. The homogeneous solution

is obtained using a particular solution at the boundary points and the applied boundary

conditions. The multi-dimensional scheme is developed using the ADI splitting approach

and an arbitrary order accuracy in time is achieved by switching the time derivation to a

spatial derivation using the Lax-Wendroﬀ approach.

The focus of the work in this thesis has been to overcome the limitations in Neumann

and outﬂow boundary conditions to get high-order accuracy by using special treatments that

deal with a choice of the interpolation, ﬁnite diﬀerence stencil, and the initial conditions.

In addition, we have extended these ideas to construct perfectly electrically conducting

boundary conditions in 2D for the MOLT.

In addition to introducing higher-order boundary conditions, an embedded boundary

method is employed to deal with complex geometries. As the method is A-stable, it does not

suﬀer from small-time step limitations that are found in explicit ﬁnite diﬀerence time domain

methods when using either embedded boundary or cut cell methods to capture geometry.

Further, we are developing an open source code MOLTN (Method Of Lines Transpose, Nth

order) which is intended to be a hardware-independent, scalable software tool, using multi-

node MPI, multi-core OpenMP, and GPU CUDA implementation. As a test case of the

method, we implement and study the A6 magnetron with our embedded boundary method

using point sources inside of the domain.

The eventual goal is to combine this method with a novel particle method for the simu-

lations of plasma. The particle method would treat particles as point particles that generate

ﬁelds that are tracked on the mesh. No density or current will be mapped to the mesh.

The consistency and performance of the scheme are evaluated for EM wave propagation and

scattering using diﬀerent shaped objects including curved boundaries and the introduction

of true point sources that demonstrate how we handle particles. Stable solutions result for

a wide range of mesh sizes and potential to leverage novel computing architectures, such as

GPU, have been demonstrated.

Copyright by
MATHIALAKAN THAVAPPIRAGASAM
2019

For my beloved late parents (Thavappiragasam and Pushpavathy) who motivate me

beyond, to move toward my dream

v

ACKNOWLEDGEMENTS

My ﬁrst and deepest gratitude goes to my advisor Professor Andrew Christlieb for giving

me point to point guidance and support to achieve the target successfully; thank god for

giving me such a mentor for my Ph.D. studies. Dr. John Luginsland is the next person to

bring me through these studies. Thank you so much for your valuable guidance sir.

I will never forget your help, Professor Shanker Balasubramaniam; you established a

successful path to move forward, thank you so much! I would also like to thank Professor

John Verboncoeur, Professor Edward J. Rothwell, Professor Brian O’Shea for their guidance

and thoughts over my Ph.D. studies. Without your support, I couldn’t imagine success.

I would like to thank Dr. Matthew F. Causley, Dr. Aditya Viswanathan, and Dr. Pierson

Guthrey for their support and collaborative works to accomplish every milestone, and also

I am happy to thank Dr. Eric Wolf for his valuable and timely supports. The Christlieb-

group, the bleeding edge research team with active researchers, is the team I have been

working throughout my Ph.D. studies. Thank you colleagues, William Sands, Dr. Hana

Cho, Dr. Michel Crockatt, Dr. Xiao Feng, Dr. Bosu Choi, Dr. Yan Jiang, Dr. Wei Guo,

Dr. Bankim Mandal, Dr. Rouchun Zhang for your tips and tricks. I would also like to thank

Dr. Bernard G. Pope, a professor emeritus of physics at the Michigan State University, for

thesis proofreading and editing.

My thanks also go to professors who taught me courses at Michigan State University,

and the faculty and staﬀ that are working in the Department of Electrical and Computer

Engineering and Computational Mathematics, Science, and Engineering.

Finally, I am proud to thank my family for their support and patience, especially my

loving wife Kukapalini for her supports and inspirational words of encouragement to carry

my studies and our family concurrently. I also want to thank friends and relatives for their

motivation.

vi

TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

CHAPTER 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Context and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 High Power Magnetrons
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Research Goal and Objectives . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Electromagnetic Wave Propagation . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Electromagnetic Fields . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.2 Boundary Conditions for Electromagnetic Fields . . . . . . . . . . . .
1.5 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 2: Mathematical Model for Wave Equation Solver

. . . . . . .
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 One Dimensional Implicit Wave Equation Solver . . . . . . . . . . . . . . . .
2.2.1
Semi-discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.2 Fast Convolution Algorithm . . . . . . . . . . . . . . . . . . . . . . .
2.2.3 Boundary Condition . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.3.1 Dirichlet Boundary Condition . . . . . . . . . . . . . . . . .
2.2.3.2 Neumann Boundary Condition . . . . . . . . . . . . . . . .
2.2.3.3 Periodic Boundary Condition . . . . . . . . . . . . . . . . .
2.2.3.4 Outﬂow Boundary Condition . . . . . . . . . . . . . . . . .
2.3 Multi-Dimensional Implicit Wave Equation Solver using ADI Scheme . . . .
2.4 Arbitrary Temporal Order Accuracy . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Arbitrary Order One Dimensional Scheme . . . . . . . . . . . . . . .
2.4.2 Extension to Higher Dimensions for Arbitrary Order Accuracy . . . .
2.5 Treatment for Variable Speed Wave Propagation . . . . . . . . . . . . . . . .
2.6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.1.1 Higher Order Wave Solvers in Two Dimension . . . . . . . .
2.6.1.2 Higher Order Wave Solvers in Three Dimensions
. . . . . .
2.6.2 Three Dimensional Waves
. . . . . . . . . . . . . . . . . . . . . . . .
2.6.3 Waves with Variable Speeds . . . . . . . . . . . . . . . . . . . . . . .
2.6.3.1 One Dimensional Case . . . . . . . . . . . . . . . . . . . . .
2.6.3.2 Two Dimensional Case . . . . . . . . . . . . . . . . . . . . .
2.6.3.3 Three Dimensional Cases
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
2.6.4.1 Problem Deﬁnition . . . . . . . . . . . . . . . . . . . . . . .

2.6.4 Waveguide in a Photonic Crystal

2.6.1 Convergence Studies

1
1
3
6
7
7
10
11

13
13
16
16
23
24
25
26
27
27
31
35
37
41
44
46
46
46
48
48
50
50
53
55
56
56

vii

2.6.4.2 Geometry of the Problem . . . . . . . . . . . . . . . . . . .
2.6.4.3 Result and Discussion . . . . . . . . . . . . . . . . . . . . .
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 3: Higher Order Non-reﬂecting Boundary Condition . . . . . .
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Extension to Higher Order in Accuracy . . . . . . . . . . . . . . . . . . . . .
3.3 Appropriate Initial Condition . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Convergence Test for High-Order Outﬂow . . . . . . . . . . . . . . .
3.4.2 Rotating Gaussian Pulse . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2.1 Time Evaluation of 50o angled Gaussian pulse
. . . . . . .
3.4.3 Outﬂow Boundary Condition Along Curve Boundaries
. . . . . . . .
3.4.4 Convergence Studies Using Analytical Solution . . . . . . . . . . . . .
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57
58
61

62
62
63
67
68
68
70
72
73
74
76

CHAPTER 4: Higher Order Embedded Neumann Boundary Method . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1
4.2 Derivation of The Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 One Dimensional Scheme . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Two Dimensional Scheme
. . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 Three Dimensional Scheme . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Treatment for Complex Geometries . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Pre-computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2 Geometry for A6 MDO in 2D . . . . . . . . . . . . . . . . . . . . . .
4.4 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .

77
77
79
79
81
85
88
90
92
93
95
4.4.1.1 Convergence studies
96
Spherical Scattering
98
4.4.2.1 Convergence Studies . . . . . . . . . . . . . . . . . . . . . . 100
4.4.3 Electron Beam in Magnetrons . . . . . . . . . . . . . . . . . . . . . . 101
4.4.3.1 A6 Magnetron with Diﬀraction Output (MDO) in 2D . . . . 101
4.4.3.2 A6 Magnetron in 3D . . . . . . . . . . . . . . . . . . . . . . 103
4.4.3.3 Convergence Studies . . . . . . . . . . . . . . . . . . . . . . 104
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

4.4.1 Cylindrical Scattering

4.4.2

CHAPTER 5: Second-Order PEC Boundary Condition . . . . . . . . . . . . 107
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.1
2D Electric Scalar and Magnetic Vector Potential
. . . . . . . . . . . . . . . 108
5.2
5.3 Perfect Electric Conductor Boundary . . . . . . . . . . . . . . . . . . . . . . 109
5.4 Numerical Approximation for PEC Boundary Conditions . . . . . . . . . . . 111
5.4.1 Using Dirichlet Boundary Conditions for A to Capture PEC Boundary 112
5.4.2 Embedded Boundary Method, An eﬀective Neumann to Dirichlet

. . . . . . . . . . . . . . . . . . . . . 115
5.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Map for Creating Ghost Points

viii

5.5.1

Square Cavity Rotated Through Diﬀerent Angles
. . . . . . . . . . . 120
5.5.1.1 Convergence Studies and Error Analysis . . . . . . . . . . . 123
5.5.2
Square Cavity with a leak (diﬀraction Q)
. . . . . . . . . . . . . . . 126
5.5.3 A6 Magnetron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.5.4 Rising Sun Magnetrons . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

CHAPTER 6: Software Solution and Code Acceleration . . . . . . . . . . . 134
6.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.2 Multi-core OpenMP, Multi-node MPI version . . . . . . . . . . . . . . . . . 134
6.2.1 Data Structures and Data Flows . . . . . . . . . . . . . . . . . . . . . 135
6.2.2 Domain Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.2.3 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.3 GPGPU CUDA version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.3.1 Task Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
6.3.2 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

CHAPTER 7: Summary and Potential Future Directions . . . . . . . . . . . 145
7.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.3.1 Code Acceleration for MOLTN . . . . . . . . . . . . . . . . . . . . . 146
7.3.2 Complex Geometry in MOLTN . . . . . . . . . . . . . . . . . . . . . 147
7.3.3 Further HPM tube simulations
. . . . . . . . . . . . . . . . . . . . . 147
Incorporate with Particles . . . . . . . . . . . . . . . . . . . . . . . . 148
7.3.4
7.3.4.1 Derive the scheme for 3D point source . . . . . . . . . . . . 149
7.3.4.2 Numerical Example
. . . . . . . . . . . . . . . . . . . . . . 151
7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

ix

LIST OF TABLES

Table 2.1: The maximum values βmax for which the P th order scheme remains A-stable
[14]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

40

Table 5.1: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping

test.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Table 5.2: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping

test for CFL 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Table 5.3: Numerical error in the frequency computation using the ping test at the

point (3.36cm, 3.36cm).

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

x

LIST OF FIGURES

Figure 1.1:

3D Relativistic A6 magnetron with diﬀraction output (MDO) with a
. . . . . . . . . . . . . . . . . . .
cylindrical cathode in the center [61]

Figure 1.2:

2D view of A6 magnetron with a cathode of radius rc and the anode with
inner radius ra, vane radius rv, vane angle α1, and cavity angle α2 . . . .

Figure 1.3: Boundary surface between two regions with electric ﬁelds E1 and E2,
magnetic ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic
ﬂux densities B1 and B2 respectively. Here, Js is the surface current, ρs
is the surface charge density, and n is the normal vector pointed out from
the region two.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 2.1: Fast convolution line integration which decomposes left I L[u](x) and right
I R[u](x)) oriented integrals between the domain a and b with the spatial
step size, ∆xj(= xj − xj−1).
. . . . . . . . . . . . . . . . . . . . . . . .

Figure 2.2:

(a) x-, (b) y- and (c) xy lines with exact circular boundary points on the
mesh lines that are used for the ADI x (a) and y (b) sweeps.
. . . . . .

Figure 2.3: Fourth order time convergence of the 2D wave solver using Dirichlet (a)
and periodic (b) boundary conditions with ∆x = ∆y = 6.25 × 10−3.
This is a self-reﬁnement study which measures L∞ norm of the error
at time T = 2.0 on a square domain Ω = [−1, 1]2 with a point source
cos(ωt), ω = 1 at the center of the box, (0, 0). The CFL (= c∆t
∆x ) value
changes proportional to the time step size ∆t with ﬁxed spatial step size.

Figure 2.4: Fourth order convergence of 3D wave solver using Dirichlet boundary
conditions with ∆x = ∆y = ∆z = 1.25 × 10−2 ω = 1. This is a self-
reﬁnement study which measures L∞ norm of the error at time T = 2.0
on a cubic domain Ω = [−1, 1]3 with a point source cos(ωt) at the center
of the cube, (0, 0, 0) for (a) ω = 0.1 and (b) ω = 1. The CFL (= c∆t
∆x )
value changes proportional to the time step size ∆t with ﬁxed spatial step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
size.

Figure 2.5: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic do-
main (Ω = [−1, 1]3) by imposing Dirichlet boundary condition along the
surface with the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and
the time step size, ∆t = 0.0313.
. . . . . . . . . . . . . . . . . . . . . .

5

6

10

23

33

47

49

50

xi

Figure 2.6: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω =
[−1, 1]3) by imposing periodic boundary condition along the surface with
the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step
size, ∆t = 0.0313.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 2.7: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω =
[−1, 1]3) by imposing outﬂow boundary condition along the surface with
the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
size, ∆t = 0.0313.

Figure 2.8:

1D wave travelling over the two domains Ω1 and Ω2 with piecewise con-
stant speed c1 = 1.0 and c2 = 2.0. For this experiment, we choose spatial
step ∆x = 0.002, and time step ∆t = 0.005.
. . . . . . . . . . . . . . .

Figure 2.9:

1D wave travelling over the layered media with eight domains and the
wave travels with the same speed in every bi-domain ( c1 = 1.0 and
c2 = 2.0). For this experiment, we choose spatial step ∆x = 0.002, and
time step ∆t = 0.005 along layered media with piecewise constant wave
speed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Figure 2.10: Geometrical view of 0.5 × 0.5 (a) one and (b) four square patches in the
square domain Ω = [−1, 1]2 with c2(= 0.1) and c1(= 1.0) are the wave
speeds in the patches and the remaining area. . . . . . . . . . . . . . . .

Figure 2.11: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω =
[−1, 1]2 with a 0.52 square patch as shown in Figure 2.10-(a). Here, the
spatial step size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005. . .

Figure 2.12: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω =
[−1, 1]2 with four 0.52 square patches as shown in Figure 2.10-(b). Here,
the spatial step size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005.
Figure 2.13: Geometrical view of 0.5 × 0.5 × 0.5 (a) one and (b) four cubic patches
in the cubical domain Ω = [−1, 1]3 with c2(= 0.1) and c1(= 1.0) are the
wave speeds in the patches and the remaining area. . . . . . . . . . . . .

Figure 2.14: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with a 0.53 cubic patch as shown in Figure 2.13-(a). Here, the
spatial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size,
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
∆t = 0.0313.

Figure 2.15: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with four 0.53 cubic patches as shown in Figure 2.13-(b). Here,
the spatial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size,
∆t = 0.0313.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

50

52

53

54

54

54

55

56

56

xii

Figure 2.16: Schematic illustration of a line defect (in red) within a set of cylindrical
rods (in green) on a photonic crystal. . . . . . . . . . . . . . . . . . . . .

Figure 2.17: 2D Lattice of cylindrical rods which are periodic along x and y axes with
. . . . . . . . . . . . . . . . . .

lattice constant a on a photonic crystal.

Figure 2.18: Square lattice of square rods of side length 0.38a with lattice constant a
and dielectric constant  = 8.9 in 3D (a) and 2D (b), a linear defect is
. . . . . . . . . . . . . . . . . . .
formed by removing a column of rods.
Figure 2.19: The wave generated by the point source, e(−iωt), where ω = cky, ky =
,  = 8.9, and µ = 1 travelling through the photonic

0.88(2a), c = 1√
waveguide using the model shown in Figure 2.18 . . . . . . . . . . . . .

(µ)

Figure 2.20: Square lattice of cylindrical rods of side length 0.38a with lattice constant
a and dielectric constant  = 8.9 in 3D (a) and 2D (b), a linear defect is
. . . . . . . . . . . . . . . . . . .
formed by removing a column of rods.
Figure 2.21: The wave generated by the point source, e(−iωt), where ω = cky, ky =
,  = 8.9, and µ = 1 travelling through the photonic

0.88(2a), c = 1√
waveguide using the model shown in Figure 2.20 . . . . . . . . . . . . .

(µ)

Figure 2.22: The wave, e(−25(x2+y2)) traveling through the photonic waveguide using
. . . . . . . . . . . . . . . . . . . . . .

the model shown in Figure 2.20.

Figure 3.1: Fourth order convergence of (a) 2D and (b) 3D wave solver using out-
ﬂow boundary conditions with the spatial step size h = 1.25 × 10−2.
This is a self-reﬁnement study which measures L∞ norm of the error at
time T = 2.0 on a (a) square and (b) cubic domain with a point source
cos(ωt), ω = 1 at the center of the domain. The CFL (= c∆t
h ) value
changes proportional to the time step size ∆t with ﬁxed spatial step size,
h.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 3.2: Fourth order convergence of the oval shape Gaussian pulse (however it
reduces to the above third order near to θ = Π
4 due to the ADI splitting
error) given by the Equation 3.8 rotated through Π
2 on square do-
main Ω = [−1, 1]2 by imposing Outﬂow boundary conditions along the
boundaries. This self-reﬁnement study measures L∞ norm of the error at
time T = 2.0 with h = 0.02.
. . . . . . . . . . . . . . . . . . . . . . . .

18 to Π

xiii

57

58

59

59

60

60

60

70

71

Figure 3.3: Time evolution of a Gaussian pulse given by the Equation 3.8 with angle
θ = 50o on square domain Ω = [−1, 1]2 by imposing outﬂow boundary
conditions along the boundaries. Here the time step size ∆t = 3.3× 10−3
. . .
and CFL value is 0.5. The wave is leaving completely as expected.

Figure 3.4: Evolution of a Gaussian ﬁeld given by the Equation 3.8 at time t =
0.6864, rotated through diﬀerent angles (a) θ = 30o, (b) θ = 50o, (c)
θ = 70o, and (d) θ = 90o on square domain Ω = [−1, 1]2 by imposing
outﬂow boundary conditions along the boundaries, Here the time step
size ∆t = 3.3 × 10−3 and CFL value is 0.5 at time t = 0.1287 using
outﬂow boundary conditions. We observe the same behaviour of the
wave for rotated Gaussian pulse through diﬀerent angles. The waves
leave completely through the curved boundary.
. . . . . . . . . . . . . .

Figure 3.5:

(a) Geometry and (b) graph representation of the 2D object constructed
by using an oval of width 2a and height 2b and a rectangle of 2rw × (rh1 +
.
rh2). Here rh1 = b and v1, v2, v3, and v4 are the vertices of the object.

Figure 3.6: Time evolution of the point source ﬁeld cos ωt with ω = 0.5 at (0, 100∆x)
in the object as shown in Figure 3.5 by imposing outﬂow boundary con-
ditions along the curved boundary and homogeneous Dirichlet along the
straight boundaries. Here the spatial step size, ∆x = ∆y = 0.039 and
the time step size, ∆t = 0.0195 . . . . . . . . . . . . . . . . . . . . . . .

Figure 3.7:

(a) Snapshot of the point source ﬁeld at time T = 5.0 and (b) Fourth
order convergence of 2D wave solver using Outﬂow boundary conditions
along the curve boundary with the spatial grid width ∆x = ∆y = 0.078.
This is a self-reﬁnement study which measures L∞ norm of the error at
time T = 5.0 on the object shown in Figure 3.5 with a point source
. . . . . . . . . . .
cos(ωt), ω = 0.5 at the center of the box, (0, 100∆x)

Figure 3.8: Fourth order convergence of 3D wave solver using outﬂow boundary con-
ditions with the spatial step size h = 2.5× 10−2. This measures L∞ norm
of the error which compares with analytical solution using a point source
sin(4πt) placed at the center of the domain. . . . . . . . . . . . . . . . .
Figure 4.1: Geometry of 1D embedded Neumann boundary domain Ω ≡ [xa, xb] with
left boundary xB, ghost point xG, and interpolation points xI and xII,
where the distance ξI = |xI − xa|, ξII = |xII − xa|, and ξG = |xG − xa|
.

Figure 4.2:

(a)Geometry of 2D embedded Neumann boundary method with a bound-
ary point (xB, yB), ghost point (xG, yG), and interpolation points (xI, yI),
(xII, yII), (xIII, yIII) and (xIV , yIV ) and (b) symmetric 8-points stencil to
interpolate the solution at the point (xI, yI).
. . . . . . . . . . . . . . .

72

72

73

74

75

76

80

82

xiv

Figure 4.3: Geometry of 3D embedded Neumann boundary method with a bound-
ary point (xB, yB, zB), ghost point (xG, yG, zG), and interpolation points
(xI, yI, zI), (xII, yII, zII), (xIII, yIII, zIII) and (xIV , yIV , zIV ) . . . . . . .

Figure 4.4:

2D graph with 3 vertices (v1(x1, y1), v2(x2, y2), and v3(x3, y3), 2 straight
edges es1(≡ (v1, v2)), es2(≡ (v3, v1)) and 1 circular arch (a) or elliptical
arch (b) edge ea1(≡ (v2, v3, Oa1)) centered at Oa1(xa1, ya1).
. . . . . . .

Figure 4.5: Flow diagram of pre-computation for 3D problems with complex geome-
tries; it cuts a 3D object into 2D slices Sxy, Sxz, makes 2D graphs Gxy,
Gxz, and identiﬁes the properties of the line segments Px, Py, and Pz.
.

Figure 4.6: A6 MDO (a) Axial view with a solid cathode of radius rc in the center,
the anode with inner radius ra, vane radius rv, vane angle α1, cavity angle
α2, and a coupling horn of radius rh (b) the angle of kth cavity which is
represented by the angles θk
. . . . . . . . . . . .

1 and θk

2 , θk

2 = θk

1 + α1.

Figure 4.7: Geometrical structure of A6 MDO with key points; intersection, bound-
ary, ghost, and interpolation points required to obtain (a) x− and (b) y−
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
sweeps.

Figure 4.8:

2D model for a single cylindrical scatter at the (a) center and two (b)
symmetric and (c) asymmetric cylindrical scatters in the corners. Here
the red mark indicates the point source.
. . . . . . . . . . . . . . . . . .

86

89

90

93

94

95

Figure 4.9: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a circular
scatter of radius r = 0.5 in the center of the square domain Ω = [−1, 1]2
with a spatial step size ∆x = ∆y = 0.0156 and time step size ∆t = 0.0078. 96

Figure 4.10: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
circular scatters of radii r1 = r2 = 0.3 in the bottom-left and top-right
corners of the square domain Ω = [−2, 2]2 with a spatial step size ∆x =
. . . . . . . . . . . . . . .
∆y = 0.0156 and time step size ∆t = 0.0078.

Figure 4.11: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmet-
rical circular scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left
and top-right corners of the square domain Ω = [−2, 2]2 with a spatial
step size ∆x = ∆y = 0.0156 and time step size ∆t = 0.0078. . . . . . . .

Figure 4.12: Fourth order convergence of 2D wave solver using a point source cos(ωt),
ω = 1 placed at (0,−1 + 3∆x), ∆x = 0.0156 on a square domain Ω =
[−1, 1]2 with a circular scatter of radius r = 0.5 at the center by imposing
Neumann and outﬂow along the circular and square boundaries. This is
a self-reﬁnement study which measures L∞ norm of the error at the ﬁnal
time T = 2.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

97

97

xv

Figure 4.13: Convergence plots for a) space using entire solution and b) space using
boundary derivative on a square domain Ω = [−1, 1]2 with a circular
scatter of radius r = 0.5 at the center by imposing Neumann and outﬂow
along the circular and square boundaries. This is a self-reﬁnement study
which measures L2 norm of the error at the ﬁnal time T = 2.0.
. . . . .

Figure 4.14: 3D model for a single spherical scatter at the center of a cube, two b)
. . . . .

symmetric and c) asymmetric spherical scatters in the corners.

98

99

Figure 4.15: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a spherical
scatter of radii r = 0.5 in the center of the cubic domain Ω = [−1, 1]3 with
a spatial step size ∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078. 99

Figure 4.16: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
spherical scatters of radii r1 = r2 = 0.3 in the bottom-left and top-
right corners of the cubic domain Ω = [−1, 1]3 with a spatial step size
∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.

. . . . . . . . 100

Figure 4.17: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmet-
ric spherical scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left
and top-right corners of the cubic domain Ω = [−1, 1]3 with a spatial step
size ∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.

. . . . . . 100

Figure 4.18: Fourth order (a) time and (b) space convergence plots using a spherical
scatter of radius r = 0.5 at the center of a cubic domain Ω = [−1, 1]3 by
imposing Neumann and outﬂow along the surface of the sphere and cubic
boundaries. This is a self-reﬁnement study which measures (a) L∞ and
(b) L2 norm of the error at the ﬁnal time T = 2.0.

. . . . . . . . . . . . 101

Figure 4.19: Time evolution of a circular source ﬁeld cos(ωt)(|x2+y2−r2

c| < 2∆x), ω =
10 in the model of A6RM with a solid cathode of radius rc = 1.58 and
anode of inner radius ra = 2.11, vane radius rv = 4.11 and the coupling
horn radius rh = 4.9 and the angular width of the vane, α1 = π
6 , and the
cavity, α2 = π
6 .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Figure 4.20: Time evolution of ﬁelds provided by six point sources in the model of
A6M with a transparent cathode of radius rc = 1.58 and anode of inner
radius ra = 2.11, vane radius rv = 4.11 and the coupling horn radius
6 , and the cavity, α2 = π
rh = 4.9 and the angular width of the vane, α1 = π

6 .102

xvi

Figure 4.21: Fourth order convergence of ﬁelds provided by a point source cos(ωt),
ω = 1 at the center (0, 0), of A6MDO with the anode of radii rh = 4.9
rv = 4.11, and ra = 2.11 and angular width of the vane, α1 = π
6 and the
cavity, α2 = π
6 . This is a self-reﬁnement study which measures L∞ norm of
the error at time T = 2.0 for ﬁxed spatial step size ∆x = ∆y = 3.13× 10−2 103

Figure 4.22: Key points used for the simulation of 3D A6 magnetron; intersection,
boundary, ghost, and interpolation points required to obtain (a) x−, (b)
y−, and (c) z− sweeps.

. . . . . . . . . . . . . . . . . . . . . . . . . . . 104

Figure 4.23: Time evolution of ﬁelds provided by six point sources in the model of 3D
A6 magnetron with a transparent cathode of radius rc = 1.58 and anode
of inner radius ra = 2.11, vane radius rv = 4.11 and the coupling horn
radius rh = 4.9, the angular width of the vane, α1 = π
9 , cavity angle,
α2 = 2π

9 , and thickness h = 6. . . . . . . . . . . . . . . . . . . . . . . . . 105

Figure 4.24: Fourth order (a) time convergence and (b) space convergence of ﬁelds
provided by six point sources in the model of 3D A6 magnetron with
a transparent cathode and the anode of radii rh = 4.9 rv = 4.11, and
ra = 2.11 and angular width of the vane, α1 = π
6 and the cavity, α2 = π
6 .
This is a self-reﬁnement study which measures L∞ norm of the error at
time T = 2.0 for a ﬁxed spatial step size of ∆x = ∆y = 3.13 × 10−2 . . . 105

Figure 5.1: Boundary surface between two regions with electric ﬁelds E1 and E2,
magnetic ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic
ﬂux densities B1 and B2 respectively. Here, Js is the surface current, ρs
is the surface charge density, and n is the normal vector pointed out from
the region two (Perfect Electric Conductor). . . . . . . . . . . . . . . . . 110

Figure 5.2: A PEC square cavity rotated by an angle θ with normal vectors nL, nD,

nR, and nU along the left, down, right, and up boundaries. . . . . . . . . 112

Figure 5.3: Geometry of 2D embedded PEC boundary method with a boundary point
(xB, yB), ghost point (xG, yG), and interpolation points (xI, yI), (xII, yII),
and (xIII, yIII).

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Figure 5.4: Frequency distribution for 31.42o rotated 21cm×21cm square cavity com-
puted using the measured vector potential at the point (3.36cm, 3.36cm)
for the impulse response, h = 0.084 cm . . . . . . . . . . . . . . . . . . . 121

Figure 5.5: A strong fundamental mode for 31.42o rotated 21cm×21cm square cavity

computed for diﬀerent resolutions.

. . . . . . . . . . . . . . . . . . . . . 121

xvii

Figure 5.6: Convergence plots for the natural modes (a) 1-3, (b) 3-3, (c) 1-5, (d) 3-5,
and (e) 5-5 with analytically computed frequencies (a) f = 2.2588GHz,
(b) f = 3.0305GHz, (c) f = 3.6422GHz, (d) f = 4.1650GHz, and (e)
f = 5.0508GHz for 31.42o rotated 21cm × 21cm square cavity computed
for diﬀerent resolutions.

. . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Figure 5.7: Time evolution of a point source ﬁeld sin(2πf t) with f = 1GHz placed
at the center of a PEC square box of grid size 100 × 100 rotated by the
angles 0o, 31.42o, 45o, time step size ∆t = 7ps.

. . . . . . . . . . . . . . 124

Figure 5.8: Convergence plots for a) space using entire solution and b) space using
boundary derivative on a PEC square domain [0cm, 21cm]2. This study
measures L2 norm of the error at time T = 1.0ns compared with the
analytical solution, for the cases of mesh aligned and nonaligned (rotated
by 45o). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Figure 5.9: Convergence plots for (a) time and (b) space on a PEC a square domain
[0cm, 21cm]2. This study measures L∞ norm of the error at time T =
1.0ns compared with the analytical solution, for the cases of mesh aligned
and nonaligned (rotated by 45o).

. . . . . . . . . . . . . . . . . . . . . 126

Figure 5.10: Time evaluation of A in (a) mesh aligned and (b) θ = 31.420 rotated
square cavities with a leak imposed by outﬂow boundary condition and
diﬀraction Q which is obtained for an oscillating point source (sin(2πf t),
f = 1GHz) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Figure 5.11: Frequency spectrum of 2D A6M with vane resonators rv = 4.11cm, anode
radius ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane,
20o, and cavity angle, 40o, the grid size 128×128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient  = 0.1.

. . . . . 128

Figure 5.12: Evolution of the transparent cathode A6M with vane resonators rv =
4.11cm, anode radius ra = 2.11cm, cathode radius rc = 1.58cm, angular
width of vane, 20o, and cavity angle, 40o, the grid size 128×128, time step
size ∆t = 39ps, averaging parameter β = 1.4 and dissipation coeﬃcient
 = 0.1.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

Figure 5.13: 2D view of the 18 cavity rising sun magnetron (AX9) with a cathode of
radius rc and anode of inner radius ra, short and long cavity radii rvh and
rvl.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

xviii

Figure 5.14: Evolution of the transparent cathode 12 cavity rising sun magnetron
with vane resonators rvs = 4.11cm and rvl = 4.91cm, anode radius
ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane, 12o,
and cavity angle, 18o, the grid size 128 × 128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient  = 0.1.

. . . . . 131

Figure 5.15: Evolution of the transparent cathode 18 cavity rising sun magnetron
(AX9) with vane resonators rvs = 4.11cm and rvl = 4.91cm, anode radius
ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane, 8o,
and cavity angle, 12o, the grid size 128 × 128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient  = 0.1.

. . . . . 132

Figure 6.1: Domain Decomposition of the domain [a, b] to three 1D sub domains
j ) and right

[c0, c1], [c1, c2],and [c2, c3], each evolves own internal left (I L
(I R

j ) sweeps over the domain of length ∆cj.

. . . . . . . . . . . . . . . . 137

Figure 6.2: Log-log plot for wall time using 28 core Intel(R) Xeon(R) CPU E5-2680

v4 @ 2.40GHz processor, for the grid size N 2, N = 64, 128, 256, 1024, 2048.139

Figure 6.3: Log-log plot for speedup in using 28 core Intel(R) Xeon(R) CPU E5-2680
v4 @ 2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024,
2048.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Figure 6.4: Log-log plot for eﬃciency using 28 core Intel(R) Xeon(R) CPU E5-2680
v4 @ 2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024,
2048.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Figure 6.5: Task parallelism for the computation of un+1 based on Cxy and parallel

stages executing through CUDA streams.

. . . . . . . . . . . . . . . . . 142

Figure 6.6: Log-log plot for wall time using Intel(R) Xeon(R) CPU, Tesla K20, and
K80 GPUs for the grid sizeN 2, N = 32, 64, 128, 256, 512, 1024, 2048,
4096, 8192.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

Figure 6.7:

Intel(R) Xeon(R) CPU vs Tesla K20 and K80 GPU speedup (log-log plot)
for the grid size N 2, N = 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192.

. 143

Figure 7.1: Magnetron with eight circular hole cavity anode (this ﬁgure is from En-

cyclopedia Britannica, Inc.)

. . . . . . . . . . . . . . . . . . . . . . . . 148

Figure 7.2: Two-cavity klystron (this ﬁgure is from Reference [30]) . . . . . . . . . . 148

Figure 7.3: Evolution of the two 3D point sources sin(2πf t) with the same frequency
f = 1 at the corners of the cubical domain ([−1, 1]3), spatial step size
∆x = ∆y = ∆z = 0.013, and time step size ∆t = 0.0067.

. . . . . . . . 152

xix

Figure 7.4: Time (a) and space (b) convergence studies using 3D point source sin(2πf t)

with frequencies f = 1 at the center of the cubical domain ([−1, 1]3).

. . 152

Figure 7.5: Evolution of the two 3D point sources sin(2πf t) with frequencies 1 and
10 at the corners of the cubical domain ([−1, 1]3), spatial step size ∆x =
∆y = ∆z = 0.013, and time step size ∆t = 0.0067.

. . . . . . . . . . . . 153

xx

LIST OF ALGORITHMS

Algorithm 1

Fast Convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Algorithm 2

Two Dimensional Solver . . . . . . . . . . . . . . . . . . . . . . . . .

Algorithm 3

Algorithm 4

compute u : Compute u at time tn+1
. . . . . . . . . . . . . . . . . .
linv : L−1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Algorithm 5

Compute u at time tn+1 with high-order accuracy . . . . . . . . . . .

Algorithm 6

poly highorder : Compute the coeﬃcients for high-order accuracy . .

Algorithm 7

Algorithm 8

Algorithm 9

Algorithm 10

compute u high :Compute u in 2D with high-order accuracy . . . . .
compute c : Compute operator C in 2D; Cxy = L−1
compute d :Compute operator D in 2D; Dxy = 1 − L−1
compute cd : Compute operators C and D in 2D; Dxy and Cxy

y Dx + L−1
y L−1

x Dy . . . .

. . . . . .

. . . .

x

Algorithm 11 Compute u at time tn+1 by imposing outﬂow boundary conditions

.
linv out: Compute L−1 for outﬂow boundary condition . . . . . . . .

Algorithm 12

Algorithm 13

apply bc outf low: Apply outﬂow boundary conditions . . . . . . . . .

Algorithm 14

gamma : Compute the γ coeﬃcients for a given order of accuracy . .

25

35

36

36

39

40

43

44

45

45

67

68

69

69

70

81

84

85

88

Algorithm 15 E : Compute the coeﬃcients E based on lemma 2.1
. . . . . . . . .
Algorithm 16 Compute L−1 for 1D scheme . . . . . . . . . . . . . . . . . . . . . . .
Algorithm 17 Compute wx(= L−1
Algorithm 18 Compute wyx(= L−1
Algorithm 19 Compute wzyx(= L−1
Algorithm 20 Compute wx(= L−1
Algorithm 21 Compute wyx(= L−1

x [u]) . . . . . . . . . . . . . . . . . . . . . . . . . .

y [wx]) . . . . . . . . . . . . . . . . . . . . . . . .

z [wyx]) . . . . . . . . . . . . . . . . . . . . . . .

x [A])

y [wx]) . . . . . . . . . . . . . . . . . . . . . . . . 119

. . . . . . . . . . . . . . . . . . . . . . . . . 118

xxi

CHAPTER 1: Introduction

1.1 Context and Motivation

Scientists have been researching Higher Power Microwave (HPM) tubes to improve their

performance in new technologies, as well as identifying applications in diﬀerent areas, such

as medicine, waste decomposition, radar systems, plasma screens, linear accelerators, etc.

Beyond working in the Electromagnetic (EM) experimental labs, researchers also have been

working on the development of HPM simulation tools using diﬀerent numerical schemes.

Simulations can provide insight into real experiments. Moreover, they are very ﬂexible to

implement and can test new ideas using easy, fast, and cost-eﬃcient design cycles. In order

to perform EM simulations, ﬁrst, we have to solve Maxwell’s equations. Several approaches

have been introduced to solve Maxwell’s equations. They basically fall in the broad category

of Finite Diﬀerence (FD) Methods, Finite Element Methods (FEM), Method of Moments

(MOM), or hybrid techniques which combine two or more methods ( e.g., Finite Element

Time Domain (FETD) with Finite Diﬀerent Time Domain (FDTD) or FETD with MOM).

Each of the aforementioned methods has pros and cons, For example, the FDTD-based

explicit Yee scheme [71] has been used as a benchmark method for more than 60 years because

it preserves certain physical properties very well namely, Gauss’s law and the continuity

condition. The drawback of the method is that it fails to handle curved boundaries in

Cartesian grids due to staircasing issues. Further, explicit FD schemes can become unstable

when the surface bounds a medium [52]. In contrast, the FETD method is a very prominent

approach for complex geometries but this requires the solution of large systems of linear

equations. Such methods are expensive and hard to scale due to the bottleneck of matrix

inversions. Further, FETD schemes based on the Newmark beta method are unconditionally

1

stable but exhibit only second-order accuracy [29]. In the same spirit as FEM, the MOM

was developed to address complex problems in EM. The method transforms the boundary-

value problem into a system of linear equations [42]. Consequently, it suﬀers from the same

performance issues as FEM, since these matrices need to be inverted. The MOLT based

scheme developed by Causley et.al. [15; 13] was designed to address the issues faced by FDTD

and FETD methods. It is an unconditionally stable, fast, and implicit scheme, capable of

arbitrary order accuracy in time for Dirichlet and periodic boundaries on Cartesian grids.

The method reduces to second order for outﬂow boundary condition and Neumann boundary

condition on curved boundaries. My work extended the order of accuracy for these problems

to arbitrary order, which was veriﬁed through reﬁnement studies. We show fourth-order

spatiotemporal accuracy for complex geometry problems including curved boundaries.

Further, a general framework was deﬁned for problems with complex geometry using an

embedded boundary approach. Simulation tools for HPM tubes, such as A6 magnetron, 12

and 18 cavity rising suns were developed using a PEC boundary condition. The scheme

was derived for the electromagnetic vector potential using the Lorenz gauge which imposed

the PEC boundary condition in 2D. I evaluated the simulation of A6 magnetron using a

ping test and obtained six-strong resonance modes. This is identiﬁed as a challenging task

because the existing tools may fail to give an accurate or a robust solution [39; 40; 51; 67],

or the methods are limited to second-order accuracy [32]. The Discontinuous Galerkin PIC-

based approach fails to obtain the exact six resonance modes for the simulation of an A6

magnetron due to the diﬀusivity of the scheme [39; 40]. The multiphysics simulation software

tool VSim which is based on conformal Finite Diﬀerence Time Domain (FDTD) methods

and conformal particle boundary conditions, fails to give robust solutions in the simulation

of A6 magnetrons for every resolution. The user-conﬁgurable code, MAGIC which is an EM

FDTD-PIC code, is limited to the second-order accuracy[32].

We will review the needs and challenges for the simulation of magnetrons, speciﬁcally A6

magnetron in the next section.

2

1.2 High Power Magnetrons

The Magnetron is the most tunable Higher Power Microwave (HPM) source with 30%

tunable range [62]. It was invented by Arthur Hull in 1913, and magnetrons were built on

his original principles in the 1920s and 1930s with power levels of about 100W. In the 1940s,

the magnetrons developed by Boot and Randell achieved 10kW on their initial development.

In August 1940, the cavity magnetron based new weapons development for the US defense

systems during World War-II was led by Sir Henry Tizard. The magnetron driven radar had

a great impact on the war and the subsequent technical development of the magnetrons was

mainly carried out at the MIT radiation laboratory. The postwar development of magnetron

and related technologies has been enhanced in multi-directions. Such technology is strapping,

a mode selection technology which shifts the frequency spectrum by connecting alternate

resonators and prevents mode hopping. The rising suns, designed by alternating between

two resonators’ shape to separate the mode in frequency. A new generation of high-power

frequency-agile magnetrons was introduced for electronic warfare applications. The coaxial

magnetron, introduced by Feinstein and Collier, allows stable tuning by use of a cavity

surrounding the magnetron resonator. The Magnetron Injection Gun was introduced by

French and the relativistic magnetron (high current extension of the conventional magnetron)

was developed by Bekeﬁ and Orzechowski in 1976 [5; 23; 32].

Nowadays, magnetrons, high output power (GW-class) microwave tubes, are used in nu-

merous applications such as communication systems, radar, warfare, medical X-ray sources,

and microwave ovens. To study and enhance the performance of the magnetrons, magnetron

developers need a reliable and accurate simulation method. The research studies on the sim-

ulation of magnetrons have been carried out in academic research labs, national research labs

as well as electrical-electronic-computer engineering industries. For this simulation, we need

to derive a time-dependent solution for Maxwell’s equations that is applicable for complex

geometries and should have the ability to incorporate with particles. In recent years, the

conventional FDTD particle-in-cell (PIC) method has been widely used to study magnetrons

3

as well as other microwave tubes. Even though the FDTD based Yee scheme is a well-agreed

method for Maxwell’s equations and preserves divergence-free quantities, because of an ex-

plicit scheme it suﬀers from CFL restrictions and is not suitable for curved surfaces due to the

cut-cell staircasing approximation. Another approach, the conformal ﬁnite diﬀerence time

domain (CFDTD) PIC method, treats cut-cells diﬀerently in order to maintain second-order

accuracy for curved surfaces [51]. The CFDTD Dey-Mittra algorithm adjusts the area and

lengths of cut cells in the equation of Faraday’s law in Maxwell’s equation, but its stability

is limited by the CFL condition and it wouldn’t be scalable. The ADI-FDTD is one of the

most powerful schemes to solve Maxwell’s equations because it relies on simple, one dimen-

sional, tridiagonal system solvers in contrast to a single large system solver as is required by

the Crank-Nicholson implicit method. However, the ADI-FDTD method is mostly used to

solve non-complicated domains such as rectangular domain and is not applicable for complex

geometries because of showing only ﬁrst-order accuracy for stair-stepped curved boundaries.

ADI-FDTD method combined with the Dey-Mittra embedded boundary method can model

the curved domains associated with complex structures and time step sizes beyond the CFL

limit [67]. The eﬃciency of this method depends on the one-dimensional tridiagonal solvers

which are used underneath and that will cause a major bottle-neck which eﬀects the scala-

bility of the scheme. Further, the order of accuracy is limited to second-order.

Based on these studies, in order to simulate the HPM tube, we need a robust implicit

scheme that applies to complex geometries. Hence, the MOLT based A-stable implicit scheme

introduced in [14] is chosen to simulate HPM magnetrons and imposed embedded boundary

methods to deal with complex geometries associated with these kinds of accelerator struc-

tures. We derived a time-dependent solution to Maxwell’s equations in the vector potential

form under the Lorenz gauge and imposed a perfect electric conductor (PEC) embedded

boundary method. The scheme is unconditionally stable, shows fourth-order accuracy be-
yond the CFL restriction, and is freely scalable with high performance (requires O(n) cost

for n grids). It doesn’t depend on scalability restricting matrix operations but instead uses

4

free-space Green’s function based fast convolution. We performed a cold test using our sim-

ulation study. The cold test is an experiment of classical electromagnetic/microwave studies

without including charged particles and it can be done in either the frequency or the time

domain. We determined the dispersion relation of the interacting structures or slow-wave

structures and obtained the fundamental frequency mode using ping tests.

It would be

possible to carry out a follow up hot test which will include particles.

Figure 1.1 shows the 3D view of a relativistic A6 magnetron with diﬀraction output

(MDO) [61]. It has a cylindrical cathode in the center, an anode with six vanes around the

circle symmetrically, and a coupling horn which is connected with a cavity. We simulate

the open cavity A6 MDO using our MOLT based scheme by imposing embedded Neumann

(Chapter 4)/PEC (Chapter 5) and outﬂow boundary conditions for the scheme evaluation.

Further, our simulation study includes a symmetrical A6 magnetron using PEC boundary

conditions in 2D and Neumann boundary conditions in 3D. Further, we evaluate the A6

magnetron for the impulse response and resonance mode frequencies of it. A 2D view of

an A6 magnetron without diﬀraction output is shown in Figure 1.2. This has a cathode of

radius rc and an anode with inner radius ra, vane radius rv, vane angle α1, and cavity angle

α2.

Figure 1.1: 3D Relativistic A6 magnetron with diﬀraction output (MDO) with a cylindrical

cathode in the center [61]

5

Figure 1.2: 2D view of A6 magnetron with a cathode of radius rc and the anode with inner

radius ra, vane radius rv, vane angle α1, and cavity angle α2

.

1.3 Research Goal and Objectives

We have been working towards developing a linear-time, fast higher-order, multi-dimensional,

A-stable implicit solver that can be applicable for electromagnetic (EM) problems, specif-

ically targeting plasma science. This approach uses MOLT formulation combined with an

ADI scheme. In this way, algebraic approximations of original complex multi-physics prob-

lems can be mapped to utilize a computer’s core competencies. We aim to develop several

classes for leadership distributed multi-core computing platforms based on GP-GPU that is
an O(N ) direct implicit solver, and thereby enable exploration of a range of grand-challenge

problems. We are motivated by the following objectives:

• The development of high-order outﬂow and Neumann boundary conditions in 3D and

implementing a general geometric framework.

• The development of a high-order EM solver with no CFL constraint by developing

6

suitable boundary conditions for the vector-potential form of Maxwells equations under

the Lorenz gauge. Develop a scheme for perfectly electrically conducting layers (PECs)

in 2D.

• The development of high-order multi-core 2D and 3D implicit solvers on GPGPU for

general geometries.

• The development of a scalable software solution using multi-core openMP and multi-

node MPI.

We choose classic EM problems with complex geometries to assess our solver. These

include photonic crystal waveguides, scattering of laser light on spherical objects, and sim-

ulating electron beams in an A6 relativistic magnetron.

1.4 Electromagnetic Wave Propagation

1.4.1 Electromagnetic Fields

The mathematical relationship of electromagnetic ﬁelds is governed by Maxwell’s equa-

tions. The diﬀerential form of the macroscopic Maxwell’s equations can be expressed as:

= 0

∇ × E +
∂B
∂t
∇ × H − ∂D
∂t

= J
∇ · B = 0
∇ · D = ρ

(1.1)

(1.2)

(1.3)

(1.4)

Where Equations 1.1 and 1.2 represent Faraday’s law and Ampere’s law respectively, and
Equations 1.3 and 1.4 represent Gauss’s law. Here B is the magnetic ﬂux density (Wbm−2),
D is the electric ﬂux density (Cm−2), E is the electric ﬁeld intensity (Vm−1), H is the
magnetic ﬁeld intensity (Am−1), J is the electric current density (Am−2), and ρ is the

7

electric charge density (Cm−3). In linear isotropic media, the electric ﬂux density D and the

magnetic ﬂux density B have the constitutive relations with electric ﬁeld intensity E and

magnetic ﬁeld intensity H as follows:

D = E = 0rE

B = µH = µ0µrH

(1.5)

(1.6)

where the dielectric constant  is the permittivity (Fm−1), 0 is the free space permittivity

and r is the relative permittivity. Similarly, µ is the permeability, µ0 is the free space
permeability (Hm−1), and µr is the relative permeability of the media. Upon applying the

approximations (1.5 and 1.6), we get

∇ · H = 0
∇ · E =
ρ


Now consider the case where  and µ do not vary with time, thus

∇ × E + µ
∇ × H − 

∂H
∂t
∂E
∂t

= 0

= J

(1.7)

(1.8)

(1.9)

(1.10)

Since we have solvers for second-order wave equations, we convert the ﬁrst order Maxwell

system to second-order form.

We know the following relationship with the curl and divergence operations,

∇ × (∇ × v) = ∇(∇ · v) − ∇2v

(1.11)

Applying Equation 1.11 to E, we ﬁnd the wave equation for the electric ﬁeld in the

8

presence of source,

∇(∇ · E) − ∇2E = ∇ × (∇ × E)

(cid:17) − ∇2E = −µ(∇ × ∂H

∇(cid:16) ρ


)
∂t
∂(∇ × H)

= −µ

= −µ
= −µ
∂J
∂t

∂t
∂( ∂E
∂t + J)
∂t
+ ∇(cid:16) ρ
(cid:17)
∂t2 − µ

∂2E

∂J
∂t


∇2E − 1
c2

∂2E
∂t2 = µ

(1.12)

where the speed c = 1√

µ. Suppose there are no free charges, ρ = 0, and J = 0 then we

have source-free wave equation for the electric ﬁeld,

∇2E − 1
c2

∂2E
∂t2 = 0

(1.13)

In the same way, upon applying Equation 1.11 to H, we obtain

∇(∇ · H) − ∇2H = ∇ × (∇ × H)

(cid:18)

(cid:19)

−∇2H = ∇ ×


= 

∂t

+ J

+ ∇ × J

∂E
∂t
∂(∇ × E)
∂(−µ ∂H
∂t )
+ ∇ × J
∂t2 + ∇ × J

∂t
∂2H

= 
= −µ

∇2H − 1
c2

∂2H

∂t2 = −∇ × J

(1.14)

If J = 0 then we have source-free wave equation for the magnetic ﬁeld,

9

∇2H − 1
c2

∂2H
∂t2 = 0

(1.15)

We have derived a wave equation for the electric ﬁeld (1.12) and the magnetic ﬁeld (1.14)

previously. However, since our targeting application domain is plasma science, we need to

obtain electric scalar and magnetic vector potential formulation from the mixed potential

form of Maxwell’s equation which can easily deal with charged particles. We give a detailed

derivation for the vector potential formulation in Chapter 5

1.4.2 Boundary Conditions for Electromagnetic Fields

The electromagnetic ﬁelds and ﬂux densities are subject to boundary conditions; they

can be derived from the integral form of Maxwell’s equations. Let’s consider a current and

charge-free boundary s that separates the regions 1 and 2 as shown in Figure 1.3.

Figure 1.3: Boundary surface between two regions with electric ﬁelds E1 and E2, magnetic
ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic ﬂux densities
B1 and B2 respectively. Here, Js is the surface current, ρs is the surface charge
density, and n is the normal vector pointed out from the region two.

Faraday’s and Ampere’s laws predict electric and magnetic ﬁelds are continuous along

the boundary, so the tangential boundary conditions can be expressed as,

ˆn × (E1 − E2) = 0

ˆn × (H1 − H2) = 0

(1.16)

Gauss’s law predicts the electric and magnetic ﬂuxes are continuous along the normal to

10

the boundary surface, so the normal boundary conditions can be expressed as,

ˆn · (D1 − D2) = 0

ˆn · (B1 − B2) = 0

(1.17)

Suppose the surface s has the induced surface electric current density Js and the induced

surface electric charge density ρs, the tangential and normal boundary conditions can be

predicted from Faraday’s, Ampere’s, and Gauss’s laws as follows,

ˆn × (E1 − E2) = 0

ˆn · (D1 − D2) = ρs

ˆn × (H1 − H2) = Js
ˆn · (B1 − B2) = 0

(1.18)

(1.19)

1.5 Overview

We give a general overview of this dissertation and draw attention to the original contri-

butions of the work. Basically, this work begins with MOLT based A-stable implicit solvers

introduced by [15] and later developed to achieve higher-order schemes [14] and adopt em-

bedded boundary methods [13]. We provide high-order outﬂow and Neumann boundary con-

ditions in 3D while targeting 3D multi-core high-order implicit EM solvers on the GP-GPU

platform. This work contributes to the development of high-order outﬂow and Neumann

boundary conditions in 3D for general geometries and 3D high-order EM solver using the

vector-potential form of Maxwells equations under the Lorenz gauge. Further, this work

implements these solvers on a multi-core computing platform, GP-GPU in order to release

a full-throttle multi-scale solver to enhance spatiotemporal performance. We now describe

the structure of the dissertation section by section.

We start with an introduction of our work in Chapter 1 including motivation (in Section

1.1) and objectives (Section 1.3).

In Chapter 2, we describe the available MOLT based

implicit scheme (in Section 2.2.1) and explain how we extend the technique to a 3D solver

11

using an ADI scheme ( in Section 2.3). We present numerical results for newly developed

3D solvers including variable speed test cases. Waveguides on a photonic crystal are used to

assess our 2D variable speed solver.

High-order temporal schemes are explained in Chapter 3 including a novel scheme for

high-order outﬂow boundary conditions described in Section 3.2. Further, this chapter in-

cludes numerical results for the proposed high-order 3D solver.

In Chapter 4, we begin with a description of the embedded method presented by [13] and

extend it to higher-order using the method given in [15] for 2D and 3D problems. Further, we

develop a general approach to deal with complex boundary problems in Section 4.3. We give

an explanation of our high-order 3D scheme in Section 4.2.3 and present numerical results

for complex boundary problems, electromagnetic wave scattering by PEC cylinders (4.4.1),

electron beams in A6 Relativistic Magnetron in 2D (4.4.3), and scattering of Laser light

on spherical objects in 3D (4.4.2) which are implemented using a fourth-order embedded

boundary scheme.

In Chapter 5, we explain our versatile approach for electromagnetic vector potential and

the implementation of PEC boundary conditions. We provide several test cases including

frequency mode analysis for A6 magnetron in 2D. Chapter 6 explains nuts and bolts of a

software solution which can be scalable using high-performance computing systems. Finally

in Chapter 7 we summarize our work that was carried out throughout this study and give a

list of possible directions that can be used for further advancement.

12

CHAPTER 2: Mathematical Model for Wave Equation Solver

2.1 Introduction

In this chapter, we review the novel impact method which we will further develop in

later chapters. The new work in this chapter is the extension to 3D. We also discuss the

advantages of this approach over other methods.

Maxwell’s equations in free-space result in a hyperbolic wave equation with ﬁnite speed of

propagation of a ﬁeld quantity. We develop a numerical scheme in 3D to solve the hyperbolic

wave equation by imposing diﬀerent boundary conditions, speciﬁcally outﬂow and embedded

Neumann and PEC boundaries. We are mainly focused on the development of an EM

solver for problems with complex geometries, which can be easily accepted on multi-core

technologies facilitated by GP-GPU computing platforms. Our main target is to derive a

time-dependent solution for Maxwell’s equations with higher-order accuracy in time and

space.

In order to obtain the solution, we choose an implicit unconditionally stable Method Of

Lines Transpose (MOLT - also known as the transverse method of lines or Rothes method) [4;

17; 38; 41; 54; 50; 31] based numerical scheme coupled with an embedded boundary approach

to deal with complex geometry problems [15; 14; 13]. This was implemented in 2D and we

extend the scheme to 3D by successful implementations for diﬀerent applications. This

approach uses a MOLT formulation combined with an Alternating Direction Implicit (ADI)

scheme. The method avoids the use of matrices that typically result from the discretization

of the spatial parts of a problem and thus eliminates the main bottleneck in scaling implicit

methods. In this scheme, a PDE is ﬁrst discretized in time, and then the resulting boundary-

value problems are solved using a Green’s function method. In particular, the inverse of the

13

resulting modiﬁed Helmholtz operator is analytically constructed and evaluated eﬃciently

using an O(N ) recursive fast convolution algorithm. Extension to multi-dimensions is formed

using an ADI scheme, and each line is solved independently. Next, we describe why this

scheme is an optimal solution.

When we use this scheme to solve the hyperbolic wave equation, the spatial information

along a line in the simulation is connecting the boundaries to all points in space along that

line to advance a computational solution in time. Along this line, every point in space knows

about the other points due to the nature of the Green’s function used in the solution. The

approximation of spatial convolution depends on the quadrature size we choose, and the

accuracy of the time derivation can be extended by using the higher-order scheme which

performs a set of spatial convolutions/sweeps in multiple computing levels. The hyperbolic

wave equation form of Maxwell’s equation is preserved by the time discretization and the

spatial convolution. Further, we proved the scheme is strictly dispersing ( no diﬀusive for-

mulation) [53; 15], and satisﬁes the Gauss’s divergence-free conditions. When we extend

the scheme to multi-D using ADI splitting, the spatial points talk to each other by moving

in each direction, and they communicate along the lines in that direction. We reduce the

splitting error by doing a couple of sweeps along each direction and taking the average in

multiple computing levels.

Historically, time-dependent solutions of the wave equation are most commonly con-

structed with the method-of-lines (MOL) approach [47; 48; 72; 59]. The MOL schemes

consist of either collocation (nodal) methods or Galerkin (modal) methods [7; 12; 20]. The

collocation methods (e.g.ﬁnite-diﬀerence, ﬁnite-volume, and ﬁnite-element methods) are used

to ﬁnd a solution at a particular point, but in Galerkin methods, the solution is projected

onto a set of basis functions and the time-dependent coeﬃcients are computed. The ap-

proach which has inﬂuenced most numerical methods in computational electromagnetics is

Finite-diﬀerence time-domain (FDTD) methods [46; 65; 19], which use an explicit scheme,

so these schemes are restricted by Courant-Friedrichs-Lewy (CFL) condition (concerning the

14

ratio of the time step to the spatial step) that limits the time step size. In 1966, Kane S. Yee

derived an FDTD based time-dependent solution for Maxwell’s equations [71]. The FDTD

Yee scheme has been used for a large number of problems in computational electromagnetics.

However, the scheme is limited to Cartesian grids and suﬀers from the staircase eﬀect on

curved boundaries. Further, it cannot be used to deal with discontinuities across material

interfaces while maintaining a high-order of accuracy. Hence, the scheme is not suitable for

curved objects or media with material interfaces [57; 19]. There have been many attempts

apart from FDTD methods [46; 65; 19] to complex geometry and raise the order of these

methods above two [21; 19; 11; 10; 18]. However, these methods have not made it into

the mainstream because they are not robust. In addition to stability issues, these methods

typically suﬀer from very small CFL [21; 19; 11; 18] conditions because of small cells at the

boundaries.

In contrast, implicit schemes break the CFL restriction. Hence, the implicit schemes

are most preferable to the problems with tightly coupled scales such as problems in plasma

science. Alternating direction implicit FDTD (ADI-FDTD) is an implicit scheme that is

used to develop an A-stable implicit Maxwell solver in several plasma-related problems [73;

58; 26; 27], but they are all second order in time. Further, the implicit Finite Element Time

Domain (FETD) method based on the Newmark beta approach is unconditionally stable,

but it gives second-order accuracy. Further, the implicit and frequency domain schemes

require cost-expensive matrix inversion. Therefore, the MOLT based scheme is chosen to

be an appropriate scheme for this simulation study in order to obtain high-order accuracy

rapidly.

In Section 2.2, we give the implicit semi-discrete solution for the 1D wave equation, in

Section 2.3 we explain how we can extend this framework to multi-dimensions.

In Sec-

tion 2.2.3, we derive equations for several applicable boundary conditions including outﬂow

boundary conditions, and Section 2.4 explains details of the higher-order scheme, and ﬁnally

Section 2.6 reports a set of test cases. The subsections of section 2.4 are laid out as follows.

15

In Section 2.4.1, we derive a family of schemes of order 2P for the one-dimensional case,

and in Section 2.4.2, we generalize the higher-order scheme for higher spatial dimensions,

producing ADI methods of order 2P which will be A-stable. Here P refers to the number of

terms taken in the MOLT expansion.

2.2 One Dimensional Implicit Wave Equation Solver

Based on the initial boundary value problem with consistent boundary conditions, the

wave equation for one spatial dimension can be written as follows,

1

c2 ∂ttu − ∂xxu = S(x, t),

x ∈ [a, b],

t < 0,

(2.1)

with initial conditions

u(x, 0) = f (x),

∂tu(x, 0) = g(x),

x ∈ [a, b],
x ∈ [a, b],

where c is the propagation speed and S(x, t) denotes a source. This is well-posed once

consistent boundary conditions are appended, such as:

1. Dirichlet boundary condition: u(a, t) = UL(t) and u(b, t) = UR(t).

2. Neumann boundary condition: ∂xu(a, t) = VL(t) and ∂xu(b, t) = VR(t).

3. Periodic boundary condition: u(a, t) = u(b, t) and ∂xu(a, t) = ∂xu(b, t).

4. Outﬂow boundary condition: ∂tu(a, t) = c∂xu(a, t) and ∂tu(b, t) = −c∂xu(b, t).

2.2.1 Semi-discretization

The semi-discrete form of the wave equation can be obtained as follows:

We start by discretizing utt using the second order time centered ﬁnite diﬀerence approxi-

16

mation,

∂ttun =

un+1 − 2un + un−1

∆t2

− ∆t2
12

∂ttttu(x, η),

η ∈ [tn−1, tn+1].

(2.2)

The Laplacian has to be evaluated at time tn+1 to obtain an implicit scheme. To perform

this, we can apply a symmetric 3-point averaging for the Laplacian because we choose a

centered ﬁnite diﬀerence stencil.

∂xxun = ∂xx

where β > 0.

un +

un+1 − 2un + un−1

β2

(cid:17) − ∆t2

β2 ∂ttxxu(x, η).

(2.3)

Using (2.2) and (2.3), we obtain the second order semi-discrete equation,

(cid:16)

∂xx − β2
(c∆t)2

(cid:17)(cid:16)

un +

(cid:17)

un+1 − 2un + un−1

β2

= − β2

(c∆t)2 − S(x, tn) + O(∆t2).

(2.4)

(cid:16)

(cid:16)

The diﬀerential operator in Equation (2.4) can be interpreted as a modiﬁed Helmholtz

operator with,

Lx[u] :=

(cid:17)

1 − 1

α2 ∂xx

u(x), α =

β
c∆t

,

x ∈ [a, b].

(2.5)

On rearranging equation (2.4), we form the modiﬁed Helmholtz equation which can be

written as,

Lx[un+1 − (2 − β2)un + un−1] = β2(cid:16)

un +

α2 Sn(cid:17)

1

.

(2.6)

We solve this modiﬁed Helmholtz equation (2.6) by inverting the Helmholtz operator L

using the free space Greens function. In this way, we start with the solution subject to the
boundary [−∞,∞] and then map it to the actual boundary [a, b] by making the boundary

adjustment terms. Next, we describe the process in detail beginning with the derivation of

17

Green’s formula. Equation (2.4) can be written as,

(cid:16)

1 − 1

(cid:17)

where ψ(x) = un+1−(2−β2)un+un−1 and s(x) = β2(cid:16)

ψ(x) = s(x),

α2 ∂xx

x ∈ [a, b],

α2 Sn(cid:17)

un+ 1

, subject to the homogeneous

(2.7)

Dirichlet boundary condition

ψ(a) = 0, ψ(b) = 0,

(2.8)

The semi-discrete form of Equation (2.6) approximates the hyperbolic wave equation

with the second-order of accuracy in time using centered ﬁnite diﬀerence, and the symmetric

formulation removes the numerical dissipation. The scheme was proved as purely dispersive

[53], so it is more favorable for long-time simulations and overcomes the spurious eﬀect of

numerical diﬀusion. Consistency of the scheme was proved for free space and with boundary

conditions. Further, the scheme was proved as an unconditionally stable scheme using Von-

Neumann analysis (see [53] for proof). Based on those characteristics/proofs of the scheme,

(cid:16)

1− 1

α2 ∂xx

(cid:17)

the hyperbolic nature of the equation is preserved.

Upon multiplying the expression

ψ(x) with a test function P (x) and integrate

(cid:16)

(cid:12)(cid:12)(cid:12)b

a

P (x)∂xψ(x)

(cid:17)

(cid:12)(cid:12)(cid:12)b

a

− ψ(x)∂xP (x)

(2.9)

it from a to b,

b(cid:90)

a

=

(cid:16)
b(cid:90)

1 − 1

α2 ∂xx

ψ(x)P (x)dx

(cid:17)

(cid:16)

(cid:17)

ψ(x)

1 − 1

α2 ∂xx

P (x)dx − 1
α2

a

Now we have the Green’s formula,

18

(cid:104)

b(cid:90)

a

(cid:16)

(cid:17)

P (x) −(cid:16)

(cid:17)

1 − 1

α2 ∂xx

1 − 1

α2 ∂xx

(cid:105)

ψ(x)P (x)

dx =

(cid:16)

1
α2

(cid:12)(cid:12)(cid:12)b

a

P (x)∂xψ(x)

ψ(x)

Suppose the function P is a free space Green’s function

(cid:16)

(cid:17)

1 − 1

α2 ∂xx

P (x, x(cid:48)) = δ(x − x(cid:48))

(cid:17)

(cid:12)(cid:12)(cid:12)b

a

− ψ(x)∂xP (x)

(2.10)

(2.11)

b(cid:90)

a

ψ(x)δ(x − x(cid:48))dx −

ψ(x(cid:48)) −

b(cid:90)
b(cid:90)

a

a

s(x)P (x, x(cid:48))dx =

s(x)P (x, x(cid:48))dx =

(cid:16)
(cid:16)

1
α2

1
α2

Hence,

P (x, x(cid:48))∂xψ(x, x(cid:48))

− ψ(x)∂xP (x, x(cid:48))

P (x, x(cid:48))∂xψ(x)

− ψ(x)∂xP (x, x(cid:48))

(2.12)

(cid:17)

(cid:12)(cid:12)(cid:12)b
(cid:17)

a

(cid:12)(cid:12)(cid:12)b

a

(cid:12)(cid:12)(cid:12)b

a

(cid:12)(cid:12)(cid:12)b

a

b(cid:90)

a

ψ(x(cid:48)) =

s(x)P (x, x(cid:48))dx +

(cid:16)

1
α2

(cid:12)(cid:12)(cid:12)b

a

P (x, x(cid:48))∂xψ(x)

− ψ(x)∂xP (x, x(cid:48))

(cid:17)

(cid:12)(cid:12)(cid:12)b

a

(2.13)

This ends up with the particular solution and boundary correction which can be obtained by

the boundary condition we chose (Dirichlet, Neumann, periodic, outﬂow etc.). Here we chose

to use the free space Greens function and enforce the boundary conditions on u(a, t) and

u(b, t) through expressions we derive for the unknown terms. Let’s apply the homogeneous

Dirichlet boundary condition (2.8),

b(cid:90)

a

ψ(x(cid:48)) =

s(x)P (x, x(cid:48))dx +

(cid:16)

1
α2

(cid:12)(cid:12)(cid:12)x=b

P (b, x(cid:48))∂xψ(x)

− P (a, x(cid:48))∂xψ(x)

(cid:17)

(cid:12)(cid:12)(cid:12)x=a

(2.14)

19

Now we have two unknowns ∂xψ(x)
and they can be obtained by solving
the systems of linear equations that are derived by the limits on Equation (2.14) to x(cid:48) → a
and x(cid:48) → b,

and ∂xψ(x)

(cid:12)(cid:12)(cid:12)x=b

(cid:12)(cid:12)(cid:12)x=a

(cid:16)
(cid:16)

1
α2

1
α2

(cid:12)(cid:12)(cid:12)x=b
(cid:12)(cid:12)(cid:12)x=b

− P (a, a)∂xψ(x)

− P (a, b)∂xψ(x)

(cid:17)
(cid:17)

(2.15)

(2.16)

(cid:12)(cid:12)(cid:12)x=a
(cid:12)(cid:12)(cid:12)x=a

b(cid:90)
b(cid:90)

a

(cid:12)(cid:12)(cid:12)x=a
(cid:12)(cid:12)(cid:12)x=b

ψ(a) =

s(x)P (x, a)dx +

P (b, a)∂xψ(x)

ψ(b) =

s(x)P (x, b)dx +

P (b, b)∂xψ(x)

a

Since here, we are imposing homogeneous boundary condition, ψ(a) = 0 and ψ(b) = 0

b(cid:90)
(cid:16) − P (a, b)
b(cid:90)
(cid:16) − P (b, b)

a

b(cid:90)
b(cid:90)

a

(cid:17)
(cid:17)

(2.17)

(2.18)

∂xψ(x)

= wα

s(x)P (x, a)dx + P (a, a)

s(x)P (x, b)dx

∂xψ(x)

= wα

s(x)P (x, a)dx + P (b, a)

s(x)P (x, b)dx

where

a

a

wα =

α2

P (a, a)P (b, b) − P (a, b)P (b, a)

Now we compute the exact free space Green’s function to substitute the test function P ,

(cid:16)

1 − 1

α2 ∂xx

(cid:17)

G(x, x(cid:48)) = δ(x − x(cid:48))

(2.19)

subject to G(x, x(cid:48)) = 0 when x → −∞ and x → ∞. We can show that the Green’s
x(cid:48)− = −α2.
function satisﬁes the continuity condition G
Consider the homogeneous solution φ, Lφ = 0 and decompose it to two solutions φ1 and φ2
for x > x(cid:48) and x < x(cid:48). So they also satisfy Lφ1 = 0 and Lφ2 = 0 and φ1 and φ2 can be

x(cid:48)− = 0 and jump conditions ∂xG

(cid:12)(cid:12)(cid:12)x(cid:48)+

(cid:12)(cid:12)(cid:12)x(cid:48)+

20

written in the form of,

φ1 = c11sinh(αx) + c12cosh(αx)

φ2 = c21sinh(αx) + c22cosh(αx)

(2.20)

(2.21)

Upon applying the continuity and jump conditions ,

c11sinh(αx(cid:48)) + c12cosh(αx(cid:48)) − c21sinh(αx) − c22cosh(αx(cid:48)) = 0
c11cosh(αx(cid:48)) − c12sinh(αx(cid:48)) + −c21cosh(αx) + c22sinh(αx(cid:48)) = −α

(2.22)

(2.23)

Let’s consider x(cid:48) = 0, then we get, c12 = c22 and c21 = c11 + α. Now we write the solutions

φ1 and φ2 in the exponential form,

(cid:16)eαx − e−αx

(cid:17)
(cid:16) eαx − e−αx

2

+ c12

(cid:16)eαx + e−αx
(cid:17)

(cid:17)
(cid:16)eαx + e−αx

2

+ c12

2

φ2 = (c11 + α)

2

φ1 = c11

(cid:17)

(2.24)

(2.25)

For x > x(cid:48), limx(cid:48)→∞ φ1 = 0, we choose c11 = −c12 such that eαx vanishes, so,

φ1 = −c11e−αx

Similarly, for x < x(cid:48), limx(cid:48)→−∞ φ2 = 0, we choose c11 = −α/2 such that e−αx vanishes,

so,

Hence, the Green’s function G(x, x(cid:48)) = α

φ2 =

eαx

α
2
2 e−α|x−x(cid:48)|, and therefore the inverted modiﬁed

Helmholtz operator can be written as,

L−1
x [u] =

(cid:124)(cid:123)(cid:122)(cid:125)

Ix[u]

Particular solution

(cid:124)

+ Ae−α(x−a) + Be−α(b−x)

.

(2.26)

(cid:123)(cid:122)

(cid:125)

Homogeneous solution

21

where,

b(cid:90)

a

α
2

Ix[u] :=

e−α|x−y|u(y)dy,

x ∈ [a, b],

and

(cid:12)(cid:12)(cid:12)x=a

∂xψ(x)

A = −ψ(a)
2

− 1
2α

,

B = −ψ(b)
2

+

1
2α

(cid:12)(cid:12)(cid:12)x=b

∂xψ(x)

(2.27)

(2.28)

A and B also can be written in integral form as follows,

a(cid:90)

−∞

A =

α
2

e−α(a−y)u(y)dy,

B =

α
2

∞(cid:90)

b

e−α(y−b)u(y)dy.

(2.29)

The coeﬃcients A and B of the homogeneous solution are determined by applying bound-

ary conditions. Now, equation (2.6) can be written using the inverted Helmholtz operator,
L−1 as follows,

un+1 − 2un + un−1 = −β2un + β2L−1

x [un] + β2L−1

x

(2.30)

α2 Sn(cid:105)
(cid:104) 1

.

α2 Sn(cid:105)
(cid:104) 1

The deﬁnition can additionally be modiﬁed to include boundary corrections on a ﬁnite

domain (see [15]).
We also introduce a new operator D related to equation (2.26) that we are going to use

frequently in higher-order and higher dimensional solutions.

un+1 − 2un + un−1 = −β2Dx[un] + β2L−1

x

, Dx[u] := u − L−1

x [u].

(2.31)

The computational cost of evaluating Ix[u] in equation (2.26) is typically O(N 2); however it
can be computed eﬃciently with second order accuracy at O(N ) cost using the fast convo-

lution method detailed next.

22

2.2.2 Fast Convolution Algorithm

We are going to see how we can compute the particular solution using a fast convolution

algorithm. The integration for the particular solution, I[u](x) can be performed by decom-

posing it into left (I L[u](x)) and right (I R[u](x))) oriented integrals as shown in the Figure

2.1

Figure 2.1: Fast convolution line integration which decomposes left I L[u](x) and right
I R[u](x)) oriented integrals between the domain a and b with the spatial step
size, ∆xj(= xj − xj−1).

We begin by splitting the integration at y = x, and integrating from a to x and x to b;

i.e.,

I[u](x) = I L[u](x) + I R[u](x)

(2.32)

x(cid:90)

a

α
2

I L[u](x) =

u(y)e−α(x−y)dy,

I R[u](x) =

b(cid:90)

x

α
2

u(y)e−α(y−x)dy,

so that both integrands decay exponentially away from x. Additionally, they satisfy expo-

nential recurrence relations, which means that

I L[u](xj) = e−α∆xj I L[u](xj−1) + J L[u](xj)

j = 1 to N

J L[u](xj) =

α
2

e−αyu(x − y)dy,

(2.33)

∆xj(cid:90)

0

23

I R[u](xj) = e−α∆xj I R[u](xj+1) + J R[u](x),

j = N − 1 to 0

∆xj(cid:90)

J R[u](xj) =

α
2

e−αyu(x + y)dy.

(2.34)

0

where the ∆xj = xj − xj−1 which is the spatial grid size. The recursive representation

and the localization reduces the computational complexity.

In particular, using a second

order polynomial interpolant in place of u in the integral allows us to compute I[u](x) in
O(N ) time.

The local integrals ((2.33) and (2.34)) may be evaluated using quadrature or analytically.

For example, we can derive a second order quadrature using compact Simpson’s rule and

Lemma 2.1,

j ≈ P uj + Quj−1 + R (uj+1 − 2uj + uj−1) ,
J L
j ≈ P uj + Quj+1 + R (uj+1 − 2uj + uj−1) ,
J R

(2.35)

with quadrature weights,

P = 1 − 1 − d

v

, Q = −d +

1 − d
v

, R =

1 − d
v2 − 1 + d

2v

,

where v = α∆x, d = e−v, and uj denotes the solution at grid point xj

This method can be summarized as follows (Algorithm 1),

2.2.3 Boundary Condition

We can determine the Homogeneous coeﬃcients (A, B) from the imposed boundary con-

ditions and the end/boundary point values of the particular solution (Ix(xstart) and Ix(xend)

where xstart, and xend are two boundary points). Let us work with several common boundary

conditions in 1D ﬁrst. Then these can be expandable to higher dimension.

24

Algorithm 1 Fast Convolution
1: un - array of u values at time tn,
2: ν - array of weighted nodes along the speciﬁc direction,
3: expweight - array of exponentially weighted nodes along the speciﬁc direction,
4: ps - order of accuracy in space
5: for j = 1 to n do
6:

Compute JL(j + 1) and JR(j) via quadrature (see equation (2.35) for second order
quadrature) or analytical integration using expweight and ps

IL(j + 1) = d(j)IL(j) + JL(j + 1)
IR(n − j + 1) = d(n − j + 2)IR(n − j + 2) + JL(n − j + 1)

7: end for
8: d = e−ν
9: for j = 1 to n do
10:
11:
12: end for
13: for j = 1 to n do
I = 1
14:
2
15: end for

(cid:0)IL(j) + IR(j)(cid:1)

2.2.3.1 Dirichlet Boundary Condition

Let us begin with Dirichlet boundary conditions. Evaluating the semi-discrete solution

deﬁned in equation (2.30) at x = a and b, we get.

UL(tn+1) − 2UL(tn) + UL(tn−1) = −β2UL(tn) + β2

UR(tn+1)−2UR(tn)+UR(tn−1) = −β2UR(tn)+β2(cid:16)

(cid:20)

(cid:18)
(cid:104)

(cid:21)

1
α2 Sn

α2 Sn(cid:105)

1

I

un+1 +

(a) + A + Be−α(b−a)

I

un+1+

(b)+Ae−α(b−a)+B

, (2.37)

(cid:19)

,

(2.36)

(cid:17)

We can rearrange the linear system by unknown and known values,

A + µB = −wD
a ,
µA + B = −wD
b .

Solving the linear system for the unknowns A and B gives,

(wD

a − µwD
b )
(µ2 − 1)

A =

, B =

(wD

b − µwD
a )
(µ2 − 1)

,

(2.38)

25

where,

(cid:104)
(cid:104)

wD

a = I

un +

wD

b = I

un +

1

α2 Sn(cid:105)
α2 Sn(cid:105)

1

(a) − 1
β2

(a) − 1
β2

(cid:16)
(cid:16)

(cid:17)
(cid:17)

,

(2.39)

(2.40)

UL(tn+1) + (β2 − 2)UL(tn) + UL(tn−1)

UR(tn+1) + (β2 − 2)UR(tn) + UR(tn−1)

,

and µ = e−α(b−a).

2.2.3.2 Neumann Boundary Condition

For Neumann boundary condition, we obtain the identities, I

(b) =
−αI(b) using the characteristics of the integral solution (equation (2.27)); speciﬁcally, all

(a) = αI(a),

I

(cid:48)

(cid:48)

dependence on x is on the Greens function which is a simple exponential function. Diﬀer-

entiating the semi-discrete solution (2.30), and applying the Neumann boundary conditions

at x = a and b yields,

VL(tn+1)−2VL(tn)+VL(tn−1) = −β2VL(tn)+β2(cid:16)

I

(cid:104)

un+1+

and

VR(tn+1)−2VR(tn)+VR(tn−1) = −β2VR(tn)+β2(cid:16)

(cid:104)

I

un+1+

1

α2 Sn(cid:105)
α2 Sn(cid:105)

1

(a)+A+Be−α(b−a)(cid:17)

, (2.41)

(cid:17)

. (2.42)

(b)+Ae−α(b−a)+B

We can rearrange the linear system by unknown and known values,

A − µB = wN
a ,
−µA + B = wN
b .

Solving the linear system for the unknowns A and B gives,

A =

(wN

a + µwN
b )
(µ2 − 1)

, B =

(wN

b + µwN
a )
(µ2 − 1)

,

(2.43)

26

where,

(cid:104)
(cid:104)

wN

a = I

un +

wN

b = I

un +

1

α2 Sn(cid:105)
α2 Sn(cid:105)

1

(a) − 1
β2

(a) − 1
β2

(cid:16)
(cid:16)

2.2.3.3 Periodic Boundary Condition

VL(tn+1) + (β2 − 2)VL(tn) + VL(tn−1)

(2.44)

VR(tn+1) + (β2 − 2)VR(tn) + VR(tn−1)

.

(2.45)

(cid:17)
(cid:17)

,

By applying the periodic boundary conditions on the semi-discrete solution (2.30), we

obtain,

I

(cid:104)

α2 Sn(cid:105)

1

(cid:104)

un +

(a) + A + µB = I

un +

(a) + µA + B,

and

α

(cid:104)

(cid:16)

I

un +

α2 Sn(cid:105)

1

(cid:17)

(cid:16) − I

= α

(a) − A + µBα

(cid:104)

un +

(cid:17)

.

(a) − µA + Bα

(2.46)

(2.47)

1

1

α2 Sn(cid:105)
α2 Sn(cid:105)
(cid:104)
(cid:104)

We can rearrange the linear system to obtain,

(µ − 1)A + (1 − µ)B = I
(1 − µ)A + (1 − µ)B = I

un +

un +

(a) − I

un +

(a) + I

un +

(cid:104)
(cid:104)

1

α2 Sn(cid:105)
α2 Sn(cid:105)

1

1

α2 Sn(cid:105)
α2 Sn(cid:105)

1

(b),

(b).

Upon solving these linear equations we get,

(cid:104)

α2 Sn(cid:105)

(cid:104)

α2 Sn(cid:105)

I

A =

un + 1

(1 − µ)

(b)

I

, B =

un + 1

(a)

(1 − µ)

.

(2.48)

2.2.3.4 Outﬂow Boundary Condition

We can extend the deﬁnition of the outﬂow boundary conditions to the domains exterior

to [a, b]. Assuming that our initial condition has compact support in [a, b], after time t = tn,
the domain of dependence of un(x) extends to [a − ctn, b + ctn], since the propagation speed

27

is c. Now the solution of L−1 ((2.26)) can be written as.

L−1[un(x)] =

α
2

where

e−α|x−y|un(y)dy = I[u(x)] + Ane−α(x−a) + Bne−α(b−x).

b+ctn(cid:90)

a−ctn

An :=

α
2

a(cid:90)
b+ctn(cid:90)

a−ctn

e−α(a−y)un(y)dy,

Bn :=

α
2

e−α(y−b)un(y)dy.

(2.49)

(2.50)

b

Now we turn these spatial integrals into time integrals which exist at precisely the end points

x = a and b. First we consider outﬂow along the right boundary, x > b and assume this
region contains only right traveling waves, u(x, t) = u(x − ct). By tracing backward along a

characteristic ray we ﬁnd,

u(b + y, t) = u(b, t − y/c), y > 0.

Hence,

ctn(cid:90)

e−α(y)un(b + y, tn)dy =

0

tn(cid:90)

0

αc
2

Bn =

α
2

e−α(cs)un(b, tn − s)ds.

Now, we can impose outﬂow boundary conditions using the behavior of u at x = b. A

temporal recurrence relation due to the exponential will be,

tn−1(cid:90)

0

e−αcsu(b, tn − s)ds + e−αc∆t

e−αcsu(b, tn−1 − s)ds,

e−βzu(b, tn − z∆t)dz + e−βBn−1.

(2.51)

∆t(cid:90)
1(cid:90)

0

Bn =

αc
2

=

β
2

0

28

where β = αc∆t. Therefore, the boundary coeﬃcient Bn can be computed locally in

both time and space. We choose a quadratic interpolant in terms of u to have second order

accuracy.

u(b, tn − z∆t) = un(b) − z
2

(cid:16)

(cid:17)

un+1(b) − un−1(b)

(cid:16)

+

z2
2

un+1(b) − 2un(b) − un−1(b)

(cid:17)

.

(2.52)

Using the Equation (2.52) would give the second order approximation for outﬂow. To

compute the values that come from this approximation, we will make use of the following

lemma which comes from integration by parts (see [15] for proof),

Lemma 2.1. For integers m ≥ 0 and real v > 0,

1(cid:90)

Em := v

e−vzdz =

zm
m!

1
vm

(cid:16)

(cid:17)

1 − e−vPm(v)

m(cid:88)

l=0

vl
l!

where Pm(v) =

0

is the Taylor series expansion of order m of ev.

Using this lemma and integrating the expression (2.52) analytically, we can get,

Bn = e−βBn−1 + γ0un+1(b) + γ1un(b) + γ2un−1(b),

(2.53)

where,

γ0 = E2(β) − 1
E1(β),
2
γ1 = −2E2(β) + E0(β),

γ2 = E2(β) +

1
2

E1(β).

and

(2.54)

There are two unknowns in this equation (2.53), Bn and un+1, so we need one more equation,

29

that can be derived by evaluating equation (2.30) at x = b,

un+1(b) − 2un(b) + un−1(b) = −β2(cid:16)

un(b) −(cid:0)I(b) + µAn + Bn(cid:1)(cid:17)

.

(2.55)

Upon solving these two linear equations (2.53) and (2.55) we can obtain,

−Γ0µAn + (1 − Γ0)Bn = e−βBn−1 + Γ0I(b) + Γ1un(b) + Γ2un−1(b),

(2.56)

where

Γ0 =

β2
2

γ0, Γ1 = γ1 − γ0(β2 − 2), Γ2 = γ2 − γ0.

Similarly, by considering outﬂow at the other end x < a, we get

(1 − Γ0)An − Γ0µBn = e−βAn−1 + Γ0I(a) + Γ1un(a) + Γ2un−1(a).

(2.57)

Solving the linear system of equations (2.56) and (2.57) gives,

(1 − Γ0)wOut

a + µΓ0wOut

b

(1 − Γ0)2 − (µΓ0)2

, Bn =

An =

where

(1 − Γ0)wOut

b + µΓ0wOut

a

(1 − Γ0)2 − (µΓ0)2

,

(2.58)

a = e−βAn−1 + Γ0I(a) + Γ1un(a) + Γ2un−1(a),
wOut
b = e−βBn−1 + Γ0I(b) + Γ1un(b) + Γ2un−1(b)
wOut

According to the solution of the coeﬃcients An and Bn, we realize that we need to know
their values at the previous time step(An−1 and Bn−1). For utilization purposes, we set

A0 = B0 = 0 .

30

2.3 Multi-Dimensional Implicit Wave Equation Solver using ADI

Scheme

The extension to multi-dimension is formed using an alternating direction implicit (ADI)

scheme, and each line is solved independently.

The multi-dimensional wave equation can be represented using the initial boundary value

problem,

1
c2

∂2u(x, t)

∂t2 − ∇2u(x, t) = S(x, t), x ∈ Ω,

t > 0,

(2.59)

u(x, 0) = f (x), x ∈ Ω,
∂tu(x, 0) = g(x), x ∈ Ω.

with consistent boundary conditions, u(x, t) = h(x, t), x ∈ ∂Ω, t > 0, source term S(x, t),

and wave speed c

The multi-dimensional implicit scheme uses an ADI scheme [15], and each ADI line is solved

independently. Consider a scheme for three spatial dimensions,

1 − 1

=

+

(cid:16)

1
α4

∂2
∂z2
∂2
∂y2

(cid:17)(cid:16)

∂x2 +
∂2
∂x2

(cid:17)
(cid:17)(cid:16)

∂2
∂y2 +
1 − 1
α2

(cid:16) ∂2
α2∇2 = 1 − 1
α2
(cid:17) − 1
(cid:16) ∂4
1 − 1
α2
α2∇2 = LxLyLz + O(cid:0)(c∆t)4(cid:1)

∂x2∂y2 +

∂x2∂z2 +

1 − 1
α2

∂2
∂z2

∂y2∂z2

α6

∂4

∂4

(cid:17)

∂6

∂x2∂y2∂z2

(2.60)

Hence,

1 − 1

where Lx, Ly, and Lz are one dimensional univariate modiﬁed Helmholtz operators [15]

31

applied in the indicated spatial variable. Now the semi-discrete equation,

(cid:104)

β2un + un+1 − 2un + un−1(cid:105)

LxLyLz

= β2un +

β2
α2 Sn(x, t)

(2.61)

Upon inverting the modiﬁed Helmholtz operators Lx, Ly, and Lz, we have a semi-discrete

solution,

un+1 − 2un + un−1 = −β2Dxyz[un] + β2L−1

z L−1

y L−1

x

α2 Sn(cid:105)
(cid:104) 1

(x, y, z)

(2.62)

where the three dimensional operator is, Dxyz[u] := u − L−1

z L−1

y L−1

x [u],

while in two dimensions it is

un+1−2un+un−1 = −β2Dxy[un]+β2L−1

y L−1

x

α2 Sn(cid:105)
(cid:104) 1

(x, y), Dxy[u] := u−L−1

y L−1

x [u]. (2.63)

Inversion of the operator L−1

is performed sequentially leading to the usual x, y and z
“sweeps” of the ADI algorithm that is represented by a combination of the operator L−1
γ .

γ

Because the inverse operators are deﬁned along lines, it is quite natural to discretize 2D and

3D regions along Cartesian lines. However, the endpoints of these lines are not restricted to

residing at mesh points and can always be chosen to lie on the boundary ∂Ω. For example,

the x, y, boundary points for the x-sweep and y-sweep for a circle are shown in Figure 2.2.

We ﬁrst perform x-sweeps along with x-lines with actual boundary ends and obtain the

intermediate solution and then apply y-sweeps over the intermediate solution, along with y-

lines with actual boundary ends. The xy-sweeps form a complete circular boundary (Figure

2.2 - (c)). Similarly, we can perform y-sweeps ﬁrst then x-sweeps next and take an average

to obtain the solution

The general approach for implementation of the 3D ADI scheme follows the steps shown

below: Assume the number of x, y, and z lines are nx, ny, and nz respectively.

At each time step,

32

(a) x-lines

(b) y-lines

Figure 2.2: (a) x-, (b) y- and (c) xy lines with exact circular boundary points on the mesh

lines that are used for the ADI x (a) and y (b) sweeps.

(c) xy-lines

1. Perform the x-sweep

Operate L−1 on u(x, y, z) along x, and store the result into a temporary variable
wx(x, yi, zk) for 1 ≤ k ≤ nz and 1 ≤ i ≤ ny. The boundary conditions are imposed at

x = xa(ik) and xb(ik).

2. Perform the y-sweep

Operate L−1 on wx(x, y, z) along y, and store the result into a temporary variable
wyx(xj, y, zk) for 1 ≤ k ≤ nz and 1 ≤ j ≤ nx. The boundary conditions are imposed

33

at y = ya(jk) and yb(jk).

3. Perform the z-sweep

For 1 ≤ i ≤ ny and 1 ≤ j ≤ nx using wyx(x, y, z), solve for the equation un+1 =
2un + un−1 − β2Dxyz, where Dxyz = un − L−1

z [wyx]. The boundary conditions are now

applied at z = za(ij) and zb(ij).

In order to improve the accuracy of the ADI solve, the inversion of the x, y and z

Helmholtz operators is symmetrized, by averaging the results of x—y —z, y —z —x, and z
—x —y solves. That means we need to apply the operators Dxyz, Dyzx, and Dzyx and then

take the average.

We give the detailed algorithm for the two-dimensional implicit wave equation solver in

Algorithm 2 and relevant supporting functions are given in Algorithms 3 and 4. Algorithm

3 computes solution u at time tn+1 using our two dimensional implicit solver that calls
the function LINV given in Algorithm 4 to apply the operator L−1 to perform x and y

sweeps. LINV may be written in two functions for x and y sweeps separately. The function

applyBC called in Algorithm 4 is used to ﬁnd the homogeneous coeﬃcients An and Bn using

appropriate boundary conditions. Those are retrieved as the ﬁrst and second elements of the

vector H respectively. Since we have to use the value of the coeﬃcients at the previous time
step (tn−1), An−1 and Bn−1 to compute An and Bn for outﬂow boundary condition, these

algorithms, however, need to be modiﬁed with minor changes by passing reference type

parameters to the function applyBC. The parameter should be deﬁned within the main

Algorithm 2 and be called via computeU . Further, although Algorithm 2 would typically

be used to deal with problems with rectangular boundaries, it can be extended to adopt

problems with randomly placed boundary points in each grid line such as when dealing with

a circular boundary using the embedded boundary method. Here, we need to use matrices

x and y instead of vectors and deﬁne additional vectors to keep boundary points of each

horizontal and vertical grid line.

34

Algorithm 2 Two Dimensional Solver
1: xA, xB and yA , yB - boundary points along x and y respectively,
2: nx, ny - number of grid points along x and y respectively,
3: ∆t - time step size,
4: T - total time,
5: c - wave speed,
6: β - averaging parameter,
7: ps -order of accuracy in space.
8: x = grid vector(xA, xB, nx)
9: y = grid vector(yA, yB, ny)
10: set initialvalue(un−1, t0)
11: set initialvalue(un, t1)
12: α = β
c∆t
13: nt = T
∆t
14: µx = e−α(x(n)−x(0))
15: µy = e−α(y(n)−y(0))
16: expAx = e−α(x−x(0))
17: expBx = e−α(x(n)−x)
18: expAy = e−α(y−y(0))
19: expBy = e−α(y(n)−y)
20: νx = α(x(2 : nx) − x(1 : nx − 1))
21: νy = α(y(2 : ny) − y(1 : ny − 1))
22: expweightx = exp weights(νx, p)
23: expweighty = exp weights(νy, p
24: for tn = 1 to nt do
25:

26:
27:
28: end for

compute u(un−1, un, nx, ny, β, µx, µy, xA, xB, yA, yB, expAx, expBx,

=

un+1
expAy, expBy, νx, νy, expweightx, expweighty, ps)
un−1 = un
un = un+1

2.4 Arbitrary Temporal Order Accuracy

So far, we have derived multi-dimensional semi-discretizations of the wave equation with

second-order accuracy in time. In order to take large time steps in problems, we need higher-

order accuracy in time. While there is no stability restriction placed on our A-stable scheme,

considerations of accuracy present themselves when the CFL number becomes large. Indeed,

when large time steps are taken, the anisotropies introduced by the dimensional splitting are

very pronounced.

35

Algorithm 3 compute u : Compute u at time tn+1
1: un−1, un - array of u values at time tn−1 and tn respectively,
2: nx, ny - number of grid points along x and y respectively,
3: β - averaging parameter,
4: µx = e−α(x(n)−x(0)),
5: µy = e−α(y(n)−y(0)),
6: xA, xB and yA , yB - boundary points along x and y respectively,
7: expAx, expBx - exponential vector of grid distance from left boundary and right boundary

respectively along x,

8: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-

ary respectively along y,

9: νx, νy - array of weighted nodes along x and y respectively,
10: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-

tively,

11: ps -order of accuracy in space
12: linv x = linv(u1, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
13: linv yx = linv(Linvx, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, p, 1)
14: Dxy = un − linv yx
15: un+1 = 2un + un−1 − β2Dxy
Algorithm 4 linv : L−1
1: un - array of u values at time tn,
2: n - number of lines that need to be sweep,
3: bdryA, bdryB - boundary points along the speciﬁc line,
4: expA expB- exponential vector of grid distance from left boundary and right boundary

respectively along the line,

I = f astconvolution(u(i, :), ν, expweight, ps)
H = apply bc(u(bdryA), u(bdryB), I(bdryA), I(bdryB), β, bdryA, bdryB, µ)
Linv(i, :) = I + H(1)expA + H(2)expB

5: ν - array of weighted nodes along the line,
6: expweight - array of exponentially weighted nodes along the line,
7: p -order of accuracy in space,
8: dir - determine which sweep is going to do, along x or y determined by 0 or 1 respectively
9: if dir == 0 then
10:
11:
12:
13:
14:
15: else
16:
17:
18:
19:
20:
21: end if

I = f astconvolution(u(:, j), ν, expweight, ps)
H = apply bc(bdryA), u(bdryB), I(bdryA), I(bdryB), β, bdryA, bdryB, µ)
Linv(:, j) = I + H(1)expA + H(2)expB

end for

for i = 1 to n do

end for

for j = 1 to n do

36

In this section, we discuss A-stable schemes of arbitrary order using a MOLT formula-

tion of the wave equation, and by implicitly including higher-order derivatives. However,

we construct the derivatives using a novel approach: recursive convolution. The cornerstone

of our method lies in the fact that by using recursive applications of the convolution oper-

ators introduced in [15], the inversions of higher derivatives can be performed analytically.

So the resulting scheme is made explicit, even at the semi-discrete level. In constructing

the analytical convolution operators, we incorporate the boundary conditions directly. In

this way, without additional complexity Dirichlet and periodic boundary conditions can be

implemented to higher-order. Since dealing with outﬂow boundary conditions depend on

previous time step values at the boundaries, we need to do a little more work to implement

it in higher-order. We use the recursive convolution algorithm which consumes O(N ) oper-

ations. So, the schemes we obtain in d = 1, 2 and 3 dimensions will achieve accuracy 2P in

O(P dN ) operations per time step.

2.4.1 Arbitrary Order One Dimensional Scheme

A higher-order scheme can be achieved by including more spatial derivatives in the nu-

merical scheme. We choose a Lax-Wendroﬀ approach to increase the temporal accuracy of

our second order semi-discrete solution. This approach requires us to exchange time deriva-

tives with spatial derivatives in the Taylor expansion. Let’s start with the semi-discrete wave

equation and apply the Lax-Wendroﬀ procedure. We have

un+1 − 2un + un−1 = 2

(∂tt)m un

(2.64)

Hence, by using the fact,

(∂tt)m u =(cid:0)c2∂xx

∆t2m
(2m)!

∞(cid:88)
(cid:1)m u, m ≥ 1.

m=1

37

un+1 − 2un + un−1 = 2

= 2

∞(cid:88)
∞(cid:88)

m=1

m=1

(c∆t)2m
(2m)!

(∂xx)m un

(cid:18)∂xx

(cid:19)m

β2m
(2m)!

α2

un.

(2.65)

Diﬀerential operators (∂xx)m have to be approximated properly. To do this, we can

consider the convolution operator D from (2.31). We begin by considering the equation,

(cid:20)(cid:18)∂xx

(cid:19)m(cid:21)

α2

F

(cid:18) k

(cid:19)2m

α

= (−1)m

where F denotes the Fourier transform. Now, we need to ﬁnd (k/α)2; it can be formed using
D as follows,

ˆD := F [D] = 1 − F(cid:2)L−1(cid:3) = 1 −

1 +(cid:0) k

1

α

(cid:1)2 =

(cid:0) k
(cid:1)2
1 +(cid:0) k

α

α

(cid:1)2 .

Solving for the quantity (k/α)2 in terms of ˆD, gives an expression for all even derivatives as

follows,

(cid:18) k
(cid:18) k

α

α

(cid:19)2
(cid:19)2m

=

=

ˆD
1 − ˆD

(cid:32) ˆD

1 − ˆD

=

∞(cid:88)
(cid:33)m

p=1

ˆDp

=

and

(cid:19)

(cid:18) p − 1

m − 1

∞(cid:88)

p=m

ˆDp,

By inserting these into the Taylor expansion (2.65), we obtain

∞(cid:88)

m=1

(cid:19)

(cid:18) p − 1

m − 1

∞(cid:88)

p=m

Dp[un].

un+1 − 2un + un−1 =

(−1)m 2β2m
(2m)!

Upon reversing the order of summation, we have

un+1 − 2un + un−1 =

Ap(β)Dp[un],

(2.66)

∞(cid:88)

p=1

38

where Ap(β) is a polynomial of coeﬃcients in β2,

Ap(β) = 2

(−1)m β2m
(2m)!

p(cid:88)

m=1

(cid:19)

(cid:18) p − 1

m − 1

.

(2.67)

From this scheme (2.4.1), 1D second (identical to the original second order scheme (2.31))

and fourth order equations are

un+1 − 2un + un−1 = −β2D[un]
un+1 − 2un + un−1 = −β2D[un] −

(cid:18)

β2 − β4
12

(cid:19)

D2[un]

(2.68)

(2.69)

Algorithm 3 can be modiﬁed as follows to be able to compute the solution with higher-

order accuracy in 1D.

Algorithm 5 Compute u at time tn+1 with high-order accuracy
1: un−1, un - array of u values at time tn−1 and tn respectively,
2: n - number of points along the line,
3: β - averaging parameter,
4: µ = e−α(x(n)−x(0)),
5: bdryA, bdryB - boundary points along the speciﬁc line
6: expA, expB - exponential vector of grid distance from left boundary and right boundary

respectively along the line,

7: ν - array of weighted nodes along the line,
8: expweight - array of exponentially weighted nodes along the line,
9: ps -order of accuracy in space,
10: pt -order of accuracy in time
11: P = pt
2
12: Dterms = 0
13: for k = 1 to P do
14: D = u − linv(u, n, β, µ, bdryA, bdryB, expA, expB, ν, expweight, ps)
15: Dterms = Dterms + poly highorder(β, P )D
u = D
16:
17: end for
18: un+1 = 2un − un−1 + Dterms

Temporal cost for a scheme of order P is O(P N ) per time step for N spatial points.

Since the local truncation errors are O((c∆t/β)2P +2), the error constant will decrease with

39

Algorithm 6 poly highorder : Compute the coeﬃcients for high-order accuracy

1: β - averaging parameter,
2: k - order of accuracy at current stage
3: coef = 0;
4: for m = 1 to k do
5:
6: end for
7: coef = 2coef

coef = coef + (−1)m β2m

(cid:1)

(cid:0) k−1

m−1

(2m)!

increasing β. However, it should be limited with an upper value to maintain stability. The

stability of this scheme was proved in [15] using Von-Neumann analysis and they obtained

an upper limit of β for several even orders of temporal accuracy which is summarized in

Table 2.1. We can notice that βmax decreases with increasing P

P
Order
βmax

1
2
2

2
4

3
6

4
8

5
10

1.4840

1.2345

1.0795

0.9715

Table 2.1: The maximum values βmax for which the P th order scheme remains A-stable [14].

We need to set appropriate initial conditions to achieve the expected order. Suppose

we would like to get P th order accuracy, the initial value should be computed to O(∆t2P ).

For our 3-step scheme (2.66), u0 and u1 have to be initialized. They may be computed

analytically. However, in general, we know the exact value of u0 (= f (x)), and we need to

approximate the value for u1 = u(x, ∆t). To do so, we can apply the Lax-Wendroﬀ procedure

in Taylor expansion, we obtain

40

m=0

∞(cid:88)
∞(cid:88)
∞(cid:88)
∞(cid:88)

m=0

m=0

u1 =

=

=

=

(2m)!

t u0
∂m

∆tm
m!

(cid:18) ∆t2m
(cid:18)(c∆t)2m
(cid:18)(c∆t)2m

(2m)!

(2m)!

m=0

(cid:19)

∆t2m+1
(2m + 1)!

∂2m+1
t

u0

∂2m
t u0 +

∂2m
x u0 +

(c∆t)2m+1
(2m + 1)!

∂2m
x

1
c

∂tu0

∂2m
x f (x) +

(c∆t)2m+1
(2m + 1)!

∂2m
x

1
c

(cid:19)

(cid:19)

g(x)

.

(2.70)

where g(x) = ut(x, 0). This expansion can now be truncated at O(∆t2P ) to have P th

order of accuracy.

2.4.2 Extension to Higher Dimensions for Arbitrary Order Accuracy

We derive higher-order accurate solutions for the wave equation in higher dimensions

using an ADI scheme. We begin with the expansion of Equation (2.64) by introducing the

Laplacian in the Lax-Wendroﬀ procedure.

un+1 − 2un + un−1 = 2

∆t2m
(2m)!

(∂m

tt ) un = 2

∞(cid:88)

m=1

∞(cid:88)

m=1

β2m
(2m)!

(cid:18)∇2

(cid:19)m

α2

un.

Before approximating higher-order powers of the Laplacian operator, consider univariate
modiﬁed Helmholtz operators, and their corresponding D operators as given below,

Lγ := 1 − ∂2

γ

α2 , Dγ := 1 − L−1

γ ,

γ = x, y, z.

Since these operators satisfy identities, LγDγ[u] = L[u] − u = − ∂γγ

α2 u, the Laplacian in

3D can be written by,

−∇2
α2 = LxDx + LyDy + LzDz = LxLyLz[Cxyz],

(2.71)

41

where

Cxyz : L−1

y L−1

z Dx + L−1

z L−1

x Dy + L−1

x L−1

y Dz

Consider the D operator in three dimensions.

Dxyz := 1 − L−1

x L−1

y L−1
z .

(2.72)

(2.73)

Upon rearranging and inverting (2.73), we get an identity LxLyLz = (1 − Dxyz)

−1 , Now,

the Laplacian becomes,

−∇2
α2 = (1 − Dxyz)

−1 Cxyz.

By expanding (1−D)−m as a power series, we can construct even orders of the Laplacian as

follows,

(cid:18)∇2

(cid:19)m

α2

(cid:19)

(cid:18) p − 1

m − 1

∞(cid:88)

p=m

= (−1)mCm

xyz

Dp−m
xyz .

(2.74)

Upon omitting the subscripts, the semi-discrete scheme for 2D and 3D is

un+1 − 2un + un−1 =

=

=

(cid:18)∇2

(cid:19)m

2β2m
α2
(2m)!
(−1)m 2β2m
(2m)!

Cm

m=1

∞(cid:88)
∞(cid:88)
∞(cid:88)

m=1

p(cid:88)

(−1)m 2β2m
(2m)!

p=1

m=1

(cid:19)

un

(cid:18) p − 1
∞(cid:88)
(cid:18) p − 1
(cid:19)

m − 1

p=m

m − 1

Dp−m[un]

xyzDp−m
Cm

xyz [un].

We give second and fourth-order schemes here,

un+1 − 2un + un−1 = −β2Cxyz[un]
un+1 − 2un + un−1 = −β2Cxyz[un] −

(cid:18)

β2Dxyz − β4
12

(cid:19)

Cxyz

Cxyz[un]

(2.75)

(2.76)

(2.77)

As in the 1D case, these schemes will be unconditionally stable for all ∆t, and the same

42

range for β as shown in Table 2.1.

The algorithm for computing u in two dimensions using ADI splitting with a given order

of accuracy in time is designed as below in Algorithm 7. Using this algorithm, we can retrieve
the solution with accuracy O (pt) in time.

Algorithm 7 compute u high :Compute u in 2D with high-order accuracy
1: un−1, un - array of u values at time tn−1 and tn respectively,
2: nx, ny - number of points along x and y respectively,
3: β - averaging parameter,
4: µx = e−α(x(n)−x(0)),µy = e−α(y(n)−y(0)),
5: xA, xB, and yA, yB - boundary points in x and y directions respectively,
6: expAx, expBx - exponential vector of grid distance from left boundary and right boundary

respectively along x,

7: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-

ary respectively along y,

8: νx, νy - array of weighted nodes along x and y respectively,
9: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-

tively,

expAy, expBy, νx, νy, expweightx, expweighty, ps)

10: ps and pt - order of accuracy in space and time respectively
11: P = pt
2
12: struct outarray(0)
13: cd terms = −β2structoutarray(0)
14: for k = 2 to P do
15:
16:

=

compute c(u, nx, ny, β, µx, µy, xA, xB, yA, yB, expAx, expBx,

struct inarray = struct outarray
struct outarray(1) = compute cd(struct inarray(0), nx, ny, β, µx, µy, xA, xB, yA,
yB, expAx, expBx, expAy, expBy, νx, νy, expweightx, expweighty, ps)
for i = 2 to k − 1 do

struct outarray(i) = compute d(struct inarray(i), nx, ny, β, µx, µy, xA, xB, yA, yB,
expAx, expBx, expAy, expBy, νx, νy, xpweightx, expweighty, ps)

17:
18:

cd terms = cd terms + ploy highorder(β, i, k)struct outarray(i − 1)

end for
for i = k to 1 do

19:
20:
21:
22:
23: end for
24: un+1 = 2un − un−1 + cd terms

end for

Two unique functions were designed to compute solutions for the convolution operators D
(Algorithm 9) and C (Algorithm 8) by leveraging the fast convolution algorithm (Algorithm
1). Even though these two algorithms can be used together to compute the operators C and

43

D by applying them on the same vector of values at any stage, we need a separate function

as given in Algorithm 10 in order to avoid redundant calculations and follow up issues with

computing certain homogeneous boundary coeﬃcients. For this algorithm, we expand the

equation
Cxy = L−1

y Dx + L−1

x Dy as follows using Dγ = 1 − L−1

,

γ

Cxy = (L−1

y + L−1

x ) − (L−1

y L−1

x + L−1

x L−1
y )

(2.78)

The Algorithm 6 computes the coeﬃcients of the equation (2.75) for higher dimension,

higher-order accuracy in terms of β using (2.81).
Algorithm 8 compute c : Compute operator C in 2D; Cxy = L−1
1: u - array of u values at time t,
2: nx, ny - number of points along x and y respectively,
3: β - averaging parameter,
4: µx = e−α(x(n)−x(0)),
5: µy = e−α(y(n)−y(0)),
6: xA, xB, and yA, yB - boundary points along x and y respectively,
7: expAx, expBx - exponential vector of grid distance from left boundary and right boundary

y Dx + L−1

x Dy

respectively along x,

8: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-

ary respectively along y,

9: νx, νy - array of weighted nodes along x and y respectively,
10: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-

tively,

11: ps -order of accuracy in space
12: Dx = u − Linv(u, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
13: linv yDx = linv(Dx, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
14: Dy = u − linv(u, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
15: linv xDy = linv(Dy, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
16: Cxy = linv yDx + linv xDy

2.5 Treatment for Variable Speed Wave Propagation

This procedure of exchanging time derivatives with spatial derivatives in the semi-discrete

equation using the Lax-Wendroﬀ procedure will also work in the case of variable wave speeds.

44

Algorithm 9 compute d :Compute operator D in 2D; Dxy = 1 − L−1
1: u - array of u values at time t, nx, ny - number of points along x and y respectively,
2: β - averaging parameter,
3: µx = e−α(x(n)−x(0)),
4: µy = e−α(y(n)−y(0)),
5: xA, xB, and yA, yB - boundary points along x and y directions respectively,
6: expAx, expBx - exponential vector of grid distance from left boundary and right boundary

y L−1

x

respectively along x,

7: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-

ary respectively along y,

8: νx, νy - array of weighted nodes along x and y respectively,
9: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-

tively,

10: ps -order of accuracy in space
11: linv x = linv(u, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
12: linv yx = linv(linv x, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
13: Dxy = u − linv yx
Algorithm 10 compute cd : Compute operators C and D in 2D; Dxy and Cxy
1: u - array of u values at time t, nx, ny - number of points along x and y respectively,
2: β - averaging parameter,
3: µx = e−α(x(n)−x(0)),
4: µy = e−α(y(n)−y(0)),
5: xA, xB, and yA, yB - boundary points along x and y respectively,
6: expAx, expBx - exponential vector of grid distance from left boundary and right boundary

respectively along x,

7: expAy, expBy - exponential vector of grid distance from bottom boundary and up bound-

ary respectively along y,

8: νx, νy - array of weighted nodes along x and y respectively,
9: expweightx, expweighty - array of exponentially weighted nodes along x and y respec-

tively,

10: ps -order of accuracy in space
11: linv x = linv(u, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
12: linv y = linv(u, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
13: linv yx = linv(linv x, nx, β, µy, yA, yB, expAy, expBy, νy, expweighty, ps, 1)
14: linv xy = linv(linv y, ny, β, µx, xA, xB, expAx, expBx, νx, expweightx, ps, 0)
15: Dxy = u − linv yx
16: Cxy = linv x + linv y − (linv yx + linv xy)

The wave equation with variable wave speed can be written as,

utt = [c(x)]2uxx

(2.79)

45

Assume that the speed is ﬁnite and bounded as c1 ≤ c(x) ≤ c2. It can be normalized by
¯c(x) = c(x)/c2. We choose α = β

β > 0. Thus,

c2∆t,

un+1 − 2un + un−1 = 2

m=1

m=1

∞(cid:88)
∞(cid:88)
∞(cid:88)
∞(cid:88)
∞(cid:88)

m=1

m=1

p=1

= 2

= 2

=

=

p(cid:88)

m=1

∆t2m
(2m)!

∆t2m
(2m)!

β2m
(2m)!

(∂tt)mun

(cid:0)c2(x)∂xx
(cid:18)

¯c2(x)

∂xx
α2

(−1)m 2β2m
(2m)!

¯c2m(x)

Ap(β)Dp[un], .

(cid:1)m un
(cid:19)m
∞(cid:88)

p=m

un

(cid:18) p − 1

(cid:19)

m − 1

Dp[un]

(2.80)

.

(2.81)

(cid:19)

(cid:18) p − 1

m − 1

(−1)m (β¯c(x))2m
(2m)!

and the polynomial of coeﬃcients will be,

Ap(β) = 2

2.6 Numerical Results

We present a set of test cases that evaluates our framework; speciﬁcally, we consider

problems with variable/piecewise constant wave speed in one, two, and three dimensions,

using a Gaussian pulse as an initial source or a point source. For 2D, we choose a classic

EM problem - a transverse magnetic (TM) mode photonic crystal waveguide.

2.6.1 Convergence Studies

2.6.1.1 Higher Order Wave Solvers in Two Dimension

In this section, we consider a two-dimensional fourth-order temporally accurate solver.
A square domain (Ω = [−1, 1]× [−1, 1]) is chosen to assess the two dimensional higher-order

solver. We place a point source cos(ωt) at the center of the domain, (0, 0) and apply diﬀerent

46

boundary conditions.

(a) Dirichlet boundary condition

(b) Periodic boundary conditions

Figure 2.3: Fourth order time convergence of the 2D wave solver using Dirichlet (a) and
periodic (b) boundary conditions with ∆x = ∆y = 6.25 × 10−3. This is a self-
reﬁnement study which measures L∞ norm of the error at time T = 2.0 on a
square domain Ω = [−1, 1]2 with a point source cos(ωt), ω = 1 at the center of
the box, (0, 0). The CFL (= c∆t
∆x ) value changes proportional to the time step
size ∆t with ﬁxed spatial step size.

47

Figure 2.3 shows fourth order time convergence obtained for Dirichlet and periodic bound-
ary conditions by performing a reﬁnement study on a square domain Ω = [−1, 1] × [−1, 1]

with a point source cos(ωt), ω = 1 at the center of the box, (0, 0). The CFL (= c∆t
∆x )

value changes proportional to the time step size ∆t with ﬁxed spatial step size, ∆x = ∆y =
0.00625. We consider a ﬁnal time T = 2.0, with a ﬁxed spatial resolution of 320× 320 spatial

points. The discrete L∞ norm of the error is constructed at each time step and the maximum

over all time steps is used to generate the graph in Figure 2.3.

2.6.1.2 Higher Order Wave Solvers in Three Dimensions

We ﬁrst show fourth order convergence for Dirichlet boundary condition by performing a
time reﬁnement study on a cubic domain Ω = [−1, 1] × [−1, 1] × [−1, 1] with a point source

cos(ωt) at center of the domain, (0, 0, 0). This runs up to time T = 2.0, with a ﬁxed spatial
resolution of 160× 160× 160 spatial points. The discrete L∞ norm of the error is constructed

at each time step and the maximum error over all time steps is used to graph Figure 2.4 for
∆t = 625e−2

, k = 1 to 5, with Dirichlet boundary conditions.

2k

2.6.2 Three Dimensional Waves

In this section, we validate our three dimensional solver using a cubic domain (Ω =
[−1, 1] × [−1, 1] × [−1, 1]). We set the spatial grid size as ∆x = ∆y = ∆z = 0.0625
(32 × 32 × 32 grid points) and the time step size to ∆t = 0.0313. We successfully tested the

solver for a point source by applying Dirichlet, periodic and outﬂow boundary conditions

along the cubic faces. Figures 2.5, 2.6, and 2.7 show the ﬁeld of a point source cos(ωt), ω = 1
that is placed at (0, −1 + 3∆x, 0) using Dirichlet, periodic, and outﬂow boundary conditions

along the six faces respectively.

In each case, a three dimensional wave generated by a point source (Figure 2.5, 2.6,

and 2.7) smoothly propagates in a cube and obeys applied boundary conditions along the

surfaces of the cube.

48

(a) ω = 0.1

(b) ω = 1

Figure 2.4: Fourth order convergence of 3D wave solver using Dirichlet boundary conditions
with ∆x = ∆y = ∆z = 1.25 × 10−2 ω = 1. This is a self-reﬁnement study which
measures L∞ norm of the error at time T = 2.0 on a cubic domain Ω = [−1, 1]3
with a point source cos(ωt) at the center of the cube, (0, 0, 0) for (a) ω = 0.1
and (b) ω = 1. The CFL (= c∆t
∆x ) value changes proportional to the time step
size ∆t with ﬁxed spatial step size.

49

Figure 2.5: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic domain
(Ω = [−1, 1]3) by imposing Dirichlet boundary condition along the surface with
the spatial step size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step size,
∆t = 0.0313.

Figure 2.6: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω = [−1, 1]3)
by imposing periodic boundary condition along the surface with the spatial step
size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step size, ∆t = 0.0313.

Figure 2.7: Time evolution of a point source ﬁeld cos(ωt), ω = 1 within a cubic (Ω = [−1, 1]3)
by imposing outﬂow boundary condition along the surface with the spatial step
size, ∆x = ∆y = ∆z = 0.0625 ω = 1 and the time step size, ∆t = 0.0313.

2.6.3 Waves with Variable Speeds

2.6.3.1 One Dimensional Case

A one dimensional wave travels through two diﬀerent consecutive domains, Ω1 ∈ [−1, 0]
and Ω2 ∈ [0, 1] with piecewise constant speed c1 = 1.0 and c2 = 2.0 respectively. The solution

50

until the wave reaches the location of discontinuity, x = 0 is,

u(x, t) = aie−η(x−c1(t−t0))2

(2.82)

where ai is the initial amplitude, η is the Gaussian shape parameter, and t0 is a time oﬀset

which delays the Gaussian pulse.

A portion of the traveling wave in domain one, Ω1 will be reﬂected at the location of interface,

x = 0, while the remaining waves are transmitted to domain two, Ω2. Thus, we need to

consider the combination of transmitted, reﬂected, and incident waves from the location of

the interface, x = 0 and onward. The left domain (Ω1) will have components of the incident

and reﬂected waves, and the right domain (Ω2) will have the transmitted component of the

original wave.

In this way, the solution obeys the following equations,

u1(t) = aieη(x−c1(t−t0))2 + areη(x+c1(t−t0))2 x < 0
u2(t) = ate¯η(x−c2(t−t0))2 x ≥ 0,

where, ar and at are amplitudes of the reﬂected and transmitted wave respectively and ¯η

is the shape parameter of the transmitted Gaussian pulse. Because of the zero transverse

displacement at the interface, u1(0, t) = u2(0, t), that gives us,

(ai + ar)eηc2

1(t−t0)2 = ate¯ηc2

2(t−t0)2.

By equating the coeﬃcients we get,

ai + ar = at

ηc2

1 = ¯ηc2
2.

51

(2.83)

(2.84)

From energy conservation, we obtain

a2
i√
η

=

a2
r√
η

+

a2
t√
¯η

.

(2.85)

Upon solving equations (2.84), and (2.85), we obtain

,

2aic2

at =
(c1 + c2)
ar = ai − at.

, and ar = 1 − at. Further,
For our ﬁrst numerical example, we choose ai = 1, so at = 2c2
c1+c2
we chose c1 = 1.0 and c2 = 2.0, and apply our solver along the domain Ω1 ∪ Ω2 with 1024

uniform grid points of size ∆x = 0.002, while maintaining a CFL of 2.5 by choosing time step

∆t = 0.005. Figure 2.8 shows time snapshots of the moving wave using outﬂow boundary

conditions at the left and right boundaries. The numerical results agree closely with the

theoretical solutions as can be seen in the ﬁgure.

Figure 2.8: 1D wave travelling over the two domains Ω1 and Ω2 with piecewise constant speed
c1 = 1.0 and c2 = 2.0. For this experiment, we choose spatial step ∆x = 0.002,
and time step ∆t = 0.005.

A problem with multiple domains was selected as our second numerical test case with

the same spatial step size, ∆x = 0.002, and time step size ∆t = 0.005, hence CFL remains

2.5. The entire domain has eight equal width sub-domains of size 0.25, and the wave travels

at the same speed in every bi-domain. The ﬁrst sub-domain is between -1 and -0.75, has

waves travelling with speed c1 (=1.0); the wave speed is c2 (= 2.0) in the even-numbered

sub-domains.

52

Figure 2.9 shows time snapshots of the moving wave using outﬂow boundary conditions at

the external left and right boundaries.

Figure 2.9: 1D wave travelling over the layered media with eight domains and the wave
travels with the same speed in every bi-domain ( c1 = 1.0 and c2 = 2.0). For
this experiment, we choose spatial step ∆x = 0.002, and time step ∆t = 0.005
along layered media with piecewise constant wave speed.

2.6.3.2 Two Dimensional Case

In this section, we consider the two-dimensional variable speed solver. A square domain

with square patches is chosen to assess the two-dimensional variable speed solver. Due to

the material property of patches, waves travel with diﬀerent speeds through these patches
compared with the remaining area. We choose a Gaussian pulse e−25(x2+y2) as an initial

solution and apply outﬂow boundary condition along the border of the square domain (Ω =
[−1, 1] × [−1, 1]). Since we use spatial step size ∆x = ∆y = 0.02 and time step size

∆t = 0.005, applicable CFL is 0.25. We demonstrate two test cases: One and Four patches

are placed for case-i and case-ii respectively as shown in Figure 2.10

Case-i - Single patch

A 0.5 × 0.5 square patch is placed at the left top corner centered at, (-0.5, 0.5). The wave

speed is set to be c2(= 0.1) in the patch, and c1(= 1.0) in the remaining area. Figure 2.11

shows time snapshots of the solution.

Case-ii - Four patches

There are four 0.5 x 0.5 square patches placed at the four corners; i.e., centered at (-0.5,

0.5), (-0.5, -0.5), (0.5, 0.5), and (0.5, -0.5). The wave speed is set to be c2(= 0.1) in every

patch, and c1(= 1.0) in the remaining area. Figure 2.12 shows time snapshots of the wave

53

(a) single patch

(b) 4 patches

Figure 2.10: Geometrical view of 0.5× 0.5 (a) one and (b) four square patches in the square
domain Ω = [−1, 1]2 with c2(= 0.1) and c1(= 1.0) are the wave speeds in the
patches and the remaining area.

Figure 2.11: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω = [−1, 1]2
with a 0.52 square patch as shown in Figure 2.10-(a). Here, the spatial step
size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005.

at diﬀerent time instants.

Figure 2.12: Time evolution of a Gaussian ﬁeld e−25(x2+y2) in a square domain Ω = [−1, 1]2
with four 0.52 square patches as shown in Figure 2.10-(b). Here, the spatial
step size, ∆x = ∆y = 0.02 and the time step size, ∆t = 0.005.

54

2.6.3.3 Three Dimensional Cases

In this section, we discuss the three-dimensional variable speed wave-solver. A cubic

domain with cubic patches is chosen to assess the three-dimensional variable speed solver.

Due to the material property of patches, waves travel with diﬀerent speeds through these
patches compared with the remaining area. We place a Gaussian pulse e−36(x2+y2+z2) as

an initial solution and apply outﬂow boundary conditions along the border of the cube
(Ω = [−1, 1] × [−1, 1] × [−1, 1]). Since we use a spatial grid size ∆x = ∆y = ∆z = 0.0625
(32 × 32 × 32 grid points) and time step size ∆t = 0.0313, applicable CFL is 0.5. We

demonstrate two test cases: one, and four patches for case-i and case-ii respectively as

shown in Figure 2.13

(a) single patch

(b) 4 patches

Figure 2.13: Geometrical view of 0.5 × 0.5 × 0.5 (a) one and (b) four cubic patches in the
cubical domain Ω = [−1, 1]3 with c2(= 0.1) and c1(= 1.0) are the wave speeds
in the patches and the remaining area.

Case-i - Single patch

A 0.5 × 0.5 × 0.5 cubic patch is placed at the left top corner centered at, (-0.5, 0, 0.5). The

wave speed is set to be c2(= 0.1) in the patch, and c1(= 1.0) in the remaining area. Figure

2.14 shows snapshots of the wave at diﬀerent time instants.

Case-ii - Four patches

There are four 0.5× 0.5× 0.5 cubic patches placed at the four corners; i.e., centered at (-0.5,

55

Figure 2.14: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with a 0.53 cubic patch as shown in Figure 2.13-(a). Here, the spa-
tial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size, ∆t = 0.0313.

0, 0.5), (-0.5, 0, -0.5), (0.5, 0, 0.5), and (0.5, 0, -0.5). The wave speed is set to be c2(= 0.1) in

every patch, and c1(= 1.0) in the remaining area. Figure 2.15 shows snapshots of the wave

at diﬀerent time instants.

Figure 2.15: Time evolution of a Gaussian ﬁeld e−25(x2+y2+z2) in a cubical domain Ω =
[−1, 1]3 with four 0.53 cubic patches as shown in Figure 2.13-(b). Here, the
spatial step size, ∆x = ∆y = ∆z = 0.0625 and the time step size, ∆t = 0.0313.

The 3D Gaussian pulse smoothly travels through the diﬀerent material interfaces provided

by patches (one in Figure 2.14 and four in Figure 2.15) and the rest of the area.

2.6.4 Waveguide in a Photonic Crystal

2.6.4.1 Problem Deﬁnition

We study the propagation of light in a photonic crystal, a low-loss periodic dielectric

medium, in which the atoms or molecules are replaced by macroscopic media with diﬀerent

56

dielectric constants, and the periodic potential is replaced by periodic dielectric functions

[43]. These are very useful in applications such as the design of ﬁber-optic cables, laser

engineering, high-speed computing, and spectroscopy.

In order to study this propagation, we ﬁrst cast Maxwell equations written in terms of

the electric ﬁeld, E and magnetic ﬁeld H in the form of a second-order wave equation.

2.6.4.2 Geometry of the Problem

A two-dimensional photonic crystal which is periodic in the x, y directions and homo-

geneous in the z direction is chosen as the test domain. The crystal has photonic band

gaps in the xy plane and it can prevent light from propagating in any direction within the

plane. This system has discrete translational symmetry in the xy plane, and this symmetry

property can be used to characterize its electromagnetic modes. Any modes that propagate

in the xy plane are invariant under reﬂections through the plane. The Transverse Magnetic

(TM) modes have the H ﬁeld in the plane and, the E ﬁeld normal to the plane. As shown

in Figure 2.16, a line defect is introduced along the y direction by removing a line of rods in

a set of alumina rods. This can be represented using a square lattice of cylindrical rods in

the air (Figure 2.17).

Figure 2.16: Schematic illustration of a line defect (in red) within a set of cylindrical rods

(in green) on a photonic crystal.

57

In our model, we chose an m-by-n lattice that has dielectric rods which are periodic along

x (n rods) and y (m rods) axes with the lattice constant a. The cylindrical rod with radius r

and dielectric constant  is homogeneous along the z direction (we imagine the cylinders are

inﬁnitely tall). A column of rods is removed to produce a cavity, so it has reﬂecting walls on

both sides along the y direction. We consider the waveguide in TM mode using the following

boundary conditions: periodic in the x direction and outﬂow in the y direction. We assume

that there are no free charges, so only the dielectric constants are changing in space due to

the rods.

Figure 2.17: 2D Lattice of cylindrical rods which are periodic along x and y axes with lattice

constant a on a photonic crystal.

2.6.4.3 Result and Discussion

We performed numerical simulations of a waveguide on the photonic crystal using square
and cylindrical rods. A rectangular (3.9 × 2.1) lattice of rods was chosen, the rods were

placed in x, y directions with distance a = 0.3. The dielectric constant of rods was chosen

to be  = 8.9. For the ADI scheme, we chose nx = ny = 100, with time step t = 0.0105.

Using square rods

The set of square rods of size r = 0.38a is modeled as shown in Figure 2.18, and we obtained

the result as shown in Figure 2.19 using the boundary conditions: periodic in the direction

58

x and outﬂow in the direction y for a point source e(−iωt), where ω = cky, ky = 0.88(2a),
c = 1√

,  = 8.9, and µ = 1.

(µ)

(a) 3D

(b) 2D

Figure 2.18: Square lattice of square rods of side length 0.38a with lattice constant a and
dielectric constant  = 8.9 in 3D (a) and 2D (b), a linear defect is formed by
removing a column of rods.

Figure 2.19: The wave generated by the point source, e(−iωt), where ω = cky, ky = 0.88(2a),
,  = 8.9, and µ = 1 travelling through the photonic waveguide using

c = 1√
the model shown in Figure 2.18

(µ)

Using cylindrical rods

We replaced the square rods with cylindrical rods and used the same parameter choices as

in case i (Figure 2.20).

We obtained the result as shown in Figure 2.21 using the following boundary conditions:

periodic in x and outﬂow in y

Further, we tested the model for a source-free wave equation with initial solution e(−25(x2+y2))

using the same model (Figure 2.20) and obtained the result as shown in Figure 2.22. It is

59

(a) 3D

(b) 2D

Figure 2.20: Square lattice of cylindrical rods of side length 0.38a with lattice constant a
and dielectric constant  = 8.9 in 3D (a) and 2D (b), a linear defect is formed
by removing a column of rods.

Figure 2.21: The wave generated by the point source, e(−iωt), where ω = cky, ky = 0.88(2a),
,  = 8.9, and µ = 1 travelling through the photonic waveguide using

c = 1√
the model shown in Figure 2.20

(µ)

important to know that with the embedded boundary method, such as we are pursuing here,

it is no ”harder” to introduce an embedded boundary method for a square then it is for a

circle.

Figure 2.22: The wave, e(−25(x2+y2)) traveling through the photonic waveguide using the

model shown in Figure 2.20.

60

2.7 Summary

We derived the implicit semi-discrete solution for the multi-dimension wave equation sub-

ject to several applicable boundary conditions including Dirichlet, periodic, Neumann, and

outﬂow boundary conditions. We extended the second-order scheme to arbitrary higher-

order using Dirichlet and periodic boundaries on Cartesian grids. The P th order scheme

consumes O(P N ) time and O(P 2N ) space for the domain with N grid points, it can be

reduced to ideally linear time and space complexity O(N ) for considerably large problems

(N >>). The three-dimensional high-order scheme is implemented for Dirichlet and pe-

riodic boundary conditions and evaluated using diﬀerent applications including a variable

speed wave. In the next chapter, we describe the derivation of high-order outﬂow boundary

conditions.

61

CHAPTER 3: Higher Order Non-reﬂecting Boundary Condition

3.1 Introduction

In this chapter, we develop a high-order non-reﬂecting boundary condition that can be

applicable for curved boundaries based on the scheme described in Chapter 2.

A way to deﬁne a ﬁnite computational domain in the inﬁnite physical domain is by using

an artiﬁcial boundary which is called non-reﬂecting or open boundary condition. Since we

develop simulation of A6 magnetron with diﬀraction output (A6 MDO), we need a scheme

that successfully handles non-reﬂecting boundary conditions, even for curved boundaries.

The approach described in Chapter 2 gives second-order accuracy for the outﬂow boundary

condition. Basically, this approach extends the spatial domain for a speciﬁc time instance

and switches spatial integration into time integration at the boundary ends. We use higher-

order interpolation polynomial based on a higher-order ﬁnite-diﬀerence stencil for the time

integration approximation and apply a suitable initial condition to reach high-order accuracy.

Historically, the open boundary condition was developed for outgoing wave propagation

from the Sommerfeld radiation condition [66] by Zienkiewicz and Newton [74]. The wave-

based interpretation of this boundary condition is more applicable for EM wave propagation

[25; 60; 63]. Later, Higdon developed a sequence of increasing-order boundary conditions at

various incidence angles [35]. The absorbing layer-based method, Perfectly Matched Layer

(PML), is found to be the most popular method. It was developed by Berenger [6] for the

2-D Maxwell equations, and has been extended to 3D. PML imposes both losses of electric

and magnetic ﬁelds, identifying the correct loss term for PML so that the outgoing waves

not reﬂected in the domain can be problem-speciﬁc. However, it is diﬃcult to choose an

optimal conductive parameter(σ) for the best performance of the PML. Another approach,

62

the sponge layer method [16], requires extra grid points to make padding which implements

the functions to prevent the reﬂection. Kirchhoﬀ’s formula-based method developed by Ting

and Miksis [69] also requires more computation.

Recently, Bradley [1] introduced a high-order nonreﬂecting boundary condition which

uses a compressed nonreﬂecting boundary kernel to achieve high-order for 2D and 3D cases.

This boundary kernel approach is very complicated and consumes O(M N logN ) work in 2D

and O(M N log2N ) work in 3D, for M time steps and N spatial grid points. Without using

the complicated boundary kernel approach, our higher-order scheme for outﬂow boundary

condition works well for multi-dimensional problems with consuming O(P M N ) works, for

the temporal order of accuracy 2P [14]. For arbitrarily big problems, M N >> P , so the

complexity will be reduced to O(M N ).

In this chapter, we discuss 4-th order derivation for outﬂow boundary conditions. In Sec-

tion 3.2, we derive the high-order outﬂow boundary condition and give a concrete deﬁnition.

Section 3.3 gives the suitable initial condition for this approach, and ﬁnally we report a set

of test cases in Section 3.4.

3.2 Extension to Higher Order in Accuracy

We need to apply boundary conditions appropriately to obtain high order accuracy.

Dirichlet, Neumann, and periodic boundary conditions can be implemented straightforwardly

( see [15] for details) using the approach shown in [14]. Since we have to know the behavior
of the wave equation outside of the domain x ≤ a and x ≥ b, the outﬂow boundary condition

requires little additional computation as detailed in the following.

In order to obtain higher-order outﬂow boundary conditions, we consider more terms in

the Taylor series. Note, equation (2.52) uses three terms for second-order accuracy. We

use the same approach as explained previously in section 2.2.3.4. Besides, we perform some

iterations along the boundary stencil to ensure convergence. For this procedure, we need

to know the solution at time steps, tn+1, tn+2, and so on, if we choose the centered ﬁ-

63

nite diﬀerence stencil to derive equations with higher-order accuracy (as was used to derive

second-order outﬂow boundary conditions). A problem with this approach is that we need

data at a future time, tn+2, which we don’t have direct access to.

Instead, we prefer to

use one-sided backward ﬁnite diﬀerence stencils to obtain higher-order accuracy and obtain

boundary coeﬃcients explicitly.

Let us derive outﬂow boundary conditions with fourth order accuracy. First, we construct

a time interpolant at the right boundary (x > b) using a Taylor series expression of the form

(Here z = s/∆t as introduced in 2.52 )

u(b, tn − z∆t) ≈ u(b, tn) − z∆tut(b, tn) +

z2∆t2

2

utt(b, tn) − z3∆t3

6

uttt(b, tn) +

z4∆t4

24

utttt(b, tn)

(3.1)

By truncating higher order error terms, we only need to approximate the ﬁrst time derivative

to fourth order accuracy, second time derivative to third order accuracy, third time derivative

to second order accuracy, and fourth time derivative to ﬁrst order accuracy. To perform this

approximation, we use a ﬁve point backward ﬁnite diﬀerence stencil to approximate ut, utt,

uttt, and utttt to the desired order of accuracy. we obtain,

u(b, tn − z∆t) ≈ un(b) − z

(cid:16)25

12

(cid:16)35
(cid:16)5
(cid:16)

un−1(b) +

un(b) − 4un−1(b) + 3un−2(b) − 4
3
un−3(b) +

un(b) − 26
3
12
un(b) − 9un−1(b) + 12un−2(b) − 7un−3(b) +
2
un(b) − 4un−1(b) + 6un−2(b) − 4un−3(b) + un−4(b)

un−3(b) +
11
12
un−4(b)

un−2(b) − 14
3

19
2

3
2

(cid:17)

z2
+
2
− z3
6
z4
24

+

un−4(b)

1
4
un−4(b)

(cid:17)

(cid:17)

(cid:17)

(3.2)

Integrating this expression analytically using Lemma 2.1, we arrive at

Bn = e−βBn−1 + γ0un(b) + γ1un−1(b) + γ2un−2(b) + γ3un−3(b) + γ4un−4(b)

(3.3)

64

where,

E3(β) + E4(β),

35
12

E2(β) − 5
2

γ0 = E0(β) − 25
E1(β) +
12
E2(β) + 9E3(β) − 4E4(β),
γ1 = 4E1(β) − 26
3
E2(β) − 12E3(β) + 6E4(β),
γ2 = −3E1(β) +
19
2
E2(β) + 7E3(β) − 4E4(β),
14
3
E2(β) − 3
11
2
12

4
γ3 =
3
γ4 = −1
4

E3(β) + E4(β).

E1(β) +

E1(β) +

(3.4)

Likewise, by considering the left boundary x < a, we get

An = e−βAn−1 + γ0un(a) + γ1un−1(a) + γ2un−2(a) + γ3un−3(a) + γ4un−4(a).

(3.5)

We can also compute the coeﬃcients for second-order accuracy using the explicit approach

in this form,

where,

An = e−βAn−1 + γ0un(a) + γ1un−1(a) + γ2un−2(a).
Bn = e−βBn−1 + γ0un(b) + γ1un−1(b) + γ2un−2(b).

E1(β) + E2(β),

γ0 = E0(β) − 3
2
γ1 = 2E1(β) − 2E2(β),
γ2 = −1
2

E1(β) + E2(β).

(3.6)

(3.7)

Now we have equations to compute the homogeneous boundary coeﬃcients An and Bn

to second and fourth-order accuracy. Note that the boundary constants corresponding to

operator D[u] (denoted by An

1 , Bn

1 ) are independent of the boundary constants corresponding

65

to operator D2[u] (denoted by An

2 , Bn

2 ). Therefore, our fourth-order wave solution can be

constructed on two levels.

• Level 1

Compute D[u] using u; An

1 and Bn

1 are obtained by second order solution implicitly

(2.58) or explicitly (3.6).

• Level 2

Compute D2[u] using D[u]; An

2 and Bn

2 are obtained by fourth order solution (3.3) and

(3.5) explicitly

Now, we provide algorithms for fourth-order outﬂow boundary conditions in one dimen-

sion. We ﬁrst modify Algorithm 5 to work for outﬂow boundary conditions with fourth-order

accuracy in time. This Algorithm 11 computes γ coeﬃcients (see 3.7 and 3.4 ) using second-

order centered, and fourth-order backward ﬁnite diﬀerent stencils in Level 1 and Level 2

computations respectively. Initially, outﬂow boundary coeﬃcients (vector H) are set to zero,

meaning boundary coeﬃcients at time step t0 are zero, and previous solutions at boundaries

are maintained (ubppr) for the computation.

During the computation of L−1 using outﬂow (Algorithm 12), we use an iterative method

to ensure the solution converges. In each iteration, it computes boundary coeﬃcients and

updates the solutions within the boundary stencil. We perform the iteration until it satisﬁes
a criterion |u − utemp|∞ < tol where tol is a tolerance which may be chosen to be quite

small. We developed an Algorithm 13 to compute outﬂow boundary coeﬃcients using the

previously explained approach.

A general procedure to compute the γ coeﬃcients for any order based on a given ﬁnite dif-

ference stencil is designed in Algorithm 14. This algorithm utilizes a function f dcoef f F (k, ¯x, x)

published by [49] to compute coeﬃcients for ﬁnite diﬀerence approximations of the derivative

of order k at ¯x based on grid values at points in the boundary stencil x. In Algorithm 15,

the function E represents the expression deﬁned in lemma 2.1

66

Algorithm 11 Compute u at time tn+1 by imposing outﬂow boundary conditions
1: un, un−1, and un−2 - array of u values at time tn, tn−1, and tn−2 respectively,
2: bdryA, bdryB - boundary points,
3: expA expB- exponential vector of grid distance from left boundary and right boundary

respectively along the line,

4: ν - array of weighted nodes along the line,
5: expweight - array of exponentially weighted nodes along the line,
6: ps, pt -order of accuracy in space and time respectively,
7: xA, xB - left and right boundary points respectively.
8: stimp2 = 1 : −1 : 1
9: stexp2 = 0 : −1 : −4
10: g(1, :) = gamma(β, 2, stimp2)
11: g(2, :) = gamma(β, 4, stexp4)
12: P = 2
13: Dterms = 0
14: for k = 1 to P do
15: D

=

u − linv out(un, β, bdryA, bdryB, expA, expB, ν, expweight, ps, 2k, H,

ubppr, g, maxit, tol)

16:
17: Dterms = Dterms + poly high(β, P )D
u = D
18:
19: end for
20: un+1 = 2un − un−1 + Dterms

3.3 Appropriate Initial Condition

Our MOLT based implicit numerical scheme computes solution un+1 at time tn+1 using
the solutions un and un−1 at previous time steps tn, and tn−1. Therefore our initial condition

should be handled carefully in order to avoid order reduction of the scheme. In Causley’s

higher order paper [14] he introduced a technique to approximate u1 at time t1 using the

initial solution at time t0, u0.

However, since our high-order outﬂow scheme requires solutions un, un−1, un−2, un−3,
and un−4 at the boundary points for the time levels tn, tn−1, tn−2, tn−3, and tn−4 to compute

homogeneous coeﬃcients, the above approximation is not enough to go with high-order

outﬂow scheme. Therefore, we require the initial solution u1, u2, u3, and u4 to use our

higher order scheme. In order to obtain those values, we ﬁrst use the second order scheme

with the initial approximation of u1 and obtain u2, u3, and u4, then afterwords we use our

67

Algorithm 12 linv out: Compute L−1 for outﬂow boundary condition
1: u - array of u values at time tn,
2: β - averaging parameter,
3: bdryA, bdryB - boundary points,
4: expA expB- exponential vector of grid distance from left boundary and right boundary

respectively along the line,

5: ν - array of weighted nodes along the line,
6: expweight - array of exponentially weighted nodes along the line,
7: ps -order of accuracy in space,
8: pt - order of accuracy in time.
9: H - boundary coeﬃcients at time step tn,
10: ubppr - solution at the boundary points at previous time steps,
11: g - a vector of γ coeﬃcients for a given order of accuracy,
12: maxit - maximum number of iterations,
13: tol - tolerance number for the convergence.
14: Compute particular solution
15: I = f astconvolution(u, ν, expweight, ps)
16: Boundary correction
17: utemp = u
18: it = 0
19: repeat
20: H = apply bc outf low(uA, uB, bdryA, bdryB, β, H, ubppr, g, pt)
21:
22:

utemp = update stencil(utemp, I, expA, expB)
it = it + 1
23: until (|u − utemp|∞ < tol) OR (it < maxit)
24: Operate L−1
25: k = pt
2
26: linv = I + H(k, 1)expA + H(k, 2)expB

fourth order scheme to compute the solution u5.

3.4 Numerical Results

In this section, several test cases are reported for higher-order accuracy in 2D and 3D.

3.4.1 Convergence Test for High-Order Outﬂow

We ﬁrst show fourth order convergence of outﬂow boundary conditions by performing
a reﬁnement study on a rectangular domain Ω = [−1, 1] × [−1, 1] and a cubic domain
Ω = [−1, 1] × [−1, 1] × [−1, 1] for 2D and 3D respectively.

In each case, we use a point

68

Algorithm 13 apply bc outf low: Apply outﬂow boundary conditions
1: IA, IB - solution at boundary points,
2: bdryA, bdryB - boundary points,
3: H - boundary coeﬃcients at time step tn,
4: ubppr - solution at the boundary points at previous time steps,
5: g - a vector of γ coeﬃcients for a given order of accuracy,
6: β - averaging parameter,
7: p - order of accuracy in time.
8: p = 2 ∗ k
9: H(k, 1) = H(k, 1)e−β + g(k, 1)IA
10: H(k, 2) = H(k, 2)e−β + g(k, 1)IB
11: for j = 2 to p +1 do
for bp = 0 to 1 do
12:
13:
14:
15: end for
16: for j = 1 to p do
17:
18: end for
19: ubppr(p + 1, 0) = uxA
20: ubppr(p + 1, 1) = uxB

H(k, bp) = H(k, bp) + g(k, j)ubppr(p + 2 − j, bp)

ubppr(j, :) = ubppr(j + 1, :)

end for

Algorithm 14 gamma : Compute the γ coeﬃcients for a given order of accuracy

cf s(:, j) = f dcoef f F (j, 0, stencil)

1: β - averaging parameter,
2: p - order of accuracy in time,
3: stencil - array of uniform grid points
4: for j = 1 to p do
5:
6: end for
7: for k = 1 to p+1 do
for j = 1 to p do
8:
9:
10:
11: end for
12: g(1) = g(1) + E(0, β)

g(k) = g(k) + (−1)jE(j, β)

end for

source cos(ωt), ω = 1 placed at center of the domain ( (0, 0) in 2D and (0, 0, 0) in 3D),

and run the evaluation up to dimensionless time T = 2.0, with a ﬁxed spatial resolution of
160 × 160 spatial points in 2D and 160 × 160 × 160 in 3D. The discrete L∞ norm of the

error is constructed at each time step and maximum error over all time steps is used to
graph in Figure 3.1 for ∆t = 125×10−3

, k = 1 to 5, with outﬂow boundary conditions. The

2k

69

Algorithm 15 E : Compute the coeﬃcients E based on lemma 2.1
1: m and v - integers,
2: m ≥ 0 and v > 0
3: Pmv = 0
4: for l = 0 to m do
Pmv = Pmv + vl
5:
l!
6: end for
7: Emv = 1

vm (1 − e−v)Pmv

CFL (= c∆t

∆x ) value changes proportional to the time step size ∆t with ﬁxed spatial step size,

∆x = ∆y = ∆z.

(a) in 2D

(b) in 3D

Figure 3.1: Fourth order convergence of (a) 2D and (b) 3D wave solver using outﬂow bound-
ary conditions with the spatial step size h = 1.25×10−2. This is a self-reﬁnement
study which measures L∞ norm of the error at time T = 2.0 on a (a) square and
(b) cubic domain with a point source cos(ωt), ω = 1 at the center of the domain.
The CFL (= c∆t
h ) value changes proportional to the time step size ∆t with ﬁxed
spatial step size, h.

3.4.2 Rotating Gaussian Pulse

In this section, we examine the ability of our outﬂow scheme to handle the eﬀect of

a rotating Gaussian pulse. First, we perform reﬁnement studies on a square domain Ω =
[−1, 1]×[−1, 1] for a rotating Gaussian pulse through diﬀerent angles. We chose the Gaussian

70

pulse as an initial solution,

(cid:16)

−36
e

( cos(θ)x+sin(θ)y

r1

)2

+( sin(θ)x+cos(θ)y

r2

)2(cid:17)

(3.8)

with r1 = 2 and r2 = 1 and rotate the angle θ from Π

18 to Π

2 for this reﬁnement studies.

We run the evaluation up to the dimensionless time T = 2.0, with a ﬁxed spatial resolution
of 100 × 100 spatial points. The discrete L∞ norm of the error is constructed at each time
step and maximum error over all time step is used to graph in Figure 3.2 for ∆t = 0.2

2k , k = 0

to 4, with our fourth order outﬂow boundary conditions.

Figure 3.2: Fourth order convergence of the oval shape Gaussian pulse (however it reduces to
4 due to the ADI splitting error) given by the
2 on square domain Ω = [−1, 1]2 by imposing

the above third order near to θ = Π
Equation 3.8 rotated through Π
18 to Π
Outﬂow boundary conditions along the boundaries. This self-reﬁnement study
measures L∞ norm of the error at time T = 2.0 with h = 0.02.

The convergence plot shows that the order of the accuracy reduces from fourth-order to

above third-order while the angle goes to Π

4 . This is acceptable behavior because the highest

splitting error should be obtained for the angle Π

4 due to the ADI splitting.

71

3.4.2.1 Time Evaluation of 50o angled Gaussian pulse

In this section, we present a time evolution of the same Gaussian pulse deﬁned in equa-

tion 3.8 with r1 = 2, r2 = 1 and angle θ = 50o using outﬂow boundary conditions. We chose
a rectangular domain Ω = [−1, 1]× [−1, 1] with spatial resolution of 300× 300 spatial points,
and set the time step size ∆t = 3.3 × 10−3 to maintain CFL value 0.5. Figure 3.3 shows

snapshots of the wave at diﬀerent time instants

(a)

t = 0.3597

(b)

t = 0.6864

(c)

t = 1.0131

(d)

t = 1.3695

Figure 3.3: Time evolution of a Gaussian pulse given by the Equation 3.8 with angle θ = 50o
on square domain Ω = [−1, 1]2 by imposing outﬂow boundary conditions along
the boundaries. Here the time step size ∆t = 3.3 × 10−3 and CFL value is 0.5.
The wave is leaving completely as expected.

(a) θ = 30o

(b) θ = 50o

(c) θ = 70o

(d) θ = 90o

Figure 3.4: Evolution of a Gaussian ﬁeld given by the Equation 3.8 at time t = 0.6864,
rotated through diﬀerent angles (a) θ = 30o, (b) θ = 50o, (c) θ = 70o, and (d)
θ = 90o on square domain Ω = [−1, 1]2 by imposing outﬂow boundary conditions
along the boundaries, Here the time step size ∆t = 3.3 × 10−3 and CFL value is
0.5 at time t = 0.1287 using outﬂow boundary conditions. We observe the same
behaviour of the wave for rotated Gaussian pulse through diﬀerent angles. The
waves leave completely through the curved boundary.

72

3.4.3 Outﬂow Boundary Condition Along Curve Boundaries

We performed an experiment to examine the ability of our outﬂow boundary treatment

along the curved boundaries. For this experiment, we built a 2D geometry with a curved

boundary as shown in Figure 3.5 (a). The curve is a portion of an ellipse which has axes a

and b and centred at the origin (0, 0).

rw =

ab

√
b2 + a2 tan2 θ

rh2 = rw tan θ

rh1 = b

This 2D object can be represented as a 2D graph (Figure 3.5), G(V, E(ES, EA)) with a
set of vertices V = (v1, v2, v3, v4), straight edges ES = [es1(≡ (v2, v3)), es2(≡ (v3, v4)), es3(≡
(v3, v4)), es4(≡ (v4, v1))], and an arch edge EA = [ea1(≡ (v1, v2, O))].

(a) Geometry

(b) Graph

Figure 3.5: (a) Geometry and (b) graph representation of the 2D object constructed by using
an oval of width 2a and height 2b and a rectangle of 2rw×(rh1+rh2). Here rh1 = b
and v1, v2, v3, and v4 are the vertices of the object.

73

We made an instant of this object with a = 1.6, b = 4.8, and θ = 50o, place a point source

cos ωt with ω = 0.5 at O(0, 100∆x), apply outﬂow boundary condition along the elliptical

curved edge and Dirichlet along the remaining straight edges. Time evaluation of the source

ﬁeld for the spatial grid width ∆x = ∆y = 0.039 and time step size ∆t = 0.0195 is shown in

Figure 3.6

(a) t = 1.8135s

(b) t = 11.778s

(c) t = 12.6945s

(d) t = 13.7085s

Figure 3.6: Time evolution of the point source ﬁeld cos ωt with ω = 0.5 at (0, 100∆x) in the
object as shown in Figure 3.5 by imposing outﬂow boundary conditions along
the curved boundary and homogeneous Dirichlet along the straight boundaries.
Here the spatial step size, ∆x = ∆y = 0.039 and the time step size, ∆t = 0.0195

Further, we performed reﬁnement studies for the same structure and obtained the fourth-

order accuracy in time by computing the L∞ error for the ﬁnal time T = 5.0 as shown in

the plot 3.7.

3.4.4 Convergence Studies Using Analytical Solution

We perform this experiment to conﬁrm the order of accuracy of our scheme using an

analytical solution. We use the fourth order scheme (see [14] for the proof) as given in

Equation 3.9 which includes the source term s,

un+1 − 2un + un−1 = −β2Cxyz[un] −

β2Dxyz − β4
12

(cid:0)sn+1 + 10sn + sn−1(cid:1) +

+

β2
12α2

(cid:19)

Cxyz
Cxyz[un]
β2
12α4Cxyz[sn]

(3.9)

(cid:18)

74

(a) Field at t = 5.0

(b) Convergence plot

Figure 3.7: (a) Snapshot of the point source ﬁeld at time T = 5.0 and (b) Fourth order
convergence of 2D wave solver using Outﬂow boundary conditions along the
curve boundary with the spatial grid width ∆x = ∆y = 0.078. This is a self-
reﬁnement study which measures L∞ norm of the error at time T = 5.0 on the
object shown in Figure 3.5 with a point source cos(ωt), ω = 0.5 at the center of
the box, (0, 100∆x)

Here we use three dimensional integration operator Cxyz acting on the source term s for a

high-order correction. We derive an analytical solution for a point source sin(4πt) placed at

(x0, y0, z0) as follows,

(cid:112)(x − x0)2 + (y − y0)2 + (z − z0)2 + δ2

e−36(t/15−1).2sin(4πt)

1
4π

(3.10)

where δ ∈ R+, 0 < δ2 < ∆x

We choose a cubic domain Ω = [−1, 1]× [−1, 1]× [−1, 1], place a point source sin(4πt) at

the center (0, 0, 0), impose outﬂow boundary conditions along the all six boundary surfaces,

and run the evaluation up to dimensionless time T = 3.7, with a ﬁxed spatial resolution
of 80 × 80 × 80. The discrete L∞ norm of the error is constructed at each time step and
maximum error over all time steps is used to graph in Figure 3.8 for ∆t = 2×10−1

, k = 0 to

2k

3.

75

Figure 3.8: Fourth order convergence of 3D wave solver using outﬂow boundary conditions
with the spatial step size h = 2.5 × 10−2. This measures L∞ norm of the error
which compares with analytical solution using a point source sin(4πt) placed at
the center of the domain.

3.5 Summary

We derived a solution for high-order outﬂow boundary conditions using our MOLT based

scheme which uses the extended backward ﬁnite diﬀerence time-stencil along the boundaries

and applies a suitable initial condition. The computation begins with the second-order

scheme followed by the 4th-order scheme after three time-steps. We evaluated the scheme

using several test cases including curved boundaries. In the next chapter, we describe the

derivation of the high-order embedded Neumann boundary method for multi-dimensions.

76

CHAPTER 4: Higher Order Embedded Neumann Boundary Method

4.1 Introduction

In this chapter, we review the second-order embedded Neumann boundary method for

the scheme described in Chapter 2 and extend the order of spatiotemporal accuracy for

this scheme to arbitrary high-order. Further, we develop a general framework for complex

geometries in 3D.

We choose an embedded boundary method to deal with complicated geometries. As an

initial test for the simulation of magnetrons, we impose the embedded Neumann boundary

condition (later, in Chapter 5 we introduce our embedded PEC boundary condition). The

MOLT based scheme described in Chapter 2 gives second-order accuracy for embedded

Neumann boundary condition, therefore we want to improve the accuracy for arbitrary

higher-order in time and space.

In this chapter, we describe a high-order 3D embedded

Neumann boundary method and generalize it for complex geometries in 3D. The high-order

accuracy in time is achieved using the Lax-Wendroﬀ approach as explained previously in

Chapter 2. Our treatment for embedded Neumann boundary condition includes the ghost

points and boundary corrections. First, the solution at the ghost points are computed using

Hermite-Birkhoﬀ interpolation, and then interior points were updated using a boundary

correction iteration along with boundary stencils. Hence, we extend the interpolation stencils

to achieve high-order accuracy in space.

The embedded boundary method can deal with curved boundaries and material inter-

faces by superimposing boundaries on Cartesian grids and it also can handle oﬀ-grid points.

Thus, the embedded boundary method is the most suitable approach for complicated geome-

tries in diﬀerent application areas such as ﬂuid dynamics in aerospace engineering [55; 44],

77

electromagnetics [3], and acoustics [45; 2]. Further, several material interface problems are

dealt with embedded boundary methods such as dielectric materials. Here, the question

is how to handle wave propagation along with material/domain interfaces. Researchers

propose appropriate jump conditions [11; 3; 46; 22] which were derived from the domain’s

material properties and apply the suitable boundary conditions along the interfaces using

interpolation techniques. In [46], the embedded method was used for numerical modeling of

electromagnetic scattering of an incident plane wave by a dielectric circular cylinder. They

dealt with discontinuous wave propagation speed using appropriate jump conditions, but

the scheme is in second-order temporal and spatial accuracy for 2D problems. In [11] and

[18] second and fourth-order accuracy was achieved using an upwinding embedded boundary

method for Maxwell’s equations based on Runge-Kutta type time discretization. However,

the approach is very cost expensive and is not as fast as our MOLT based scheme. Another

method proposed in [3] uses interior boundary points instead of exterior ghost points to

apply Neumann or Dirichlet boundary conditions with the embedded boundary method and

gives fourth-order accuracy for the wave equation in 2D. The method is based on a compact

Pade-type discretization of spatial derivatives together with the Taylor series method in time

and it can remove small-cell stiﬀness problems for both Neumann or Dirichlet boundary con-

ditions. However, it also deals with cost expensive and slow matrix operations to obtain the

solution.

An approach similar to our scheme has been used in [9; 8]. They provide an implicit ADI

based numerical scheme for solving the heat and wave equations in a general embedded-

boundary domain with high-order spatial accuracy using Fourier continuation spatial approx-

imations and second-order temporal accuracy. Thus, the big diﬀerence from our approach

is that they use a Fourier-based method to invert the semi-discrete diﬀerential operator

instead of constructing and inverting the modiﬁed Helmholtz operator as we do. Besides,

the Fourier continuation method allows for minimal dispersion errors for wave propagation

problems. However, the schemes are only second-order accurate in time and higher-order

78

accuracy requires resorting to Richardson extrapolations.

In the following sections, we describe the higher-order Neumann boundary conditions

in Section 4.2, detailing its implementation in 1D, 2D, and 3D. In Section 4.3, we describe

the generalized embedded method suitable for any complex geometry and ﬁnally provide

numerical results.

4.2 Derivation of The Scheme

Neumann boundary conditions for boundary geometries along grid lines can be directly

imposed using a two-point boundary correction by utilizing surface grids in 2D and volu-

metric grids in 3D. The method can be applied to polygonal domains by the use of multiple

overset grids, each aligned with a boundary segment, which communicates with the interior

grid through interpolation on a ghost cell region. We chose the embedded boundary methods

explained in [13] as the basis to impose Neumann boundary conditions for curved boundary

surfaces in 3D. Here we determine the corresponding Dirichlet boundary values at the end-
points of each x−, y− and z− sweep lines that result in the approximate satisfaction of the

Neumann boundary condition.

The Neumann boundary condition is unconditionally unstable even when space is set

continuously [46; 22]. This means all embedded boundary methods need a touch of diﬀusion
to stabilize it. In our earlier work, we developed a O(∆t2) with one higher-order diﬀusion

term to stabilize it. Here we extend the diﬀusion term to the fourth-order accuracy.

4.2.1 One Dimensional Scheme

Let us consider 1D domain Ω ≡ [xa, xb] with a left boundary xB, and we choose a ghost

point xG, which is the exterior-point to the simulation domain (it should have at least one

interior neighbor). We give the formulation of the left domain in detail next,

As shown in Figure 4.1, xI and xII are two interior points, their distance from boundary

79

Figure 4.1: Geometry of 1D embedded Neumann boundary domain Ω ≡ [xa, xb] with left
boundary xB, ghost point xG, and interpolation points xI and xII, where the
distance ξI = |xI − xa|, ξII = |xII − xa|, and ξG = |xG − xa|

point are ξI = |xI − xa| and ξII = |xII − xa| along the normal. We deﬁne ξG as the distance
between ghost and boundary point (ξG = |xG− xa|). We apply Hermite-Birkhoﬀ interpolate,
P (ξ) through the points, xG, xa, xI, and xII by imposing the conditions P (cid:48)(0) = 0, P (ξI) = uI

and P (ξII) = uII, and explicitly solving for uG gives:

(cid:19)

(cid:18) ξ2

II − ξ2
II − ξ2
ξ2

I

G

(cid:18) ξ2

G − ξ2
II − ξ2
ξ2

I

I

(cid:19)

ul
II + O(∆x2),

ul+1
G =

ul
I +

We have introduced additional notation of l + 1 and l, which designate how we will set up

a ﬁxed point iteration using the Hermite-Birkhoﬀ interpolate. The iteration itself is a simple

ﬁrst order ﬁxed point method. Let w(l) and w(l+1) be the solutions at the lth and (l + 1)th

iterations. We choose a tolerance tol, and maximum number of allowed iterations mit, and
deﬁne a stopping criterion as |w(l+1) − w(l)|∞ < tol OR nit > mit where nit is the current

iteration number. The detailed steps for this iterative approach are given in Algorithm 16,

As detailed in [13] we need to include an artiﬁcial dissipation term into the solution to

maintain stability for such an embedded boundary method. Thus the second-order solution

un+1(x) at the next time step looks like,

un+1(x) − 2un(x) + un−1(x) = −β2D(1)

x [un] − D(2)

x [un−1].

(4.1)

where  is an artiﬁcial dissipation coeﬃcient, that satisﬁes 0 <  < 1. The Value of D(2)
at the previous time step can be used as D(2)
go with one more computing-level to obtain D(2)

x [un]
x [un−1] at the current time step, and we need to

x [un]. According to [13], β has to be reduced

80

Algorithm 16 Compute L−1 for 1D scheme
1: Compute particular solution I n+1 = I[un]
2: Initial guess wn+1(0) ≈ 3un − 3un−1 + un−2
3: Initialize the iteration counter nit = 0
4: repeat
5:

and wn+1(l)

I

II

G

=

G

I

I

6:

7:

(cid:17)

(cid:17)

wn+1(l)

II−ξ2
II−ξ2
ξ2

(cid:16) ξ2

(cid:16) ξ2

, at the interpolation points xI, and xII and as well as at

Compute wn+1(l)
the interpolation points along the right boundary
Compute solution at the ghost points using the following Hermite-Birkhoﬀ interpolant
wn+1(l)
Compute homogeneous coeﬃcients a1 and b1 using the fact (Here, xaG and xbG are
ghost points ),
wn+1(l) = I n+1 + a1e−α(x−xaG) + b1e−α(xbG−x).
Compute solution and update boundary stencil
wn+1(l+1) = I n+1 + a1e−α(x−xaG) + b1e−α(xbG−x)
nit = nit + 1
10: until (|wn+1(l+1) − wn+1(l)| < tol OR nit > mit)

G−ξ2
II−ξ2
ξ2

wn+1(l)

II

8:

9:

+

I

I

by a small amount from the case published in [14].

4.2.2 Two Dimensional Scheme

When extending the embedded Neumann boundary method to 2D cases, we need to

consider two important changes: every spatial point will be treated in x and y Cartesian

coordinates, and the ghost point approximation will be made by using bi-linear interpolation.

To achieve high-order accuracy, we need more computing-levels (perform P computing-levels

for 2P -th order of accuracy in time ) and extend the interpolation stencils by using 4-points

along the normal for the Hermite-Birkhoﬀ interpolation and an eight-point stencil for the bi-

linear interpolation. We can derive second-order and fourth-order 2D solution with artiﬁcial

dissipation ([13]) as given below,

un+1 − 2un + un−1 = −β2C(1)

(cid:18)

(cid:19)
xy [un] + C(2)

xy [un−1],

un+1 − 2un + un−1 = −β2C(1)

xy [un] −

β2D(2)

xy − β4
12

C(2)

xy

C(1)
xy [un] + C(3)

xy [un−1].

(4.2)

(4.3)

We now consider the situation of a two-dimensional domain with a curved boundary Γ

81

displayed in Figure 4.2. In the 2D case, we deﬁne a ghost point (xG, yG) which is an exterior

point (xi, yj), but at least one of the neighboring points (xi±1, yj) or (xi, yj±1) should be an

interior point. In order to compute the value of unknown uG at the ghost point location

(xaG, yaG), we construct a quadratic Hermite-Birkhoﬀ boundary interpolant P (ξ) along the

direction normal to the boundary, which intersects the boundary curve Γ at location (xa, ya).

We choose four interior points selected along the normal of the surface with distances, ξI =
|(xI, yI)− (xa, ya)| = ∆sI, ξII = |(xII, yII)− (xa, ya)| = 2∆sI, ξIII = |(xIII, yIII)− (xa, ya)| =
3∆sI and ξIV = |(xIV , yIV )−(xa, ya)| = 4∆sI, where we will typically take ∆sI =

2∆x, and

√

compute values uI, uII, uIII, and uIV at points (xI, yI), (xII, yII), (xIII, yIII), and (xIV , yIV )

respectively by interpolating from interior grid points, and use these interpolation values uI,

uII, uIII, and uIV to perform the Hermite-Birkhoﬀ interpolant P (ξ). Further, the distance
from the boundary to the ghost point is deﬁned as ξG = |(xG, yG) − (xa, ya)|.

(a) Geometry

(b) 8-points stencil

Figure 4.2: (a)Geometry of 2D embedded Neumann boundary method with a boundary
point (xB, yB), ghost point (xG, yG), and interpolation points (xI, yI), (xII, yII),
(xIII, yIII) and (xIV , yIV ) and (b) symmetric 8-points stencil to interpolate the
solution at the point (xI, yI).

The solution at the ghost point, uG can be computed using the Hermite-Birkhoﬀ inter-
polant P (ξ) by imposing the conditions P (cid:48)(0) = 0, P (ξI) = uI, P (ξII) = uII, P (ξIII) = uIII,

82

and P (ξIV ) = uIV given by,

ul+1
G =

ul
I +

(cid:19)

(cid:18) ξ2

III − ξ2
III − ξ2
ξ2

I

G

(cid:18) ξ2

IV − ξ2
IV − ξ2
ξ2

G

II

(cid:19)

ul
II +

(cid:18) ξ2

G − ξ2
III − ξ2
ξ2

I

I

(cid:19)

ul
III +

(cid:18) ξ2

G − ξ2
IV − ξ2
ξ2

II

II

(cid:19)

IV + O(∆x2),
ul

We obtain uI and uII by the approximations using the symmetric 8-point stencil inter-

polation. Suppose the interpolation point uI lies in a cell with corners (xi, yj), (xi+1, yj),

(xi+1, yj+1), and (xi, yj+1), we have the following approximation for uI

uI = w1ui,j + w2ui+1,j + w3ui+1,j+1 + w4ui,j+1

+ w5ui−1,j−1 + w6ui+2,j−1 + w7ui+2,j+2 + w8ui−1,j+2

(4.4)

where

w1 =

w3 =

w5 =

w7 =

4∆x∆y

(xi+2 − xI)(yj+2 − yI)
(xI − xi−1)(yI − yj−1)
(xi+1 − xI)(yj+1 − yI)
(xI − xi)(yI − yj)

4∆x∆y

4∆x∆y

4∆x∆y

w2 =

w4 =

w6 =

w8 =

4∆x∆y

(xI − xi−1)(yj+2 − yI)
(xi+2 − xI)(yI − yj−1)
(xI − xi)(yj+1 − yI)
(xi+1 − xI)(yI − yj)

4∆x∆y

4∆x∆y

4∆x∆y

As described in 1D (4.2.1), we use an iterative scheme to improve the accuracy of the

solution. For each term in the operator on the right hand side of the second order solution

given by equation 4.2, or fourth order in time solution given by equation 4.3, we must identify

x [un] +L−1

the correct ghost point that allows that term to guarantee that the normal derivative is zero
along the boundary. Taking the C(1)[un] operator and expanding the operator out we have
C(1)[un] = L−1
the ghost point for wx = L−1
y [wx] such
that when solving for the boundary correction terms the operator satisﬁes ∂nwx|∂Ω = 0,
∂nwy|∂Ω = 0, ∂nwxy|∂Ω = 0, ∂nwyx|∂Ω = 0.

y [un]. The ﬁxed point iteration identiﬁes

x [wy], and wyx = L−1

y [un], wxy = L−1

x [un], wy = L−1

y [un]−L−1

x [un]−L−1

x L−1

y L−1

83

Let us consider wx and wyx = L−1

y [wx], knowing that wy and wxy are similar. The

iterative process for wx starts by making an initial guess at the direct boundary values
wx|∂Ω, using an extrapolate in time wx
n−2 at each boundary

n+1,(0) ≈ 3wx

n−1 + wx

n − 3wx

point (Algorithm 17).

Algorithm 17 Compute wx(= L−1
x [u])
1: Compute the interior sweep (particular solution) I n+1
n − 3wx
2: Initial guess wx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:

n+1(0) ≈ 3wx

for k = 1 to ny do

n−1 + wx

x = Ix[un]

n−2

Compute wxI(k), and wxII(k), at the interpolation points (xI(k), y(k)), and
(xII(k), y(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (xaG(k), y(k)) and (xbG(k), y(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wx
Compute solution and update boundary stencil
wx

n+1(l) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)
n+1(l+1) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)

7:

8:

9:

10:
11:

end for
nit = nit + 1
12: until (|wx

n+1(l+1) − wx

n+1(l)| < tol OR nit > mit)

Given the update on wx, the next step is to consider wyx = L−1

y [wx]. The process starts

with the initial guess wx

n+1,(0) ≈ 3un − 3un−1 + un−2 at the boundary (Algorithm 18).

The same process is done for wy and wxy. The Hermite-Birkhoﬀ interpolate enforces
that the normal derivative is zero for wx, wy, wxy and wyx such that C(1)[un] satisﬁes the

normal derivative condition on ∂Ω to within tolerance. For the fourth order formulation, we
compute the boundary conditions for C(1)[un] and then we repeat the process for C(2) and
D(2) acting on C(1)[un].

Next, we will eamine the higher order embedded Neumann boundary method for 3D in

detail,

84

y [wx])

Algorithm 18 Compute wyx(= L−1
1: Compute the interior sweep (particular solution) I n+1
2: Initial guess wyx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:

n+1(0) ≈ 3un − 3un−1 + un−2

for k = 1 to nx do

y = Iy[wx

n]

Compute wyxI(k), and wyxII(k), at the interpolation points (x(k), yI(k), and
(x(k), yII(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (x(k), yaG(k)) and (x(k), ybG(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wyx
Compute solution and update boundary stencil
wyx

n+1(l) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)
n+1(l+1) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)

7:

8:

9:

10:
11:

end for
nit = nit + 1
12: until (|wyx

n+1(l+1) − wyx

n+1(l)| < tol OR nit > mit)

4.2.3 Three Dimensional Scheme

We now consider the situation of a three-dimensional domain with a curved-surface

boundary Γ displayed in Figure 4.3. In the 3D case, we deﬁne a ghost point (xaG, yaG, zaG)

which is an exterior point (xi, yj, zk), but at least one of the neighbouring points (xi±1, yj, zk)
or (xi, yj±1, zk) or (xi, yj, zk ± 1) of it should be an interior point. As described in the one and

two dimensional cases, to compute uG, we construct a quadratic Hermite-Birkhoﬀ boundary

interpolant P (ξ) along the direction normal to the boundary surface, which intersects the

boundary curved-surface Γ at location (xa, ya, za). We choose two interior points selected
along the normal of the surface with distances, ξI = |(xI, yI, zI) − (xa, ya, za) = ∆sI| and
ξII = |(xII, yII, zII) − (xa, ya, za) = 2∆sI|, where we will typically take ∆sI =

√
3∆x, and

compute values uI and uII, at points (xI, yI, zI) and (xII, yII, zII) respectively by interpo-

lating from interior grid points, and use these interpolation values uI and uII to perform the

Hermite-Birkhoﬀ interpolant P (ξ). Further, the distance from the boundary to the ghost
point is deﬁned as ξG = |((xaG, yaG, zaG)) − (xa, ya, za)|.

The solution at ghost point, uG can be computed using the Hermite-Birkhoﬀ interpolant

85

Figure 4.3: Geometry of 3D embedded Neumann boundary method with a boundary
point (xB, yB, zB), ghost point (xG, yG, zG), and interpolation points (xI, yI, zI),
(xII, yII, zII), (xIII, yIII, zIII) and (xIV , yIV , zIV )

P (ξ) by imposing the conditions P (cid:48)(0) = 0, P (ξI) = uI, and P (ξII) = uII, given by,

(cid:19)

(cid:18)ξ2

II − ξ2
II − ξ2
ξ2

I

G

(cid:19)

(cid:18) ξ2

G − ξ2
II − ξ2
ξ2

I

I

ul+1
G =

ul
I +

ul
II + O(∆x2),

We obtain uI and uII by the approximations using the tri-linear interpolation. Sup-

pose the interpolation point uI lies in a cell of rectangular prism with corners (xi, yj, zk),

(xi+1, yj, zk), (xi+1, yj+1, zk) , (xi, yj+1, zk), (xi, yj, zk+1), (xi+1, yj, zk+1), (xi+1, yj+1, zk+1), and

(xi, yj+1, zk+1), we have the following approximation for uI

uI = w1ui,j,k + w2ui+1,j,k + w3ui+1,j+1,k + w4ui,j+1,k

+ w5ui,j,k+1 + w6ui+1,j,k+1 + w7ui+1,j+1,k+1 + w8ui,j+1,k+1

(4.5)

86

where

w1 =

w3 =

w5 =

w7 =

∆x∆y∆z

(xi+1 − xI)(yj+1 − yI)(zk+1 − zI)
(xI − xi)(yI − yj)(zk+1 − zI)
(xi+1 − xI)(yj+1 − yI)(zI − zk)
(xI − xi)(yI − yj)(zI − zk)

∆x∆y∆z

∆x∆y∆z

∆x∆y∆z

∆x∆y∆z

(xI − xi)(yj+1 − yI)(zk+1 − zI)
(xi+1 − xI)(yI − yj)(zk+1 − zI)
(xI − xi)(yj+1 − yI)(zI − zk)
(xi+1 − xI)(yI − yj)(zI − zk)

∆x∆y∆z

∆x∆y∆z

w2 =

w4 =

w6 =

w8 =

∆x∆y∆z

Three dimensional 2-nd and 4-th order schemes can be deﬁned as follows,

un+1 − 2un + un−1 = −β2C(1)

(cid:18)

xyz[un−1],

xyz[un] + C(2)
(cid:19)

un+1 − 2un + un−1 = −β2C(1)

xyz[un] −

β2D(2)

xyz − β4
12

C(2)

xyz

C(1)
xyz[un] + C(3)

xyz[un−1].

(4.6)

(4.7)

y

x and L−1

in the z−
to perform z− sweeps. Taking the C(1)[un] operator
x [un]−
z [un]. The ﬁxed point iteration identiﬁes the

We use the operators C and D in three dimensions, and need to operate L−1
direction in addition to L−1
and expanding the operator out we have C(1)[un] = L−1
L−1
x L−1
z L−1
ghost point for wx = L−1
wxz = L−1
operators satisﬁes ∂n(.)|∂Ω = 0 when solving for the boundary correction terms.

z [un], wyx = L−1
x [wzy], and wyxz = L−1

y [wx], wzy = L−1
y [wxz], and each of these

z L−1
y [un] − L−1
x [un], wy = L−1

z [wyx], wxzy = L−1

x L−1

y L−1
y [un], wz = L−1

x [wz], wzyx = L−1

x [un] − L−1

z [un] +L−1

y [un] +L−1

y L−1

y L−1

x L−1

z L−1

z [wy],

z

Let us consider wx, wyx = L−1

y [wx] and wzyx = L−1

z [wyx], knowing that computing

of (wy, wzy, wxzy), and (wz, wxz, wyxz) are similar. The iterative process for wx starts
by making an initial guess at the direct boundary values wx|∂Ω, using an extrapolation in
n−2 at each boundary point. Given the update on

n+1,(0) ≈ 3wx

n−1 + wx

n − 3wx

time wx
wx, the next step is to consider wyx = L−1

n+1,(0) ≈ 3wyx

n− 3wyx

wyx
the following step is to consider wzyx = L−1

n−1 + wyx

y [wx]. The process starts with the initial guess
n−2 at the boundary, and then using the update on wyx,

z [wyx]. The process starts with the initial guess

87

in Algorithm 19.

z

z [wyx])

x , L−1

y , and L−1

n+1,(0) ≈ 3un− 3un−1 + un−2 at the boundary. We use three separate algorithms for each

wzyx
sweep L−1
z , and each algorithm processes three major tasks: computation of a
particular solution, a boundary correction, and then obtaining the value of actual L−1. We
give the detailed steps for L−1
Algorithm 19 Compute wzyx(= L−1
1: Compute the interior sweep (particular solution) I n+1
2: Initial guess wzyx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:
7:

n+1(0) ≈ 3un − 3un−1 + un−2

for j = 1 to nx do

for i = 1 to ny do

z = Iz[wyx

n]

Compute wzyxI(i,j), and wzyxII(i,j), at the interpolation points (x(j), y(i), zI(i,j)),
and (x(j), y(i), (zII(i,j)) and as well as at the interpolation points along the right
boundary using bi-linear interpolation
Compute solution at the ghost points (x(j), y(i), zaG(i,j)) and (x(j), y(i), zbG(i,j)) using
the Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients az(i,j) and bz(i,j) using the fact
wzyx
Compute solution and update boundary stencil
wzyx
end for

n+1(l) = I n+1 + az(i,j)e−α(z−zaG(i,j)) + bz(i,j)e−α(zbG(i,j)−z)
n+1(l+1) = I n+1 + az(i,j)e−α(z−zaG(i,j)) + bz(i,j)e−α(zbG(i,j)−z)

8:

9:

10:

11:
12:
13:

end for
nit = nit + 1
14: until (|wzyx

n+1(l+1) − wzyx

n+1(l)| < tol OR nit > mit)

Next, we are going to develop a generalized higher dimensional, higher-order scheme for

problems with complex geometries.

4.3 Treatment for Complex Geometries

We now consider domains with complex geometries such as a model for A6 magnetron

that has a set of arch areas joined together in 2D. The A6 magnetron has a complex ge-

ometry (Figure 1.2), and so it serves as a nice numerical test case for this boundary con-

dition as opposed to a physical accurate boundary condition in a magnetron. We chose an

embedded boundary method to solve such complex problems. The complex geometries in

88

higher-dimensional problems may be even harder. In our approach, however, since the higher

dimensional problems are solved with ADI schemes, we need to worry about one-dimensional

lines. Thus, we need to know all the relevant properties of each line to solve it. However,

these lines are going to be broken into several segments due to complex boundaries. First,

we compute boundary intersection points ( which may be in oﬀ-grids), where the grid lines

intersect with boundaries.

We give a detailed explanation for 3D complex geometries in this section. In our approach,

ﬁrst, we decompose higher-dimensional geometry to 2D slices and then represent it using 2D

graphs with a set of vertices and edges. The edges can be classiﬁed as straight-line edges

and curves or arches. The graph G can be given as,

G(V, E(ES, EA)), ES ≡ (vsi, vsj),

and EA ≡ (Oij, vai, vaj)

(4.8)

where vsi, vsj, vai, and vaj are the vertices and Oij is center of the arch (vai, vaj) which

may be a circular arch (4.4 (a)) or an elliptical arch (4.4 (b)). In Figure 4.4, each object
has three vertices (v1, v2, v3), two straight edges es1(≡ (v1, v2)), es2(≡ (v3, v1)), and one arch
edge ea1(≡ (v2, v3, Oa1)) that can be represented as a graph G([v1, v2, v3], [[es1, es2], [ea1)]])

(a) with circular arch

(b) with elliptical arc

Figure 4.4: 2D graph with 3 vertices (v1(x1, y1), v2(x2, y2), and v3(x3, y3), 2 straight edges
es1(≡ (v1, v2)), es2(≡ (v3, v1)) and 1 circular arch (a) or elliptical arch (b) edge
ea1(≡ (v2, v3, Oa1)) centered at Oa1(xa1, ya1).

89

4.3.1 Pre-computing

Before beginning the PDE evolution, we need to perform a pre-computation to identify

key characteristics of the geometry such as boundary and relevant parameter values along

the grid lines in each direction. The ﬂow diagram in Figure 4.5 shows major tasks performed

in pre-computation and results obtained at the end of each task.

Figure 4.5: Flow diagram of pre-computation for 3D problems with complex geometries; it
cuts a 3D object into 2D slices Sxy, Sxz, makes 2D graphs Gxy, Gxz, and identiﬁes
the properties of the line segments Px, Py, and Pz.

1. Slicing

This task decomposes a given 3D object into a set of 2D objects by slicing along the

grid lines. We can form three sets of 2D slices Syz, Sxz, and Sxy in each direction x,

y, and z respectively, but two sets are suﬃcient for our computation (if the object is

uniform along any direction, we need to process slicing in that direction only - yielding

90

a set of slices). Suppose we apply the slicing along z and y directions, we will have a

set of xy slices, Sxy placed along z with distance ∆z apart and a set of xz slices, Sxz

placed along y with distance ∆y.

2. Graphing

This task deﬁnes the geometry of each 2D slice using 2D graphs as expressed in equation

4.8. We have two sets of 2D slices, Sxy of size nz and Sxz of size ny where nz and ny are

the number of spatial grid points along the z and y directions respectively. Each slice

should have at least one 2D object. Suppose we assume a 3D object has only convex

surfaces along with primary directions x, y, and z, then 2D objects in the slices will

be deﬁned by

Gxy(k),

k = 1, 2, ...nz

Gxz(j),

j = 1, 2, ...ny

Therefore, we will obtain, nz + ny graphs.

3. Segmentation

This task generates line segments along each grid line in each direction x, y, and z using

intersections between the grid lines and the surface of the object. The segments along

x and y can be computed using the graph Gxy and along x and z can be computed

using the graph Gxz, but we should avoid unneeded duplicate computations for x in

order to reduce computing cost. In the ﬁnal stage of the pre-computation all relevant
parameters/properties Px of size nx × nsx, Py of size ny × nsy and Pz of size nz × nsz

for each line segment will be computed. Here nsx, nsy and nsz are the number of line

segments along x, y, and z respectively. Since, however, each line in the same direction

does not need to have the same amount of segments, nsx, nsy and nsz are not single

variables, they are vectors/arrays to keep track of the number of segments in each line.

91

Now, the sizes of Px, Py, and Pz are

nx(cid:88)

i=1

nsx(i),

ny(cid:88)

j=1

nsy(j) , and

nz(cid:88)

k=1

nsz(k) respectively.

After we are done with the pre-computation, we will have three sets of line segments with

their properties (Px, Py, and Pz) for each direction. These segments will be utilized by our

higher-order solver to obtain a solution with higher-order accuracy. Actually, during each

sweep, each line segment is treated individually with its own piecewise constant wave speed,

and then the 3D higher-order equation 2.77 is applied to compute the solution un+1 at the

next time step tn+1.

4.3.2 Geometry for A6 MDO in 2D

A model of A6M is a useful device to study high power microwave sources. This device

has a solid cathode and a solid anode with six cavities as shown in Figure 4.6-(a)). In our

modeling, we consider both categories solid and transparent cathodes ( [28; 56]) and design

the coupling horn by removing the top portion of a rectangular pyramid and ﬁxing it with a

cavity of the anode. If we install the whole pyramid in the same place, the top will be at the

center of the anode. In order to model the A6 magnetron with diﬀraction output (MDO)

in 2D, we assume that the cathode and anode are inﬁnitely long in the z-direction. Figure

4.6-(a) shows the axial view of the MDO in 2D. Suppose α1 and α2 are the angular widths

of the vane and cavity respectively, we can compute the angles θ1 and θ2 for each cavity of

arch in terms of α1 and α2 (Figure 4.6-(b)). The angle of the kth arch is given by,

1 = θk−1
θk
2 = θk−1
θk

1 + (k − 1)(α1 + α2)
1 + kα1 + (k − 1)α2

1 + α1 + α2 = θ1

1 + 2α1 + α2 = θ1

(4.9)

(4.10)

where, since this is a symmetric uniform model α1 + α2 = π

3 . The model has two closed

objects, a six wings object of anode with an object of circle inside which represents the

cathode. The geometry of a six wings object can be represented with a graph which has 24

vertices, 12 straight line edges and 12 arches [68].

92

During the pre-computation, for each horizontal and vertical lines, we ﬁnd the intersection

(a) Axial view

(b) Cavity angle

Figure 4.6: A6 MDO (a) Axial view with a solid cathode of radius rc in the center, the
anode with inner radius ra, vane radius rv, vane angle α1, cavity angle α2, and
a coupling horn of radius rh (b) the angle of kth cavity which is represented by
the angles θk

1 and θk

2 , θk

2 = θk

1 + α1.

points with edges of graphs and compute ghost, boundary, and interior points( (xI, yI) and

(xII, yII)) Figure 4.7 and relevant parameters which are needed to apply bi-linear interpo-

lation. The pre-computation task forms two sets of line segments for each direction which

have to be treated during x and y sweeps with our numerical scheme.

4.4 Numerical Results

In order to evaluate our numerical scheme, we consider several test cases in 2D and

3D using classic EM problems such as electromagnetic scattering of a plane wave by metal

circular cylinders ( which can be extended to dielectric circular cylinders and applicable to

photonic crystal applications), scattering of laser light by spherical objects, and electron

beams in A6M.

93

Figure 4.7: Geometrical structure of A6 MDO with key points; intersection, boundary, ghost,

and interpolation points required to obtain (a) x− and (b) y− sweeps.

94

4.4.1 Cylindrical Scattering

We begin with electromagnetic wave scattering of a plane wave by a metal cylinder of

radius r which is homogeneous along the z direction. By assuming the cylinder is inﬁnitely

tall, we can represent this case in a 2D model. We choose a rectangular domain Ω =
[−1, 1]×[−1, 1] with a circle of radius r = 0.5 with center at (0, 0), and analyse a point source
ﬁeld (cos(ωt), ω = 10) placed at (−1 + 3∆x, 0), ∆x = 0.0156 (Figure 4.8 - (a)) by applying

Neumann, outﬂow and Dirichlet boundary conditions along the circular boundary, vertical

rectangular borders (in y direction), and horizontal rectangular borders (in x direction)
respectively. We set the grid size to be 128 × 128, time step size ∆t = 0.0078, averaging

parameter β = 1.4 and dissipation coeﬃcient  = 0.1. Figure 4.9 shows snapshots of the

scattered wave at diﬀerent time instants.

(a) A cylinder at the center (b) Two
cylinders

symmetrical

(c) Two
cylinders

asymmetrical

Figure 4.8: 2D model for a single cylindrical scatter at the (a) center and two (b) symmet-
ric and (c) asymmetric cylindrical scatters in the corners. Here the red mark
indicates the point source.

For the following two experiments, we use two circular scatters instead of one and consider
symmetric and asymmetric geometry with a rectangular domain Ω = [−2, 2] × [−2, 2]. We

move the point source to the bottom right corner and place another circular scatter with the

same radius r1 = 0.3 for the symmetric case (Figure 4.8 - (c)) and with radius r2 = 0.15 for

the asymmetric case (Figure 4.8 - (d)) at the top right corner (1.5, 1.5). We apply Neumann

and outﬂow boundary conditions along the circular and the four rectangular boundaries

respectively and retain the same values for parameters ∆x, ∆t, β , and . Figures 4.10 and

95

(a) t = 1.0608

(b) t = 2.0436

(c) t = 3.1668

(d) t = 4.329

Figure 4.9: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a circular scatter of
radius r = 0.5 in the center of the square domain Ω = [−1, 1]2 with a spatial
step size ∆x = ∆y = 0.0156 and time step size ∆t = 0.0078.

4.11 show snapshots of the wave at diﬀerent time instants for the symmetric and asymmetric

cases respectively.

(a) t = 1.2714

(b) t = 1.9032

(c) t = 2.886

(d) t = 4.29

Figure 4.10: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
circular scatters of radii r1 = r2 = 0.3 in the bottom-left and top-right corners
of the square domain Ω = [−2, 2]2 with a spatial step size ∆x = ∆y = 0.0156
and time step size ∆t = 0.0078.

4.4.1.1 Convergence studies

We demonstrate fourth order convergence by performing a reﬁnement study on a 2D
square domain Ω = [−1, 1] × [−1, 1] with a circular scatter of radius r = 0.5 at the center,
and a point source cos(ωt), ω = 1 placed at (0,−1 + 3∆x), ∆x = 0.0156. This runs up to the
ﬁnal time T = 2.0, with a ﬁxed spatial resolution of 160 × 160 spatial points. The discrete

L∞ norm of the error is constructed at each time step and maximum error over all time

steps is used to graph 4.12 for ∆t = 6.25
2k

, k = 1 to 5, with Neumann and outﬂow boundary

96

(a) t = 1.2714

(b) t = 1.9032

(c) t = 2.886

(d) t = 4.29

Figure 4.11: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmetrical
circular scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left and top-right
corners of the square domain Ω = [−2, 2]2 with a spatial step size ∆x = ∆y =
0.0156 and time step size ∆t = 0.0078.

conditions along the circular and rectangular boundaries respectively.

Figure 4.12: Fourth order convergence of 2D wave solver using a point source cos(ωt), ω = 1
placed at (0,−1 + 3∆x), ∆x = 0.0156 on a square domain Ω = [−1, 1]2 with
a circular scatter of radius r = 0.5 at the center by imposing Neumann and
outﬂow along the circular and square boundaries. This is a self-reﬁnement
study which measures L∞ norm of the error at the ﬁnal time T = 2.0.

For the space convergence study, a point source cos(ωt), ω = 1 is placed at (−1+3∆x, 0),

and runs up to time T = 2.0, with ﬁxed CFL values 2 and 1, where the wave speed c is

97

chosen to be 1. Our L∞ norm error plots of the solution at time T = 2.0 with varying

resolutions show fourth order accuracy (Figure 4.13 - (a)). We also note that the normal

derivative along the boundary converges with fourth order accuracy (Figure 4.13 - (b)).

(a) Space convergence of solution

(b) Space convergence of boundary derivative

Figure 4.13: Convergence plots for a) space using entire solution and b) space using boundary
derivative on a square domain Ω = [−1, 1]2 with a circular scatter of radius
r = 0.5 at the center by imposing Neumann and outﬂow along the circular and
square boundaries. This is a self-reﬁnement study which measures L2 norm of
the error at the ﬁnal time T = 2.0.

4.4.2 Spherical Scattering

In this section, we evaluate our 3D scheme using the scattering of light by a spherical

object.

For the ﬁrst test case, we chose a cubic domain (Ω = [−1, 1] × [−1, 1] × [−1, 1]) with a

spherical scatter of radius r = 0.5 at the center of the cube (0, 0, 0) (Figure 4.14, (a)). We

place a source ﬁeld (cos(ωt), ω = 10) placed 3∆z above the center of the bottom surface(
xy plane), at (0, 0,−1 + 3∆z), ∆z = 0.0156 by applying Neumann and outﬂow boundary

conditions along the surface of the sphere and six surfaces of the cube respectively. We set
the grid size to be 128× 128× 128, time step size ∆t = 0.0078, averaging parameter β = 1.4

and dissipation coeﬃcient  = 0.1. Figure 4.15 shows snapshots of the wave at diﬀerent time

98

instants.

(a) Single sphere

(b) Two symmetric spheres

(c) Two asymmetrical spheres

Figure 4.14: 3D model for a single spherical scatter at the center of a cube, two b) symmetric

and c) asymmetric spherical scatters in the corners.

Figure 4.15: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with a spherical scatter
of radii r = 0.5 in the center of the cubic domain Ω = [−1, 1]3 with a spatial
step size ∆x = ∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.

For the next two experiments we placed two equal sized spheres of radii r1 = r2 = 0.3

for the ﬁrst case and two spheres with diﬀerent radii r1 = 0.3 and r2 = 0.15 for the second

case at the corners of the cubical domain as shown in Figure 4.14 - (b), (c). We move the
point source to (0, 0,−1 + 3∆z), ∆z = 0.0156 and simulate the wave propagation by apply-

ing Neumann and outﬂow boundary conditions along the surface of the two spheres and six

surfaces of the cube respectively. We retain the same values of parameters ∆x, ∆t, β, and

 as before. Figure 4.16 and 4.17 show snapshots of the wave at diﬀerent time instants for

the symmetric and asymmetric cases respectively.

99

Figure 4.16: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two symmetric
spherical scatters of radii r1 = r2 = 0.3 in the bottom-left and top-right corners
of the cubic domain Ω = [−1, 1]3 with a spatial step size ∆x = ∆y = ∆z =
0.0156 and time step size ∆t = 0.0078.

Figure 4.17: Time evolution of a point source ﬁeld cos(ωt), ω = 10 with two asymmetric
spherical scatters of radii r1 = 0.3 and r2 = 0.15 in the bottom-left and top-
right corners of the cubic domain Ω = [−1, 1]3 with a spatial step size ∆x =
∆y = ∆z = 0.0156 and time step size ∆t = 0.0078.

4.4.2.1 Convergence Studies

We demonstrate fourth order convergence in time and space by performing self-reﬁnement
studies on a rectangular domain Ω = [−1, 1]3 with a spherical scatter of radius r = 0.5 at

the center.

For the time convergence study, a point source cos(ωt), ω = 1 is placed at (0, 0,−1 +

3∆z), ∆z = 0.0156, and runs up to dimensionless time T = 2.0, with a ﬁxed spatial resolution
of 160×160×160 spatial points. The discrete L∞ norm of the error is constructed at each time
step and maximum error over all time steps is used to graph Figure 4.18 -a) for ∆t = 0.025
2k

, k = 1 to 5, with Neumann and outﬂow boundary conditions along the sphere surface and

cube surfaces respectively.

For the space convergence study, a point source cos(ωt), ω = 1 is placed at (0, 0,−1 +

3∆z), and evolves to dimensionless time T = 2.0, with ﬁxed CFL values 2, 1, and 0.5 where

100

(a) Time convergence

(b) Space convergence

Figure 4.18: Fourth order (a) time and (b) space convergence plots using a spherical scatter
of radius r = 0.5 at the center of a cubic domain Ω = [−1, 1]3 by imposing
Neumann and outﬂow along the surface of the sphere and cubic boundaries.
This is a self-reﬁnement study which measures (a) L∞ and (b) L2 norm of the
error at the ﬁnal time T = 2.0.

the wave speed c is chosen to be 1. Our L2 norm error plots of the solution at dimensionless

time T = 2.0 with varying resolutions show fourth order accuracy (Figure 4.18 -b)).

4.4.3 Electron Beam in Magnetrons

4.4.3.1 A6 Magnetron with Diﬀraction Output (MDO) in 2D

In this section, we test our solver using a modiﬁed model for the A6M in 2D. We use a

point source instead of voltage source and made two experiments using solid and transparent

cathodes with a coupling horn of radius rh = 4.9, radius of the vane resonators rv = 4.11,

the anode radius ra = 2.11, the cathode radius rc = 1.58, the angular width of the vane,

6 , and the cavity, α2 = π

α1 = π
circular source, cos(ωt)(|x2 + y2 − r2

6 . For the ﬁrst test case, we simulate the scattered wave of a
c| < 2∆x), ω = 10 placed along with the cathode. 4.19

shows snapshots of the wave captured at diﬀerent time instants.

In the second test case, we use six point-sources that are placed around the circle of the

101

cathode each nearby a cavity (we maintain 15o from left edge-radius of each cavity) and

simulate the waves propagated by these sources. 4.20 shows snapshots of the wave captured

at diﬀerent time instants. We applied Neumann boundary conditions along with the model

for both cases.

Figure 4.19: Time evolution of a circular source ﬁeld cos(ωt)(|x2 +y2−r2

c| < 2∆x), ω = 10 in
the model of A6RM with a solid cathode of radius rc = 1.58 and anode of inner
radius ra = 2.11, vane radius rv = 4.11 and the coupling horn radius rh = 4.9
and the angular width of the vane, α1 = π

6 , and the cavity, α2 = π
6 .

Figure 4.20: Time evolution of ﬁelds provided by six point sources in the model of A6M with
a transparent cathode of radius rc = 1.58 and anode of inner radius ra = 2.11,
vane radius rv = 4.11 and the coupling horn radius rh = 4.9 and the angular
width of the vane, α1 = π

6 , and the cavity, α2 = π
6 .

Further, we perform a convergence test by a reﬁnement study on a model for A6M

without cathodes. We set the radii for coupling horn, vane resonators, and anode as rh = 4.9

rv = 4.11, and ra = 2.11 respectively. The angular width of the vane, α1 = π

6 and the cavity

angle, α2 = π
6 . We place a point source cos(ωt), ω = 1 at the center (0, 0), and evolve to the
ﬁnal dimensionless time T = 2.0, with a ﬁxed spatial resolution of 160 × 160 spatial points.

The discrete L∞ norm of the error is constructed at each time step and maximum error

102

over all time steps is used to graph Figure 4.21 for ∆t = 0.3125

2k

, k = 1 to 6, with Neumann

boundary conditions along the model. It shows fourth order convergence.

(a) A6M

Figure 4.21: Fourth order convergence of ﬁelds provided by a point source cos(ωt), ω = 1
at the center (0, 0), of A6MDO with the anode of radii rh = 4.9 rv = 4.11, and
ra = 2.11 and angular width of the vane, α1 = π
6 . This
is a self-reﬁnement study which measures L∞ norm of the error at time T = 2.0
for ﬁxed spatial step size ∆x = ∆y = 3.13 × 10−2

6 and the cavity, α2 = π

4.4.3.2 A6 Magnetron in 3D

For this experiment, a 3D domain of Ω = [−5, 5]3 is chosen with radius of the vane

resonators rv = 4.11, anode radius ra = 2.11, cathode radius rc = 1.58, angular width of

vane 20o, cavity angle 40o, and thick h = 6 along z. We set the grid size to be 1283, time

step size ∆t = 0.039, averaging parameter β = 1.4 and dissipation coeﬃcient  = 0.1. In

this simulation of A6 magnetron with a transparent cathode which is mimicked by using

six emitters/line sources, we calculated the time evolution by imposing embedded Neumann
boundary conditions over the boundary stencil during the x−, y−, and z− sweeps (Figure

103

4.23). Figure 4.22 shows the geometrical setup of the 3D A6 magnetron, indicating key

on/oﬀ mesh grid points used by the scheme.

(a) on x−sweep

(b) on y−sweep

(c) on z−sweep

Figure 4.22: Key points used for the simulation of 3D A6 magnetron; intersection, boundary,
ghost, and interpolation points required to obtain (a) x−, (b) y−, and (c) z−
sweeps.

4.4.3.3 Convergence Studies

Further, we perform self-reﬁnement studies to evaluate the consistency of our scheme

for A6 magnetron with a transparent cathode. We retain the same values of geometrical

parameters (rh, rv, ra, rc, α1, α2, and h) and emitters. The simulation evolves to the ﬁnal

104

(a) t = 1.5625

(b) t = 4.6875

(c) t = 6.25

(d) t = 11.7187

Figure 4.23: Time evolution of ﬁelds provided by six point sources in the model of 3D A6
magnetron with a transparent cathode of radius rc = 1.58 and anode of inner
radius ra = 2.11, vane radius rv = 4.11 and the coupling horn radius rh = 4.9,
the angular width of the vane, α1 = π
9 , and thickness
h = 6.

9 , cavity angle, α2 = 2π

time T = 2.0 by imposing Neumann boundary conditions along the model. For the time-

convergence test, we set a ﬁxed spatial resolution of 1283 spatial points, the discrete L∞

norm of the error is constructed at each time step and maximum error over all time steps is

used to graph Figure 4.24 -a) for ∆t = 0.3125

2k

, k = 1 to 4. It shows fourth order convergence.

(a) time convergence

(b) space convergence

Figure 4.24: Fourth order (a) time convergence and (b) space convergence of ﬁelds provided
by six point sources in the model of 3D A6 magnetron with a transparent
cathode and the anode of radii rh = 4.9 rv = 4.11, and ra = 2.11 and angular
width of the vane, α1 = π
6 . This is a self-reﬁnement
study which measures L∞ norm of the error at time T = 2.0 for a ﬁxed spatial
step size of ∆x = ∆y = 3.13 × 10−2

6 and the cavity, α2 = π

105

For space-convergence test, we use ﬁxed CFL values 2, 1, and 0.5 where the wave speed

c is chosen to be 1. Our L∞ norm error plots of the solution at dimensionless time T = 2.0

with varying resolutions show fourth order accuracy (Figure 4.24 -b)).

4.5 Summary

We proposed a fourth-order 3D embedded boundary method and developed a generalized

framework for complex geometry domains in 3D. We use the extended stencil for both com-

putations of the solution at the interpolation points (8-point stencil) and the ghost points (4-

point stencil) to achieve 4th-order convergence in space. And we use the same Lax-Wendroﬀ

approach (exchange the time derivative to spatial derivative) to obtain higher-order accuracy

in time. We gave a detailed description of the high-order Neumann boundary conditions and

its implementation in 1D, 2D, and 3D. We evaluated our scheme for several problems with

complex geometries including A6 magnetron in 3D. Since the magnetrons originally designed

using metal, our best choice to represent its boundaries should be PEC. Therefore, we discuss

the PEC boundary condition in the next chapter. We give descriptions of the derivation of

PEC boundary conditions for our MOLT based scheme and its evaluation.

106

CHAPTER 5: Second-Order PEC Boundary Condition

5.1 Introduction

In this chapter, we develop an embedded PEC boundary condition for the scheme de-

scribed in Chapter 2 and develop an exact simulation for magnetrons by solving the vector

potential form of Maxwell’s equations.

Since magnetrons are physically designed by metal boundaries and have complex ge-

ometries, we need PEC boundary conditions to represent the exact metal boundaries and

the embedded boundary method to deal with complex geometries. Hence, we derive an

embedded PEC boundary condition based on the electromagnetic vector potential for the

simulation of magnetrons in this chapter. We introduce a fast and geometrically ﬂexible

approach to calculate the solution to Maxwell’s equations in vector potential form under the

Lorenz gauge. As presented, the method is 4th order in time and second-order in space, but

the A-stable formulation could be extended to higher order in both time and space. While

there is no conceptual limitation to develop this in 3D, our initial work has centered on 2D.

The eventual goal is to combine this method with particle methods for the simulations of

plasma. In the current work, the scheme is evaluated for EM wave propagation within an

object that is bounded by PEC.

Vector potential formulations of electromagnetism are widely used in classical and quan-

tum physics. Maxwell’s equations describe the time-evolution of four ﬁelds: the magnetic ﬂux

density (B), the electric ﬁeld intensity (E), the electric ﬂux density (D), and the magnetic

ﬁeld intensity (H). It is often convenient to employ a formulation based on the magnetic

vector potential A and electric scalar potential φ. Maxwell’s equations then reduce to un-

coupled wave equations for the vector potential A and scalar potential φ under the Lorenz

107

gauge condition [37]. We use our MOLT based implicit A-stable scheme to solve these wave

equations.

A Perfect Electric Conductor (PEC) boundary condition is derived based on the conti-

nuity of the magnetic ﬂux normal to the boundary and is applied to the magnetic vector

potential. Our formulation avoids the additional compatibility conditions used in [34]. This

embedded PEC boundary condition is a slightly modiﬁed embedded Neumann boundary

condition. The fundamental diﬀerence is that one of the partial spatial derivatives in the

PDE is negative. As presented, the current embedded boundary formulation is second-order

in space but could be extended to higher-order.

In this chapter, we ﬁrst derive wave equations for vector and scalar potentials A and

φ under the Lorenz gauge condition discussed in section 5.2. Then we describe our two-

dimensional high-order implicit scheme using ADI splitting and the higher-order embedded

PEC boundary conditions in section 5.3, and ﬁnally we give numerical results for several test

cases in Section 5.5.

5.2 2D Electric Scalar and Magnetic Vector Potential

The macroscopic Maxwell’s equations are:

∂tB = −∇ × E,
∂tD = ∇ × H − J,
∇ · B = 0,
∇ · D = ρ.

(5.1)

(5.2)

(5.3)

(5.4)

where J is the electric current density, and ρ is the electric charge density. In a linear

isotropic medium, D = E and B = µH. The dielectric constant  = 0r with 0 and r

being the free space and relative permittivity respectively, and the permeability µ = µ0µr

with µ0 and µr being the free space and relative permeability respectively.

108

The electric scalar potential φ and magnetic vector potential A are related to E and B by

E = −∇φ − ∂tA,
B = ∇ × A.

For free space, Ampere’s law (Equation 5.2) is,

1

c2 ∂tE = ∇ × B − µ0J.

(5.5)

(5.6)

(5.7)

where the free space phase velocity c = 1√

(µ00).

Substituting E and B using equations 5.5 and 5.6 and imposing the Lorenz gauge con-

dition

1

c2 ∂tφ = −∇A.

a wave equation in terms of the magnetic vector potential A results:

t A − ∇2A = µ0J.

1
c2 ∂2

(5.8)

Similarly by imposing the Lorenz gauge condition, a wave equation for the scalar potential

φ in free space is obtained:

t φ − ∇2φ =

1
c2 ∂2

ρ
0

.

(5.9)

5.3 Perfect Electric Conductor Boundary

The electric ﬁeld (E) is continuous along the boundary and the magnetic ﬂux (B) is

continuous along the normal to the boundary. Since the Perfectly Conducting Boundary
(PEC) has inﬁnite electrical conductivity (σ = ∞), there will be no interior electric ﬁeld

(E2 = 0) in the perfect conductor. It also follows that there is no magnetic ﬁeld (H2 = 0).

109

Figure 5.1: Boundary surface between two regions with electric ﬁelds E1 and E2, magnetic
ﬁeld H1 and H2, electric ﬂux densities D1 and D2, and magnetic ﬂux densities
B1 and B2 respectively. Here, Js is the surface current, ρs is the surface charge
density, and n is the normal vector pointed out from the region two (Perfect
Electric Conductor).

The boundary conditions for the PEC become:

n × E1 = 0,
n × H1 = Js,
n · B1 = 0,
n · D1 = ρs.

(5.10)

(5.11)

(5.12)

(5.13)

where n is the unit normal vector to the boundary and Js and ρs are the surface current

density and charge density respectively.

Consider a two-dimensional vector/scalar potential (we are solving 2D problems in this

section). If we choose to restrict B to the x-y plane, the vector potential A only has a z

component,

A = Az(x, y)z.

(5.14)

Using equation 5.6 and the boundary condition represented by equation 5.12 we get

n · (∂yAz − ∂xAz) = 0.

(5.15)

110

5.4 Numerical Approximation for PEC Boundary Conditions

In this section we discuss enforcing boundary conditions with the above method. For

PEC boundary conditions, we extend our previous work on Neumann boundary conditions

to PEC (Chapter 4, [13]). Before we move with boundary conditions, let us express the

second and fourth order two dimensional schemes,

An+1 − 2An + An−1 = −β2C(1)
An+1 − 2An + An−1 = −β2C(1)

xy [An].
xy [An] −

(cid:18)

β2D(2)

xy − β4
12

(cid:19)

C(2)

xy

C(1)
xy [An].

(5.16)

(5.17)

where superscripts on the operators Cxy and Dxy denote level numbers.

On a domain Ω = [xa, xb]× [ya, yb] outﬂow boundary and PEC conditions can be deﬁned

by,

1. PEC:

∂yAz(xa, t) = 0, −∂yAz(xb, t) = 0,
−∂xAz(ya, t) = 0,

∂xAz(yb, t) = 0.

2. Outﬂow boundary condition:

∂tA(xa, t) = c∂xA(xa, t) , ∂tA(xb, t) = −c∂xA(xb, t),
∂tA(ya, t) = c∂yA(ya, t) , ∂tA(yb, t) = −c∂yA(yb, t).

Unlike Neumann boundary conditions, the challenge here is that PEC needs y derivatives
during the x− sweeps and x derivatives during the y− sweeps. This is challenging for an

ADI method because of the separation of directional information.

For a rotated square, the PEC boundary condition can use a similar approach as used

in the original method for Neumann boundaries in [13] because the directions are coupled

at the boundary. Rotating the domain through any angle, say θ, as shown in ﬁgure 5.2, we

have,

111

1. Left: cos(θ)∂yAz − sin(θ)∂xAAz = 0,

2. Right: −cos(θ)∂yAz + sin(θ)∂xAz = 0,

3. Up: sin(θ)∂yAz − cos(θ)∂xAz = 0,

4. Down: −sin(θ)∂yAz + cos(θ)∂xAz = 0.

Figure 5.2: A PEC square cavity rotated by an angle θ with normal vectors nL, nD, nR, and

nU along the left, down, right, and up boundaries.

In the next sections, we demonstrate how to construct a solution for A given the Dirichlet

boundaries, followed by an extension of the ﬁxed point map for Neumann boundaries in [13]

to the case of PEC to generate the Dirichlet boundary conditions (including an adaptation

for the mesh aligned case).

5.4.1 Using Dirichlet Boundary Conditions for A to Capture PEC Boundary

In our embedded boundary method for Neumann, we constructed an eﬃcient conver-

gent Neumann to Dirichlet Map, converting a boundary condition on the outward normal

derivative of a function into a condition that constructs ghost points outside the domain.

The method is designed such that the generated ghost points enforce the desired Neumann

112

condition on the boundary. In the next two sections, we adapt this idea to PEC, which is a

condition on the tangential derivative.

We now show how to carry out the sweeps with given values at the end of each line of

the 4th order ADI method. In our 2D scenario, to construct the operator L−1
to
build C(1), C(2) and D(2), each line is treated as a 1D problem, where the lines are evaluated
with x− sweeps and y− sweeps respectively [13]. We note that the operators that make up
D(k) and C(k) have their own boundary conditions to enforce at level k.

x and L−1

y

The objective is given the boundary condition, solve for each ax and bx along the x−
direction and ay and by along the y− direction. We will denote A(ti, xa) = AL(ti), A(ti, xb) =

AR(ti), A(ti, ya) = AD(ti) and A(ti, yb) = AU (ti). Here, [xa, xb] and [ya, yb] are horizontal

and vertical lines that we make sweeps along. These permit the boundary conditions to be

incorporated into the method.

As we did for wave solvers in our early work, [15], taking AL(ti) and AR(ti) as ﬁxed, for
component of the second order term C(1) along the line ∂Ω = [xa, xb], we arrive at

the L−1

x

two equations in two unknowns:

AL(tn+1) + (β2 − 2)AL(tn) + AL(tn−1)

I

AL(tn) +

(xa) + ax − Mxbx

,

AR(tn+1)(β2 − 2)AR(tn) + AR(tn−1)

I

AR(tn) +

(xb) + Mxax + bx

.

= β2(cid:16)
= β2(cid:16)

(cid:104)
(cid:104)

µ0

α2 Jn(cid:105)
α2 Jn(cid:105)

µ0

(cid:17)
(cid:17)

where Mx = e−α(xb−xa). We can rearrange the linear system into unknown and known values,

ax + Mxbx = wP
a ,

Mxax + bx = wP
b .

113

Solving the linear system for the unknowns ax and bx gives

ax =

(wP

a − MxwP
b )
(1 − M 2
x )

,

bx =

(wP

b − MxwP
a )
(1 − M 2
x )

,

(5.18)

where

(cid:16)
(cid:16)

wP

a =

wP

b =

1
β2
1
β2

AL(tn+1) + (β2 − 2)AL(tn) + AL(tn−1)
AR(tn+1) + (β2 − 2)AR(tn) + AR(tn−1)

(cid:104)
(cid:17) − I
(cid:17) − I
(cid:104)

µ0

α2 Jn(cid:105)
α2 Jn(cid:105)

µ0

AL(tn) +

(xa),

AR(tn) +

(xb).

Taking AD(ti) and AU (ti) as ﬁxed for L−1

y of the second order term C(1) along the line

∂Ω = [ya, yb], we arrive at two equations in two unknowns for ay and by:

AD(tn+1)(β2 − 2)AD(tn) + AD(tn−1) = β2(cid:16)
AU (tn+1)(β2 − 2)AU (tn) + AU (tn−1) = β2(cid:16)

(cid:104)
(cid:104)

µ0

α2 Jn(cid:105)
α2 Jn(cid:105)

µ0

I

AD +

(ya) + ay + Myby

I

AU +

(yb) + Myay + by

(cid:17)
(cid:17)

.

,

(5.19)

(5.20)

where My = e−α(yb−ya). Solving the linear system for the unknowns ay and by gives

ay =

(wP

a − MywP
b )
(1 − M 2
y )

,

by =

(wP

b − MywP
a )
(1 − M 2
y )

,

where

wP

a =

wP

b =

1
β2
1
β2

(cid:16)
(cid:16)

AD(tn+1) + (β2 − 2)AD(tn) + AD(tn−1)
AU (tn+1) + (β2 − 2)AU (tn) + AU (tn−1)

(cid:17) − I
(cid:104)
(cid:17) − I
(cid:104)

AD(tn) +

AU (tn) +

µ0

α2 Jn(cid:105)
α2 Jn(cid:105)

µ0

(ya)

(yb).

For the operator C(2) + D(2) acting on C(1)[An], which is the higher order correction, one

can obtain similar equations, except that the values of AL(ti), AR(ti), AD(ti) and AU (ti) at

the boundaries are explicitly zero. This is due to linearity and the leading order operator

satisﬁes the boundary condition exactly.

114

5.4.2 Embedded Boundary Method, An eﬀective Neumann to Dirichlet Map

for Creating Ghost Points

Given that we know how to solve for the Dirichlet boundary conditions for equations

5.16 and 5.17, as in [14], we would like to extend this process to the PEC case. As discussed

in [13], the reason to employ an embedded boundary approach for a diﬀerential boundary

condition has to do with stability. The diﬀerence here is we need to develop an iterative

map to ﬁnd the homogeneous coeﬃcients a and b for each piece of the operator instead of

directly computing these coeﬃcients.

Our embedded boundary method is based on our initial work in [13]. In this paper, we

developed a second-order embedded boundary method for Neumann boundary conditions

which was second-order accurate in time and space. Using the same ideas, we extended

this method to PEC boundary conditions for the vector potential in 2D. However, in [13],

direct application through the ADI of the operators is more direct. Extending this work to

fourth-order involves developing an approach to solving for the boundary conditions for the
operators C(1), C(2), and D(2).

We use an iterative method to obtain accurate values at the boundaries using an initial

approximation and correct it by imposing our PEC boundary condition through a ﬁxed

point iteration. The ﬁxed point iteration is a multi-step process that converts the derivative

condition along the boundary to a Dirichlet boundary condition, to be set at the ghost

point. These Dirichlet boundary values force the solution to satisfy the PEC condition at

the boundary to within the tolerance of iteration of the ﬁxed point method. It should

be noted that the iteration is local at the boundary and does not involve re-

computing the interior sweeps, making this update cost-eﬀective.

To understand the method, we start by considering a collection of uniform points with

a boundary passing through, see ﬁgure 5.3. We deﬁne the ghost points to be the collection

of points that are greater than one half of a grid spacing away, but less than two and

a half grid spacings away, from the boundary. As in our work in [13], starting from the

115

Figure 5.3: Geometry of 2D embedded PEC boundary method with a boundary point
(xB, yB), ghost point (xG, yG), and interpolation points (xI, yI), (xII, yII), and
(xIII, yIII).

ghost point we deﬁne a normal to the surface, along which we will enforce the boundary

condition. We deﬁne the ﬁxed point method along this normal.

In ﬁgure 5.3, the ghost

point being considered is the red grid point labeled (xG, yG), the blue point labeled (xB, yB)

is the boundary point and there are three interior points needed that are related to the

normal, at (xI, yI), (xII, yII), and (xIII, yIII). Their distances from the boundary point
2∆sI , ξII = |(xII, yII) − (xB, yB)| =
(xB, yB) are ξI = |(xI, yI) − (xB, yB)| =
and ξIII = |(xIII, yIII) − (xB, yB)| = 2∆sI, where we will typically take ∆sI =

√
2∆sI,
√

2∆x.

√

The ﬁxed point method starts by assuming we know AI, AII, AIII, and ∂T AB, which
is taken as a solution at {(xI, yI), (xII, yII) and (xIII, yIII)} and the tangential derivative
of a solution at (xB, yB). Here we enforce ∂T AB = 0 by making ∂T P|(xB,yB) = 0 one
of the conditions we use to solve for the coeﬃcients of the Hermite-Birkhoﬀ interpolating

polynomial, P (x, y) = c0 +c1x+c2y +c3xy. Enforcing that the Hermite-Birkhoﬀ interpolates

these points and explicitly solving for AG gives:

Al+1

G = ΓIAl

I + ΓIIAl

II + ΓIIIAl

III + O(∆x2),

(5.21)

116

where,

where,

ΓI = 1 +

γ3(γ4 + γ5) − γ6(γ1 + γ2)

(γ4γ2 − γ5γ1)

,

ΓII =

ΓIII =

,

(γ6γ2 + γ3γ5)
(γ4γ2 − γ5γ1)
(γ6γ1 + γ3γ4)
(γ4γ2 − γ5γ1)

,

γ1 = xII − xI + yII − yI,
γ2 = xI − xIII + yI − yIII,
γ3 = xG − xI + yG − yI,
γ4 = (xII − xI)(xB − yB) + xIyI,
γ5 = (xI − xIII)(xB − yB) + xIIIyIII,
γ6 = (xB − yB)(xG − x1) + xG(yG − y1).

(5.22)

(5.23)

We have introduced additional notation of l + 1 and l, which designate how we will set up a

ﬁxed point iteration using the Hermite-Birkhoﬀ interpolant. The iteration itself is a simple

ﬁrst order ﬁxed point method. Let w(l) and w(l+1) be the solutions at the lth and (l + 1)th

iterations. We choose a tolerance tol, and maximum number of allowed iterations mit, and
deﬁne a stopping criterion as |w(l+1) − w(l)|∞ < tol OR nit > mit where nit is the current
iteration number. We will now discuss the quasi local process for obtaining w(l+1) given the

rest of the information.

For each term in the operator on the right hand side of the second order solution given

by equation 5.16, or fourth order in time solution given by equation 5.17, we must identify

the correct ghost point that allows that term to guarantee that the tangential derivative is
zero along the boundary. Taking the C(1)[An] operator and expanding the operator out we

117

x [An] + L−1

have C(1)[An] = L−1
y [An] − L−1
identiﬁes the ghost point for wx = L−1
L−1
y [wx] such that when solving for the boundary correction terms the operator satisﬁes
∂T wx|∂Ω = 0, ∂T wy|∂Ω = 0, ∂T wxy|∂Ω = 0, ∂T wyx|∂Ω = 0.

x L−1
y [An], wxy = L−1

y L−1
x [An] − L−1
x [An], wy = L−1

y [An]. The ﬁxed point iteration

x [wy], and wyx =

Let us consider wx and wyx = L−1

y [wx], knowing that wy and wxy are similar. The

iterative process for wx starts by making an initial guess at the direct boundary values
wx|∂Ω, using an extrapolate in time wx
n−2 at each boundary

n+1,(0) ≈ 3wx

n−1 + wx

n − 3wx

point (Algorithm 20).

Algorithm 20 Compute wx(= L−1
x [A])
1: Compute the interior sweep (particular solution) I n+1
n − 3wx
2: Initial guess wx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:

n+1(0) ≈ 3wx

for k = 1 to ny do

n−1 + wx

n−2

x = Ix[An]

Compute wxI(k), and wxII(k), at the interpolation points (xI(k), y(k)), and
(xII(k), y(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (xaG(k), y(k)) and (xbG(k), y(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wx
Compute solution and update boundary stencil
wx

n+1(l) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)
n+1(l+1) = I n+1 + ax(k)e−α(x−xaG(k)) + bx(k)e−α(xbG(k)−x)

7:

8:

9:

10:
11:

end for
nit = nit + 1
12: until (|wx

n+1(l+1) − wx

n+1(l)| < tol OR nit > mit)

Given the update on wx, the next step is to consider wyx = L−1

y [wx]. The process starts

with the initial guess wx

n+1,(0) ≈ 3An − 3An−1 + An−2 at the boundary (Algorithm 21).

The same process is done for wy and wxy. The Hermite-Birkhoﬀ interpolate enforces
that the tangential derivative is zero for wx, wy, wxy and wyx such that C(1)[An] satisﬁes

the tangential derivative condition on ∂Ω to within tolerance.

For the fourth order formulation, we compute the boundary conditions for C(1)[An] and

then we repeat the process for C(2) and D(2) acting on C(1)[An].

118

y [wx])

Algorithm 21 Compute wyx(= L−1
1: Compute the interior sweep (particular solution) I n+1
2: Initial guess wyx
3: Initialize the iteration counter nit = 0
4: repeat
5:
6:

n+1(0) ≈ 3An − 3An−1 + An−2

for k = 1 to nx do

y = Iy[wx

n]

Compute wyxI(k), and wyxII(k), at the interpolation points (x(k), yI(k), and
(x(k), yII(k)) and as well as at the interpolation points along the right boundary
using bilinear interpolation
Compute solution at the ghost points (x(k), yaG(k)) and (x(k), ybG(k)) using the
Hermite-Birkhoﬀ interpolant
Compute homogeneous coeﬃcients a1(k) and b1(k) using the fact
wyx
Compute solution and update boundary stencil
wyx

n+1(l) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)
n+1(l+1) = I n+1 + ay(k)e−α(y−yaG(k)) + by(k)e−α(ybG(k)−y)

7:

8:

9:

10:
11:

end for
nit = nit + 1
12: until (|wyx

n+1(l+1) − wyx

n+1(l)| < tol OR nit > mit)

Further as detailed in [13] we need to include an artiﬁcial dissipation term to maintain

stability for these embedded boundary domains. Thus we have

(cid:18)

β2D(2)

xy − β4
12

C(2)

xy

(cid:19)

An+1 − 2An + An−1 = −β2C(1)

xy [An] −

C(1)
xy [An] + C(3)

xy [An−1].

where  is an artiﬁcial dissipation coeﬃcient that satisﬁes 0 <  < 1. The value of C(3)
at the previous time step can be used in place of C(3)
we need to go to one more computing level to obtain C(3)

xy [An]
xy [An−1] at the current time step, and

xy [An] [68]. We obtain AI, AII, and

AIII approximately, using the bilinear interpolation. Suppose the interpolation point AI

lies in a rectangular cell with corners (xi, yj), (xi+1, yj), (xi+1, yj+1), and (xi, yj+1). Then we

have the following approximation for AI

AI =w1Ai,j + w2Ai+1,j + w3Ai+1,j+1 + w4Ai,j+1,

(5.24)

119

where

w1 =

w3 =

(xi+1 − xI)(yj+1 − yI)
(xI − xi)(yI − yj)

∆x∆y

∆x∆y

,

,

w2 =

w4 =

(xI − xi)(yj+1 − yI)
(xi+1 − xI)(yI − yj)

∆x∆y

∆x∆y

,

.

(5.25)

5.5 Experimental Results

5.5.1 Square Cavity Rotated Through Diﬀerent Angles

For this experiment, we chose a square cavity bounded by a PEC, placed a point source

at the center of the cavity, and performed a ping test and mode analysis. First, we computed

the vector potential A at a point inside the box using our scheme, then computed the derived

frequency by taking FFT over the time history of the measured A, and ﬁnally analyzed the

frequency modes for varying CFL value, step size, and rotation angle. Further, we have

transformed into physical units here in contrast to the normalized units we have used up to

this point.

We chose a square box 21cm × 21cm with PEC boundaries in a square domain (Ω =
[−21cm, 21cm]2), placed a point source 1 at the center of the box,(0, 0), and turned it on for

a single time step t = ∆tns. The vector potential is measured at the point (3.36cm, 3.36cm)

for the time period t = [∆tns, T ns]. The derived frequencies for diﬀerent CFL values (0.5,

1.0, 2.0), rotation angles (0o, 31.42o, 45o), and resolutions are summarized in tables 5.5.1

and 5.5.1. Table 5.5.1 consists of frequencies obtained for the CFL value 1. Here, we set
the wave speed c = 30 Gcms−1, averaging parameter β = 1.4 and dissipation coeﬃcient

 = 0.1. Figure 5.4 shows the frequency distribution for θ = 31.42o which consists the 1-1

strong mode fundamental frequency 1GHz as expected and Figure 5.5 shows a frequency

mode computed for diﬀerent resolutions with CFL value 1. We see clear convergence to the

analytically computed 1-1 mode fundamental frequency 1GHz. For the ﬁxed CFL and wave

speed, time step size ∆t reduces with reducing spatial step size ∆x. Hence, the amplitude

120

is reducing by the decreasing spatial step size.

Figure 5.4: Frequency distribution for 31.42o rotated 21cm × 21cm square cavity computed
using the measured vector potential at the point (3.36cm, 3.36cm) for the impulse
response, h = 0.084 cm

Figure 5.5: A strong fundamental mode for 31.42o rotated 21cm × 21cm square cavity com-

puted for diﬀerent resolutions.

121

(a) 1-3 mode,

f = 2.2588GHz

(b) 3-3 mode,

f = 3.0305GHz

(c) 1-5 mode,

f = 3.6422GHz

(d) 3-5 mode,

f = 4.1650GHz

(e) 5-5 mode,

f = 5.0508GHz

Figure 5.6: Convergence plots for the natural modes (a) 1-3, (b) 3-3, (c) 1-5, (d) 3-5, and
(e) 5-5 with analytically computed frequencies (a) f = 2.2588GHz, (b) f =
3.0305GHz, (c) f = 3.6422GHz, (d) f = 4.1650GHz, and (e) f = 5.0508GHz
for 31.42o rotated 21cm× 21cm square cavity computed for diﬀerent resolutions.

122

Rotated angle, θ

Grid size, N

CF L = 0.5, T = 50ns CF L = 2, T = 200ns
45o

31.42o

31.42o

45o

0o

0o

502
1002
1502
2002
2502

0.98
1.00
1.00
1.00
1.00

0.99
1.00
1.00
1.00
1.00

0.96
0.98
0.98
1.00
1.00

0.98
1.00
0.99
1.00
1.00

0.98
0.99
1.00
1.00
1.00

0.96
0.98
0.98
0.99
1.00

Table 5.1: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping test.

Rotated angle, θ

Grid size, N

0o

31.42o

45o

502
1002
1502
2002
2502

9.599802E-1
9.899800E-1
9.899800E-1
1.000015E+0
1.000008E+0

9.899800E-1
9.999800E-1
9.999800E-1
1.000115E+0
1.000108E+0

9.899800E-1
9.999800E-1
9.999800E-1
1.000015E+0
1.000008E+0

Table 5.2: Frequency (in GHz) obtained at the point (3.36cm, 3.36cm) using the ping test

for CFL 1.

Further, we evaluated the frequencies of the normal modes 1-3, 3-3, 1-5, 3-5, and 5-5.

Our convergence plots (Figure 5.6) show the clear convergence to the analytically computed

frequencies 2.2588GHz, 3.0305GHz, 3.6422GHz, 4.1650GHz, and 5.0508GHz for the modes

1-3, 3-3, 1-5, 3-5, and 5-5 respectively.

Figure 5.7 shows the time evolution of the potential A generated by a point source

sin(2πf t) with f = 1GHz placed at the center of the box. We chose the same domain as
used in the previous experiment and set the grid size to be 100×100, time step size ∆t = 7ps.

5.5.1.1 Convergence Studies and Error Analysis

We evaluate the consistency of our scheme via (1) the ping test and (2) space-time

convergence studies for the magnetic vector potential A in the PEC square cavity.

First, we perform a convergence study given the initial condition sin(cid:0) mπx

(cid:1) sin(cid:0) nπy

(cid:1) with

L

H

frequency mode 3-2 (m = 3, n = 2) over a square domain [0cm, 21cm]2 and compare our

123

(a) θ = 0o,

t = 0.637ns

(b) θ = 31.42o,

t = 0.637ns

(c) θ = 45o,

t = 0.637ns

(d) θ = 0o,

t = 1.176ns

(e) θ = 31.42o,

t = 1.176ns

(f) θ = 45o,

t = 1.176ns

Figure 5.7: Time evolution of a point source ﬁeld sin(2πf t) with f = 1GHz placed at the
center of a PEC square box of grid size 100×100 rotated by the angles 0o, 31.42o,
45o, time step size ∆t = 7ps.

numerical solution to the exact solution given by,

A(t, x, y) = cos

c

(

mπ
L

)2 + (

nπ
H

(cid:18)

(cid:114)

(cid:19)

)2t

sin

(cid:16)mπx

(cid:17)

L

(cid:16)nπy

(cid:17)

H

sin

, where the wave speed c is chosen to be 30Gcms−1. Our L2 norm error plots of the solution

at time T = 1.0ns with varying resolutions and a ﬁxed CFL( = 1) for the cases of mesh

aligned and nonaligned (rotated by 45o) square cavities show second order accuracy (Figure

5.8 - (a)). We also note that the tangential derivative along the boundary converges with

124

third order accuracy (Figure 5.8 - (b)).

(a) Space convergence of solution

(b) Space convergence of boundary derivative

Figure 5.8: Convergence plots for a) space using entire solution and b) space using boundary
derivative on a PEC square domain [0cm, 21cm]2. This study measures L2 norm
of the error at time T = 1.0ns compared with the analytical solution, for the
cases of mesh aligned and nonaligned (rotated by 45o).

For the second evaluation, we compute the error with our fundamental frequency com-

putation using the ping test explained above and summarize it in Table 5.5.1.1. The error

is the diﬀerence between the frequencies (1-1 mode) obtained analytically (= 1GHz) and

numerically using our scheme.

Rotated angle θ

Grid size, N

CF L = 0.5, T = 50ns
31.42o

0o

CF L = 2, T = 200ns
31.42o

0o

502
1002
1502
2002
2502

1.001980E-2
1.999960E-5
1.999960E-5
1.999960E-5
8.0000064E-6

4.501910E-2
2.001960E-2
2.001960E-2
1.500023E-5
8.000064E-6

1.501970E-2
5.019900E-3
1.999960E-3
5.015075E-4
5.0080040E-5

4.501910E-2
2.001960E-2
1.501970E-2
4.985075E-4
8.000064E-6

Table 5.3: Numerical error in the frequency computation using the ping test at the point

(3.36cm, 3.36cm).

Finally, we perform a self-reﬁnement study for time and space convergence using a point

source. Figure 5.9 shows the time and space convergence plots for the cases of mesh aligned

and nonaligned (rotated by 45o) square cavities with the same conﬁguration as used in

125

the above experiment done for the time evolution of a point source ﬁeld. For the time
convergence test, we maintain the resolution at 320 × 320 and reduce the time step size

∆t = 10.9ps

to

0.3418ps. For the space convergence test, we maintain the CFL value

as 1 and reduce the spatial step size ∆x = 1.313cm to 0.082cm. We obtained fourth-order

convergence in time and second-order in space.

(a) Time convergence

(b) Space convergence

Figure 5.9: Convergence plots for (a) time and (b) space on a PEC a square domain
[0cm, 21cm]2. This study measures L∞ norm of the error at time T = 1.0ns
compared with the analytical solution, for the cases of mesh aligned and non-
aligned (rotated by 45o).

5.5.2 Square Cavity with a leak (diﬀraction Q)

The Q factor (quality factor) of a resonator is a measure of the damping, with a lower

Q indicating higher damping [33]. It measures the harmonic losses of the oscillations. The

diﬀraction Q factor is the external Q factor which characterized the energy loss due to

radiation. The value of Q is inﬁnite for a closed cavity and ﬁnite for an open cavity and it

can be used to determine the amount of energy loss. The diﬀraction Q can be expressed as,

Q = −πf (t2 − t1)

ln( A1t1
A2t2

)

where A1 and A2 are the vector potentials at time t1 and t2.

126

In the second experiment, an open boundary is placed along the center of the right edge

of the square cavity, and the diﬀraction quality factor Q is obtained for an oscillating point

source (sin(2πf t), f = 1GHz) placed at the center.

Every conﬁguration was the same as above, except for imposing an outﬂow boundary

condition along the open boundary and keeping the Gaussian pulse running for the entire

time period [∆tns, T ns]. The vector potential A is measured at the point (3.36cm, 3.36cm)

and the computation of Q is done for aligned and nonaligned mesh cases. For non aligned

meshes, we rotate the square by an angle θ = 45o.

Analytic calculation of the quality factor treating an open cavity like a transmission line

with a load simulating free space (377Ω) gives a value of Q=24 [36]. The Q values obtained

from our scheme (Q = 26.5768 for θ = 0o and Q = 25.7875 for θ = 31.420) are very close to

the analytically obtained value. Figure 5.10 shows the time evaluation of vector potential A

at the point (3.36cm, 3.36cm) for both cases.

(a) θ = 0o, Q = 26.5768

(b) θ = 31.420, Q = 25.7875

Figure 5.10: Time evaluation of A in (a) mesh aligned and (b) θ = 31.420 rotated square
cavities with a leak imposed by outﬂow boundary condition and diﬀraction Q
which is obtained for an oscillating point source (sin(2πf t), f = 1GHz)

127

5.5.3 A6 Magnetron

We chose a 2D A6 magnetron to test the applicability of our PEC scheme to complicated

geometry. For this experiment, a 2D domain of Ω = [0cm, 5cm]2 is chosen with radius of

the vane resonators rv = 4.11cm, anode radius ra = 2.11cm, cathode radius rc = 1.58cm,
angular width of vane, 20o, and cavity angle, 40o. We set the grid size to be 128 × 128, time

step size ∆t = 39ps, averaging parameter β = 1.4 and dissipation coeﬃcient  = 0.1. The
embedded PEC boundary condition is imposed over the boundary stencil during the x− and
y− sweeps.

This experiment was conducted for frequency mode analysis. A point source was placed

in the center of the A6M and the frequency distribution examined using a ping test at the

point (1.4063cm, 0.8203cm). We obtained six strong frequency modes as shown in ﬁgure

5.11, associated with the ﬁrst two passbands, clearly showing the eﬀects of all six symmetric

resonances.

Figure 5.11: Frequency spectrum of 2D A6M with vane resonators rv = 4.11cm, anode radius
ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane, 20o, and
cavity angle, 40o, the grid size 128 × 128, time step size ∆t = 39ps, averaging
parameter β = 1.4 and dissipation coeﬃcient  = 0.1.

We further simulated the A6 magnetron with a transparent cathode (5.12) which is

128

mimicked by using six emitters/point sources and calculated the time evolution for the same

conﬁguration as above. The formulation maintains symmetry and high accuracy for relatively

coarse-grained solutions. For example, in the A6 magnetron results, there are only 12 points

across the neck of the magnetron vanes.

(a) t = 1.23ns

(b) t = 3.01ns

(c) t = 5.68ns

(d) t = 8.81ns

(e) t = 11.95ns

(f) t = 17.87ns

Figure 5.12: Evolution of the transparent cathode A6M with vane resonators rv = 4.11cm,
anode radius ra = 2.11cm, cathode radius rc = 1.58cm, angular width of vane,
20o, and cavity angle, 40o, the grid size 128 × 128, time step size ∆t = 39ps,
averaging parameter β = 1.4 and dissipation coeﬃcient  = 0.1.

5.5.4 Rising Sun Magnetrons

We chose Rising Sun magnetrons with 12 and 18 cavities for further evaluation of our

scheme to complicated geometry. Figure shows the 2D view of the 18 cavity rising sun

129

magnetron (AX9) with the cathode of radius rc and the anode with inner radius ra, short

vane radius rvs and long vane radius rvl. For this experiment, a 2D domain of Ω = [0cm, 5cm]2

is chosen with radius of the wider and shorter vane resonators rvs = 4.11cm, and rvl = 4.91cm

respectively, anode radius ra = 2.11cm, cathode radius rc = 1.58cm. For the 12 cavity rising

sun magnetron , we chose the angular width of vane, 12o, and cavity angle, 18o. For the 18

cavity rising sun magnetron , we chose the angular width of vane, 12o, and cavity angle, 8o.
We set the grid size to be 128 × 128, time step size ∆t = 39ps, averaging parameter β = 1.4

and dissipation coeﬃcient  = 0.1. The embedded PEC boundary condition is imposed over
the boundary stencil during the x− and y− sweeps.

Figure 5.13: 2D view of the 18 cavity rising sun magnetron (AX9) with a cathode of radius

rc and anode of inner radius ra, short and long cavity radii rvh and rvl.

We simulated the rising sun magnetrons with transparent cathodes ( Figures 5.14 and 5.15

) which are mimicked by using 12 and 18 emitters/point sources for 12 and 18 cavity rising

sun magnetrons respectively and calculated the time evolution for the same conﬁguration as

above. The formulation maintains symmetry and high accuracy for relatively coarse-grained

solutions.

130

(a) t = 0.72s

(b) t = 0.72s

(c) t = 5.68s

(d) t = 11.86s

(e) t = 5.68s

(f) t = 11.86s

Figure 5.14: Evolution of the transparent cathode 12 cavity rising sun magnetron with vane
resonators rvs = 4.11cm and rvl = 4.91cm, anode radius ra = 2.11cm, cathode
radius rc = 1.58cm, angular width of vane, 12o, and cavity angle, 18o, the grid
size 128 × 128, time step size ∆t = 39ps, averaging parameter β = 1.4 and
dissipation coeﬃcient  = 0.1.

131

(a) t = 0.81s

(b) t = 0.81s

(c) t = 2.05s

(d) t = 5.72s

(e) t = 5.72s

(f) t = 5.72s

Figure 5.15: Evolution of the transparent cathode 18 cavity rising sun magnetron (AX9)
with vane resonators rvs = 4.11cm and rvl = 4.91cm, anode radius ra = 2.11cm,
cathode radius rc = 1.58cm, angular width of vane, 8o, and cavity angle, 12o,
the grid size 128 × 128, time step size ∆t = 39ps, averaging parameter β = 1.4
and dissipation coeﬃcient  = 0.1.

5.6 Summary

We derived wave equations for vector and scalar potentials A and φ under the Lorenz

gauge condition and obtained the solution for the vector potentials based wave equation

subject to PEC boundary condition. We explained the embedded PEC boundary conditions

in 2D and the consistency and performance of the scheme are conﬁrmed using the ping

test and frequency mode analysis for rotated square cavities. We then demonstrate the

132

diﬀraction Q value test and the use of this method for simulating an A6 magnetron. We

obtained strong six resonance modes for the impulse response of A6 magnetron as expected.

In the next chapter, we describe our scalable software solution MOLTN in detail.

133

CHAPTER 6: Software Solution and Code Acceleration

6.1 Introduction

In this chapter, we describe our new open-source code development work and give scala-

bility studies for the OpenMP and GP-GPU CUDA versions.

The exciting EM software solutions in plasma science are developed based on the FDTD-

PIC method [51; 67; 32; 70] and they have limitations as described previously in section 1.1.

We have proved that our MOLT based scheme can overcome those limitations. Hence, we

are motivated to develop a software solution to bring our scheme to the scientiﬁc commu-

nity. We are developing an open source code MOLTN (Method Of Line Transpose based

Nth arbitrary order scheme) which is intended to be an architecture-independent, scalable

software tool, using MPI, OpenMP, and as well as GPU CUDA implementation. We are

working with the templated C++ library Kokkos [24] at the lowest levels of the code to

achieve this goal.

In this chapter, we give the design of MOLTN software and scalability studies for the

OpenMP version in section 6.2 and the parallel design and scalability studies for the GP-GPU

CUDA version in section 6.3.

6.2 Multi-core OpenMP, Multi-node MPI version

As an initial work, we developed the code in C++ using multi-core, shared memory

OpenMP and multi-node, distributed memory MPI. For this implementation, we ﬁrst de-

compose the domain as a set of subdomains and assign each sub-domain per node. We use

OpenMP pragmas for the data parallelization within the nodes and each node communicates

134

with each other using the MPI library. In our scheme, internal sweeps are carried out on

each node and boundary information is passed between the adjacent nodes during the left

and right oriented sweeps for the fast convolution integration (as described in the section

2.2.2). We give an overview of the structure of the MOLTN in the following section,

6.2.1 Data Structures and Data Flows

As we have seen, we know the kernel of this scheme is the O(n) one-dimensional recursive

fast convolution to obtain the particular solution and the homogeneous solution ( section

2.2). Thus, we deﬁned the MOLT kernel using a C++ object to implement this algorithm

with subroutines to build quadrature weights, perform left and right oriented sweeps, and

apply boundary conditions. We deﬁned another major object, MOLT mesh, to represent

the domain with every relevant parameter. The MOLT mesh acts as the main storage

place of our implementation, which consists of a set of one-dimensional C++ pointer arrays.

The controller object deﬁned in MOLTN is the MOLT time-step which receives the user-

deﬁned object and parameters, builds MOLT mesh and performs needed computation and

communication. We use node-core hybrid parallelism using MPI and OpenMP, thus initially,

the whole domain has to be partitioned and distributed over the HPC cluster nodes. Then

computing tasks have to be performed and the resultant relevant data must be transferred

between the nodes. The MOLT time-step performs this task by invoking appropriate

routines deﬁned in the MOLTN library API that includes MOLT kernel, MOLT mesh,

and MOLT time-step. So the application class can include these library ﬁles to carry out

its work. The application of these libraries can be extended beyond the wave solver such as

advection-diﬀusion, magnetohydrodynamic (MHD), and ﬂuid dynamics.

135

6.2.2 Domain Decomposition

Here we present our domain decomposition strategy ([64]). Consider the L−1(u) operator:

M (u) =

u(y)e−α|x−y|dy + A0e−α(x−a) + B0e−α(b−x).

a

We seek to partition this expression for subdomains, so we deﬁne ci such that c0 < ... < ci <

... < cN for some N , where c0 = a and cN = b, and thus for ci < x < ci+1,

Mi[u](x) =

u(y)e−α|x−y|dy + Aie−α(x−ci) + Bie−α(ci+1−x).

This expression can be rewritten in terms of the left and right sweep operators

ci+1(cid:90)

Mi[u](x) =e−α(x−ci)

u(y)eα(y−ci)dy + e−α(ci+1−x)

u(y)eα(ci+1−y)dy

ci

x

+ Aie−α(x−ci) + Bie−α(ci+1−x).

We evaluate this expression at x = ci:

Mi[u](ci) =e−α(ci+1−ci)

u(y)eα(ci+1−y)dy + Ai + Bie−α(ci+1−ci),

(6.1)

(6.2)

b(cid:90)

ci+1(cid:90)

ci

x(cid:90)

This is computed in the following way: we deﬁne

ci+1(cid:90)

ci

x(cid:90)
b(cid:90)

a

I L[u](x) =

I R[u](x) =

x(cid:90)
b(cid:90)

a

u(y)e−α|x−y|dy = e−αx

u(y)eαydy

u(y)e−α|x−y|dy = eαx

u(y)e−αydy

x

x

136

So then

M (u) = I L[u](x) + I R[u](x) + Ae−α(x−a) + Be−α(b−x)

Suppose we decompose the domain [a, b] to three sub domains 1,2, and 3 with pseudo

boundaries c0, c1, c2, and c3 as shown in Figure 6.1, we can now apply the above expression

for the each sub domain,

Figure 6.1: Domain Decomposition of the domain [a, b] to three 1D sub domains [c0, c1],
j ) sweeps over

[c1, c2],and [c2, c3], each evolves own internal left (I L
the domain of length ∆cj.

j ) and right (I R

A1e−α(x−a) + B1e−α(c1−x) − A0e−α(x−a) − B0e−α(b−x)

= e−α(c1−x)I R

2 [u](c1) + e−α(c2−x)I R

3 [u](c2)

A2e−α(x−c1) + B2e−α(c2−x) − A0e−α(x−a) − B0e−α(b−x)

= e−α(x−c1)I L

2 [u](c1) + e−α(c2−x)I R

3 [u](c2)

A3e−α(x−c2) + B3e−α(b−x) − A0e−α(x−a) − B0e−α(b−x)

= e−α(x−c1)I L

1 [u](c1) + e−α(x−c2)I L

2 [u](c2)

137

We deﬁne ∆c1 = a − c1, ∆c2 = c2 − c1, and ∆c3 = b − c2. We now write:

A1e−α(x−a) + B1e−α(c1−x) − A0e−α(x−a) − B0e−α(c1−x)e−α∆c2e−α∆c3

= e−α(c1−x)I R

2 [u](c1) + e−α(c1−x)e−α∆c2I R

3 [u](c2)

A2e−α(x−c1) + B2e−α(c2−x) − A0e−α∆c1e−α(x−c1) − B0e−α∆c3e−α(c2−x)

= e−α(x−c1)I L

2 [u](c1) + e−α(c2−x)I R

3 [u](c2)

A3e−α(x−c2) + B3e−α(b−x) − A0e−α∆c1e−α∆c2e−α(x−c2) − B0e−α(b−x)

= e−α(x−c2)e−α∆c2I L

1 [u](c1) + e−α(x−c2)I L

2 [u](c2)

From the equations 6.3, 6.4, and 6.5 we can discern

(6.3)

(6.4)

(6.5)

2 [u](c1) + e−α∆c2I R

3 [u](c2)

A1 = A0
B1 = B0e−α∆c2e−α∆c3 + I R
A2 = A0e−α∆c1 + I L
B2 = B0e−α∆c3 + I R
A3 = A0e−α∆c1e−α∆c2 + e−α∆c2I L

3 [u](c2)

2 [u](c1)

1 [u](c1) + I L

2 [u](c2)

B3 = B0

We choose ∆t such that e−α∆ci = 0 for i = 1...N , hence,

A1 = A0, B1 = I R

2 [u](c1),

A2 = I L

2 [u](c1), B2 = I R

3 [u](c2),

A3 = I L

2 [u](c2), B3 = B0.

138

6.2.3 Performance Analysis

As an initial evaluation of MOLTN, we tested it using a single node multi-core platform.

We chose a node with 28 cores which facilitated with Intel(R) Xeon(R) CPU E5-2680 v4 @

2.40GHz processor and 492 GB memory to examine our OpenMP parallel MOLTN code with-

out using any accelerators. This strong scaling study shows nearly linear speedup (in Figure

6.3) and approximately 85% eﬃciency ( in Figure 6.4). We observe less speedup/eﬃciency

for a small amount of work due to the leading eﬀect of communication cost including com-

munication overhead. Speedup and eﬃciency reduce with increasing core counts due to

memory bounds.Hence, it shows positive signs for the scalability of our scheme, but we need

to prove the scalability of the hybrid version and also have to work with the performance

enhancement using the accelerators such as NVIDIA GPU and Intel SIMD vectorization. We

believe that designing an algorithm based on the architecture is the best practice to achieve

high performance, and focus to work deeply with lower-level code optimization technologies,

NVIDIA warp primitive-level programming and SIMD vectorization intrinsics.

Figure 6.2: Log-log plot for wall time using 28 core Intel(R) Xeon(R) CPU E5-2680 v4 @

2.40GHz processor, for the grid size N 2, N = 64, 128, 256, 1024, 2048.

139

Figure 6.3: Log-log plot for speedup in using 28 core Intel(R) Xeon(R) CPU E5-2680 v4 @

2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024, 2048.

Figure 6.4: Log-log plot for eﬃciency using 28 core Intel(R) Xeon(R) CPU E5-2680 v4 @

2.40GHz processor for the grid size N 2, N = 64, 128, 256, 512, 1024, 2048.

140

6.3 GPGPU CUDA version

In this section, we describe our initial work of GPGPU CUDA implementation of our

scheme and performance evaluation.

6.3.1 Task Organization

For this implementation, we chose the two-dimensional second-order scheme,

un+1 − 2un + un−1 = −β2C(1)

xy [un]

where,

Cxy = (L−1

y + L−1

x ) − (L−1

y L−1

x + L−1

x L−1
y )

(6.6)

Based on the structure of the scheme, the computation of the solution un+1 at the time step

tn+1 can be obtained through four parallel stages using task paralleling,

1. Perform the ﬁrst set of x- and y- sweeps:

Compute wx = L−1

x [u] and wy = L−1

y [u] concurrently,

2. Perform the second set of x- and y- sweeps:

Compute wyx = L−1

x [wx] and wxy = L−1

y [wy] concurrently,

3. Obtain Cxy:

Compute Cxy1 = wx − wyx and Cxy2 = wy − wxy concurrently,

4. Obtain un+1:

Compute un+1

1 = 2un − β2Cxy1 and un+1

2 = −un−1 − β2Cxy2 concurrently,

and ﬁnally obtain un+1 = un+1

1 + un+1

2

. We choose two CUDA streams to go through

these stages, Figure 6.5 shows the processing ﬂow of the two streams.

141

Figure 6.5: Task parallelism for the computation of un+1 based on Cxy and parallel stages

executing through CUDA streams.

6.3.2 Performance Analysis

We evaluated the scalability and performance of the scheme using NVIDIA Tesla K20 and

K80 GPU accelerators. Tesla K20 (2880 CUDA cores) and K80 (4992 CUDA cores) GPUs

were compared with a CPU, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz processor. This

study shows the nearly 45X and 35X speedup for Tesla K80 and K20 respectively with full

utilized CUDA cores(in Figure 6.7) and linear time complexity for both CPU and GPU

architectures (Figure 6.6). Even though the evaluation result shows reasonable improve-

ment in time and cost, we have to include the MPI library to communicate between nodes

for the distributed multi-node system and also consider the lower-level code optimization

technologies, NVIDIA warp primitive-level programming.

142

Figure 6.6: Log-log plot for wall time using Intel(R) Xeon(R) CPU, Tesla K20, and K80

GPUs for the grid sizeN 2, N = 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192.

Figure 6.7: Intel(R) Xeon(R) CPU vs Tesla K20 and K80 GPU speedup (log-log plot) for

the grid size N 2, N = 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192.

143

6.4 Summary

We have been developing an architecture-independent, scalable software tool MOLTN

using MPI, OpenMP, and as well as GPU CUDA implementation. In this chapter, we have

described the overview of the design of MOLTN, parallel implementation of GP-GPU CUDA

version, and scalability studies for both OpenMP and CUDA versions. In the next chapter,

we summarize the work carried out through these studies and give useful ideas for future

work.

144

CHAPTER 7: Summary and Potential Future Directions

7.1 Introduction

In this chapter, I conclude the work carried out through this study, emphasize my con-

tributions to this research work, and identify potential future directions.

7.2 Conclusion

Based on my literature review, I identiﬁed the MOLT based scheme developed by Christlieb

team [15] as an unconditionally stable, fast, and implicit scheme and arbitrary order accuracy

for Dirichlet and periodic boundaries on a Cartesian grid. Hence, I chose the scheme for my

simulation study and overcame the identiﬁed limitations on it. The major limitation is that

the scheme was second order for outﬂow and Neumann boundary conditions. So, I worked

to extend the order of accuracy for Neumann and outﬂow for arbitrary order and achieved a

scheme with fourth-order spatiotemporal accuracy for complex geometry problems including

curved boundaries. Further, I deﬁned a general framework for complex geometry problems

and simulated such problems using an embedded boundary method. Speciﬁcally, I developed

a simulation tool for HPM tubes, such as A6 magnetron, 12 and 18 cavity rising suns us-

ing PEC boundary condition. For this simulation, I derived the scheme for electromagnetic

vector potential using the Lorenz gauge and imposed PEC boundary condition in 2D. I eval-

uated the simulation of A6 magnetron using a ping test and obtained six strong resonance

modes. This is one of the challenging tasks that I accomplished.

I extended the implementation of the scheme to 3D and obtained multiple applications

such as 3D variable speed waves, spherical scattering, 3D point source, and A6 magnetron in

145

3D. The simulation of three-dimensional A6 magnetron is an additional important milestone.

I developed codes in C++ for these simulations and improved those performances by code

optimization and acceleration using parallel technologies in diﬀerent architectures including

NVIDIA GPU. Further, I have been working with the scalable open source MOLTN devel-

opment and contributed to the development of multi-core shared memory OpenMP version

and wave kernel development using kokkos.

7.3 Future Directions

7.3.1 Code Acceleration for MOLTN

As described in the Chapter 6, we are developing an open source scalable software to make

visible our MOLT based scheme to the scientiﬁc community. We are on the initial stage of

this development and will have to optimize the code in diﬀerent levels; i) low-level architec-

ture based ii) compiler related iii) high-level code based optimizations. To obtain accelerated

code in Intel architectures, we can use the SIMD vectorization using Intel intrinsic instruc-

tions, which are C style functions that provide access to many Intel instructions, including

Intel SSE (128 bits Streaming SIMD Extension), AVX (256 bits Advanced Vector Extension),

and AVX512 (512 bits Advanced Vector Extension) etc. without the need to write assembly

code. The SIMD vectorization techniques can increase processor throughput by perform-

ing multiple computations in a single instruction. We can use Arithmetic and Elementary

Math functions such as the function m256d mm256 add pd( m256d,

m256d) to add

packed double-precision (64-bit) ﬂoating-point elements using AVX. We can also use the
compiler ﬂags for the auto-vectorization (e.g. −O2 or higher in intel). Another well-known

accelerator is NVIDIA GPUs, which execute warps of 32 parallel threads using SIMT (Sin-

gle Instruction Multiple Threads - multiple threads issue common instructions to arbitrary

data) technologies. The explicit warp-level programming using warp-level primitives which

were introduced in CUDA 9.0, is safe, eﬀective and will give high-performance. We can use

146

NVIDIA’s intrinsic functions to make the code even faster with reduced accuracy, but they

can be used in device codes only.

7.3.2 Complex Geometry in MOLTN

We have already implemented simple rectangular-shaped boundary problems on the

Cartesian grids for MOLTN, so we have to deﬁne a general framework for complex ge-

ometry problems on MOLTN in the future. To obtain this, i) convert MOLT kernel to

able to operate on line-segments instead of lines because line-segments are unit elements for

complex geometry cases. ii) we have to deal with the work load-balancing and data locality

carefully in order to make sure better performance is obtained. I can suggest a max-ﬁt policy

which takes care of load balancing and locality at the same time.

7.3.3 Further HPM tube simulations

Even though the concept of the magnetrons is the same, every magnetron is not designed

by the same structure. They may diﬀer in the number of cavities (12 or 18 cavity rising

suns), the shape of the cavities (triangular or circular cavity magnetrons), and radius of

the cathode/anode, etc. In these studies, my simulations are developed based on triangular

cavities. In the future, we can consider the circular cavity-anode structure as shown in Figure

7.3.3 which is a typical magnetron for microwave ovens in cutaway view. The cylindrical

anode structure contains a number of equally spaced cavity resonators with the anode surface

adjacent to the cylindrical cathode. Permanent magnets are used to provide the necessary

magnetic ﬁeld, which is perpendicular to the electric ﬁeld between the cathode and the

anode. The power output is coupled through an antenna that runs from one of the cavities

to a waveguide that channels the microwave radiation to the cooking chamber.

Further, beyond the magnetrons, we can develop simulations for other kinds of HPM

tubes such as the accelerator klystron. Figure 7.3.3 describes the two-cavity klystron [30]

147

Figure 7.1: Magnetron with eight circular hole cavity anode (this ﬁgure is from Encyclopedia

Britannica, Inc.)

Figure 7.2: Two-cavity klystron (this ﬁgure is from Reference [30])

7.3.4

Incorporate with Particles

We obtained a cold test (in Chapter 5) using the electromagnetic waves without including

particles successfully. Now, we are planning to perform hot tests by including particles in

148

the simulation. For this purpose, we tested for a point source as a delta function in three

dimensions. Next we will see how to derive the scheme which includes the point source.

7.3.4.1 Derive the scheme for 3D point source

Let’s rewrite the Helmholtz operator in 1D,

(cid:16)

(cid:17)

(.)

Lx(.) =

1 − 1

α2 ∂xx

We use the free space Green’s function to solve the L−1 as previously described. The one

dimensional Green’s function can be given by,

Upon using the Green’s function,

G(x|x0) =

e−α|x−x0|2
|x − x0|2

1
4π

(cid:90)

G(x|x0)(.)dx0

G(x|x0)(.)dx0 +

(7.1)

(7.2)

(cid:90)

L−1
x (.) =

Ωx0

Ωx0

Now, we break the solution into two pieces, the particular solution and the homogeneous

solutions:

where,

b(cid:90)

a

Ix(.) =

α
2

L−1
x (.) := Ix(.) + Hx

(7.3)

e−α|x−y|(.)dy, Hx = axe−α(x−xa) + bxe−α(xb−x).

For three dimensional inverse operator L−1(.) for the ADI splitting is performed one line

149

at a time. Recall,

where,

b(cid:90)

a

Ik(.) =

α
2

L−1
k (.) := Ik(.) + Hk

(7.4)

e−α|k−k0|(.)dk0, Hk = ake−α(k−ka) + bke−α(kb−k).

So the three dimensional operator L−1(.) can be expressed as,

L−1(.) = L−1

x (L−1

y (L−1

z (.))) + O(∆t2)

or

or

L−1(.) = Ix,y,z(.) + Hx,y,z(.) + O(∆t2)

L−1(.) = Ix(Iy(Iz(.) + Hz) + Hy) + Hx + O(∆t2)

By plugging into the second order scheme,

un+1 = (2 − β2)un − un−1 + β2 ¯L−1
un+1 = (2 − β2)un − un−1 + β2L−1

x

3D [un]

(cid:2)L−1

y

(cid:2)L−1
z [un](cid:3)(cid:3) .

(7.5)

(7.6)

and including a point source S,

un+1 = (2 − β2)un − un−1 + β2L−1

x

(cid:2)L−1

y

(cid:2)L−1
z [un](cid:3)(cid:3) +

β2
α2 Sn.Gδ

3D.

(7.7)

where,

Gδ

3D =

1
4π

√

(cid:112)(x − x0)2 + (y − y0)2 + (z − z0)2 + δ2

(x−x0)2+(y−y0)2+(z−z0)2

e−α

150

Gδ

3D(x/x0) =

(cid:112)(x − x0)2 + (y − y0)2 + (z − z0)2 + δ2

e−α||x−x0||2

1
4π

where δ ∈ R+, 0 < δ2 < ∆x

7.3.4.2 Numerical Example

For the next two experiments we placed two point sources sin(2πf t) with equal fre-

quencies (f = 1) for the ﬁrst case and two point sources sin(2πf t) with diﬀerent fre-
quencies f = 1 and f = 10 for the second case at the corners (−0.613,−0.613,−0.613)
and (1.613, 1.613, 1.613) of the cubical domain ([−1, 1]3). We simulate the wave propaga-

tion by applying outﬂow and Dirichlet boundary conditions along the right surface of the

cube and the other ﬁve surfaces of the cube respectively. We set the values of parameters

∆x = ∆y = ∆z = 0.013, ∆t = 0.0067, and β = 2. Figure 7.3 and 7.5 show snapshots of the

wave at diﬀerent time instants for the symmetric and asymmetric cases.

Finally, we perform a self-reﬁnement study for time and space convergence using a point
source sin(2πf t) with frequencies f = 1 at the center of the cubical domain ([−1, 1]3). Figure

7.4 shows the time and space convergence plots for the time evolution of a point source ﬁeld.
For the time convergence test, we maintain the resolution at 40 × 40 × 40 and reduce the

time step size ∆t = 0.1 to 0.0125. For the space convergence test, we maintain the CFL

value as 1.3 and reduce the spatial step size ∆x = 1.313 to 0.082. We obtained second-order

convergence in time and space

This shows a positive result to deal with particles properly, and the approach can be

extended to high-order by using the Taylor series expansion in terms of the source function.

151

(a) t = 1.23ns

(b) t = 3.01ns

(c) t = 5.68ns

(d) t = 8.81ns

Figure 7.3: Evolution of the two 3D point sources sin(2πf t) with the same frequency f = 1
at the corners of the cubical domain ([−1, 1]3), spatial step size ∆x = ∆y =
∆z = 0.013, and time step size ∆t = 0.0067.

(a) time convergent

(b) space convergent

Figure 7.4: Time (a) and space (b) convergence studies using 3D point source sin(2πf t)

with frequencies f = 1 at the center of the cubical domain ([−1, 1]3).

152

(a) t = 1.23ns

(b) t = 3.01ns

(c) t = 5.68ns

(d) t = 8.81ns

Figure 7.5: Evolution of the two 3D point sources sin(2πf t) with frequencies 1 and 10 at the
corners of the cubical domain ([−1, 1]3), spatial step size ∆x = ∆y = ∆z = 0.013,
and time step size ∆t = 0.0067.

7.4 Summary

I gave the summary and contributions of this research work and gave potential future

directions in this chapter. I strengthen the MOLT based scheme by including high-order

Neumann, outﬂow and PEC boundary conditions, general geometry framework, 3D im-

plementations, and accelerated codes and wave kernel for MOLTN development. We are

proposing to work with particles, several HPM tube simulations, and optimization and de-

velopment of MOLTN in the future.

153

BIBLIOGRAPHY

154

BIBLIOGRAPHY

[1] Alpert, B., L. Greengard, and T. Hagstrom (2000), Rapid evaluation of nonreﬂect-
ing boundary kernels for time-domain wave propagation, SIAM Journal on Numerical
Analysis, 37 (4), 1138–1164, doi:10.1137/S0036142998336916.

[2] AlSalem, H., P. Petrov, G. Newman, and J. Rector (2018), Embedded boundary meth-
ods for modeling 3d ﬁnite-diﬀerence laplace-fourier domain acoustic-wave equation
with free-surface topography, GEOPHYSICS, 83 (5), T291–T300, doi:10.1190/geo2017-
0828.1.

[3] Appelo, D., and N. A. Petersson (2012), A Fourth-Order Accurate Embedded Boundary

Method for the Wave Equation, SIAM J. Sci. Comput., 34 (6), A2982–A3008.

[4] Ascher, U., R. Mattheij, and R. Russell (1995), Numerical Solution of Boundary Value
Problems for Ordinary Diﬀerential Equations, Society for Industrial and Applied Math-
ematics, doi:10.1137/1.9781611971231.

[5] Benford, J. (2010), History and future of the relativistic magnetron, in 2010 Interna-
tional Conference on the Origins and Evolution of the Cavity Magnetron, pp. 40–45,
doi:10.1109/CAVMAG.2010.5565566.

[6] Berenger, J.-P. (1994), A perfectly matched layer for the absorption of elec-
tromagnetic waves, Journal of Computational Physics, 114 (2), 185 – 200, doi:
https://doi.org/10.1006/jcph.1994.1159.

[7] Bernacki, M., S. Lanteri, and S. Piperno (2006), Time-Domain Parallel Simulation of
Heterogeneous Wave Propagation on Unstructured Grids Using Explicit, Nondiﬀusive,
Discontinuous Galerkin Methods, Journal of Computational Acoustics, 14 (01), 57–81,
doi:10.1142/S0218396X06002937.

[8] Bruno, O. P., and M. Lyon (2010), High-order unconditionally stable FC-AD solvers for
general smooth domains I . Basic elements, Journal of Computational Physics, 229 (6),
2009–2033.

[9] Bruno, O. P., and M. Lyon (2010), High-order unconditionally stable FC-AD solvers for
general smooth domains I. Basic elements , Journal of Computational Physics, 229 (6),
2009 – 2033, doi:http://dx.doi.org/10.1016/j.jcp.2009.11.020.

[10] Cai, W. (2004), Numerical methods for maxwell’s equations in inhomogeneous media

with material interfaces, Journal of Computational Mathematics, 22 (2), 156–167.

[11] Cai, W., and S. Deng (2003), An upwinding embedded boundary method for Maxwells
equations in media with material interfaces: 2D case, Journal of Computational Physics,
190 (1), 159 – 183, doi:http://dx.doi.org/10.1016/S0021-9991(03)00269-9.

155

[12] Catella, A., V. Dolean, and S. Lanteri (2008), An Unconditionally Stable Discontin-
uous Galerkin Method for Solving the 2-D Time-Domain Maxwell Equations on Un-
structured Triangular Meshes, IEEE Transactions on Magnetics, 44 (6), 1250–1253,
doi:10.1109/TMAG.2007.916578.

[13] Causley, M., A. Christlieb, and E. Wolf (2017), Method of Lines Transpose: An Eﬃcient
Unconditionally Stable Solver for Wave Propagation, Journal of Scientiﬁc Computing,
70 (2), 896–921, doi:10.1007/s10915-016-0268-8.

[14] Causley, M. F., and A. J. Christlieb (2014), Higher Order A-Stable Schemes for the
Wave Equation Using a Successive Convolution Approach, SIAM Journal on Numerical
Analysis, 52 (1), 220–235, doi:10.1137/130932685.

[15] Causley, M. F., A. Christlieb, Y. G¨u¸cl¨u, and E. Wolf (2013), Method of Lines Transpose:

A Fast Implicit Wave Propagator, Mathematics of Computation, submitted.

[16] Chini, G. P., and S. Leibovich (2003), An analysis of the klemp and dur-
internal waves,
doi:10.1175/1520-

ran radiation boundary condition as applied to dissipative
of Physical Oceanography,
Journal
0485(2003)033¡2394:AAOTKA¿2.0.CO;2.

33 (11),

2394–2407,

[17] Coifman, R., V. Rokhlin, and S. Wandzura (1993), The fast multipole method for the
wave equation: a pedestrian prescription, IEEE Antennas and Propagation Magazine,
35 (3), 7–12, doi:10.1109/74.250128.

[18] Deng, S., and W. Cai (2006), A Fourth-Order Upwinding Embedded Boundary Method

(UEBM) for Maxwells Equations in Media with Material Interfaces: Part I.

[19] Ditkowski, A., K. Dridi, and J. Hesthaven (2001), Convergent Cartesian Grid Methods
for Maxwell’s Equations in Complex Geometries, Journal of Computational Physics,
170 (1), 39 – 80, doi:http://dx.doi.org/10.1006/jcph.2001.6719.

[20] Dolean, V., H. Fahs, L. Fezoui, and S. Lanteri (2010), Locally implicit discontinuous
Galerkin method for time domain electromagnetics, Journal of Computational Physics,
229 (2), 512 – 526, doi:http://dx.doi.org/10.1016/j.jcp.2009.09.038.

[21] Dridi, K. H., J. S. Hesthaven, and A. Ditkowski (2001), Staircase-free ﬁnite-diﬀerence
time-domain formulation for general materials in complex geometries, IEEE Transac-
tions on Antennas and Propagation, 49 (5), 749–756, doi:10.1109/8.929629.

[22] Duru, K., and G. Kreiss (2012), A Well-Posed and Discretely Stable Perfectly Matched
Layer for Elastic Wave Equations in Second Order Formulation, Communications in
Computational Physics, pp. 1–35.

[23] Dwivedi, S., and P. K. Jain (2013), Magnetically Insulated Line Oscillator (MILO)
Performance Study and its Parameter Optimization, IEEE Transactions on Plasma
Science, 41 (9), 2532–2538, doi:10.1109/TPS.2013.2277859.

156

[24] Edwards, H. C., C. R. Trott,

En-
abling manycore performance portability through polymorphic memory access pat-
terns, Journal of Parallel and Distributed Computing, 74 (12), 3202 – 3216, doi:
https://doi.org/10.1016/j.jpdc.2014.07.003, domain-Speciﬁc Languages and High-Level
Frameworks for High-Performance Computing.

and D. Sunderland (2014), Kokkos:

[25] Engquist, B., and A. Majda (1977), Absorbing boundary conditions for the nu-
merical simulation of waves, Mathematics of Computation, 31 (139), 629–651, doi:
10.1090/S0025-5718-1977-0436612-4.

[26] Fornberg, B. (2001), A short proof of the unconditional stability of the ADI-FDTD

scheme, Tech. rep., University of Colorado.

[27] Fornberg, B., J. Zuev, and J. Lee (2007), Stability and accuracy of time-extrapolated
ADI-FDTD methods for solving wave equations, Journal of Computational and Applied
Mathematics, 200 (1), 178 – 192.

[28] Fuks, M., and E. Schamiloglu (2005), Rapid start of oscillations in a mag-
205,101, doi:

netron with a ”transparent” cathode, Phys. Rev. Lett.,
10.1103/PhysRevLett.95.205101.

95,

[29] Gedney, S. D., and U. Navsariwala (1995), An unconditionally stable ﬁnite element
time-domain solution of the vector wave equation, IEEE Microwave and Guided Wave
Letters, 5 (10), 332–334, doi:10.1109/75.465046.

[30] Gewartowski, H. A. W., J. W. (1965), Principles of electron tubes, D. Van Nostrand

Company, Inc.

[31] Gimbutas, Z., and V. Rokhlin (2003), A Generalized Fast Multipole Method for
Nonoscillatory Kernels, SIAM Journal on Scientiﬁc Computing, 24 (3), 796–817, doi:
10.1137/S1064827500381148.

[32] Goplen, B., L. Ludeking, D. Smith, and G. Warren (1995), User-conﬁgurable MAGIC
for electromagnetic PIC calculations, Computer Physics Communications, 87 (1), 54 –
86, doi:https://doi.org/10.1016/0010-4655(95)00010-D, particle Simulation Methods.

[33] Harlow, J. (2003), Electric Power Transformer Engineering, The Electric Power Engi-

neering Hbk, Second Edition, CRC Press.

[34] Henshaw, W. D. (2006), A High-Order Accurate Parallel Solver for Maxwell’s Equations

on Overlapping Grids, SIAM J. Scientiﬁc Computing, 28, 1730–1765.

[35] Higdon, R. L. (1987), Numerical absorbing boundary conditions for the wave equation,

Mathematics of Computation, 49 (179), 65–90.

[36] Huang, Y. J., L. H. Yeh, and K. R. Chu (2014), An analytical study on the
diﬀraction quality factor of open cavities, Physics of Plasmas, 21 (10), 103,112, doi:
10.1063/1.4900415.

157

[37] Jackson, J. D. (1999), Classical Electrodynamics, 3rd ed. ed., Wiley, New York, NY.

[38] Jackson, S. D., and P. H. Muir (2001), Theory and numerical simulation of nth-
order cascaded Raman ﬁber lasers, J. Opt. Soc. Am. B, 18 (9), 1297–1306, doi:
10.1364/JOSAB.18.001297.

[39] Jacobs, G., and J. Hesthaven (2009), Implicit explicit time integration of a high-order
particle-in-cell method with hyperbolic divergence cleaning, Computer Physics Commu-
nications, 180 (10), 1760 – 1767, doi:https://doi.org/10.1016/j.cpc.2009.05.020.

[40] Jacobs, G. B., J. S. Hesthaven, and G. Lapenta (2006), Simulations of the Weibel
instability with a high-order discontinuous galerkin particle-in-cell solver, Collection of
Technical Papers - 44th AIAA Aerospace Sciences Meeting, 19, 14,189–14,199.

[41] Jia, J., and J. Huang (2008), Krylov deferred correction accelerated method of lines
transpose for parabolic problems, Journal of Computational Physics, 227 (3), 1739–
1753.

[42] Jin, J. (2010), Theory and Computation of Electromagnetic Fields, John Wiley & Sons,

Inc, doi:10.1002/9780470874257.

[43] Joannopoulos, J., S. Johnson, J. Winn, and R. Meade (2011), Photonic Crystals: Mold-

ing the Flow of Light (Second Edition), Princeton University Press.

[44] Johansen, H., and P. Colella (1998), A Cartesian Grid Embedded Boundary Method
for Poisson’s Equation on Irregular Domains, J. Computational. Phys., 147 (1), 60–85,
doi:10.1006/jcph.1998.5965.

[45] Kreiss, H., and N. Petersson (2006), An embedded boundary method for the wave
equation with discontinuous coeﬃcients, SIAM Journal on Scientiﬁc Computing, 28 (6),
2054–2074, doi:10.1137/050641399.

[46] Kreiss, H., and N. A. Petersson (2006), An Embedded Boundary Method for the
Wave Equation with Discontinuous Coeﬃcients, SIAM Journal on Scientiﬁc Comput-
ing, 28 (6), 2054–2074, doi:10.1137/050641399.

[47] Lambers, J. V. (2016), Solution of time-dependent PDE through component-wise ap-
proximation of matrix functions, IAENG Journal of Applied Mathematics, 41 (1), 1–26.

[48] Lapidus, L., and G. F. Pinder (1999), Numerical Solution of Partial Diﬀerential Equa-

tions in Science and Engineering, 677 pp., John Wiley and Sons Inc.

[49] LeVeque, R. (2007), Finite Diﬀerence Methods for Ordinary and Partial Diﬀerential
Equations: Steady-State and Time-Dependent Problems, Society for Industrial and Ap-
plied Mathematics.

[50] Li, P., H. Johnston, and R. Krasny (2009), A Cartesian treecode for screened
Coulomb interactions , Journal of Computational Physics, 228 (10), 3858 – 3868, doi:
http://dx.doi.org/10.1016/j.jcp.2009.02.022.

158

[51] M. C. Lin, C. Nieter, P. H. Stoltz, D. N. Smithe (2010), Accurately and Ef-
ﬁciently Studying the RF Structures Using a Conformal Finite-Diﬀerence Time-
Domain Particle-in-Cell Method, The Open Plasma Physics Journal, 3, 48–52, doi:
10.2174/1876534301003010048.

[52] Marfurt, K. J. (1984), Accuracy of ﬁnitediﬀerence and ﬁniteelement modeling
of the scalar and elastic wave equations, GEOPHYSICS, 49 (5), 533–549, doi:
10.1190/1.1441689.

[53] Matthew Causley, B. O., Andrew Christlieb, and L. V. Groningen (2014), Method of
lines transpose: An implicit solution to the wave equation, Mathematics of Computation,
83, 2763–2786, doi:https://doi.org/10.1090/S0025-5718-2014-02834-2.

[54] Mazzia, A., and F. Mazzia (1997), High-order transverse schemes for the numerical
solution of PDEs, Journal of Computational and Applied Mathematics, 82 (1), 299 –
311, doi:http://dx.doi.org/10.1016/S0377-0427(97)00090-3.

[55] McCorquodale, P., P. Colella, and H. Johansen (2001), A Cartesian grid embedded

boundary method for the heat equation on irregular domains.

[56] Mendonca, C., S. Prasad, E. Schamiloglu, and T. Fleming (2011), 3d icepic simulations
of a pulsed relativistic magnetron with transparent cathode: A comparative study, in
2011 IEEE Pulsed Power Conference, pp. 823–828, doi:10.1109/PPC.2011.6191521.

[57] Monk, P., and E. Suli (1994), Error estimates for Yee’s method on non-uniform grids,

IEEE Transactions on Magnetics, 30 (5), 3200–3203, doi:10.1109/20.312618.

[58] Namiki, T. (1999), A new FDTD algorithm based on alternating-direction implicit
method, IEEE Transactions on Microwave Theory and Techniques, 47 (10), 2003–2007,
doi:10.1109/22.795075.

[59] Newmark, N. (1959), A Method of Computation for Structural Dynamics, no. nos. 179-
181 in A Method of Computation for Structural Dynamics, American Society of Civil
Engineers.

[60] Orlanski, I. (1976), A simple boundary condition for unbounded hyperbolic ﬂows,
Journal of Computational Physics, 21 (3), 251 – 269, doi:https://doi.org/10.1016/0021-
9991(76)90023-1.

[61] Palevsky, A. (1980), Generation of Intense Microwave Radiation by the Relativistic E-
beam Magnetron (experiment and Numerical Simulation), Massachusetts Institute of
Technology, Department of Physics.

[62] Palevsky, A., and G. Bekeﬁ (1979), Microwave emission from pulsed, relativistic ebeam
diodes. II. The multiresonator magnetron, The Physics of Fluids, 22 (5), 986–996, doi:
10.1063/1.862663.

159

[63] Pearson, R. A. (1974), Consistent boundary conditions for numerical models of systems
that admit dispersive waves, Journal of the Atmospheric Sciences, 31 (6), 1481–1489,
doi:10.1175/1520-0469(1974)031¡1481:CBCFNM¿2.0.CO;2.

[64] Pierson Guthrey, W. S. A. C., Thavappiragasam Mathialakan (2019), Domain Decom-

position Strategy for Method of Line Transpose Based Scheme.

[65] Ricci, P., G. Lapenta,

(2002), A Simpliﬁed Implicit
Maxwell Solver, Journal of Computational Physics, 183 (1), 117 – 141, doi:
http://dx.doi.org/10.1006/jcph.2002.7170.

and J. Brackbill

[66] Sommerfeld, A. (1949), p. iv, doi:https://doi.org/10.1016/B978-0-12-654658-3.50002-1.

[67] T. Austin, D. N. S., J. R. Cary, and C. Nieter (2010), Alternating Direction Implicit
Methods for FDTD Using the Dey-Mittra Embedded Boundary Method, The Open
Plasma Physics Journal, 3, 29–35, doi:10.2174/1876534301003010029.

[68] Thavappiragasam, M., A. Viswanathan, and A. Christlieb (2017), MOLT based fast
high-order three dimensional A-stable scheme for wave propagation, Journal of Coupled
Systems and Multiscale Dynamics, 5 (2), 151–163, doi:doi:10.1166/jcsmd.2017.1137.

[69] Ting, L., and M. J. Miksis (1986), Exact boundary conditions for scattering prob-
lems, The Journal of the Acoustical Society of America, 80 (6), 1825–1827, doi:
10.1121/1.394297.

[70] Wang, Y., and S. Langdon (2010), Design of wave ports in fdtd and its application
to microwave circuits and antennas, in 2010 IEEE Antennas and Propagation Society
International Symposium, pp. 1–4, doi:10.1109/APS.2010.5562023.

[71] Yee, K. (1966), Numerical solution of

initial boundary value problems involving
Maxwell’s equations in isotropic media, IEEE Transactions on Antennas and Propa-
gation, 14 (3), 302–307, doi:10.1109/TAP.1966.1138693.

[72] Zafarullah, A. (1970), Application of the Method of Lines to Parabolic Par-
tial Diﬀerential Equations With Error Estimates, J. ACM, 17 (2), 294–302, doi:
10.1145/321574.321583.

[73] Zheng, F., Z. Chen, and J. Zhang (1999), A ﬁnite-diﬀerence time-domain method with-
out the Courant stability conditions, IEEE Microwave and Guided Wave Letters, 9 (11),
441–443, doi:10.1109/75.808026.

[74] Zienkiewicz, O., and R. Newton (1969), Coupled Vibrations of a Structure Submerged

in a Compressible Fluid.

160